--- Comment #39 from Michael Gmelin <[hidden email]> ---
(In reply to Allan Jude from comment #38)
> The patch I committed only stops the deadlock in the case where the vnode being reclaimed belongs to a deleted file. It can still happen in other circumstances.
Obviously (like avg pointed our), but the main issue in this test case was
`find` iterating over files that were deleted. So beforehand it would hang 9
out of 10 times when running the test case, while I couldn't reproduce it after
patching the kernel. So the patch is effective for this specific scenario.
Let's see if Andreas' CI runs into issues again (like he pointed out, his
observations in the last two weeks were done using an unpatched kernel by
> I am wondering if the correct approach is just to limit the number of times it can loop, and return an error. I am not sure what side effects returning an error will have.
Based on my gut feeling this sounds dangerous, just thinking of possible
scenarios where one process could cause another process to make an important
write fail that way where it wasn't expected. What kind of error would you
return in such a scenario, maybe EINTR?