-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sure 'everyone' knows about changes to the sharetab. #1991
Conversation
this may help with bug #2004 testing now |
@FransUrbo Using the already open handle in |
Since we have no locking on sharetab, any process/thread can overwrite it anyway. So the use of a temporary file is pointless. |
@FransUrbo It's not totally pointless. By using a temporary file we ensure that all the updates are atomic. Other well behaved system utilities do the same for this very reason. The result is that while the file may be stale/wrong we never access in a state where it's only been partially updated. For example, half of a line has only been written. |
@behlendorf so how do we solve #2004? |
@FransUrbo Good question. I think we need to understand the root cause of #2004, because it's not clear to me _why_this patch would fix the crash. Calling But like I said above updating |
This may have been fixed elsewhere, here. https://lkml.org/lkml/2014/1/9/349 I'll test tonight if I can. |
@FransUrbo I could merge this patch if it were refreshed to just use the shared handle and preserve the |
I've separated them into two commits. Feel free to merge 0ba2b48 and we can discuss/wait with the second one... I'm still not convinced that using a temporary file is the way to go. IF we end up with a partially written file, something really bad must have happened. And we'll need to rewrite it anyway. Exactly under what circumstances CAN we end up with a partially written file? |
@FransUrbo Thanks. That will make it easier to get at least half of this optimization in.
You're right about the case your considering. A partially written file on disk is unlikely, it would mean the process writing the file crashed while it was performing the update. Which is unlikely (although it could happen). The more likely case is that you have two processes accessing the shared file concurrently. One of them is overwriting the file with new contents writing N sized chunks at a time. The other is reading those contents. What can happen is that the reading process accessed beyond what the writing process has finished updating and finds garbled data because it was partially overwritten. Since the file isn't protected by any locking there's nothing preventing this from happening. As long as the reads occur entirely before or after the writes this can't happen and that's what the tempfile/rename does. It ensures that the file is always atomically updated and never is some half updated state. |
The only time I can see two processes writing at the same time is if two admins starts sharing their own dataset/zvolume at the same time. Then the one that finishes last will 'win'. This is wrong in any case, because then the first ones won't be in the sharetab. And with regards to one reading and one writing... What happens if the reader get a new file mid read? Let's say that there is the following entries in sharetab when the reader starts reading:
Now, the reader is at So in that case, I think it's much better to just introduce locking to the file when someone needs to write to it. This should hopefully also solve the problem above with one reader and one writer... |
This is exactly the point. By writing to a new file, unlinking the old, and renaming the file it ensure this case is impossible. Readers with the old file open will continue to see that version of the data until they close the file. The kernel won't remove the file until all processes reading it have closed it. The process writing the file then doesn't make the file active under the expected name until it has written everything. |
@behlendorf But that will still invalidate By not creating a temp file AND use locking, we make sure that all changes, even those that are simultaneous is maintained. |
Why not spawn a Singleton thread whose sole responsibility it is to |
It double occurs to me that a sharetab necessarily is only valid until |
Both of this is in theory good. But they are really bad in It's not possible to store it in memory, because no process The only 'two' process that runs continuously is the kernel Starting a kernel thread just to deal with the sharetab is |
@FransUrbo In the interest of not holding up the next tag any longer than we have to I'm bumping this to 0.6.4. But in the interest of not having these changes linger any longer I would like to sort this out shortly after the tag. |
@behlendorf Agreed, this needs more thought and discussion. |
+ Reuse the handle, fseek to beginning and read...
just open it with freopen() writeable and write the info. + If sharetab isn't opened, open it with fopen() instead of freopen() to avoid segfault if it doesn't exist.
@FransUrbo any objection to just closing this as stale? The original issues this patch was trying to address have already been resolved by other means. |
This might help (somewhat!) with #821, #845 and #1484, but it is no way close enough to fix the issues completely. Most it can do is cut the amount of time used with a few minutes in total. But it's still a worthy enough pull request to consider (until such time we can remove or redesign/rewrite libshare completely).
This was part of my SMB rewrites patch (#1476), but it's better served as a separate pull request.
Signed-off-by: Turbo Fredriksson [email protected]