Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure 'everyone' knows about changes to the sharetab. #1991

Closed
wants to merge 2 commits into from

Conversation

FransUrbo
Copy link
Contributor

  • Instead of opening a temporary sharetab and then renaming it, just open it with freopen() writeable and write the info.
  • If sharetab isn't opened, open it with fopen() instead of freopen() to avoid segfault if it doesn't exist.
  • Don't open sharetab in parse_sharetab() - it's already open!
  • Reuse the handle, fseek to beginning and read...

This might help (somewhat!) with #821, #845 and #1484, but it is no way close enough to fix the issues completely. Most it can do is cut the amount of time used with a few minutes in total. But it's still a worthy enough pull request to consider (until such time we can remove or redesign/rewrite libshare completely).

This was part of my SMB rewrites patch (#1476), but it's better served as a separate pull request.

Signed-off-by: Turbo Fredriksson [email protected]

@prometheanfire
Copy link
Contributor

this may help with bug #2004 testing now

@behlendorf
Copy link
Contributor

@FransUrbo Using the already open handle in parse_sharetab() seems like a good idea to me. However, I'm concerned about remove the temporarily file from update_sharetab(). The temporarily file is required to allow us to make atomic updates to the sharetab file, without it parse_sharetab() may see an inconsistent view of the file.

@FransUrbo
Copy link
Contributor Author

Since we have no locking on sharetab, any process/thread can overwrite it anyway. So the use of a temporary file is pointless.

@behlendorf
Copy link
Contributor

@FransUrbo It's not totally pointless. By using a temporary file we ensure that all the updates are atomic. Other well behaved system utilities do the same for this very reason. The result is that while the file may be stale/wrong we never access in a state where it's only been partially updated. For example, half of a line has only been written.

@FransUrbo
Copy link
Contributor Author

@behlendorf so how do we solve #2004?

@behlendorf
Copy link
Contributor

@FransUrbo Good question. I think we need to understand the root cause of #2004, because it's not clear to me _why_this patch would fix the crash. Calling mkstemp() is entirely safe many many many applications use it without issue. Are we positive this patch prevented the SELinux crash?

But like I said above updating parse_sharetab and update_sharetab to use the shared handle in libzfs we should certainly do. But we really do need to keep temp file in order to ensure the sharetab update is atomic.

@prometheanfire
Copy link
Contributor

This may have been fixed elsewhere, here.

https://lkml.org/lkml/2014/1/9/349

I'll test tonight if I can.

@behlendorf
Copy link
Contributor

@FransUrbo I could merge this patch if it were refreshed to just use the shared handle and preserve the mkstemp() functionality.

@FransUrbo
Copy link
Contributor Author

I've separated them into two commits. Feel free to merge 0ba2b48 and we can discuss/wait with the second one...

I'm still not convinced that using a temporary file is the way to go. IF we end up with a partially written file, something really bad must have happened. And we'll need to rewrite it anyway.

Exactly under what circumstances CAN we end up with a partially written file?

@behlendorf
Copy link
Contributor

@FransUrbo Thanks. That will make it easier to get at least half of this optimization in.

Exactly under what circumstances CAN we end up with a partially written file?

You're right about the case your considering. A partially written file on disk is unlikely, it would mean the process writing the file crashed while it was performing the update. Which is unlikely (although it could happen).

The more likely case is that you have two processes accessing the shared file concurrently. One of them is overwriting the file with new contents writing N sized chunks at a time. The other is reading those contents. What can happen is that the reading process accessed beyond what the writing process has finished updating and finds garbled data because it was partially overwritten.

Since the file isn't protected by any locking there's nothing preventing this from happening. As long as the reads occur entirely before or after the writes this can't happen and that's what the tempfile/rename does. It ensures that the file is always atomically updated and never is some half updated state.

@FransUrbo
Copy link
Contributor Author

The only time I can see two processes writing at the same time is if two admins starts sharing their own dataset/zvolume at the same time. Then the one that finishes last will 'win'. This is wrong in any case, because then the first ones won't be in the sharetab.

And with regards to one reading and one writing... What happens if the reader get a new file mid read?

Let's say that there is the following entries in sharetab when the reader starts reading:

file1
file2
file3
[etc up to 100]

Now, the reader is at file35 when the writer have finished writing it's new file with file2a between file2 and file3. That file won't be catched by the reader.

So in that case, I think it's much better to just introduce locking to the file when someone needs to write to it. This should hopefully also solve the problem above with one reader and one writer...

@behlendorf
Copy link
Contributor

And with regards to one reading and one writing... What happens if the reader get a new file mid read?

This is exactly the point. By writing to a new file, unlinking the old, and renaming the file it ensure this case is impossible. Readers with the old file open will continue to see that version of the data until they close the file. The kernel won't remove the file until all processes reading it have closed it. The process writing the file then doesn't make the file active under the expected name until it has written everything.

@FransUrbo
Copy link
Contributor Author

By writing to a new file, unlinking the old, and renaming the file it ensure this case is impossible.

@behlendorf But that will still invalidate writer2's change(s).

By not creating a temp file AND use locking, we make sure that all changes, even those that are simultaneous is maintained.

@aarcane
Copy link

aarcane commented Jun 1, 2014

Why not spawn a Singleton thread whose sole responsibility it is to
maintain the file and it's records, processing requests to make add and
delete transactions, and returning the atomic, correct data as of the
instant that a request is made. The end result is that all procees get a
correct atomic result, the single thread assures that all changes are
atomic, and with properly chosen record IDs, every delete can be guaranteed
to succeed.

@aarcane
Copy link

aarcane commented Jun 1, 2014

It double occurs to me that a sharetab necessarily is only valid until
system reboot, so why not further simplify the matter and store it in
memory only?

@FransUrbo
Copy link
Contributor Author

Why not spawn a Singleton thread

store it in memory only?

Both of this is in theory good. But they are really bad in
practice.

It's not possible to store it in memory, because no process
runs continuously. Sharing/unsharing is a userland process
(zfs share ....) and the userland command startup, do its
thing and die. There's no communication between the
different command executions (except the sharetab file).

The only 'two' process that runs continuously is the kernel
thread(s) and now ZED.

Starting a kernel thread just to deal with the sharetab is
in my opinion WAY to much work and overhead to even
consider. And ZoL is not yet capable of sending sharing
events to ZED. Eventually, this is our goal - remove the
whole libshare part and let ZED take care of this, and then
this problem will be moot.

@behlendorf
Copy link
Contributor

@FransUrbo In the interest of not holding up the next tag any longer than we have to I'm bumping this to 0.6.4. But in the interest of not having these changes linger any longer I would like to sort this out shortly after the tag.

@behlendorf behlendorf modified the milestones: 0.6.4, 0.6.3 Jun 5, 2014
@FransUrbo
Copy link
Contributor Author

@behlendorf Agreed, this needs more thought and discussion.

+ Reuse the handle, fseek to beginning and read...
just open it with freopen() writeable and write the info.
+ If sharetab isn't opened, open it with fopen() instead of
  freopen() to avoid segfault if it doesn't exist.
@behlendorf
Copy link
Contributor

@FransUrbo any objection to just closing this as stale? The original issues this patch was trying to address have already been resolved by other means.

@behlendorf behlendorf modified the milestones: 0.6.5, 0.6.4 Feb 5, 2015
@behlendorf behlendorf closed this Feb 14, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants