-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/sys/unix: Solaris lacks Flock (plus related constants) #11113
Comments
Have you tried golang.org/x/sys/unix?
(The issue subject mentions x/sys, but your code still
references the syscall pakage)
|
Yes, it seems not implemented there either, unless I'm missing it.
(similar for Flock; hits, but not in something that looks like it would build on Solaris) |
We can fix this trivially in x/sys, but it still won't solve your problem, as bolt has to be changed as well to start using x/sys instead of syscall. |
That's not a problem though. Bolt I can fix, but I'm not sure of the changes necessary in x/sys. |
The Solaris OS itself does not currently support flock(); could you use FcntlFlock() instead? |
It compiles and doesn't return an error... Although I didn't perform any deeper testing of it.
Nevertheless that's the least important part of it, as you note there are other ways to do locking anyway. |
I was referring to Solaris, not OpenSolaris derivatives such as SmartOS. Solaris had an flock method historically but it was not semantically compatible with Linux and was removed from later versions of Solaris. Since the Solaris port of Go is targeted to both Solaris and OpenSolaris-based derivatives, flock is not supportable on Solaris for Go for now. I also think the precise semantics of whatever version of flock available on SmartOS are would need to be verified to be the same as other OS' given what I know about the original sources. |
I have a working implementation for locking using fcntl() interface that is currently used by etcd. I made version for bolt as well. |
|
@akolb-imd Thanks! I'm working on boltdb for Solaris as well. I think I have a patch to x/sys/unix which fixes the build and provides mmap ... is this of interest? -- Ken |
@ksedgwic Yes, please. I am especially interested in your mmap fix. I hacked golfing to provide mmap, but fixing it in bolt is definitely better. BTW, why bolt uses mmap? It is very OS-specific and in many cases isn't even faster. Would be better to avoid it altogether. |
@ksedgwic Does the flock/funlock fix look reasonable? |
That's what I did for Solaris build:
|
I'm new to boltdb so can't speak to the reason they used mmap. My patch is for x/sys/unix itself. It appears the solaris build has been broken for some time:
Here is my x/sys patch:
|
Please let me know if the patch works for you as well; if it does I'll submit it to x/sys folks ... |
@ksedgwic Thanks I'll try it. And since you are there, can you also add
|
@akolb1 I'll try out the flock code straightaway and let you know how it goes. re: SYS_IOCTL, will do ... |
@ksedgwic The flock diff was for bolt but something similar can be done in golang to provide flock on Solaris. I have not checked the golang implementation. BTW, now the newJoyent version of illumos provides flock() but I don't think that golfing can be that specific (unless it is possible to detect whether flock() is provided by the library) |
@akolb1 It's working! I can run the "bolt bench" command! I had to add Madvise support to solaris as well. Here is the complete patch for x/sys: Here is the complete patch for bolt: I think we should be careful about introducing flock into x/sys for solaris. Since it is not supported directly by the underlying OS/libraries we'd have to guarantee that we'd considered all possible semantics, not just the ones that bolt needs (example, can you flock a directory?) |
@ksedgwic Have you tried running bolt tests? They fail for me. Regarding flock(), here is Joyent implementation of flock() for Linux brand - we can do exactly the same in GO:
|
@akolb1 Three tests fail for me; I think they are flock relateddb_test.go:57: --- FAIL: TestOpen_Timeout (0.00s) --- FAIL: TestOpen_Wait (0.00s) --- FAIL: TestOpen_ReadOnly (0.00s) I'm looking at the first test now, looks like we failing to prevent a second open of the same db ... |
Ok, I think I understand the issue. An important difference between flock and fcntl is how they deal with threads in the same process: My read is that fcntl ignores the fd (and threads) and considers whether the process holds the lock. The unit test TestOpen_Timeout is opening the same file twice from the same process; so flock based locking will prevent that (test PASSES) and fcntl locking will allow it (test FAILS). |
I see. Is this specific to the test or not actually depends on such semantics? I would argue that the test is broken. Also there is an important difference is that fcntl-based locking usually works on NFS while lockf doesn't. |
@ksedgwic Your Solaris patch looks great (I don't understand GO bindings to libc so I can't comment on these). For madvise support - are Solaris-specific NUMA advise flags supported? (MADV_ACCESS_LWP, MADV_ACCESS_MANY, MADV_ACCESS_DEFAULT) ? |
@ksedgwic Do you plan to push your golang patches to the upstream? |
@akolb1 I plan on pushing the x/sys patch upstream. I would like to get it all working first. I think the flock semantics may actually be important. Consider a multi-threaded server of some sort which opens the boltdb separately from each thread. With flock it all works fine; their access is consistent. With fcntl it breaks. I'm currently investigating a fix, will know in an hour or so how it's going ... |
If multiple threads have the same file open with different fds: |
Moving this offline. On August 14, 2015 at 2:08:25 PM, Ken Sedgwick ([email protected]) wrote: If multiple threads have the same file open with different fds: Hmm - fcntl provides locks at the file level so it doesn’t matter whether you go to the file through the same fd or through different fds - so why would it allow corruption from multiple threads? |
Because if different threads in the same process open the file with different fds, they will be given concurrent access, which will cause corruption. See: https://gist.github.com/MerlinDMC/3197f4d13f8145c457e4 |
This shouldn’t be the case - fcntl is implemented at vnode, so it shouldn’t give concurrent access. So I wrote a simple test in go to verify this. It isn’t multithreaded - you need to open two windows to try it. The source is attached. In window 1: ./locks In window2: ./locks So I go through two different opens with different fds. |
Right, but if you implement both parts of the test in the same process (like a multithreaded server) then it will fail and allow concurrent exclusive access. This is what the failing unit test does. This is what a reasonable multi-threaded server might do. See the problem? In other words, the problem never happens when you have separate processes (like your test), that part works fine. The problem only happens when both lockers are inside the same process, even with two different fds. |
@binarycrusader Looks great! I'm pretty sure we're still going to need to fix the specific flock != fcntl issue, but that is probably better handled on the boltdb side (where the issue, and the fix would lie). Thanks! |
@ksedgwic Oh, I see, you are right. I'll investigate this a bit further. |
|
On the recent illumos-joyent the man page for fcntl says:
But this isn't part of Solaris or other illumos distributions. |
@ksedgwic It seems that for bolt fcntl locks are is fine. |
@ksedgwic the changes I submitted for review are only for fixing the build, they don't add support for any calls that aren't supported today. |
Here's more color on the status of implementing a proper flock in Illumos: |
I've submitted a change for x/sys/unix which adds mmap, munmap and madvise https://go-review.googlesource.com/#/c/13720/ Ken On Wed, Aug 19, 2015 at 3:22 PM, Fazal Majid [email protected]
Ken Sedgwick |
@ksedgwic Your change will not make it to 1.5 - correct? Can you send me your Solaris patch? |
@ksedgwic Also I noticed that your change didn't have SYS_IOCTL for Solaris. |
@ksedgwic What should I do to apply golang/sys to standard golfing source tree? I tried copying files, but Solaris files are in unix package instead of syscall package. |
@ksedgwic Oh, I see. It isn't a patch to golfing itself but rather a GO package. I was thinking that this is part of the core golang. |
Fixed for x/sys/unix by #21410. |
Primarily needed by Bolt which currently builds on "everything" except Solaris.
The text was updated successfully, but these errors were encountered: