Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exceptions and cache cleaning #744

Closed
jastax opened this issue Jan 28, 2021 · 7 comments · Fixed by #745
Closed

Exceptions and cache cleaning #744

jastax opened this issue Jan 28, 2021 · 7 comments · Fixed by #745
Assignees

Comments

@jastax
Copy link

jastax commented Jan 28, 2021

Hi @sclassen ,

you asked me (here #707) to create a new issue for this.

The error and the cache deleting still occurs (I've launched 3 instances of the application)

OWS 1.3.2, Windows 10
log.zip

Thanks!

@sclassen
Copy link
Contributor

sclassen commented Feb 5, 2021

Hi @jastax
It took us some time to track this one down.
Looks like we have a glitch in locking the cache index file. Thus allowing multiple processes to write at the same time which leads to chaos and explains the exceptions and the cache cleaning.

There is a fix on the way in #745

@sclassen
Copy link
Contributor

sclassen commented Feb 5, 2021

For now as a workaround you can delete the entire cache in the file system and then try to not start multiple jnlps at the same time.
The dependencies all get downloaded at start time and thus the cache should only see minor activity once the application is running.
It then is (almost) save to launch the next jnlp.

@jastax
Copy link
Author

jastax commented Feb 8, 2021

@sclassen That's great, thanks!

"For now as a workaround you can delete the entire cache in the file system and then try to not start multiple jnlps at the same time.
The dependencies all get downloaded at start time and thus the cache should only see minor activity once the application is running."

Yes, i noticed that. We cannot rollout OWS with such an error though, because there will be thousands of users using it everyday. I noticed the error because when I started the application the OWS splash screen would show for a short time and downloads start. For some time between downloads there is no window / progress bar to see at all, so users would think the launching didn't work and start the application again.

@sclassen
Copy link
Contributor

sclassen commented Feb 9, 2021

@jastax
We tried the changes in the file locking behavior and thereby ran into a very old JDK issue. In short: on the NTFS if one releases a file lock in a Java application the underlying lock (in NTFS) is not immediately released. Only after a full GC where the memory mapped file is freed the lock is released.
Since there is no way to enforce a full GC there is also no way of reliably releasing the file lock. Thus multiple processes will lock each other out causing massive problems. Therefore the solution in the above mentioned PR is not practical.
We will need to think of an alternative to allow both parallel execution of OWS and guaranteed file locking to prevent cache corruption.

see: https://bugs.java.com/bugdatabase/view_bug.do?bug_id=4715154

@sclassen sclassen self-assigned this Feb 11, 2021
@jastax
Copy link
Author

jastax commented Feb 11, 2021

Is there some way I can help you with this one?

@sclassen
Copy link
Contributor

sclassen commented Feb 11, 2021

We are trying a different approach today. I can give you feedback about the outcome tomorrow.

@sclassen
Copy link
Contributor

Ok, the new approach is working. The code is merged.

@sclassen sclassen linked a pull request Feb 18, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants