-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test to demonstrate issue mesalib #237
base: master
Are you sure you want to change the base?
Conversation
@doutriaux1 I will take a look at this. |
@danlipsa that would be great, please do take a look into this.. Since 2015 it's been a problem (see CDAT/cdat#1424) and it's really tripping over most of the inhouse More chatter in PCMDI/amipbcs#10 |
@danlipsa I have a fairly big investment in this, so please do ping me if/when you want me to provide some feedback - the demo that @doutriaux1 has provided should provide much of the code that we're testing.. But my environment is significantly different to yours.. |
@durack1 Thanks! I can reproduce the bug so I am all set. The test as it it written has a small vcs.elements leak, but fixing that did not make a difference. So I am still digging ... |
@doutriaux1 @durack1 @aashish24 @zshaheen It seems that certain resources get accumulated in the render window and are not released. As a workarounds, just set x.backend.renWin = None at the end of your loop. That fixes Charles' test. |
Could you try that in your programs. Thanks! |
@danlipsa excellent, will kick off a run now.. @doutriaux1 what was the other object that I needed to purge/delete - in the code that I provided you? |
@danlipsa I added in the
And full disclosure this code is running on an active VNC session |
@durack1 Is this better than what you had before x.backend.renWin = None? This instruction solves completly, Charles' test simplified a little. But of course, this is not necessary the best solution as creating and destroying the window at every iteration takes time. It will just help me narrow down the problem. |
@danlipsa, the numbers below are before I included the
|
@danlipsa, ok now I am using the correct
|
@durack1 in your scriot you also want to plot s1s (not s1) and use s1s-s2 to compute diff, your memory use will go down significantly (like 1.5Gb) |
@doutriaux1 @durack1 In Charles' test there is a small leak, I solved using: |
thanks @danlipsa did I forget to clean it up? Or should it be cleaned up automatically? |
oh yes looking at the code, I forgot to remove the object. |
@danlipsa @doutriaux1 great, this is looking better. Following your suggestions above:
It still seems those python objects are continuing to grow however..? @danlipsa not sure whether the comment by @dlonie at CDAT/cdat#1424 (comment) is something to consider? |
@durack1 I would not worry about the number of objects if the memory does not grow. So now you don't see any leak in your program (just looking at memory reported)? How is the running time? Maybe this is a valid workaround for now. |
@danlipsa I'll run the script to completion and see what happens to time and memory usage.. In the past it's taken a little time to get to the ridiculous numbers.. It used to take a day or more to complete the script, with the shorter times (reported above ~4 vs ~15s) I'll hopefully have a log to upload here this afternoon.. |
@durack1 @doutriaux1 I run a simplified version of Charles' test with the workaround over 200 times and I don't get any increase in memory or the number objects. The simplification is that I only run one plot not three so that I get the simplest program where I still was seeing the problem.
|
@danlipsa attached is the logfile for the output using all the latest tweaks you guys suggested, it seems these tweaks have considerably reduced the time taken, and the memory usage, which now maxes out at 2.2Gb rather than 100+Gb.. I am a little curious about the huge virtual memory footprint (which is over 30Gb) but that's an aside If we were able to speed things up to be sub-1sec timing for each panel that would be ideal, I know your example above is simpler, but the timing is certainly appealing.. |
I have been following the resource monitoring of this thread with interest. I'm wondering how the results of time.time() would differ from time.clock()
Is |
@doutriaux1 @aashish24 @durack1 @williams13 I built a OpenGL2 VTK backend and run the test (without the workaround). It does not show the memory leak - also seems to run much faster. So, I would use the workaround in the client applications for now and remove the workaround once we switch to the new backend. |
@danlipsa should we try to stick it somewhere in |
@danlipsa "..much faster.." now you have me excited.. How much faster, can you quantify? Also apologies for harping, but any insights on the very large (~40Gb) virtual memory footprint? |
@danlipsa @doutriaux1 as a user, I would strongly advocate that all the "workaround" bits are incorporated into the |
@durack1 we cannot do that it would just be plain wrong. BUT I think you're right we should add a |
OpenGL2: ~ .39 So it seems window closing/opening does have some overhead which is to be expected. We could close the window every 10 time steps if we are worried about that.
|
@durack1 are you doing any regridding in your script before plotting? This can be really memory hungry... :/ |
@doutriaux1 @durack1 We cannot really close() the window when the user only requested clear(). So this has to be done at user level. I would not add an additional function - close() already does this. |
@danlipsa this sounds great.. I would certainly like to throw myself out for testing of the openGL2 stuff, as I'm using the VNC virtual sessions I'm sure to hit a bunch of issues that would be great to solve.. Is RHEL6 going to be a problem for openGL2? Most of our workstations are still running the older OS |
@doutriaux1 I think something like this would be useful - well I would certainly use it to make sure that I delete all the items I have unwittingly (or knowingly) created |
according to: https://bugs.freedesktop.org/show_bug.cgi?id=102844 a mem leak has been fixed in |
nope... still no diff
|
@doutriaux1 I'd be happy to provide feedback on this when you think you have a fix in place |
@durack1 no luck with updating mesalib... |
@doutriaux1 bummer, what are the next steps to isolate and resolve this issue? |
@zshaheen did we ever fix this? Is it still an issue? |
@durack1 merged master in and tryting to see if it still fails. But needed EzTemplate. So retriggered now. |
@doutriaux1 To my knowledge, no. The way we run |
@danlipsa the added test passes under a regular VTK build, but a VTK-mesalib shows a leak.
This is kind of a show stopper for both @zshaheen and @durack1 do you or @sankhesh or @aashish24 have any cycle for this at the moment? Thanks.