Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in GUI [timebox: 4.5 days] #7212

Closed
FreddieAkeroyd opened this issue Jul 1, 2022 · 5 comments
Closed

Memory leak in GUI [timebox: 4.5 days] #7212

FreddieAkeroyd opened this issue Jul 1, 2022 · 5 comments
Assignees
Labels

Comments

@FreddieAkeroyd
Copy link
Member

After monitoring the GUI with VisualVM the enclosed trace was attached. It seems that at 15:03 on 1st July the GUI allocated a lot of memory and went into a garbage collect loop high cpu loop that froze the GUI.
polref

@FreddieAkeroyd
Copy link
Member Author

Note from scientist "I tried opening the reflectometry server in a device screen and this caused it to freeze. In the end I needed to kill the client and restart it."

@FreddieAkeroyd
Copy link
Member Author

visual VM will be installed locally on polref, also gui patched to allow nagios monitoring

@FreddieAkeroyd FreddieAkeroyd changed the title Memory leak in GUI Memory leak in GUI [timebox: 4.5 days] Jul 7, 2022
@rerpha
Copy link
Contributor

rerpha commented Jul 13, 2022

visualvm installed on polref and inter along with gui patch for jmx handles

@FreddieAkeroyd
Copy link
Member Author

Nagios now monitoring NDXPOLREF and NDXINTER, "IBEX CLIENT Java Heap Memory" item

@Tom-Willemsen
Copy link
Contributor

Tom-Willemsen commented Aug 3, 2022

This wasn't really a memory leak, but rather some highly excessive resource consumption by the reflectometry OPI, caused by large numbers of rules being implemented in javascript.

I've added a mechanism that lets us completely bypass the javascript layer for a number of the most common rules. Most rules actually use a relatively small number of distinct conditions (see output of grep -roP "bool_exp=\".*?\"" | cut -d ":" -f 2 | sort | uniq -c | sort -nr). By implementing these common conditions in java, and only falling back to javascript for uncommon conditions, we can save a lot of memory and cpu.

This was exacerbated on INTER and POLREF by previously allowing them to open the reflectometry OPI in three different places - refl perspective, device screen, and synoptic. At ~400MB each time, this got POLREF to ~1.2G of memory used purely for the reflectometry OPI, and INTER similarly got to ~800MB (they only had perspective + synoptic, not device screen). It is then easy to see how INTER ran into their 1G heap limit with a little consumption from other OPIs/scripting etc. I believe this has since been fixed, and the REFL opi can no longer be opened from multiple places. Combined with the changes in this ticket, this should mean the reflectometry opi now uses reasonable amounts of resources.


Although it wasn't a primary goal of this ticket, I also did a quick benchmark of CPU time using the old & new rule implementations. For executing the following (reasonably representative) rule:

importPackage(Packages.org.csstudio.opibuilder.scriptUtil); 
var pvInt0 = PVUtil.getLong(pvs[0]);
if(pvInt0 == 1)
	widget.setPropertyValue("border_color",ColorFontUtil.getColorFromRGB(255,0,0));
else
	widget.setPropertyValue("border_color",ColorFontUtil.getColorFromRGB(0,128,255));
  • For the first execution (e.g. when OPI first opens & pvs connect), RhinoWithFastPath takes 27 μs, while executing with JS takes 45407 μs on my machine. This represents a performance increase of about 1645x.
  • For subsequent executions (e.g. when a linked PV updates), RhinoWithFastPath takes 14 μs while executing with JS takes 2792 μs. This represents a performance increase of about 200x.

The significant difference between first and subsequent executions for JS is because it needs to compile the JS and then run it, whereas subsequent iterations can reuse a cached result of the compilation.


PRs:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants