Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Leak #259

Closed
Tracked by #1186
KatieWoe opened this issue Dec 3, 2024 · 15 comments
Closed
Tracked by #1186

Memory Leak #259

KatieWoe opened this issue Dec 3, 2024 · 15 comments

Comments

@KatieWoe
Copy link
Contributor

KatieWoe commented Dec 3, 2024

For phetsims/qa#1178. Seen on Win 11 Chrome. First screenshot taken at start, and subsequent screenshots taken one minute apart during fuzzing.
There seems to be a noticeable memory leak in the dev test.
memleak

@pixelzoom
Copy link
Contributor

pixelzoom commented Dec 4, 2024

Memory testing for the 1.0 release was done by me in #130, and by @KatieWoe in phetsims/qa#711 (comment). There were no memory leaks at that time.

Because this sim has been converted to support dynamic locale, the most likely culprit for leaks is a link to a LocalizedStringProperty that has not been cleaned up. That is most typically a Text or RichText that needs to be disposed, but could be an explicit link, Multilink, or DerivedProperty.

Unfortunately, the listenerLimit query parameter is not very useful, because of the large number of listener for localeProperty -- every LocalizedStringProperty and then some.

@pixelzoom
Copy link
Contributor

pixelzoom commented Dec 4, 2024

There is definitely a leak in TickLabelSet, tracking in phetsims/bamboo#64.

@pixelzoom
Copy link
Contributor

pixelzoom commented Dec 4, 2024

Because the listenerLimit query parameter is so useless, identifying leaks has been brutal. It's like looking for a needle in a haystack. I've had to manually inspect occurrence of things that link to LocalizedStringProperty -- see the checklist below.❗designates memory leaks that I found. I'm not confident that this is all of them.

Text

  • LabeledExpandCollapseButton
  • SecondaryWaveformCheckbox
  • InfiniteHarmonicsCheckbox
  • ContinuousWaveformCheckbox
  • WaveformEnvelopeCheckbox
  • SumSymbolNode
  • DiscreteInfoDialog
  • PeriodCheckbox
  • WavelengthCheckbox
  • AmplitudeControlsSpinner
  • WaveGameInfoDialog
  • WaveGameLevelNode
  • WaveGameLevelSelectionNode
  • WavePacketInfoDialog
  • WidthIndicatorsCheckbox

RichText

  • AmplitudeKeypadDialog
  • AmplitudeNumberDisplay
  • CalipersNode
  • DomainChartNode
  • FMWComboBox
  • InteractiveAmplitudesChartNode
  • SeriesTypeRadioButtonGroup
  • SumSymbolNode
  • DiscreteInfoDialog
  • DiscreteSumEquationNode
  • ExpandedFormDialog
  • HarmonicsEquationNode
  • PeriodClockNode
  • WaveGameLevelNode
  • CenterControl
  • ComponentSpacingControl
  • ConjugateStandardDeviationControl
  • StandardDeviationControl
  • WavePacketInfoDialog
  • WavePacketMeasurementToolNode
  • ComponentSpacingToolNode
  • WavePacketLengthToolNode
  • WavePacketScreenView
  • WavePacketSumEquationNode
  • WidthIndicatorPlot
  • ❗TickLabelUtils.createSymbolicTickLabel => see phetsims/bamboo@44f6124

DerivedStringProperty

  • AmplitudeKeypadDialog
  • AmplitudeNumberDisplay
  • DomainChartNode
  • EquationMarkup
  • SumSymbolNode
  • PeriodClockNode
  • CaliperCheckbox
  • LengthToolCheckbox
  • WavePacketMeasurementToolNode
  • WavePacketLengthToolNode
  • WavePacketScreenView
  • WidthIndicatorPlot
  • ❗TickLabelUtils.createSymbolicTickLabel => see 8a158da

PatternStringProperty

  • AmplitudeKeypadDialog
  • GameLevel
  • WaveGameModel

@pixelzoom
Copy link
Contributor

pixelzoom commented Dec 5, 2024

To check where this is at, I tested with https://phet-dev.colorado.edu/html/fourier-making-waves/1.1.0-dev.7/phet/fourier-making-waves_all_phet.html?fuzz. Snapshots 1-10 are at 1-minute intervals. Snapshot 11 is 10 minutes after Snapshot 10.

No improvement at all here. In fact, this is much worse than what @KatieWoe reported. Either I did something to make it worse, or I picked up something from common code before I published /1.1.0-dev.7. Ugh.

Heap comparisons unfortunately did not tell me anything, because I forgot to build with --minify.mangle=false.

screenshot_3628

@pixelzoom
Copy link
Contributor

pixelzoom commented Dec 5, 2024

https://phet-dev.colorado.edu/html/fourier-making-waves/1.1.0-dev.8/phet/fourier-making-waves_all_phet.html?fuzz was built with grunt dev --brands=phet --minify.mangle=false. Same code, same test protocol. Similar (bad) results shown below.

screenshot_3629

I tried comparing Snapshots 6 & 7, because there was a big increase there. But I could not grok the comparison. Sorting by "# Delta", the top 10 are all very obfuscated, nothing that I recognize:

screenshot_3631

@pixelzoom
Copy link
Contributor

pixelzoom commented Dec 5, 2024

@jbphet and I looked at this, and he gave me a couple of good ideas for what state the sim should be in when taking heap snapshots. Basically:

  1. Fuzz the sim for ~1 minute, so that all code has been loaded.
  2. Pause fuzzing by typing phet.chipper.queryParameters.fuzz=false in the console.
  3. Press the resetAllButton in all screens.
  4. Take snapshot 1.
  5. Resume fuzzing by typing phet.chipper.queryParameters.fuzz=true in the console.
  6. Fuzz the sim for ~1 minute.
  7. Pause fuzzing by typing phet.chipper.queryParameters.fuzz=false in the console.
  8. Press the resetAllButton in all screens.
  9. Take snapshot 2.
  10. Compare snapshots 1 & 2.

I'll probably do further debugging in the unbuilt version, because we saw fewer mangled names there.

I'm also going to test with "supportsInteractiveDescription": false, in package.json, to see if these leaks are related to PDOM features.

@pixelzoom
Copy link
Contributor

Running unbuilt in my local copy, http://localhost:8080/fourier-making-waves/fourier-making-waves_en.html?brand=phet&ea&debugger&fuzz, I am no longer seeing a memory leak. See snapshots below. Snapshots 1-10 are at 1-minute intervals. Snapshot 11 is 10 minutes after Snapshot 10. So heap size stabilizes at ~185 MB, and is still in that range after 20 minutes of fuzzing.

This test was done after fixing memory leaks for TickLabelSet over in phetsims/bamboo#64. I also made a simplification to how TickLabelSet removes unused labels from its cache, and that simplification involved removing a Set instance. I don't know if it's related, but in #259 (comment), I had identified Set as the largest "Size Delta" in heap comparisons.

screenshot_3632

@pixelzoom
Copy link
Contributor

I backed out the memory leak fixes for TickLabelSet, then compared heap snaphots at startup and 20 minutes of fuzzing. See below. The memory leak did not return, and after 20 minutes of fuzzing, it's still in the ~185MB ballpark -- consistent with the test in #259 (comment).

So TickLabelSet was not the culprit. Perhaps something got fixed in common code since the tests that I did yesterday? I'll ask in Slack#developer.

screenshot_3633

@pixelzoom
Copy link
Contributor

pixelzoom commented Dec 9, 2024

... Perhaps something got fixed in common code since the tests that I did yesterday? I'll ask in Slack#developer.

No one reported any common-code fixes in Slack#developer, but @jessegreenberg replied:

That seems really strange. Do you have a dev version that always shows the leak? And if so, does it leak in all browsers? Maybe Chrome had something weird going on where it wasn't running GCs correctly.

Unfortunately, this issue has not been specific about Chrome version. As noted above, 1.1.0-dev.5 through 1.1.0-dev.8 were all leaking badly. So I retested 1.1.0-dev.7, noting that I'm using Chrome Version 131.0.6778.109 on macOS 15.1.1. After 30 minutes of fuzzing, the heap size grew to ~500MB. So it's not a problem with Chrome.

@pixelzoom
Copy link
Contributor

pixelzoom commented Dec 9, 2024

https://phet-dev.colorado.edu/html/fourier-making-waves/1.1.0-dev.9/phet/fourier-making-waves_all_phet.html?fuzz was built with grunt dev --brands=phet --minify.mangle=false. The number of snapshots is off because I forgot to clear snapshots. But the test protocol is the same: Snapshots 3-12 are at 1-minute intervals. Snapshot 13 is 10 minutes after Snapshot 12. Like the unbuilt version in #259 (comment), there is no indication of a memory leak. Heap size stabilizes at ~150MB.

screenshot_3634

@pixelzoom
Copy link
Contributor

Revisiting 1.1.0-dev.8, I used the test protocol that @jbphet suggested in #259 (comment). Digging down into the Set instances, which had the highest "Size Delta" in heap comparisons, I find createLabel in the Retainers, see below. This is the function called by TickLabelSet to create tick labels, and it is what was fixed in phetsims/bamboo#64. So I'm going to conclude that (despite #259 (comment)) TickLabelSet was indeed responsible for the memory leak.

screenshot_3635

@pixelzoom
Copy link
Contributor

I've tested this so many times that I'm not going to ask @KatieWoe to test again, and I'll simply close the issue. We'll be doing another memory test in RC, and that should suffice.

@pixelzoom
Copy link
Contributor

Actually... I leave this open, to be verified in 1.1.0-rc.1.

@pixelzoom
Copy link
Contributor

Please do a memory test for phetsims/qa#1186. If it looks OK, you can close this issue.

@KatieWoe
Copy link
Contributor Author

Memory leak seems to be gone in rc.1
goodmem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants