Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom sample filtering performance and security improvement #11274

Merged

Conversation

dippindots
Copy link
Member

@dippindots dippindots commented Dec 12, 2024

Fix #11195

  • concat has poor performance during my test, we should try to not use concat in a large foreach clause
  • clickhouse support native array in the query, and it has better performance, try replace large foreach with array

In this pull request:

  • Change back from string to prepared statement for security reason
  • Use native array to improve performance

@dippindots dippindots force-pushed the demo-fix-prepared-statements branch 4 times, most recently from 8e58dbb to 5ff57e9 Compare December 12, 2024 16:14
@@ -65,6 +66,13 @@ private StudyViewFilterHelper(@NonNull StudyViewFilter studyViewFilter,
this.categorizedGenericAssayDataCountFilter = extractGenericAssayDataCountFilters(studyViewFilter, genericAssayProfilesMap);
this.customDataSamples = customDataSamples;
this.involvedCancerStudies = involvedCancerStudies;
if (studyViewFilter != null && studyViewFilter.getSampleIdentifiers() != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dippindots is there a reason why we don't just calculate this on the fly? it's a derivation of getSampleIdentifiers. I guess it's an optimization to save it? I'm just coming from the Mobx world where we would make this a computed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, so Java just recently implemented this concept of Property you are talking about. With Records.

But I agree this shouldn't be in the constructor and we could just add this to the filteredSampleIdentifiers method to dynamically calculate this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if it's called multiple times then, absent some kind of mobx-like computed thing, we should just leave it. I think that's the existing pattern with "build", right? You do the calculation up front?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this to the filteredSampleIdentifiers method to dynamically calculate it

public String getUniqueSampleId() {
// Assuming studyId and sampleId are available in SampleIdentifier
// Concatenate with "_" in between if both values are not null
if (getStudyId() != null && getSampleId() != null) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this even happen? a sample Id with one of these null?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have this case, but we do see a study view filter missing studyIds, do you think I should remove this check?

@dippindots dippindots force-pushed the demo-fix-prepared-statements branch from 5ff57e9 to a9b44d6 Compare December 17, 2024 16:35
Copy link

sonarcloud bot commented Dec 17, 2024

@alisman alisman merged commit 7ea2fe3 into cBioPortal:demo-rfc80-poc Dec 18, 2024
15 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants