Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Threaded validation #513

Merged
merged 8 commits into from
Mar 3, 2023
Merged

Threaded validation #513

merged 8 commits into from
Mar 3, 2023

Conversation

br648
Copy link
Contributor

@br648 br648 commented Feb 17, 2023

Checklist

  • Appropriate branch selected (all PRs must first be merged to dev before they can be merged to master)
  • Any modified or new methods or classes have helpful JavaDoc and code is thoroughly commented
  • The description lists all applicable issues this PR seeks to resolve
  • The description lists any configuration setting(s) that differ from the default settings
  • All tests and CI builds passing

Description

Working on top of https://github.com/ibi-group/datatools-server/pull/512/files original and mobility data validation are now in their own separate threads.

@br648
Copy link
Contributor Author

br648 commented Feb 20, 2023

@binh-dam-ibigroup, Miles is to carry out some testing, but on the whole is happy to merge this into his working PR. If you don't mind, can you review please just in case I've missed something?

Copy link
Contributor

@miles-grant-ibigroup miles-grant-ibigroup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be working nice and fast, only I noticed to resolve is a race condition that sometimes prevents results from being persisted to mongo

@@ -382,22 +414,6 @@ public void validate(MonitorableJob.Status status) {

// This will persist the document to Mongo
this.mobilityDataResult = Document.parse(json);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "regular" parsing step was previously persisting the this to mongo. If the mobility data parser finishes first, this still happens. If it finishes second, this fails to happen and the data isn't persisted! I think the solution is to add a mongo persistence line right here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@miles-grant-ibigroup We need to be careful here because we don't want to create a new Mongo object if the other validator is expecting to do that. See ValidateFeedJob -> jobFinished(). Both validators need to be aware of the Mongo object state, so that either can create or update. Let me take a look. Good spot btw.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@miles-grant-ibigroup This should now be addressed. Unit tests look good. Lemme know how you get on.

Robin Beer and others added 5 commits February 22, 2023 16:17
Bumps [mongodb-driver-sync](https://github.com/mongodb/mongo-java-driver) from 4.0.5 to 4.0.6.
- [Release notes](https://github.com/mongodb/mongo-java-driver/releases)
- [Commits](mongodb/mongo-java-driver@r4.0.5...r4.0.6)

---
updated-dependencies:
- dependency-name: org.mongodb:mongodb-driver-sync
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
…ongodb-driver-sync-4.0.6

chore(deps): bump mongodb-driver-sync from 4.0.5 to 4.0.6
} else {
Persistence.feedVersions.replace(feedVersion.id, feedVersion);
}
feedVersion.persistFeedVersionAfterValidation(isNewVersion);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written, it looks like the new version is shown in the UI (on the mobility-data-validator branch) after the first validator completes. Is that the desired effect?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's ok as long as we refresh the UI when the mobility data results come in, which is something we can do on the client.

Copy link
Contributor

@miles-grant-ibigroup miles-grant-ibigroup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things are working well for me, I think this is good to merge back into the main mobility data branch thanks for the big improvement!

@miles-grant-ibigroup
Copy link
Contributor

I've noticed that on some large feeds the jobs are running sequentially?
Screenshot 2023-03-01 at 11 50 35 AM

@br648
Copy link
Contributor Author

br648 commented Mar 2, 2023

This is to be expected. Once both threads have started we are then effectively passing control to the JVM which then decides how much CPU each thread will get. If the first gets 100% CPU, the second will have to wait.

Copy link
Contributor

@binh-dam-ibigroup binh-dam-ibigroup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good deal!

@binh-dam-ibigroup binh-dam-ibigroup removed their assignment Mar 2, 2023
@br648 br648 merged commit 18767dd into mobility-data-validator Mar 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants