-
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add assessment API #455
Add assessment API #455
Conversation
Only 85 files changed? Easy peasy. I'll get to this early next week. |
It would be nice to have a 5-10 point list of what was changed. |
|
Should this be in the new service toolkit? |
We will be looking up engines by ID. The engine ID should be a second index. |
Will we want to also be getting verses by reference? Will there be corpora defined as well? pretranslations have more indexing specified. |
Could this be shared? |
So, a corpus will only be one language? Wouldn't they need to be two languages? How flexible are we imagining this API to be? I could imagine different types of assessments needing different input corpora - as well as referencing other engines for AQUA. |
Previously, johnml1135 (John Lambert) wrote…
I see how languages are handled - using an optional second corpora. That sounds acceptable. |
I assume that referencing other jobs from other engines would be done in these options for AQUA assessments. |
Could any more of this be shared with the Machine engines? |
You are organizing results by job instead of by engine and corpus (as with pretranslations). Can we reasonably assume that people would want legacy results? Wouldn't the tracking, correlation resolution of assessment results be the responsibility of the client? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much overlap/separation is there meant to be between the assessment pipeline and the translation pipeline?
Also, are there plans to make an issue to track adding additional testing? How soon do we expect this to be added to production? Is it worth exploring ignoring the assessment code for code coverage for now?
Reviewed 85 of 85 files at r1, all commit messages.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @ddaspit and @johnml1135)
.github/workflows/ci.yml
line 42 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Nice addition!
src/Serval/src/Serval.Shared/Controllers/Endpoints.cs
line 5 at r1 (raw file):
public static class Endpoints { public const string GetDataFile = "GetDataFile";
Very nice
Is this what we really want? I think when we were looking at this before, I recommended using something fasioned after the language server's diagnostic message. Severity, range, code, description, tags, data, etc. |
Is there any way that more of this could be shared with Machine? Could a ServiceBase class be created that is shared? There is so much that overlaps. |
Do we want a cleanup function "show me everything that is not connected to a corpus"? Removing this endpoint disables cleanup of hanging files. If a person loses the id of the file, they lose it completely. |
Is there anything tiered? Can't we share using the ServiceToolkit? |
See other comment. |
Why is the message here, but not in the Dto? |
Ah, some Grpc stuff is in shared now. |
Same with files - are we sure we want to have dangling webhooks? Will they auto-die? |
We should likely have an issue for finishing this, detailing all of what would be needed - including testing and the first integration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 85 of 85 files at r1, all commit messages.
Reviewable status: all files reviewed, 17 unresolved discussions (waiting on @ddaspit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 17 unresolved discussions (waiting on @johnml1135)
src/Machine/src/Serval.Machine.Shared/Services/ServalHealthServiceV1.cs
line 4 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Should this be in the new service toolkit?
No, the service toolkit wouldn't be appropriate, but it could be moved to the Serval.Grpc
library.
src/Serval/src/Serval.Assessment/Configuration/IMongoDataAccessConfiguratorExtensions.cs
line 14 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
We will be looking up engines by ID. The engine ID should be a second index.
The id
property automatically has an index.
src/Serval/src/Serval.Assessment/Configuration/IMongoDataAccessConfiguratorExtensions.cs
line 36 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Will we want to also be getting verses by reference? Will there be corpora defined as well? pretranslations have more indexing specified.
Right now, we only support filtering by book. We can always add more indexes when we add more filtering options.
src/Serval/src/Serval.Assessment/Consumers/DataFileDeletedConsumer.cs
line 3 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Could this be shared?
No, this is specific to the assessment API. It depends on the assessment engine service.
src/Serval/src/Serval.Assessment/Contracts/AssessmentJobConfigDto.cs
line 14 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
I assume that referencing other jobs from other engines would be done in these options for AQUA assessments.
Right now, I haven't designed it to reference engines from other domains. I would like to understand better why we would need to do that and how it might work before we design the interaction.
src/Serval/src/Serval.Assessment/Contracts/AssessmentResultDto.cs
line 6 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Could any more of this be shared with the Machine engines?
I would like to keep the domains separate so they are free to evolve separately from each other.
src/Serval/src/Serval.Assessment/Controllers/AssessmentEnginesController.cs
line 446 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
You are organizing results by job instead of by engine and corpus (as with pretranslations). Can we reasonably assume that people would want legacy results? Wouldn't the tracking, correlation resolution of assessment results be the responsibility of the client?
If we want to correlate different sets of results, what would be the best way to present the correlated results to the user? Wouldn't it be to have a unique ID for the result that is preserved with updates? Would we want to say "this was there before, but it no longer applies"? Would we update an existing record or make a new one?
Those are good questions. To tell you the truth, I don't know exactly how the API will be used yet. I am not far enough into designing Lynx to know. I want to have something in place, so that the AQuA work isn't blocked and I have something to work with when implementing Lynx. Initially, I am taking cues from the AQuA API, but I fully expect that we will be making changes to it. The nice thing about this design is that we don't have to worry about only allowing one job to run at a time.
src/Serval/src/Serval.Assessment/Models/Result.cs
line 11 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Is this what we really want? I think when we were looking at this before, I recommended using something fasioned after the language server's diagnostic message. Severity, range, code, description, tags, data, etc.
This model is certainly not finalized yet. I'm not sure exactly what will be needed. Properties like severity are Lynx concepts and not AQuA concepts, so those probably won't be added.
src/Serval/src/Serval.Assessment/Services/AssessmentPlatformServiceV1.cs
line 51 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Is there any way that more of this could be shared with Machine? Could a ServiceBase class be created that is shared? There is so much that overlaps.
There are similarities, but they are different domains with different data models, so I'm not what more could be shared.
src/Serval/src/Serval.DataFiles/Services/DataFileService.cs
line 37 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Do we want a cleanup function "show me everything that is not connected to a corpus"? Removing this endpoint disables cleanup of hanging files. If a person loses the id of the file, they lose it completely.
This method was not removed. It was moved to the base class.
src/Serval/src/Serval.Grpc/Protos/serval/assessment/v1/platform.proto
line 22 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Is there anything tiered? Can't we share using the ServiceToolkit?
I am intentionally keeping the APIs for different domains separate.
src/Serval/src/Serval.Shared/Contracts/AssessmentJobFinished.cs
line 9 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Why is the message here, but not in the Dto?
I forgot to update the webhook library to consume the event.
src/Serval/src/Serval.Webhooks/Services/WebhookService.cs
line 9 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Same with files - are we sure we want to have dangling webhooks? Will they auto-die?
Same as with the data files, the method is now defined in the base class.
src/Serval/src/Serval.Shared/Services/GrpcServiceHealthCheck.cs
line 17 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Ah, some Grpc stuff is in shared now.
I created a separate health API that all engines can use.
Previously, ddaspit (Damien Daspit) wrote…
Ah - I just forgot. This is to add other indexes from the default. |
Previously, ddaspit (Damien Daspit) wrote…
Nevermind - it's already shared. I was mostly not wanting it to be duplicated. |
Previously, ddaspit (Damien Daspit) wrote…
True. |
Previously, ddaspit (Damien Daspit) wrote…
Corpora are handled significantly differently - I see it now. |
Previously, ddaspit (Damien Daspit) wrote…
I think the build options is flexible enough to handle future requirements. |
Previously, ddaspit (Damien Daspit) wrote…
For any testing or diagnostics done in VSCode or any other platform, I have not seen history on diagnostics. The one exception is baselining for regulated industries, which may want to be done with Bible translation, to work off the issues. For that though, the diagnostics would be stored in a separate system to make sure that they are actioned, not the original test platform. I think it adds an unneeded feature and would recommend it being removed. |
Previously, ddaspit (Damien Daspit) wrote…
How would Lynx determine severity? It would need data fed up to it from the lower levels. If we want to work on it more, let's open an issue for "finishing Assessment API" and adding these issues including test coverage. |
Previously, ddaspit (Damien Daspit) wrote…
I'll wait for the update then. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can think of this PR as a "partial" or "in-development" implementation of the assessment API to enable Lynx development. This way I don't have to continue to maintain a long-lived development branch. Once we have a clearer idea of what more will need to be done, we can open follow-up issues.
Reviewable status: all files reviewed, 17 unresolved discussions (waiting on @johnml1135)
src/Serval/src/Serval.Assessment/Models/Result.cs
line 11 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
How would Lynx determine severity? It would need data fed up to it from the lower levels. If we want to work on it more, let's open an issue for "finishing Assessment API" and adding these issues including test coverage.
Serval would be responsible for generating the assessment results. Lynx would be responsible for "interpreting" the assessment results. Severity might be determined by assessment type or score threshold.
I agree that we should open further issues to finish up the work, but I would like to hold off on doing that for now until we have a better idea of what we will need.
493c87b
to
0a9a67b
Compare
- add assessment library - assessment API disabled using a feature flag - add separate health gRPC API - add GHA unit test report - add constants for all endpoint names - add assessment engine gRPC API - add new assessment scopes - update webhook library to push assessment events - add initial set of assessment integration tests
0a9a67b
to
ec397d3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a list of changes to the commit message.
Reviewable status: 79 of 90 files reviewed, 17 unresolved discussions (waiting on @Enkidu93 and @johnml1135)
src/Serval/src/Serval.Shared/Contracts/AssessmentJobFinished.cs
line 9 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
I'll wait for the update then.
Done.
|
Good enough for this stage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 11 of 11 files at r2, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @ddaspit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dismissed @Enkidu93 from a discussion.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @ddaspit)
This change is