Add assessment API #455

ddaspit · 2024-08-15T22:47:47Z

This change is

add assessment library
assessment API disabled using a feature flag
add separate health gRPC API
add GHA unit test report
add constants for all endpoint names
add assessment engine gRPC API
add new assessment scopes
update webhook library to push assessment events
add initial set of assessment integration tests

codecov-commenter · 2024-08-15T22:56:50Z

Codecov Report

Attention: Patch coverage is 43.46591% with 597 lines in your changes missing coverage. Please review.

Project coverage is 56.68%. Comparing base (d9e928d) to head (ec397d3).

Files	Patch %	Lines
...Assessment/Services/AssessmentPlatformServiceV1.cs	0.00%	224 Missing ⚠️
...al/src/Serval.Assessment/Services/EngineService.cs	38.20%	105 Missing and 5 partials ⚠️
...essment/Controllers/AssessmentEnginesController.cs	67.22%	70 Missing and 8 partials ⚠️
...erval/src/Serval.Assessment/Services/JobService.cs	7.50%	37 Missing ⚠️
...ebhooks/Consumers/AssessmentJobFinishedConsumer.cs	0.00%	29 Missing ⚠️
...Webhooks/Consumers/AssessmentJobStartedConsumer.cs	0.00%	26 Missing ⚠️
...sessment/Configuration/IServalBuilderExtensions.cs	35.48%	16 Missing and 4 partials ⚠️
src/Serval/src/Serval.ApiServer/Startup.cs	77.35%	11 Missing and 1 partial ⚠️
...l.Machine.Shared/Services/ServalHealthServiceV1.cs	0.00%	7 Missing ⚠️
...c/Serval.Shared/Contracts/AssessmentJobFinished.cs	0.00%	7 Missing ⚠️
... and 27 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #455      +/-   ##
==========================================
- Coverage   61.16%   56.68%   -4.48%     
==========================================
  Files         240      275      +35     
  Lines       11999    14115    +2116     
  Branches     1520     1895     +375     
==========================================
+ Hits         7339     8001     +662     
- Misses       4132     5529    +1397     
- Partials      528      585      +57

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

johnml1135 · 2024-08-16T22:32:14Z

Only 85 files changed? Easy peasy. I'll get to this early next week.

johnml1135 · 2024-08-19T13:26:49Z

It would be nice to have a 5-10 point list of what was changed.

johnml1135 · 2024-08-19T13:27:51Z

.github/workflows/ci.yml line 42 at r1 (raw file):

          name: NUnit Tests
          path: src/**/TestResults/test-results.trx
          reporter: dotnet-trx

Ah - https://github.com/dorny/test-reporter.

johnml1135 · 2024-08-19T13:31:25Z

src/Machine/src/Serval.Machine.Shared/Services/ServalHealthServiceV1.cs line 4 at r1 (raw file):

using Serval.Health.V1;

namespace Serval.Machine.Shared.Services;

Should this be in the new service toolkit?

johnml1135 · 2024-08-19T14:04:00Z

src/Serval/src/Serval.Assessment/Configuration/IMongoDataAccessConfiguratorExtensions.cs line 14 at r1 (raw file):

            {
                await c.Indexes.CreateOrUpdateAsync(
                    new CreateIndexModel<Engine>(Builders<Engine>.IndexKeys.Ascending(e => e.Owner))

We will be looking up engines by ID. The engine ID should be a second index.

johnml1135 · 2024-08-19T14:06:42Z

src/Serval/src/Serval.Assessment/Configuration/IMongoDataAccessConfiguratorExtensions.cs line 36 at r1 (raw file):

                );
                await c.Indexes.CreateOrUpdateAsync(
                    new CreateIndexModel<Result>(Builders<Result>.IndexKeys.Ascending(pt => pt.TextId))

Will we want to also be getting verses by reference? Will there be corpora defined as well? pretranslations have more indexing specified.

johnml1135 · 2024-08-19T14:09:58Z

src/Serval/src/Serval.Assessment/Consumers/DataFileDeletedConsumer.cs line 3 at r1 (raw file):

namespace Serval.Assessment.Consumers;

public class DataFileDeletedConsumer(IEngineService engineService) : IConsumer<DataFileDeleted>

Could this be shared?

johnml1135 · 2024-08-19T14:12:58Z

src/Serval/src/Serval.Assessment/Contracts/AssessmentCorpusConfigDto.cs line 13 at r1 (raw file):

    /// The language tag.
    /// </summary>
    public required string Language { get; init; }

So, a corpus will only be one language? Wouldn't they need to be two languages? How flexible are we imagining this API to be? I could imagine different types of assessments needing different input corpora - as well as referencing other engines for AQUA.

johnml1135 · 2024-08-19T14:17:54Z

src/Serval/src/Serval.Assessment/Contracts/AssessmentCorpusConfigDto.cs line 13 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

So, a corpus will only be one language? Wouldn't they need to be two languages? How flexible are we imagining this API to be? I could imagine different types of assessments needing different input corpora - as well as referencing other engines for AQUA.

I see how languages are handled - using an optional second corpora. That sounds acceptable.

johnml1135 · 2024-08-19T14:19:01Z

src/Serval/src/Serval.Assessment/Contracts/AssessmentJobConfigDto.cs line 14 at r1 (raw file):

    /// }
    /// </example>
    public object? Options { get; init; }

I assume that referencing other jobs from other engines would be done in these options for AQUA assessments.

johnml1135 · 2024-08-19T14:20:35Z

src/Serval/src/Serval.Assessment/Contracts/AssessmentResultDto.cs line 6 at r1 (raw file):

{
    public required string TextId { get; init; }
    public required string Ref { get; init; }

Could any more of this be shared with the Machine engines?

johnml1135 · 2024-08-19T14:40:21Z

src/Serval/src/Serval.Assessment/Controllers/AssessmentEnginesController.cs line 446 at r1 (raw file):

    /// <response code="503">A necessary service is currently unavailable. Check `/health` for more details.</response>
    [Authorize(Scopes.ReadAssessmentEngines)]
    [HttpGet("{id}/jobs/{jobId}/results")]

You are organizing results by job instead of by engine and corpus (as with pretranslations). Can we reasonably assume that people would want legacy results? Wouldn't the tracking, correlation resolution of assessment results be the responsibility of the client?
If we want to correlate different sets of results, what would be the best way to present the correlated results to the user? Wouldn't it be to have a unique ID for the result that is preserved with updates? Would we want to say "this was there before, but it no longer applies"? Would we update an existing record or make a new one?

Enkidu93

How much overlap/separation is there meant to be between the assessment pipeline and the translation pipeline?

Also, are there plans to make an issue to track adding additional testing? How soon do we expect this to be added to production? Is it worth exploring ignoring the assessment code for code coverage for now?

Reviewed 85 of 85 files at r1, all commit messages.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @ddaspit and @johnml1135)

.github/workflows/ci.yml line 42 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

Ah - https://github.com/dorny/test-reporter.

Nice addition!

src/Serval/src/Serval.Shared/Controllers/Endpoints.cs line 5 at r1 (raw file):

public static class Endpoints
{
    public const string GetDataFile = "GetDataFile";

Very nice

johnml1135 · 2024-08-19T15:08:27Z

src/Serval/src/Serval.Assessment/Models/Result.cs line 11 at r1 (raw file):

    public required string TextId { get; init; }
    public required string Ref { get; init; }
    public double? Score { get; init; }

Is this what we really want? I think when we were looking at this before, I recommended using something fasioned after the language server's diagnostic message. Severity, range, code, description, tags, data, etc.

johnml1135 · 2024-08-19T15:09:56Z

src/Serval/src/Serval.Assessment/Services/AssessmentPlatformServiceV1.cs line 51 at r1 (raw file):

            },
            cancellationToken: context.CancellationToken
        );

Is there any way that more of this could be shared with Machine? Could a ServiceBase class be created that is shared? There is so much that overlaps.

johnml1135 · 2024-08-19T15:19:07Z

src/Serval/src/Serval.DataFiles/Services/DataFileService.cs line 37 at r1 (raw file):

    }

    public async Task<IEnumerable<DataFile>> GetAllAsync(string owner, CancellationToken cancellationToken = default)

Do we want a cleanup function "show me everything that is not connected to a corpus"? Removing this endpoint disables cleanup of hanging files. If a person loses the id of the file, they lose it completely.

johnml1135 · 2024-08-19T15:52:53Z

src/Serval/src/Serval.Grpc/Protos/serval/assessment/v1/platform.proto line 22 at r1 (raw file):

    optional double percent_completed = 2;
    optional string message = 3;
}

Is there anything tiered? Can't we share using the ServiceToolkit?

johnml1135 · 2024-08-19T15:53:26Z

src/Serval/src/Serval.Grpc/Protos/serval/translation/v1/platform.proto line 55 at r1 (raw file):

}

message DeleteAllPretranslationsRequest {

See other comment.

johnml1135 · 2024-08-19T16:03:11Z

src/Serval/src/Serval.Shared/Contracts/AssessmentJobFinished.cs line 9 at r1 (raw file):

    public required string Owner { get; init; }
    public required JobState JobState { get; init; }
    public required string Message { get; init; }

Why is the message here, but not in the Dto?

johnml1135 · 2024-08-19T16:06:33Z

src/Serval/src/Serval.Shared/Services/GrpcServiceHealthCheck.cs line 17 at r1 (raw file):

            $"{context.Registration.Name}-Health"
        );
        HealthCheckResponse? healthCheckResponse = await client.HealthCheckAsync(

Ah, some Grpc stuff is in shared now.

johnml1135 · 2024-08-19T16:10:18Z

src/Serval/src/Serval.Webhooks/Services/WebhookService.cs line 9 at r1 (raw file):

    private readonly IBackgroundJobClient _jobClient = jobClient;

    public async Task<IEnumerable<Webhook>> GetAllAsync(string owner, CancellationToken cancellationToken = default)

Same with files - are we sure we want to have dangling webhooks? Will they auto-die?

johnml1135 · 2024-08-19T16:12:00Z

We should likely have an issue for finishing this, detailing all of what would be needed - including testing and the first integration.

johnml1135

Reviewed 85 of 85 files at r1, all commit messages.
Reviewable status: all files reviewed, 17 unresolved discussions (waiting on @ddaspit)

ddaspit

Reviewable status: all files reviewed, 17 unresolved discussions (waiting on @johnml1135)

src/Machine/src/Serval.Machine.Shared/Services/ServalHealthServiceV1.cs line 4 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

Should this be in the new service toolkit?

No, the service toolkit wouldn't be appropriate, but it could be moved to the Serval.Grpc library.

src/Serval/src/Serval.Assessment/Configuration/IMongoDataAccessConfiguratorExtensions.cs line 14 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

We will be looking up engines by ID. The engine ID should be a second index.

The id property automatically has an index.

src/Serval/src/Serval.Assessment/Configuration/IMongoDataAccessConfiguratorExtensions.cs line 36 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

Will we want to also be getting verses by reference? Will there be corpora defined as well? pretranslations have more indexing specified.

Right now, we only support filtering by book. We can always add more indexes when we add more filtering options.

src/Serval/src/Serval.Assessment/Consumers/DataFileDeletedConsumer.cs line 3 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

Could this be shared?

No, this is specific to the assessment API. It depends on the assessment engine service.

src/Serval/src/Serval.Assessment/Contracts/AssessmentJobConfigDto.cs line 14 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

I assume that referencing other jobs from other engines would be done in these options for AQUA assessments.

Right now, I haven't designed it to reference engines from other domains. I would like to understand better why we would need to do that and how it might work before we design the interaction.

src/Serval/src/Serval.Assessment/Contracts/AssessmentResultDto.cs line 6 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

Could any more of this be shared with the Machine engines?

I would like to keep the domains separate so they are free to evolve separately from each other.

src/Serval/src/Serval.Assessment/Controllers/AssessmentEnginesController.cs line 446 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

You are organizing results by job instead of by engine and corpus (as with pretranslations). Can we reasonably assume that people would want legacy results? Wouldn't the tracking, correlation resolution of assessment results be the responsibility of the client?
If we want to correlate different sets of results, what would be the best way to present the correlated results to the user? Wouldn't it be to have a unique ID for the result that is preserved with updates? Would we want to say "this was there before, but it no longer applies"? Would we update an existing record or make a new one?

Those are good questions. To tell you the truth, I don't know exactly how the API will be used yet. I am not far enough into designing Lynx to know. I want to have something in place, so that the AQuA work isn't blocked and I have something to work with when implementing Lynx. Initially, I am taking cues from the AQuA API, but I fully expect that we will be making changes to it. The nice thing about this design is that we don't have to worry about only allowing one job to run at a time.

src/Serval/src/Serval.Assessment/Models/Result.cs line 11 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

Is this what we really want? I think when we were looking at this before, I recommended using something fasioned after the language server's diagnostic message. Severity, range, code, description, tags, data, etc.

This model is certainly not finalized yet. I'm not sure exactly what will be needed. Properties like severity are Lynx concepts and not AQuA concepts, so those probably won't be added.

src/Serval/src/Serval.Assessment/Services/AssessmentPlatformServiceV1.cs line 51 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

Is there any way that more of this could be shared with Machine? Could a ServiceBase class be created that is shared? There is so much that overlaps.

There are similarities, but they are different domains with different data models, so I'm not what more could be shared.

src/Serval/src/Serval.DataFiles/Services/DataFileService.cs line 37 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

Do we want a cleanup function "show me everything that is not connected to a corpus"? Removing this endpoint disables cleanup of hanging files. If a person loses the id of the file, they lose it completely.

This method was not removed. It was moved to the base class.

src/Serval/src/Serval.Grpc/Protos/serval/assessment/v1/platform.proto line 22 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

Is there anything tiered? Can't we share using the ServiceToolkit?

I am intentionally keeping the APIs for different domains separate.

src/Serval/src/Serval.Shared/Contracts/AssessmentJobFinished.cs line 9 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

Why is the message here, but not in the Dto?

I forgot to update the webhook library to consume the event.

src/Serval/src/Serval.Webhooks/Services/WebhookService.cs line 9 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

Same with files - are we sure we want to have dangling webhooks? Will they auto-die?

Same as with the data files, the method is now defined in the base class.

src/Serval/src/Serval.Shared/Services/GrpcServiceHealthCheck.cs line 17 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

Ah, some Grpc stuff is in shared now.

I created a separate health API that all engines can use.

johnml1135 · 2024-08-20T20:25:20Z

src/Serval/src/Serval.Assessment/Configuration/IMongoDataAccessConfiguratorExtensions.cs line 14 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

The id property automatically has an index.

Ah - I just forgot. This is to add other indexes from the default.

johnml1135 · 2024-08-20T20:28:30Z

src/Machine/src/Serval.Machine.Shared/Services/ServalHealthServiceV1.cs line 4 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

No, the service toolkit wouldn't be appropriate, but it could be moved to the Serval.Grpc library.

Nevermind - it's already shared. I was mostly not wanting it to be duplicated.

johnml1135 · 2024-08-20T20:28:46Z

src/Serval/src/Serval.Assessment/Configuration/IMongoDataAccessConfiguratorExtensions.cs line 36 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Right now, we only support filtering by book. We can always add more indexes when we add more filtering options.

True.

johnml1135 · 2024-08-20T20:29:29Z

src/Serval/src/Serval.Assessment/Consumers/DataFileDeletedConsumer.cs line 3 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

No, this is specific to the assessment API. It depends on the assessment engine service.

Corpora are handled significantly differently - I see it now.

johnml1135 · 2024-08-20T20:33:23Z

src/Serval/src/Serval.Assessment/Contracts/AssessmentJobConfigDto.cs line 14 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Right now, I haven't designed it to reference engines from other domains. I would like to understand better why we would need to do that and how it might work before we design the interaction.

I think the build options is flexible enough to handle future requirements.

johnml1135 · 2024-08-20T20:37:53Z

src/Serval/src/Serval.Assessment/Controllers/AssessmentEnginesController.cs line 446 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Those are good questions. To tell you the truth, I don't know exactly how the API will be used yet. I am not far enough into designing Lynx to know. I want to have something in place, so that the AQuA work isn't blocked and I have something to work with when implementing Lynx. Initially, I am taking cues from the AQuA API, but I fully expect that we will be making changes to it. The nice thing about this design is that we don't have to worry about only allowing one job to run at a time.

For any testing or diagnostics done in VSCode or any other platform, I have not seen history on diagnostics. The one exception is baselining for regulated industries, which may want to be done with Bible translation, to work off the issues. For that though, the diagnostics would be stored in a separate system to make sure that they are actioned, not the original test platform. I think it adds an unneeded feature and would recommend it being removed.

johnml1135 · 2024-08-20T20:42:31Z

src/Serval/src/Serval.Assessment/Models/Result.cs line 11 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

This model is certainly not finalized yet. I'm not sure exactly what will be needed. Properties like severity are Lynx concepts and not AQuA concepts, so those probably won't be added.

How would Lynx determine severity? It would need data fed up to it from the lower levels. If we want to work on it more, let's open an issue for "finishing Assessment API" and adding these issues including test coverage.

johnml1135 · 2024-08-20T20:44:59Z

src/Serval/src/Serval.Shared/Contracts/AssessmentJobFinished.cs line 9 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I forgot to update the webhook library to consume the event.

I'll wait for the update then.

ddaspit

You can think of this PR as a "partial" or "in-development" implementation of the assessment API to enable Lynx development. This way I don't have to continue to maintain a long-lived development branch. Once we have a clearer idea of what more will need to be done, we can open follow-up issues.

Reviewable status: all files reviewed, 17 unresolved discussions (waiting on @johnml1135)

src/Serval/src/Serval.Assessment/Models/Result.cs line 11 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

How would Lynx determine severity? It would need data fed up to it from the lower levels. If we want to work on it more, let's open an issue for "finishing Assessment API" and adding these issues including test coverage.

Serval would be responsible for generating the assessment results. Lynx would be responsible for "interpreting" the assessment results. Severity might be determined by assessment type or score threshold.

I agree that we should open further issues to finish up the work, but I would like to hold off on doing that for now until we have a better idea of what we will need.

- add assessment library - assessment API disabled using a feature flag - add separate health gRPC API - add GHA unit test report - add constants for all endpoint names - add assessment engine gRPC API - add new assessment scopes - update webhook library to push assessment events - add initial set of assessment integration tests

ddaspit

I added a list of changes to the commit message.

Reviewable status: 79 of 90 files reviewed, 17 unresolved discussions (waiting on @Enkidu93 and @johnml1135)

src/Serval/src/Serval.Shared/Contracts/AssessmentJobFinished.cs line 9 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

I'll wait for the update then.

Done.

johnml1135 · 2024-08-21T11:34:28Z

-- commits line 7 at r2:
This is exactly what I was looking for! Thank you.

johnml1135 · 2024-08-21T11:42:25Z

Good enough for this stage.

johnml1135 · 2024-08-21T11:43:23Z

johnml1135

Reviewed 11 of 11 files at r2, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @ddaspit)

johnml1135

Dismissed @Enkidu93 from a discussion.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @ddaspit)

johnml1135 self-requested a review August 16, 2024 22:31

Enkidu93 reviewed Aug 19, 2024

View reviewed changes

johnml1135 reviewed Aug 19, 2024

View reviewed changes

ddaspit commented Aug 19, 2024

View reviewed changes

ddaspit commented Aug 20, 2024

View reviewed changes

ddaspit force-pushed the assessment-api branch from 493c87b to 0a9a67b Compare August 20, 2024 22:10

ddaspit force-pushed the assessment-api branch from 0a9a67b to ec397d3 Compare August 20, 2024 22:44

ddaspit commented Aug 20, 2024

View reviewed changes

johnml1135 approved these changes Aug 21, 2024

View reviewed changes

johnml1135 merged commit 0d33b4f into main Aug 21, 2024
3 checks passed

johnml1135 deleted the assessment-api branch August 21, 2024 17:57

Add assessment API #455

Add assessment API #455

Conversation

ddaspit commented Aug 15, 2024 • edited Loading

codecov-commenter commented Aug 15, 2024 • edited Loading

Codecov Report

johnml1135 commented Aug 16, 2024

johnml1135 commented Aug 19, 2024

johnml1135 commented Aug 19, 2024

johnml1135 commented Aug 19, 2024

johnml1135 commented Aug 19, 2024

johnml1135 commented Aug 19, 2024

johnml1135 commented Aug 19, 2024

johnml1135 commented Aug 19, 2024

johnml1135 commented Aug 19, 2024

johnml1135 commented Aug 19, 2024

johnml1135 commented Aug 19, 2024

johnml1135 commented Aug 19, 2024

Enkidu93 left a comment

Choose a reason for hiding this comment

johnml1135 commented Aug 19, 2024

johnml1135 commented Aug 19, 2024

johnml1135 commented Aug 19, 2024

johnml1135 commented Aug 19, 2024

johnml1135 commented Aug 19, 2024

johnml1135 commented Aug 19, 2024

johnml1135 commented Aug 19, 2024

johnml1135 commented Aug 19, 2024

johnml1135 commented Aug 19, 2024

johnml1135 left a comment

Choose a reason for hiding this comment

ddaspit left a comment

Choose a reason for hiding this comment

johnml1135 commented Aug 20, 2024

johnml1135 commented Aug 20, 2024

johnml1135 commented Aug 20, 2024

johnml1135 commented Aug 20, 2024

johnml1135 commented Aug 20, 2024

johnml1135 commented Aug 20, 2024

johnml1135 commented Aug 20, 2024

johnml1135 commented Aug 20, 2024

ddaspit left a comment

Choose a reason for hiding this comment

ddaspit left a comment

Choose a reason for hiding this comment

johnml1135 commented Aug 21, 2024

johnml1135 commented Aug 21, 2024

johnml1135 commented Aug 21, 2024

johnml1135 left a comment

Choose a reason for hiding this comment

johnml1135 left a comment

Choose a reason for hiding this comment

ddaspit commented Aug 15, 2024 •

edited

Loading

codecov-commenter commented Aug 15, 2024 •

edited

Loading