Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🧪 Integrate process validation into the application #7

Open
1 of 3 tasks
jadudm opened this issue Nov 9, 2024 · 2 comments
Open
1 of 3 tasks

🧪 Integrate process validation into the application #7

jadudm opened this issue Nov 9, 2024 · 2 comments

Comments

@jadudm
Copy link
Collaborator

jadudm commented Nov 9, 2024

At a glance

In order to have confidence in the application
as a developer
I want integration tests

Acceptance Criteria

We use DRY behavior-driven development wherever possible.

then...

Shepherd

  • UX shepherd:
  • Design shepherd:
  • Engineering shepherd:

Background

Ultimately, we want a variety of tests: unit tests for small portions of functionality (e.g. algorithmic operations, network/service interations) as well as larger/integrative tests. This ticket is about integration tests, which will make sure that multiple components work together correctly.

Traditionally, we might have tests that run outside of the application. That is, we might run some code, and ask if it did the right thing. However, in production... we have nothing. We might catch errors on a micro level ("did this file save?"), but we don't know if the process as a whole is progressing.

Inspired by the architecture of nanopass compilers, we will integrate our testing and validation directly into the pipeline of the application itself. We will then run this both locally (when developing) as well as in production. This way, we cannot let our tests fall behind our application, because they are part of the application itself.

In a picture, our application might have looked like this:

sequenceDiagram
  participant f as fetch
  participant e as extract
  participant p as pack
  participant s as serve
  participant v as validator
  f ->> e: fetch content, queue for extraction
  f ->> v: validate fetch output
  e ->> p: extract, queue for packing
  e ->> v: validate extract output
  p ->> s: pack, queue for search
  p ->> v: validate pack output
  s ->> v: confirm search results
Loading

Or:

  1. After fetching a page, we should have a file in S3. That file should have a JSON metadata document as well as a .raw file. We should be able to confirm the existence of the metadata file, its contents (e.g. is it valid JSON? Does it have the fields we expect, like content-length or content-type?) as well as the existence of the .raw file and its size (e.g. at least check, is it non-zero size? Within 1% of the content-length reported in the metadata?)
  2. After extracting content, we should have a JSON document with a content field, and that field should be of non-zero length. Ultimately, we might have more robust checks.
  3. Once we have packed content, there should be a metadata/manifest file about everything packed and an sqlite database in the serve bucket. We can confirm at this point that the DB is valid, and that all of the URLs in the manifest are present.
  4. Finally, we should be able to pick some words from the extract files and run a query against the live search component for the database. We expect at least one result. This confirms the search is working, serving content, and that the particular site was completely indexed/updated.

The validator can become a living component of the system. It should not be an external process that is run only when we are testing, but it should instead run always, as part of the production stack. In this way, every application action is validated constantly. If the validator ever fails/throws an error, we know something very, very bad has happened. (Cats and dogs, living together... mass hysteria!

It is, of course, possible to then run test scenarios against the stack. That is, in CI/CD, we can have a test site that provides "easy path" and "diabolical" test cases. Our system should, then, always pass all our test cases (e.g. sites to crawl). And, in production, we will be able to see when we venture off the map, and into the region where there be 🐉 ...

Security Considerations

Required per CM-4.

There are no security concerns; the validator runs within the stack, and communicates only with the queue. Having continuous validation of inputs/outputs is a security-enhancing feature.


Process checklist
  • Has a clear story statement
  • Can reasonably be done in a few days (otherwise, split this up!)
  • Shepherds have been identified
  • UX youexes all the things
  • Design designs all the things
  • Engineering engineers all the things
  • Meets acceptance criteria
  • Meets QASP conditions
  • Presented in a review
  • Includes screenshots or references to artifacts
  • Tagged with the sprint where it was finished
  • Archived

If there's UI...

  • Screen reader - Listen to the experience with a screen reader extension, ensure the information presented in order
  • Keyboard navigation - Run through acceptance criteria with keyboard tabs, ensure it works.
  • Text scaling - Adjust viewport to 1280 pixels wide and zoom to 200%, ensure everything renders as expected. Document 400% zoom issues with USWDS if appropriate.
@jadudm jadudm added this to jemison Nov 9, 2024
@github-project-automation github-project-automation bot moved this to triage in jemison Nov 9, 2024
@jadudm jadudm moved this from triage to backlog in jemison Nov 9, 2024
jadudm added a commit that referenced this issue Nov 9, 2024
@jadudm
Copy link
Collaborator Author

jadudm commented Nov 9, 2024

The first step is complete, which the app is included (locally), and it responds to a validate_fetch. However, rules are not implemented, nor are other services.

@jadudm
Copy link
Collaborator Author

jadudm commented Nov 17, 2024

Blocked only because I'm waiting for some other things to stabilize before I bring this in fully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: blocked
Development

No branches or pull requests

1 participant