Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Populate domains drop down with what's been ingested in datahub #407

Merged
merged 9 commits into from
Jun 11, 2024

Conversation

MatMoore
Copy link
Contributor

@MatMoore MatMoore commented Jun 6, 2024

Resolves #385

Our hardcoded domain list has drifed from the CaDeT domain model. We now have a way to ingest domains from CaDeT, so we can dynamically populate our domain model in the service.

On dev, this now looks like this (there are some old domains hanging around still)

Screenshot 2024-06-10 at 12 07 01

I've sorted these alphabetically and removed the subdomains drop down for now. This is based on there being no subdomains available to select any more, regardless of which domain you choose.

Also noticed a small bug where domains weren't being displayed for Chart results, so fixed this in the process.

MatMoore added 2 commits June 6, 2024 16:18
- remove entity which is not currently present
- enable the no_duplicates test (we have fixed this)
home/forms/domain_model.py Outdated Show resolved Hide resolved
tests/conftest.py Outdated Show resolved Hide resolved
Previously we hardcoded the list of domains shown in the search filter,
and had different lists per environment.

This was useful in alpha when we had some junk domains we wanted to
filter out, but now we're at a point where every domain in Datahub
should be one we want to use.

This commit means we now fetch every domain that has something linked to
it, and display that in alphabetical order.
MatMoore added 3 commits June 10, 2024 10:39
Ideally we would just fetch the facets once per request,
but in practice we do this from a few different places.

1. In the view we instantiate a SearchService, which uses the domain
   model in constructing filters for Datahub.
2. The SearchForm also needs them to know what choices are valid, so we
   need to pass a callback to the form's ChoiceField. That callback does
not share any data with the view.

Caching the value is a quick way to avoid making extra requests for the
same data.
This is the case at the moment, because the domain model we've pulled in
from CaDeT doesn't have subdomains. This might change later though so I
don't want to remove the subdomain code completely.
@MatMoore MatMoore marked this pull request as ready for review June 10, 2024 11:16
@MatMoore MatMoore changed the title [WIP] Populate domains drop down with what's been ingested in datahub Populate domains drop down with what's been ingested in datahub Jun 10, 2024
@murdo-moj
Copy link
Contributor

Because of how our search is structured, only domains for which there are one of the datatypes we have defined are being pulled through. Is this the intended behaviour? eg electronic_monitoring https://datahub-catalogue-dev.apps.live.cloud-platform.service.justice.gov.uk/domain/urn:li:domain:electronic_monitoring/Entities?is_lineage_mode=false

Previously it was only returning domains with tables in. We should
include any that show as non-empty in Find MOJ Data.
@MatMoore
Copy link
Contributor Author

Sort of, but there was a bug - it's supposed to pull through anything non-empty. This means that if nothing in the domain is tagged to dc_display_in_catalogue, the domain will be hidden as well.

By default it was filtering on result type = table so we were missing a few, but fixed now.

@MatMoore MatMoore merged commit aee5e43 into main Jun 11, 2024
5 checks passed
@MatMoore MatMoore deleted the align-domains branch June 11, 2024 15:52
Copy link

sentry-io bot commented Jun 12, 2024

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

  • ‼️ CatalogueError: Unable to execute facets query /search View Issue
  • ‼️ ConnectivityError /search View Issue
  • ‼️ CatalogueError: Unable to execute facets query /search View Issue
  • ‼️ CatalogueError: Unable to execute facets query /search View Issue
  • ‼️ SystemExit: 1 /search View Issue

Did you find this useful? React with a 👍 or 👎

mitchdawson1982 pushed a commit that referenced this pull request Jun 14, 2024
* Add missing domain information from charts

* Update search tests that hit datahub dev

- remove entity which is not currently present
- enable the no_duplicates test (we have fixed this)

* Load the list of domains from Datahub

Previously we hardcoded the list of domains shown in the search filter,
and had different lists per environment.

This was useful in alpha when we had some junk domains we wanted to
filter out, but now we're at a point where every domain in Datahub
should be one we want to use.

This commit means we now fetch every domain that has something linked to
it, and display that in alphabetical order.

* Move domain model to models and remove unused model

* Refacotr: decouple SearchFacetFetcher from DomainModel

* Cache facets fetched from datahub

Ideally we would just fetch the facets once per request,
but in practice we do this from a few different places.

1. In the view we instantiate a SearchService, which uses the domain
   model in constructing filters for Datahub.
2. The SearchForm also needs them to know what choices are valid, so we
   need to pass a callback to the form's ChoiceField. That callback does
not share any data with the view.

Caching the value is a quick way to avoid making extra requests for the
same data.

* Hide subdomains if there aren't any defined

This is the case at the moment, because the domain model we've pulled in
from CaDeT doesn't have subdomains. This might change later though so I
don't want to remove the subdomain code completely.

* Include missing domains

Previously it was only returning domains with tables in. We should
include any that show as non-empty in Find MOJ Data.
mitchdawson1982 added a commit that referenced this pull request Jun 17, 2024
* add .env.tpl env template file

* Add MOJ internal service header (#405)

* Add MOJ internal service header

The main links are now in a primary nav component.

This should go below the phase banner as the banner is supposed to touch
the black header.

I've also changed the phase from alpha -> beta, and changed the
capitalization in the service name.

* Remove commented out html

* Populate domains drop down with what's been ingested in datahub (#407)

* Add missing domain information from charts

* Update search tests that hit datahub dev

- remove entity which is not currently present
- enable the no_duplicates test (we have fixed this)

* Load the list of domains from Datahub

Previously we hardcoded the list of domains shown in the search filter,
and had different lists per environment.

This was useful in alpha when we had some junk domains we wanted to
filter out, but now we're at a point where every domain in Datahub
should be one we want to use.

This commit means we now fetch every domain that has something linked to
it, and display that in alphabetical order.

* Move domain model to models and remove unused model

* Refacotr: decouple SearchFacetFetcher from DomainModel

* Cache facets fetched from datahub

Ideally we would just fetch the facets once per request,
but in practice we do this from a few different places.

1. In the view we instantiate a SearchService, which uses the domain
   model in constructing filters for Datahub.
2. The SearchForm also needs them to know what choices are valid, so we
   need to pass a callback to the form's ChoiceField. That callback does
not share any data with the view.

Caching the value is a quick way to avoid making extra requests for the
same data.

* Hide subdomains if there aren't any defined

This is the case at the moment, because the domain model we've pulled in
from CaDeT doesn't have subdomains. This might change later though so I
don't want to remove the subdomain code completely.

* Include missing domains

Previously it was only returning domains with tables in. We should
include any that show as non-empty in Find MOJ Data.

* Cleanup - bring through tags and glossary terms consistently, and remove dead code for data products (#418)

* Extend tags and include glossary terms in search results

* Remove remaining references to data product

This is currently unused, because we no longer include data products in
the search.

* Set chromedriver path to one installed by setup-chromedriver

* Remove metrics ingres config (#425)

remove allowed subnets

* Show when stuff is an ESDA (#421)

* Show when stuff is an ESDA

This is only shown on a handful of assets.

Also remove metadata fields we have not populated yet (these will always
display as blank)

* Correct casing

* Metrics ingress test (#428)

* remove allowed subnets

* change ecr_region from input to var

* Update workflow variable assignments (#431)

* add replaces vars with inputs

* Remove inputs and pull vars from respective environements

* Fmd 366 add dataset lineage link (#416)

* add upstream and downstream lineage to getDatasetDetails graphql query

* refactor parse_relations() helper to handle more relations

* add upstream and downstream lineage to RelationshipType enum

* update parse_relations() input args

* update parse_relations() input args in search

* add has_lineage and lineage_url to dataset details context

* add lineage link to details_table template

* remove redundant block in query for data product relationships

* return entity name for lineage

* have only 1 RelationshipType for lineage

* simplfy `parse_relations()` helper function

* update DatasetDetails to use single lineage type

* align url to rest of table

* update tests

* add default value for url

* design suggestions for lineage label, from Alex and Jess

* spell it right

* suggestions from Mat

* update readme

* remove .env.example

---------

Co-authored-by: Mat <[email protected]>
Co-authored-by: Matt <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Align dev find-moj-data domains to current cadet domains
2 participants