Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⛩ Subsetting of data model / rules #987

Merged
merged 10 commits into from
Feb 12, 2025
Merged

Conversation

nikokaoja
Copy link
Collaborator

  • [ALPHA] Ability to subset data model to desired concepts (classes/views)

Copy link

github-actions bot commented Feb 12, 2025

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
14732 11757 80% 60% 🟢

New Files

File Coverage Status
cognite/neat/_session/_subset.py 39% 🟢
TOTAL 39% 🟢

Modified Files

File Coverage Status
cognite/neat/_alpha.py 100% 🟢
cognite/neat/_graph/extractors/_mock_graph_generator.py 20% 🟢
cognite/neat/_rules/analysis/_base.py 81% 🟢
cognite/neat/_rules/transformers/init.py 100% 🟢
cognite/neat/_rules/transformers/_converters.py 85% 🟢
cognite/neat/_session/_base.py 78% 🟢
cognite/neat/_session/_read.py 75% 🟢
cognite/neat/utils/rdf.py 68% 🟢
TOTAL 76% 🟢

updated for commit: 9fbfd6e by action🐍

Copy link
Collaborator

@doctrino doctrino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff



@session_class_wrapper
class SubsetAPI:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if there is a simpler name, subset is a technical term that might not encourage our less technical users.

if self._state.rule_store.empty:
raise NeatSessionError("No rules to set the data model ID.")

warnings.filterwarnings("default")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this one?

raise NeatSessionError("Something went terrible wrong. Please contact the neat team.")

if not issues:
print(f"Subset to {after} concepts.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after can risk not existing

information = self._state.rule_store.provenance[-1].target_entity.information

if dms:
views = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This selection logic should probably be in the transformer and not here

after = len(self._state.rule_store.last_verified_dms_rules.views)

elif information:
classes = {ClassEntity(prefix=information.metadata.space, suffix=concept) for concept in concepts}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no warning about typos here. So if you mistype, you can easily drop all classes.

@@ -101,6 +102,7 @@ def __init__(
self.inspect = InspectAPI(self._state)
self.mapping = MappingAPI(self._state)
self.drop = DropAPI(self._state)
self.subset = SubsetAPI(self._state)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets have a short discussion on name

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Keeping the name

subset = subset.union(ancestors)

if not subset:
raise ValueError("None of the requested classes are defined in the rules!")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NeatValueError? This could happen as a result of poor user decisions?

subset = subset.union(ancestors)

if not subset:
raise ValueError("None of the requested classes are defined in the rules!")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. NeatValueError? This could happen as a result of poor user decisions?

try:
return DMSRules.model_validate(subsetted_rules)
except ValidationError as e:
raise NeatValueError(f"Cannot subset rules: {e}") from e
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets unpack the pydantic errors into nice, readable NeatErrors.

Suggested change
raise NeatValueError(f"Cannot subset rules: {e}") from e
neat_errors = from_pydantic_errors(e.errors())
body = "\n".join(f" - {neat_error.as_message()}" for neat_error in neat_errors)
raise NeatValueError(f"Cannot subset rules: \n{body}") from e

remember

from cognite.neat._issues._factory import from_pydantic_errors

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even better, make from_pydanti_errors return an IssueList and have a create_body method.

try:
return InformationRules.model_validate(subsetted_rules)
except ValidationError as e:
raise NeatValueError(f"Cannot subset rules: {e}") from e
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from cognite.neat._issues._factory import from_pydantic_errors
Suggested change
raise NeatValueError(f"Cannot subset rules: {e}") from e
neat_errors = from_pydantic_errors(e.errors())
body = "\n".join(f" - {neat_error.as_message()}" for neat_error in neat_errors)
raise NeatValueError(f"Cannot subset rules: \n{body}") from e

@nikokaoja nikokaoja merged commit 02f1873 into main Feb 12, 2025
6 checks passed
@nikokaoja nikokaoja deleted the NEAT-759-Refactor-subset-rules branch February 12, 2025 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants