Improve quality of `sequence_index` #1765

frances-h · 2024-01-31T20:22:57Z

CU-86az5xcqz
Resolve #1760

The sequence index is now split into a diff column which goes through the sequential model and a context column which is added to the context model. Additionally, FloatFormatter is used on the diff column to force sampled columns to be within the given min/max range.

Also, sample_sequential_columns had to be adjusted to use conditional sampling to get missing extra context columns before getting sequential samples.

…diff column

…nd fix for single sequence

sdv-team · 2024-01-31T20:23:01Z

Task linked: CU-86az5xcqz SDV - Improve quality of sequence_index: Move the start dates into the context model #1760

codecov-commenter · 2024-01-31T20:32:44Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (e6e508b) 97.11% compared to head (1428364) 97.12%.
Report is 1 commits behind head on main.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1765      +/-   ##
==========================================
+ Coverage   97.11%   97.12%   +0.01%     
==========================================
  Files          48       48              
  Lines        4570     4598      +28     
==========================================
+ Hits         4438     4466      +28     
  Misses        132      132

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

amontanez24

This looks good! Maybe we should add an integration test to show that the issues raised that led to this one were resolved

amontanez24 · 2024-02-01T22:01:56Z

sdv/sequential/par.py

+        data = data.merge(
+            sequence_index_context,
+            left_on=self._sequence_key,
+            right_index=True)


It's funny that we merge this back in but just separate it out again later. Not sure if there's a way to avoid that

Maybe, but not easily right now. PAR also takes in the context columns when assembling the sequences and fitting the model so we'd need to do the join regardless. We could investigate if the context is really needed there though, and then skip the join here if so.

amontanez24

LGTM!

frances-h added 4 commits January 31, 2024 12:14

Add sequence_index base to context model and apply FloatFormatter to …

3f475ea

…diff column

conditionally sample extra context columns when sampling sequential a…

cc6b677

…nd fix for single sequence

fix tests

4dba291

lint

1e83531

split into helper and add unit tests

bd62116

frances-h marked this pull request as ready for review January 31, 2024 21:26

frances-h requested a review from a team as a code owner January 31, 2024 21:26

frances-h requested review from amontanez24 and rwedge and removed request for a team January 31, 2024 21:26

amontanez24 reviewed Feb 1, 2024

View reviewed changes

add sequence index validation to integration test

1428364

amontanez24 approved these changes Feb 2, 2024

View reviewed changes

rwedge approved these changes Feb 5, 2024

View reviewed changes

frances-h merged commit 1d2b03e into main Feb 5, 2024
37 checks passed

frances-h deleted the issue-1760-improve-sequence_index-quality branch February 5, 2024 19:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve quality of `sequence_index` #1765

Improve quality of `sequence_index` #1765

frances-h commented Jan 31, 2024

sdv-team commented Jan 31, 2024

codecov-commenter commented Jan 31, 2024 •

edited

Loading

amontanez24 left a comment

amontanez24 Feb 1, 2024

frances-h Feb 2, 2024

amontanez24 left a comment

Improve quality of sequence_index #1765

Improve quality of sequence_index #1765

Conversation

frances-h commented Jan 31, 2024

sdv-team commented Jan 31, 2024

codecov-commenter commented Jan 31, 2024 • edited Loading

Codecov Report

amontanez24 left a comment

Choose a reason for hiding this comment

amontanez24 Feb 1, 2024

Choose a reason for hiding this comment

frances-h Feb 2, 2024

Choose a reason for hiding this comment

amontanez24 left a comment

Choose a reason for hiding this comment

Improve quality of `sequence_index` #1765

Improve quality of `sequence_index` #1765

codecov-commenter commented Jan 31, 2024 •

edited

Loading