Support extracting multiple subsets from one table #142
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a more in-depth approach to #139 than the
--data-only
flag I proposed in #140. That flag may still be helpful in other cases, so I'd consider that PR separately.This introduces the concept of table Subsets, which allow multiple Filter, Anonymise, and Relationships configurations to be attached to a single table. All functionality related to reading table data is refactored from operating on an entire table to operating on a single subset (ReadTable becomes ReadSubset).
For compatibility, Filter, Anonymise, and Relationships blocks defined at the root Table level are copied into a
_deafult
Subset when the config file is parsed.So, a config like this:
Will be parsed into something like this behind the scenes:
One area that I'm somewhat questioningis how things should be handled when Filter config exists both at the root level and in Subsets
For example:
This config will currently be parsed into a tree that looks like this:
I think this is a reasonable approach overall — it treats the root level as its own separate subset — but I can also see how it could be confusing if people thought that the Filter, Anonymise, and Relationships blocks defined at the root level would be inherited by all subsets.
The alternative is to make this scenario throw an error, so a table can either use a root-level configuration, or it can define Subsets, but it cannot do both.
Thoughts?