You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For any column that has sdtype id with a provided regex, the SDV currently generates the regex in sequential (alphanumeric) order. The resulting data doesn't look realistic.
Expected behavior
For any id columns with a regex, ensure that the synthetic data does not have values in sequential order. A simple way to do this is to scramble them.
In technical terms: We assign RDT's RegexGenerator to accomplish this. By default, we should assign the RegexGenerator with generation_order='scrambled' to these columns.
Additional context
This change only applies if a column is sdtype 'id' AND there is a user-provided 'regex_format'. If there is no regex provided in the metadata, then no changes will be made for this issue
This issue depends on RDT changes to add the generation_order parameter. See RDT issue #800
The text was updated successfully, but these errors were encountered:
Problem Description
For any column that has sdtype
id
with a provided regex, the SDV currently generates the regex in sequential (alphanumeric) order. The resulting data doesn't look realistic.Expected behavior
For any
id
columns with a regex, ensure that the synthetic data does not have values in sequential order. A simple way to do this is to scramble them.In technical terms: We assign RDT's RegexGenerator to accomplish this. By default, we should assign the RegexGenerator with
generation_order='scrambled'
to these columns.Additional context
'id'
AND there is a user-provided'regex_format'
. If there is no regex provided in the metadata, then no changes will be made for this issuegeneration_order
parameter. See RDT issue #800The text was updated successfully, but these errors were encountered: