Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If the input data has a different index, the reverse transformed data may be out of order #277

Closed
amontanez24 opened this issue Oct 4, 2021 · 0 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@amontanez24
Copy link
Contributor

Environment Details

Please indicate the following details about the environment in which you found the bug:

  • RDT version: Any
  • Python version: Any
  • Operating System: Any

Error Description

If the DataFrame input into theHyperTransformer has an index that isn't in incremental order, the reverse transformed data might come back in the wrong order. Some transformers reset the index and when the BaseTransformer assigns the new columns it is sometimes out of order. This happens in the following lines because the rows of columns_data will reorder to match the index of data.

def _set_columns_data(data, columns_data, columns):
if len(columns_data.shape) == 1:
data[columns[0]] = columns_data
else:
data[columns] = columns_data

Steps to reproduce

data = pd.DataFrame({
    'integer': [1, 2, 1, 3],
    'float': [0.1, 0.2, 0.1, 0.1],
    'categorical': ['a', 'a', 'b', 'a'],
    'bool': [False, False, True, False],
    'names': ['Jon', 'Arya', 'Jon', 'Jon'],
})
ht = HyperTransformer(data_type_transformers={'categorical': CategoricalTransformer})
ht.fit(data)
transformed = ht.transform(data)
reversed = ht.reverse_transform(data)
@amontanez24 amontanez24 added bug Something isn't working pending review labels Oct 4, 2021
@amontanez24 amontanez24 self-assigned this Oct 26, 2021
@amontanez24 amontanez24 added this to the 0.6.0 milestone Oct 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant