Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sampling methods to PARSynthesizer #1083

Closed
amontanez24 opened this issue Oct 21, 2022 · 1 comment
Closed

Add sampling methods to PARSynthesizer #1083

amontanez24 opened this issue Oct 21, 2022 · 1 comment
Assignees
Labels
feature request Request for a new feature
Milestone

Comments

@amontanez24
Copy link
Contributor

Problem Description

As a user, it would be helpful to have a clear understanding of how to sample sequential data whether I have to provide the context or not.

Acceptance criteria

Add the following methods to the PARSynthesizer

  • sample
    • Parameters:
      • (required) num_sequences(int): The number of sequences to sample
      • sequence_length(int): An integer that describes the length of every sequence
        • (default) None: The model will determine the length of every sequence
      • randomize_samples (bool):
        • True: Every sample() call will have randomly generated data that is different
        • (default) False: Every sample() call has the same data generated
  • sample_sequential_columns
    • Parameters:
      • (required) context_columns (pandas.DataFrame): A DataFrame containing the rows for the unchanging context
      • sequence_length (int): An integer that describes the length of every sequence
        • (default) None: The model will determine the length of every sequence
      • randomize_samples(bool):
        • True: Every sample() call will have randomly generated data that is different
        • (default) False: Every sample() call has the same data generated

Expected behavior

>>> synthesizer.sample(5, sequence_length=10, randomize_samples=True)
>>> context = pd.DataFrame({'name': ['sam', 'sam', 'sam']})
>>> synthesizer.sample_sequential_columns(context, 5, True)

Additional context

  • The current PAR model only has one sampling method with a bunch of different arguments. The functionality in that method can essentially be split up into the two methods above.
@amontanez24 amontanez24 added the feature request Request for a new feature label Oct 21, 2022
@amontanez24 amontanez24 added this to the 1.0.0 milestone Oct 21, 2022
@npatki
Copy link
Contributor

npatki commented Oct 25, 2022

When doing this change, we can also fix #1052: In sample_sequential_columns the columns provided as context_columns can be in any order. (We can simply reorder them if as needed.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

2 participants