Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialize empty DataFrame with dataclass #60530

Open
1 of 3 tasks
brandonchinn178 opened this issue Dec 9, 2024 · 0 comments
Open
1 of 3 tasks

Initialize empty DataFrame with dataclass #60530

brandonchinn178 opened this issue Dec 9, 2024 · 0 comments
Labels
Enhancement Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@brandonchinn178
Copy link

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

I wish there was some way to ensure a DataFrame has the same schema that would be inferred from a list of dataclass objects, if the list happens to be empty.

Feature Description

One possibility is to allow passing the dataclass as pd.DataFrame(dtype=Point2D). Ref #4464.

Note that while this is similar to #4464 (and possibly blocked by #4464), I don't think it's a duplicate, since there's still the issue of building the same dtype schema that would be inferred for a dataclass object, even if we had compound dtypes.

Alternative Solutions

Current workaround:

def _dataframe_with_schema(data: list[T], dataclass: type[T]):
    types = typing.get_type_hints(dataclass)
    overrides = {
        datetime.datetime: np.dtype("datetime64[ns]"),
    }
    schema = [
        (name, overrides.get(type, type))
        for name, type in (
            (field.name, types[field.name])
            for field in dataclasses.fields(dataclass)
        )
    ]

    df = pd.DataFrame(data, columns=[name for name, _ in schema])
    return df.astype(dict(schema))

Additional Context

No response

@brandonchinn178 brandonchinn178 added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

1 participant