-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for numpy structured array conversion to and from #8564
Comments
Can you use |
In a literal sense, yes, but I'm trying to reduce dependence on pandas. |
Does the answer to this stackoverflow question help? |
It seems, for Statmodels + formulaic interface, rather easy to create your own wrapper, see working with large datasets in the user guide. Something along the lines of (note: not tested): class DataSet(dict):
def __init__(self, df):
self._df = df
def __getitem__(self, key):
try:
return df[key].to_numpy()
except:
raise KeyError |
@tikkanz yes it's helpful although their return trip method uses the structured array that they started from. That being said we can fix that like this:
|
Done; we'll have native init/export support for numpy structured/record arrays in the upcoming |
What's the syntax to export a structured array? Also, thanks for your efforts, they're much appreciated. |
Have a look at the pull request - it's all detailed there ;) |
Problem description
A while back there was this question which would have benefited from being able to convert a structured array into a polars df.
Another usage for the to_structured_array use case is feeding it to something like statsmodels so that names and dtypes are preserved.
Right now I can do something like
but then the dtypes are all floats. I don't think that matters in the context of OLS but in other contexts it might be more important to retain dtypes.
Alternatively, other packages like patsy and formulaic produce their own Matrix datatype which is just an np.ndarray but it retains the column names in a way that statsmodels recognizes. I haven't dug into the source code enough to see how that works but that would be enough for me although perhaps not as extensible.
The text was updated successfully, but these errors were encountered: