Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: fix Dataframe.to_records examples #22419

Merged
merged 9 commits into from
Aug 21, 2018
2 changes: 1 addition & 1 deletion ci/doctests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ if [ "$DOCTEST" ]; then

# DataFrame / Series docstrings
pytest --doctest-modules -v pandas/core/frame.py \
-k"-assign -axes -combine -isin -itertuples -join -nlargest -nsmallest -nunique -pivot_table -quantile -query -reindex -reindex_axis -replace -round -set_index -stack -to_dict -to_records -to_stata -transform"
-k"-assign -axes -combine -isin -itertuples -join -nlargest -nsmallest -nunique -pivot_table -quantile -query -reindex -reindex_axis -replace -round -set_index -stack -to_dict -to_stata -transform"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this on-purpose?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was asked to make this change by @TomAugspurger in issue #18160. I believe the point is to check that the examples are correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a whitelist of doctests that are allowed to fail till we fix them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, going to open an issue about this


if [ $? -ne "0" ]; then
RET=1
Expand Down
42 changes: 17 additions & 25 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -1337,22 +1337,25 @@ def to_records(self, index=True, convert_datetime64=None):
"""
Convert DataFrame to a NumPy record array.

Index will be put in the 'index' field of the record array if
Index will be included as the first field of the record array if
requested.

Parameters
----------
index : boolean, default True
Include index in resulting record array, stored in 'index' field.
convert_datetime64 : boolean, default None
index : bool, default True
Include index in resulting record array, stored in 'index'
field or using the index label, if set.
convert_datetime64 : bool, default None
.. deprecated:: 0.23.0

Whether to convert the index to datetime.datetime if it is a
DatetimeIndex.

Returns
-------
y : numpy.recarray
numpy.recarray
NumPy ndarray with the DataFrame labels as fields and each row
of the DataFrame as entries.

See Also
--------
Expand All @@ -1374,31 +1377,20 @@ def to_records(self, index=True, convert_datetime64=None):
rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)],
dtype=[('index', 'O'), ('A', '<i8'), ('B', '<f8')])

If the DataFrame index has no label then the recarray field name
is set to 'index'. If the index has a label then this is used as the
field name:

>>> df.index = df.index.rename("I")
>>> df.to_records()
rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)],
dtype=[('I', 'O'), ('A', '<i8'), ('B', '<f8')])

The index can be excluded from the record array:

>>> df.to_records(index=False)
rec.array([(1, 0.5 ), (2, 0.75)],
dtype=[('A', '<i8'), ('B', '<f8')])

By default, timestamps are converted to `datetime.datetime`:

>>> df.index = pd.date_range('2018-01-01 09:00', periods=2, freq='min')
>>> df
A B
2018-01-01 09:00:00 1 0.50
2018-01-01 09:01:00 2 0.75
>>> df.to_records()
rec.array([(datetime.datetime(2018, 1, 1, 9, 0), 1, 0.5 ),
(datetime.datetime(2018, 1, 1, 9, 1), 2, 0.75)],
dtype=[('index', 'O'), ('A', '<i8'), ('B', '<f8')])

The timestamp conversion can be disabled so NumPy's datetime64
data type is used instead:

>>> df.to_records(convert_datetime64=False)
rec.array([('2018-01-01T09:00:00.000000000', 1, 0.5 ),
('2018-01-01T09:01:00.000000000', 2, 0.75)],
dtype=[('index', '<M8[ns]'), ('A', '<i8'), ('B', '<f8')])
"""

if convert_datetime64 is not None:
Expand Down