-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: Accessing files from a S3 bucket. #23639
Conversation
Add some documetnation about accessing files from a remote S3 bucket in pandas. pandas-dev#12206
Codecov Report
@@ Coverage Diff @@
## master #23639 +/- ##
==========================================
- Coverage 92.24% 92.24% -0.01%
==========================================
Files 161 161
Lines 51326 51314 -12
==========================================
- Hits 47347 47335 -12
Misses 3979 3979
Continue to review full report at Codecov.
|
doc/source/cookbook.rst
Outdated
Load a file from S3 | ||
------------------- | ||
|
||
Pandas support loading files from a S3 bucket for remote file interactivity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: But I think this should read 'Pandas supports loading files'
This looks good to me - CC. @jreback and @datapythonista |
We already have this: https://pandas.pydata.org/pandas-docs/stable/io.html#reading-remote-files I'd expand that section if there is anything missing, I wouldn't expect many people to look in the cookbook for S3 options having a page specific for IO operations. Does it make sense? |
@datapythonista I've expanded the the section in io and remove it from cookbook. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating @myles. That looks good to me, but I don't know much about S3, so will let other people review.
thanks! |
* upstream/master: (25 commits) DOC: Delete trailing blank lines in docstrings. (pandas-dev#23651) DOC: Change release and whatsnew (pandas-dev#21599) DOC: Fix format of the See Also descriptions (pandas-dev#23654) DOC: update pandas.core.groupby.DataFrameGroupBy.resample docstring. (pandas-dev#20374) ENH: Allow export of mixed columns to Stata strl (pandas-dev#23692) CLN: Remove unnecessary code (pandas-dev#23696) Pin flake8-rst version (pandas-dev#23699) Implement _most_ of the EA interface for DTA/TDA (pandas-dev#23643) CI: raise clone depth limit on CI BUG: Fix Series/DataFrame.rank(pct=True) with more than 2**24 rows (pandas-dev#23688) REF: Move Excel names parameter handling to CSV (pandas-dev#23690) DOC: Accessing files from a S3 bucket. (pandas-dev#23639) Fix errorbar visualization (pandas-dev#23674) DOC: Surface / doc mangle_dupe_cols in read_excel (pandas-dev#23678) DOC: Update is_sparse docstring (pandas-dev#19983) BUG: Fix read_excel w/parse_cols & empty dataset (pandas-dev#23661) Add to_flat_index method to MultiIndex (pandas-dev#22866) CLN: Move to_excel to generic.py (pandas-dev#23656) TST: IntervalTree.get_loc_interval should return platform int (pandas-dev#23660) CI: Allow to compile docs with ipython 7.11 pandas-dev#22990 (pandas-dev#23655) ...
…fixed * upstream/master: DOC: Delete trailing blank lines in docstrings. (pandas-dev#23651) DOC: Change release and whatsnew (pandas-dev#21599) DOC: Fix format of the See Also descriptions (pandas-dev#23654) DOC: update pandas.core.groupby.DataFrameGroupBy.resample docstring. (pandas-dev#20374) ENH: Allow export of mixed columns to Stata strl (pandas-dev#23692) CLN: Remove unnecessary code (pandas-dev#23696) Pin flake8-rst version (pandas-dev#23699) Implement _most_ of the EA interface for DTA/TDA (pandas-dev#23643) CI: raise clone depth limit on CI BUG: Fix Series/DataFrame.rank(pct=True) with more than 2**24 rows (pandas-dev#23688) REF: Move Excel names parameter handling to CSV (pandas-dev#23690) DOC: Accessing files from a S3 bucket. (pandas-dev#23639) Fix errorbar visualization (pandas-dev#23674) DOC: Surface / doc mangle_dupe_cols in read_excel (pandas-dev#23678) DOC: Update is_sparse docstring (pandas-dev#19983) BUG: Fix read_excel w/parse_cols & empty dataset (pandas-dev#23661) Add to_flat_index method to MultiIndex (pandas-dev#22866) CLN: Move to_excel to generic.py (pandas-dev#23656) TST: IntervalTree.get_loc_interval should return platform int (pandas-dev#23660)
Add some documentation about accessing files from a remote S3 bucket in pandas.
git diff upstream/master -u -- "*.py" | flake8 --diff