DOC: Accessing files from a S3 bucket. #23639

myles · 2018-11-12T16:43:03Z

Add some documentation about accessing files from a remote S3 bucket in pandas.

closes DOC: improve s3 access doc-strings / docs #12206
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

Add some documetnation about accessing files from a remote S3 bucket in pandas. pandas-dev#12206

codecov · 2018-11-12T17:20:12Z

Codecov Report

Merging #23639 into master will decrease coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #23639      +/-   ##
==========================================
- Coverage   92.24%   92.24%   -0.01%     
==========================================
  Files         161      161              
  Lines       51326    51314      -12     
==========================================
- Hits        47347    47335      -12     
  Misses       3979     3979

Flag	Coverage Δ
#multiple	`90.63% <ø> (-0.01%)`	⬇️
#single	`42.31% <ø> (-0.02%)`	⬇️

Impacted Files	Coverage Δ
pandas/io/common.py	`70.54% <0%> (-0.23%)`	⬇️
pandas/io/parquet.py	`84.61% <0%> (-0.15%)`	⬇️
pandas/io/html.py	`91.2% <0%> (-0.06%)`	⬇️
pandas/core/series.py	`93.68% <0%> (-0.03%)`	⬇️
pandas/io/json/json.py	`93.09% <0%> (-0.02%)`	⬇️
pandas/core/frame.py	`97.02% <0%> (-0.01%)`	⬇️
pandas/io/parsers.py	`95.55% <0%> (-0.01%)`	⬇️
pandas/tseries/offsets.py	`97.07% <0%> (-0.01%)`	⬇️
pandas/core/groupby/ops.py	`96.79% <0%> (ø)`	⬆️
pandas/plotting/_core.py	`83.63% <0%> (ø)`	⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b9ba708...21487b1. Read the comment docs.

alimcmaster1 · 2018-11-12T22:14:45Z

doc/source/cookbook.rst

+Load a file from S3
+-------------------
+
+Pandas support loading files from a S3 bucket for remote file interactivity.


NIT: But I think this should read 'Pandas supports loading files'

alimcmaster1 · 2018-11-12T22:24:20Z

This looks good to me - CC. @jreback and @datapythonista

datapythonista · 2018-11-12T22:51:05Z

We already have this: https://pandas.pydata.org/pandas-docs/stable/io.html#reading-remote-files

I'd expand that section if there is anything missing, I wouldn't expect many people to look in the cookbook for S3 options having a page specific for IO operations.

Does it make sense?

myles · 2018-11-13T14:59:00Z

@datapythonista I've expanded the the section in io and remove it from cookbook.

datapythonista

Thanks for updating @myles. That looks good to me, but I don't know much about S3, so will let other people review.

jreback · 2018-11-14T13:58:17Z

thanks!

* upstream/master: (25 commits) DOC: Delete trailing blank lines in docstrings. (pandas-dev#23651) DOC: Change release and whatsnew (pandas-dev#21599) DOC: Fix format of the See Also descriptions (pandas-dev#23654) DOC: update pandas.core.groupby.DataFrameGroupBy.resample docstring. (pandas-dev#20374) ENH: Allow export of mixed columns to Stata strl (pandas-dev#23692) CLN: Remove unnecessary code (pandas-dev#23696) Pin flake8-rst version (pandas-dev#23699) Implement _most_ of the EA interface for DTA/TDA (pandas-dev#23643) CI: raise clone depth limit on CI BUG: Fix Series/DataFrame.rank(pct=True) with more than 2**24 rows (pandas-dev#23688) REF: Move Excel names parameter handling to CSV (pandas-dev#23690) DOC: Accessing files from a S3 bucket. (pandas-dev#23639) Fix errorbar visualization (pandas-dev#23674) DOC: Surface / doc mangle_dupe_cols in read_excel (pandas-dev#23678) DOC: Update is_sparse docstring (pandas-dev#19983) BUG: Fix read_excel w/parse_cols & empty dataset (pandas-dev#23661) Add to_flat_index method to MultiIndex (pandas-dev#22866) CLN: Move to_excel to generic.py (pandas-dev#23656) TST: IntervalTree.get_loc_interval should return platform int (pandas-dev#23660) CI: Allow to compile docs with ipython 7.11 pandas-dev#22990 (pandas-dev#23655) ...

…fixed * upstream/master: DOC: Delete trailing blank lines in docstrings. (pandas-dev#23651) DOC: Change release and whatsnew (pandas-dev#21599) DOC: Fix format of the See Also descriptions (pandas-dev#23654) DOC: update pandas.core.groupby.DataFrameGroupBy.resample docstring. (pandas-dev#20374) ENH: Allow export of mixed columns to Stata strl (pandas-dev#23692) CLN: Remove unnecessary code (pandas-dev#23696) Pin flake8-rst version (pandas-dev#23699) Implement _most_ of the EA interface for DTA/TDA (pandas-dev#23643) CI: raise clone depth limit on CI BUG: Fix Series/DataFrame.rank(pct=True) with more than 2**24 rows (pandas-dev#23688) REF: Move Excel names parameter handling to CSV (pandas-dev#23690) DOC: Accessing files from a S3 bucket. (pandas-dev#23639) Fix errorbar visualization (pandas-dev#23674) DOC: Surface / doc mangle_dupe_cols in read_excel (pandas-dev#23678) DOC: Update is_sparse docstring (pandas-dev#19983) BUG: Fix read_excel w/parse_cols & empty dataset (pandas-dev#23661) Add to_flat_index method to MultiIndex (pandas-dev#22866) CLN: Move to_excel to generic.py (pandas-dev#23656) TST: IntervalTree.get_loc_interval should return platform int (pandas-dev#23660)

📝 Add docs for accessing files from S3.

6551a42

Add some documetnation about accessing files from a remote S3 bucket in pandas. pandas-dev#12206

myles changed the title ~~Add Documentation for accessing files from S3.~~ DOC: Accessing files from a S3 bucket. Nov 12, 2018

alimcmaster1 requested changes Nov 12, 2018

View reviewed changes

datapythonista added Docs IO Data IO issues that don't fit into a more specific label labels Nov 12, 2018

✏️ Move S3 docs from cookbook to io.

21487b1

datapythonista approved these changes Nov 13, 2018

View reviewed changes

jreback added this to the 0.24.0 milestone Nov 14, 2018

jreback merged commit 6f8c6e1 into pandas-dev:master Nov 14, 2018

JustinZhengBC pushed a commit to JustinZhengBC/pandas that referenced this pull request Nov 14, 2018

DOC: Accessing files from a S3 bucket. (pandas-dev#23639)

c8ac3bf

tm9k1 pushed a commit to tm9k1/pandas that referenced this pull request Nov 19, 2018

DOC: Accessing files from a S3 bucket. (pandas-dev#23639)

c06e26c

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

DOC: Accessing files from a S3 bucket. (pandas-dev#23639)

f438dd9

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

DOC: Accessing files from a S3 bucket. (pandas-dev#23639)

551a918

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: Accessing files from a S3 bucket. #23639

DOC: Accessing files from a S3 bucket. #23639

myles commented Nov 12, 2018

codecov bot commented Nov 12, 2018 •

edited

Loading

alimcmaster1 Nov 12, 2018

alimcmaster1 commented Nov 12, 2018

datapythonista commented Nov 12, 2018

myles commented Nov 13, 2018

datapythonista left a comment

jreback commented Nov 14, 2018

DOC: Accessing files from a S3 bucket. #23639

DOC: Accessing files from a S3 bucket. #23639

Conversation

myles commented Nov 12, 2018

codecov bot commented Nov 12, 2018 • edited Loading

Codecov Report

alimcmaster1 Nov 12, 2018

Choose a reason for hiding this comment

alimcmaster1 commented Nov 12, 2018

datapythonista commented Nov 12, 2018

myles commented Nov 13, 2018

datapythonista left a comment

Choose a reason for hiding this comment

jreback commented Nov 14, 2018

codecov bot commented Nov 12, 2018 •

edited

Loading