Skip to content

Commit

Permalink
BUG: Fix loading files from S3 with # characters in URL (GH25945) (#2…
Browse files Browse the repository at this point in the history
  • Loading branch information
swt2c authored and WillAyd committed Apr 9, 2019
1 parent af0ecbe commit 2f6b90a
Show file tree
Hide file tree
Showing 4 changed files with 8 additions and 1 deletion.
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.25.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -361,6 +361,7 @@ I/O
- Bug in ``read_csv`` which would not raise ``ValueError`` if a column index in ``usecols`` was out of bounds (:issue:`25623`)
- Improved the explanation for the failure when value labels are repeated in Stata dta files and suggested work-arounds (:issue:`25772`)
- Improved :meth:`pandas.read_stata` and :class:`pandas.io.stata.StataReader` to read incorrectly formatted 118 format files saved by Stata (:issue:`25960`)
- Fixed bug in loading objects from S3 that contain ``#`` characters in the URL (:issue:`25945`)

Plotting
^^^^^^^^
Expand Down
2 changes: 1 addition & 1 deletion pandas/io/s3.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

def _strip_schema(url):
"""Returns the url without the s3:// part"""
result = parse_url(url)
result = parse_url(url, allow_fragments=False)
return result.netloc + result.path


Expand Down
1 change: 1 addition & 0 deletions pandas/tests/io/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ def s3_resource(tips_file, jsonl_file):
moto = pytest.importorskip('moto')

test_s3_files = [
('tips#1.csv', tips_file),
('tips.csv', tips_file),
('tips.csv.gz', tips_file + '.gz'),
('tips.csv.bz2', tips_file + '.bz2'),
Expand Down
5 changes: 5 additions & 0 deletions pandas/tests/io/parser/test_network.py
Original file line number Diff line number Diff line change
Expand Up @@ -198,3 +198,8 @@ def test_read_csv_chunked_download(self, s3_resource, caplog):
read_csv("s3://pandas-test/large-file.csv", nrows=5)
# log of fetch_range (start, stop)
assert ((0, 5505024) in {x.args[-2:] for x in caplog.records})

def test_read_s3_with_hash_in_key(self, tips_df):
# GH 25945
result = read_csv('s3://pandas-test/tips#1.csv')
tm.assert_frame_equal(tips_df, result)

0 comments on commit 2f6b90a

Please sign in to comment.