Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a BaseBlobService.list_blob_names method #545

Merged
merged 2 commits into from
Feb 15, 2019

Conversation

xhochy
Copy link
Contributor

@xhochy xhochy commented Jan 24, 2019

This can currently also be achieved via [b.name for b in bbs.list_blobs()] but that parses the full XML and discards most of the parsed information again. With this change listing the blob names is not anymore CPU-bound for us.

This can currently also be achieved via
`[b.name for b in bbs.list_blobs()]` but that parses the full XML and
discards most of the parsed information again. With this change listing
the blob names is not anymore CPU-bound for us.
@msftclas
Copy link

msftclas commented Jan 24, 2019

CLA assistant check
All CLA requirements met.

@xhochy
Copy link
Contributor Author

xhochy commented Jan 24, 2019

Travis fails as the casette in the tests needs to be recorded. Any advice on this?

@zezha-msft
Copy link
Contributor

Hi @xhochy, thanks for the PR!

The additional method seems to be very specific to your use case. Is it possible to keep it in your application code instead?

@xhochy
Copy link
Contributor Author

xhochy commented Jan 25, 2019

I can keep that in my application code but I made a PR here as I don't think that it is specific for my use case but rather generally useful.

When working with object stores, often people are only interested in the filenames and not any additional information. The additional information information is often not required but in my measurements, a request for 5000 keys took 20-30s whereof 3s were for the request itself and the remaining time was spent on parsing the XML and constructing the Blob objects.

I would use the code in this PR in https://github.com/mbr/simplekv and an ABS-compatible version of https://github.com/martindurant/filesystem_spec / https://github.com/dask/s3fs for dask. Thus I hope this is enough to convince that the method is generally useful. I will for now copy it into the specific applications but hope to still get it merged here.

@xhochy
Copy link
Contributor Author

xhochy commented Jan 25, 2019

I've opened a PR over at mbr/simplekv#88 to use the code directly instead of changing anything in azure-storage-blob but I'm not really happy as it uses a lot of the private APIs of azure-storage-*.

@zezha-msft
Copy link
Contributor

Hi @xhochy, thanks for getting back to me.

I see your point, and tend to agree with you. But let me talk to the Team first, to make sure.

In the mean time, could you please run the test in record mode? To generate the recording file.

@paularmand
Copy link

I would like to back this request. We also have a need for just filenames.

@zezha-msft
Copy link
Contributor

@seguler what do you think?

@xhochy
Copy link
Contributor Author

xhochy commented Feb 8, 2019

@seguler @zezha-msft Any updates in whether this can go in?

Copy link
Contributor

@seguler seguler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me.

@jank
Copy link

jank commented Feb 12, 2019

Is there a plan or schedule on when this PR will be included in a release?

@zezha-msft zezha-msft changed the base branch from master to dev February 12, 2019 23:40
@zezha-msft zezha-msft changed the base branch from dev to master February 12, 2019 23:40
@zezha-msft
Copy link
Contributor

Hi @BY-jk, I was planning on making a release by the end of this week. Is there a timeline requirement?

@jank
Copy link

jank commented Feb 13, 2019

@zezha-msft - end of this week would be great. While we do have workarounds in place, this will allow for a number of down-stream libraries to avoid adding more custom code. Thanks for getting back and pushing this forward. Very much appreciated.

@zezha-msft zezha-msft merged commit f97958b into Azure:master Feb 15, 2019
@zezha-msft
Copy link
Contributor

@xhochy @BY-jk this change was just released. Thanks for your patience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants