-
-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async Methods #35
Comments
This can be a good thing if you're writing a very busy service, I agree. We could prototype this in some way and compare the performance. It will be slightly lower on a single thread for sure, but for multiple transforms could be a big win. |
Async won't make it faster, but will make it more scalable. For I/O, async is definitely the way to go. We had a lot of async enhancements in the recent releases of .NET Core, like performance and memory allocation improvements and new API's. Do you have plans to drop support to old .NET versions and migrate to .NET 3.1/5? Span can also bring performance improvements, many new Async APIs uses Span to avoid copying memory |
Thrift for netstd supports only .NET 4.6.1+ : https://github.com/apache/thrift/blob/master/lib/netstd/README.md |
The issue is Thrift part plays a miniscule role on performance, and will actually slow down a bit. You want async in the core where data pages are read from disk. |
@aloneguid are you talking about APIs using BinaryReader/BinaryWriter? I sent I PR where I changed the ParquetActor.ReadMetadata and ParquetRowGroupReader.ReadColumn to async for example, but yes, I started it based on Thrift APIs. BinaryReader/BinaryWriter still don't have async overload, we may need to read directly from Stream, or use an Open Source alternative: https://github.com/ronnieoverby/AsyncBinaryReaderWriter |
Partially blocked by dotnet/runtime#17229 |
closing as very old issue unlikely to be worked on. |
Kinda picking up off of https://github.com/elastacloud/parquet-dotnet/issues/20
I see now that the thrift lib ONLY has async methods, and the older clients are being removed next release.
I'm not sure how often parquet updates it's thrift spec, or how often thrift adds new features or w/e though, so this might not be a big deal.
Curious if In the past have you done any testing to see what kind of negative performance making some of the reading/writing async has?
Was considering helping and trying to update to the latest
netstd
thrift generation, but I'm more interested in making this faster....not slower ;)The text was updated successfully, but these errors were encountered: