Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
simplify interactions with arrow flight APIs (#377)
* simplify interactions with arrow flight APIs Initial work to implement some basic traits * more polishing and introduction of a couple of wrapper types Some more polishing of the basic code I provided last week. * More polishing Add support for representing tickets as base64 encoded strings. Also: more polishing of Display, etc... * improve BOOLEAN writing logic and report error on encoding fail When writing BOOLEAN data, writing more than 2048 rows of data will overflow the hard-coded 256 buffer set for the bit-writer in the PlainEncoder. Once this occurs, further attempts to write to the encoder fail, becuase capacity is exceeded, but the errors are silently ignored. This fix improves the error detection and reporting at the point of encoding and modifies the logic for bit_writing (BOOLEANS). The bit_writer is initially allocated 256 bytes (as at present), then each time the capacity is exceeded the capacity is incremented by another 256 bytes. This certainly resolves the current problem, but it's not exactly a great fix because the capacity of the bit_writer could now grow substantially. Other data types seem to have a more sophisticated mechanism for writing data which doesn't involve growing or having a fixed size buffer. It would be desirable to make the BOOLEAN type use this same mechanism if possible, but that level of change is more intrusive and probably requires greater knowledge of the implementation than I possess. resolves: #349 * only manipulate the bit_writer for BOOLEAN data Tacky, but I can't think of better way to do this without specialization. * better isolation of changes Remove the byte tracking from the PlainEncoder and use the existing bytes_written() method in BitWriter. This is neater. * add test for boolean writer The test ensures that we can write > 2048 rows to a parquet file and that when we read the data back, it finishes without hanging (defined as taking < 5 seconds). If we don't want that extra complexity, we could remove the thread/channel stuff and just try to read the file and let the test runner terminate hanging tests. * fix capacity calculation error in bool encoding The values.len() reports the number of values to be encoded and so must be divided by 8 (bits in a bytes) to determine the effect on the byte capacity of the bit_writer. * make BasicAuth accessible Following merge with master, make sure this is exposed so that integration tests work. also: there has been a release since I last looked at this so update the deprecation warnings. * fix documentation for ipc_message_from_arrow_schema TryFrom, not From * replace deprecated functions in integrations tests with traits clippy complains about using deprecated functions, so replace them with the new trait support. also: fix the trait documentation * address review comments - update deprecated warnings - improve TryFrom for DescriptorType