-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reading of a data source #11
Comments
@loleg It reads data source and cast data according to the schema (if provided). Also it doesn't load the whole data file into memory (but one row per time - streaming). |
As per #4 I think we should continue discussion of streaming library choice here. In reviewing the implementation notes, I wondered:
Is this really a generator, or an iterator? In Julia, iterators are made by applying protocol interfaces to the type. Since in our case generators are being used for streaming data, I am inclined to think that Channels would be more appropriate and performant. But the most sensible course of action right now would probably be simply to implement what the approach we take to loading data recommends. |
@loleg rows = table.iter(keyed=True)
for row in rows:
print(row) Probably a good idea could be to implement the simplest |
@roll I have updated this issue description with my current understanding, please confirm. |
@loleg |
I've prototyped some approaches to include an external library, but don't like to have heavy dependencies or slow-to-run unit tests. Therefore I have made the Table |
Still a bit stuck on the way forward here, I've been considering implementing a DataStreams.jl interface or supporting IterableTables.jl. |
In addition to the basic streamed loading of data sources into a
Table
as per #6, provide the ability to read through aTable
with cast on iteration so that the individual cells of the table are formatted according to theSchema
provided.read
API* should be accessible for aTable
class.The text was updated successfully, but these errors were encountered: