Message printed to console on `read.fst` #181

jangorecki · 2018-12-02T05:28:00Z

When reading fst file we are getting extra message about loading data.table package. There should be an option to suppress that message.

fst::write.fst(iris, "iris.fst")
ir=fst::read.fst("iris.fst")
Loading required namespace: data.table

After investigating I found that reading fst file actually requires data.table package to be installed while DESCRIPTION defines it as Suggested depedency. Any use of data.table should be properly escaped in such case. When we try to read fst not having data.table installed we are getting following error:

fst::write.fst(iris, "iris.fst")
> ir=fst::read.fst("iris.fst")
Loading required namespace: data.table
Failed with error:  'there is no package called 'data.table''

The text was updated successfully, but these errors were encountered:

MarcusKlik · 2018-12-02T22:00:45Z

Hi @jangorecki, thanks a lot for the fix!

I think at some point data.table will end up in the Imports field again, as I plan to use data.table's fast sorting capabilities to sort chunks of data that constitute one or more groups of the data-set. Together with a merge-sort algorithm (for the chunks), that would allow for out-of-memory sorting of very big tables that are stored in a fst file.

Thanks again for the corrections!

xiaodaigh · 2018-12-02T23:00:53Z

Together with a merge-sort algorithm (for the chunks)

I wish I know enough C++ to help. I have started work on implementing an R-code only version.

MarcusKlik · 2018-12-02T23:17:29Z

Hi @xiaodaigh, an R only version using fst as a backend for writing the chunks might be almost as fast as a C++ implementation!

Most of the computational work during a merge sort is done in serializing and de-serializing chunks and writing- and reading the data to disk I think and the actual sorting of the chunks themselves (using data.table) will probably take less time.

You'll have to coordinate your workers however, and that will be relatively slow (especially on Windows :-))

jangorecki mentioned this issue Dec 2, 2018

properly escape use of suggested dependency data.table, closes #181 #182

Merged

MarcusKlik closed this as completed in #182 Dec 2, 2018

MarcusKlik added this to the fst v0.8.10 milestone Dec 2, 2018

MarcusKlik added the bug label Dec 2, 2018

MarcusKlik assigned jangorecki Dec 2, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Message printed to console on `read.fst` #181

Message printed to console on `read.fst` #181

jangorecki commented Dec 2, 2018

MarcusKlik commented Dec 2, 2018 •

edited

Loading

xiaodaigh commented Dec 2, 2018

MarcusKlik commented Dec 2, 2018

Message printed to console on read.fst #181

Message printed to console on read.fst #181

Comments

jangorecki commented Dec 2, 2018

MarcusKlik commented Dec 2, 2018 • edited Loading

xiaodaigh commented Dec 2, 2018

MarcusKlik commented Dec 2, 2018

Message printed to console on `read.fst` #181

Message printed to console on `read.fst` #181

MarcusKlik commented Dec 2, 2018 •

edited

Loading