-
-
Notifications
You must be signed in to change notification settings - Fork 381
added HDF5 lesson to intermediate R curriculum #687
Conversation
theme: united | ||
--- | ||
|
||
HDF5 is a format that allows the storage of large heterogeneous data sets with self-describing metadata. It support compression, parallel I/O, and easy data slicing which means large files don't need to be completely read into RAM (a real benefit to `R`). Plus it has wide support in the many programming languages, `R` included. To be able to access HDF5 files, you'll need to first install the base [HDF5 libraries](http://www.hdfgroup.org/HDF5/release/obtain5.html#obtain). It might also be useful to install [HDFview](http://www.hdfgroup.org/products/java/hdfview/) which will allow you to explore the contents of an HDF5 file easily. HDF5 as a format can essentially be thought of as a file system that you load slices of at a time. HDF5 files consists of groups (directories) and datasets (files). The dataset holds the actual data, but the groups provide structure to that data, as you'll see in our example. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'It support' --> 'It supports'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed this typo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you commit the typo fix? It should show the updated version once you do so. Currently the file in the PR still has the typo.
Thanks for all the feedback all. I'll take those into consideration and add some more commits. I will also definitely work on a bit about making your own hdf5 files. As far as using a smaller file @sje30 I take your point. I was hoping to provide learners with a real world file as an example. Perhaps I can start off with writing and reading a file, and then go on to using the larger file. |
Great -- also, chcek out: https://github.com/sje30/waverepo/blob/master/paper/waverepo_paper.Rnw if you want a real example of combining hdf5+R+knitr to make a published Stephen On Fri, Sep 05 2014, Edmund Hart wrote:
Sent with my mu4e |
Thanks for the reviews, @chendaniely and @sje30. @emhart, this PR received good reviews. Do you have time to address their suggestions in the next few days? The haste is due to the imminent breakup of the bc repo (see #759 and Greg's blog post). If not, we can close this issue and you can send a new PR in the future. |
I'll send a new PR this weekend @jdblischak will that make it before the breakup? |
Thanks, @emhart. If you send in the final changes in the next few week you should be fine. You have two choices:
Please let me know if I need to explain more or if you need any help with this. |
closing and the reissuing new PR from new fork with updated lessons. |
Per Greg's request I've added in the lesson for working with HDF5 in R and an associated data set.