**** Please note: the central luarocks server has another package called hdf5 (http://colberg.org/lua-hdf5/) - if you use 'luarocks install' you may get that one instead. ****
**** Please note also: torch-hdf5 now requires version 1.8.14 or greater of hdf5! ****
brew tap homebrew/science
brew install hdf5
git clone https://github.com/deepmind/torch-hdf5
cd torch-hdf5
luarocks make hdf5-0-0.rockspec
Note: if luarocks make
fails with an unsatisfied dependency, the luarocks being used is likely not the one provided by torch. Try using [torch install directory]/install/bin/luarocks
instead.
sudo apt-get install libhdf5-serial-dev hdf5-tools
git clone https://github.com/deepmind/torch-hdf5
cd torch-hdf5
luarocks make hdf5-0-0.rockspec
sudo apt-get install libhdf5-serial-dev hdf5-tools
git clone https://github.com/deepmind/torch-hdf5
cd torch-hdf5
luarocks make hdf5-0-0.rockspec LIBHDF5_LIBDIR="/usr/lib/x86_64-linux-gnu/"
require 'hdf5'
local myFile = hdf5.open('/path/to/write.h5', 'w')
myFile:write('/path/to/data', torch.rand(5, 5))
myFile:close()
require 'hdf5'
local myFile = hdf5.open('/path/to/read.h5', 'r')
local data = myFile:read('/path/to/data'):all()
myFile:close()
h5read /path/to/file.h5 /location/of/data
See the Matlab documentation for further information.
You need to install a library:
$ pip install h5py
Then:
import h5py
myFile = h5py.File('/path/to/file.h5', 'r')
# The '...' means retrieve the whole tensor
data = myFile['location']['of']['data'][...]
print(data)
See also the h5py manual.
You need to install a library:
source("http://bioconductor.org/biocLite.R")
biocLite("rhdf5")
Then:
library(rhdf5)
mydata <- h5read("/path/to/file.h5", "/location/of/data")
str(mydata)
Alternative libraries for R include 'h5r' and 'ncdf4'.
You can optionally pass a DataSetOptions
object to specify how you want data to be written:
require 'hdf5'
local myFile = hdf5.open('/path/to/write.h5', 'w')
local options = hdf5.DataSetOptions()
options:setChunked(32, 32)
options:setDeflate()
myFile:write('/path/to/data', torch.rand(500, 500), options)
myFile:close()
You can read from a dataset without loading the whole thing at once:
local myFile = hdf5.open('/path/to/read.h5','r')
-- Specify the range for each dimension of the dataset.
local data = myFile:read('/path/to/data'):partial({start1, end1}, {start2, end2})
myFile:close()
Note that, for efficiency, hdf5 may still load (but not return) more than just the piece you ask for - depending on what options the file was written with. For example, if the dataset is chunked, it should just load the chunks that overlap with the part you ask for.
Getting the size of the dataset without loading the data:
local myFile = hdf5.open('/path/to/read.h5','r')
local dim = myFile:read('/path/to/data'):dataspaceSize()
myFile:close()
Checking the type of torch.Tensor without loading the data:
local myFile = hdf5.open('/path/to/read.h5','r')
local factory = myFile:read('/path/to/data'):getTensorFactory()
myFile:close()
If you want to use HDF5 from multiple threads, you will need a thread-safe build of the underlying HDF5 library. Otherwise, you will get random crashes. See the HDF5 docs for how to build a thread-safe version.
If you want to do this from torch you will also need to install torch threads. Then you can
local mainfile = hdf5.open('/path/to/read.h5','r')
local nthreads = 2
local data = nil
local worker = function(h5file)
torch.setnumthreads(1)
print(__threadid)
return h5file:read("data" .. __threadid):all()
end
local pool = threads.Threads(nthreads, function(threadid) require'torch' require'hdf5'end)
pool:specific(true)
for i=1,nthreads do
pool:addjob(i, worker, function(_data) data = _data end, mainfile)
end
for i=1,nthreads do
pool:dojob()
print(data:size(1)==10)
end
mainfile:close()
There are also a number of handy command-line tools.
Lists specified features of HDF5 file contents.
Examine the contents of an HDF5 file and dump those contents to an ASCII file.
Compare two HDF5 files.
Copies HDF5 objects from a file to a new file
See this page for many more HDF5 tools.
Libraries for many other languages and tools exist, too. See this list for more information.