Skip to content

Commit

Permalink
HDF5 file structure reorganisation, and InterfaceComponent readab…
Browse files Browse the repository at this point in the history
…ility (#280)

* Reorganise hdf5 unit tests

- Move uint16 to char/str functions to the unit test utils file
- Create hdf5_and_tdms_objects subdirectory of unit/ to hold unit tests on interaction between hdf5 and tdms classes
- Move the Matrix<double> test from test_hdf5_io into the new subdirectory
- Data files needed for unit tests are defined in a unit_test_utils namespace to avoid redefinition across multiple files

* Create file to test interface and hdf5 interactions

- Add docstrings to interface.h since these are missing and I've just had to work out what they do
- Create a matlab script that can reproduce the class_data.mat file which the hdf5 unit tests will try to create tdms objects from
- Create the barebones test_hdf5_interface file

* HDF5Reader can read from .mat file and produce an InterfaceComponent

* File restructure: accounting for how many tests we are going to have with HDF5

* Prune includes

* Add .mat file for HDF5-TDMS-object unit tests to run.

- Adds scripts to reproduce this data, so in theory a new user can run a short MATLAB script to reproduce this
- Had a play with trying to get setup-matlab to run these scripts before the unit tests, but alas, no.
  • Loading branch information
willGraham01 authored and samcunliffe committed May 22, 2023
1 parent f69f59a commit 4b3b3d3
Show file tree
Hide file tree
Showing 18 changed files with 755 additions and 522 deletions.
214 changes: 0 additions & 214 deletions tdms/include/hdf5_io.h

This file was deleted.

92 changes: 92 additions & 0 deletions tdms/include/hdf5_io/hdf5_base.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
/**
* @file hdf5_io.h
* @brief Helper classes for HDF5 file I/O.
* @details The main classes are `HDF5Reader` and `HDF5Writer` with the methods
* `HDF5Reader::read` and `HDF5Writer::write` respectively.
*/
#pragma once

#include <memory>
#include <string>
#include <vector>

#include <H5Cpp.h>

#include "cell_coordinate.h"

/**
* @brief Convert from a vector of HDF5's hsize_t back to our struct of ints.
* @note Local scope utility function as only this code needs to interact with
* the HDF5 H5Cpp library.
*
* @param dimensions a 1, 2, or 3 element vector of dimensions.
* @return ijk The dimensions in a struct.
*/
ijk to_ijk(const std::vector<hsize_t> dimensions);

/**
* @brief The base class for HDF5 I/O.
* @details Common functionality and wraps handling the std::unique_ptr to hold
* the H5::File object.
*/
class HDF5Base {

protected:
std::string filename_; /**< The name of the file. */
std::shared_ptr<H5::H5File> file_; /**< Pointer to the underlying H5::File. */

/**
* @brief Construct a new HDF5{Reader/Writer} for a named file.
* @param filename The name of the file.
* @param mode The H5 file access mode (RDONLY for a HDF5Reader, TRUNC for a
* HDF5Writer.)
* @throws H5::FileIException if the file doesn't exist or can't be created.
*/
HDF5Base(const std::string &filename, int mode = H5F_ACC_RDONLY)
: filename_(filename) {
file_ = std::make_unique<H5::H5File>(filename, mode);
}

/**
* @brief Destructor closes the file.
* @details Closes file when HDF5Reader(or HDF5Writer) goes out of scope.
* Since the file pointer is a smart pointer it is deallocated automatically.
*/
~HDF5Base() { file_->close(); }

public:
/**
* @brief Get the name of the file.
* @return std::string the filename.
*/
std::string get_filename() const { return filename_; }

/**
* @brief Get the names of all datasets (data tables) currently in the file.
* @return std::vector<std::string> A vector of their names.
*/
std::vector<std::string> get_datanames() const;

/**
* @brief Print the names of all datasets to std::out.
*/
void ls() const;

/**
* @brief Return shape/dimensionality information about the array data stored
* with `name`.
* @param dataname The name of the data table.
* @return IJKDimensions The dimensions of the data.
*/
// IJKDimensions shape_of(const std::string &dataname) const;
std::vector<hsize_t> shape_of(const std::string &dataname) const;

/**
* @brief Checks the file is a valid HDF5 file, and everything is OK.
* TODO: Can perhaps remove.
*
* @return true If all is well.
* @return false Otherwise.
*/
bool is_ok() const;
};
89 changes: 89 additions & 0 deletions tdms/include/hdf5_io/hdf5_reader.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
#pragma once

#include "hdf5_io/hdf5_base.h"

#include "arrays.h"
#include "interface.h"

/**
* @brief Class wrapper of the reading of HDF5 format files.
* @details Opens files in readonly and retrieves the datasets (in our case
* **double, but can be anything in general).
*/
class HDF5Reader : public HDF5Base {

public:
/**
* @brief Construct a new HDF5Reader for a named file.
* @param filename The name of the file.
* @throws H5::FileIException if the file can't be created.
*/
HDF5Reader(const std::string &filename)
: HDF5Base(filename, H5F_ACC_RDONLY) {}

/**
* @brief Reads a named dataset from the HDF5 file.
* @param dataname The name of the datset to be read.
* @param data A pointer to an array of correct size.
*/
// template <typename T>
// void read(const std::string &dataname, T *data) const;
template<typename T>
void read(const std::string &dataset_name, T *data) const {
spdlog::debug("Reading {} from file: {}", dataset_name, filename_);

// get the dataset and dataspace
H5::DataSet dataset = file_->openDataSet(dataset_name);
H5::DataSpace dataspace = dataset.getSpace();

// now get the data type
dataset.read(data, dataset.getDataType());
spdlog::trace("Read successful.");
}

template<typename T>
void read_field_from_struct(const std::string &struct_name,
const std::string &field_name, T *data) const {
spdlog::debug("Reading {} from file: {}", struct_name, filename_);

// Structs are saved as groups, so we need to fetch the group this struct is
// contained in
H5::Group structure_array = file_->openGroup(struct_name);
// Then fetch the requested data and read it into the buffer provided
H5::DataSet requested_field = structure_array.openDataSet(field_name);
requested_field.read(data, requested_field.getDataType());
}

template<typename T>
void read(const std::string &dataset_name, Matrix<T> &data_location) const {
spdlog::debug("Reading {} from file: {}", dataset_name, filename_);

std::vector<hsize_t> dimensions = shape_of(dataset_name);
if (dimensions.size() != 2) {
throw std::runtime_error(
"Cannot read " + dataset_name + " into a 2D matrix, it has " +
std::to_string(dimensions.size()) + " dimensions");
}
int n_rows = dimensions[0];
int n_cols = dimensions[1];

SPDLOG_DEBUG("n_rows = {}; n_cols = {}", n_rows, n_cols);
T *buff = (T *) malloc(n_rows * n_cols * sizeof(T));
read(dataset_name, buff);

data_location.allocate(n_rows, n_cols);
for (unsigned int i = 0; i < n_rows; i++) {
for (unsigned int j = 0; j < n_cols; j++) {
data_location[i][j] = buff[i * n_cols + j];
}
}
return;
}

void read(const std::string &plane, InterfaceComponent *ic) const;
InterfaceComponent read(const std::string &plane) const {
InterfaceComponent ic;
read(plane, &ic);
return ic;
}
};
Loading

0 comments on commit 4b3b3d3

Please sign in to comment.