Skip to content

Commit

Permalink
Infrastructure for removing MATLAB (#314)
Browse files Browse the repository at this point in the history
* Handle structs (except with complex data, that's nasty) (#278)

* Trying to decipher structs

* Further towards reading a struct

* Bypass the stack-smash bug.

MATLAB saves character arrays as uint16s, and HDF5 is unable to interpret them as chars. As such, we need to do manual conversion in the test.

* Rename binary and update contents

* `HDF5` file structure reorganisation, and `InterfaceComponent` readability (#280)

* Reorganise hdf5 unit tests

- Move uint16 to char/str functions to the unit test utils file
- Create hdf5_and_tdms_objects subdirectory of unit/ to hold unit tests on interaction between hdf5 and tdms classes
- Move the Matrix<double> test from test_hdf5_io into the new subdirectory
- Data files needed for unit tests are defined in a unit_test_utils namespace to avoid redefinition across multiple files

* Create file to test interface and hdf5 interactions

- Add docstrings to interface.h since these are missing and I've just had to work out what they do
- Create a matlab script that can reproduce the class_data.mat file which the hdf5 unit tests will try to create tdms objects from
- Create the barebones test_hdf5_interface file

* HDF5Reader can read from .mat file and produce an InterfaceComponent

* File restructure: accounting for how many tests we are going to have with HDF5

* Prune includes

* Add .mat file for HDF5-TDMS-object unit tests to run.

- Adds scripts to reproduce this data, so in theory a new user can run a short MATLAB script to reproduce this
- Had a play with trying to get setup-matlab to run these scripts before the unit tests, but alas, no.

* Infrastructure for removing `MATLAB` (#281)

* Skeleton for new method

* FrequencyVectors use std::vectors instead of our custom object as members. [DOESN'T COMPILE YET]

- FrequencyVectors is now just a struct that uses std::vector datatypes
- SimualationManager is broken, along with most parts of the codebase that use FrequencyVectors
- Pending update to allow code to COMPILE

* Allow TDMS to actually be compiled again. [TESTS STILL FAIL, READER METHOD NOT READY YET]

- Move .mat data generation scripts for unit tests into benchmarking folder
- Adjust paths to unit test data accordingly
- InputMatrices stores the filename so that we can minimise disruption as we transition away from MATLAB

* IT WORKS

* Create the abstract H5Dimensions class to ease reading objects.

- Update HDF5Base::shape_of to return instances of this class
- Add unit tests for class
- Update HDF5Reader method to avoid recycled code

* Remove MATLAB pointers to InterfaceComponents in initialisation.

- simulation_manager can now instantate these members using HDF5Reader rather than still relying on the old InputMatrices object

* Add docstrings to files touched

* Produce unit test binaries on CI rather than track in repo

* - Detaches MATLAB from Cuboid class.
- Turns Cuboid class into a struct because MATLAB is now gone.
- Adds failure-case checks to HDF5Reader::read() functions.

Squashed commit of the following:

commit ce8d15e311e9dbb93bd7d7fbcd6c9cffe69657e3
Author: willGraham01 <[email protected]>
Date:   Fri May 5 15:19:51 2023 +0100

    Rename shapes.h to cuboid.h because that's all it contains

commit fe83f568c3da7491d450693574975df90a3f15e1
Author: willGraham01 <[email protected]>
Date:   Fri May 5 15:13:48 2023 +0100

    Fix faulty logic thrown up by test

commit b37a89f031519c6a356b50f91a12ad8867b78333
Author: willGraham01 <[email protected]>
Date:   Fri May 5 15:01:58 2023 +0100

    Allow tdms to actually compile

commit d7c982b9165e3ae4ccf3f42e0fb8c5cdde55b644
Author: willGraham01 <[email protected]>
Date:   Fri May 5 14:31:43 2023 +0100

    Write unit test and read method for Cuboid. Add failure test for FrequencyVectors

    - Add new .m script to produce an input file with bad object data
    - Update unit_test_utils with this new filepath
    - Write HDF5Reader(Cuboid *cube) method and unit test
    - Add failure-test check for FrequencyVectors using bad data file

commit cf61970d2052ab02ae23822f0edd28e89044734f
Author: willGraham01 <[email protected]>
Date:   Fri May 5 14:12:11 2023 +0100

    Change Cuboid to be a struct because a class is overkill

* Use one master script for setting up .mat files for unit tests.

- setup_unit_tests.m now calls all the other data-generating scripts in the folder to save updating the ci.yaml each time.

* Apply suggestions from code review

Co-authored-by: Sam Cunliffe <[email protected]>

* Move cuboid class into array/ directory, which will be populated as we deatch other classes from MATLAB

* Update tdms/src/input_matrices.cpp

Co-authored-by: Sam Cunliffe <[email protected]>

* Update tdms/src/input_matrices.cpp

Co-authored-by: Sam Cunliffe <[email protected]>

* Clang format.

---------

Co-authored-by: Will Graham <[email protected]>
Co-authored-by: willGraham01 <[email protected]>
  • Loading branch information
3 people authored May 26, 2023
1 parent 3afe4b7 commit e0de2cc
Show file tree
Hide file tree
Showing 32 changed files with 529 additions and 287 deletions.
5 changes: 5 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,11 @@ jobs:
- name: Set up MATLAB
uses: matlab-actions/[email protected]

- name: Produce MATLAB unit test data
uses: matlab-actions/run-command@v1
with:
command: cd('tdms/tests/unit/matlab_benchmark_scripts/'), setup_unit_tests

# -------------------------------------------------------------------------------
# Ubuntu
- name: Install dependencies for Ubuntu
Expand Down
10 changes: 4 additions & 6 deletions tdms/include/arrays.h
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
#include <complex>
#include <stdexcept>
#include <string>
#include <vector>

#include <fftw3.h>

Expand Down Expand Up @@ -325,12 +326,9 @@ class FrequencyExtractVector : public Vector<double> {
double max();
};

class FrequencyVectors {
public:
Vector<double> x;
Vector<double> y;

void initialise(const mxArray *ptr);
struct FrequencyVectors {
std::vector<double> x;
std::vector<double> y;
};

/**
Expand Down
21 changes: 21 additions & 0 deletions tdms/include/arrays/cuboid.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#pragma once

#include "mat_io.h"

/**
* @brief Defines a cuboid by specifying the first and last Yee cells in each
axial direction that form part of the cube.
*
* @details For example, { 0, 5, 2, 7, 1, 10 } corresponds to the cuboid that
contains all Yee cells indexed (i,j,k) where;
* 0 <= i <= 5,
* 2 <= j <= 7,
* 1 <= k <= 10.
*
* TODO: Check inclusivity of the inequalities above.
*/
struct Cuboid {
int array[6] = {0, 0, 0, 0, 0, 0};

int operator[](int index) const { return array[index]; }
};
19 changes: 15 additions & 4 deletions tdms/include/hdf5_io/hdf5_base.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
#include <H5Cpp.h>

#include "cell_coordinate.h"
#include "hdf5_io/hdf5_dimension.h"

/**
* @brief The base class for HDF5 I/O.
Expand Down Expand Up @@ -64,12 +65,22 @@ class HDF5Base {

/**
* @brief Return shape/dimensionality information about the array data stored
* with `name`.
* with `dataname`.
* @param dataname The name of the data table.
* @return IJKDimensions The dimensions of the data.
* @return H5Dimension The dimensions of the data.
*/
// IJKDimensions shape_of(const std::string &dataname) const;
std::vector<hsize_t> shape_of(const std::string &dataname) const;
H5Dimension shape_of(const std::string &dataname) const;
/**
* @brief Return shape/dimensionality information about array data stored
* within a group.
*
* @param group_name The name of the HDF5 Group in which the data array is
* stored.
* @param dataname The name of the data array to check dimensions of.
* @return The dimensions of the data.
*/
H5Dimension shape_of(const std::string &group_name,
const std::string &dataname) const;

/**
* @brief Checks the file is a valid HDF5 file, and everything is OK.
Expand Down
34 changes: 34 additions & 0 deletions tdms/include/hdf5_io/hdf5_dimension.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#pragma once

#include <algorithm>
#include <vector>

#include <H5Cpp.h>

class H5Dimension : public std::vector<hsize_t> {
public:
H5Dimension() = default;
H5Dimension(const H5::DataSpace &data_space);

/**
* @brief Whether these dimensions describe an array that is castable to a 1D
* array.
* @details In the event that these dimensions only have one entry, or at
* most one of the entries is greater than 1, the shape described can be cast
* to a 1D-array of length max_dim().
*
* @return true These dimensions describe (an object castable to) a 1D-array
* @return false Otherwise
*/
bool is_1D() const;

/**
* @brief Returns the dimension of the greatest extent.
* @details For instances where is_1D() returns true, this conincides with the
* number of elements in the array, and the length of a 1D array necessary to
* hold all the elements.
*
* @return hsize_t
*/
hsize_t max_dim() const { return *std::max_element(begin(), end()); };
};
83 changes: 68 additions & 15 deletions tdms/include/hdf5_io/hdf5_reader.h
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
#include "hdf5_io/hdf5_base.h"

#include "arrays.h"
#include "arrays/cuboid.h"
#include "interface.h"

/**
Expand All @@ -21,13 +22,33 @@ class HDF5Reader : public HDF5Base {
HDF5Reader(const std::string &filename)
: HDF5Base(filename, H5F_ACC_RDONLY) {}

/**
* @brief Read the dataset stored within a group into the buffer provided. Can
* be used to read MATLAB structs by treating the struct as the Group and
* field as the Dataset.
* @tparam T C++ datatype to read data into.
* @param group The Group within the file in which the dataset lives.
* @param dataset The name of the dataset to fetch data from.
* @param data The buffer into which to write the data.
*/
template<typename T>
void read_dataset_in_group(const std::string &group,
const std::string &dataset, T *data) const {
spdlog::debug("Reading {} from file: {}", group, filename_);

// Structs are saved as groups, so we need to fetch the group this struct is
// contained in
H5::Group structure_array = file_->openGroup(group);
// Then fetch the requested data and read it into the buffer provided
H5::DataSet requested_field = structure_array.openDataSet(dataset);
requested_field.read(data, requested_field.getDataType());
}

/**
* @brief Reads a named dataset from the HDF5 file.
* @param dataname The name of the datset to be read.
* @param data A pointer to an array of correct size.
*/
// template <typename T>
// void read(const std::string &dataname, T *data) const;
template<typename T>
void read(const std::string &dataset_name, T *data) const {
spdlog::debug("Reading {} from file: {}", dataset_name, filename_);
Expand All @@ -41,19 +62,13 @@ class HDF5Reader : public HDF5Base {
spdlog::trace("Read successful.");
}

template<typename T>
void read_field_from_struct(const std::string &struct_name,
const std::string &field_name, T *data) const {
spdlog::debug("Reading {} from file: {}", struct_name, filename_);

// Structs are saved as groups, so we need to fetch the group this struct is
// contained in
H5::Group structure_array = file_->openGroup(struct_name);
// Then fetch the requested data and read it into the buffer provided
H5::DataSet requested_field = structure_array.openDataSet(field_name);
requested_field.read(data, requested_field.getDataType());
}

/**
* @brief Reads a 2D-dataset into a Matrix object.
*
* @tparam T C++ datatype of the Matrix object
* @param dataset_name Name of the dataset to read data from
* @param data_location Matrix object buffer
*/
template<typename T>
void read(const std::string &dataset_name, Matrix<T> &data_location) const {
spdlog::debug("Reading {} from file: {}", dataset_name, filename_);
Expand All @@ -80,10 +95,48 @@ class HDF5Reader : public HDF5Base {
return;
}

/**
* @brief Read an InterfaceComponent into the buffer provided.
*
* @param[in] plane The plane {I,J,K}{0,1} to read from the file.
* @param[out] ic InterfaceComponent reference to populate/overwrite.
*/
void read(const std::string &plane, InterfaceComponent *ic) const;
/**
* @brief Read an InterfaceComponent from the file.
*
* @param plane The plane {I,J,K}{0,1} to read from the file.
* @return InterfaceComponent corresponding to the requested plane.
*/
InterfaceComponent read(const std::string &plane) const {
InterfaceComponent ic;
read(plane, &ic);
return ic;
}

/**
* @brief Read FrequencyVectors into the buffer provided.
*
* @param[out] f_vec FrequencyVectors reference to populate/overwrite.
*/
void read(FrequencyVectors *f_vec) const;
/**
* @brief Read FrequencyVectors from the file.
*
* @return FrequencyVectors object containing the data from the input file.
*/
FrequencyVectors read() const {
FrequencyVectors f_vec;
read(&f_vec);
return f_vec;
}

/**
* @brief
*
* Cuboid is just the phasorsurface array which is it's own dataset, but needs
* to be offset by -1 b/c indexing things
* @param cube
*/
void read(Cuboid *cube) const;
};
5 changes: 5 additions & 0 deletions tdms/include/input_matrices.h
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,11 @@ class InputMatrices {
void validate_assigned_pointers();

public:
/*! Name of the input file from which MATLAB objects were extracted. Akward
* middle-child between full removal of MATLAB and the slow removal of class
* dependencies */
std::string input_filename;

InputMatrices() = default;


Expand Down
3 changes: 1 addition & 2 deletions tdms/include/interface.h
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,7 @@
*/
class InterfaceComponent {
public:
/*!
* Whether or not a source or boundary condition is applied at this interface
/*! Whether or not a source or boundary condition is applied at this interface
*/
bool apply = false;
/*! The value of the constant Yee-cell index for cells in this plane */
Expand Down
2 changes: 1 addition & 1 deletion tdms/include/output_matrices/output_matrices.h
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

#include <string>

#include "arrays/cuboid.h"
#include "cell_coordinate.h"
#include "field.h"
#include "fieldsample.h"
Expand All @@ -14,7 +15,6 @@
#include "matrix.h"
#include "output_matrices/id_variables.h"
#include "output_matrices/output_matrix_pointers.h"
#include "shapes.h"
#include "simulation_parameters.h"
#include "surface_phasors.h"
#include "vertex_phasors.h"
Expand Down
20 changes: 0 additions & 20 deletions tdms/include/shapes.h

This file was deleted.

2 changes: 1 addition & 1 deletion tdms/include/simulation_manager/objects_from_infile.h
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
#include "matrix.h"

#include "arrays.h"
#include "arrays/cuboid.h"
#include "cell_coordinate.h"
#include "field.h"
#include "fieldsample.h"
Expand All @@ -18,7 +19,6 @@
#include "input_flags.h"
#include "input_matrices.h"
#include "interface.h"
#include "shapes.h"
#include "simulation_parameters.h"
#include "source.h"
#include "vertex_phasors.h"
Expand Down
9 changes: 0 additions & 9 deletions tdms/src/arrays.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -236,15 +236,6 @@ double FrequencyExtractVector::max() {
return tmp;
}

void FrequencyVectors::initialise(const mxArray *ptr) {

if (mxIsEmpty(ptr)) { return; }

assert_is_struct_with_n_fields(ptr, 2, "f_vec");
x = Vector<double>(ptr_to_vector_in(ptr, "fx_vec", "f_vec"));
y = Vector<double>(ptr_to_vector_in(ptr, "fy_vec", "f_vec"));
}

void Pupil::initialise(const mxArray *ptr, int n_rows, int n_cols) {

if (mxIsEmpty(ptr)) { return; }
Expand Down
25 changes: 13 additions & 12 deletions tdms/src/hdf5_io/hdf5_base.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -33,18 +33,19 @@ void HDF5Base::ls() const {
return;
}

std::vector<hsize_t> HDF5Base::shape_of(const std::string &dataname) const {
spdlog::debug("shape_of {}", dataname);

// get the dataset and dataspace (contains dimensionality info)
H5::DataSet dataset = file_->openDataSet(dataname);
H5::DataSpace dataspace = dataset.getSpace();

// need the rank in order to declare the vector size
int rank = dataspace.getSimpleExtentNdims();
std::vector<hsize_t> dimensions(rank);
dataspace.getSimpleExtentDims(dimensions.data(), nullptr);
return dimensions;
H5Dimension HDF5Base::shape_of(const std::string &dataname) const {
// Get the dataspace (contains dimensionality info)
H5::DataSpace dataspace = file_->openDataSet(dataname).getSpace();
return H5Dimension(dataspace);
}

H5Dimension HDF5Base::shape_of(const std::string &group_name,
const std::string &dataname) const {
// Open the group that contains the dataset
H5::Group group = file_->openGroup(group_name);
// Get the DataSpace for the DataSet within the group
H5::DataSpace dataspace = group.openDataSet(dataname).getSpace();
return H5Dimension(dataspace);
}

bool HDF5Base::is_ok() const {
Expand Down
23 changes: 23 additions & 0 deletions tdms/src/hdf5_io/hdf5_dimension.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#include "hdf5_io/hdf5_dimension.h"

#include <algorithm>

using namespace std;

H5Dimension::H5Dimension(const H5::DataSpace &data_space) {
// Fetch rank to declare the vector size
int rank = data_space.getSimpleExtentNdims();
// Resize to the correct rank then populate entries with the data
resize(rank);
data_space.getSimpleExtentDims(data(), nullptr);
}

bool H5Dimension::is_1D() const {
int n_non_trivial_dimensions = 0;
if (size() != 1) {
for (hsize_t dim : *this) {
if (dim > 1) { n_non_trivial_dimensions++; }
}
}
return n_non_trivial_dimensions <= 1;
}
Loading

0 comments on commit e0de2cc

Please sign in to comment.