You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For LGDO arrays, I suggest that we separate the capacity (i.e. the size of the underlying numpy array) from the size of the LGDO object. This is similar to how C++ std::vectors work. When increasing the size, only increase the capacity if needed (I think the typical strategy with C++ is to double the capacity when needed, although it is compiler dependent. We could also double up to a certain size and then add fixed amounts after that); when decreasing the size there is no need to change the capacity. I would suggest we only take this approach for the outer-most dimension (i.e. ArrayOfArrays still have fixed lengths in other dimensions). We would also want to turn [lgdo_type].nda into a property that returns an array view over the range [0:size]
This would have several benefits:
Simplify reading into pre-defined buffers. If the read is smaller than the pre-defined buffer, then we simply change the size without causing any problems
Do not need to return pair of array+size anymore, which makes read easier to use (and avoids the inconsistent output type that we have right now)
Helpful for LH5Iterator
Could help with performance when reading from many files as an alternative approach to what is suggested here: Determine the total buffer size before starting to read LH5 data from a list of files #93. Instead of having to re-allocate/copy the array each time a new file is open, the number of times this is done scales logarithmically with number of files (depending on the approach we take to increasing capacity)
In certain situations this would avoid the error when trying to resize an array when another reference to it exists. This is mostly useful for benefit 1, but there may be other situations where shrinking an array is useful. See potential annoyance 1.
Note that VectorOfVectors is already more or less handled this way, because resizing the data array is a common need
Potential annoyances:
If multiple variables point to the same LGDO object, we need to be careful that the size is treated as a mutable reference rather than an immutable value, otherwise if you make a change to one, it could change the underlying array without updating the size in all the others.
This may not be totally trivial to plug into view_as and read_as, if external libraries use the shape of the array. Changing the behavior of nda hopefully keeps this simple
The text was updated successfully, but these errors were encountered:
For LGDO arrays, I suggest that we separate the capacity (i.e. the size of the underlying numpy array) from the size of the LGDO object. This is similar to how C++ std::vectors work. When increasing the size, only increase the capacity if needed (I think the typical strategy with C++ is to double the capacity when needed, although it is compiler dependent. We could also double up to a certain size and then add fixed amounts after that); when decreasing the size there is no need to change the capacity. I would suggest we only take this approach for the outer-most dimension (i.e. ArrayOfArrays still have fixed lengths in other dimensions). We would also want to turn
[lgdo_type].nda
into a property that returns an array view over the range [0:size]This would have several benefits:
Potential annoyances:
nda
hopefully keeps this simpleThe text was updated successfully, but these errors were encountered: