Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dimension Mismatch Error When Using FASTOutputFile.toDataFrame(): operands could not be broadcast together with shapes (4801,138) (138,1) #41

Open
Asionm opened this issue Mar 4, 2025 · 0 comments

Comments

@Asionm
Copy link

Asionm commented Mar 4, 2025

Description

When loading a FAST binary file using FASTOutputFile from the openfast_toolbox.io module and converting it to a DataFrame, a dimension mismatch error occurs in non-buffered mode. The error originates from openfast_toolbox/io/fast_output_file.py during the data scaling step.

Error Trigger Code:

from openfast_toolbox.io import FASTOutputFile
out_file = FASTOutputFile("test.outb").toDataFrame()  # Fails here

Error Message:

ValueError: operands could not be broadcast together with shapes (NT, NumOutChans) (NumOutChans, 1)

Affected File:
openfast_toolbox/io/fast_output_file.py


Steps to Reproduce

  1. Import FASTOutputFile and load a FAST binary file:
    from openfast_toolbox.io import FASTOutputFile
    out_file = FASTOutputFile("test.outb").toDataFrame()
  2. Ensure the file is in a compressed format (e.g., FileFmtID_WithTime or FileFmtID_WithoutTime).
  3. The error occurs during the data scaling step in non-buffered mode (use_buffer=False).

Root Cause

In fast_output_file.py, the scaling arrays ColOff and ColScl are incorrectly shaped as column vectors ((NumOutChans, 1)), while the data array data has shape (NT, NumOutChans). This violates NumPy broadcasting rules when performing element-wise operations:

# In fast_output_file.py (non-buffered mode):
data = (data - ColOff) / ColScl  # Shapes: (NT,138) vs (138,1)

Buffered mode works because it uses 1D arrays and applies scaling column-by-column.


Proposed Fix

Adjust the dimensions of ColOff and ColScl to align with broadcasting rules.

Option 1: Flatten to 1D Arrays

Modify the code in fast_output_file.py to convert ColOff and ColScl to 1D arrays:

# Before (line ~X in fast_output_file.py):
ColScl = fread(fid, NumOutChans, 'float32')  # Shape: (138, 1)
ColOff = fread(fid, NumOutChans, 'float32')  # Shape: (138, 1)

# After:
ColScl = fread(fid, NumOutChans, 'float32').flatten()  # Shape: (138,)
ColOff = fread(fid, NumOutChans, 'float32').flatten()  # Shape: (138,)

Option 2: Transpose Scaling Arrays

Alternatively, transpose ColOff and ColScl to row vectors:

data = (data - ColOff.T) / ColScl.T  # Shapes: (NT,138) vs (1,138)

Why Buffered Mode Works

In buffered mode (use_buffer=True):
ColOff and ColScl are 1D arrays (shape (NumOutChans,)).
• Scaling is applied column-wise in a loop, avoiding broadcasting:

for iCol in range(NumOutChans):
    data[:, iCol+1] = (data[:, iCol+1] - ColOff[iCol]) / ColScl[iCol]

Impact

Affected Users: Anyone using FASTOutputFile.toDataFrame() in non-buffered mode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant