Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reporting::CheckedImpl::RetrieveClusterData crashes when attribute read fails #35306

Closed
bzbarsky-apple opened this issue Aug 30, 2024 · 3 comments · Fixed by #35319
Closed

reporting::CheckedImpl::RetrieveClusterData crashes when attribute read fails #35306

bzbarsky-apple opened this issue Aug 30, 2024 · 3 comments · Fixed by #35319
Assignees
Labels
bug Something isn't working linux needs triage

Comments

@bzbarsky-apple
Copy link
Contributor

Reproduction steps

scripts/examples/gn_build_example.sh examples/energy-management-app/linux out/debug chip_config_network_layer_ble=false chip_use_data_model_interface='"check"'

Then run ./out/debug/chip-energy-management-app, commission it, and do:

chip-tool any read-by-id 0x94 0x00 ${NODE_ID} 1

The app crashes. This happens because that attribute read fails in both the "data model" and "ember" versions of read, but the "ember" version does the equivalent of AttributeReportBuilder::PrepareAttribute before it tries the part that fails, while the "data model" version does them in the opposite order. So while statusEmber != statusDm tests false, lengthWrittenEmber != reportBuilder.GetWriter()->GetLengthWritten() is true and we hit the fatal assert.

There's a comment there that says:

    // NOTE: RetrieveClusterData is responsible for encoding StatusIB errors in case of failures
    //       so we validate length written requirements for BOTH success and failure. 

but that's just not true. The StatusIB encoding is done by the caller of RetrieveClusterData (Engine::BuildSingleReportDataAttributeReportIBs) after rolling back the TLV in error cases. Except for IsOutOfSpaceEncodingResponse() situations.

So presumably either the logic in Read-Checked needs to be fixed to be looser in its assertions or the two RetrieveClusterData implementations need to be changed to produce equivalent output as this code expects.

Bug prevalence

Always

GitHub hash of the SDK that was being used

ebc4237

Platform

core

Platform Version(s)

No response

Anything else?

No response

@bzbarsky-apple bzbarsky-apple added bug Something isn't working needs triage labels Aug 30, 2024
@github-actions github-actions bot added the linux label Aug 30, 2024
bzbarsky-apple added a commit to bzbarsky-apple/connectedhomeip that referenced this issue Aug 30, 2024
On Linux we were using the "check" mode, but it's buggy in ways that cause app
crashes (see project-chip#35306).
Just disabling it for now so this is not an issue for SVE.
@bzbarsky-apple
Copy link
Contributor Author

Disabling this in #35307 for now to work around SVE problems.

mergify bot pushed a commit that referenced this issue Aug 30, 2024
…35307)

On Linux we were using the "check" mode, but it's buggy in ways that cause app
crashes (see #35306).
Just disabling it for now so this is not an issue for SVE.
@andy31415
Copy link
Contributor

Reproduced via python script by:

    attr = await devCtrl.ReadAttribute(
        node_id, [Clusters.WaterHeaterManagement.Attributes.HeaterTypes]
    )

(wrote a test script in https://github.com/andy31415/chip-repl-tests/blob/main/energy_management_read.py).
Looking into fixing this

@andy31415
Copy link
Contributor

It seems the logic in the code states that we do not compare data length for non-unit-tests because it is not reliable (time-dependent sizes may differ). However the logic for this seems broken as CONFIG_BUILD_FOR_HOST_UNIT_TEST seems to be enabled when building the energy management app. This seems odd.

@mergify mergify bot closed this as completed in #35319 Aug 30, 2024
shgutte pushed a commit to shgutte/connectedhomeip that referenced this issue Sep 10, 2024
…roject-chip#35307)

On Linux we were using the "check" mode, but it's buggy in ways that cause app
crashes (see project-chip#35306).
Just disabling it for now so this is not an issue for SVE.
ArekBalysNordic pushed a commit to ArekBalysNordic/sdk-connectedhomeip that referenced this issue Nov 28, 2024
…#35307)

On Linux we were using the "check" mode, but it's buggy in ways that cause app
crashes (see project-chip/connectedhomeip#35306).
Just disabling it for now so this is not an issue for SVE.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working linux needs triage
Projects
None yet
2 participants