Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AP_IOMCU: status read error counter never resets #28534

Open
RickReeser opened this issue Nov 6, 2024 · 4 comments
Open

AP_IOMCU: status read error counter never resets #28534

RickReeser opened this issue Nov 6, 2024 · 4 comments

Comments

@RickReeser
Copy link
Contributor

RickReeser commented Nov 6, 2024

Bug report

Copter 4.5.6
CubeOrange; CubeOrangePlus

If I leave my vehicle turned on for a long time, I inevitably encounter this prearm error:

AP: PreArm: Internal errors 0x1000 1:469 iomcu_reset error

Which points to this line: https://github.com/ArduPilot/ardupilot/blob/Copter-4.5/libraries/AP_IOMCU/AP_IOMCU.cpp#L469

An overnight disarmed log of IOMCU errors looks like this:
image

So I guess the status read errors are accruing until they hit the limit of 20, triggering the prearm error. The comments in the code indicate that they are supposed to reset upon a successful read:

    if (read_status_ok == 0) {
        // reset error count on first good read
        read_status_errors = 0;
    }

This never happens because read_status_ok is never 0. I guess this line is supposed to be read_status_ok > 0? This would cause the errors to reset (maybe the conditional is not needed at all since we return early if the read is not successful).

However, this would cause another issue. read_status_errors is used elsewhere:

bool AP_IOMCU::healthy(void)
{
    return crc_is_ok && protocol_fail_count == 0 && !detected_io_reset && read_status_errors < read_status_ok/128U;
}

Resetting the read errors will make the last check useless, because read_status_ok is always incrementing and will be some arbitrarily large number. It should be compared to the total read errors ever seen, so I guess we should track the total errors and errors-since-last-success separately.

This check was added by this commit.

Version
Copter 4.5.6

Platform
[ x ] All

@peterbarker
Copy link
Contributor

@RickReeser did you want to put together a PR with your proposal?

@RickReeser
Copy link
Contributor Author

RickReeser commented Nov 24, 2024

@RickReeser did you want to put together a PR with your proposal?

#28725

I have been testing this on 3.5.6 for a few weeks now, everything seems good.

There is one difference between this PR and my tests, though, and that is the logging: this PR changes the logging to match the comments (the counter resets upon successful read). But in my test code, I left it to accumulate so that I could verify that the prearm error did not appear once it reached 20+ counts.

@peterbarker
Copy link
Contributor

@RickReeser were you running BRD_IO_DSHOT?

@RickReeser
Copy link
Contributor Author

@RickReeser were you running BRD_IO_DSHOT?

No, no dshot. I was using regular PWM. Here is the param file from the log that was screenshotted in the initial post.

iomcu-error-accumulation.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants