-
-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] OLED over I2C on RP2040 causes secondary half of split keyboard to slowdown and crash #17720
Comments
This isn't an issue I've experienced using my corne with sparkfun rp2040s. I'm assuming you have the OLEDs fitted? Do you have a branch with the code you are using to cause this issue? |
Here you go! And yes, I do have the OLED fitted on both sides. |
I have experienced this behavior in boards (CRKBDs and TidBit) with either the KB2040 and PM-RP2040. In my case, it was due to i2c errors thanks to some poorly solder joints. I suggest you to enable debug and see if any failed transaction occur :) |
I can reproduce this across two separate keyboards, and it does not happen on an elite-c, for example. I will look at enabling debug though! |
Set CONSOLE_ENABLE = yes in rules.mk and added the below to my keymap.c
The attached file is the output from QMK toolbox. The issue was occurring at the time of capture. Toolbox would stop outputting when keypresses would fail on the secondary side. |
Interesting... Does the bug present itself when using the other half as master? Or even when only using just 1 half without the other connected? |
It is always the secondary side that stops responding to keypresses. So I can plug in either side, and the secondary will have the issue. The 'main' half always types as expected. |
I have also experienced this issue on my crkbd w/ KB2040. When it happens on my keyboard I notice the following symptoms, all exclusively on the Secondary side.
|
Interesting, just flashed @CodyMathis123 keymap in my Crkbd with KB2040s and it locks up after a minute or two. Funny enough my keymap which uses a simple Update: using an animation taken from my AVR CRKBD also causes lock up... |
Just tested with both PM-RP2040 and KB2040, either as a slave or master, and in all cases the slave dies the same. I'm using this to reliably crash my slave half:
@CodyMathis123 @amadea-system, can you confirm that this is the exact behavior as yours? Img.1456-1.mp4(KB2040 as Master and PM-RP2040 as slave) |
Shouldn't they be 512 bytes? |
Oh shoot! My maths did wrong to me. But sadly even by changing the size and making it static still locks it up hmmm Update: It makes sense that changing the size doesn't affect the reproducibility of the bug thanks to the sanity check in the oled driver for dumb mistakes like mine... |
Yeah I can confirm the issue, the RGB matrix and key issues aren't as apparent when you have a slow Using the following code Jpe230 provided triggers the issue pretty quickly, thanks for that. static int current_block = 0;
current_block = (current_block + 1) % 512;
static char white_bg[512] = {0};
memset(white_bg, 0, 512);
memset(white_bg, 0xff, current_block);
oled_write_raw(white_bg, 512); From some quick initial testing calling diff --git a/platforms/chibios/drivers/i2c_master.c b/platforms/chibios/drivers/i2c_master.c
index 21e064b1dc..cbf3b84d9f 100644
--- a/platforms/chibios/drivers/i2c_master.c
+++ b/platforms/chibios/drivers/i2c_master.c
@@ -115,6 +115,7 @@ static i2c_status_t chibios_to_qmk(const msg_t* status) {
return I2C_STATUS_TIMEOUT;
// I2C_BUS_ERROR, I2C_ARBITRATION_LOST, I2C_ACK_FAILURE, I2C_OVERRUN, I2C_PEC_ERROR, I2C_SMB_ALERT
default:
+ i2c_stop();
return I2C_STATUS_ERROR;
}
} |
This does seem to resolve the issue! Greatly appreciated. I will keep an eye on this and see if it reoccurs in any way. I am curious about what I2C errors are happening? And should they be happening? |
@Jpe230 @CodyMathis123 @daskygit I'm not sure on the root cause of the problem (yet), but I was able to reproduce the lockup easily. The cause was an I2C IRQ tail chaining of the TX FIFO EMPTY event, after a transmission was done. My fix is to disable all IRQ sources after a successful transmission (just like in the zephyr driver). That solved the lockups for me. Changes are already in a PR -> ChibiOS/ChibiOS-Contrib#329 |
I was going to open an issue after discovering and digging into the same thing -- I just wanted to add my anecdata to the mix. I have seen in most frequently on the slave side while testing the SNAP, but have also seen the same lockups on the master side albeit with a lower rate of occurrence. The problem seems specific to RP2040 as I haven't seen any I2C errors with AVR on the same board. |
@jaygreco could you test the ChibiOS-contrib PR and see if this fixes your lockups as well? |
No problem. I'll grab your PR and test with my setup and share results. |
As of this afternoon, I was able to reproduce the issue on both master and slave of my test board with your ChibiOS-contrib patch pulled in. Is there any additional debug info I can pull during the failure that might help? |
Chiming in...the bug is still reproducible with the linked PR. I now managed to reliably crash the master half by spamming I2C commands to the OLED display |
Interesting, could both of you post your configs and the code that make it crash? |
Sure! My config is here: https://github.com/Jpe230/Jpe230sKeebs/blob/main/keyboards/crkbd/keymaps/jpe230_rp/config.h And to crash the board I use this:
|
45 min with the patch applied and no lockups -- Gonna leave it plugged for the rest of the day but so far it seems that it has resolved the lockups 👯 |
Thanks @KarlK90 -- happy to test over the weekend. No worries there. I'll post some updates in a few days. |
It's looking good on my end -- no OLED hangs so far. |
PRs: ChibiOS-contrib: ChibiOS/ChibiOS-Contrib#329 I have mostly rewritten the rp2040 i2c driver in the last couple of days and pushed to current state of affairs to chibios-contrib. The driver now fully utilizes the Please try out these changes and leave your feedback in the PRs if this fixes your problems. Torture test: void housekeeping_task_user(void) {
static int current_block = 0;
current_block = (current_block + 1) % 512;
static char white_bg[512] = {0};
memset(white_bg, 0, 512);
memset(white_bg, 0xff, current_block);
oled_write_raw(white_bg, 512);
} |
Looking good as well on my end: currently running the Dilemma with OLED on the slave side, Cirque trackpad on the master side, and I haven't had any issues in the last 6 hours that the keyboard has been plugged in after being flashed with the patches referenced in the comment above. The same configuration used to lock up in just a few minutes before those changes. Thanks, @KarlK90! |
With #17817 merged the fixes should now be available on the latest develop branch. |
A late reply: I torture-tested one extra display (using my old code + rendering it each ms) in the course of the night and happy to report that I couldn't find any problem with the aforementioned PRs! Amazing job as always and thanks for fix, @KarlK90! |
Nice, thanks for the feedback everyone. I'll close this issue then. |
Describe the Bug
When OLED screens are enabled and used on a KB2040 (Assuming this affects other RP2040 as well) over I2C, the secondary side of the keyboard becomes slow, missing keystrokes, and eventually unresponsive.
System Information
Keyboard: Corne / CRKBD
Revision (if applicable): v3
Operating system: Windows 10
qmk doctor
output:Any keyboard related software installed?
Additional Context
I am not sure if this only impacts split keyboards or not, but it only seems to impact the secondary side of the split.
The text was updated successfully, but these errors were encountered: