Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Looks like 0.11.0 broke something for jj40 #11389

Open
Artheg opened this issue Jan 1, 2021 · 58 comments
Open

[Bug] Looks like 0.11.0 broke something for jj40 #11389

Artheg opened this issue Jan 1, 2021 · 58 comments

Comments

@Artheg
Copy link

Artheg commented Jan 1, 2021

Hi,

I've wanted to alter the keymap for my jj40 board. When I flashed the firmware, keys started to behave strange.
Sometimes there was no output at all, sometimes they would 'stick' (e.g. I type 'g' and the output goes 'gggggg...' until I hit 'g' again) and sometimes there would be a delay after I hit the key. Backlight Underglow wouldn't work at all.
I've tried to compile the firmware both locally and on the website and I've had the same effect.

With the help of guys in discord (spidey3, Dasky) it was figured out that the last firmware that was working was compiled with 0.10.54. Everything after that just didn't work.

@Artheg Artheg changed the title [Bug] 0.11.0 broke something for jj40 [Bug] Looks like 0.11.0 broke something for jj40 Jan 1, 2021
@spidey3
Copy link
Contributor

spidey3 commented Jan 3, 2021

I believe that @zvecr planned to take a look at this...

@Artheg
Copy link
Author

Artheg commented Jan 21, 2021

I'm on the latest firmware now (0.11.53).
Looks like 'sticking' and delay are gone.
Backlight Underglow still doesn't work.

@spidey3
Copy link
Contributor

spidey3 commented Jan 21, 2021

Can you describe the backlight issue in more detail?
Do the underglow LEDs work?
Does Raise+Lower+S or Raise+Lower+D change anything?
What about Raise+Lower+X or Raise+Lower+C?

@Artheg
Copy link
Author

Artheg commented Jan 22, 2021

I'm sorry, I've used wrong words here.
What I meant by backlight was RGB Lighting (underglow).
I've tried using combinations you suggested (default keymap), but unfortunately nothing happens.

@spidey3
Copy link
Contributor

spidey3 commented Jan 23, 2021

Recapping:
OP does not have backlight LEDs installed.
The remaining issue is to diagnose the difficulty enabling the RGB Lighting (underglow).

@benthepoet
Copy link

benthepoet commented Jan 24, 2021

I can confirm I'm having this issue with the 0.11.53 release. After flashing the board feels sluggish often not responding to several key presses and will start endlessly repeating a character if you roll your fingers along the top row quickly. Flashing with 0.10.54 as @Artheg mentioned, the board works fine (my underglow works also).

dmesg in Linux shows some weird reset low-speed USB messages. These don't show up when using 0.10.54.

[11025.087805] usb 5-1: New USB device found, idVendor=4b50, idProduct=0040, bcdDevice= 2.00
[11025.087815] usb 5-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[11025.087820] usb 5-1: Product: JJ40
[11025.087824] usb 5-1: Manufacturer: KPrepublic
[11025.117297] input: KPrepublic JJ40 as /devices/pci0000:00/0000:00:13.0/usb5/5-1/5-1:1.0/0003:4B50:0040.0013/input/input49
[11025.120741] usb 5-1: ctrl urb status -62 received
[11025.173358] hid-generic 0003:4B50:0040.0013: input,hidraw0: USB HID v1.01 Keyboard [KPrepublic JJ40] on usb-0000:00:13.0-1/input0
[11025.189945] input: KPrepublic JJ40 System Control as /devices/pci0000:00/0000:00:13.0/usb5/5-1/5-1:1.1/0003:4B50:0040.0014/input/input50
[11025.246311] input: KPrepublic JJ40 Consumer Control as /devices/pci0000:00/0000:00:13.0/usb5/5-1/5-1:1.1/0003:4B50:0040.0014/input/input51
[11025.246559] hid-generic 0003:4B50:0040.0014: input,hidraw1: USB HID v1.01 Device [KPrepublic JJ40] on usb-0000:00:13.0-1/input1
[11026.692777] usb 5-1: reset low-speed USB device number 8 using ohci-pci
[11028.906366] usb 5-1: reset low-speed USB device number 8 using ohci-pci
[11031.116836] usb 5-1: reset low-speed USB device number 8 using ohci-pci
[11033.336801] usb 5-1: reset low-speed USB device number 8 using ohci-pci
[11035.200201] usb 5-1: reset low-speed USB device number 8 using ohci-pci
[11037.367141] usb 5-1: reset low-speed USB device number 8 using ohci-pci
[11038.073905] usb 5-1: reset low-speed USB device number 8 using ohci-pci
[11039.960724] usb 5-1: reset low-speed USB device number 8 using ohci-pci
[11040.647408] usb 5-1: reset low-speed USB device number 8 using ohci-pci
[11042.430964] usb 5-1: reset low-speed USB device number 8 using ohci-pci
[11044.287781] usb 5-1: reset low-speed USB device number 8 using ohci-pci
[11046.401295] usb 5-1: reset low-speed USB device number 8 using ohci-pci
[11048.441467] usb 5-1: reset low-speed USB device number 8 using ohci-pci

@Artheg
Copy link
Author

Artheg commented Jan 26, 2021

@benthepoet Are you sure that you're having 'sticking' issues with 0.11.53? I'm on 0.11.53 with my JJ40 and sticking is gone, although underglow is not working.

Also, my underglow doesn't work at all. I've tried to flash my board with all versions from 0.11.0 to 0.10.54 and since then it just wouldn't work. All layers are working properly. I've tried to reset EEPROM. I've also tried to map RGB keys to the layer 0.
Is my underglow broken for good?

@benthepoet
Copy link

Yep, I'm definitely getting the sticking/delay issues with 0.11.53. Cleared EEPROM and flashed several times with the same result.

One strange thing is that if I plug the keyboard into a USB hub then the sticking/delay issues don't occur but if I plug the board directly into the USB ports on my motherboard that's when it starts acting up. Using 0.10.54 I can plug directly into my motherboard without any issues.

Anytime I've flashed the board my underglow has been working, usually defaults to solid red when I use the default layout.

@spidey3
Copy link
Contributor

spidey3 commented Jan 26, 2021

One strange thing is that if I plug the keyboard into a USB hub then the sticking/delay issues don't occur but if I plug the board directly into the USB ports on my motherboard that's when it starts acting up. Using 0.10.54 I can plug directly into my motherboard without any issues.

This could be a USB device state issue. Sometimes hubs behave differently.
I wonder - can you try the following versions:

  • 0.11.43
  • 0.11.18
  • 0.11.15
  • 0.11.0

I'd like to figure out the version where the problem started...

@benthepoet
Copy link

I did some testing and it looks like the problems start with 0.11.0. The board only registers keypresses for a little bit and then it becomes almost completely unresponsive.

@tzarc
Copy link
Member

tzarc commented Feb 3, 2021

Reproduced this with my Mysterium and 0.11.57. Left it sitting, connected directly to onboard USB. Connected through a hub wasn't triggering.

@tzarc
Copy link
Member

tzarc commented Feb 3, 2021

Seems like I've come to the same conclusion -- 0.11.0 is broken, 0.10.54 is not.

@MajorKoos
Copy link
Contributor

I've got the same issue with a port I'm working on for leeku pcbs (atmega32a + bootloadhid)
Type a single "." and I get a row of "............................" and then it hangs or sometimes even disconnects.
10.54 is stable, but 11.x is not.

@sowbug
Copy link
Contributor

sowbug commented Feb 28, 2021

I'm experiencing similar behavior on a custom keyboard (sowbug/68keys). This is a Blue Pill board. Problems began when I rebased to master around February 10 and flashed the resulting firmware after problem-free usage of firmware built around March 2020. Note that the issue seems to happen only on a Google Pixelbook, but not on my personal Linux machine. I had assumed it was thus a problem with a recent Chrome OS update, but I'm now starting to suspect the QMK update. I'll try rolling back to last year's firmware and see if the issue goes away.

@sowbug
Copy link
Contributor

sowbug commented Mar 1, 2021

I updated to yesterday's master, rebuilt, and flashed. So far I haven't seen the problem again.

@jhbruhn
Copy link

jhbruhn commented Mar 5, 2021

I am having the same problem with a Discipad (atmega328p as well) and the most recent master. The problem is also present in 0.11.0, but not in 0.10.54. The only relevant change I could find was in tmk_core/protocol/vusb/main.c running the housekeeping tasks, but reverting that did not fix it.
Interestingly enough, it works on a USB3 port, but not via a USB2.0 hub, while my ATMega32 Discipline works on that same hub.

@spidey3
Copy link
Contributor

spidey3 commented Mar 6, 2021

Is this still happening in 0.12.0 and later?

@jhbruhn
Copy link

jhbruhn commented Mar 6, 2021

Yes,. I tested with the latest master and a couple of versions between 0.10.54 and now.

@jhbruhn
Copy link

jhbruhn commented Mar 11, 2021

I assume the underlying problem is some kind of performance regression, possibly also the avr-gcc compiler version I am using? Using 0.12.15, I set the USB_POLLING_INTERVAL_MS to 40 instead of the default 10 and the "reset low-speed USB" message count was heavily reduced (0 at the moment). I additionally set OPT to 2 instead of s because the ATMega328 seems to have enough memory.

@daskygit
Copy link
Member

daskygit commented May 1, 2021

I was helping someone on discord with this issue using a sesame and a thinkpad dock, version 0.10.54 worked fine.

@jeffjewiss
Copy link

I was having the above issue of repeated keys or my keyboard locking up with my Discipline65.

A couple days ago I pulled down master and reflashed the keyboard and I haven't had repeated characters or a lockup since (fingers crossed). I think this PR is what contained the fix: #12576

@fauxpark
Copy link
Member

fauxpark commented May 6, 2021

@jeffjewiss definitely not, the Discipline65 (and the JJ40 as well) runs an ATmega32A, so QMK will use V-USB rather than ChibiOS or LUFA. I would be interested to know what it was, though - I don't think there's been any meaningful changes to the V-USB code recently.

@jeffjewiss
Copy link

@fauxpark fair enough, thanks for explaining. I just guessed at what the fix might have been from looking at the commits over the last 2-3 weeks.

My discipline went from being basically unusable since it would lockup or repeat characters multiple times an hour to not having any issues at all. The only thing I did was pull qmk master and reflash.

I could try to git bisect to find what commit added the fix, but otherwise I'm not familiar with the internals of qmk at all.

@MajorKoos
Copy link
Contributor

I can't find the specific PR at the moment, but when I did some digging I saw one of the PR's related to adding the extra USB endpoint mentioned how it added some additional latency but should still be in spec for devices slower than 16Mhz. Wonder if that's it since some folks have mentioned that dropping the polling interval helps.

@kosmiciatakuja
Copy link

I started bisecting the repo for the problem and so far I managed to initially focus on the transition between 0.11.69 and 0.12.0. Everything seems to work on 11.69 and 12.0 causes the keyboard to lock up with time. I'm still testing this and if it is indeed that I'm going to test every commit between 11.69 and 12.0 to see what causes the problem, fortunately there's only a few of them. Once I find out it should be possible to reverse-patch it onto the current version (if nothing else is changed in that area). I'll report back soon.

@kosmiciatakuja
Copy link

I managed to track this down to this specific commit. When I burn the previous commit (804d5c1) it works but with this one (1581ea4) compiled and burned the keyboard hangs within maybe 30 minutes and must be replugged to work. The only thing I don't understand is that this commit only changes a bunch of *.py files I believe responsible for the CLI, no .c files with any serious code in them. Which is strange and I'm not sure how to proceed because of that.

@sigprof
Copy link
Contributor

sigprof commented Sep 15, 2021

The commits 804d5c1 and 1581ea4 are not consecutive, however — there are 645 commits between them (or 144, if you ignore merge commits). Maybe you need to do another bisect round just between these points, or you just pasted a wrong commit ID.

@kosmiciatakuja
Copy link

I'm confused then, this is how it looks in my git log:

git log screenshot

I circled both commits in yellow. Both are from February 27 so they should be close, I guess. It may make a difference that one is on master and the other in on develop, but as seen on the screenshot the second one (from devel) was merged to master just one commit later, so there shouldn't be much differences...

@sigprof
Copy link
Contributor

sigprof commented Sep 15, 2021

This graph is somewhat misleading — the commit 804d5c1 was made in the master branch just before the February 27 breaking changes merge, while the commit 1581ea4 was made in the develop branch, again just before that merge. So the difference between those commits is basically the whole content of the develop branch that accumulated over ≈3 months since the previous breaking changes round, and your result basically says “something that was added in the February 27 breaking changes merge broke things”, which is actually a lot of code.

You probably should use the commit 3cc7d22 as the base instead — it is the point where that incarnation of the develop branch was forked from master. The history from 3cc7d22 to 1581ea4 contains both commits to develop and merges from master.

Also be sure to run make git-submodule before every compile — the develop branch contains some changes to submodules, and if you miss updating them, you won't be testing the correct code. (Although this particular commit range seems to have only chibios-related changes, which won't affect your board.)

@kosmiciatakuja
Copy link

Okay, understood, I think (about the commits). For me it would be simplest to just stick to one branch (master) and go commit by commit there. In that way, the commit for 12.0 breaks everything. But I have a slight breakthrough in this case. Since my keyboard is a split one (Keebio Nyquist) it has two USB ports, on the left and right hand. I'd been using the left port since the beginning as it is closer. Now I switched to the right port and poof - all my problems are gone. Keyboard works for over a day now on a later version and no hangs, absolutely 100% good behavior. I tried this same version with the left hand port and it hangs as usual. I just need to burn qmk again after switching ports. Can anybody confirm that a) the problems they were experiencing were on split keyboards, and b) if they are gone after switching USB ports to the other keyboard half?

@sigprof
Copy link
Contributor

sigprof commented Sep 17, 2021

Hmm, looks like we are discussing your problem with Keebio Nyquist in a wrong place then (and I did not notice that you are writing about a completely different keyboard). This issue is about problems with jj40, which is a V-USB board; you are experiencing problems with a board based on ATmega32U4 with native USB interface, therefore your problem is probably caused by something completely different.

Please open a separate issue about your problem.

And “the commit for 12.0 breaks everything” is unfortunately not very useful — running a bisect over the develop changes in that range could pinpoint a single problematic commit, however, which would be really appreciated. Although your findings that the problem is linked with using a specific half as master may also mean that you have some hardware issues with one of the halves (assuming that you always reflashed both halves with the same firmware when testing).

@sweetsuicide
Copy link

Hi, I tried flashing my jj40 using qmk toolbox 0.1.1 and modified and compiled my firmware in the web page (I have no idea what version it is). I have the very same issue as the one described in the first ticket. I am available to help analyse the issue

@ollien
Copy link

ollien commented May 3, 2022

I'm still experiencing this with a freshly built Sesame keyboard. I managed to make it work with firmware 0.10.54.

Anyway - I flashed at what's currently at master (c03e18f) and experienced this problem. I also collected a capture of me pressing "p" and "o", with "p" eventually getting stuck. If I'm honest, I'm not experienced enough to know what I'm looking at here, but hopefully I was able to isolate it enough.

pcap.zip

I'm going to try my hand at bisecting this and will report back if I find anything.

@ollien
Copy link

ollien commented May 3, 2022

I found it, I think! 75a18e6 breaks in exactly the way described, but 69d8bbf (its parent) works totally fine. I unfortunately do not have the experience necessary to see a problem (nor to know if any of the intermediate commits in the original pull request, #10491, are safe to flash without harming my keyboard).

Please let me know if there's anything else I can provide to help track down this problem.

@jhbruhn
Copy link

jhbruhn commented May 3, 2022

Nice work! My guess is: The timing of the vusb implementation is broken through these new atomics disabling interrupts, which leads to the USB endpoint failing.

I do not know how critical these are, but can they by disabled by doing a #define IGNORE_ATOMIC_BLOCK?

@ollien
Copy link

ollien commented May 3, 2022

@jhbruhn Yep - I applied this patch to 75a18e6 and I'm typing this comment on it now...

diff --git a/quantum/quantum.h b/quantum/quantum.h
index 42e8c00091..c1320d2645 100644
--- a/quantum/quantum.h
+++ b/quantum/quantum.h
@@ -220,6 +220,7 @@ typedef ioline_t pin_t;
 #    define togglePin(pin) palToggleLine(pin)
 #endif
 
+#define IGNORE_ATOMIC_BLOCK
 // Atomic macro to help make GPIO and other controls atomic.
 #ifdef IGNORE_ATOMIC_BLOCK
 /* do nothing atomic macro */

@jhbruhn
Copy link

jhbruhn commented May 4, 2022

I currently can't test this myself, but as this only seems to happen for ATMEGA32 based keyboard (?), can we do a patch which disables the atomics implementation for that processor? Or maybe even only for the specific keyboards in the associated config.h?

@ollien
Copy link

ollien commented May 4, 2022

@jhbruhn I guess you could, but that would re-introdce the RGB bugs that the original PR aimed to fix. That said, maybe by luck none of the boards affected here were affected by that? I'd have to dig a bit to answer that

@fauxpark
Copy link
Member

fauxpark commented May 5, 2022

I am not seeing this issue with either my JJ4x4 (32a) or my Plaid-Pad (328p) - but those are both 4x4 macropads, so perhaps it has something to do with matrix size (ie. larger matrix takes more time to process atomically).

@ollien
Copy link

ollien commented May 8, 2022

It seems that a couple of keyboards seem to actually already disable atomics as a workaround for matrix delay (see: https://github.com/qmk/qmk_firmware/blob/master/keyboards/massdrop/alt/config.h#L43-L44 which may very well be due to this issue, but it doesn't use an ATMEGA32). I've opened a PR for the Sesame, which is the only affected keyboard I can test. It doesn't have LEDs, so it isn't affected by the issue that #10491 was addressing.

@zvecr
Copy link
Member

zvecr commented May 8, 2022

The Drop boards do this for a different reason. Mainly that waitInputPinDelay is not implemented, and there is no benefit to having the matrix interactions be atomic, where its inclusion throws off the expected timings.

Setting IGNORE_ATOMIC_BLOCK has nothing to do with running the keyboard "without matrix delay".

@ollien
Copy link

ollien commented May 8, 2022

@zvecr Gotcha, thanks. I was mostly inferring based on the comments above. This solution does seem to work, but obviously it's not optimal.

Daveyr pushed a commit to Daveyr/qmk_firmware that referenced this issue May 26, 2022
@eddible
Copy link

eddible commented Jun 2, 2022

Just to add to the above, I had this exact issue with a FaceW PCB, and adding #define IGNORE_ATOMIC_BLOCK to config.h fixed it.

@val-m4
Copy link

val-m4 commented Jun 16, 2022

@eddible this fix didn't help me.

I have found out, that the keyboard (sesame) does not work with a laptop (dell XPS) with USB 3. But when I plug in the USB hub 2.0, it starts working and processing keypresses, shutter still present. I've put some debug code into the firmware. This is how it looks for me

@tzarc
Copy link
Member

tzarc commented Jun 16, 2022

@eddible this fix didn't help me.

I have found out, that the keyboard (sesame) does not work with a laptop (dell XPS) with USB 3. But when I plug in the USB hub 2.0, it starts working and processing keypresses, shutter still present. I've put some debug code into the firmware. This is how it looks for me

Matrix debug output is likely the culprit, here. Does it stutter with the printouts disabled?

@val-m4
Copy link

val-m4 commented Jun 17, 2022

@eddible this fix didn't help me.
I have found out, that the keyboard (sesame) does not work with a laptop (dell XPS) with USB 3. But when I plug in the USB hub 2.0, it starts working and processing keypresses, shutter still present. I've put some debug code into the firmware. This is how it looks for me

Matrix debug output is likely the culprit, here. Does it stutter with the printouts disabled?

Yes. The same. If pressing really fast the key keyboard outputs with a small delay, then freeze for a second or so, and then outputs without pressing that key. This behavior is the same with debug mode and works just with a USB 2.0 hub.

@tzarc
Copy link
Member

tzarc commented Jun 17, 2022

I'm not sure, then. I can't reproduce this behaviour on my jj40, running the default keymap, with no core modifications.

@val-m4
Copy link

val-m4 commented Jun 17, 2022

I'm not sure, then. I can't reproduce this behaviour on my jj40, running the default keymap, with no core modifications.

It seems to me, that the problem also is in the USB.
I've used the old version 0.10.47 and it works almost ok, with USB 2.0 hub, and doesn't work in case of a direct connection to USB 3.0, without a hub. Also, it has more issues with connection when the laptop is powered by a battery.

@tzarc
Copy link
Member

tzarc commented Jun 17, 2022

and doesn't work in case of a direct connection to USB 3.0, without a hub

That's the scenario I was testing here, and it works fine...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests