Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ergodox EZ speedup #5498

Merged
merged 9 commits into from
Apr 3, 2019
Merged

Ergodox EZ speedup #5498

merged 9 commits into from
Apr 3, 2019

Conversation

alex-ong
Copy link
Contributor

@alex-ong alex-ong commented Mar 27, 2019

Added Eager_PR algorithm that should reduce ergodox debounce latency

Description

Added a new algorithm and integrated ergodox into it.
Similar to #2675 but uses the shiny new Debounce API.
This requires TESTING - i have not got an ergodox EZ.

For whoever guinea-pigs into testing, i would like to know

  1. scan rate with this algorithm (hopefully faster than official one, which has to update 60+ uint8_t's every scan)
  2. scan rate with previous algorithm (updates 60+ uint8_t's every update, so will be a bit costly)
  3. scan rate with debouncing turned off completely (simply use this branch and remove the call to debounce(). You won't get any keyboard input but it should still log the scan_rate.

Types of Changes

  • Core
  • Bugfix
  • New feature
  • Enhancement/optimization
  • Keyboard (addition or update)
  • Keymap/layout/userspace (addition or update)
  • Documentation

Issues Fixed or Closed by This PR

Checklist

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • I have tested the changes and verified that they work and don't break anything (as well as I can manage).

@alex-ong
Copy link
Contributor Author

@seebs any chance you could test this for me :)

@drashna
Copy link
Member

drashna commented Mar 27, 2019

I'm more than willing to test this. Though, i"m not sure how to find out the scan rate, short of using something web based.

Though, this appears to be identical to #5496, except for that it adds support to the Ergodox keyboard, correct?

@alex-ong
Copy link
Contributor Author

@drashna - it's the same as the other commit + ergodoxez changes. There's a scanrate #ifdef in matrix.c - just enable that and it should pump out the scan rate into qmk-flasher

@drashna
Copy link
Member

drashna commented Mar 27, 2019

Well, I'll have to do that later. This change seems to have soft bricked my EZ Glow. And I'm having issues getting it back into the bootloader, at all.

I have a second ergodox ez, but I'm hesitant to test, unless I can get the other one working again.

On the plus side, testing this on other boards .... seems to work fine. So it's something specific with the EZ here.

Edit: Update, this was a lot of fun. I did get the EZ working again thankfully, but the hub I'm using for it crashed (part of a monitor), and I had to connect directly to the PC and hard reset it to be able to take a new image.

I'll test more in the morning, when I'm no longer stressed out.

@alex-ong
Copy link
Contributor Author

alex-ong commented Mar 27, 2019

Edit: Update, this was a lot of fun. I did get the EZ working again thankfully, but the hub I'm using for it crashed (part of a monitor), and I had to connect directly to the PC and hard reset it to be able to take a new image.

Phew - i saw the post just as i left a real pc™ i was going to mark my condolences and say sorry etc but you seem to have sorted it out :) best of luck on attempt number 2!

@seebs
Copy link
Contributor

seebs commented Mar 27, 2019

Interesting! I was using something that used a hid listener or something to print scan rate info at one point, but I don't recall what it was now. It's been ages.

@drashna
Copy link
Member

drashna commented Mar 27, 2019

Edit: Update, this was a lot of fun. I did get the EZ working again thankfully, but the hub I'm using for it crashed (part of a monitor), and I had to connect directly to the PC and hard reset it to be able to take a new image.

Phew - i saw the post just as i left a real pc™ i was going to mark my condolences and say sorry etc but you seem to have sorted it out :) best of luck on attempt number 2!

Yeah. But that's also why I test a lot of the PRs. So that I can sacrifice my boards rather than some poor soul after it's been merged.

That said, I can confirm that this kills my Glow and Shine models, and even on the master branch.
So something is bugged here.

And by "kills", I mean that it causes the controller to crash. And as long as my USB hub doesn't also crash (part of the issue last night), I can easily recover it with the reset button.

I added the define for DEBUG_MATRIX_SCAN_RATE, have console enabled, and HID Listen hooked up. I get the "connected" message, but that's it. Nothing else, and the board locks up.
Even with the normal debug stuff set.... nothing.

Interesting! I was using something that used a hid listener or something to print scan rate info at one point, but I don't recall what it was now. It's been ages.

QMK Toolbox uses HID LIsten actually. It's built into it. And I think that's what @alex-ong meant.
But it's only for Windows and MacOS. Otherwise, you have to manually set it up.

@drashna drashna requested review from drashna and ezuk March 27, 2019 15:59
@alex-ong
Copy link
Contributor Author

alex-ong commented Mar 27, 2019

Sad that the board locks up - Since master soft-bricks your board, is someone able to find a working master branch that I can extend off and/or patch master so i can patch this? 👍

Yes i was referring to QMK Toolbox for HID Listen. I remember when i was still in TMK Land i had to use HID_Listen, but now QMK Toolbox does it for you :)

@drashna
Copy link
Member

drashna commented Mar 28, 2019

To clarify, master with this PR locks it up. Using just master works fine.

I tested that because I have a lot of custom code that I'm running on my main branch, and wanted to make sure it wasn't that code.

@alex-ong
Copy link
Contributor Author

To clarify, master with this PR locks it up. Using just master works fine.

I tested that because I have a lot of custom code that I'm running on my main branch, and wanted to make sure it wasn't that code.

Ahh thanks!

Not sure why my code is locking it up - i dont have the keyboard but i'll try and somehow check the code...

@alex-ong
Copy link
Contributor Author

alex-ong commented Mar 28, 2019

Had a closer squiz compared to qmk's matrix.c. I found that I

  • forgot to call debounce_init() - this would definitely cause some crashing since we needed to allocate an array. 🤦
  • had a bug where left/right hand would be duplicated 🤦

Hopefully it should work now!

Looking forward to seeing results :) @drashna

Sorry again for crashing your board - definitely not a fun experience. I'm glad you recovered it!

@drashna
Copy link
Member

drashna commented Mar 28, 2019

Awesome.

And no worries. That's why I started testing stuff like this out. It may compile, but that doesn't mean it runs correctly!

And it's very hard to permanently brick a board (I would know, I've accidently tried on MANY occasions :D)

@drashna
Copy link
Member

drashna commented Mar 28, 2019

Well, at the very least, I can confirm that it works now. :)
I'm typing on it, right now. So there is that!

I'll try it out for a bit, and see if I notice any issues with the debouncing.

@drashna
Copy link
Member

drashna commented Mar 28, 2019

@seebs this is on an Ergodox EZ Glow, which has the rgb matrix feature. That may actually impact performance.

Yeah, it does. On my ergodox EZ Shine, I'm getting:

  > matrix scan frequency: 399
  > matrix scan frequency: 567
  > matrix scan frequency: 568
  > matrix scan frequency: 569
  > matrix scan frequency: 568
  > matrix scan frequency: 569
  > matrix scan frequency: 568
  > matrix scan frequency: 569
  > matrix scan frequency: 569
  > matrix scan frequency: 569
  > matrix scan frequency: 569

@alex-ong
Copy link
Contributor Author

alex-ong commented Mar 28, 2019

I did like the results I got from "take key hits immediately, don't accept a key release until we've seen it be stable for a bit", since that seemed to work well; in practice, you don't get a lot of false positive hits that way, and you don't get the long extra delay on accepting a keypress.

I definitely encourage you to implement your algo in the debounce library - the more algo's are added the better for users! I don't personally understand why you would need to debounce for several cycles on key-up if the key-down is perfectly accurate - if the switches had some sort of defect that causes >5ms bounce on key-up but not key-down, it makes sense, but my experience with cherries/gateron is they are well within the 5ms spec.

Also not sure how low scan rate affects eager algorithms - I always assume humans take 20ms between keyup/down (this equates to 50hz tapping which is insanely fast), so i would guess that having an eager timeout of 5ms (or whatever the switch spec is) should be sufficient to filter out any noise if there is any. I am assuming that noise only occurs on key events and there arent random keydown events when you don't actually hit the key.

scanrate

yeah 250 is quite slow (i target at least 1000+ on my keyboards), but i guess i2c is an issue on this one.

If you comment out the line with debounce(), it will disable debouncing but also let you see the raw scan rate (and also not let you type anything. but its a good number to start with). Latency is affected by scan rate too; Latency is simply scanrate latency + debounce time + key process time. In this case, debounce time is effectively set to 0, 500hz scanning is 2ms, key processing == 1ms, so total latency is on the order of 3ms for key down, and 3ms-8ms for keyup, assuming the row constraint is adhered to.

The previous algorithm in master was 5 cycles (not ms) debounce @ 200-300hz + scanning time + key processing which would equate to almost 16-25ms delay.

In the end, the debounce algorithm here (eager_pr) is just a compromise that attempts to reduce latency while also keeping scanrate sufficiently high. I'm sure there is an asm version of eager_pr with sick bit-shifts that would make the debouncing 2-5x faster. I definitely encourage people to optimize this!

@drashna
Copy link
Member

drashna commented Mar 28, 2019

Yeah, it does. On my ergodox EZ Shine, I'm getting:

  > matrix scan frequency: 567
  > matrix scan frequency: 568
  > matrix scan frequency: 569
  > matrix scan frequency: 568
  > matrix scan frequency: 569
  > matrix scan frequency: 568
  > matrix scan frequency: 569
  > matrix scan frequency: 569
  > matrix scan frequency: 569
  > matrix scan frequency: 569

vs ergodox ez shine without this PR:

  > matrix scan frequency: 536
  > matrix scan frequency: 537
  > matrix scan frequency: 535
  > matrix scan frequency: 536
  > matrix scan frequency: 537
  > matrix scan frequency: 537
  > matrix scan frequency: 537
  > matrix scan frequency: 537
  > matrix scan frequency: 537
  > matrix scan frequency: 537
  > matrix scan frequency: 537
  > matrix scan frequency: 537

the Glow, without this. (with reactive keypresses enabled on both

  > matrix scan frequency: 259
  > matrix scan frequency: 260
  > matrix scan frequency: 260
  > matrix scan frequency: 260
  > matrix scan frequency: 260
  > matrix scan frequency: 260
  > matrix scan frequency: 260
  > matrix scan frequency: 261
  > matrix scan frequency: 259
  > matrix scan frequency: 261
  > matrix scan frequency: 259
  > matrix scan frequency: 261
  > matrix scan frequency: 259
  > matrix scan frequency: 260
  > matrix scan frequency: 260

vs with this:

  > matrix scan frequency: 316
  > matrix scan frequency: 315
  > matrix scan frequency: 313
  > matrix scan frequency: 315
  > matrix scan frequency: 316
  > matrix scan frequency: 313
  > matrix scan frequency: 316
  > matrix scan frequency: 316
  > matrix scan frequency: 316
  > matrix scan frequency: 316
  > matrix scan frequency: 316
  > matrix scan frequency: 316
  > matrix scan frequency: 316
  > matrix scan frequency: 316
  > matrix scan frequency: 316
  > matrix scan frequency: 316

So while there isn't a huge different, it may be noticable.

@alex-ong Yeah, the IO Expander and related code are known for making the Ergodox EZ a very slow board.

@drashna
Copy link
Member

drashna commented Mar 28, 2019

Shine without debouncing:

  > matrix scan frequency: 643
  > matrix scan frequency: 643
  > matrix scan frequency: 642
  > matrix scan frequency: 643
  > matrix scan frequency: 642
  > matrix scan frequency: 643
  > matrix scan frequency: 643
  > matrix scan frequency: 643
  > matrix scan frequency: 643
  > matrix scan frequency: 643

Glow without debouncing:

  > matrix scan frequency: 337
  > matrix scan frequency: 337
  > matrix scan frequency: 337
  > matrix scan frequency: 337
  > matrix scan frequency: 337
  > matrix scan frequency: 337
  > matrix scan frequency: 337
  > matrix scan frequency: 337
  > matrix scan frequency: 337
  > matrix scan frequency: 337
  > matrix scan frequency: 337
  > matrix scan frequency: 337
  > matrix scan frequency: 337

@alex-ong
Copy link
Contributor Author

Thanks!

Let's break that down a bit:
Keeb : nodebounce / pr /pk
Shine: 643 / 570 / 536
Glow: 337/316 /260

Then convert from hz to Ms/cycle
Shine: 1.55 / 1.75 / 1.86
Glow: 2.96 /3.16 /3.84

Looks like my eager_pr takes 0.2 Ms per scan, and the old alg took .3ms or .9ms per scan.

If the .3ms number is right, then my alg could be optimized to .1ms (14 updates vs 80 updates). I did see some funky bitshifting. If the .9ms number is right then .2ms for my alg sounds right 14/80*.9 is about .2ms.

At least we know the ceiling from debounce is optimizing .2ms/scan away since the rest of the time is grabbing the input.

My calls to eager_pr force a full update every scan - this can be optimized to only do it lazily, changing that .2ms to close to 0. I'll try and get a test of that too.

@seebs
Copy link
Contributor

seebs commented Mar 28, 2019

So, the reason my debounce works is precisely that there's no noise without a key hit. There can be a ton of noise, but you still know that the first sign of noise is conclusive evidence that the key's been hit. So you can consider that first signal to be proof of a hit. But then you might have a lot of samples that show key-up, some of them before the key is actually fully down, some of them while it's releasing, so you have to be conservative in recognizing key-up.

So when I say "several cycles", I mean "you want several consecutive scans that show the key up". The worst case, I think, is a really short actual key-down state, with a bunch of noise on both sides, and you get unlucky and read several key-up scans from that noise before the noise is totally over. Basically, if you have an actual key-down state that's less than a scan interval, you can have a situation where you're effectively seeing both the key-down and key-up noise as one long run of noise.

I have not actually looked closely at the debounce code so I don't know how similar it is or isn't to this.

@alex-ong
Copy link
Contributor Author

alex-ong commented Mar 29, 2019

So, the reason my debounce works is precisely that there's no noise without a key hit. There can be a ton of noise, but you still know that the first sign of noise is conclusive evidence that the key's been hit. So you can consider that first signal to be proof of a hit. But then you might have a lot of samples that show key-up, some of them before the key is actually fully down, some of them while it's releasing, so you have to be conservative in recognizing key-up.

Right. Our debounce algorithms have similar assumptions, though mine assumes the input is even more perfect than yours.

Here's a link to a few debounce algorithms in QMK right now:
https://imgur.com/pyqd7Uz

  1. is the default (symmetric, 5ms stable time), which 99% of keyboards have been using forever. It looks for a stable period of 5ms before reporting key up or key down, hence adding >5ms of delay on these events . If your debounce time is 5ms, it will actually take 10ms before it is reported, since it takes 5ms for your switch to reach stable state, and another 5ms for the algo to determine it stable enough to report. It effectively shields against EMI, so you won't get presses when you arent pressing, and you definitely won't get releases while you have the key held down.

  2. is your algorithm. It assumes that EMI is not a thing - the signal will always be no key pressed if you just leave your keyboard sitting there. It has protection after the key is pressed down - if it detects a key_up while the key is down, it won't report it until a period of stability. This means for a 5ms stable period, you get a 10ms latency. Since we are using ergodox_ez and scancycles rather than milliseconds, and the scanrate is roughly 500hz, we actually end up with 20ms latency on key-release.

  3. is eager on key-press and key-release. It assumes that EMI will never occur, and all signals are directly from keypresses, and that noise will only occur on a key press or release. It has no latency, but also no EMI protection.

Note i have ignored other factors such as whether debounce is per-row, per-key or global to the whole keyboard. Let's dig into that:

(1) The default QMK method is global. This means it waits for the entire keyboard to have a period of stability before marking keys as "accurate". This means if you strum "asdf" 4ms apart from each other, no keys get reported until the "F" is stable. That means the "A" is delayed 16+10ms, the F is only delayed 10ms.

eager_pk in the debounce library, as well as the current master branch debounce algo for ergodox_ez, are per key. They have a counter for every key. This means keys dont affect each other (good), but you have to update 80 counters every update (expensive)

(2 and 3) (your algorithm and my algorithm) is per electronic row, which is actually physically columns on this particular keyboard. It's a hybrid of 1 and 2 - "3edc" affect each other, but "asdf" do not. Since touch typists use one finger per column, it saves memory and computational time.

In an ideal world, debounce would be per key. However, the more counters you have, the longer debouncing takes. Since the ergodox_ez is already time constrained from just polling the inputs, having a per-key debounce reduces performance even further, reducing the polling rate even further.

@drashna
Copy link
Member

drashna commented Apr 3, 2019

Is there anything else that needs to be done here?

I would like to see this get merged sooner rather than later, as there is an improvement here.
However, if you think you can improve it further (as mentioned above), waiting is fine.

@seebs
Copy link
Contributor

seebs commented Apr 3, 2019

Ahh, cool, you had the same observation I did about rows/columns. :) (This would be broken by the otherwise clever idea that it would have made more sense to have 6 rows of 7 keys, which would take 1/7 less time to scan.) So I think at this point, I'd say go ahead and use this. I think my conclusion was that I needed at least 4 scans to be reasonably confident about key-up events, which i think came out to 8-10ms. But just taking the first key-down as "real" seemed to improve responsiveness quite a bit.

@drashna
Copy link
Member

drashna commented Apr 3, 2019

Great minds think alike, apparently. :)

@drashna drashna merged commit 17e7762 into qmk:master Apr 3, 2019
@alex-ong alex-ong deleted the ergodox_ez_speedup branch April 3, 2019 23:19
@drashna
Copy link
Member

drashna commented Apr 5, 2019

@zvecr zvecr mentioned this pull request Apr 5, 2019
13 tasks
legooolas added a commit to legooolas/qmk_firmware that referenced this pull request Apr 5, 2019
* upstream/master: (445 commits)
  Update ps2avrgb template to use standard matrix/i2c code (qmk#4957)
  [Keyboard] Simplified handwired/xealous since most of the features are in core now. (qmk#5556)
  [Keyboard] Move scrabblepad into donutcables directory (qmk#5553)
  [Keymap] Additional RGB options set (qmk#5551)
  [Keyboard] Add Budget96 by Donut Cables (qmk#5550)
  [Keyboard] Added configurable defaults for RGB backlight parameters. (qmk#5549)
  Added Hacked Motospeed keyboard (qmk#5534)
  [Keymap] New HS60/v2 HHKB keymap for goatmaster (qmk#5545)
  [Keyboard] add treeadstone48 (qmk#5405)
  [Keyboard] Doro67 Multi PCB port (qmk#5539)
  [Keyboard] V60 Type R - Turn on leds for Configurator + Refactor (qmk#5546)
  RGB Matrix support for Massdrop CTRL/ALT (qmk#5328)
  Added encoder support to split common code (qmk#5477)
  Eager Per Row Debouncing added (added to Ergodox) (qmk#5498)
  Added configurable defaults for RGB backlight indicators.
  [Keyboard] Small Refactor of Duck boards (qmk#5521)
  [Keyboard] Quantrik Kyuu 65% Board (qmk#5541)
  Call default zeal60 rgb file
  remove call to custom rgb file
  Removed duplicated zeal60 files
  ...
@drashna
Copy link
Member

drashna commented Apr 8, 2019

@alex-ong interesting issue here ....

If you set the debounce type to sym_g on the EZ, it doesn't register any keypresses. But eager_pk and eager_pr work just fine.

It's not a huge thing, but it's something that should probably be fixed, sooner or later.

StevenWolfe pushed a commit to StevenWolfe/qmk_firmware that referenced this pull request Apr 8, 2019
* Implemented Eager Per Row debouncing algorithm.

Good for when fingers can only press one row at a time (e.g. when keyboard is wired so that "rows" are vertical)

* Added documentation for eager_pr

* Ported ergodox_ez to eager_pr debouncing.

* Removed check for changes in matrix_scan.

* Added further clarification in docs.

* Accidental merge with ergodox_ez

* Small cleanup in eager_pr

* Forgot to debounce_init - this would probably cause seg-faults.
@alex-ong
Copy link
Contributor Author

alex-ong commented Apr 9, 2019

Yup, its because i (lazily) just called debounce(changed=true) instead of calculating it. sym_g relies on debounce(changed) to actually be accurate.
I'll update it now...

drashna pushed a commit to drashna/qmk_firmware that referenced this pull request Apr 9, 2019
* Implemented Eager Per Row debouncing algorithm.

Good for when fingers can only press one row at a time (e.g. when keyboard is wired so that "rows" are vertical)

* Added documentation for eager_pr

* Ported ergodox_ez to eager_pr debouncing.

* Removed check for changes in matrix_scan.

* Added further clarification in docs.

* Accidental merge with ergodox_ez

* Small cleanup in eager_pr

* Forgot to debounce_init - this would probably cause seg-faults.
@drashna drashna mentioned this pull request Apr 15, 2019
danielo515 pushed a commit to danielo515/qmk_firmware that referenced this pull request May 15, 2019
* Implemented Eager Per Row debouncing algorithm.

Good for when fingers can only press one row at a time (e.g. when keyboard is wired so that "rows" are vertical)

* Added documentation for eager_pr

* Ported ergodox_ez to eager_pr debouncing.

* Removed check for changes in matrix_scan.

* Added further clarification in docs.

* Accidental merge with ergodox_ez

* Small cleanup in eager_pr

* Forgot to debounce_init - this would probably cause seg-faults.
shimesaba-type0 pushed a commit to shimesaba-type0/qmk_firmware that referenced this pull request Jun 22, 2019
* Implemented Eager Per Row debouncing algorithm.

Good for when fingers can only press one row at a time (e.g. when keyboard is wired so that "rows" are vertical)

* Added documentation for eager_pr

* Ported ergodox_ez to eager_pr debouncing.

* Removed check for changes in matrix_scan.

* Added further clarification in docs.

* Accidental merge with ergodox_ez

* Small cleanup in eager_pr

* Forgot to debounce_init - this would probably cause seg-faults.
Timbus pushed a commit to Timbus/qmk_firmware that referenced this pull request Jun 23, 2019
* Implemented Eager Per Row debouncing algorithm.

Good for when fingers can only press one row at a time (e.g. when keyboard is wired so that "rows" are vertical)

* Added documentation for eager_pr

* Ported ergodox_ez to eager_pr debouncing.

* Removed check for changes in matrix_scan.

* Added further clarification in docs.

* Accidental merge with ergodox_ez

* Small cleanup in eager_pr

* Forgot to debounce_init - this would probably cause seg-faults.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants