From 6841246dccac1839687197facfe397636689fabb Mon Sep 17 00:00:00 2001
From: Spence Konde <spencekonde@gmail.com>
Date: Sun, 24 Oct 2021 14:39:51 -0400
Subject: [PATCH] Update Docs

---
 README.md                             |  29 ++-
 megaavr/extras/Comparison.md          |  83 ++++-----
 megaavr/extras/DA-series-notes.md     |  18 --
 megaavr/extras/InteruptVectorNames.md | 175 -----------------
 megaavr/extras/Libraries.md           |   2 +-
 megaavr/extras/NotesOnPeripherals.md  |  95 ----------
 megaavr/extras/PWMandTimers.md        | 226 +++++++++++++---------
 megaavr/extras/Performance.md         |  24 ---
 megaavr/extras/PinInterrupts.md       | 113 -----------
 megaavr/extras/Ref_Interrupts.md      | 259 ++++++++++++++++++++++++++
 megaavr/extras/Ref_Timers.md          | 209 +++++++++++++++++++++
 11 files changed, 669 insertions(+), 564 deletions(-)
 delete mode 100644 megaavr/extras/DA-series-notes.md
 delete mode 100644 megaavr/extras/InteruptVectorNames.md
 delete mode 100644 megaavr/extras/NotesOnPeripherals.md
 delete mode 100644 megaavr/extras/Performance.md
 delete mode 100644 megaavr/extras/PinInterrupts.md
 create mode 100644 megaavr/extras/Ref_Interrupts.md
 create mode 100644 megaavr/extras/Ref_Timers.md

diff --git a/README.md b/README.md
index 3af037e1..7f7cda15 100644
--- a/README.md
+++ b/README.md
@@ -197,7 +197,7 @@ All pins can be used with attachInterrupt() and detachInterrupt(), on `RISING`,
 For full information and example, but postentially dated information (attachInterrupt() should suck less now): [Pin Interrupts](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/PinInterrupts.md)
 
 #### attachInterrupt() rework
-There are three options, controlled by the Tools -> attachInterrupt Mode submenu: the new, enabled on all pins always (like the old one), manual (ports must be enabled before attaching to them), and old version (if the new implementation turns out to break something). Manual mode is required for the main benefit. You must call attacbPortAEnable() (replace A with the letter of the port) before attachbing the interrupt. The main point of this is that (in addition to saving an amount of flash that doesn't much matter on the Dx-series) attachInterrupt() on one pin (called by a library, say) will not glom onto every single port's pin interrupt vectors so you can't manually define any. The interrupts are still just as slow (it's inherrent to calling a function by pointer from an ISR - and low-numbered pins are faster to start executing than high numbered ones. . The method to enable may change, I am not happy with it, but "can't use any pin interrupts if using a library that does attachInterrupt" is significantly worse.
+There are three options, controlled by the Tools -> attachInterrupt Mode submenu: the new, enabled on all pins always (like the old one), manual (ports must be enabled before attaching to them), and old version (if the new implementation turns out to break something). Manual mode is required for the main benefit. In manual mode, you must call attachPortAEnable() (replace A with the letter of the port) before attaching the interrupt. The main point of this is that (in addition to saving an amount of flash that doesn't much matter on the Dx-series) attachInterrupt() on one pin (called by a library, say) will not glom onto every single port's pin interrupt vectors so you can't manually define any. The interrupts are still just as slow (it's inherrent to calling a function by pointer from an ISR - and low-numbered pins are faster to start executing than high numbered ones. The method to enable may change - I had hoped that I could detect which pins were used, but I couldn't get the function chose which ports to enable to not count as "referencing" those ports, and hence pull inthe ISR. I am not happy with it, but "can't use any pin interrupts except through attachInterrupt() if using a library that uses attachInterrupt()" is significantly worse.
 
 ### On-chip Opamps
 The DB-series parts have 2 or 3 on-chip opamps, with programmable resistor ladder, configurable for a variety of applications. They can be used as a voltage follower (you can follow the DAC and then use the output to drive VDDIO2, though the current is still only tens of mA, that's usually enough - driving heavy loads at the lower voltage is an unusual use case (I imagine powering low voltage sensors is not particularly rare - but those sort of modern sensors are also usually very low current).
@@ -288,7 +288,7 @@ See the [Improved Digital I/O Reference](https://github.com/SpenceKonde/DxCore/b
 * Tools -> MVIO - MVIO option is back in 1.3.7. It is not a risk of hardware damage if it is turned off, and it saves 0.5 uA power consumption to disable it. Disabling it when you shouldn't doesn't keep the pins from being readable and writable, nor does it short the VDDIO pin to VDD.... it just no longer watches the voltage to ensure sane behavior if insufficient voltage is applied on VDDIO2. This is in effect an extra layer of monitoring like the BOD is,
 * Tools -> Write flash from App - Either disabled (Flash.h library does not work), "Everywhere" (allow writes everywhere in the flash after first page), or allow writes only above a certain address. On Optiboot definirtions, it's always enabled for writes anywhere.
 * Tools -> printf() imoplementation - The default option can be swapped for a lighter weight version that omits most functionality to save a tiny amount of flash, or for a full implementation (which allows printing floats with it) at the cost of about 1k extra. Note that if non-default options are selected, the implementation is always linked in, and will take space even if not called. Normal Arduino boards are set to default. They also don't have `Serial.printf()`
-* Tools -> attachInterrupt Mode - Choose from 3 options - the new, enabled on all pins always (like the old one), Manual (as above, but you must call attacbPortAEnable() (replace A with the letter of the port) before attachbing the interrupt. The main point of this is that (in addition to saving an amount of flash that doesn't much matter on the Dx-series) attachInterrupt() on one pin (called by a library, say) will not glom onto every single port's pin interrupt vectors so you can't manually define any. The interrupts are still just as slow (it's inherrent to calling a function by pointer from an ISR). The method to enable may change, I am not happy with it, but "can't use any pin interrupts if using a library that does attachInterrupt" is significantly worse.
+* Tools -> attachInterrupt Mode - Choose from 3 options - the new, enabled on all pins always (like the old one), Manual. or the old implementation in case of regressions in the new implementation. When in Manual mode, You must call attachPortAEnable() (replace A with the letter of the port) before attaching the interrupt. The main point of this is that (in addition to saving an amount of flash that doesn't much matter on the Dx-series) attachInterrupt() on one pin (called by a library, say) will not glom onto every single port's pin interrupt vectors so you can't manually define any. The interrupts are still just as slow (it's inherrent to calling a function by pointer from an ISR - and low-numbered pins are faster to start executing than high numbered ones. The method to enable may change - I had hoped that I could detect which pins were used, but I couldn't get the function chose which ports to enable to not count as "referencing" those ports, and hence pull inthe ISR. I am not happy with it, but "can't use any pin interrupts except through attachInterrupt() if using a library that uses attachInterrupt()" is significantly worse.
 
 
 
@@ -299,25 +299,36 @@ See the [Improved Digital I/O Reference](https://github.com/SpenceKonde/DxCore/b
 The API reference for the analog-related functionality that is included in this core beyond the standard Arduino API.
 #### [Digital I/O and enhanced options](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_Digital.md)
 The API reference for the digital I/O-related functionality that is included in this core beyond the standard Arduino API, as well as a few digital I/O related features that exist in the hardware which we provide no wrapper around.
+#### [Interrupts](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_Interrupts.md)
+Includes a list of all interrupt vectors that can be used, how the flags are cleared (not a substitute for the datasheet - just a very quick reminder), which parts each vector exists on, and and what parts of the core, if any, make use of a vector. It also has general guidance and warnings relating to interrupts their handling, including estimates of real-world interrupt response times.
+#### [Timers and PWM](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_Timers.md)
+We configure the timers in specific ways upon startup, which determines the frequency of PWM output, and some parameters of millis() timekeeping.
 #### [TCD0](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_TCD.md)
 The type D timer is a powerful timer, but has quirks which one must be aware of if using it. This describes what you can do without having to take full control of the timer.
-#### [Mapped flash and PROGMEM in DxCore](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_PROGMEM.md) - There are two ways to access constants stored in flash on DxCore. Which ones can read data stored where can be confusing.
-#### [Optiboot Bootloader](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_Optiboot.md) - an Optiboot-derived bootloader is provided and may be optionally used. This covers relevant considerations for deciding whether to use it as well.
-#### [SerialUPDI](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_SerialUPDI.md) - the recommended tool for UPDI programming.
+#### [Mapped flash and PROGMEM in DxCore](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_PROGMEM.md)
+There are two ways to access constants stored in flash on DxCore. Which ones can read data stored where can be confusing; this document should make this clear.
+#### [Optiboot Bootloader](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_Optiboot.md)
+An Optiboot-derived bootloader is provided and may be optionally used. How that impacts operations is described here. This covers relevant considerations for deciding whether to use it as well.
+#### [SerialUPDI](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_SerialUPDI.md)
+Serial UPDI is our recommended tool for UPDI programming.
 #### [Clock Information](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_Clocks.md)
 Supported clock sources and considerations for the use thereof.
 #### [Callbacks/weakly defined functions](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_CallBacka.md)
 These are provided by the core and can be overridden with code to run in the event of certain conditions, or at certain times in the startup process.
-#### [Identification of core features programmatically](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_Defines.md) - used by megaTinyCore as well.
+#### [Identification of core features programmatically](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_Defines.md)
+These are used by megaTinyCore and other cores I maintain as well.
 #### [Reset control and the WDT](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_Reset.md)
 The sources of reset, and how to handle reset cause flags to ensure clean resets and proper functioning in adcverse events. **Must read for production systems**
 #### [Considerations for robust applications](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_Robust.md)
 Covers a variety of design considerations for making something that will opperate reliably in the field, some specific to DxCore, others general. Lately I have been seeing a lot of projects get too far along without considering these. **Must read for production systems**
 ### Older guides inherited from megaTinyCore.
 These guides may not account for all of differences between DxCore and megaTinyCore, and may not reflect recent changes.
-#### [Power Saving techniques and Sleep](megaavr/extras/PowerSave.md) - There are plans for a better wrapper around this sort of functionality, which keep getting deferred as more pressing issues come up.
-#### [Direct Port Manipulation](megaavr/extras/DirectPortManipulation.md) - It's similar to classic AVRs, but a bit more complicated. See also digital I/O
-#### [Pin Interrupts](megaavr/extras/PinInterrupts.md) - Manually defining pin interrupts, becauuse attachInterrupt results in interrupts which respond slowly.
+#### [Power Saving techniques and Sleep](megaavr/extras/PowerSave.md)
+There are plans for a better wrapper around this sort of functionality, which keep getting deferred as more pressing issues come up.
+#### [Direct Port Manipulation](megaavr/extras/DirectPortManipulation.md)
+It's similar to classic AVRs, but a bit more complicated. See also digital I/O reference
+#### [Pin Interrupts](megaavr/extras/PinInterrupts.md)
+Manually defining pin interrupts, because attachInterrupt results in interrupts which respond slowly. See also Interrupt Reference
 
 ## Support Continued Development
 I sell breakout boards with regulator, UPDI header, and Serial header and other basic supporting parts in my Tindie shop, as well as the bare boards. Buying from my store helps support further development on the core, and is a great way to get started using these exciting new parts with Arduino. Note that we do not currently sell a 28-pin version - this did not seem like a compelling part with the availability of the 32-pin version; the main appeal of the 28-pin part is that it is available in a through-hole version. As we would not be able to make the 28-pin version significantly smaller, there did not seem to be a compelling reason to create a 28-pin version. We may revisit this decision in the future, including potentially a 28-pin bare board for the through-hole version, suitable for assembly by those not experienced with drag soldering.
diff --git a/megaavr/extras/Comparison.md b/megaavr/extras/Comparison.md
index 08fb78a5..d68b280d 100644
--- a/megaavr/extras/Comparison.md
+++ b/megaavr/extras/Comparison.md
@@ -1,69 +1,62 @@
-# DA vs tinyAVR 1-series and megaAVR 0-series peripherals
-While reading datasheets, I looked for differences as those are places where the core would need to be adapted. These are some notes, presented without further editing.
+# Dx-series peripherals vs other modern AVRs
+Based on notes from initial reading of datasheets for DA, DB-series parts. Much was unchanged, and essentially all of the changes were welcome.
 
-## VREF
-Same deal, except the reference voltages are better now. With 12-bit resolution, and 4.096V, 1 LSB is 1mV! It doesn't get any easier than that!
-Also, you can now (finally!) use Vcc as the DAC reference voltage.
+## NVMCTRL
+Dramatic changes. 0/1-series parts erase and write with page-granularity. Dx-series erases with page granularity but writes with word-granularity. The flash protection mechanism now only requires the instruction that results in the actual write or erase (an SPM or ST instruction with NVMCTRL.CTRLA set appropriately) to execute from bootloader section.
+
+## CLKCTRL
+Major differences from 0/1-series, and further changes for DB-series.
+* For internal oscillator, you just set the frequency: Options are 1/2/3/4 MHz, then increments of 4. Datasheet stops at 24. Part still works if you keep incrementing it up to 28, 32. Then it repeats 20-32 again.
+* There's also a PLL, rated for operation with input frequency between 16 and 24 MHz (very conservative, it turns out) and an output frequency 2 or 3 times that (or 4x - known from the earliest I/O headers, but not otherwise documented), but all it can do is clock the type D timer. It also
+* You can enable "AutoTuning" from a watch crystal.
+* The DB and DD-series also support using an external crystal and have clock failure detection.
 
 ## TCA
-EVACTB added.
-Otherwise, appears identical, complete with the wonky CTRLExxx CMDEN bits in split mode, with values of NONE (0b00) and BOTH (0b11), but the other two marked reserved. Something tells me they screwed up and this didn't work, and they didn't bother to fix it in the DA series. Were I microchip, I'd have just disabled those bits and made them act like 1's - unless they thought maybe they'd fix it in some version of the chip?
-The **REAL** change with TCA is that you get 2 of them on 48 and 64 pin parts! Which meaaans: TWO INDEPENDENT PRESCALERS FOR PWM!
+Second event channel, EVACTB added. Otherwise, appears identical, complete with the wonky CTRLExxx CMDEN bits in split mode, with values of NONE (0b00) and BOTH (0b11), but the other two marked reserved.
+
+The tinyAVR 2-series also got the new version.
 
 ## TCB
-The only timer with Real, Meaningful improvements. These have gone from great utility timers to amazing ones, with a few simple additions:
-Added CLKSEL TCA1 option - all TCBs can pick their clock source! Now you can take over a TCA and still use TCBs (and the other TCA) without messing up the prescaler for your TCBs!
-(the big one) Added EVENT clock option - now TCB as well as TCA can be clocked from events!
-They got an overflow interrupt too! Though in most modes it is redundant...
-Oh, and did I mention the CASCADE bit? You can clock one timer off another timer's overflow, and set cascade on the one that counts the two MSBs to make sure they are in sync (ie, correct for propagation delays) and get a 32-bit input capture. Yes, that's right, you can time something lasting nearly 3 minutes TO A TWENTYFOURTH OF A MICROSECOND!!!
+The only timer with real and meaningful improvements. These have gone from great utility timers to amazing ones, with a few additions:
+* Added EVENT clock option - now TCB as well as TCA can be clocked from events.
+* They got an overflow interrupt too! Though in most modes it is redundant and/or only fires in very unusual conditions.
+* Cascade functionality - when one timer is clocked from the others' overflow event generator, there is a cascade bit that can be set, which will permit 32-bit input capture. and set cascade on the one that counts the two MSBs to make sure they are in sync (ie, correct for propagation delays) and get a 32-bit input capture.  You can time something lasting nearly 3 minutes (at 24 MHz system clock) down to single processor clock cycles.
+* TCA1 clock option on parts that have it (obviously)
+
+tinyAVR 2-series also got the new version (without the TCA1 clock option of course,)
 
 ## TCD0
-Only difference appears to be the addition of the PLL as a clock source! Probably also has different errata.
+Only difference appears to be the addition of the PLL as a clock source. It looks like in general everyone ignored the TCD until the Dx-series came out - there was hardly any errata listed for it until the year following the Dx-series release, but it has now grown significantly, with the same bugs appearing both here and for Dx-series.
+
+## EVSYS
+Largely unchanged aside from basics needed to adjust for the larger number of channels.
+
+## VREF
+Reference voltages are 1.024, 2.048, and 4.096V, as well as 2.5V, Vdd, and external clock. Works the same for AC, DAC, and ADC. The first three are clearly meant to make life easier for people using the ADC - With 12-bit resolution, and 4.096V, 1 LSB is 1mV - It doesn't get any easier than that!
+This means Vdd can be used for DAC reference voltage.
 
 ## DAC
-Only significant difference is that it's 10 bit... and the DATA register is left-adjusted, so if you just want 8 bits, you can just write the one register. Nifty.
-And that VREF lets you use Vcc.
+Only significant difference is that it's 10 bit... and the DATA register is left-adjusted, so if you just want 8 bits, you can just write the one register. DAC vref can also now be set to Vdd so you can get rail to rail output.
 
 ## ADC
-RESSEL now chooses between 10 and 12 bits
-differential ADC is now supported. I know this is very exciting to everyone in Arduino-land, because I know you all do differential ADC measurements on classic AVRs all the time right? No? Uh, me neither... I don't think I've ever taken a differential ADC reading in my life.
+RESSEL now chooses between 10 and 12 bits, differential ADC (but measured voltage still must not exceed VRef for correct results - no high side current sense using the smallest reference voltage, that will have to wait until 2-series/EA-series parts), and slightly higher maximum accumulation option, much higher maximum sample duration. Specs permit slightly faster ADC clock.
 
-Oh, and you can also measure the AC DACREFs, if you want to measure a number you control or something...? Probably mostly a clever way to either check accuracy or make math easy for Vcc measurement?
+Note that the 2-series tinyAVR has a completely reworked ADC, which will also be featured on the EA-series.
 
 ## AC
-Looks pretty much the same to me....
+General layout of megaAVR 0-series, with ACn.DACREF, instead of additional DACn peripherals with no output buffer and DAC0 being shared with AC0's DACREF, and the windowed mode configuration like the "golden" tinyAVR 1-series.
 
 ## TWI
-Same one as the megaAVR 0-series got, with the dual mode thing...
+No changes from megaAVR 0-series.
 
 ## SPI
-Same as before. On early 128k chips, SPI0 must have SSD=1 to run in master mode if set to the default pin mapping. We always set that because we NEVER supported slave
+No changes.
 
 ## USART
-Same as before. Still has the same errata with open drain mode... they might as well just add it to the datasheet at this point.
+Largely unchanged, except that one of the RS-485 modes, which was poorly described in the 1-series datasheets, appears to be gone entirely.
 
 ## CCL
-Errta doesn't mention the D-latch? Does that mean we have a working D-latch finally?
-Other than that, minor changes to the INSEL bits to make sense with all the extra peripherals the DA series has, but otherwise, it's pretty much the same as the CCL that the megaAVR 0-series got.
-
-## EVSYS
-Basically the event system from the megaAVR 0-series.
-Errata: ~Early~ AVR128DA64 parts have no connection to event system for the PE, PB pins that aren't present on the 48-pin parts. [8 months later, still no silicon revision to fix this]
-
-## NVMCTRL
-TOTALLY DIFFERENT!
-Writing via UPDI means manipulating NVMCTRL to do your bidding. This was quite the challenge. The writing is lovely though,
-
-Errata: Early 128k parts incorrectly apply the bootloader protections to all 32k blocks of flash, so you can't write using the memory mapping... Read still works fine assuming the bit to make the bootloader section unreadable from app is not set.
-
-
-## CLKCTRL
-Lot of differences!
-For one thing, you just set the frequency: Options are 1/2/3/4 MHz, then increments of 4. Datasheet stops at 24. Part still works if you keep incrementing it up to 28, 32. Then it repeats 20-32 again.
-
-There's also a PLL. Only works at >=16MHz internal oscillator speed, max frequency 48, multiply by 2 or 3 (basically, 16 and 24 get 48, 20 gets 40) - too bad all it can do is clock the single Type D timer!
-
-And... there's AUTOTUNE! Use a crystal... but a watch crystal! It's used to adjust the main oscillator frequency. It ain't an MHz crystal at the system clock frequency, but it's better than nothing...
+D-latch works.
 
 ## RTC
-Same as megaAVR 0-series
+Same as megaAVR 0-series. Free of the bugs that afflicted the tinyAVR 0/1-series.
diff --git a/megaavr/extras/DA-series-notes.md b/megaavr/extras/DA-series-notes.md
deleted file mode 100644
index 76e3606b..00000000
--- a/megaavr/extras/DA-series-notes.md
+++ /dev/null
@@ -1,18 +0,0 @@
-# DA-series notes
-
-Be sure to read the AVR128DA errata, there are some nasty silicon bugs in the first release of these (which is all that's currently available):
-* TCA1 PWM output can only be mappend to ports B and C - E and G are unavailable (this core uses port B on all parts)
-* The memory mapping applies the bootloader section protections to the memory mapped flash, regardless of which page (ie, if bootloader was set to be the first 512b of flash, the first 512b of EVERY page would be treated as bootloader memory when accessing via memory map) - not an issue when using with Arduino typically, though.
-* TWI doesn't work if the pins aren't set low - Wire.h will be adapted to insure that Wire.Begin() configures this correctly.
-* SPI with default pin mapping MUST disable SS Detect in order to operate in master. Currently the SPI.h library does this anyway and does not support slave mode anyway.
-* PB6, PB7, PE4-PE7 are not connected to the event system, both output and input. Hence, alternate locations for EVOUTB and EVOUTE do not work, nor will using those pins as event generators.
-* Digital input disabled when pin used for analog input. This will be compensated for by the core (unless this is more serious than their app notes indicate) and should not be seen if using the digital/analogRead functions
-* As with... basically every modern AVR, the USART Open Drain mode for Serial requires the TX pin to NOT be set as an output, otherwise it can drive the pin high despite being in open drain mode... You'd think after 4 years without fixing it, they'd have just changed the datasheet to say that the bit in PORTx.DIR had to be 0 to ensure that it did not drive the pin high...
-
-Several of those give the impression that most of the development was done with the 48-pin version (Port G, and the extra pins on Ports B and E are only on the 64-pin version)
-
-## Discoveries
-* The fuse listing in datasheet is apparently inaccurate or incomplete, as the parts come with an impossible fuse combination set. So far, no problems resulting from this have been encountered. More to come as this is further investigated.
-* UPDI: On these parts, the 24-bit addressing mode must always be used, even for addressed reachable by the first 16 bits. If the ST_ptr instruction is used, the high byte of the pointer will be used for the high byte of a 16-bit STS instruction. jtag2updi does this correctly.
-* You can't view CLKCTRL.OSCHFTUNE register while AutoTune is engaged.
-* The tuning range is much narrower. Only 64 values of the OSCHFTUNE are valid and over the whole range the frequency varies within 10% of nominal. Not like classic AVRs where you could get the 8MHz internal going at 12.8, or even 16...
diff --git a/megaavr/extras/InteruptVectorNames.md b/megaavr/extras/InteruptVectorNames.md
deleted file mode 100644
index 7a689608..00000000
--- a/megaavr/extras/InteruptVectorNames.md
+++ /dev/null
@@ -1,175 +0,0 @@
-# Interrupts vectors and DxCore
-Use of advanced functionality of these frequenctly requires use of interrupts.
-
-## What is an interrupt
-As the name implies, an interrupt is something that can cause the currently running code to stop in it's tracks. The location of the instruction it was about to execcute is pushed onto the stack, and it then jumps to a specific "interrupt vector" near the start of the flash. This in turn is a jump (or rjump for parts with 8k or less of flash) the Interruipt Service Routine (ISR). This runs, and then returns to the code that was interrupted through a RETI instruction. Almost every peripheral can generate at least one interrupt, and most can generate several. See the datasheet for more information on what conditions they can be generated.
-
-## Creating an ISR
-There are two ways that you may end up writing an ISR, but the same considerations described at the bottom of this page apply to both of them. The first most Arduino users will see is an `attachInterrupt()` function or method - these take a function name as an argument. Somewhere in the core or library is the ISR itself, which checks if you've attached one, and calls it if so. This is simpler - where it's an option - though naturally the performance suffers as there's another layer of calls and returns, and a larger minimum number of registers that the ISR will have to save and restore (see the notes below). The other way is directly - you declare a function, but instead of the name, you use the ISR() macro with the vector name as it's argument:
-
-```
-ISR(CCL_CCL_vect) {
-  //try to keep this part fast.
-}
-```
-
-## List of interrupt vector names
-
-If there is a list of the names defined for the interrupt vectors is present somewhere in the datasheet, I was never able to find it. These are the possible names for interrupt vectors on the parts supported by megaTinyCore. Not all parts will have all interrupts listed below (interrupts associated with hardware not present on a chip won't exist there). An ISR is created with the ISR() macro.
-
-**WARNING** If you misspell the name of a vector, you will get a compiler warning BUT NOT AN ERROR! Hence, you can upload the bad code... in this case the chip will freeze the instant the ISR you thought you assigned is called, as it jumps to BAD_ISR, which in turn jumps to the reset vector... but since the interrupt flag never gets cleared, as soon as interrupts are enabled, it will do it all over again, so the chip will be hung. Encountering this (and the annoying lack of a nice list anywhere outside of the io.h) was the impetus for creating this list.
-
-
-| Vector Name         | DA | DB | DD | Cleared By       | Notes                                               |
-|---------------------|----|----|----|------------------|-----------------------------------------------------|
-| `AC0_AC_vect`       | XX | XX | XX | Manually         |                                                     |
-| `AC1_AC_vect`       | XX | XX |    | Manually         |                                                     |
-| `AC2_AC_vect`       | XX | XX |    | Manually         |                                                     |
-| `ADC0_RESRDY_vect`  | XX | XX | XX | Read result reg  |                                                     |
-| `ADC0_WCMP_vect`    | XX | XX | XX | Manually         |                                                     |
-| `BOD_VLM_vect`      | XX | XX | XX | Manually(?)      |                                                     |
-| `CCL_CCL_vect`      | XX | XX | XX | Manually         | Check flags to see which triggered, like with PORT  |
-| `CLKCTRL_CFD_vect`  |    | XX | XX | Manually         | Called when ext. clock fails, used by core for blink|
-| `MVIO_MVIO_vect`    |    | XX | XX | Manually         | Called when MVIO enables or disables (due to vDDIO2)|
-| `NMI_vect`          | XX | XX | XX | Reset            |                                                     |
-| `NVMCTRL_EE_vect`   | XX | XX | XX | Write(?)         | Unclear if can clear, or is like DRE on USARTs      |
-| `PORTA_PORT_vect`   | XX | XX | XX | Manually         |                                                     |
-| `PORTB_PORT_vect`   |  X |  X |    | Manually         |                                                     |
-| `PORTC_PORT_vect`   | XX | XX | XX | Manually         |                                                     |
-| `PORTD_PORT_vect`   | XX | XX | XX | Manually         |                                                     |
-| `PORTE_PORT_vect`   |  X |  X |    | Manually         |                                                     |
-| `PORTF_PORT_vect`   | XX | XX | XX | Manually         |                                                     |
-| `PORTG_PORT_vect`   |  X |  X |    | Manually         |                                                     |
-| `PTC_PTC_vect`      | XX |    |    | Handled by QTouch| All aspects of PTC only handled by QTouch library   |
-| `RTC_CNT_vect`      | XX | XX | XX | Manually         | Two possible flags, CNT and OVF                     |
-| `RTC_PIT_vect`      | XX | XX | XX | Manually         | Time to first PIT int is random from 0 to period    |
-| `SPI0_INT_vect`     | XX | XX | XX | Depends on mode  | 2 or 4 flags, some autoclear, some dont             |
-| `SPI1_INT_vect`     | XX | XX | XX | Depends on mode  | 2 or 4 flags, some autoclear, some dont             |
-| `TCA0_CMP0_vect`    | XX | XX | XX | Manually         | Alias: `TCA0_LCMP0_vect`                            |
-| `TCA0_CMP1_vect`    | XX | XX | XX | Manually         | Alias: `TCA0_LCMP1_vect`                            |
-| `TCA0_CMP2_vect`    | XX | XX | XX | Manually         | Alias: `TCA0_LCMP2_vect`                            |
-| `TCA0_HUNF_vect`    | XX | XX | XX | Manually         | Split Mode                                          |
-| `TCA0_OVF_vect`     | XX | XX | XX | Manually         | Alias: `TCA0_LUNF_vect`                             |
-| `TCA1_CMP0_vect`    |  X |  X |    | Manually         | Alias: `TCA1_LCMP0_vect`                            |
-| `TCA1_CMP1_vect`    |  X |  X |    | Manually         | Aloas: `TCA1_LCMP1_vect`                            |
-| `TCA1_CMP2_vect`    |  X |  X |    | Manually         | Alias: `TCA1_LCMP2_vect`                            |
-| `TCA1_HUNF_vect`    |  X |  X |    | Manually         | Split Mode                                          |
-| `TCA1_OVF_vect`     |  X |  X |    | Manually         | Alias: `TCA1_LUNF_vect`                             |
-| `TCB0_INT_vect`     | XX | XX | XX | Depends on mode  | Two flags: CMP on read ccmp in capt mode, OVF Manual|
-| `TCB1_INT_vect`     | XX | XX | XX | Depends on mode  | Two flags: CMP on read ccmp in capt mode, OVF Manual|
-| `TCB2_INT_vect`     | XX | XX |  X | Depends on mode  | Two flags: CMP on read ccmp in capt mode, OVF Manual|
-| `TCB3_INT_vect`     |  X |  X |    | Depends on mode  | Two flags: CMP on read ccmp in capt mode, OVF Manual|
-| `TCB4_INT_vect`     |  X |  X |    | Depends on mode  | Two flags: CMP on read ccmp in capt mode, OVF Manual|
-| `TCD0_OVF_vect`     | XX | XX | XX | Manually         |                                                     |
-| `TCD0_TRIG_vect`    | XX | XX | XX | Manually         |                                                     |
-| `TWI0_TWIM_vect`    | XX | XX | XX | Usually Auto     | See datasheet for list of what clears it            |
-| `TWI0_TWIS_vect`    | XX | XX | XX | Usually Auto     | See datasheet for list of what clears it            |
-| `TWI1_TWIM_vect`    | XX | XX |    | Usually Auto     | See datasheet for list of what clears it            |
-| `TWI1_TWIS_vect`    | XX | XX |    | Usually Auto     | See datasheet for list of what clears it            |
-| `USART0_DRE_vect`   | XX | XX | XX | Write/Disable    | ISR must write data or disable interrupt            |
-| `USART0_RXC_vect`   | XX | XX | XX | RXCIF, on read   | Error flags, if enabled only clear manually         |
-| `USART0_TXC_vect`   | XX | XX | XX | Manually         | Often polled and not cleared until next write       |
-| `USART1_DRE_vect`   | XX | XX | XX | Write/Disable    | ISR must write data or disable interrupt            |
-| `USART1_RXC_vect`   | XX | XX | XX | RXCIF, on read   | Error flags, if enabled only clear manually         |
-| `USART1_TXC_vect`   | XX | XX | XX | Manually         | Often polled and not cleared until next write       |
-| `USART2_DRE_vect`   | XX | XX |    | Write/Disable    | ISR must write data or disable interrupt            |
-| `USART2_RXC_vect`   | XX | XX |    | RXCIF, on read   | Error flags, if enabled only clear manually         |
-| `USART2_TXC_vect`   | XX | XX |    | Manually         | Often polled and not cleared until next write       |
-| `USART3_DRE_vect`   |  X |  X |    | Write/Disable    | ISR must write data or disable interrupt            |
-| `USART3_RXC_vect`   |  X |  X |    | RXCIF, on read   | Error flags, if enabled only clear manually         |
-| `USART3_TXC_vect`   |  X |  X |    | Manually         | Often polled and not cleared until next write       |
-| `USART4_DRE_vect`   |  X |  X |    | Write/Disable    | ISR must write data or disable interrupt            |
-| `USART4_RXC_vect`   |  X |  X |    | RXCIF, on read   | Error flags, if enabled only clear manually         |
-| `USART4_TXC_vect`   |  X |  X |    | Manually         | Often polled and not cleared until next write       |
-| `USART5_DRE_vect`   |  X |  X |    | Write/Disable    | ISR must write data or disable interrupt            |
-| `USART5_RXC_vect`   |  X |  X |    | RXCIF, on read   | Error flags, if enabled only clear manually         |
-| `USART5_TXC_vect`   |  X |  X |    | Manually         | Often polled and not cleared until next write       |
-| `ZCD0_ZCD_vect`     | XX | XX |    | Manually         |                                                     |
-| `ZCD1_ZCD_vect`     |  X |  X |    | Manually         |                                                     |
-| `ZCD2_ZCD_vect`     |  X |  X |    | Manually         |                                                     |
-| `ZCD3_ZCD_vect`     |    |    | XX | Manually         | Yeah for whatever reason they have ZCD3 only on DDs |
-
-`XX` indicates available on at least three of the four pincounts that series is available in (ex: PORTF is on 20, 28 and 32-pin DD)
-
-` X` indicates available on only one or two of the four sizes that series is available in (ex: TCA1 is only on 48 and 64 pin DA or DB)
-
-## Additional notes
-
-### Clearing flags - why so complicated?
-Almost all flags *can* be manually cleared - the ones that can be cleared automatically generally do that to be helpful:
-* when the purpose of the flag is to tell you that something is ready to be read, reading it clears the flag. ADC, serial interfaces, and TCB input capture do that.
-* The TWI interrupts work the same way - you need to read, write, or ack/nack something to respond to the bus event; doing so clears the flag too.
-* Sometimes interrupts like that can have error flags that can trigger them enabled too; those typically have to be manually cleared - by enabling them, you declare an intent to do something about them, so you're responsible for telling the hardware you did it.
-* USART, and buffered SPI have DRE interrupt that can only be cleared by writing more data - otherwise you need to disable the interurpt from within the ISR. The TXC (transfer/transmit complete) flags are freqently polled rather than used to fire interrupts. It's not entirley clear from the datasheet if the EEPROM ready interrupt is like that, or can be cleared manually.
-* The NMI is a very special interrupt; it can be configured to be a normal interrupt *or* a Non-Maskable Interrupt. In NMI mode, the part will sit there running the interrupt instead of almost-working with damaged firmware - which could potentially create a dangerous situation if it was part of a life-saftety critical device, like the controller for your antilock breaks. No matter what the damaged firmware tries to do, it cannot disable or bypass the NMI. Onlu loading working firmware and resetting it will clear the NMI. This is of particular relevance in life-safety-critical applications which these parts (but NOT this software package nor Arduino in genberal) are certified for. Not something likely to be used in Arduino-land.
-
-### Vectors linked to many flags
-There are a few vectors with a lot of flags that can trigger them. For example, each of the PORT interrupts has 8 flags that can trigger it. One hazard with these is that if you have a large number enabled - especially if your ISR is longer than it ought to be - that interrupts could fire whikle the ISR is running. You need to make sure you aren't missing those:
-
-```c++
-ISR(PORTA_PORT_vect) {
-  if (VPORTA.INTFLAGS & (1 << 0)) {
-    doSomething();
-  }
-  if (VPORTA.INTFLAGS & (1 << 1)) {
-    doSomethingElse();
-  }
-  VPORTA.INTFLAGS=VPORTA.INTFLAGS //WRONG - if an interrupt happened after it's conditional, it would be missed.
-}
-```
-
-```c++
-ISR(PORTA_PORT_vect) {
-  byte flags=PORTA.INTFLAGS; //Note: slower than VPORT!
-  if (flags & (1 << 0)) {
-    doSomething();
-  }
-  if (flags & (1 << 1)) {
-    doSomethingElse();
-  }
-  PORTA.INTFLAGS=flags; // Better... if you care whether one of those conditions happens again, though, you could still miss it.
-}
-```
-```c++
-ISR(PORTA_PORT_vect) {
-  byte flags=VPORTA.INTFLAGS;
-  PORTA.INTFLAGS=flags; // Very common approach
-  if (flags & (1 << 0)) {
-    doSomething();
-  }
-  if (flags & (1 << 1)) {
-    doSomethingElse();
-  }
-}
-```
-Another approach
-```c++
-ISR(PORTA_PORT_vect) {
-  // Only handles one interrupt source per call of the ISR
-  if (VPORTA.INTFLAGS & (1 << 0)) {
-    VPORTA.INTFLAGS |= (1 << 0);
-    doSomething();
-  }   // Could be made into an else-if in order to let other interrupts fire if your ISR
-      // is slow, and likely to be called with multiple flags set;
-      // that case goes particularly well with round-robin interrupt scheduling option.
-  if (VPORTA.INTFLAGS & (1 << 1)) {
-    VPORTA.INTFLAGS |= (1 << 1);
-    doSomethingElse();
-  }
-}
-```
-
-
-Note: `if (VPORTx.INTFLAGS & (1 << n))` is a maximally efficient way to test for a bit in a `VPORTx.anything` register or one of the 4 GPIORn/GPIOn/GPR.GPRn. Those registers (like many assorted important registers on classic AVRs - and unlike any other registers besides VPORTs and GPR/GPIOR/GPIO registers (over the past 6 years, the've been known by every one of those officially; based on current trends, by the time we get the EA series, we should have a yet another name for them, and by the tinyAVR 4-series, maybe we will be writing to `GOGO1`))
-
-### Reminders
-* ISRs should run FAST. Minimize the time that the code spends in the ISR. Never use polling loops, and avoid writing to serial. Most interrupts should just set a flag that is checked elsewhere, and do what must be done that instant (eg, reading from certain hardware registers). If you absolutely must write something to serial, maybe for debugging - can you make it a single character? `Serial.write('*');` is significantly less bad than `Serial.print("Int1 fired");`
-* Read the datasheet, particularly relating to the relevant INTFLAGS register - make sure you understand when it is and is not cleared automatically, and when you need to clear it. And be sure you do so.
-* Any global variable that an interrupt changes, if used outside the ISR, must be declared volatile - otherwise the compiler may optimize away access to it, resulting in code elsewhere not knowing that it was changed by the ISR.
-* Any global variable read by the ISR and written to by code outside of the ISR which are larger than a byte must be written to with interrupts disabled - if the interrupt triggers in the middle of a write, the ISR would see a corrupted value.
-
-
-### A bit more on timing
-So as described above, execution reaches the ISR within 6 system clock cycles (5 on 8k or less parts); Then the interrupt has to take special measures to save the state of what was interrupted, and then restore it at the end - this is automatically done by AVR-GCC at the beginning and end of an ISR (these are called the prologue and epilogue); the compiler must assume that every working register, plus the SREG, contains something that must be saved. These get saved by `push`ing them onto the stack. All of the needed registers are freed up this way before your code starts to actually execute (the more variables your ISR needs at a time, the more it needs to `push` onto the stack, prolonmging the delay before your code begins executing. At least on the modern AVRs, `push` is only 1 clock cycle (in the past, it was two). Of course, saving the state is only half of the job - after running the ISR that you wrote, the epilogue has to `pop` all those registers off the stack - this takes two clocks a piece. So while the datasheet loves to talk about 6 clocks to enter the interrupt routine - even the bare minimum is 11: 6 to get there, then push r1, and r0, load SREG into one of those, push that onto the stack, and then zero out r1 (gcc needs a known zero register, even though it often doesn't always use it when it could). So 11 clocks to enter, and 7 to restore the minimal variables and 4 for the reti - 11 on either end.... plus the overhead of any registers your code uses. In DxCore or megaTinyCore, when a TCB is used for millis, all the ISR does is load the millisecond count, add 1 to it, and save it (that's all it does!) requires 17 clocks in the prologue, 12 more to load the current millis tally, 3 to increment it, 8 more to save it, 3 to clear the interrupt flag, then the 4 registers we pushed need to be restored taking 8 clocks plus the 11 standard ones. That ISR wind up taking 17 + 26 + 19 = 62 clock cycles! This amplification of execution time is a big part of why everyone always tell folks to make the ISRs fast. What often isn't mentioned is the importance of mimnimizing the number of simultaneously used variables.
-
-If you're desperate for speed - or space - and if all you are doing is setting a flag, you can use one of the general purpose registers:GPR.GPR0/1/2/3 - the only place the core uses those is when using a bootloader, the reset cause is stashed in GPR.GPR0 (you can clear in setup: `GPR.GPR0 = 0` What is magic about the GPRs is that they are in the low I/O space. So something like `GPR.GPR1 |= (1 << n)` where n is known at compile time, is a single clock operation which consumes no registers - it gets turned into a `sbi` - set bit index. The same goes for `GPR.GPR1 &= ~(1 << n)`  - these are also atomic (an interrupt couldn't interrupt them like it could a read-modify-write). There are analogous instructions that make `if(GPR.GPR1 & (1 << n))` and `if (!(GPR.GPR1 & (1 << n))` lightning fast tests. But this only works for 1 bit at a time! (`GPR.GPR1 |= 3` is a 3 clock read-modify-write operation) (though you can read or write to them in one clock, instead of 2 or 3 like most memory locations.)
diff --git a/megaavr/extras/Libraries.md b/megaavr/extras/Libraries.md
index 56a4f91c..7f1a9399 100644
--- a/megaavr/extras/Libraries.md
+++ b/megaavr/extras/Libraries.md
@@ -47,7 +47,7 @@ These are the very-basic libraries that all Arduino-compatible board packages (w
 [EEPROM readme](../libraries/EEPROM/README.md) This is the standard wrapper around interacting with the 512b on-chip EEPROM available on all DA-series and DB-series parts. The future DD-series parts will have 256b EEPROM. It replicates the standard API exactly. There are no special concerns - except that other libraries which build upon the EEPROM library may make assumptions about how EEPROM is implemented which are not compatible with these parts.
 
 ### SoftwareSerial
-SoftwareSerial was - against my better judgement - brought over. It is unmodified from the version in the official megaavr core. Avoid using software serial wherever possible - these parts have between 3 and 6 hardware serial ports; a hardware serial port will always be less likely to cause problematic interactions. Just like on other devices, SoftwareSerial takes over ALL pin interrupts, because it calls `attachInterrupt()` (more of the blame rests with the abominable implementation of `attachInterrupt()`, really for no reason other than that attaching one interrupt to one port takes over all port interrupts on all ports.). Note that I hope to introduce a better software serial implementation in the somewhat near future.
+SoftwareSerial was - against my better judgement - brought over. It is unmodified from the version in the official megaavr core. Avoid using software serial wherever possible - these parts have between 3 and 6 hardware serial ports; a hardware serial port will always be less likely to cause problematic interactions. Just like on other devices, SoftwareSerial takes over ALL pin interrupts, because it calls `attachInterrupt()` (See [the interrupt reference](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_Interrupts.md) - there is now a way around this in 1.3.7)
 
 ### SPI
 [SPI.h readme](../libraries/SPI/README.md) The included version of SPI.h includes all the standard Arduino API functions, the `swap()` and `pins()` methods, and as of 1.3.0 supports using either SPI0 or SPI1 to increase the range of available pins without compromising compatibility. See the readme for details and the sordid story of why that was an issue. If you're not using the SPI1 pinsets, you could manually implement SPI slave on SPI1 while using SPI.h as master with the SPI0 pin sets.
diff --git a/megaavr/extras/NotesOnPeripherals.md b/megaavr/extras/NotesOnPeripherals.md
deleted file mode 100644
index 7ed3edb8..00000000
--- a/megaavr/extras/NotesOnPeripherals.md
+++ /dev/null
@@ -1,95 +0,0 @@
-# Notes on peripherals and other things
-This document lays out some general notes on the peripherals - some general, some advanced, and some particularly aimed at people who are used to classic AVRs...
-
-## With a #if, you can test for `_bm`, `_bp`, and `_gm` constants **but NOT `_gc` ones**!
-The `_gc` constants are implemented as enums, not #defines. The preprocessor does not know how to deal with them - or rather, it deals with them precisely how it deals with all other "Identifiers which are not macros" - they are treated as equal to zero. It is not an error condition. More specifically, what you can and cannot do:
-
-```
-#define TCA0_PINS    PORTMUX_TCA0_PORTC_gc     // this is OK - and TCA0_PINS gets replaced with PORTMUX_TCA0_PORTC_gc
-                           // (which is a constant, 2 in this case), when it gets substituted into C/C++ code.
-
-#ifdef TCA0_PINS           // SOMETHING is defined, yup this is okay too
-  conditionallyCompiled();
-#endif
-
-#if defined(TCA0_PINS)     // Same as the #ifdef
-  conditionallyCompiled();
-#endif
-#if TCA0_PINS == 2         // However, here the preprocessor does not know the value of TCA0_PINS, because
-  conditionallyCompiled(); // PORTMUX_TCA0_PORTC_gc is an enum; so while TCA0_PINS is defined, it is defined as
-#endif                     // an enum; which as stated above is not defined as far as the preprocessor is concerned.
-                           // So it treats TCA0_PINS as 0, and instead of this resolving to #if 2 == 2 and including
-                           // the conditionally-compiled code, it turns into 0 == 2, and that code is not included.
-```
-The creators of the preprocessor considered that "shortcut" to be a "feature" - and so even -Wall does not warn about it. Since it is all but certain to be a bug in Arduino-land, I have enabled -Wundef. when any warnings are enabled. The use of enums for the group code constants is incredibly inconvenient for this reason. And you VERY frequently want to test for them too!
-
-## Interrupts
-
-### Don't let your code try to call an ISR that doesn't exist
-
-If you don't have an ISR defined, it will go to the "bad interrupt" handler, which jumps to the reset vector... This puts your sketch into a guaranteed broken state: The INTFLAG is still set, and the interrupt is still enabled - but interrupts won't be firing because CPUINT thinks you're already in an interrupt because the `reti` instruction (RETurn from Interrupt) was never executed... The result is generally completely broken sketch behavior. So don't let that happen!
-
-### Flags MUST be cleared in (many) interrupts
-Unlike classic AVRs, you often must clear the interrupt flag in the ISR by writing a 1 to it. Not for all interrupts, though. In some cases, there is a specific condition that will result in the flag being cleared automatically (often something that is directly related to it, like reading the CAPTURE register for TCB input capture, or reading the ADC result for an ADC result ready interrupt. In a few cases (such as DRE for serial or buffered SPI), you can't control thestate of the flag at all, and can only enable or disable the interrupt). This is all described on a case-by-case basis in the datasheet where the INTFLAG registers in question is described
- Failing to do so can produce surprising results, because the processor doesn't *halt* - it just runs agonizingly slowly (plus whatever the interrupt does keeps happening - but this may not be as obvious) - because at least one instruction will always happen between interrupts.
-
-Here's a sketch that demonstrates this, and another interesting thing; the loop waiting for the ISR is exited as soon as it fires the first time... but it keeps running continually, letting the while loop after it run one instruction for each time the ISR runs. Both of those unsigned ints cause the ISR and while() loop to toggle the pins more slowly, so you can see how fast the two are running relative to each other with even really crude means.
-
-```c++
-volatile unsigned int test=0;
-unsigned int test2=0;
-void setup() {
-  PORTA.DIR=0xC0;
-  TCA0.SINGLE.CTRLA=0x0F; //TCA0 1024 prescaler, Running
-  TCB0.CTRLA=0; TCB0.CTRLB=0; TCB0.INTCTRL=1; //Stop TCB0, Periodic Interrupt timing mode, interrupt on
-  TCB0.CNT=0;
-  TCB0.CCMP=F_CPU/1024; // Set TOP to number of clocks until overflow
-  TCB0.CTRLA=0x05;
-  while(!test);
-  while(1){
-    __asm__ __volatile__("nop");
-    if (!test2++) PORTA.OUTTGL=0x40;
-  }
-}
-void loop () {}
-ISR(TCB0_INT_vect){
-  if (!test++) PORTA.OUTTGL=0x80;
-  //TCB0.INTFLAGS=1; //Should be here!
-}
-
-```
-
-As it happens the "test2" PA7 transitions 10 times for each time PA6 does - and sure enough in the assembly listing (sketch -> export compiled binary with this core exports it to sketch folder!), there are 10 instructions. **And the ISR takes a whopping 50 instructions!** (Lots of POPs and the RETI at the end; loading a byte from RAM into registers is also 3 cycles, and storing it back is 2 more. So it is surprisingly slow). Just like you'd expect (`20,000,000/(2^16*51)`), it transitionas once per 167mS or so.
-
-#### Corollary
-Accessing variables in an ISR is extra expensive because it has to save and restore the register used hold it's variable, since it doesn't know what it might be interrupting so it has to assume all registers are used and preserve them. The prologue and epilogue of an ISR is surprisingly costly. In this example, all we're doing is incrementing one unsigned integer, testing if the result is 0 and if it is executing a single instruction, and we're at 50 clock cycles! Well, yeah... loading that 2 byte global is 2x LDS, and saving it is 2x STS (2x3 + 2x2), the math and test is 4?, then the normal required prologue and epilogue is 3 push, 1 in, 1 clr, 3 pop, 1 out - another 12, and we needed 3 registers? So there's another 3 push 3 pop, 9 more clocks there, reti is 4, it takes 5 to actually start executing the ISR at all (2 to store the program counter to stack and three to execute the jmp from the vector to the routine - so without even having to look at the assembly listing, I can count up 40 clocks. Your brain should translate this to an extra-bold underline under the frequent admonation to **keep ISRs as small and short as possible** and that that includes the number of variables they access!
-
-
-### Watch out for the optimizer...
-You may be wondering about the nop instruction there.
-That's because there's another wacky thing demonstrated there: The compiler can get awfully clever when optimizing - as in, too clever. Without that, it will perform loop unrolling - replacing the loop with with all the values test2 will hold:
-
-This:
-```c++
-while(1){
-  if (!test2++) PORTA.OUTTGL=0x40;
-}
-```
-Becomes:
-```c++
-if (!0) PORTA.OUTTGL=0x40;
-if (!1) PORTA.OUTTGL=0x40;
-if (!2) PORTA.OUTTGL=0x40;
-...
-if (!65534) PORTA.OUTTGL=0x40;
-if (!65535) PORTA.OUTTGL=0x40;
-```
-And the constants where the test for the if() evaluates to false (ie, 65535 of the 65536 cases) can be eliminated - they do nothing right?
-Leaving:
-```c++
-while(1){
-  PORTA.OUTTGL=0x40;
-}
-```
-
-Which is why one normally writes busy-wait and cycle-counting stuff using inline assembly!
diff --git a/megaavr/extras/PWMandTimers.md b/megaavr/extras/PWMandTimers.md
index 9f0a4865..6b08c148 100644
--- a/megaavr/extras/PWMandTimers.md
+++ b/megaavr/extras/PWMandTimers.md
@@ -1,7 +1,7 @@
 # PWM and Timer usage
-This document describes how the timers are configured by the core prior to the sketch starting and/or by the built-in peripherals, and how this may impact users who wish to take full control of these peripherals.
+This document describes how the timers are configured by the core prior to the sketch starting and/or by the built-in peripherals, and how this may impact users who wish to take full control of these peripherals. This document - besides the background section, applies only to DxCore - though much of it is very similar to megaTinyCore. The corresponding document for megaTinyCore is more accurate for that core.
 
-## The Timers on Dx-series parts
+## Background: The Timers on Dx-series parts
 This applies to the DA, DB, and in overwhelming liklihood, the DD-series as well. These timers are, with very few changes, the same "modern" timers introduced on the tinyAVR 0/1-series, and featured on the megaAVR 0-series (including the ATmega4809 on the Nano Every) and tinyAVR 2-series. The megaAVR 0-series parts are all supported by @MCUdude's MegaCoreX [Hans/MCUdude](https://github.com/MCUdude)'s excellent [MegaCoreX](https://github.com/MCUdude/MegaCoreX).
 
 
@@ -17,9 +17,9 @@ There are a few examples of using TCA0 to generate PWM at specific frequencies a
 ### TCBn - Type B 16-bit Timer
 The type B timer is what I would describe as a "utility timer". It is also the only timer which got a significant upgrade with the Dx-series... it received a new event user, `TCBn_COUNT`, and a new `TCBn.CTRLA` register layout with an option to clock on events, rather than just. This is a pretty big deal for the type B timers. But that's  Although, unlike the earlier 0/1-series parts, (though these both call the same ISR) (they now have `CAPT` and OVF), the behavior is somewhat muddled to retain compatibility with code written for the older timers, and the benefit . The input clock source can be either the system clock, optionally prescaled by 2, or whatever the prescaled clock of TCA0 (or TCA1 if present) is.
 
-They can be set to act as 8 bit PWM source. When used for PWM, they can only generate 8-bit PWM, despite being a 16-bit timer, because the 16-bit `TCBn.CCMP` register is used for both the period and the compare value in the high and low bytes respectively. They always operate in single-slope mode, counting upwards. In other words, the type B timers are not good for generating PWM. Note also that on many parts, `TCBn.CCMP` is effected by silicon errata: It still acts like a 16-bit register, using the temp register for access, so you must read the low byte first, then high byte, and always write the high byte after the low one, lest it not be written. T
+They can be set to act as 8 bit PWM source. When used for PWM, they can only generate 8-bit PWM, despite being a 16-bit timer, because the 16-bit `TCBn.CCMP` register is used for both the period and the compare value in the low and high bytes respectively. They always operate in single-slope mode, counting upwards, and the frequency depends on that of the TCAn (since CLK_PER/2 is far too fast for 8-bit PWM). In other words, **the type B timers are not very good at generating PWM**. Note also that `TCBn.CCMP` is effected by silicon errata: It still acts like a 16-bit register, using the temp register for access, so you must read the low byte first, then high byte, and always write the high byte after the low one, lest it not be written or a bad value written over the low byte!
 
-While this makes them poor output generators, they are excellent utility timers, which is what they are clearly designed for. They can be used to time the duration of events down to single system clock cycles in the input capture modes, and with the event being timed coming from the event system, any pin can be used as the source for the input capture, as well as the analog comparators, the CCL modules, and more. As input capture timers, they are far more powerful than the 16-bit timers of the classic AVR parts. They can also be used as high resolution timers independent of the builtin millis()/micros() timekeeping system if this is needed for specific applications. The Dx-series adds support for two exciting new options - first, they can be clocked from events (those single cycle events became a lot more useful) - and secondly, you can use that to cascade two timers together, in order to do 32-bit input capture. 32 bits gives you a maximum count of 4.2 billion... How precise is that? How about precise enough to time an event lasting nearly 3 minutes to an accuracy of 24ths of a microsecond? Yes, this is possible now!
+While this makes them poor output generators, they are excellent utility timers, which is what they are clearly designed for. They can be used to time the duration of events down to single system clock cycles in the input capture modes, and with the event being timed coming from the event system, any pin can be used as the source for the input capture, as well as the analog comparators, the CCL modules, and more. As input capture timers, they are far more powerful than the 16-bit timers of the classic AVR parts. They can also be used as high resolution timers independent of the builtin millis()/micros() timekeeping system if this is needed for specific applications (in some cases during developmentm, millis/micros and a TCB were compared in order to detect errors in the timekeeping). The Dx-series adds support for two exciting new options - first, they can be clocked from events (those single cycle events became a lot more useful) - and secondly, you can use that to cascade two timers together, in order to do 32-bit input capture. 32 bits gives you a maximum count of 4.2 billion; with CLK_PER as the clock source, events with durations of several minutes can be timed to an accuracy of single clock cycles.
 
 ### TCD0 - Type D 12-bit Async Timer
 The Type D timer, is a very strange timer indeed. It can run from a totally separate clock supplied on EXTCLK, or from the unprescaled internal oscillator - or, on the Dx-series, from the on-chip PLL at 2 or 3 times the speed of the external clock or internal oscillator! It was apparently designed with a particular eye towards motor control and SMPS control applications. This makes it very nice for those sorts of use cases, but in a variety of ways,these get in the way of using it for the sort of things that people who would be using the Arduino IDE typical arduino-timer purposes. First, none of the control registers can be changed while it is running; it must be briefly stopped, the register changed, and the timer restarted. In addition, the transition between stopping and starting the timer is not instant due to the synchronization process. This is fast (it looks to me to be about 2 x the synchronizer prescaler 1-8x Synchronizer-prescaler, in clock cycless. The same thing applies to reading the value of the counter - you have to request a capture by writing the SCAPTUREx bit of TCD0.CTRLE, and wait a sync-delay for it. can *also* be clocked from the unprescaled 20 MHz (or 16 MHz) internal oscillator, even if the main CPU is running more slowly. - though it also has it's own prescaler - actually, two of them - a "synchronizer" clock that can then be further prescaled for the timer itself. It supports normal PWM (what they call one-ramp mode) and dual slope mode without that much weirdness, beyond the fact that `CMPBSET` is TOP, rather than it being set by a dedicated register. But the other modes are quite clearly made for driving motors and switching power supplies. Similar to Timer1 on the ATtiny x5 and x61 series parts in the classic AVR product line,  this timer can also create programmable dead-time between cycles.
@@ -34,112 +34,170 @@ Information on the RTC and PIT will be added in a future update.
 
 ## Timer Prescaler Availability
 
-Prescaler    | TCA0   | TCBn  | TCD0   | TCD0 sync | TD0 counter|
------------- | -------|-------| -------| -------| -------|
-CLK          |  YES   |  YES  |  YES   |  YES   |  YES   |
-CLK2         |  YES   |  YES  |  YES*  |  YES   |  NO    |
-CLK/4        |  YES   |  TCA  |  YES   |  YES   |  YES   |
-CLK/8        |  YES   |  TCA  |  YES   |  YES   |  NO    |
-CLK/16       |  YES   |  TCA  |  YES*  |  NO    |  NO    |
-CLK/32       |  NO    |  NO   |  YES   |  NO    |  YES   |
-CLK/64       |  YES   |  TCA  |  YES*  |  NO    |  NO    |
-CLK/128      |  NO    |  NO   |  YES*  |  NO    |  NO    |
-CLK/256      |  YES   |  TCA  |  YES*  |  NO    |  NO    |
-CLK/1024     |  YES   |  TCA  |  NO    |  NO    |  NO    |
-
-* Requires using the synchronizer prescaler as well. My understanding is that this results in synch cycles taking longer.
+Prescaler    | TCAn  | TCBn  | TCD0  | TCD0 sync | TD0 counterb|
+------------ | ------|-------|-------|-----------|-------------|
+CLK          |  YES  |  YES  |  YES  |  YES      |  YES        |
+CLK2         |  YES  |  YES  |  YES* |  YES      |  NO         |
+CLK/4        |  YES  |  TCA  |  YES  |  YES      |  YES        |
+CLK/8        |  YES  |  TCA  |  YES  |  YES      |  NO         |
+CLK/16       |  YES  |  TCA  |  YES* |  NO       |  NO         |
+CLK/32       |  NO   |  NO   |  YES  |  NO       |  YES        |
+CLK/64       |  YES  |  TCA  |  YES* |  NO       |  NO         |
+CLK/128      |  NO   |  NO   |  YES* |  NO       |  NO         |
+CLK/256      |  YES  |  TCA  |  YES* |  NO       |  NO         |
+CLK/1024     |  YES  |  TCA  |  NO   |  NO       |  NO         |
+
+* Requires using the synchronizer prescaler as well. My understanding is that this results in sync cycles taking longer.
+`TCA` indicates that for this prescaler, a TCA must also use it, and then that can be prescaled, and the TCB set to use that TCA's clock.
 
 ## Resolution, Frequency and Period
 When working with timers, I constantly found myself calculating periods, resolution, frequency and so on for timers at the common prescaler settings. While that is great for adhoc calculations, I felt it was worth some time to make a nice looking chart that showed those figures at a glance. The numbers shown are the resolution (when using it for timing), the frequency (at maximum range), and the period (at maximum range - ie, the most time you can measure without accounting for overflows).
 ### [In Google Sheets](https://docs.google.com/spreadsheets/d/10Id8DYLRtlp01KA7vvslC3cHaR4S2a1TrH7u6pHXMNY/edit?usp=sharing)
 
-## Usage of Timers by DxCore
-This section applies only to DxCore - though much of it is very similar to megaTinyCore.
 
+## PWM ( analogWrite() )
+### TCAn
+The core reconfigures they type A timers in split mode, so each can generate up to 6 PWM channels simultaneously. The `LPER` and `HPER` registers are set to 254, giving a period of 255 cycles (it starts from 0), thus allowing 255 levels of dimming (though 0, which would be a 0% duty cycle, is not used via analogWrite, since analogWrite(pin,0) calls digitalWrite(pin,LOW) to turn off PWM on that pin). This is used instead of a PER=255 because analogWrite(255) in the world of Arduino is 100% on, and sets that via digitalWrite(), so if it counted to 255, the arduino API would provide no way to set the 255/256th duty cycle). Additionally, modifications would be needed to make millis()/micros() timekeeping work without drift at that period - see notes about millis/micros period and prescale requirement above.
 
-### PWM ( analogWrite() )
-In it's stock configuration, TCA0 (and TCA1) is configured in split mode and used to generate the PWM controlled by analogWrite(). The `LPER` and `HPER` registers are set to 254, giving a period of 255 cycles (it starts from 0), thus allowing 255 levels of dimming (though 0, which would be a 0% duty cycle, is not used via analogWrite, since analogWrite(pin,0) calls digitalWrite(pin,LOW) to turn off PWM on that pin). This is used instead of a PER=255 because analogWrite(255) in the world of Arduino is 100% on, and sets that via digitalWrite(), so if it counted to 255, the arduino API would provide no way to set the 255/256th duty cycle). Additionally, modifications would be needed to make millis()/micros() timekeeping work without drift at that period - see notes about millis/micros period and prescale requirement above.
+### TCD0
+TCD0, by default, is configured for generating PWM (unlike TCA's, that's about all it can do usefully). TCD0 is clocked from the CLK_PER when the system is using the internal clock without prescaling. On the prescaled clocks (5 and 10 MHz, which exist for 0-series compatibility) it is run it off the unprescaled oscillator, keeping the PWM frequency near the center of the target range. When an external clock is used, we run it from the internal oscillator at 8 MHz, which is right on target.
 
-Similarly, TCD0, by default, is configured for generating PWM. TCD0 is clocked from the CLK_PER except on the prescaled clocks (5 and 10 MHz, which exist for 0-series compatibility and run it off the unprescaled oscillator - like the 1-series tinyAVR does with megaTinyCore). It is always used in single-ramp mode, with `CMPBCLR` (hence TOP) set to either 244, 509, or 1019 (for 255 tick, 510 tick, or 1020 tick cycles), the sync prescaler set to 1 for fastest synchronization, system clock and TCD0 for millis, as noted below). `CMPACLR` is set to 0xFFF (the timer maximum, 4095). The `CMPxSET` registers are controlled by analogWrite() which subtracts the supplied dutycycle from 255, and then doubles it (a lower `CMPxSET` value corresponds to a higher duty cycle in one ramp mode). Counting to 509 instead of 254 allows it to output PWM at the same 1.225 kHz (at 20 MHz) or 980 Hz (at 16 MHz) that TCA0 does.
+It is always used in single-ramp mode, with `CMPBCLR` (hence TOP) set to either 254, 509, or 1019 (for 255 tick, 510 tick, or 1020 tick cycles), the sync prescaler set to 1 for fastest synchronization, and the count prescaler to 32 except at 1 MHz. `CMPACLR` is set to 0xFFF (the timer maximum, 4095). The `CMPxSET` registers are controlled by analogWrite() which subtracts the supplied dutycycle from 255, checks the current CMPBCLR high byte to see how many places to left-shift that result by before subtracting 1 and writing to the register. The `SYNCEOC` command is sent to synchronize the compare value registers at the end of the current PWM cycle if the channel is already outputting PWM. If it isn't, we have to briefly disable the timer, turn on the pin, and then reenable it, producing a glitch on the other channel. To mitigate this issue we treat 0 and 255 duty cycles differently for the TCD pins - they instead set duty cycle to 0% without disconnecting the pin from the timer, for the 100% duty cycle case, we invert the pin (setting CMPxSET to 0 won't produce a constant output). This eliminates the glitches when the channels are enabled or disabled.
 
-Timer has two output channels - however, each of them can go to either of two pins. PA5 and PA7 use WOB, and PA4 and PA6 use WOA. If you try to write one of the pins controlled by a given channel:
+TCD0 has two output channels - however, each of them can go to either of two pins. PA5 and PA7 use WOB, and PA4 and PA6 use WOA. :
 ```
 analogWrite(PIN_PA4,64);  // outputting 25% on PA4
 analogWrite(PIN_PA5,128); // 25% on PA4, 50% on PA5
 analogWrite(PIN_PA5,0);   // 25% on PA4, PA5 constant LOW, but *still connected to timer*
-digitalWrite(PIN_PA5,LOW);// NOW PA5 totally disconnected from timer.
-analogWrite(PIN_PA6,192); // This is on same channel as PA4. We connect channel to PA6 too (not in place of - this is the same way that the Timer1 module is handled on ATTinyCore for ATtiny167 is done, actually)
-                          // so now, both PA4 and PA6 will be outputting a 75% duty cycle. Turn the first pin off with digitalWrite() if you don't want this.
-
+digitalWrite(PIN_PA5,LOW);// NOW PA5 totally disconnected from timer. A glitch will show up briefly on PA4.
+analogWrite(PIN_PA6,192); // This is on same channel as PA4. We connect channel to PA6 too (not in place of - we do the same thing on ATTinyCore for the 167 pwm output from Timer1 on the latest versions).
+                          // so now, both PA4 and PA6 will be outputting a 75% duty cycle. Turn the first pin off with digitalWrite() to explicitly turn off that pin.
 ```
+You can get a lot of control over the frequency without having to take over full management of the timer (which is rather complicated, and difficult to reconfigure) as long as you follow the rules carefully: See [TCD0 reference](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_TCD.md) Step off that narrow path, however, and analogWrite() will not work correctly.
+### TCBn
+The type B timers, while not particularly good for PWM, can be used for PWM as well; they are set to use the TCA1 clock by default. A type B timer used for millis cannot be used to output PWM.
+
+### PWM Frequencies
+The frequency of PWM output using the settings supplied by the core is shown in the table below. The "target" is 1 kHz, never less than 490 Hz or morethan 1.5 kHz. As can be seen below, there are several frequencies where this has proven an unachievable goal. The upper end of that range is the point at which - if PWMing the gate of a MOSFET - you have to start giving thought to the gate charge and switching losses, and may not be able to directly drive the gate of a modern power MOSFET and expect to get acceptable results (ie, MOSFET turns on and off completely in each cycle, there is minimal distortion of the duty cycle, and it spends most of it's "on" time with the low resistance quoted in the datasheet, instead of something much higher that would cause it to overheat and fail). Not to say that it **definitely** will work with a given MOSFET under those conditions (see [the PWM section of my MOSFET guide](https://github.com/SpenceKonde/ProductInfo/blob/master/MOSFETs/Guide.md#pwm) ), but the intent was to try to keep the frequency low enough that that use case was viable (nobody wants to be forced into using a gate driver), without compromising the ability of the timers to be useful for timekeeping.
+
+Note that no attention had been paid to these for DxCore prior to the 1.3.0 release, and serious bugs were not discovered until 1.3.7
+
+|   CLK_PER | Prescale A |   fPWM  | Prescale D  | TOP D |  fPWM (D) |
+|-----------|------------|---------|-------------|-------|-----------|
+| ** 48 MHz |        256 |  735 Hz |             |       |           |
+| ** 44 MHz |        256 |  674 Hz |             |       |           |
+| ** 40 MHz |        256 |  613 Hz |             |       |           |
+| ** 36 MHz |        256 |  551 Hz |             |       |           |
+|  External |            |         | OSCHF@8  32 |   254 |    980 Hz |
+|  * 32 MHz |        256 |  490 Hz |          32 |  1019 |    980 Hz |
+|  * 30 MHz |         64 | 1836 Hz | OSCHF@8  32 |   254 |    980 Hz |
+|  * 28 MHz |         64 | 1716 Hz |          32 |  1019 |    858 Hz |
+|    25 MHz |         64 | 1532 Hz |          32 |  1019 |    766 Hz |
+|    24 MHz |         64 | 1471 Hz |          32 |  1019 |    735 Hz |
+|    20 MHz |         64 | 1225 Hz |          32 |   509 |   1225 Hz |
+|    16 MHz |         64 |  980 Hz |          32 |   509 |    980 Hz |
+|    12 MHz |         64 |  735 Hz |          32 |   509 |    735 Hz |
+|    10 MHz |         64 |  613 Hz | OSCHF@20 32 |   509 |   1225 Hz |
+|     8 MHz |         64 |  490 Hz |          32 |   254 |    980 Hz |
+|     5 MHz |         16 | 1225 Hz |  OSCHF   32 |   509 |   1225 Hz |
+|     4 MHz |         16 |  980 Hz |          32 |   254 |    490 Hz |
+|     1 MHz |          8 |  490 Hz |           4 |   254 |    980 Hz |
+
+`*` Overclocked (generally works, 28 and 32 can be achieved with internal oscillator)
+
+`**` Way overclocked, may not work (requires external crystal or oscillator).
+
+External clock or crystal will always cause TCD0 to use the internal oscillator by default. Speeds higher than 32 MHz can only use external clock sources, so they always act as described on the External line (unless reconfigured at runtime)
+
+`Prescale A` and `fPWM` apply to all pins not on TCD0. TOP is always set to 254 for TCA
+
+`Prescale D`, `TOP D`, and `fPWM (D)` apply to the pins on TCD0.
+Where marked, we clock TCD0 from OSCHF instead of using CLK_PER, prescale by 32. For speeds other than 5 MHz and 10 MHz, we set the internal oscillator to 8 MHz.
 
-The type B timers, while not particularly good for PWM, can be used for this as well; they are set to use the TCA0 clock.
-
-#### PWM Frequencies
-The frequency of PWM output using the settings supplied by the core is shown in the table below. The "target" is 1 kHz, never above 1.5 kHz or below 500 Hz. This range is the highest frequency at which you can directly drive the gate of anything resembling a modern power MOSFET and expect to get acceptable results (ie, MOSFET turns on and off completely in each cycle, there is minimal distortion of the duty cycle, and it spends most of it's "on" time with the low resistance quoted in the datasheet, instead of something much higher that would cause it to overheat and fail). Not to say that it **definitely** will work with a given MOSFET under those conditions (see [the PWM section of my MOSFET guide](https://github.com/SpenceKonde/ProductInfo/blob/master/MOSFETs/Guide.md#pwm) ).
-Note that no attention had been paid to these for DxCore prior to the 1.3.0 release.
-
-|   CLK_PER | Prescale A |   fPWM  | Prescale D | TOP D |  fPWM (D) |
-|-----------|------------|---------|------------|-------|-----------|
-| ** 48 MHz |        256 |  735 Hz |        TBD |   TBD |       TBD |
-| ** 44 MHz |        256 |  674 Hz |        TBD |   TBD |       TBD |
-| ** 40 MHz |        256 |  613 Hz |        TBD |   TBD |       TBD |
-| ** 36 MHz |        256 |  551 Hz |        TBD |   TBD |       TBD |
-|  * 32 MHz |        256 |  490 Hz |         32 |  1019 |    980 Hz |
-|  * 28 MHz |        256 |  429 Hz |         32 |  1019 |    858 Hz |
-|    25 MHz |         64 | 1532 Hz |         32 |  1019 |    766 Hz |
-|    24 MHz |         64 | 1471 Hz |         32 |  1019 |    735 Hz |
-|    20 MHz |         64 | 1225 Hz |         32 |   509 |   1225 Hz |
-|    16 MHz |         64 |  980 Hz |         32 |   509 |    980 Hz |
-|    12 MHz |         64 |  735 Hz |         32 |   509 |    735 Hz |
-|    10 MHz |         64 |  613 Hz |  OSCHF  32 |   509 |   1225 Hz |
-|     8 MHz |         64 |  490 Hz |         32 |   254 |    980 Hz |
-|     5 MHz |         16 | 1225 Hz |  OSCHF  32 |   509 |   1225 Hz |
-|     4 MHz |         16 |  980 Hz |         32 |   254 |    490 Hz |
-|     1 MHz |          8 |  490 Hz |          4 |   254 |    980 Hz |
-
-* Overclocked (generally works and can be achieved with internal oscillator)
-** Way overclocked (untested, requires external crystal or oscillator).
-** TCD0 at those speeds does not have to use CLK_PER - it can use the internal oscillator, which we might set to, oh, whatever the most friendly and convenient speeds is
-Prescale A and F_PWM apply to all pins not on TCD0. TOP is always set to 254
-Prescale D, TOP D, and F_PWM D apply to the pins on TCD0.
-Where marked, we clock from OSCHF instead of using CLK_PER
 These are the overall Timer D prescaler (in all cases, by default only the count prescaler is used), TOP, and resulting frequency of TCD0 PWM output.
-Where TOP is not 254, (that is, everywhere that matters), all duty cycles passed to analogWrite() for those pins will by left-shifted as necessary to get an appropriate duty cycle. One of very few times that we react well to register values being changed on us...
 
-#### Potential enhancement
-When using an external clock source, it may be reasonable to configure the internal oscillator to run at 8 MHz and clock TCD0 with that to get the target 1 kHz PWM without having to bitshift the duty cycle.
+Where TCD0 TOP is not 254, but is 509, 1019, or (above 32 MHz only) 2039, all duty cycles passed to analogWrite() for those pins will by left-shifted as necessary to get an appropriate duty cycle. You can change this at runtime, along with the TCD0 clock settings while still using analgoWrite as
 
 #### Previous versions
-On versions 1.3.0 and earlier, TCA prescale was fixed at 64 for CLK_PER > 5 MHz, resulting in PWM frequencies 4x higher when clocked at speeds in excess of the manufacturer's maximum rated clock speed.
-
-Similarly, TCD0 TOP was fixed at 509, with the same result at high frequencies, plus PWM around 250-300 Hz at 4-5 MHz, and 61 Hz at 1 MHz system clock. This issue has not yet been fixed.
-The prescale of 32 is a highly favorable setting - under normal conditions, sync prescale can be 1 and count prescale 32. But if you are using the TCDThirdPWM trick from [the Logic library examples](../libraries/Logic/examples/TCDThirdPWM/TCDThirdPWM.ino), you can get the same prescale from sync prescale of 8 and count prescale of 4 - If TOP = 254, match that by setting delay prescale to 4, and if TOP = 509, set delay prescale to 8 for the same period. In the aforementioned case with the third PWM channel,where F_CPU > 25 MHz, you would need to set CMPBCLR back to 509, and you would have to accept that higher output frequency.
-
-### Millis/Micros Timekeeping
-DxCore allows any of the type A or B timers to be selected as the clock source for timekeeping via the standard millis timekeeping functions. The RTC timers will be added after the sleep/low power library for this and tinyAVR 0/1-series is completed. There are no plans to support the type D timer - this is not like tinyAVR where we are desperately short of timers, and the comparatively difficult to use type D timer is an irresistible victim to palm off the task of millis timekeeping on - and the calculations are more complicated since there are a great many possible speeds it could be running at, as opposed to just 16 or 20 on the tinyAVR 0/1-series. The timer used and system clock speed will effect the resolution of millis() and micros(), the time spent in the millis ISR when the timer overflows, and the time it takes for micros() to return a value (micros always takes several times it's resolution to return - the time returned corresponds to the time micros() was called, regardless of how long it takes to return).
-
-#### TDBn for millis timekeeping
-When TCB2 (or other type B timer) is used for millis() timekeeping, it is set to run at the system clock prescaled by 2 (1 at 1 MHz system clock) and tick over every millisecond (every 2 milliseconds at 1 MHz). This makes the millis ISR very fast, and provides 1ms resolution at all clock speeds except 1 MHz (where it has 2ms resolution). The micros() function also has 1 us resolution at all clock speeds. The type B timer is an ideal timer for millis - as these parts have plenty of them, TCB2 is used by default on all parts, like with the megaAVR 0-series devices in MegaCoreX - typically without resulting in a shortage of Type B timers for other purposes, like tone(), servo, input capture or outputting pulses of a controlled length, which is a relatively common procedure; it is anticipated that as libraries for IR, 433MHz OOK'ed remote control, and similar add support for the modern AVR parts, that these timers will see even more use.
-
-#### TCA0 for millis timekeeping
-When TCA0 is used as the millis timekeeping source, it is set to run at the system clock prescaled by 8 when system clock is 1MHz, 16 when system clock is 4 MHz or 5 MHz, and 64 for faster clock speeds, with a period of 255 (as with PWM). This provides a millis() resolution of 1 or 2 mss, and a micros() resolution of between 2us and 8us. The time taken for micros() to return is slightly faster than with TCD0 as the timekeeping source; the same goes for the time spent in the millis overflow ISR. This is the default timekeeping timer for the 0-series parts, as they do not have a type D timer. Since the timer is run in split mode, the interrupt is set on TCA0_HUNF - that means that, if you wanted to, you could change `TCA0.SPLIT.LPER` and `TCA0.SPLIT.LCMPn` registers to increase the frequency of PWM output on the WO0/1/2 pins at the cost of reducing PWM resolution, without disrupting millis timekeeping.
-
-#### TCD0 for millis timekeeping
-TCD0 is not supported for millis timekeeping on these parts. Originally it was imagined that the implementation from megaTinyCore could simply be used - but there the main clock was prescaled from 16 or 20 MHz. So TCD0 ran from unprescaled osc, giving 2 channels of normal speed PWM anda predictable timebase for millis even clocked at 1 MHz. Here, the value proposition isn't there (unless we were to change how we generate lower clocks specifically for that reason), but that's a *dun dun daaaa* decision. And I don't feel like making that one yet!  , and the clear motivation (a full res timer and full speed PWM on a part running at a lower speed) for using the type D timer in this way. Use a different timer for millis on these parts.
-
-### Tone
+On versions 1.3.0 and earlier, TCA prescale was fixed at 64 for CLK_PER > 5 MHz, resulting in PWM frequencies much higher  when clocked at speeds in excess of the manufacturer's maximum rated clock speed. Similarly, TCD0 TOP was fixed at 509, with the same result at high frequencies and because it was derived from megaTinyCore where the timebase was always 16 or 20 MHz, it never did anything about lower clock speeds, so it would give PWM at 250-300 Hz at 4-5 MHz, and 61 Hz at 1 MHz system clock...
+
+Until 1.3.7, 4 MHz and 8 MHz PWM was half of the stated frequency on TCD0, and 8x the stated frequency on 20 MHz and 40 MHz parts. Both issues were due to the wrong number of 0s after a F_CPU value. TCD0 at speeds above 32 MHz other than 40 MHz used the same settings as 32 MHz. Since 1.3.7 we use the OSCHF for TCD clock source whenever using an external clock source.also the first one where TCD0 and external clock source would grab the OSCHF clock and set it to 8 MHz.
+
+## Millis/Micros Timekeeping
+DxCore allows any of the type A or B timers to be selected as the clock source for timekeeping via the standard millis timekeeping functions. The RTC timers will be added after the sleep/low power library for this and tinyAVR 0/1-series is completed. There are no plans to support the type D timer - this is not like tinyAVR where we are desperately short of timers, with the comparatively difficult to use type D timer an irresistible victim to palm off the task of millis timekeeping on. Now, the calculations are more complicated since there are a great many possible speeds it could be running at, as opposed to just 16 or 20 on the tinyAVR 0/1-series. The timer used and system clock speed will effect the resolution of `millis()` and `micros()`, the time spent in the millis ISR, and the time it takes for micros() to return a value. The `micros()` function will typically take several times it's resolution to return, and the times returned corresponds to the time `micros()` was called, regardless of how long it takes to return.
+
+A table is presented for each type of timer comparing the percentage of CPU time spent in the ISR, the resolution of the timekeeping functions, and the execution time of micros. Typically micros() can have one of three execution times, the shortest one being overwhelmingly more common, and the differences between them are small.
+
+
+### TCAn for millis timekeeping
+When TCA0 is used as the millis timekeeping source, it is set to run at the system clock prescaled by 8 when system clock is 1MHz, 16 when system clock is 4 MHz or 5 MHz, and 64 for faster clock speeds, with a period of 255 (as with PWM). This provides a millis() resolution of 1-2ms, and is effecively not higher than 1ms between 16 and 30 MHz, while micros() resolution remains at 4 us or less. At 32 MHz or higher, to continue generating PWM output within the target range, we are forced to switch to a larger prescaler (by a factor of 4), so the resolution figures fall by a similar amoubnt, and the ISR is called that much less often.
+
+#### TCA timekeeping resolution
+|   CLK_PER | millis() | micros() | % in ISR | micros() time |
+|-----------|----------|----------|----------|---------------|
+|    48 MHz |  1.36 ms |   5.3 us |   0.19 % |        2.5 us |
+|    44 MHz |  1.48 ms |   5.8 us |   0.19 % |               |
+|    40 MHz |  1.63 ms |   6.4 us |   0.19 % |        3.5 us |
+|    36 MHz |  1.81 ms |   7.1 us |   0.19 % |          4 us |
+|    32 MHz |  2.04 ms |   8.0 us |   0.19 % |          4 us |
+|    30 MHz |  0.54 ms |   2.1 us |   0.72 % |               |
+|    28 MHz |  0.58 ms |   2.3 us |   0.72 % |          4 us |
+|    25 MHz |  0.65 ms |   2.6 us |   0.72 % |          4 us |
+|    24 MHz |  0.68 ms |   2.7 us |   0.72 % |          5 us |
+|    20 MHz |  0.82 ms |   3.2 us |   0.72 % |          7 us |
+|    16 MHz |  1.02 ms |   4.0 us |   0.72 % |          9 us |
+|    12 MHz |  1.36 ms |   5.3 us |   0.72 % |         10 us |
+|    10 MHz |  1.63 ms |   6.4 us |   0.72 % |         14 us |
+|     8 MHz |  2.04 ms |   8.0 us |   0.72 % |         17 us |
+|     5 MHz |  0.82 ms |   3.2 us |   2.99 % |         27 us |
+|     4 MHz |  1.02 ms |   4.0 us |   2.99 % |         33 us |
+|     1 MHz |  2.04 ms |   8.0 us |   5.98 % |        112 us |
+
+In contrast to the type B timer where prescaler is held constant while the period changes, here period (in ticks) is constant but the prescaler is not. Hence each prescaler option is associated with a fixed % of time spent in the ISR (and yes, for reasons I don't understand, the generated ISR code is slightly faster for /64 prescaling compared to /256, /16, and /8 (which are equal to eachother).
+The micros execution time does not depend strongly on F_CPU. Except when the resolution is way down near the minimum, the device spends more time in the ISR on these parts. Notice that at these points that - barely - favor TCAn, the interrupt they're being compared to is firing twice as frequently!
+
+
+### TCBn for millis timekeeping
+When TCB2 (or other type B timer) is used for millis() timekeeping, it is set to run at the system clock prescaled by 2 (1 at 1 MHz system clock) and tick over every millisecond. This makes the millis ISR very fast, and provides 1ms resolution at all speeds for millis. The micros() function also has 1 us resolution at all clock speeds (though there are small deterministic distortions due to the performance shortcuts used for the microsecond calculations. The type B timer is an ideal timer for millis - as these parts have plenty of them, TCB2 is used by default on all parts exceopt DD-series devices with only 2 type B timers, which use TCB1 instead. Except on those smaller DD-series parts, there is rarely competition for type B timers for other purposes, like tone(), servo, input capture or outputting pulses of a controlled length, which is a relatively common procedure; it is anticipated that as libraries for IR, 433MHz OOK'ed remote control, and similar add support for the modern AVR parts, that these timers will see even more use.
+
+
+|   CLK_PER | millis() | micros() | % in ISR | micros() time |
+|-----------|----------|----------|----------|---------------|
+|    48 MHz |     1 ms |     1 us |   0.14 % |          3 us |
+|    44 MHz |     1 ms |     1 us |   0.15 % |               |
+|    40 MHz |     1 ms |     1 us |   0.17 % |          4 us |
+|    36 MHz |     1 ms |     1 us |   0.18 % |          3 us |
+|    32 MHz |     1 ms |     1 us |   0.20 % |          3 us |
+|    30 MHz |     1 ms |     1 us |   0.22 % |               |
+|    28 MHz |     1 ms |     1 us |   0.23 % |               |
+|    25 MHz |     1 ms |     1 us |   0.26 % |          6 us |
+|    24 MHz |     1 ms |     1 us |   0.27 % |          7 us |
+|    20 MHz |     1 ms |     1 us |   0.33 % |          7 us |
+|    16 MHz |     1 ms |     1 us |   0.40 % |          6 us |
+|    12 MHz |     1 ms |     1 us |   0.54 % |         12 us |
+|    10 MHz |     1 ms |     1 us |   0.65 % |         13 us |
+|     8 MHz |     1 ms |     1 us |   0.80 % |         11 us |
+|     5 MHz |     1 ms |     1 us |   1.30 % |         25 us |
+|     4 MHz |     1 ms |     1 us |   1.60 % |         21 us |
+|     1 MHz |     1 ms |     1 us |   6.50 % |         78 us |
+Resolution is always exactly 1ms for millis, and whereas TCAn micros() is limited by the resolution of the timer, here it's instead limited only by the resolution of the value we are returning. The timer count and the running tally of overflows could get us microseconds limited only by F_CPU/2
+The percentage of time spent in the ISR varies in inverse proportion to the clock speed - the ISR simply increments a counter and clears its flags. Implemnented in assembly the ISR could be pushed under 50 clocks from interrupt flag becoming set to interrupted code resuming, but even as is, it runs in about half the time that the TCA ISR does. The time that micros takes to return a value varies significatly with F_CPU. Specifically, powers of 2 are most favorable, and the special case of 1 MHz particularly so, as litterally all calculations drop out except for multiplying the overflow count by 1000 (which takes more time than the rest of micros combined) and adding the result to the current timer count.
+
+### TCD0 for millis timekeeping
+TCD0 is not supported for millis timekeeping on these parts. Originally it was imagined that the implementation from megaTinyCore could simply be used - but there the main clock was prescaled from 16 or 20 MHz, and TCD0 ran from unprescaled osc, giving 2 channels of normal speed PWM and a predictable timebase for millis even clocked at 1 MHz. Here, the value proposition isn't as strong: there are more timers available, and the type D timer is readily put to use generating high frequency PWM with the PLL and higher maximum system clock speeds. (ex, motor control applications which must be outside of the range of human hearing for quieter operation. Note that such applications most certainly require a MOSFET gate driver. 50 kHz is an order of magnitude above the highest plausible frequency of PWM without one.
+
+## Tone
 The tone() function included with DxCore uses one Type B timer. It defaults to using TCB0; do not use that for millis timekeeping if using tone(). Tone is not compatible with any sketch that needs to take over TCB0. If possible, use a different timer for your other needs. When used with Tone, it will use CLK_PER or CLK_PER/2 as it's clock source - the TCA clock will never be used, so it does not care if you change the TCA0 prescaler (unlike the official megaAVR core).
 
 Tone works the same was as the normal tone() function on official Arduino boards. Unlike the official megaAVR board package's tone function, it can be used to generate arbitrarily low frequency tones (as low as 1 Hz). If the period between required toggling's of the pin is greater than the maximum timer period possible, it will calculate how many cycles it has to wait through between switching the pins in order to achieve the desired frequency.
 
 It can only generate a tone on one pin at a time.
 
-All tone generation is done via interrupts. The hardware output compare functionality is not used for generating tones.
+All tone generation is done via interrupts. The hardware output compare functionality is not used for generating tones because in PWM mode, the type B timers kindof suck.
 
-### Servo Library
-The Servo library included with this core uses one Type B timer. It defaults to using TCB1 if available, unless that timer is selected for Millis timekeeping. Otherwise, it will use TCB0. The Servo library is not compatible with any sketch that needs to take over these timers - if possible, use a different timer for your other needs. Servo and tone() can only be used together on parts with both TCB0 and TCB1.
+## Servo Library
+The Servo library included with this core uses one Type B timer. It defaults to using TCB1 if available, unless that timer is selected for Millis timekeeping. Otherwise, it will use TCB0. The Servo library is not compatible with any sketch that needs to take over these timers - if possible, use a different timer for your other needs. Servo and tone() can only be used together on when neither of those is used for millis timekeeping.
 
 Regardless of which type B timer it uses, Servo configures that timer in Periodic Interrupt mode (`CNTMODE`=0) mode with CLK_PER/2 or CLK_PER as the clock source, so there is no dependence on the TCA prescaler. The timer's interrupt vector is used, and it's period is constantly adjusted as needed to generate the requested pulse lengths. In 1.1.9 and later, CLK_PER is used if the system clock is below 10MHz to generate smoother output and improve performance at low clock speeds.
 
-The above also applies to the Servo_megaTinyCore library; it is an exact copy except for the name. If you have installed a version of Servo via Library Manager or by manually placing it in your sketchbook/libraries folder, the IDE will use that in preference to the one supplied with this core. Unfortunately, that version is not compatible with the tinyAVR parts. Include Servo_megaTinyCore.h instead in this case. No changes to your code are needed other than the name of the library you include.
+The above also applies to the Servo_DxCore library; it is an exact copy except for the name. If you have installed a version of Servo via Library Manager or by manually placing it in your sketchbook/libraries folder, the IDE will use that in preference to the one supplied with this core. Unfortunately, that version is not compatible with the Dx-series parts. Include Servo_megaTinyCore.h instead in this case. No changes to your code are needed other than the name of the library you include.
diff --git a/megaavr/extras/Performance.md b/megaavr/extras/Performance.md
deleted file mode 100644
index 1dace2cf..00000000
--- a/megaavr/extras/Performance.md
+++ /dev/null
@@ -1,24 +0,0 @@
-# Performance of common API functions
-
-Speed  | digitalWrite(OUTPUT)  | digitalWrite(INPUT)   | digitalRead()   | analogRead()  | pinMode()
-------------   | ------------  | ------------  | ------------  | ------------  | ------------
-Clock Cycles   | 130   | 153   | 108.5   | N/A   | 52.5
-20 MHz   | 6.50  | 7.65  | 5.43  | 56.01   | 2.63
-16 MHz   | 8.13  | 9.56  | 6.78  | 70.01   | 3.28
-10 MHz   | 13.00   | 15.30   | 10.85   | 89.35   | 5.25
-8 MHz  | 16.25   | 19.13   | 13.56   | 111.68  | 6.56
-5 MHz  | 26.00   | 30.60   | 21.70   | 91.09   | 10.50
-4 MHz  | 32.50   | 38.25   | 27.13   | 113.87  | 13.13
-1 MHz  | 130.0   | 153.00  | 108.50  | 125.46  | 52.50
-
-## Notes
-
-### digitalWrite() on INPUT vs OUTPUT pins
-On classic AVRs, the PORT register controlled whether the pullups were enabled; That is not the case with the megaAVR parts, but much code is in circulation that assumes this, so the behavior is emulated. However, this imposes a performance penaly on digitalWrite() when called on an input pin.
-
-### pinMode() is average of INPUT and OUTPUT
-Setting a pin OUTPUT is simpler than setting it as INPUT. Result shown is an average.
-
-### analogRead depends on the ADC clock
-analogRead depends in large part on the ADC clock, which is set to 156.25 kHz for 20/10/5 MHz boards, and 125 kHz for other clock speeds. However, it does still require some normal processor time before and after the conversion, as shown by the longer times at slower clock speeds.
-I am not sure why analogRead() is so much faster at 20 MHz either - this may be addressed in a future version.
diff --git a/megaavr/extras/PinInterrupts.md b/megaavr/extras/PinInterrupts.md
deleted file mode 100644
index 71f0511c..00000000
--- a/megaavr/extras/PinInterrupts.md
+++ /dev/null
@@ -1,113 +0,0 @@
-# Pin interrupts
-While the usual attachInterrupt() functionality is provided by megaTinyCore (and works on every pin, with every type of trigger), these will take slightly longer to run, and use more flash, than an equivalent interrupt implemented manually (due to the need to check for all 8 pins - while a manually implemented scheme would know that only the pins configured to generate interrupts need to be checked; that takes both time and flash space). Additionally, there are common use cases (for example, reading rotary encoders, particularly more than one) where each pin being handled separately prevents convenient shortcuts from being taken. For these reasons, it is often desirable or necessary to manually implement a pin interrupt.
-
-## Manually implementing pin interrupts
-The system for interrupts on megaavr parts is different from, and vastly more powerful than that of the classic AVR parts. Unlike classic AVR parts, there are no special interrupt pins (INT0, INT1, etc.) - instead, all pins can be configured to generate an interrupt on change, rising, falling or LOW level. While all pins on a port are handled by the same ISR (like classic AVR PCINT's), the pin that triggered the interrupt is recorded in the INTFLAGS register, making it easy to determine which pin triggered the interrupt.
-
-### Enabling the interrupt
-The pin interrupt control is handled by the PORTx.PINnCTRL register.
-Bits 0-2 control interrupt and sense behavior:
-000 = no interrupt, normal operation
-001 = interrupt on change
-010 = interrupt on rising
-011 = interrupt on falling
-100 = digital input buffer disabled entirely (equivalent of DIDR register on classic AVRs that have it)
-101 = interrupt on LOW level
-
-Bit 3 controls the pullup.
-
-Bit 3 is set when pinMode() is used to set the pin to INPUT_PULLUP. When manually writing the PINnCTRL registers, be sure to either use bitwise operators to preserve this bit, or set it to the correct value.
-
-### The ISR
-Each port has one interrupt vector; their names are:
-
-    PORTA_PORT_vect
-    PORTB_PORT_vect
-    PORTC_PORT_vect
-
-When the interrupt condition occurs, the bit int PORTx.INTFLAGS corresponding to the interrupt will be set. If multiple pins in a port are used for interrupts corresponding to different things, you can use this to determine which pin triggered the interrupt. **YOU MUST CLEAR THIS BIT WITHIN THE ISR** - the interrupt will continue to be generated as long as the flag is set, so if you do not unset it, the ISR will run continuously after it was triggered once. To clear the bit, write a 1 to it; writing 255 to it will clear all of them.
-
-### A basic example for the x16/x06
-
-```cpp
-unsigned long previousMillis;
-byte ledState;
-volatile byte interrupt1;
-volatile byte interrupt2;
-volatile byte interrupt3;
-
-void setup() {
-  pinMode(LED_BUILTIN, OUTPUT);
-  pinMode(PIN_PC2,INPUT_PULLUP); //PC2
-  pinMode(PIN_PA1,INPUT_PULLUP); //PA1
-  pinMode(PIN_PA2,INPUT_PULLUP); //PA2
-  pinMode(PIN_PB0,INPUT_PULLUP); //PB0
-  PORTC.PIN2CTRL=0b00001101; //PULLUPEN=1, ISC=5 trigger low level
-  PORTA.PIN1CTRL=0b00001010; //PULLUPEN=1, ISC=2 trigger rising
-  PORTA.PIN2CTRL|=0x02; //ISC=2 trigger rising - uses |= so current value of
-  PORTB.PIN0CTRL=0b00001001; //PULLUPEN=1, ISC=1 trigger both
-  Serial.begin(115200);
-  delay(10);
-  Serial.println("Startup");
-
-}
-
-void loop() {
-  if (interrupt1){
-    interrupt1=0;
-    Serial.println("I1 fired");
-  }
-  if (interrupt2){
-    interrupt2=0;
-    Serial.println("I2 fired");
-  }
-  if (interrupt3){
-    interrupt3=0;
-    Serial.println("I3 fired");
-  }
-  //BlinkWithoutDelay, just so you can confirm that the sketch continues to run.
-  unsigned long currentMillis = millis();
-  if (currentMillis - previousMillis >= 1000) {
-    previousMillis = currentMillis;
-    if (ledState == LOW) {
-      ledState = HIGH;
-    } else {
-      ledState = LOW;
-    }
-    digitalWrite(LED_BUILTIN, ledState);
-  }
-}
-
-ISR(PORTA_PORT_vect) {
-  byte flags=PORTA.INTFLAGS;
-  PORTA.INTFLAGS=flags; //clear flags
-  if (flags&0x02) {
-    interrupt1=1;
-  }
-  if (flags&0x04) {
-    interrupt2=1;
-  }
-}
-
-ISR(PORTB_PORT_vect) {
-  PORTB.INTFLAGS=1; //we know only PB0 has an interrupt, so that's the only flag that could be set.
-  interrupt3=1;
-}
-
-ISR(PORTC_PORT_vect) {
-  _PROTECTED_WRITE(RSTCTRL.SWRR,1); //virtual reset
-}
-```
-
-### Synchronous and Asynchronous pins
-Certain pins (pin 2 and 6 in each port) are "fully asynchronous" - These pins have several special properties:
-* They can be triggered by conditions which last less than one processor cycle.
-* They can wake the system from sleep on change, rising, falling or level interrupt. Other pins can only wake on change or level interrupt.
-* There is no "dead time" between successive interrupts (other pins have a 3 clock cycle "dead time" between successive interrupts)
-
-In the example above, note that the interrupts on pin 14 and 15 (PA1 and PA2) are configured identically. However, if the part was put to sleep, only the one on pin PA2 would be able to wake the part, as they trigger on rising edge, and only PA2 is a fully asynchronous pin.
-
-### Interrupt response time
-The datasheet reports the interrupt response time as 6 clock cycles (1 to finish current instruction, 2 to save PC for return, and 3 to execute the jmp to the ISR). That is, like any vendor spec, technically accurate, but paints an overly optimistic picture, because you don't care about when execution reaches the start of the ISR(), you care when it starts executing the code you wrote in the ISR (right?). But before that happens, the prologue has to execute. The prologue is responsible for saving the state of the status register and working registers that your ISR modifies; so that they can be restored in the epilogue after the ISR is done and the function that was interrupted can continue and produce expected results. All ISRs written in C will push r0 and r1 onto the stack, load sreg into r0 and push that, and then clear r1 so it can act as `__zero_reg` as gcc requires. That's 5 more clocks (and was 8 on classic AVRs - push is single cycle now, just like st - push and pop are almost identical to st X+ and ld -X, only using the stack pointer instead of the X register). So you're up to a minimum of 11 cycles - and that only gets you one register, r0. For every byte of local storage you need (local variables, and I think potentially intermediate values, depending on complexity of the code) that's going to cost you a longer prologue. The serial transmit ISR in DxCore needs another 12 bytes, so the prologue takes 17 clocks, and it then calls (3 clocks) another routine, which pushes 2 more registers in it's prologue - so it's 27 clocks from when the tx data register is empty to when the ISR starts running! For every register pushed in the prologue, it needs to be popped back in epilogue at the end of the ISR; pop takes 2 clocks, and the reti instruction itself takes another 4. So at minimum, it's 12 clocks between when the last of the code you wrote in the isr finishes, and whatever it interrupted resumes.
-
-In the above example of the serial TX interrupt, there are the 2 registers of the interior function, and it's ret, then 12 pops to restore the 12 pushes, then the 8 clocks of boiler plate regarding sreg and r0/r1, then reti: 8 + 24 + 8 + 4 = 44 clocks. The total overhead overhead is a whopping 71 clock cycles there! This is actually longer than the interrupt itself, which runs for around 35 clock cycles. At low F_CPU and high baud rate, the sum of these, 136, is *longer than it takes to transmit that character* (the point where this manifests is close to 57600 baud at 1 MHz - proportionally higher at faster clock speeds, and so rarely a problem if not running at unusually low system clock speed) - though there are significant complicating factors, and in some cases the buffer doesn't actually get used at all. It may be possible to improve this (unlike some parts of the Serial library, this has not been rewritten for DxCore) in a future update, but the code is rather intimidating. The RX interrupt is considerably faster and doesn't suffer from this problem (if it did, it would miss characters!)
diff --git a/megaavr/extras/Ref_Interrupts.md b/megaavr/extras/Ref_Interrupts.md
new file mode 100644
index 00000000..b00013e5
--- /dev/null
+++ b/megaavr/extras/Ref_Interrupts.md
@@ -0,0 +1,259 @@
+# Interrupts defined manually
+Use of advanced functionality of these parts frequenctly requires use of interrupts.
+
+## What is an interrupt
+As the name implies, an interrupt is something that can cause the currently running code to stop in it's tracks. The location of the instruction it was about to execute is pushed onto the stack, and it then jumps to a specific "interrupt vector" near the start of the flash. This in turn is a jump (or rjump) to the Interruipt Service Routine (ISR). This runs, and then returns to the code that was interrupted through a RETI instruction. Almost every peripheral can generate at least one interrupt, and most can generate several. See the datasheet for more information on what conditions they can be generated in/
+
+## Creating an ISR
+There are two ways that you may end up writing an ISR, but most of the same considerations apply. The first most Arduino users will see is an `attachInterrupt()` function or method - these take a function name as an argument. Somewhere in the core or library is the ISR itself, which checks if you've attached one, and calls it if so. This is simpler - where it's an option - though the performance suffers profoundly as there's another layer of calls and returns, and a larger minimum number of registers that the ISR will have to save and restore (see the notes below). We recommend avoiding attachInterrupt() type calls when possible. For pin interrupts, even after the 1.3.7 rewrite, `attachInterrupt(pin, isr)` interrupts will take 50-100 clock cycles before the ISR even starts executing, and has a total overhead of around 150 clock cycles. This method is easy, but the slow response time limits its usefulness. This is inherrent to the any system wherein an arbitrary function pointer is called from an ISR (about half the overhead) and the fact that the API permits a separate function on each pin within a port and has to check for null pointers make it that much worse.
+
+The other way is directly - it's as if you are declaring a function, but instead of the name, you use the ISR() macro with the vector name as it's argument; these can have overhead as low as 21 clock cycles, split approximately evenly between before and after:
+
+```
+ISR(CCL_CCL_vect) {
+  //try to keep this part fast.
+}
+```
+
+That 21 clock overhead figure is a best-case value - it can be worse (worst case possible is 68 clocks - 26 before and 42 after for an ISR that uses every call-used working register, like the attachInterrupt case forces it to assume). This is in to any register shuffling overhead that would be imposed for very register-intensive code in a normal function call; well written interrupts are short and sweet and will be closer to 21 than 68.
+
+## Only one definition per vector
+You cannot define the same vector as two different things. This is most often a problem with the default settings for attachInterrupt() - to mimic the standard API, by default, we permit attach interrupt on any pin. Even if no pin in a port is attached to, it will still always take over every port interrupt vector. There is a menu option added to 1.3.7 to select beteween 3 versions of attachInterrupt() - the new version (default), the old one (in case there is a new bug introduced by this), and manual. In manual mode you can restrict it to specific ports, such that it leaves other port vectors unused. You must call attachPortAEnable() (replace A with the letter of the port) before attaching the interrupt. The main point of this is that (in addition to saving an amount of flash that doesn't much matter on the Dx-series) attachInterrupt() on one pin (called by a library, say) will not glom onto every single port's pin interrupt vectors so you can't manually define any. The interrupts are still just as slow (it's inherrent to calling a function by pointer from an ISR - and low-numbered pins are faster to start executing than high numbered ones. The method to enable may change - I had hoped that I could detect which pins were used, but I couldn't get the function chose which ports to enable to not count as "referencing" those ports, and hence pull inthe ISR. I am not happy with it, but "can't use any pin interrupts except through attachInterrupt() if using a library that uses attachInterrupt()" is significantly worse.
+
+## List of interrupt vector names
+If there is a list of the names defined for the interrupt vectors is present somewhere in the datasheet, I was never able to find it. These are the possible names for interrupt vectors on the parts supported by megaTinyCore. Not all parts will have all interrupts listed below (interrupts associated with hardware not present on a chip won't exist there). An ISR is created with the `ISR()` macro.
+
+**WARNING** If you misspell the name of a vector, you will get a compiler warning BUT NOT AN ERROR! Hence, you can upload the bad code... in this case the chip will freeze the instant the ISR you thought you assigned is called, as it jumps to BAD_ISR, which in turn jumps to the reset vector... but since the reti instruction is never executed, it still thinks its in an interrupt. If the recommended procedures in the [reset reference](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_Reset.md) were followed, this will be detected through the absence of a new reset flag, triggering a software reset; Otherwise, the code will start running, with interrupts "enabled" but not actually occurring, which usually but not always results in a hang or bootloop, but could potentially do something more subtly wrong. Encountering this (and the annoying lack of a nice list anywhere outside of the io.h) was the impetus for creating this list.
+
+| Vector Name         | DA | DB | DD | Cleared By       | Notes                                               | Used by              |
+|---------------------|----|----|----|------------------|-----------------------------------------------------|----------------------|
+| `AC0_AC_vect`       | XX | XX | XX | Manually         |                                                     | Comparator.h library |
+| `AC1_AC_vect`       | XX | XX |    | Manually         |                                                     | Comparator.h library |
+| `AC2_AC_vect`       | XX | XX |    | Manually         |                                                     | Comparator.h library |
+| `ADC0_RESRDY_vect`  | XX | XX | XX | Read result reg  |                                                     |                      |
+| `ADC0_WCMP_vect`    | XX | XX | XX | Manually         | ADC Window Comparator interrupt.                    |                      |
+| `BOD_VLM_vect`      | XX | XX | XX | Manually(?)      |                                                     |                      |
+| `CCL_CCL_vect`      | XX | XX | XX | Manually         | Check flags to see which triggered, like with PORT  | Logic.h library      |
+| `CLKCTRL_CFD_vect`  |    | XX | XX | Manually         | Called when ext. clock fails, used by core for blink| For ext. clock/xtal  |
+| `MVIO_MVIO_vect`    |    | XX | XX | Manually         | Called when MVIO enables or disables (due to vDDIO2)|                      |
+| `NMI_vect`          | XX | XX | XX | Reset            | Can only be triggered by CRC failure.               |                      |
+| `NVMCTRL_EE_vect`   | XX | XX | XX | Write(?)         | Unclear if can clear, or is like DRE on USARTs      |                      |
+| `PORTA_PORT_vect`   | XX | XX | XX | Manually         |                                                     | attachInterrupt()    |
+| `PORTB_PORT_vect`   |  X |  X |    | Manually         |                                                     | attachInterrupt()    |
+| `PORTC_PORT_vect`   | XX | XX | XX | Manually         |                                                     | attachInterrupt()    |
+| `PORTD_PORT_vect`   | XX | XX | XX | Manually         |                                                     | attachInterrupt()    |
+| `PORTE_PORT_vect`   |  X |  X |    | Manually         |                                                     | attachInterrupt()    |
+| `PORTF_PORT_vect`   | XX | XX | XX | Manually         |                                                     | attachInterrupt()    |
+| `PORTG_PORT_vect`   |  X |  X |    | Manually         |                                                     | attachInterrupt()    |
+| `PTC_PTC_vect`      | XX |    |    | Handled by QTouch| All aspects of PTC only handled by QTouch library   |                      |
+| `RTC_CNT_vect`      | XX | XX | XX | Manually         | Two possible flags, CNT and OVF                     |                      |
+| `RTC_PIT_vect`      | XX | XX | XX | Manually         | Time to first PIT int is random from 0 to period    |                      |
+| `SPI0_INT_vect`     | XX | XX | XX | Depends on mode  | 2 or 4 flags, some autoclear, some don't            |                      |
+| `SPI1_INT_vect`     | XX | XX | XX | Depends on mode  | 2 or 4 flags, some autoclear, some don't            |                      |
+| `TCA0_CMP0_vect`    | XX | XX | XX | Manually         | Split Mode: `TCA0_LCMP0_vect`                       |                      |
+| `TCA0_CMP1_vect`    | XX | XX | XX | Manually         | Split Mode: `TCA0_LCMP1_vect`                       |                      |
+| `TCA0_CMP2_vect`    | XX | XX | XX | Manually         | Split Mode: `TCA0_LCMP2_vect`                       |                      |
+| `TCA0_HUNF_vect`    | XX | XX | XX | Manually         | Split Mode only                                     | If used for millis   |
+| `TCA0_OVF_vect`     | XX | XX | XX | Manually         | Split Mode: `TCA0_LUNF_vect`                        |                      |
+| `TCA1_CMP0_vect`    |  X |  X |    | Manually         | Split Mode: `TCA1_LCMP0_vect`                       |                      |
+| `TCA1_CMP1_vect`    |  X |  X |    | Manually         | Split Mode: `TCA1_LCMP1_vect`                       |                      |
+| `TCA1_CMP2_vect`    |  X |  X |    | Manually         | Split Mode: `TCA1_LCMP2_vect`                       |                      |
+| `TCA1_HUNF_vect`    |  X |  X |    | Manually         | Split Mode only                                     | If used for millis   |
+| `TCA1_OVF_vect`     |  X |  X |    | Manually         | Split Mode: `TCA1_LUNF_vect`                        |                      |
+| `TCB0_INT_vect`     | XX | XX | XX | Depends on mode  | Two flags: CMP on read ccmp in capt mode, OVF Manual| If used for millis   |
+| `TCB1_INT_vect`     | XX | XX | XX | Depends on mode  | Two flags: CMP on read ccmp in capt mode, OVF Manual| If used for millis   |
+| `TCB2_INT_vect`     | XX | XX |  X | Depends on mode  | Two flags: CMP on read ccmp in capt mode, OVF Manual| If used for millis   |
+| `TCB3_INT_vect`     |  X |  X |    | Depends on mode  | Two flags: CMP on read ccmp in capt mode, OVF Manual| If used for millis   |
+| `TCB4_INT_vect`     |  X |  X |    | Depends on mode  | Two flags: CMP on read ccmp in capt mode, OVF Manual| If used for millis   |
+| `TCD0_OVF_vect`     | XX | XX | XX | Manually         |                                                     |                      |
+| `TCD0_TRIG_vect`    | XX | XX | XX | Manually         |                                                     |                      |
+| `TWI0_TWIM_vect`    | XX | XX | XX | Usually Auto     | See datasheet for list of what clears it            | Wire.h library       |
+| `TWI0_TWIS_vect`    | XX | XX | XX | Usually Auto     | See datasheet for list of what clears it            | Wire.h library       |
+| `TWI1_TWIM_vect`    | XX | XX |    | Usually Auto     | See datasheet for list of what clears it            | Wire.h library       |
+| `TWI1_TWIS_vect`    | XX | XX |    | Usually Auto     | See datasheet for list of what clears it            | Wire.h library       |
+| `USART0_DRE_vect`   | XX | XX | XX | Write, not manual| ISR must write data or disable interrupt            | Serial class         |
+| `USART0_RXC_vect`   | XX | XX | XX | RXCIF, on read   | Error flags, if enabled, only clear manually        | Serial class         |
+| `USART0_TXC_vect`   | XX | XX | XX | Manually         | Often polled and not cleared until next write       |                      |
+| `USART1_DRE_vect`   | XX | XX | XX | Write, not manual| ISR must write data or disable interrupt            | Serial class         |
+| `USART1_RXC_vect`   | XX | XX | XX | RXCIF, on read   | Error flags, if enabled, only clear manually        | Serial class         |
+| `USART1_TXC_vect`   | XX | XX | XX | Manually         | Often polled and not cleared until next write       |                      |
+| `USART2_DRE_vect`   | XX | XX |    | Write, not manual| ISR must write data or disable interrupt            | Serial class         |
+| `USART2_RXC_vect`   | XX | XX |    | RXCIF, on read   | Error flags, if enabled, only clear manually        | Serial class         |
+| `USART2_TXC_vect`   | XX | XX |    | Manually         | Often polled and not cleared until next write       |                      |
+| `USART3_DRE_vect`   |  X |  X |    | Write, not manual| ISR must write data or disable interrupt            | Serial class         |
+| `USART3_RXC_vect`   |  X |  X |    | RXCIF, on read   | Error flags, if enabled, only clear manually        | Serial class         |
+| `USART3_TXC_vect`   |  X |  X |    | Manually         | Often polled and not cleared until next write       |                      |
+| `USART4_DRE_vect`   |  X |  X |    | Write, not manual| ISR must write data or disable interrupt            | Serial class         |
+| `USART4_RXC_vect`   |  X |  X |    | RXCIF, on read   | Error flags, if enabled, only clear manually        | Serial class         |
+| `USART4_TXC_vect`   |  X |  X |    | Manually         | Often polled and not cleared until next write       |                      |
+| `USART5_DRE_vect`   |  X |  X |    | Write, not manual| ISR must write data or disable interrupt            | Serial class         |
+| `USART5_RXC_vect`   |  X |  X |    | RXCIF, on read   | Error flags, if enabled, only clear manually        | Serial class         |
+| `USART5_TXC_vect`   |  X |  X |    | Manually         | Often polled and not cleared until next write       |                      |
+| `ZCD0_ZCD_vect`     | XX | XX |    | Manually         |                                                     | ZCD.h library        |
+| `ZCD1_ZCD_vect`     |  X |  X |    | Manually         |                                                     | ZCD.h library        |
+| `ZCD2_ZCD_vect`     |  X |  X |    | Manually         |                                                     | ZCD.h library        |
+| `ZCD3_ZCD_vect`     |    |    | XX | Manually         | DD-series has no ZCD0-2, and instead has ZCD3       | ZCD.h library        |
+
+`XX` indicates available on at least three of the four pincounts that series is available in.
+
+` X` indicates available on only one or two of the four sizes that series is available in (ex: TCA1 is only on 48 and 64 pin DA or DB)
+
+## Why clearing flags is so complicated
+Almost all flags *can* be manually cleared - the ones that can be cleared automatically generally do that to be helpful:
+* When the purpose of the flag is to tell you that something is ready to be read, reading it clears the flag. ADC, serial interfaces, and TCB input capture do that.
+* The TWI interrupts work the same way - you need to read, write, or ack/nack something to respond to the bus event; doing so clears the flag too.
+* Sometimes interrupts like that can have error flags that can trigger them enabled too; those typically have to be manually cleared - by enabling them, you declare an intent to do something about them, so you're responsible for telling the hardware you did it.
+* USART, and buffered SPI have DRE interrupt that can only be cleared by writing more data - otherwise you need to disable the interurpt from within the ISR. The TXC (transfer/transmit complete) flags are freqently polled rather than used to fire interrupts. It's not entirley clear from the datasheet if the EEPROM ready interrupt is like that, or can be cleared manually.
+* The NMI is a very special interrupt; it can be configured to be a normal interrupt *or* a Non-Maskable Interrupt. In NMI mode, the part will sit there running the interrupt instead of almost-working with damaged firmware - which could potentially create a dangerous situation if it was part of a life-saftety critical device, like the controller for an airbag, or antilock breaks in a car. In such cases, corrupted firmware might appear work fine, if the corruption only impacted code-paths related to handling the relevant emergency situations, so the vehicle and hence operator would not be aware of a problem until . No matter what the damaged firmware tries to do, it cannot disable or bypass the NMI. Only loading working firmware and resetting it will clear the NMI. This is of particular relevance in life-safety-critical applications which these parts (but NOT this software package nor Arduino in general) are certified for. Not something likely to be used in Arduino-land.
+
+### Vectors linked to many flags
+There are a few vectors with a lot of flags that can trigger them. For example, each of the PORT interrupts has 8 flags that can trigger it. One hazard with these is that if you have a large number enabled - especially if your ISR is longer than it ought to be - that interrupts could fire whille the ISR is running. You need to make sure you aren't missing those:
+*note - these depict calling a function from the ISR. That's generally bad unless it will be automatically inlined because it is only referenced here, since it increases overhead due to required register saving and restoration.*
+
+**Wrong**
+```c++
+ISR(PORTA_PORT_vect) {
+  if (VPORTA.INTFLAGS & (1 << 0)) {
+    doSomething();
+  }
+  if (VPORTA.INTFLAGS & (1 << 1)) {
+    doSomethingElse();
+  }
+  VPORTA.INTFLAGS=VPORTA.INTFLAGS //WRONG - if an interrupt happened after it's conditional, it would be missed.
+}
+```
+
+**Less wrong**
+```c++
+ISR(PORTA_PORT_vect) {
+  byte flags=PORTA.INTFLAGS; //Note: slower than VPORT; use VPORTx, not PORTx for INTFLAGS
+  if (flags & (1 << 0)) {
+    doSomething();
+  }
+  if (flags & (1 << 1)) {
+    doSomethingElse();
+  }
+  PORTA.INTFLAGS=flags; // Better... if you care whether one of those conditions happens again, though, you could still miss it.
+}
+```
+
+**Correct**
+```c++
+// Check and clear flags at start of ISR.
+ISR(PORTA_PORT_vect) {
+  byte flags=VPORTA.INTFLAGS;
+  PORTA.INTFLAGS=flags; // Very common approach
+  if (flags & (1 << 0)) {
+    doSomething();
+  }
+  if (flags & (1 << 1)) {
+    doSomethingElse();
+  }
+}
+```
+**Also correct**
+```c++
+// This could be made into an else-if in order to let other interrupts fire if your ISR is slow, and is
+// likely to be called often with multiple flags set - that case goes particularly  well with round-robin
+// interrupt scheduling - but if you're in a situation where you need this, you should be concerned about
+// larger scale probems with your code - you're either generating way too many interrupts, or they are
+// far too slow...
+
+ISR(PORTA_PORT_vect) {
+  if (VPORTA.INTFLAGS & (1 << 0)) {
+    VPORTA.INTFLAGS |= (1 << 0);
+    doSomething();
+  }
+  if (VPORTA.INTFLAGS & (1 << 1)) {
+    VPORTA.INTFLAGS |= (1 << 1);
+    doSomethingElse();
+  }
+}
+```
+
+Note: `if (VPORTx.INTFLAGS & (1 << n))` is a maximally efficient way to test for a bit in a `VPORTx.anything` register or one of the 4 GPIORn/GPIOn/GPR.GPRn. Those registers (like many assorted important registers on classic AVRs - and unlike any other registers besides VPORTs and GPR/GPIOR/GPIO registers (over the past 6 years, the've been known by every one of those officially) are in the "Low I/O space", and instructions for atomic bit-level access exist. (set, clear, and skip-if-set/cleared).
+
+
+## If you don't need to do anything in the ISR
+For example, if it's only purpose is to wake from sleep, `ISR(PERIPHERAL_INT_vect) {;}` is slower and larger than `EMPTY_INTERRUPT(PERIPHERAL_INT_vect);` The latter will produce an ISR containing only a reti instruction. The former will generate the standard prologue and epilogue, for 21-26 clock overhead, instead of just the approx. 10 clock minimum overhead from getting to the vector and returning from it.
+
+## If two interrupts have an identical implementation
+You can point two interrupt vectors at the same code, which imposes no flash penalty compared to having only one of them - however, you have ZERO information on which one triggered it in this case, and if you need to figure that out, that cost you some time, which you need to weigh against the flash cost.
+
+```c
+ISR(PERIPHERAL0_INT_vect, ISR_ALIASOF(PERIPHERAL1_INT_vect));
+ISR(PERIPHERAL1_INT_vect){
+  /* handle PERIPHERAL0_INT and PERIPHERAL1_INT here */
+}
+```
+
+`ISR_ALIASOF()` can point a vector at an `EMPTY_INTERRUPT` which saves 2 bytes of flash compared to each one set EMPTY_INTERRUPT individually.
+
+## Reminders
+* ISRs should run FAST. Minimize the time that the code spends in the ISR. Never use polling loops unless you know that they will only need a couple of passes (an example would be TCD0 ENRDY or CMDRDY, which will never take longer than 16 clocks to clear), and avoid writing to serial. Most interrupts should just set a flag that is checked elsewhere, and do what must be done that instant (ex, read an incoming data byte from a register and store it in a buffer then set a flag or byte indicating at what point in the buffer it's at. Don't process the byte you received and figure out what it's instructing you to do - that should be done outside of the interrupt).
+* delay() must never be used in an ISR. It depends on the timekeeping interrupt firing every millisecond. Depending on the timer used and other details, it will either last forever, or expire within 2 ms).
+* If you absolutely must write something to serial, maybe for debugging - can you make it a single character? `Serial.write('*');` is much less bad than `Serial.print("Int1 fired");` (notice the single quotes - use those when you print a single character for faster execution; instead of printing a char array of length 2, the second one being the null terminator, this ).
+* Read the datasheet, particularly relating to the relevant INTFLAGS register - make sure you understand when it is and is not cleared automatically, and when you need to clear it. And be sure you do so. If excecution appears to slow to a crawl once the ISR fires once - you probably didn't clear the flags, resulting in the ISR executing once for every instruction outside of the ISR that is executed.
+* Any global variable that an interrupt changes, if used outside the ISR, must be declared volatile - otherwise the compiler may optimize away access to it, resulting in code elsewhere not knowing that it was changed by the ISR.
+* Any global variable read by the ISR and written to by code outside of the ISR which is larger than a byte must be written to with interrupts disabled - if the interrupt triggers in the middle of a write, the ISR would see a corrupted value.
+* Any global variable or register subject to a read-modify-write cycle in code outside the ISR and written to within an ISR must disable interrupts while performing the read-modify-write in order to make the opperation atomic.
+* For the three cases immediately above, if a level 1 priority interrupt is set, the same three rules apply to variables used by a level 0 ISR and the level 1 ISR.
+* The core does not use the level 1 interrupt priority option. Thus, an interrupt blocks all other interrupts, like on classic AVRs (though the global interrrupt flag in the SREG is not set or cleared). Hence it has the same caveats as other things that disable interrupts (this is one of the major reasons that ISRs need to run fast):
+  * An interrupt that lasts longer than 10/(usart baud rate) can cause received characters to be missed (technically, you have twice that long because it's double-buffered, minus the time overhead at the start of the interrupt that reads it. The "timer" starts from when the first byte arrives - the interrupt must fire and reach the point where the RXDATA register is read before the third byte arrives. The ISR is not something from the Museum of Efficient and Well Written ISRs, either.
+  * An interrupt that lasts longer than 0.5-4ms (details depend on the timer used for millis, and at what point in the cycle it was at) will lose time as viewed through millis() and micros().
+  * After 0.5-2ms (potentially as little as 512 system clocks on 1.3.6 and earlier, micros will jump backwards in time by around 0.5-2 ms. The thresholdfor this occurring and the size of the jump will depend on the system clock speed, millis timer, and what point in the cycle it is at when interrupts are disabled.
+
+
+## A bit more on timing
+So as described above, execution reaches the ISR within 6 system clock cycles (sometimes 5, now that rjmp is used instead of jmp when possible); Then the interrupt has to take special measures to save the state of what was interrupted, which it must restore at the end - this is automatically done by AVR-GCC at the beginning and end of an ISR (these are called the prologue and epilogue); the compiler must assume that every working register, plus the SREG, contains something that must be saved. These get saved by `push`ing them onto the stack. All of the needed registers are freed up this way before your code starts to actually execute (the more variables your ISR needs at a time, the more the prologue needs to `push` onto the stack, and the longer the delay before your code begins executing. At least on the modern AVRs, `push` is only 1 clock cycle (in the past, it was two). Of course, saving the state is only half of the job - after running the ISR that you wrote, the epilogue has to `pop` all those registers off the stack - this takes two clocks a piece. So while the datasheet loves to talk about 6 clocks to enter the interrupt routine - the bare minimum the compiler will produce is 10: 5 to get there assuming the compiler was able to use an rjmp to get there, then push r1, and r0, load SREG into one of those, push that onto the stack, and then zero out r1 (gcc needs a known zero register). So 10 clocks to enter. Then at the end pop the SREG value and store it, then pop the r0 and r1, then 4 clocks for the reti for 11 on the way out end.... plus the overhead of saving and restoring any registers your code uses. In DxCore or megaTinyCore, when a TCB is used for millis, all the ISR does is load the millisecond count, add 1 to it, and save it (that's all it does!). This requires 17 clocks in the prologue, 12 more to load the current millis tally, 4 to increment it, 8 more to save it, 3 to clear the interrupt flag, then the 4 registers we pushed need to be restored taking 8 clocks plus the 11 standard ones. That ISR winds up taking 17 + 27 + 19 = 63 clock cycles for a task that would take half that long outide of an ISR (note that a dedicated individual could implement that in assembly to save 2-3 clocks on the prologue, 4 in the body, and 4-6 in the epilogue - but that's beside the point, you can always look at generated assembly and find missed opportinities. The compiler misses more opportunities than usual in ISRs). 128k parts add 2 clocks to prologue and 3 to epilogue because they also have to save and restore RAMPZ (even if isn't changed by the ISR). This just makes it that much more important to be careful about ISRs, since execution time is amplified by the added constraints on ISRs.
+One of the worst things is calling a function that won't end up being inlined or can't be optimized - like the "attachInterrupt" functions - in this case, the prologue + jump is minimum 24 clocks, and the epilogue 39, as it must assume the function uses all of the "call used" registers and save and restore them all.
+
+### ISRs benefit the most from using the GPRs
+If you're desperate for speed - or space - and if all you are doing is setting a flag, you can use one of the general purpose registers: GPR.GPR0/1/2/3 - the only place the core uses any of those is when using a bootloader, where the reset cause is stashed in `GPR.GPR0` before the reset flags are cleared and the sketch is run (you can clear it in setup: `GPR.GPR0 = 0`. To get the full benefit, use single-bit operations - they're in the low I/O space. So something like `GPR.GPR1 |= (1 << n)` where n is known at compile time, is a single clock operation which consumes no registers - it gets turned into a `sbi` - set bit index, with the register and bit being encoded by the opcode itself. The same goes for `GPR.GPR1 &= ~(1 << n)`  - these are also atomic (an interrupt couldn't interrupt them like it could a read-modify-write). There are analogous instructions that make things like `if(GPR.GPR1 & (1 << n))` and `if (!(GPR.GPR1 & (1 << n))` lightning fast tests. These bits are only magic when manipulating a single bit, and the bit and GPR is known at compile time: `GPR.GPR1 |= 3` is a 3 clock non-atomic read-modify-write operation which needs a working register to store the intermediate value in while modifying it, which is just slightly faster than `MyGlobalByte |= 3`, which is a 6-clock non-atomic read-modify-write using 1 working register for the intermediate. `GPR |= 1; GPR |= 2;` is 2 clocks, each of which is an atomic operation which does not require a register to store any intermediate values. Note that atomicity is only a concern for code running outside the ISR, or code within a level 0 priority ISR when a level 1 priority ISR uses the same variable or hardware register.
+
+## Naked ISRs
+An advanced technique. This requires that either your ISR be written entirely in assembly, with your own prologue and epilogue hand optimized for this use case, or that you know for a fact that the tiny piece of C code you use doesn't change SREG or use any working registers.
+In a naked ISR, all the compiler does for you is tell the linker to that this code is where it should point the vector. When writing a naked ISR, You are responsible for everything: prologue. epilogue, and the reti at the end. Don't forget the reti, without that you'll get crazy (and severely broken) behavior.
+
+```c
+/* OK - results in an sbi that neither changes SREG nor uses a register*/
+ISR(PERIPHERAL_INT_vect, ISR_NAKED)
+{
+  GPR.GPR1 |= (1 << 0);
+  reti();
+}
+/* OK - digitalWriteFast() where both arguments are constant maps directly to sbi*/
+ISR(PERIPHERAL_INT_vect, ISR_NAKED)
+{
+  digitalWriteFast(PIN_PB4,HIGH);
+  reti();
+}
+/* NO! setting multiple bit requires a register */
+ISR(PERIPHERAL_INT_vect, ISR_NAKED)
+{
+  GPR.GPR1 |= 3;
+  reti();
+}
+/* NO! This is a read-modify-write, AND the addition changes SREG. */
+ISR(PERIPHERAL_INT_vect, ISR_NAKED)
+{
+  GPR.GPR1++;
+  reti();
+}
+/* NO! You must reti(), or it will continue on and execute whatever happens to be after it in the flash (potentially anything) */
+ISR(PERIPHERAL_INT_vect, ISR_NAKED)
+{
+  GPR.GPR1 |= (1 << 0);
+}
+/* OK - turns into sbic (Skip-next-instruction-if-Bit-in-I/O-is-Clear), sbi  */
+ISR(PERIPHERAL_INT_vect, ISR_NAKED)
+{
+  if (GPR.GPR0 & (1 << 0))
+    GPR.GPR1 |= (1 << 0);
+  reti();
+}
+/* Anything more compllicated written in C - even when it can be safe - should be verified by reading the assembly listing.
+ * The compiler does not "know" that it has to avoid using any working registers or changing the SREG.
+ * ISR_NAKED is really meant to run hand-optimized assembly, rather than relying on the compiler not doing anything dumb
+ */
+```
diff --git a/megaavr/extras/Ref_Timers.md b/megaavr/extras/Ref_Timers.md
new file mode 100644
index 00000000..045600db
--- /dev/null
+++ b/megaavr/extras/Ref_Timers.md
@@ -0,0 +1,209 @@
+# PWM and Timer usage
+This document describes how the timers are configured by the core prior to the sketch starting and/or by the built-in peripherals, and how this may impact users who wish to take full control of these peripherals. This document - besides the background section, applies only to DxCore - though much of it is very similar to megaTinyCore. The corresponding document for megaTinyCore is more accurate for that core.
+
+## Background: The Timers on Dx-series parts
+This applies to the DA, DB, and in overwhelming liklihood, the DD-series as well. These timers are, with very few changes, the same "modern" timers introduced on the tinyAVR 0/1-series, and featured on the megaAVR 0-series (including the ATmega4809 on the Nano Every) and tinyAVR 2-series. The megaAVR 0-series parts are all supported by @MCUdude's MegaCoreX [Hans/MCUdude](https://github.com/MCUdude)'s excellent [MegaCoreX](https://github.com/MCUdude/MegaCoreX).
+
+
+### TCA0 - Type A 16-bit Timer with 3/6 PWM channels
+This timer is the crown jewel of the modern AVR devices, as far as timers go. It can be operated in two very different modes. The default mode on startup is "Normal" or `SINGLE` mode - it acts as a single 16-bit timer with 3 output compare channels. It can count in either direction, and can also be used as an event counter (ie, effectively "clocked" off the event), is capable of counting up or down, generating PWM in single and dual slope modes, and has 7-output prescaler. For most use cases, a TCA in SINGLE mode is on the same level as the classic avr 16-bit Timers, only with more outputs (especially for 8-bit PWM) - the newly added features aren't ones that are particularly relevant for Arduino users. In this mode, TCA0 can generate events or interrupts on compare match for each channel (independently), as well as on an overflow.
+
+The Type A timer can be also be set to split mode to get six 8-bit PWM channels (this is how it is configured for analogWrite() PWM in megaTinyCore. In split mode the high and low bytes of the timer count `TCA0.SINGLE.CNT` register becomes `TCA.SPLIT.LCNT` and `TCA.SPLIT.LCNT`; likewise the period and compare registers *in SINGLE mode, these are 16-bit registers; accessing them uses the temporary register. In SPLIT mode, they are 8-bit registers!*. The count frequency of the two "halves" of the timer is always the same. However, the HPER and LPER registers can be used to adjust the period (that is, the period of high and low half can be independently adjusted). So, for the frequency of the PWM - albeit at a cost in resolution. he periods can be adjusted independently: Assuming 20MHz, prescaler 64 (default configuration), one could be generating 1.225 kHz PWM with period of 255 (LPER=254 - the default) on three channels, and PWM with frequency of 2 kHz with period of 156 (HPER=155) on the other three channels.
+
+In megaTinyCore 2.2.6 and earlier and DxCore 1.3.0 and earlier, all digitalWrite() calls to pins that are output to in split mode will result in PWM being turned off on whichever pin normally corresponds to that channel (that is, if it has been remapped to a different pin, it will get turned off there), and no mechanism was provided to disable that. megaTinyCore 2.2.7 and DxCore 1.3.1 provide takeOverTCA0() and takeOverTCA1() (DxCore only). Calling those functions will tell analogWrite()/digitalWrite()/etc functions to "forget about" that timer. PWM will not be altered, turned on, or turned off by functions supplied by the core. The
+
+There are a few examples of using TCA0 to generate PWM at specific frequencies and duty cycles in the document on [Taking over TCA0](TakingOverTCA0.md)
+
+### TCBn - Type B 16-bit Timer
+The type B timer is what I would describe as a "utility timer". It is also the only timer which got a significant upgrade with the Dx-series... it received a new event user, `TCBn_COUNT`, and a new `TCBn.CTRLA` register layout with an option to clock on events, rather than just. This is a pretty big deal for the type B timers. But that's  Although, unlike the earlier 0/1-series parts, (though these both call the same ISR) (they now have `CAPT` and OVF), the behavior is somewhat muddled to retain compatibility with code written for the older timers, and the benefit . The input clock source can be either the system clock, optionally prescaled by 2, or whatever the prescaled clock of TCA0 (or TCA1 if present) is.
+
+They can be set to act as 8 bit PWM source. When used for PWM, they can only generate 8-bit PWM, despite being a 16-bit timer, because the 16-bit `TCBn.CCMP` register is used for both the period and the compare value in the low and high bytes respectively. They always operate in single-slope mode, counting upwards, and the frequency depends on that of the TCAn (since CLK_PER/2 is far too fast for 8-bit PWM). In other words, **the type B timers are not very good at generating PWM**. Note also that `TCBn.CCMP` is effected by silicon errata: It still acts like a 16-bit register, using the temp register for access, so you must read the low byte first, then high byte, and always write the high byte after the low one, lest it not be written or a bad value written over the low byte!
+
+While this makes them poor output generators, they are excellent utility timers, which is what they are clearly designed for. They can be used to time the duration of events down to single system clock cycles in the input capture modes, and with the event being timed coming from the event system, any pin can be used as the source for the input capture, as well as the analog comparators, the CCL modules, and more. As input capture timers, they are far more powerful than the 16-bit timers of the classic AVR parts. They can also be used as high resolution timers independent of the builtin millis()/micros() timekeeping system if this is needed for specific applications (in some cases during developmentm, millis/micros and a TCB were compared in order to detect errors in the timekeeping). The Dx-series adds support for two exciting new options - first, they can be clocked from events (those single cycle events became a lot more useful) - and secondly, you can use that to cascade two timers together, in order to do 32-bit input capture. 32 bits gives you a maximum count of 4.2 billion; with CLK_PER as the clock source, events with durations of several minutes can be timed to an accuracy of single clock cycles.
+
+### TCD0 - Type D 12-bit Async Timer
+The Type D timer, is a very strange timer indeed. It can run from a totally separate clock supplied on EXTCLK, or from the unprescaled internal oscillator - or, on the Dx-series, from the on-chip PLL at 2 or 3 times the speed of the external clock or internal oscillator! It was apparently designed with a particular eye towards motor control and SMPS control applications. This makes it very nice for those sorts of use cases, but in a variety of ways,these get in the way of using it for the sort of things that people who would be using the Arduino IDE typical arduino-timer purposes. First, none of the control registers can be changed while it is running; it must be briefly stopped, the register changed, and the timer restarted. In addition, the transition between stopping and starting the timer is not instant due to the synchronization process. This is fast (it looks to me to be about 2 x the synchronizer prescaler 1-8x Synchronizer-prescaler, in clock cycless. The same thing applies to reading the value of the counter - you have to request a capture by writing the SCAPTUREx bit of TCD0.CTRLE, and wait a sync-delay for it. can *also* be clocked from the unprescaled 20 MHz (or 16 MHz) internal oscillator, even if the main CPU is running more slowly. - though it also has it's own prescaler - actually, two of them - a "synchronizer" clock that can then be further prescaled for the timer itself. It supports normal PWM (what they call one-ramp mode) and dual slope mode without that much weirdness, beyond the fact that `CMPBSET` is TOP, rather than it being set by a dedicated register. But the other modes are quite clearly made for driving motors and switching power supplies. Similar to Timer1 on the ATtiny x5 and x61 series parts in the classic AVR product line,  this timer can also create programmable dead-time between cycles.
+
+It also has a 'dither' option to allow PWM at a frequency in between frequencies possible by normal division of the clock - a 4-bit value is supplied to the TCD0.DITHER register, and this is added to a 4-bit accumulator at the end of each cycle; when this rolls over, another clock cycle is inserted in the next TCD0 cycle.
+
+The asynchronous nature of this timer, however, comes at a great cost: It is much harder to use than the other timers. Most changes to settings require it to be disabled - and you need to wait for that operation to complete (check for the `ENABLERDY` bit in `TCD0.STATUS`). Similarly, to tell it to apply changes made to the `CMPxSET` and `CMPxCLR` registers, you must use the `TCD.CTRLE` (the "command" register) to instruct it to synchronize the registers. Similarly, to capture the current count, you need to issue a SCAPTUREx command (x is A or B - there are two capture channels) - and then wait for the corresponding bit to be set in the `TCD0.STATUS` register. In the case of turning PWM channels on and off, not only must the timer be stopped, but a timed write sequence is needed ie, `_PROTECTED_WRITE(TCD0.FAULTCTRL,value)` to write to the register that controls whether PWM is enabled; this is apparenmtly because, in the intended use-cases of motor and switching power supply control, changing this accidentally (due to a wild pointer or other software bug) could have catastrophic consequences. Writes to any register when it is not "legal" to write to it will be ignored. Thus, making use of the type D timer for even simple tasks requires careful study of the datasheet - which is itself quite terse in places where it really shouldn't be - and can be frustrating and counterintuitive.
+
+
+### RTC - 16-bit Real Time Clock and Programmable Interrupt Timer
+Information on the RTC and PIT will be added in a future update.
+
+## Timer Prescaler Availability
+
+Prescaler    | TCAn  | TCBn  | TCD0  | TCD0 sync | TD0 counterb|
+------------ | ------|-------|-------|-----------|-------------|
+CLK          |  YES  |  YES  |  YES  |  YES      |  YES        |
+CLK2         |  YES  |  YES  |  YES* |  YES      |  NO         |
+CLK/4        |  YES  |  TCA  |  YES  |  YES      |  YES        |
+CLK/8        |  YES  |  TCA  |  YES  |  YES      |  NO         |
+CLK/16       |  YES  |  TCA  |  YES* |  NO       |  NO         |
+CLK/32       |  NO   |  NO   |  YES  |  NO       |  YES        |
+CLK/64       |  YES  |  TCA  |  YES* |  NO       |  NO         |
+CLK/128      |  NO   |  NO   |  YES* |  NO       |  NO         |
+CLK/256      |  YES  |  TCA  |  YES* |  NO       |  NO         |
+CLK/1024     |  YES  |  TCA  |  NO   |  NO       |  NO         |
+
+* Requires using the synchronizer prescaler as well. My understanding is that this results in sync cycles taking longer.
+`TCA` indicates that for this prescaler, a TCA must also use it, and then that can be prescaled, and the TCB set to use that TCA's clock.
+
+## Resolution, Frequency and Period
+When working with timers, I constantly found myself calculating periods, resolution, frequency and so on for timers at the common prescaler settings. While that is great for adhoc calculations, I felt it was worth some time to make a nice looking chart that showed those figures at a glance. The numbers shown are the resolution (when using it for timing), the frequency (at maximum range), and the period (at maximum range - ie, the most time you can measure without accounting for overflows).
+### [In Google Sheets](https://docs.google.com/spreadsheets/d/10Id8DYLRtlp01KA7vvslC3cHaR4S2a1TrH7u6pHXMNY/edit?usp=sharing)
+
+
+## PWM ( analogWrite() )
+### TCAn
+The core reconfigures they type A timers in split mode, so each can generate up to 6 PWM channels simultaneously. The `LPER` and `HPER` registers are set to 254, giving a period of 255 cycles (it starts from 0), thus allowing 255 levels of dimming (though 0, which would be a 0% duty cycle, is not used via analogWrite, since analogWrite(pin,0) calls digitalWrite(pin,LOW) to turn off PWM on that pin). This is used instead of a PER=255 because analogWrite(255) in the world of Arduino is 100% on, and sets that via digitalWrite(), so if it counted to 255, the arduino API would provide no way to set the 255/256th duty cycle). Additionally, modifications would be needed to make millis()/micros() timekeeping work without drift at that period anyway.
+The core supports generating PWM using up to 6 channels per timer, and will work with alternate PORTMUX settings as long as the the selected option isn;t one of the three-channel ones for TCA1 - those are not supported. TCA1 can be on PB0-5 or PG0-5 (and not even the latter on DA due to errata). TCA0 can go on pin 0-5 in any port (though they must all be on the same port. We default to configuring it for PD on 28/32 pin parts and PC on 48/64 pin ones).
+
+analogWrite() checks the PORTMUX.TCAROUTEA register.
+
+### TCD0
+TCD0, by default, is configured for generating PWM (unlike TCA's, that's about all it can do usefully). TCD0 is clocked from the CLK_PER when the system is using the internal clock without prescaling. On the prescaled clocks (5 and 10 MHz) it is run it off the unprescaled oscillator (just like on the 0/1-series parts that it inherits the frequencies from), keeping the PWM frequency near the center of the target range. When an external clock is used, we run it from the internal oscillator at 8 MHz, which is right on target.
+
+It is always used in single-ramp mode, with `CMPBCLR` (hence TOP) set to either 254, 509, or 1019 (for 255 tick, 510 tick, or 1020 tick cycles), the sync prescaler set to 1 for fastest synchronization, and the count prescaler to 32 except at 1 MHz. `CMPACLR` is set to 0xFFF (the timer maximum, 4095). The `CMPxSET` registers are controlled by analogWrite() which subtracts the supplied dutycycle from 255, checks the current CMPBCLR high byte to see how many places to left-shift that result by before subtracting 1 and writing to the register. The `SYNCEOC` command is sent to synchronize the compare value registers at the end of the current PWM cycle if the channel is already outputting PWM. If it isn't, we have to briefly disable the timer, turn on the pin, and then reenable it, producing a glitch on the other channel. To mitigate this issue we treat 0 and 255 duty cycles differently for the TCD pins - they instead set duty cycle to 0% without disconnecting the pin from the timer, for the 100% duty cycle case, we invert the pin (setting CMPxSET to 0 won't produce a constant output). This eliminates the glitches when the channels are enabled or disabled.
+
+TCD0 has two output channels - however, each of them can go to either of two pins. PA5 and PA7 use WOB, and PA4 and PA6 use WOA. :
+```
+analogWrite(PIN_PA4,64);  // outputting 25% on PA4
+analogWrite(PIN_PA5,128); // 25% on PA4, 50% on PA5
+analogWrite(PIN_PA5,0);   // 25% on PA4, PA5 constant LOW, but *still connected to timer*
+digitalWrite(PIN_PA5,LOW);// NOW PA5 totally disconnected from timer. A glitch will show up briefly on PA4.
+analogWrite(PIN_PA6,192); // This is on same channel as PA4. We connect channel to PA6 too (not in place of - we do the same thing on ATTinyCore for the 167 pwm output from Timer1 on the latest versions).
+                          // so now, both PA4 and PA6 will be outputting a 75% duty cycle. Turn the first pin off with digitalWrite() to explicitly turn off that pin.
+```
+You can get a lot of control over the frequency without having to take over full management of the timer (which is rather complicated, and difficult to reconfigure) as long as you follow the rules carefully: See [TCD0 reference](https://github.com/SpenceKonde/DxCore/blob/master/megaavr/extras/Ref_TCD.md) Step off that narrow path, however, and analogWrite() will not work correctly.
+### TCBn
+The type B timers, while not particularly good for PWM, can be used for PWM as well; they are set to use the TCA1 clock by default. A type B timer used for millis cannot be used to output PWM.
+
+### PWM Frequencies
+The frequency of PWM output using the settings supplied by the core is shown in the table below. The "target" is 1 kHz, never less than 490 Hz or morethan 1.5 kHz. As can be seen below, there are several frequencies where this has proven an unachievable goal. The upper end of that range is the point at which - if PWMing the gate of a MOSFET - you have to start giving thought to the gate charge and switching losses, and may not be able to directly drive the gate of a modern power MOSFET and expect to get acceptable results (ie, MOSFET turns on and off completely in each cycle, there is minimal distortion of the duty cycle, and it spends most of it's "on" time with the low resistance quoted in the datasheet, instead of something much higher that would cause it to overheat and fail). Not to say that it **definitely** will work with a given MOSFET under those conditions (see [the PWM section of my MOSFET guide](https://github.com/SpenceKonde/ProductInfo/blob/master/MOSFETs/Guide.md#pwm) ), but the intent was to try to keep the frequency low enough that that use case was viable (nobody wants to be forced into using a gate driver), without compromising the ability of the timers to be useful for timekeeping.
+
+Note that no attention had been paid to these for DxCore prior to the 1.3.0 release, and serious bugs were not discovered until 1.3.7
+
+|   CLK_PER | Prescale A |   fPWM  | Prescale D  | TOP D |  fPWM (D) |
+|-----------|------------|---------|-------------|-------|-----------|
+| ** 48 MHz |        256 |  735 Hz |             |       |           |
+| ** 44 MHz |        256 |  674 Hz |             |       |           |
+| ** 40 MHz |        256 |  613 Hz |             |       |           |
+| ** 36 MHz |        256 |  551 Hz |             |       |           |
+|  External |            |         | OSCHF@8  32 |   254 |    980 Hz |
+|  * 32 MHz |        256 |  490 Hz |          32 |  1019 |    980 Hz |
+|  * 30 MHz |         64 | 1836 Hz | OSCHF@8  32 |   254 |    980 Hz |
+|  * 28 MHz |         64 | 1716 Hz |          32 |  1019 |    858 Hz |
+|    25 MHz |         64 | 1532 Hz |          32 |  1019 |    766 Hz |
+|    24 MHz |         64 | 1471 Hz |          32 |  1019 |    735 Hz |
+|    20 MHz |         64 | 1225 Hz |          32 |   509 |   1225 Hz |
+|    16 MHz |         64 |  980 Hz |          32 |   509 |    980 Hz |
+|    12 MHz |         64 |  735 Hz |          32 |   509 |    735 Hz |
+|    10 MHz |         64 |  613 Hz | OSCHF@20 32 |   509 |   1225 Hz |
+|     8 MHz |         64 |  490 Hz |          32 |   254 |    980 Hz |
+|     5 MHz |         16 | 1225 Hz | OSCHF@20 32 |   509 |   1225 Hz |
+|     4 MHz |         16 |  980 Hz |          32 |   254 |    490 Hz |
+|     1 MHz |          8 |  490 Hz |           4 |   254 |    980 Hz |
+
+`*` Overclocked (generally works, 28 and 32 can be achieved with internal oscillator)
+
+`**` Way overclocked, may not work (requires external crystal or oscillator).
+
+External clock or crystal will always cause TCD0 to use the internal oscillator by default. Speeds higher than 32 MHz can only use external clock sources, so they always act as described on the External line (unless reconfigured at runtime)
+
+`Prescale A` and `fPWM` apply to all pins not on TCD0. TOP is always set to 254 for TCA
+
+`Prescale D`, `TOP D`, and `fPWM (D)` apply to the pins on TCD0.
+Where marked, we clock TCD0 from OSCHF instead of using CLK_PER, prescale by 32. For speeds other than 5 MHz and 10 MHz, we set the internal oscillator to 8 MHz.
+
+These are the overall Timer D prescaler (in all cases, by default only the count prescaler is used), TOP, and resulting frequency of TCD0 PWM output.
+
+Where TCD0 TOP is not 254, but is 509, 1019, or (above 32 MHz only) 2039, all duty cycles passed to analogWrite() for those pins will by left-shifted as necessary to get an appropriate duty cycle.
+
+#### Previous versions
+Versions 1.3.1-1.3.6 had issues with PWM frequency at under some condition, and with micros readout in others. In 1.3.0 and earlier, there was no coherent list of timer settings based on part and clock speed.
+
+## Millis/Micros Timekeeping
+DxCore allows any of the type A or B timers to be selected as the clock source for timekeeping via the standard millis timekeeping functions. The RTC timers will be added after the sleep/low power library for this and tinyAVR 0/1-series is completed. There are no plans to support the type D timer - this is not like tinyAVR where we are desperately short of timers, with the comparatively difficult to use type D timer an irresistible victim to palm off the task of millis timekeeping on. Now, the calculations are more complicated since there are a great many possible speeds it could be running at, as opposed to just 16 or 20 on the tinyAVR 0/1-series. The timer used and system clock speed will effect the resolution of `millis()` and `micros()`, the time spent in the millis ISR, and the time it takes for micros() to return a value. The `micros()` function will typically take several times it's resolution to return, and the times returned corresponds to the time `micros()` was called, regardless of how long it takes to return.
+
+A table is presented for each type of timer comparing the percentage of CPU time spent in the ISR, the resolution of the timekeeping functions, and the execution time of micros. Typically micros() can have one of three execution times, the shortest one being overwhelmingly more common, and the differences between them are small.
+
+
+### TCAn for millis timekeeping
+When TCA0 is used as the millis timekeeping source, it is set to run at the system clock prescaled by 8 when system clock is 1MHz, 16 when system clock is 4 MHz or 5 MHz, and 64 for faster clock speeds, with a period of 255 (as with PWM). This provides a millis() resolution of 1-2ms, and is effecively not higher than 1ms between 16 and 30 MHz, while micros() resolution remains at 4 us or less. At 32 MHz or higher, to continue generating PWM output within the target range, we are forced to switch to a larger prescaler (by a factor of 4), so the resolution figures fall by a similar amoubnt, and the ISR is called that much less often.
+
+#### TCA timekeeping resolution
+|   CLK_PER | millis() | micros() | % in ISR | micros() time |
+|-----------|----------|----------|----------|---------------|
+|    48 MHz |  1.36 ms |   5.3 us |   0.19 % |        2.5 us |
+|    44 MHz |  1.48 ms |   5.8 us |   0.19 % |               |
+|    40 MHz |  1.63 ms |   6.4 us |   0.19 % |        3.5 us |
+|    36 MHz |  1.81 ms |   7.1 us |   0.19 % |          4 us |
+|    32 MHz |  2.04 ms |   8.0 us |   0.19 % |          4 us |
+|    30 MHz |  0.54 ms |   2.1 us |   0.72 % |               |
+|    28 MHz |  0.58 ms |   2.3 us |   0.72 % |          4 us |
+|    25 MHz |  0.65 ms |   2.6 us |   0.72 % |          4 us |
+|    24 MHz |  0.68 ms |   2.7 us |   0.72 % |          5 us |
+|    20 MHz |  0.82 ms |   3.2 us |   0.72 % |          7 us |
+|    16 MHz |  1.02 ms |   4.0 us |   0.72 % |          9 us |
+|    12 MHz |  1.36 ms |   5.3 us |   0.72 % |         10 us |
+|    10 MHz |  1.63 ms |   6.4 us |   0.72 % |         14 us |
+|     8 MHz |  2.04 ms |   8.0 us |   0.72 % |         17 us |
+|     5 MHz |  0.82 ms |   3.2 us |   2.99 % |         27 us |
+|     4 MHz |  1.02 ms |   4.0 us |   2.99 % |         33 us |
+|     1 MHz |  2.04 ms |   8.0 us |   5.98 % |        112 us |
+
+In contrast to the type B timer where prescaler is held constant while the period changes, here period (in ticks) is constant but the prescaler is not. Hence each prescaler option is associated with a fixed % of time spent in the ISR (and yes, for reasons I don't understand, the generated ISR code is slightly faster for /64 prescaling compared to /256, /16, and /8 (which are equal to eachother).
+
+The micros execution time does not depend strongly on F_CPU, running from 112-145 clock cycles.
+
+Except when the resolution is way down near the minimum, the device spends more time in the ISR on these parts. Notice that at these points that - barely - favor TCAn, the interrupt they're being compared to is firing twice as frequently!
+
+
+### TCBn for millis timekeeping
+When TCB2 (or other type B timer) is used for millis() timekeeping, it is set to run at the system clock prescaled by 2 (1 at 1 MHz system clock) and tick over every millisecond. This makes the millis ISR very fast, and provides 1ms resolution at all speeds for millis. The micros() function also has 1 us resolution at all clock speeds (though there are small deterministic distortions due to the performance shortcuts used for the microsecond calculations. The type B timer is an ideal timer for millis - as these parts have plenty of them, TCB2 is used by default on all parts exceopt DD-series devices with only 2 type B timers, which use TCB1 instead. Except on those smaller DD-series parts, there is rarely competition for type B timers for other purposes, like tone(), servo, input capture or outputting pulses of a controlled length, which is a relatively common procedure; it is anticipated that as libraries for IR, 433MHz OOK'ed remote control, and similar add support for the modern AVR parts, that these timers will see even more use.
+
+
+|   CLK_PER | millis() | micros() | % in ISR | micros() time |
+|-----------|----------|----------|----------|---------------|
+|    48 MHz |     1 ms |     1 us |   0.14 % |          3 us |
+|    44 MHz |     1 ms |     1 us |   0.15 % |               |
+|    40 MHz |     1 ms |     1 us |   0.17 % |          4 us |
+|    36 MHz |     1 ms |     1 us |   0.18 % |          3 us |
+|    32 MHz |     1 ms |     1 us |   0.20 % |          3 us |
+|    30 MHz |     1 ms |     1 us |   0.22 % |               |
+|    28 MHz |     1 ms |     1 us |   0.23 % |               |
+|    25 MHz |     1 ms |     1 us |   0.26 % |          6 us |
+|    24 MHz |     1 ms |     1 us |   0.27 % |          7 us |
+|    20 MHz |     1 ms |     1 us |   0.33 % |          7 us |
+|    16 MHz |     1 ms |     1 us |   0.40 % |          6 us |
+|    12 MHz |     1 ms |     1 us |   0.54 % |         12 us |
+|    10 MHz |     1 ms |     1 us |   0.65 % |         13 us |
+|     8 MHz |     1 ms |     1 us |   0.80 % |         11 us |
+|     5 MHz |     1 ms |     1 us |   1.30 % |         25 us |
+|     4 MHz |     1 ms |     1 us |   1.60 % |         21 us |
+|     1 MHz |     1 ms |     1 us |   6.50 % |         78 us |
+Resolution is always exactly 1ms for millis, and whereas TCAn micros() is limited by the resolution of the timer, here it's instead limited only by the resolution of the value we are returning. The timer count and the running tally of overflows could get us microseconds limited only by F_CPU/2
+The percentage of time spent in the ISR varies in inverse proportion to the clock speed - the ISR simply increments a counter and clears its flags. 65 clocks from interrupt bit set to interrupted code resuming.
+
+The time that micros takes to return a value varies significatly with F_CPU. Specifically, powers of 2 are highly favorable, and almost all the calculations drop out of the 1 MHz case. micros takes between 78 and 160 clocks to run. Each factor of 2 increase in clock speed results in 10 extra clocks being added to micros in most cases (bitshifts, while faster than division, are still slow when you need multiples of them on larger types
+
+### TCD0 for millis timekeeping
+TCD0 is not supported for millis timekeeping on these parts. Originally it was imagined that the implementation from megaTinyCore could simply be used - but there the main clock was prescaled from 16 or 20 MHz, and TCD0 ran from unprescaled osc, giving 2 channels of normal speed PWM and a predictable timebase for millis even clocked at 1 MHz. Here, the value proposition isn't as strong: there are more timers available, and the type D timer is readily put to use generating high frequency PWM which is more accessible thanks to the PLL  and higher maximum system clock speeds, which not only determine maximum frequency, but also how complicated calculations that have be performed at a specified frequency can be.  calculations that were on the edge of being too slow on the 0/1-series to be. (ex, motor control applications which must be outside of the range of human hearing for quieter operation. Note that such applications most certainly require a MOSFET gate driver. 50 kHz is an order of magnitude above the highest plausible frequency of PWM that could be used to drive a gate directly). .
+
+## Tone
+The tone() function included with DxCore uses one Type B timer. It defaults to using TCB0; do not use that for millis timekeeping if using tone(). Tone is not compatible with any sketch that needs to take over TCB0. If possible, use a different timer for your other needs. When used with Tone, it will use CLK_PER or CLK_PER/2 as it's clock source - the TCA clock will never be used, so it does not care if you change the TCA0 prescaler (unlike the official megaAVR core).
+
+Tone works the same was as the normal tone() function on official Arduino boards. Unlike the official megaAVR board package's tone function, it can be used to generate arbitrarily low frequency tones (as low as 1 Hz). If the period between required toggling's of the pin is greater than the maximum timer period possible, it will calculate how many cycles it has to wait through between switching the pins in order to achieve the desired frequency.
+
+It can only generate a tone on one pin at a time.
+
+All tone generation is done via interrupts. The hardware output compare functionality is not used for generating tones because in PWM mode, the type B timers kindof suck.
+
+## Servo Library
+The Servo library included with this core uses one Type B timer. It defaults to using TCB1 if available, unless that timer is selected for Millis timekeeping. Otherwise, it will use TCB0. The Servo library is not compatible with any sketch that needs to take over these timers - if possible, use a different timer for your other needs. Servo and tone() can only be used together on when neither of those is used for millis timekeeping.
+
+Regardless of which type B timer it uses, Servo configures that timer in Periodic Interrupt mode (`CNTMODE`=0) mode with CLK_PER/2 or CLK_PER as the clock source, so there is no dependence on the TCA prescaler. The timer's interrupt vector is used, and it's period is constantly adjusted as needed to generate the requested pulse lengths. In 1.1.9 and later, CLK_PER is used if the system clock is below 10MHz to generate smoother output and improve performance at low clock speeds.
+
+The above also applies to the Servo_DxCore library; it is an exact copy except for the name. If you have installed a version of Servo via Library Manager or by manually placing it in your sketchbook/libraries folder, the IDE will use that in preference to the one supplied with this core. Unfortunately, that version is not compatible with the Dx-series parts. Include Servo_megaTinyCore.h instead in this case. No changes to your code are needed other than the name of the library you include.