Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

attachInterrupt(rxPin, onRx, FALLING) results in immediate reset after updating from 2.0.2 to 2.0.3 (and 2.0.4) #7202

Closed
1 task done
dagnall53 opened this issue Aug 30, 2022 · 12 comments · Fixed by #7904
Closed
1 task done
Assignees

Comments

@dagnall53
Copy link

dagnall53 commented Aug 30, 2022

Board

ESP32 devboard

Device Description

The fault was noted in code that uses a software UART.
I have added a number of debug printlns to explore when and why the code resets repeatedly after being compiled with 2.0.3 (and 2.0.4).
These debug printlns appear to show that the code resets if attachInterrupt(rxPin, onRx, FALLING); is called from within a timer interrupt.
When compiled using 2.0.2, the code functions ok, and progresses past this point and then works fully.
I have made a much simplified code (below) that demonstrates this effect.

Hardware Configuration

Fault occurs with bare board. Resets during setup with 2.0.3 (+) revision compilers. but works with 2.0.2 and earlier.

Version

v2.0.4

IDE Name

Version: 2.0.0-rc9.2 Date: 2022-08-10T13:03:18.962Z CLI Version: 0.26.0-rc.1 [fc2ea723] Copyright © 2022 Arduino SA

Operating System

win10

Flash frequency

80Mhz

PSRAM enabled

no

Upload speed

115200

Description

The problem manifests as the Hardware resetting during code start-up.

BY adding debugprint lines I have traced the issue to a single line that attaches a falling edge interrupt as part of a Timer interrupt.
In "normal" use as compiled by earlier versions (eg 2.0.2), the code will setup a timerAlarmEnable(timer_1);, and do subsequent debug prints and will function as expected.
But with 2.0.3 (+) the code appears to fail and reset the ESP at this line.

This timer interrupt code does many things, but the FIRST time it is run, it stops the timer and Sets a falling edge interrupt. It is this element that appears to be failing.

The UART code was originally designed to overcome the (undocumented?) ESP32 hardware(?) issue that results in the FIRST timer interrupt always occurring immediately after the timer is setup. (If this 'first immediate trigger' was a SOFTWARE fault, it is possible that we can redesign the code to accept 2.0.3 on.. but I have not been able to explore this possibility)

I have sketched the key parts of the code below it does not "do" anything, but it does demonstrate how the code will not complete my setup with compiler 2.0.3 on.
.
Many thanks

Sketch

//  Updated with a simpler example.. 
//This partial code shows the general form of the UART setup. and is sufficient to demonstrate the reset behaviour with 2.0.3 on.
// if pin IO18 is pulled down it will start (simulated) reading a byte
// but if compiled with 2.0.3 (0n), it will give a GURU mediation error, like this...
//setup-start
//v4.4.1-472-gc9140caf8c
//0
//attachInterrupt
//Guru Meditation Error: Core  1 panic'ed (Interrupt wdt timeout on CPU1). 
 //interrupt  wdt interrupt crash
//******** code starts **************
#define RX_ST_PIN 18
hw_timer_t * timer_1 = NULL;

volatile byte rxBitCounter = 0;

void IRAM_ATTR onRx() { 
  detachInterrupt(RX_ST_PIN);
  Serial.println("onRx");  
  timerStart(timer_1);
  timerAlarmWrite(timer_1, 1.8 * 10000000.0 / 4800, true);
  timerAlarmEnable(timer_1);
}

void IRAM_ATTR onRxTimer() {
  Serial.println(rxBitCounter);
  if ( rxBitCounter == 0 ) {   // special case for first setup. to overcome initial zero time timeout
    rxBitCounter = 1;
    timerStop(timer_1);
    Serial.println("attachInterrupt");
    attachInterrupt(RX_ST_PIN, onRx, FALLING);//Crashes here in v2.0.3+
    Serial.println("attachInterrupt-done");    
  }
  if (rxBitCounter == 10) {
    rxBitCounter = 1;
    timerStop(timer_1);   
    attachInterrupt(RX_ST_PIN, onRx, FALLING);
    return;
  }
  rxBitCounter++;
}

void setup() {
  Serial.begin(115200, SERIAL_8N1, 3, 1);
  delay(1000);
  Serial.println("setup-start");
  Serial.println(ESP.getSdkVersion());
  pinMode(RX_ST_PIN, INPUT_PULLUP);
  timer_1 = timerBegin(1, 8, true);
  timerAttachInterrupt(timer_1, &onRxTimer, true);
  timerAlarmWrite(timer_1, 10, true);
  timerAlarmEnable(timer_1);  
  Serial.println("setup-end");
}

void loop() {
  Serial.println("loop");
  delay (10000);
}

Debug Message

When compiled with 2.0.3 or subsequent, the code does not progress past  the "attachInterrupt(rxPin, onRx, FALLING);" statement: 

This is demonstrated in the debug prints:  (here with V2.0.4)
setup-start
v4.4.1-472-gc9140caf8c
0
attachInterrupt
Guru Meditation Error: Core  1 panic'ed (Interrupt wdt timeout on CPU1). 

Core  1 register dump:
PC      : 0x4008a7f0  PS      : 0x00060735  A0      : 0x80089a46  A1      : 0x3ffbec7c  
A2      : 0x3ffb7bf8  A3      : 0x3ffb8890  A4      : 0x00000004  A5      : 0x00060723  

Other Steps to Reproduce

Code works with previous versions of compiler, its just versions after 2.0.3 that fail.

I have checked existing issues, online documentation and the Troubleshooting Guide

  • I confirm I have checked existing issues, online documentation and Troubleshooting guide.
@dagnall53 dagnall53 added the Status: Awaiting triage Issue is waiting for triage label Aug 30, 2022
@Enigma644
Copy link

Confirmed, I also see this error with my ESP32 Dev board.

@SuGlider
Copy link
Collaborator

@P-R-O-C-H-Y, could you please check it out? I think you have been working on Timers, right?
It has to do with using attachInterrupt() inside an ISR, as we talked a week ago in the Sprint Meeting.

@P-R-O-C-H-Y
Copy link
Member

Hi @dagnall53, Timer functions cannot be called within ISR same will be with attachInterrupt as they don't have IRAM_ISR attribute. Related issue #6693.

@P-R-O-C-H-Y P-R-O-C-H-Y self-assigned this Aug 31, 2022
@dagnall53
Copy link
Author

dagnall53 commented Aug 31, 2022

Hi @P-R-O-C-H-Y .. Thanks for looking at this. I realise that code with a SerialPrint inside the Interrupt is very bad, but it was one way to try and find out (in a huge program) why it would not run under 2.0.3 (+).
What I fail to understand is that in 2.0.2 it does work - and reliably. But not under 2.0.3 (+).

This code fragment is essentially a UART and we have been using very similar code to swap between Edge Triggered Interrupt (to detect start of transmission) and timed interrupts (to get the actual data) very happily up until now!.

Looking at esp32-hal-timer.c I see some big changes from 2.0.2 (left) to 2.0.4. (right). and Although I do not understand them, I wonder if these might be missing something?
I'm afraid that I do not understand the "IRAM_ISR attribute" implications, but did previous compilers have this attribute for the timer / level interrupt functions and has this been lost?

image

@P-R-O-C-H-Y
Copy link
Member

P-R-O-C-H-Y commented Aug 31, 2022

Hi @dagnall53,

First about the change in code you sent 2.0.2 - 2.0.4. The Timer API got refactored to use ESP-IDF functions and the timer_enable_intr(group_num, timer_num); is called inside timer_isr_callback_add(); so there is no need to call it separately and the timer_info was not used at all.

I will try to look more into that and find if there is a way to make it work is it was in 2.0.2.
But I think it won't be possible to call Timer functions inside ISR or even attachInterrupt inside ISR.

@VojtechBartoska VojtechBartoska added the Status: Needs investigation We need to do some research before taking next steps on this issue label Aug 31, 2022
@dagnall53
Copy link
Author

dagnall53 commented Aug 31, 2022

@P-R-O-C-H-Y . Thanks, it would be great if the original functionality could be retained.

In the meantime, have been "playing" and I think that using a token (here "LookingForFalling" ) might be one way to this to possibly work with 2.0.4, by having the timing and edge sensing interrupts running concurrently, with simple switches in their routines.
But I am a bit concerned that doing it this way leaves both the "falling Edge" and the "bit timing" interrupts active all the time, and so may have other consequences in the "real" code.

I would appreciate any insights..

EG.. to get the bit timing synchronous with the falling edge I propose to use timerWrite(My_Timer, 0); // reset bit timer?.. will this reset the timer as I expect it to?

//This partial code now "Works" with 2.0.4 (or at least it does not crash!!)
// (not necessarily minimal code.. !!)

#define RX_ST_PIN 18
hw_timer_t * My_Timer = NULL;

volatile byte rxBitCounter = 0;
bool Has_Been_Printed = true;
bool Setup_done=false;

unsigned long previousMillis ;

bool LookingForFalling = false;

String debugString = "";

void IRAM_ATTR onRxFall() {
if (!LookingForFalling) {return;}
if (!digitalRead(RX_ST_PIN)) {return;} // wait for signal high before allowing any more progress with byte reading
LookingForFalling = false;
rxBitCounter = 1;
timerAlarmWrite(My_Timer, 1.8 * 10000000.0 / 4800, true); // set baud rate
timerWrite(My_Timer, 0); // reset bit timer?
}

void IRAM_ATTR onRxTimer() {
if (LookingForFalling){return;}
//Serial.println(rxBitCounter);

if (rxBitCounter == 10) {
LookingForFalling = true;
debugString.concat("byte Read\n\r");
Has_Been_Printed=false; // replaces serial print , allows loop to show bit progress
return;
}
rxBitCounter++;
//debugString.concat(rxBitCounter);
}

void Setup_timer () {
if (Setup_done){ Serial.println( " already setup"); return; }
pinMode(RX_ST_PIN, INPUT_PULLUP);
Serial.println("setup-timer -begin "); delay(500);
My_Timer = timerBegin(2, 8, true);
timerAlarmWrite(My_Timer, 100000, true); // set bit rate (here set to a simple "fast" to get to the bit count=10 and set the interrupt)
timerAttachInterrupt(My_Timer, onRxTimer, true);
rxBitCounter = 9; // test will initially count to 10 and then set the falling edge interrupt
LookingForFalling = false;
Serial.println("Enabling timer"); delay(500);
timerStart(My_Timer);
timerAlarmEnable(My_Timer);
Serial.println("setup-end"); delay(500);

Setup_done=true;
Has_Been_Printed=false;
}

void setup() {

Setup_done=false;
Serial.begin(115200, SERIAL_8N1, 3, 1);
delay(500);
Serial.println("******************");
Serial.println(ESP.getSdkVersion());
delay(500);
LookingForFalling = false;
attachInterrupt(RX_ST_PIN, onRxFall, FALLING);//Crashes here in v2.0.3+
//Setup_timer ();
previousMillis = millis();
}

void loop() {
if (!Has_Been_Printed) {
Serial.print(debugString);Serial.print(" ");
Has_Been_Printed=true;
}

if (millis() >= (previousMillis +5000)){
Serial.print("loop ");
previousMillis = millis();
if (!Setup_done) {Setup_timer ();}

}

delay(1);
}

@dagnall53
Copy link
Author

@P-R-O-C-H-Y .. After having read the ESP documents, (and maybe partially understanding them), I think that the new interrupt approach is using shared interrupts (?). And that this brings some critical limitations.

I am not sure but, it appears one additional limitation appears to be that shared edge triggered interrupts become problematic, and so are converted to level interrupts (I may have seen a compiler verbose comment to this effect ? ) (Is this correct???)

Since an edge triggered interrupt is key to timing accuracy in a UART, I am concerned about timing implications, and how to overcome them in a way compatible with 2.0.3 + compiler..

1: Will all edge triggered interrupts now actually be programmed at assembler as a shared interrupt, LOW detect? If so, its going to make accurate detection of the falling edge exact time of occurance a big problem!

2: If I CAN ( ?how? ) still program a high priority edge triggered interrupt with the new compiler, I know (because it crashes!) that I cannot now do attach and detach inside another ISR. My ‘hack’ is to add a token and a minimal ISR addition that initially checks the token and rapidly returns if the Interrupt is not required. But this seems inelegant and will add a small timing penalty on every falling edge. - Unless there is some other way to switch on / off the interrupt when it is not needed?

Comments much appreciated.
Many thanks
Dagnall

@SuGlider
Copy link
Collaborator

SuGlider commented Sep 7, 2022

2: If I CAN ( ?how? ) still program a high priority edge triggered interrupt with the new compiler, I know (because it crashes!) that I cannot now do attach and detach inside another ISR. My ‘hack’ is to add a token and a minimal ISR addition that initially checks the token and rapidly returns if the Interrupt is not required. But this seems inelegant and will add a small timing penalty on every falling edge. - Unless there is some other way to switch on / off the interrupt when it is not needed?

I think that a possible solution for such a critical time application would be use some direct IDF or LL code to detach it, instead of Arduino Classic code... it may require more investigation.

@dagnall53
Copy link
Author

dagnall53 commented Oct 24, 2022

@P-R-O-C-H-Y / @SuGlider EDIT.. Sorry It would appear that 2.0.5 has not "solved" this problem-

@dagnall53
Copy link
Author

@SuGlider @P-R-O-C-H-Y
< an upate >
A friend and colleague has managed to get this code working by the extremely simple method of placing a timerStop() immediately after the TimerBegin in the setup routine:
timer_1 = timerBegin(1, 8, true);
**timerStop(timer_1); // added **
timerAttachInterrupt(timer_1, &onRxTimer, true);

This appears to be related to the "immediate trigger" issue discussed in
https://docs.espressif.com/projects/esp-idf/en/v4.3/esp32/api-reference/peripherals/timer.html#_CPPv422timer_isr_callback_add13timer_group_t11timer_idx_t11timer_isr_tPvi

I am assuming that the issue I had with the code was NOT therefore with the ATTACH interrupt, but was actually caused by the timer ?
If you agree, we can close this Issue.. (but perhaps also add it to notes on timer use? )

@Z-e-e-v
Copy link

Z-e-e-v commented Jan 22, 2023

@dagnall53 all you did was add that one line? Could you send the whole code again? I am having trouble with ISRs as well.

@P-R-O-C-H-Y
Copy link
Member

Closing this as its solved by refactored driver #7904.

@github-project-automation github-project-automation bot moved this from Under investigation to Done in Arduino ESP32 Core Project Roadmap Aug 9, 2023
@P-R-O-C-H-Y P-R-O-C-H-Y added Status: Solved and removed Status: Awaiting triage Issue is waiting for triage Status: Needs investigation We need to do some research before taking next steps on this issue labels Aug 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging a pull request may close this issue.

6 participants