Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems when trying to update esp-idf and toolchain to version 8.2 (the latest) #263

Closed
leofabri opened this issue Sep 11, 2019 · 12 comments
Labels

Comments

@leofabri
Copy link

leofabri commented Sep 11, 2019

Hi everyone, I'm fairly new to esp-idf as well as c++. I'm trying to use the Open Source OVMS3 as a base for a personal project. I'm sorry if I came here to ask about this, I hope that some good developer can enlighten me.

The problem: the current version of esp-idf is quite old

I've noticed that you are using an old version of both the toolchain and the compiler (v 5.2 if I remember correctly), and that's why I've spent quite some time trying to implement all the changes that you have made to the stock esp-idf components to make it work with this version.

What I'm trying to achieve:

I want to get OVMS to work with the most updated version of esp-idf, respectively with:

  • Toolchain version: esp32-2019r1
  • Compiler version: 8.2.0

The code compiles, but panics on boot

Unfortunately, I got stuck and now I can't really understand why when I flash my ESP32 with the updated code I get this segmentation fault:

I (940) ovms-mdns: Initialising MDNS (8100)
Guru Meditation Error: Core  0 panic'ed (LoadProhibited). Exception was unhandled.
Core 0 register dump:
PC      : 0x40184479  PS      : 0x00060330  A0      : 0x8013af84  A1      : 0x3ffe3a70  
0x40184479: std::local_Rb_tree_decrement(std::_Rb_tree_node_base*) at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/gcc/libstdc++-v3/src/c++98/tree.cc:98

A2      : 0x3ffb7f80  A3      : 0x3ffe3b88  A4      : 0x0000000c  A5      : 0x00000000  
A6      : 0x00001004  A7      : 0x3ffae72c  A8      : 0x00000000  A9      : 0x3ffe3a60  
A10     : 0x3ffbb51c  A11     : 0x3ffe3b88  A12     : 0x0000000c  A13     : 0x00000080  
A14     : 0x0000001f  A15     : 0x00000000  SAR     : 0x00000004  EXCCAUSE: 0x0000001c  
EXCVADDR: 0x00000004  LBEG    : 0x4000c2e0  LEND    : 0x4000c2f6  LCOUNT  : 0xffffffff  

ELF file SHA256: 897edbb72e8f84efb785d64d5b515037f2286af9bafeb202a60e63f5c2bc4136

Backtrace: 0x40184476:0x3ffe3a70 0x4013af81:0x3ffe3a90 0x40109b5e:0x3ffe3ab0 0x40109c2a:0x3ffe3ae0 0x40109ccb:0x3ffe3b10 0x40130692:0x3ffe3b80 0x401307aa:0x3ffe3bf0 0x4008126b:0x3ffe3c10 0x40081494:0x3ffe3c60 0x40079311:0x3ffe3c80 |<-CORRUPTED
0x40184476: std::local_Rb_tree_decrement(std::_Rb_tree_node_base*) at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/gcc/libstdc++-v3/src/c++98/tree.cc:97

0x4013af81: std::_Rb_tree_decrement(std::_Rb_tree_node_base*) at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/gcc/libstdc++-v3/src/c++98/tree.cc:123

0x40109b5e: std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::list<EventCallbackEntry*, std::allocator<EventCallbackEntry*> >*>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::list<EventCallbackEntry*, std::allocator<EventCallbackEntry*> >*> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::list<EventCallbackEntry*, std::allocator<EventCallbackEntry*> >*> > >::_M_get_insert_unique_pos(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /home/leofabri/.espressif/tools/xtensa-esp32-elf/esp32-2019r1-8.2.0/xtensa-esp32-elf/xtensa-esp32-elf/include/c++/8.2.0/bits/stl_tree.h:302
 (inlined by) std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::list<EventCallbackEntry*, std::allocator<EventCallbackEntry*> >*>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::list<EventCallbackEntry*, std::allocator<EventCallbackEntry*> >*> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::list<EventCallbackEntry*, std::allocator<EventCallbackEntry*> >*> > >::_M_get_insert_unique_pos(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /home/leofabri/.espressif/tools/xtensa-esp32-elf/esp32-2019r1-8.2.0/xtensa-esp32-elf/xtensa-esp32-elf/include/c++/8.2.0/bits/stl_tree.h:2063

0x40109c2a: std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::list<EventCallbackEntry*, std::allocator<EventCallbackEntry*> >*>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::list<EventCallbackEntry*, std::allocator<EventCallbackEntry*> >*> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::list<EventCallbackEntry*, std::allocator<EventCallbackEntry*> >*> > >::_M_get_insert_hint_unique_pos(std::_Rb_tree_const_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::list<EventCallbackEntry*, std::allocator<EventCallbackEntry*> >*> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /home/leofabri/.espressif/tools/xtensa-esp32-elf/esp32-2019r1-8.2.0/xtensa-esp32-elf/xtensa-esp32-elf/include/c++/8.2.0/bits/stl_tree.h:2192

0x40109ccb: OvmsEvents::RegisterEvent(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::function<void (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void*)>) at /home/leofabri/.espressif/tools/xtensa-esp32-elf/esp32-2019r1-8.2.0/xtensa-esp32-elf/xtensa-esp32-elf/include/c++/8.2.0/bits/stl_tree.h:2418
 (inlined by) ?? at /home/leofabri/.espressif/tools/xtensa-esp32-elf/esp32-2019r1-8.2.0/xtensa-esp32-elf/xtensa-esp32-elf/include/c++/8.2.0/bits/stl_map.h:499
 (inlined by) OvmsEvents::RegisterEvent(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::function<void (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void*)>) at /home/leofabri/esp/ovms/main/ovms_events.cpp:190

Now, please notice that due to the changes I've made to OVMS the lines are no longer matching those of the original OVMS. However, from what I see, there is a problem when the software loads MDNS.

After some tracing, the problematic line seems to be this:

m_map[event] = new EventCallbackList();

void OvmsEvents::RegisterEvent(std::string caller, std::string event, EventCallback callback)
  {
  auto k = m_map.find(event);
  if (k == m_map.end())
    {
    m_map[event] = new EventCallbackList(); // <- the problem occurs here wen we invoke the Callback
    k = m_map.find(event);
    }
  if (k == m_map.end())
    {
    ESP_LOGE(TAG, "Problem registering event %s for caller %s",event.c_str(),caller.c_str());
    return;
    }

  EventCallbackList *el = k->second;
  el->push_back(new EventCallbackEntry(caller,callback));
  }
@dexterbg
Copy link
Member

Bold attempt. Hint: enable the gdb stub so you can inspect registers and parameters on the crash.

@leofabri
Copy link
Author

I'm still developing this and I'm using OVMSV3 as a base. Now that ESP-IDF V4 has been released I'm trying to completely transition to it, but it's not as simple.
I made a lot of changes since the beginning and the code is now compiling properly, but there is still something I don't understand.

  1. I really wish I could understand why you inherited an std::map here when I read in many places that this should be avoided due to the impossibility of having a destructor acting on the map itself. Read here why this shouldn't be done.

Solution:

I made the map private inside of the class (actually, I removed the class entirely and moved everything inside OvmsCommand.


  1. Now, I noticed a really strange behavior here. When you issue MyCommandApp.RegisterCommand("blablabla") you basically call this function here but something bad happens.

It's difficult to explain, so please follow me.

In my code, I noticed that the first library that tries to register a new command is pcp.cpp, this line here
. Then other registrations occur in order.
At some point, the commands inside of the OvmsCommandApp() get registered:

  m_root.RegisterCommand("help", "Ask for help", help, "", 0, 0, false);
  m_root.RegisterCommand("exit", "End console session", cmd_exit, "", 0, 0, false);
  OvmsCommand* cmd_log = MyCommandApp.RegisterCommand("log","LOG framework");
  cmd_log->RegisterCommand("file", "Start logging to specified file", log_file, "[<vfspath>]\nDefault: config log[file.path]", 0, 1);
  cmd_log->RegisterCommand("open", "Start file logging", log_open);
  cmd_log->RegisterCommand("close", "Stop file logging", log_close);
  cmd_log->RegisterCommand("status", "Show logging status", log_status);
  cmd_log->RegisterCommand("expire", "Expire old log files", log_expire, "[<keepdays>]", 0, 1);
  OvmsCommand* level_cmd = cmd_log->RegisterCommand("level", "Set logging level", NULL, "$C [<tag>]", 0, 0, false);
  level_cmd->RegisterCommand("verbose", "Log at the VERBOSE level (5)", log_level , "[<tag>]", 0, 1);
  level_cmd->RegisterCommand("debug", "Log at the DEBUG level (4)", log_level , "[<tag>]", 0, 1);
  level_cmd->RegisterCommand("info", "Log at the INFO level (3)", log_level , "[<tag>]", 0, 1);
  level_cmd->RegisterCommand("warn", "Log at the WARN level (2)", log_level , "[<tag>]", 0, 1, false);
  level_cmd->RegisterCommand("error", "Log at the ERROR level (1)", log_level , "[<tag>]", 0, 1, false);
  level_cmd->RegisterCommand("none", "No logging (0)", log_level , "[<tag>]", 0, 1, false);
  monitor = cmd_log->RegisterCommand("monitor", "Monitor log on this console", log_monitor , "[$C]");
  monitor_yes = monitor->RegisterCommand("yes", "Monitor log", log_monitor);
  monitor->RegisterCommand("no", "Don't monitor log", log_monitor);
  m_root.RegisterCommand("enable","Enter secure mode (enable access to all commands)", enable, "[<password>]", 0, 1, false);
  m_root.RegisterCommand("disable","Leave secure mode (disable access to most commands)", disable);

And those commands are registered during the creation of the constructor here.

The problem:

the map of commands contains just the ones registered after the creation of the constructor and not the additional ones like: metrics, power...

When I saw this, I initially thought that the map or even m_root had something to do with it. But the program was compiling successfully and the code didn't seem wrong to me.

So, I decided to trace things a little bit more and observe what was happening during the creation of those commands inside of m_root.

I replaced the OvmsCommandApp::RegisterCommand(...) with the following code:

OvmsCommand *OvmsCommandApp::RegisterCommand(const char *name, const char *title, void (*execute)(int, OvmsWriter *, OvmsCommand *, int, const char *const *),
                                               const char *usage, int min, int max, bool secure)
{
  OvmsCommand *test = m_root.RegisterCommand(name, title, execute, usage, min, max, secure);

  printf("\n---START---\n");
  int i = 0;
  for (const auto &x : m_root.m_children)
  {
    // Just show me what's in the map
    printf("%d) key is: %s \n- title: %s \n- name: %s\n", i, x.first.c_str(), x.second->m_title, x.second->m_name);
    i++; // Nothing personal here :)
  }
  printf("\n---END---\n");

  return test;
}

and everything got clearer on the next run.

What's happening

m_children is populated with all the commands, I can see them getting registered and appended one after the other.
Then, at some point, I see "Initialising COMMAND (1000)" and the list of registered commands is immediately cleared of its content and just the ones registered after the initialization are retained on the console.

[...] previous console logs...

---START---
0) key is: boot 
- title: BOOT framework
- name: boot
1) key is: event 
- title: EVENT framework
- name: event
2) key is: metrics                        <--- Notice metrics. It's inside m_root already
- title: METRICS framework
- name: metrics
3) key is: notify
- title: NOTIFICATION framework 
- name: notify
4) key is: power 
- title: Power control
- name: power

---END---
I (947) command: Initialising COMMAND (1000)  <--- Look at this here! The previous content seems to be gone.

---START---
0) key is: exit
- title: End console session
- name: exit
1) key is: help
- title: Ask for help 
- name: help
2) key is: log
- title: LOG framework 
- name: log
                                     <---- where are metrics and the other commands?

---END---
I (969) config: Initialising CONFIG (1400)

---START---
0) key is: disable
- title: Leave secure mode (disable access to most commands) 
- name: disable
1) key is: enable
- title: Enter secure mode (enable access to all commands)
- name: enable
2) key is: exit 
- title: End console session
- name: exit
3) key is: help
- title: Ask for help
- name: help
4) key is: log 
- title: LOG framework
- name: log
5) key is: store
- title: STORE framework 
- name: store

---END---

Thoughts and considerations

I have spent a lot of time looking for the reasons why the constructor gets called with such a delay.
My idea is that after some time, the objects inside of class OvmsCommandApp get completely overwritten or reinitialized. We clearly don't want this to happen.

Why despite having registered OvmsCommandApp as global MyCommandApp is this happening?

I mean: shouldn't the OvmsCommandApp constructor be called immediately when I ask to register a command?

@dexterbg
Copy link
Member

The OvmsCommand code has mostly not been written by me, but I think I can answer your questions.

  1. Regarding the use of std::map: there is nothing wrong with using std::map as a base class in a non-virtual sub class, especially not if it's never destructed -- as is the case here.

The issue you're relating to has to do with virtual object destruction. If OvmsCommandMap was virtual and would need to be addressable using a std::map pointer, that would not be working. But it isn't, and it's never even accessed through a pointer. It's just a functional extension to std::map only used within OvmsCommand.

So there's nothing wrong with that and it doesn't need a fix.

  1. There is exactly one instance of OvmsCommandApp, which is MyCommandApp. That instance is created before any other of the static component instances are created. The instantiation order is managed at the source level through the init priority attributes, e.g.:
OvmsCommandApp MyCommandApp __attribute__ ((init_priority (1000)));

Init priorities are also logged in brackets from the constructors during the boot process:

…
I (1930) cpu_start: Pro cpu start user code
I (1934) spiram: Adding pool of 4096K of external SPI memory to heap allocator
I (63) ovms_main: Set default logging level for * to INFO
I (64) command: Initialising COMMAND (1000)
I (65) boot: Initialising BOOT (1100)
I (68) boot: Power cycle reset detected
I (72) events: Initialising EVENTS (1200)
I (77) config: Initialising CONFIG (1400)
I (82) time: Initialising TIME (1500)
I (87) script: Initialising SCRIPTS (1600)
I (91) script: Using DUKTAPE javascript engine
I (97) metrics: Initialising METRICS (1810)
…

The OvmsCommandApp constructor must not be delayed by the compiler until the first use, it must obey the init priority as defined, as the boot process depends on the order. And of course there must be no calls to RegisterCommand() before the instantiation.

So if you don't have this exact order in your boot log or see some entries having a second creation log entry, something is probably broken with your compiler or build setup.

In a normal boot, the OvmsCommandApp constructor itself is the first to call RegisterCommand(), with the first command registered being "help":
https://github.com/openvehicles/Open-Vehicle-Monitoring-System-3/blob/master/vehicle/OVMS.V3/main/ovms_command.cpp#L731

Regards,
Michael

@leofabri
Copy link
Author

@dexterbg thank you so much for clarifying this.
I found your answer very clear, and you are totally right about this: the map is never destroyed and it's not virtual, this is a good point.

Regarding the last question, yes, I did notice that something was wrong with the __attribute__(init_priority()) rule on esp-idf V4. Even when changing it, it didn't seem to affect anything in the initialization of the variable, and the boot sequence is clearly messy right now.

I noticed that this rule seems to be ignored on esp-idf v4 as it is, or at least not with my current configuration. I think I'll try to get some more details about this to the esp-idf guys.

It doesn't look like so, but many things have changed from V3 to V4, especially after they decided to move to CMake.

Best regards,
Fabrizio

@slcasner
Copy link
Contributor

Thanks for taking this, Michael. I am the author of code deriving from std::map, but the first comment from Fabrizio today was not forwarded to me by email as is usually the case, so I didn't know about it until I saw the email resulting from Michael's reply. I did get an email for Fabrizio's second comment today.

@leofabri
Copy link
Author

I can confirm this will now be fixed by the next commit of esp-idf as reported here.
Thank you all for your support.

@eddyv19
Copy link

eddyv19 commented Apr 25, 2020

Hello Fabrizio,

Are you still actively working on this? How far are you now?

Thanks,
Eddy

@leofabri
Copy link
Author

Hi @eddyv19 I don't feel like I'm actively working on this since I am busy with my academic studies in this field.
I'm currently developing other applications so I had to stop for a while developing this new version. Are you interested in this? I'm not developing a new OVMS version even though I appreciate the hard work that the guys here are carrying on.

@eddyv19
Copy link

eddyv19 commented Apr 25, 2020

Hi, yeah i am interested. Is the work that you have done so far accessible somewhere? How far did you get? Is it compiling with the updated libraries and tools.
I am very new to the OVMS code base.

@leofabri
Copy link
Author

leofabri commented Apr 25, 2020

Hi, despite being OVMS codebase Open Source, the code I've been working on is not accessible. It's been developed as a private repo for a custom project which unfortunately I can not share. I'm sorry about that.

I want to clarify that my version of the code is not currently supporting any of the OVMS components. They have been disabled permanently (we don't need them).
The app is currently compiling (but it's loading just pieces of the code): because of the errors reported above and many deprecations that have happened since V3.
The OVMS guys are doing a remarkable job supporting this project by manually updating the esp-idf framework. This codebase is quite big and porting it to V4 is not as easy as it seems.

When I started this, I was as new as you to the OVMS codebase, and I have to say that if you want to develop a new project based on this, you will probably have to work quite hard to get this working on esp-idf 4.x. It might not be worth it depending on your requirements.

@leres
Copy link
Contributor

leres commented Apr 26, 2020

Fabrizio: Do you have anything you can share with us?

Eddy: I want to see this happen. I've been working on building the esp-2020r1 toolchain on FreeBSD (a lot of weekends effort over the last month) and am very close to finishing this part.

@openvehicles openvehicles locked as too heated and limited conversation to collaborators Apr 27, 2020
@markwj
Copy link
Member

markwj commented Apr 27, 2020

The original poster has indicated he will not be able to contribute his changes back to the project. That is disappointing, but understandable.

I have raised a new GitHub issue for further discussions regarding this:

Moving OVMS to latest ESP-IDF framework #360

@markwj markwj added wontfix cantfix Can't fix and removed wontfix labels Apr 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

6 participants