Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use placement new for ETSTimer - no heap fragmentation #6164

Merged
merged 13 commits into from
Jun 5, 2019

Conversation

dok-net
Copy link
Contributor

@dok-net dok-net commented May 29, 2019

When using many Ticker objects, possible in a container, each allocates an ETSTimer object on the heap. On an embedded system, this is poor practice, this PR changes this, without affecting constructor/destructor effects of ETSTimer.

@dok-net
Copy link
Contributor Author

dok-net commented May 29, 2019

@igrr Could you kindly review the proposed change, please?

@d-a-v
Copy link
Collaborator

d-a-v commented May 29, 2019

Why don't you use a plain ETSTimer instead of keeping ETSTimer*.
Since it's a struct without explicit constructor, the in-place new does nothing, does it ?

@dok-net
Copy link
Contributor Author

dok-net commented May 30, 2019

I'm trying to stick as closely as possible to how it worked before, constructing and destructing ETSTimer objects - the pointer is used as predicate for armed/disarmed just like before. Should ETSTimer ever have a constructor/destructor, the object lifetime is unaffected - making struct ETSTimer an immediate member, that's different.

@igrr
Copy link
Member

igrr commented May 30, 2019

The change looks correct, but I agree with @d-a-v that making ETSTimer struct a member of the ticker object would lead to simpler code. The pointer checks in attach/detach can then be removed.

@devyte
Copy link
Collaborator

devyte commented May 30, 2019

I agree with the idea, but I think what should be done is:

  1. add member ETSTimer _timerObj
  2. point _timer to that member in the constructor
  3. open an issue to clean up for v3.0.0 (would break any class derived from Ticker)

libraries/Ticker/Ticker.h Outdated Show resolved Hide resolved
@dok-net
Copy link
Contributor Author

dok-net commented May 30, 2019

@devyte I have to respectfully reply that I don't care about a change for v3.0.0 but that I'm working on something for right now. That said, please consider:
The reviews question the sensibility of the seemingly C++-overengineering approach with placement new, but continue to note the incompatibility of their proposed mitigations.
I believe the following is true: Ticker is a non-virtual concrete class, no virtuals, not even a virtual destructor. I've move the placement memory into visibility private, making the change IMHO completely transparent to any derived classes - at compile time, naturally, ABI is affected, but being a "library" member, not strictly in core, this can be argued, can't it?
Changing to a plain ETSTimer member and dropping the new/delete/destructor calls can be done another time - this change is supposed to support using Ticker as a stable wrapper for the OS ETSTimer without the heap clutter.

Please keep in mind, this PR is there to remove polling for timeouts in Scheduler.

@d-a-v
Copy link
Collaborator

d-a-v commented May 30, 2019

@doknet @hreintke @devyte @igrr @earlephilhower @anyone

About this:

Please keep in mind, this PR is there to remove polling for timeouts in Scheduler.

I must understand this PR is aimed at removing the recent recurrent scheduled function list called at every loop() and yield(), leaving only the former one called once at loop() because of the overhead looping through this list which is not as trivial as it was before, am I right ?

I would like to explain why I need this feature and why ticker is not sufficient.

I'm currently trying to find my way to add pure ethernet drivers in this core :
ethernet chips without TCP or with TCP disabled. TCP would be managed by lwIP so the ethernet interfaces could transparently be used by all network libraries we currently and will have. We could also use PPP interfaces (= IP (tcp, udp, ...) over Serial so).

I have some ethernet shields to work with based on W5100 without exposed interrupt pin (even if W5100 itself has one). With that, or given the number of gpio (users may not wish to use the interrupt pin if possible), or given the fact current Ethernet library does not use it (or show me I didn't find): a function like ethernet.handlePacket() has to be called regularly, and preferably from CONT stack.

One could argue: what's the issue with that, just put your ethernet.handlePacket() right next to your MDNS.update() in your main loop. And I did. Then I got this kind of issue. What is certain is external libraries looping for a long time need to call yield() so I just need to call ethernet from yield(), and I did, with the recurrent scheduled functions.

My current issues:

  • os_timer is in SYS (CONT is preferred, lwIP is not protected in this core)
  • Ticker is in CONT but only called only at every loop() (not compatible with external network libraries)
  • recurrent scheduled functions are OK but it appears polling seems to be too much of overhead at every loop()/yield().

I am currently seeing two solutions

  • separate the current scheduled functions into two different lists.
    They will be much simpler to process. Overhead will be lower.
  • revert recurrent scheduled function, use two different lists, the legacy one called in loop() and the other called from both loop() and yield().
    I would use Ticker with the second one for ethernet.

I'm open to any proposal.

I would like to show you what os_timer would look like, frequently called too:
https://github.com/espressif/ESP8266_RTOS_SDK/blob/45f305243f5bc8527fabf7b648e89fa0739b0e9a/components/freertos/freertos/timers.c#L538
What about its overhead ?

@igrr
Copy link
Member

igrr commented May 30, 2019

I would like to show you what os_timer would look like, frequently called too:

Note that this is the FreeRTOS timer. It is unrelated to the os_timer (aka ETSTimer) implemented in the non-OS SDK.

(Not claiming that ETSTimer doesn't have overhead, just clearing this up to avoid confusion)

@dok-net
Copy link
Contributor Author

dok-net commented May 30, 2019

I cannot quite follow the claims made about Ticker, and if what I understand was true, there's still no problem, as the Schedule run_scheduled_functions only runs in CONT anyway.
Here's a test sketch that in my mind shows that what I am trying to do works.
It uses Ticker to schedule functions for immediate execution, but only at the given time, because that's when the Ticker schedules them in the first place - which is the same as scheduling them for delayed execution when run_scheduled_functions polls them all the time, right?

// QueueTest.cpp : This file contains the 'main' function. Program execution begins and ends there.
//

#include <Ticker.h>
#include <atomic>
#include "PolledTimeout.h"

constexpr long unsigned MSGCNT = 1000;
constexpr long unsigned MAXTICKERS = 50;

int pending = 1;

std::atomic<long unsigned> msgNo(0);
long unsigned nextVal[] = { 0 };

uint32_t startMillis;

bool checkMessage(long unsigned msg)
{
	const auto id = msg & 0x1f;
	const auto val = msg >> 5;

	if (id < 0 || id >= 1) {
		String str = "received bad thread id = ";
		str += int(id);
		Serial.println(str);
		return true;
	}
	if (val == MSGCNT - 1) {
		--pending;
		return false;
	}

	if (val != nextVal[id])
	{
		if (val < nextVal[id]) {
			String str = "value in reverse, id = ";
			str += int(id);
			str += ", val = ";
			str += int(val);
			Serial.println(str);
			nextVal[id] = val + 1;
		}
		else {
			String str = "values skipped, id = ";
			str += int(id);
			str += ", lost = ";
			str += int(val - nextVal[id]);
			Serial.println(str);
			nextVal[id] = val + 1;
		}
	}
	else
	{
		nextVal[id] = val + 1;
	}
	return true;
}

void setup()
{
	Serial.begin(115200);
	delay(1000);
	Serial.println("scheduler test");
	startMillis = millis();

	if (schedule_function_us([]() { Serial.print(".."); return true; }, 500000))
	{
		Serial.println("scheduled 'alive'");
	}
	else
	{
		Serial.println("could not schedule 'alive'");
	}

	std::atomic<uint32_t> tickerCnt(0);
	for (int i = 0; i < MSGCNT; ++i)
	{
		// prevent heap exhaustion by limiting Tickers simultaneously created
		while (tickerCnt.load() > MAXTICKERS) { delay(10); wdt_reset(); };
		tickerCnt.store(tickerCnt.load() + 1);
		auto _ticker = new Ticker;
		auto mn = msgNo.load();
		msgNo.store(mn + 1);
		_ticker->once_ms(i, [mn, _ticker, &tickerCnt]() {
			if (mn < MSGCNT) {
				auto msg = (mn << 5) + 0;
				while (!schedule_function([msg]()
					{
						//Serial.print("!");
						checkMessage(msg);
					}, SCHEDULE_FUNCTION_WITHOUT_YIELDELAYCALLS)) {
				}
			}
			delete _ticker;
			tickerCnt.store(tickerCnt.load() - 1);
			});
	}
}

esp8266::polledTimeout::periodicFastUs loopAlive(5000000);

void loop()
{
	if (pending == 0)
	{
		const auto val = nextVal[0];
		if (val != MSGCNT - 1)
		{
			String str = "received bad last message, id = ";
			str += 0;
			str += ", val = ";
			str += int(val);
			Serial.println(str);
		}

		Serial.print("done after ");
		Serial.print(millis() - startMillis);
		Serial.println("ms");

		pending = -1;
	}

	if (loopAlive) Serial.println("loop alive");
}

@d-a-v
Copy link
Collaborator

d-a-v commented May 30, 2019

@dok-net

It uses Ticker to schedule functions for immediate execution, but only at the given time, because that's when the Ticker schedules them in the first place - which is the same as scheduling them for delayed execution when run_scheduled_functions polls them all the time, right?

Yes you are right.

You missed the part where I claim I need scheduled functions be run also at yield() time.
Former scheduled function don't do that.
Current sources (to be reverted because of overhead?) do that.
Please tell me where I am unclear so I can explain in a different way. or whether I'm wrong.

@devyte
Copy link
Collaborator

devyte commented May 30, 2019

We're drifting off the topic of this PR.
@dok-net I didn't ask for a change for 3.0.0, I asked for an issue to track the change, which would be a minor cleanup after this PR, so as not to forget about it. Also, that comment is not necessarily aimed at you, but at whoever merges this PR.
As I said, I agree with your idea about reducing the heap fragmentation, but not with your solution. What I proposed is equivalent changes and results in cleaner code. My review stands. If you have further questions let's discuss in gitter directly.
Making the new member private protects against future access, hence is a good addition, so please keep the idea, just applied to ETSTimer _timerObj.

@dok-net
Copy link
Contributor Author

dok-net commented May 30, 2019

@d-a-v We both want your work to stay in place! What I am trying to do/prove is using Ticker to schedule timed functions instead of using the polling approach. When you say that this conflicts with execution of said scheduled functions does not happen during yield, in the case that the Ticker schedules them when their time case come, confused me - is there something about the Ticker that prevents it from executing during yield?

See, if you substitute the delay(10); wdt_reset(); line with

while (tickerCnt.load() > MAXTICKERS) { yield();/* delay(10); wdt_reset();*/ };

the test still completes successfully. Doesn't this indicate that yield() indeed lets both the Ticker execute and run_scheduled_functions() do it's job? So if this really works as it seems to me, it's a minor thing to
set up schedule_function_us to hand the function to a Ticker instance, which will schedule to the queue only upon elapsed timer, saving the queue from polling all timed functions. Once a timed function want to recur, it will again be given to a Ticker, and so forth.

If this makes sense, we need to figure out whether to allocate the Tickers on the heap, or in a prepared array, again using placement new(), if I can work out the reservations against that everyone seems to have.

@dok-net
Copy link
Contributor Author

dok-net commented May 30, 2019

@devyte Naturally just my opinion, but I see absolutely nothing wrong with placement new. It expresses quite succinctly that what was going on before is still unchanged, namely, an instance of ETSTimer is constructed, and destructed later. The compiler should actually generate code that is superior to supplying an immediate ETSTimer member in the Timer class, because in common interpretation of the C++ standard, this POD class will zero-initialize during construction. A detach() - attach*() cycle may easily rely on this initialization, if os_timer_setfn() and os_timer_arm() should rely on a freshly constructed ETSTimer, that's zeroed out.

@d-a-v
Copy link
Collaborator

d-a-v commented May 30, 2019

@dok-net

How is ticker going to call a function in cont stack while an external library is busy waiting on network ?

#include <coredecls.h>
#include <Ticker.h>
#include <PolledTimeout.h>

Ticker tick1, tick2;
esp8266::polledTimeout::periodicFastMs extLibHasFinished(10000);

void mycb (void)
{
  if (cont_can_yield(g_pcont))
    Serial.println("now in CONT stack");
  else
    Serial.println("NOT now");
}

void mycb1 (void)
{
  Serial.printf("attach_ms: ");
  mycb();
}

void mycb2 (void)
{
  Serial.printf("attach_ms_scheduled: ");
  mycb();
}

void setup() {
  Serial.begin(115200);
  tick1.attach_ms(2000, mycb1);
  tick2.attach_ms_scheduled(2000, mycb2);
  extLibHasFinished.reset();
}

void loop() {
  Serial.println("external library now busy");
  while (!extLibHasFinished)
    yield();
  Serial.println("external library finally gets out");
}

output:

01:28:04.435 -> attach_ms: NOT now
01:28:06.427 -> attach_ms: NOT now
01:28:08.420 -> attach_ms: NOT now
01:28:10.412 -> attach_ms: NOT now
01:28:10.412 -> external library finally gets out
01:28:10.412 -> attach_ms_scheduled: now in CONT stack
01:28:10.412 -> attach_ms_scheduled: now in CONT stack
01:28:10.446 -> attach_ms_scheduled: now in CONT stack
01:28:10.446 -> attach_ms_scheduled: now in CONT stack
01:28:10.446 -> attach_ms_scheduled: now in CONT stack
01:28:10.446 -> external library now busy
01:28:12.441 -> attach_ms: NOT now
01:28:14.433 -> attach_ms: NOT now
01:28:16.426 -> attach_ms: NOT now
01:28:18.419 -> attach_ms: NOT now
01:28:20.412 -> attach_ms: NOT now
01:28:20.412 -> external library finally gets out
01:28:20.412 -> attach_ms_scheduled: now in CONT stack
01:28:20.445 -> attach_ms_scheduled: now in CONT stack
01:28:20.445 -> attach_ms_scheduled: now in CONT stack
01:28:20.445 -> attach_ms_scheduled: now in CONT stack
01:28:20.445 -> attach_ms_scheduled: now in CONT stack
01:28:20.445 -> external library now busy
01:28:22.438 -> attach_ms: NOT now
01:28:24.430 -> attach_ms: NOT now

@dok-net
Copy link
Contributor Author

dok-net commented May 31, 2019

@d-a-v This PR is for improving memory performance. Please hop over to #6139 for the answer :-)

@dok-net
Copy link
Contributor Author

dok-net commented May 31, 2019

Core "Host tests" require some fixes - use of uint32_t for pointer type was incorrect anyway.

@dok-net
Copy link
Contributor Author

dok-net commented May 31, 2019

@igrr @devyte @d-a-v Next question: I have a choice of either adding ICACHE_RAM_ATTR to Ticker::once_ms (and assorted further members) or using the upgraded new "cont scheduler" to allow ISRs to safely schedule timed functions. The latter approach causes an additional scheduler roundtrip delay for timed functions... Which is preferable from your POV?

Copy link
Collaborator

@devyte devyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much cleaner! Only minor comments now.

libraries/Ticker/Ticker.cpp Outdated Show resolved Hide resolved
libraries/Ticker/Ticker.cpp Outdated Show resolved Hide resolved
libraries/Ticker/Ticker.cpp Outdated Show resolved Hide resolved
libraries/Ticker/Ticker.h Outdated Show resolved Hide resolved
@devyte devyte merged commit 7910121 into esp8266:master Jun 5, 2019
@dok-net dok-net deleted the TickerSingleMemChunk branch June 5, 2019 05:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants