Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recurrent scheduled functions max delay update #8949

Draft
wants to merge 29 commits into
base: master
Choose a base branch
from

Conversation

dok-net
Copy link
Contributor

@dok-net dok-net commented Jul 1, 2023

Fixes #8947
This pulls in an older PR from myself, that added remaining() to PolledTimeout. That is needed to properly calculate the maximum permissible delay until the next recurrent scheduled function is due.
Replaces the expensive and IMHO incorrect logic using std::gcd() recently added to Scheduler.

This now works with microsecond precision for the timed functions, instead of rounding up to the nearest millisecond delay as before.

@dok-net dok-net changed the title Fix 8947: delay(0) must yield(); recurrent scheduled functions max delay fixed Fix #8947: delay(0) must yield(); recurrent scheduled functions max delay fixed Jul 1, 2023
@dok-net dok-net marked this pull request as draft July 1, 2023 15:00
@dok-net dok-net marked this pull request as ready for review July 2, 2023 07:47
@dok-net dok-net force-pushed the fix_8947 branch 2 times, most recently from 84f4b73 to cebad90 Compare July 19, 2023 22:16
Copy link
Collaborator

@d-a-v d-a-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I find it funny that in your humble opinion you think the logic is incorrect, I took time to review your proposal.
The idea to recalculate next call date at every call is interesting and could be indeed more efficient than relying on the durations gcd to check on them all (while I'm not certain yet, because gcd is calculated once).

Also, this PR's title is outdated.

cores/esp8266/Schedule.cpp Outdated Show resolved Hide resolved
cores/esp8266/Schedule.cpp Outdated Show resolved Hide resolved
}
#endif
}
if (!rFirst) return ~static_cast<decltype(micros())>(0) >> 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~static_cast<decltype(micros())>(0) >> 1 should be usefully replaced by std::numeric_limits<decltype(micros())>::max()

Why not using using micros_t = decltype(micros())
and using micros_signed_t = std::make_signed_t<micros_t> ?

Copy link
Contributor Author

@dok-net dok-net Jul 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~static_cast<decltype(micros())>(0) >> 1 should be usefully replaced by std::numeric_limits<decltype(micros())>::max()

The results are dramatically different, unless also >> 1. I feel that the purpose becomes less "obvious" when working with "max", which had given thought and discarded due to that reason. The wrap-over logic is a very technical detail.

Why not using using micros_t = decltype(micros()) and using micros_signed_t = std::make_signed_t<micros_t> ?

I am really not a fan of aliasing. If your change is a prerequisite to merging, re-request, please.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wrap-over logic is a very technical detail.

Really?

The fact is that this code does not compile without warnings in 64 bit host mode.

How would your logic be different without >>1 ?
Or from rTarget = micros() + (~static_cast<decltype(micros())>(0) >> 1) to rTarget = micros() - 1

I tried this and it seems to works with a trivial example.
Demixing decltype(micros()) and uint32 appears to be necessary.

--- a/cores/esp8266/Schedule.cpp
+++ b/cores/esp8266/Schedule.cpp
@@ -146,7 +146,7 @@ bool schedule_recurrent_function_us(const std::function<bool(void)>& fn,
 
 uint32_t get_scheduled_recurrent_delay_us()
 {
-    if (!rFirst) return ~static_cast<decltype(micros())>(0) >> 1;
+    if (!rFirst) return std::numeric_limits<uint32_t>::max();
     // handle already expired rTarget.
     const int32_t remaining = rTarget - micros();
     return (remaining > 0) ? static_cast<uint32_t>(remaining) : 0;
@@ -154,7 +154,7 @@ uint32_t get_scheduled_recurrent_delay_us()
 
 uint32_t get_scheduled_delay_us()
 {
-    return sFirst ? 0 : ~static_cast<decltype(micros())>(0) >> 1;
+    return sFirst ? 0 : std::numeric_limits<uint32_t>::max();
 }
 
 void run_scheduled_functions()
@@ -223,7 +223,7 @@ void run_scheduled_recurrent_functions()
 
     // prevent scheduling of new functions during this run
     stop = rLast;
-    rTarget = micros() + (~static_cast<decltype(micros())>(0) >> 1);
+    rTarget = micros() - 1;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to think that the new comment at HALF_MAX_MICROS explains everything. If it doesn't, let's improve the wording together.

cores/esp8266/Schedule.cpp Outdated Show resolved Hide resolved
cores/esp8266/Schedule.cpp Outdated Show resolved Hide resolved
@dok-net dok-net changed the title Fix #8947: delay(0) must yield(); recurrent scheduled functions max delay fixed recurrent scheduled functions max delay update Jul 19, 2023
@dok-net dok-net force-pushed the fix_8947 branch 3 times, most recently from 72381c8 to 4486191 Compare July 27, 2023 19:07
@dok-net dok-net marked this pull request as draft July 27, 2023 22:56
@dok-net dok-net marked this pull request as ready for review July 30, 2023 08:15
@dok-net dok-net requested a review from d-a-v August 1, 2023 06:06
Copy link
Collaborator

@d-a-v d-a-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I run this test:

#include <Schedule.h>

void setup ()
{
    Serial.begin(74880);
    Serial.println();
    Serial.println("result in usec:");
    
    for (int i = 0; i < 4; i++)
        schedule_recurrent_function_us([]()
        {
            return true;
        }, 1000 + (300 * i) /*us*/);
}

uint64_t between;
uint32_t count;
decltype(esp_get_cycle_count()) last, now;

void loop ()
{
    now = esp_get_cycle_count();
    
    if (!(count & 0xfffff))
        Serial.printf("%f\n", between / (1.0 * F_CPU * count / 1000000));

    between += now - last;
    count++;

    last = esp_get_cycle_count();
}

It is designed to count time passed by core between loops.
I get 12us with master and 23us with this PR.

@dok-net
Copy link
Contributor Author

dok-net commented Sep 7, 2023

@d-a-v I'm beginning to wonder if the test methodology is off target: we are going the extra mile to extract a good delay duration - that is what this PR and yours are all about. But the test doesn't focus on the "quality" at all. I know it becomes more complicated to extract a metric that matters, but without delaying, and perhaps measuring the effect on power consumption, it seems without much value to the overall picture. What do you think?

In the meantime I have taken some USB power measurements. It's well known that a lot of energy on the breakout and evaluation boards is used for the peripherals. In summary, I have not found difference in power consumption that are worth mentioning, using the MCVE alone. It's all a bit sobering :-) Maybe we should explore the effects of computing a maximum delay vs. not computing one at all and using a fixed value in the first place?

@d-a-v
Copy link
Collaborator

d-a-v commented Nov 7, 2023

What I understand from your message

the test methodology is off target
the test doesn't focus on the "quality" at all
measure the effect on power consumption (<= no difference after measurement)

Can you explain what is "quality" if it is not speed of execution, which is directly related to the number of instructions executed between loops. As you found no difference on power consumption we can deduce that internal work between loops consume twice more energy for the same purpose. I am aware that your changes deal with usecs where current code is only msecs-aware. Current code uses a cache, yours recalculate bounds at every loop. Can't you change this ? Or should we update the current code to use usecs instead ?

we should explore the effects of computing a maximum delay vs. not computing one at all and using a fixed value in the first place

This looks like missing the target, or I don't understand this point.

… - that's wasted time.

Expired keeps executing yield policy on already expired oneShot - that's counter intuitive and wasted time.
…and expiredRetrigger().

_oneShotExpired qualified as mutable member.
This is done as intermediate fix for use in derived class ESP8266WiFiMesh ExpiringTimeTracker.
@dok-net
Copy link
Contributor Author

dok-net commented Jan 5, 2024

@d-a-v unfortunately I have to focus my attention on other projects. While I think there's still something strange about using gcd, my current proposal has a significant performance impact and would need fixing.
Should we leave this MR standing (as draft?) or shall I withdraw it completely?

@dok-net dok-net marked this pull request as draft January 5, 2024 09:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

delay(0) behavior changed
2 participants