Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changed node:work() signature #105

Merged
merged 1 commit into from
Jul 11, 2023

Conversation

RalphSteinhagen
Copy link
Member

... to {requested, performed, status} work(std::size requested) {..}

This is preparatory to support:

  • limit amount of work to be processed based on explicit scheduler constraints, i.e.
    • optimise for latency
    • optimise for throughput
    • optimise for actual processing time (i.e. if '1k work' takes n-ms, then schedule next time 10k to minimise work overhead or vice verse)
  • transparent blocking IO calls: of BlockingIO is declared node-implicitly runs internal work function on independent io thread (injected by graph/scheduler tbd.) -- separation between public and internal work function required for decoupling.

N.B. for non-blocking blocks nothing changes compared to before.

@RalphSteinhagen RalphSteinhagen temporarily deployed to configure coverage July 10, 2023 14:55 — with GitHub Actions Inactive
@RalphSteinhagen RalphSteinhagen temporarily deployed to configure coverage July 10, 2023 14:55 — with GitHub Actions Inactive
@RalphSteinhagen RalphSteinhagen temporarily deployed to configure coverage July 10, 2023 14:55 — with GitHub Actions Inactive
@RalphSteinhagen RalphSteinhagen temporarily deployed to configure coverage July 10, 2023 14:55 — with GitHub Actions Inactive
@RalphSteinhagen RalphSteinhagen temporarily deployed to configure coverage July 10, 2023 14:55 — with GitHub Actions Inactive
@RalphSteinhagen RalphSteinhagen temporarily deployed to configure coverage July 10, 2023 14:55 — with GitHub Actions Inactive
@RalphSteinhagen RalphSteinhagen temporarily deployed to configure coverage July 10, 2023 14:55 — with GitHub Actions Inactive
@RalphSteinhagen RalphSteinhagen temporarily deployed to configure coverage July 10, 2023 14:55 — with GitHub Actions Inactive
@RalphSteinhagen RalphSteinhagen temporarily deployed to configure coverage July 10, 2023 14:55 — with GitHub Actions Inactive
@RalphSteinhagen RalphSteinhagen temporarily deployed to configure coverage July 10, 2023 14:55 — with GitHub Actions Inactive
@RalphSteinhagen RalphSteinhagen temporarily deployed to configure coverage July 10, 2023 14:55 — with GitHub Actions Inactive
@RalphSteinhagen RalphSteinhagen temporarily deployed to configure coverage July 10, 2023 14:55 — with GitHub Actions Inactive
@RalphSteinhagen RalphSteinhagen temporarily deployed to configure coverage July 10, 2023 14:55 — with GitHub Actions Inactive
@RalphSteinhagen RalphSteinhagen temporarily deployed to configure coverage July 10, 2023 14:55 — with GitHub Actions Inactive
@RalphSteinhagen RalphSteinhagen temporarily deployed to configure coverage July 10, 2023 14:55 — with GitHub Actions Inactive
@RalphSteinhagen RalphSteinhagen temporarily deployed to configure coverage July 10, 2023 14:55 — with GitHub Actions Inactive
Copy link
Member

@wirew0rm wirew0rm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good code-wise, thanks for pushing this though and going through the work of changing everything to fit together again 👍

I have a few small notes/questions:

Why is requested_work returned in the result struct? is there any case where it is/would be different from the input parameter of the same name?

Also is performed_work always the number of processed samples or just an arbitrary number. In the first case renaming to processed_samples would be more clear, in the other case (which i think is the correct one, because it is also meaningful for decimating blocks and blocks with inputs of different sample rate) maybe this should be explicitly stated so that users do not draw wrong conclusions.

Finally the described scheduler use-case depends on there being a direct dependency between requested/performed_work and execution time (which the author of the work function has to ensure), if the performed work is not linear this makes the naming again somewhat misleading.

Overall there seem to be 2 concepts in one, one where a block gets a certain amount of "time" as a deadline and one where a block's work function can be deterministically scaled. They are very similar, but need to be treated differently by the scheduler. The scheduler can probably handle this by treating blocking and non-blocking nodes differently.

But in general I like the interface and we can always tweak further once we have a useful number of implemented blocks and schedulers to find out.

@wirew0rm wirew0rm merged commit 81c8e8e into main Jul 11, 2023
@wirew0rm wirew0rm deleted the work_expected_upgrade_and_blocking_IO_prep branch July 11, 2023 16:04
@RalphSteinhagen
Copy link
Member Author

Why is requested_work returned in the result struct? is there any case where it is/would be different from the input parameter of the same name?

For consistency reasons because -- in a multi-threaded environment -- by the time the return value is returned to the thread handled by the scheduler, it isn't clear anymore which requested work the returned value corresponded to. To note: this is used for cases where the blocking IO work function is executed independently from the scheduler working pool.

Also is performed_work always the number of processed samples or just an arbitrary number.

For cases/blocks that are deterministic and linked to a fixed input->output ratio this could be the process input samples (80% use-case)... but doesn't have to be. There are cases where the input rates are variable and/or the work performed by the block is not linked to the input samples where a more generic "work" measure is needed ... I thought it is unlikely that we'll find a one-fits-all solution because there are some hard corner cases.

Finally the described scheduler use-case depends on there being a direct dependency between requested/performed_work and execution time (which the author of the work function has to ensure), if the performed work is not linear this makes the naming again somewhat misleading.

Not necessarily linear dependence though. The execution time is also highly subjective and hard to specify/control from a block-implementer point of view.

Overall there seem to be 2 concepts in one, one where a block gets a certain amount of "time" as a deadline and one where a block's work function can be deterministically scaled.

I think that we can handle and conflate both cases to one. Counting and keeping an integer definition consistent within the same block is easier than time-keeping since there a lot of other aspects (which clock? granularity, dead-time/latencies, etc.) that need to be defined when using 'time' as a general constraint.

Using integers should be flexible enough.... and also ... if this doesn't fit we could change it. I needed the interface changes primarily for the blocking IO implementation and at least the sample-type definition to accumulate multiple work executions that are queried by the scheduler potentially only as a grouped execution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants