-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Proposal] remove most rxjs-related code from the SegmentBuffers and transport code #916
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
peaBerberian
added
work-in-progress
This Pull Request or issue is not finished yet
Refacto
This Pull Request changes a lot of RxPlayer's code and/or logic
proposal
This Pull Request or Issue is only a proposal for a change with the expectation of a debate on it
SegmentBuffers
Relative to the `SegmentBuffers` media buffers
labels
Mar 9, 2021
peaBerberian
commented
Mar 9, 2021
@@ -26,7 +26,9 @@ import { | |||
mergeMap, | |||
} from "rxjs/operators"; | |||
import log from "../../log"; | |||
import observablifyCancellablePromise from "../../utils/observablify_promise"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very ugly name, I know. Though I wanted to communicate the fact that a TaskCanceller
was used here (but perhaps "observablify" is the uglier part of the name :p)
peaBerberian
changed the title
remove most rxjs-related code from the SegmentBuffers
[Proposal] remove most rxjs-related code from the SegmentBuffers
Mar 10, 2021
peaBerberian
force-pushed
the
misc/no-rxjs-poc
branch
from
March 10, 2021 18:00
f1b41ff
to
60fe3fb
Compare
peaBerberian
force-pushed
the
misc/no-rxjs-poc
branch
from
March 26, 2021 20:57
e638ded
to
f29def3
Compare
peaBerberian
force-pushed
the
misc/no-rxjs-poc
branch
6 times, most recently
from
April 9, 2021 10:30
4f4b7af
to
eb3c5e8
Compare
peaBerberian
changed the title
[Proposal] remove most rxjs-related code from the SegmentBuffers
[Proposal] remove most rxjs-related code from the SegmentBuffers and transport code
May 4, 2021
peaBerberian
added a commit
that referenced
this pull request
May 4, 2021
Every segment parser in `src/transports` (which allows to parse things like data, encryption metadata and time information from loaded segments according to the current streaming protocol) do so synchronously. However, those are typed as Observables. My guess (this was before my time on the project) is that it was mainly for two reasons: 1. Manifest parsing can be asynchronous (for example to fetch MPD xlinks or the clock) and it follows the same pipeline architecture (a loader and a parser) than segments. Letting segment parser be asynchronous thus allows to harmonize all resource-fetching code to a similar interface. 2. Theorically, segment parsing could be asynchronous. As far as I can see for now, this could be because of two reasons: - We need to perform an HTTPS request to fully parse segment information. We already encountered such case when trying to implement the "sidx ref type 1" feature (#754). However we did not merge that work for now due to the complexity and most of all because no real DASH MPD seems to rely on this feature (nor ar we able to envisage a real use-case for it). - segment parsing is performed in a Worker. But We do not feel the need to do that for now as the CPU footprint of parsing segments is really low. So theorically, there's good reasons to make the parsing operation asynchronous, but reastically for now, there's none. Meanwhile, making that operation asynchronous lead to big headaches: - it is one of the main reasons why the PR on making the init and first media segment's requests in parallel [#918] was not merged. Subtle bugs, some that would only be seen in a case where the parsing is asynchronous (so not really existant for now) could arise because of that. - Even if the response of the parser is wrapped in an Observable, it wasn't lazy (the code would be called even if the parser was not subscribed to). We could fix that by wrapping all parsers in a `defer` call from RxJS, but it would complexify even more the code. - Handling Observables is a LOT harder than just using directly a response in general. There's pending work about not depending so much on RxJS in the future [#916] and I think beginning to remove it where it is not even needed is a good first step. If it become needed again in the future (e.g. to support the "sidx ref type 1" feature), we could always revert that work, maybe even with a more Promise-based solution which might be much easier to reason about. Thoughts?
Every segment parser in `src/transports` (which allows to parse things like data, encryption metadata and time information from loaded segments according to the current streaming protocol) do so synchronously. However, those are typed as Observables. My guess (this was before my time on the project) is that it was mainly for two reasons: 1. Manifest parsing can be asynchronous (for example to fetch MPD xlinks or the clock) and it follows the same pipeline architecture (a loader and a parser) than segments. Letting segment parser be asynchronous thus allows to harmonize all resource-fetching code to a similar interface. 2. Theorically, segment parsing could be asynchronous. As far as I can see for now, this could be because of two reasons: - We need to perform an HTTPS request to fully parse segment information. We already encountered such case when trying to implement the "sidx ref type 1" feature (#754). However we did not merge that work for now due to the complexity and most of all because no real DASH MPD seems to rely on this feature (nor ar we able to envisage a real use-case for it). - segment parsing is performed in a Worker. But We do not feel the need to do that for now as the CPU footprint of parsing segments is really low. So theorically, there's good reasons to make the parsing operation asynchronous, but reastically for now, there's none. Meanwhile, making that operation asynchronous lead to big headaches: - it is one of the main reasons why the PR on making the init and first media segment's requests in parallel [#918] was not merged. Subtle bugs, some that would only be seen in a case where the parsing is asynchronous (so not really existant for now) could arise because of that. - Even if the response of the parser is wrapped in an Observable, it wasn't lazy (the code would be called even if the parser was not subscribed to). We could fix that by wrapping all parsers in a `defer` call from RxJS, but it would complexify even more the code. - Handling Observables is a LOT harder than just using directly a response in general. There's pending work about not depending so much on RxJS in the future [#916] and I think beginning to remove it where it is not even needed is a good first step. If it become needed again in the future (e.g. to support the "sidx ref type 1" feature), we could always revert that work, maybe even with a more Promise-based solution which might be much easier to reason about. Thoughts?
This commit proposes a simplification of types used by the segment parsers by defining a unique segment data type instead of two: one for the init segment and the other for media segments. As `SegmentBuffers` - on which those segment data will be pushed - are only generic based on a single segment data type, this is not really necessary anyway. The real reason behind creating two types was to better indicate that text and image Representation had no initialization segment (by setting their type to `null`) but considering that a parsed media segment could theoretically be `null` as well, nothing is lost by using the same type for both. Being generic over a single type simplify the way we approach segment parsers in my opinion. Now we're thinking: "which type is the parsed segment data in?" not "which type is a parsed init segment data and which type is a media segment data?". Both should generally be in the same format and the only difference until now was that one could be always `null`. Also, considering that media segments could theorically contain initialization data themselves, this is one more argument in the direction of only defining a single type. --- Another improvement in that commit is to limit `any` typings to the small `segment_fetcher_creator.ts` file. Doing this there should reduce possible typing errors as segment loader and parsers are now properly type-checked (both their inputs and outputs).
This commit is a first attempt to remove most RxJS-related code from the RxPlayer code. Here it is mainly done for the code inside `src/core/segment_buffers`. There are multiple reasons for this weird new idea: 1. Observables in general are complex beasts. By default, they are lazy, they are called as much times as they are subscribed to, they are "teared down" (all linked side-effects such as event listeners are cleaned-up) automatically on unsubscription, depending on the scheduler used they may emit or be subscribed to in a synchronous or multiple types of asynchronous ways, they keep being open until you either complete them, emit an error or unsubscribe from them - when you're not careful, you thus have a chance of leaking memory. Even if we understand it well, we still make frequent mistakes with it, some having pretty big consequences (e.g. big memory leak in #913, API events sent too much time in #850 - well to be honest - I am guilty on both I think!). And that's only talking about us, with multiple years of experience with it. Newer developper may find it even more complex to understand. 2. Observables are IMO too much used in the code, even when `Promises` seems like a simpler and better solution (for example when there is only one item emitted). 3. They are too risky (in my opinion) when we're talking about code with important side-effects (e.g. HTTP requests, SourceBuffer interactions, API interactions) due to the easyness with which we can perform them multiple times by mistake. 4. I find Observables tear-down too implicit (the fact that it happens silently on unsubscription, mostly) for what we use it for: Unsubscribing from some Observables will stop the content, others will stop segments from being pushed, others will cancel requests. Here I would prefer a much more explicit solution: one where we explicitly write a line of code anouncing that operation are aborted at the time we abort them. 5. Call stacks are unreadable and too long currently. I don't know if we ever exploited any call stack information because of that. The fact that call stacks are unreadable seems to be directly linked to the heavy usage of RxJS operators. 6. Observables cannot be used with async/await. We often end-up with a huge callback hell when inner Observables contain other inner Observables and so one, whereas it could in effect be much simpler. I still think that there are real good use cases for RxJS in the RxPlayer, for example I found the text synchronization code in the `HTMLTextSegmentBuffer` really simple to grasp and functional. In this commit, I also created the `TaskCanceller` util, which allows to implement explicit Promise cancellation roughly the same way that is is done for the native `fetch` API (on that note, I was heavily inspired by the `AbortController` interface - used to cancel fetch operations - to write it).
peaBerberian
force-pushed
the
misc/no-rxjs-poc
branch
from
May 11, 2021 08:41
eb3c5e8
to
b91af53
Compare
peaBerberian
added a commit
that referenced
this pull request
May 19, 2021
After doing a proof-of-concept looking at how some parts of the code looks like without RxJS (#916), this is a first functional proposal which looks good enough to me to be merged. It removes all RxJS code from the `transports` code in `src/transports`. As a reminder, the reasons for doing this are: 1. Observables are complicated and full of implicit behaviors (lazily running, sync or async, unsubscribing automatically after the last unsubscription etc.) which is difficult to reason about, especially for a newcomer. Things like exploiting schedulers through the `deferSubscriptions` util to work-around some subtle potential race-conditions, or merging Observables in a specific order for similar reasons, are ugly hacks that are difficult to explain to someone not familiar with that code. Even for us with multiple years of experience with it, we sometimes struggle with it. 2. Promises, event listeners - and direct callbacks in general - are generally much more explicit and most developpers (at least JS/TS devs) are familiar with them. 3. Call stacks are close to inexploitable when using RxJS. 4. Promises provide async/await syntax which can improve drastically the readability of our async-heavy code, which for the moment suffer from callback hells almost everywhere. However, I'm still not sure if this wish (getting rid of RxJS) is shared by other maintainers and/or contributors, so it is still only a proposal. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
May 19, 2021
After doing a proof-of-concept looking at how some parts of the code looks like without RxJS (#916), this is a first functional proposal which looks good enough to me to be merged. It removes all RxJS code from the `transports` code in `src/transports`. As a reminder, the reasons for doing this are: 1. Observables are complicated and full of implicit behaviors (lazily running, sync or async, unsubscribing automatically after the last unsubscription etc.) which is difficult to reason about, especially for a newcomer. Things like exploiting schedulers through the `deferSubscriptions` util to work-around some subtle potential race-conditions, or merging Observables in a specific order for similar reasons, are ugly hacks that are difficult to explain to someone not familiar with that code. Even for us with multiple years of experience with it, we sometimes struggle with it. 2. Promises, event listeners - and direct callbacks in general - are generally much more explicit and most developpers (at least JS/TS devs) are familiar with them. 3. Call stacks are close to inexploitable when using RxJS. 4. Promises provide async/await syntax which can improve drastically the readability of our async-heavy code, which for the moment suffer from callback hells almost everywhere. However, I'm still not sure if this wish (getting rid of RxJS) is shared by other maintainers and/or contributors, so it is still only a proposal. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
May 19, 2021
After doing a proof-of-concept looking at how some parts of the code looks like without RxJS (#916), this is a first functional proposal which looks good enough to me to be merged. It removes all RxJS code from the `transports` code in `src/transports`. As a reminder, the reasons for doing this are: 1. Observables are complicated and full of implicit behaviors (lazily running, sync or async, unsubscribing automatically after the last unsubscription etc.) which is difficult to reason about, especially for a newcomer. Things like exploiting schedulers through the `deferSubscriptions` util to work-around some subtle potential race-conditions, or merging Observables in a specific order for similar reasons, are ugly hacks that are difficult to explain to someone not familiar with that code. Even for us with multiple years of experience with it, we sometimes struggle with it. 2. Promises, event listeners - and direct callbacks in general - are generally much more explicit and most developpers (at least JS/TS devs) are familiar with them. 3. Call stacks are close to inexploitable when using RxJS. 4. Promises provide async/await syntax which can improve drastically the readability of our async-heavy code, which for the moment suffer from callback hells almost everywhere. However, I'm still not sure if this wish (getting rid of RxJS) is shared by other maintainers and/or contributors, so it is still only a proposal. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
May 19, 2021
After doing a proof-of-concept looking at how some parts of the code looks like without RxJS (#916), this is a first functional proposal which looks good enough to me to be merged. It removes all RxJS code from the `transports` code in `src/transports`. As a reminder, the reasons for doing this are: 1. Observables are complicated and full of implicit behaviors (lazily running, sync or async, unsubscribing automatically after the last unsubscription etc.) which is difficult to reason about, especially for a newcomer. Things like exploiting schedulers through the `deferSubscriptions` util to work-around some subtle potential race-conditions, or merging Observables in a specific order for similar reasons, are ugly hacks that are difficult to explain to someone not familiar with that code. Even for us with multiple years of experience with it, we sometimes struggle with it. 2. Promises, event listeners - and direct callbacks in general - are generally much more explicit and most developpers (at least JS/TS devs) are familiar with them. 3. Call stacks are close to inexploitable when using RxJS. 4. Promises provide async/await syntax which can improve drastically the readability of our async-heavy code, which for the moment suffer from callback hells almost everywhere. However, I'm still not sure if this wish (getting rid of RxJS) is shared by other maintainers and/or contributors, so it is still only a proposal. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
May 20, 2021
After doing a proof-of-concept looking at how some parts of the code looks like without RxJS (#916), this is a first functional proposal which looks good enough to me to be merged. It removes all RxJS code from the `transports` code in `src/transports`. As a reminder, the reasons for doing this are: 1. Observables are complicated and full of implicit behaviors (lazily running, sync or async, unsubscribing automatically after the last unsubscription etc.) which is difficult to reason about, especially for a newcomer. Things like exploiting schedulers through the `deferSubscriptions` util to work-around some subtle potential race-conditions, or merging Observables in a specific order for similar reasons, are ugly hacks that are difficult to explain to someone not familiar with that code. Even for us with multiple years of experience with it, we sometimes struggle with it. 2. Promises, event listeners - and direct callbacks in general - are generally much more explicit and most developpers (at least JS/TS devs) are familiar with them. 3. Call stacks are close to inexploitable when using RxJS. 4. Promises provide async/await syntax which can improve drastically the readability of our async-heavy code, which for the moment suffer from callback hells almost everywhere. However, I'm still not sure if this wish (getting rid of RxJS) is shared by other maintainers and/or contributors, so it is still only a proposal. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
May 20, 2021
After doing a proof-of-concept looking at how some parts of the code looks like without RxJS (#916), this is a first functional proposal which looks good enough to me to be merged. It removes all RxJS code from the `transports` code in `src/transports`. As a reminder, the reasons for doing this are: 1. Observables are complicated and full of implicit behaviors (lazily running, sync or async, unsubscribing automatically after the last unsubscription etc.) which is difficult to reason about, especially for a newcomer. Things like exploiting schedulers through the `deferSubscriptions` util to work-around some subtle potential race-conditions, or merging Observables in a specific order for similar reasons, are ugly hacks that are difficult to explain to someone not familiar with that code. Even for us with multiple years of experience with it, we sometimes struggle with it. 2. Promises, event listeners - and direct callbacks in general - are generally much more explicit and most developpers (at least JS/TS devs) are familiar with them. 3. Call stacks are close to inexploitable when using RxJS. 4. Promises provide async/await syntax which can improve drastically the readability of our async-heavy code, which for the moment suffer from callback hells almost everywhere. However, I'm still not sure if this wish (getting rid of RxJS) is shared by other maintainers and/or contributors, so it is still only a proposal. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
May 20, 2021
After doing a proof-of-concept looking at how some parts of the code looks like without RxJS (#916), this is a first functional proposal which looks good enough to me to be merged. It removes all RxJS code from the `transports` code in `src/transports`. As a reminder, the reasons for doing this are: 1. Observables are complicated and full of implicit behaviors (lazily running, sync or async, unsubscribing automatically after the last unsubscription etc.) which is difficult to reason about, especially for a newcomer. Things like exploiting schedulers through the `deferSubscriptions` util to work-around some subtle potential race-conditions, or merging Observables in a specific order for similar reasons, are ugly hacks that are difficult to explain to someone not familiar with that code. Even for us with multiple years of experience with it, we sometimes struggle with it. 2. Promises, event listeners - and direct callbacks in general - are generally much more explicit and most developpers (at least JS/TS devs) are familiar with them. 3. Call stacks are close to inexploitable when using RxJS. 4. Promises provide async/await syntax which can improve drastically the readability of our async-heavy code, which for the moment suffer from callback hells almost everywhere. However, I'm still not sure if this wish (getting rid of RxJS) is shared by other maintainers and/or contributors, so it is still only a proposal. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
May 20, 2021
After doing a proof-of-concept looking at how some parts of the code looks like without RxJS (#916), this is a first functional proposal which looks good enough to me to be merged. It removes all RxJS code from the `transports` code in `src/transports`. As a reminder, the reasons for doing this are: 1. Observables are complicated and full of implicit behaviors (lazily running, sync or async, unsubscribing automatically after the last unsubscription etc.) which is difficult to reason about, especially for a newcomer. Things like exploiting schedulers through the `deferSubscriptions` util to work-around some subtle potential race-conditions, or merging Observables in a specific order for similar reasons, are ugly hacks that are difficult to explain to someone not familiar with that code. Even for us with multiple years of experience with it, we sometimes struggle with it. 2. Promises, event listeners - and direct callbacks in general - are generally much more explicit and most developpers (at least JS/TS devs) are familiar with them. 3. Call stacks are close to inexploitable when using RxJS. 4. Promises provide async/await syntax which can improve drastically the readability of our async-heavy code, which for the moment suffer from callback hells almost everywhere. However, I'm still not sure if this wish (getting rid of RxJS) is shared by other maintainers and/or contributors, so it is still only a proposal. Thoughts?
Closed for now in profit of #962 |
peaBerberian
added a commit
that referenced
this pull request
May 25, 2021
Every segment parser in `src/transports` (which allows to parse things like data, encryption metadata and time information from loaded segments according to the current streaming protocol) do so synchronously. However, those are typed as Observables. My guess (this was before my time on the project) is that it was mainly for two reasons: 1. Manifest parsing can be asynchronous (for example to fetch MPD xlinks or the clock) and it follows the same pipeline architecture (a loader and a parser) than segments. Letting segment parser be asynchronous thus allows to harmonize all resource-fetching code to a similar interface. 2. Theorically, segment parsing could be asynchronous. As far as I can see for now, this could be because of two reasons: - We need to perform an HTTPS request to fully parse segment information. We already encountered such case when trying to implement the "sidx ref type 1" feature (#754). However we did not merge that work for now due to the complexity and most of all because no real DASH MPD seems to rely on this feature (nor ar we able to envisage a real use-case for it). - segment parsing is performed in a Worker. But We do not feel the need to do that for now as the CPU footprint of parsing segments is really low. So theorically, there's good reasons to make the parsing operation asynchronous, but reastically for now, there's none. Meanwhile, making that operation asynchronous lead to big headaches: - it is one of the main reasons why the PR on making the init and first media segment's requests in parallel [#918] was not merged. Subtle bugs, some that would only be seen in a case where the parsing is asynchronous (so not really existant for now) could arise because of that. - Even if the response of the parser is wrapped in an Observable, it wasn't lazy (the code would be called even if the parser was not subscribed to). We could fix that by wrapping all parsers in a `defer` call from RxJS, but it would complexify even more the code. - Handling Observables is a LOT harder than just using directly a response in general. There's pending work about not depending so much on RxJS in the future [#916] and I think beginning to remove it where it is not even needed is a good first step. If it become needed again in the future (e.g. to support the "sidx ref type 1" feature), we could always revert that work, maybe even with a more Promise-based solution which might be much easier to reason about. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
May 25, 2021
After doing a proof-of-concept looking at how some parts of the code looks like without RxJS (#916), this is a first functional proposal which looks good enough to me to be merged. It removes all RxJS code from the `transports` code in `src/transports`. As a reminder, the reasons for doing this are: 1. Observables are complicated and full of implicit behaviors (lazily running, sync or async, unsubscribing automatically after the last unsubscription etc.) which is difficult to reason about, especially for a newcomer. Things like exploiting schedulers through the `deferSubscriptions` util to work-around some subtle potential race-conditions, or merging Observables in a specific order for similar reasons, are ugly hacks that are difficult to explain to someone not familiar with that code. Even for us with multiple years of experience with it, we sometimes struggle with it. 2. Promises, event listeners - and direct callbacks in general - are generally much more explicit and most developpers (at least JS/TS devs) are familiar with them. 3. Call stacks are close to inexploitable when using RxJS. 4. Promises provide async/await syntax which can improve drastically the readability of our async-heavy code, which for the moment suffer from callback hells almost everywhere. However, I'm still not sure if this wish (getting rid of RxJS) is shared by other maintainers and/or contributors, so it is still only a proposal. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
May 26, 2021
Every segment parser in `src/transports` (which allows to parse things like data, encryption metadata and time information from loaded segments according to the current streaming protocol) do so synchronously. However, those are typed as Observables. My guess (this was before my time on the project) is that it was mainly for two reasons: 1. Manifest parsing can be asynchronous (for example to fetch MPD xlinks or the clock) and it follows the same pipeline architecture (a loader and a parser) than segments. Letting segment parser be asynchronous thus allows to harmonize all resource-fetching code to a similar interface. 2. Theorically, segment parsing could be asynchronous. As far as I can see for now, this could be because of two reasons: - We need to perform an HTTPS request to fully parse segment information. We already encountered such case when trying to implement the "sidx ref type 1" feature (#754). However we did not merge that work for now due to the complexity and most of all because no real DASH MPD seems to rely on this feature (nor ar we able to envisage a real use-case for it). - segment parsing is performed in a Worker. But We do not feel the need to do that for now as the CPU footprint of parsing segments is really low. So theorically, there's good reasons to make the parsing operation asynchronous, but reastically for now, there's none. Meanwhile, making that operation asynchronous lead to big headaches: - it is one of the main reasons why the PR on making the init and first media segment's requests in parallel [#918] was not merged. Subtle bugs, some that would only be seen in a case where the parsing is asynchronous (so not really existant for now) could arise because of that. - Even if the response of the parser is wrapped in an Observable, it wasn't lazy (the code would be called even if the parser was not subscribed to). We could fix that by wrapping all parsers in a `defer` call from RxJS, but it would complexify even more the code. - Handling Observables is a LOT harder than just using directly a response in general. There's pending work about not depending so much on RxJS in the future [#916] and I think beginning to remove it where it is not even needed is a good first step. If it become needed again in the future (e.g. to support the "sidx ref type 1" feature), we could always revert that work, maybe even with a more Promise-based solution which might be much easier to reason about. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
May 26, 2021
After doing a proof-of-concept looking at how some parts of the code looks like without RxJS (#916), this is a first functional proposal which looks good enough to me to be merged. It removes all RxJS code from the `transports` code in `src/transports`. As a reminder, the reasons for doing this are: 1. Observables are complicated and full of implicit behaviors (lazily running, sync or async, unsubscribing automatically after the last unsubscription etc.) which is difficult to reason about, especially for a newcomer. Things like exploiting schedulers through the `deferSubscriptions` util to work-around some subtle potential race-conditions, or merging Observables in a specific order for similar reasons, are ugly hacks that are difficult to explain to someone not familiar with that code. Even for us with multiple years of experience with it, we sometimes struggle with it. 2. Promises, event listeners - and direct callbacks in general - are generally much more explicit and most developpers (at least JS/TS devs) are familiar with them. 3. Call stacks are close to inexploitable when using RxJS. 4. Promises provide async/await syntax which can improve drastically the readability of our async-heavy code, which for the moment suffer from callback hells almost everywhere. However, I'm still not sure if this wish (getting rid of RxJS) is shared by other maintainers and/or contributors, so it is still only a proposal. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
Jun 17, 2021
Every segment parser in `src/transports` (which allows to parse things like data, encryption metadata and time information from loaded segments according to the current streaming protocol) do so synchronously. However, those are typed as Observables. My guess (this was before my time on the project) is that it was mainly for two reasons: 1. Manifest parsing can be asynchronous (for example to fetch MPD xlinks or the clock) and it follows the same pipeline architecture (a loader and a parser) than segments. Letting segment parser be asynchronous thus allows to harmonize all resource-fetching code to a similar interface. 2. Theorically, segment parsing could be asynchronous. As far as I can see for now, this could be because of two reasons: - We need to perform an HTTPS request to fully parse segment information. We already encountered such case when trying to implement the "sidx ref type 1" feature (#754). However we did not merge that work for now due to the complexity and most of all because no real DASH MPD seems to rely on this feature (nor ar we able to envisage a real use-case for it). - segment parsing is performed in a Worker. But We do not feel the need to do that for now as the CPU footprint of parsing segments is really low. So theorically, there's good reasons to make the parsing operation asynchronous, but reastically for now, there's none. Meanwhile, making that operation asynchronous lead to big headaches: - it is one of the main reasons why the PR on making the init and first media segment's requests in parallel [#918] was not merged. Subtle bugs, some that would only be seen in a case where the parsing is asynchronous (so not really existant for now) could arise because of that. - Even if the response of the parser is wrapped in an Observable, it wasn't lazy (the code would be called even if the parser was not subscribed to). We could fix that by wrapping all parsers in a `defer` call from RxJS, but it would complexify even more the code. - Handling Observables is a LOT harder than just using directly a response in general. There's pending work about not depending so much on RxJS in the future [#916] and I think beginning to remove it where it is not even needed is a good first step. If it become needed again in the future (e.g. to support the "sidx ref type 1" feature), we could always revert that work, maybe even with a more Promise-based solution which might be much easier to reason about. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
Jun 21, 2021
Every segment parser in `src/transports` (which allows to parse things like data, encryption metadata and time information from loaded segments according to the current streaming protocol) do so synchronously. However, those are typed as Observables. My guess (this was before my time on the project) is that it was mainly for two reasons: 1. Manifest parsing can be asynchronous (for example to fetch MPD xlinks or the clock) and it follows the same pipeline architecture (a loader and a parser) than segments. Letting segment parser be asynchronous thus allows to harmonize all resource-fetching code to a similar interface. 2. Theorically, segment parsing could be asynchronous. As far as I can see for now, this could be because of two reasons: - We need to perform an HTTPS request to fully parse segment information. We already encountered such case when trying to implement the "sidx ref type 1" feature (#754). However we did not merge that work for now due to the complexity and most of all because no real DASH MPD seems to rely on this feature (nor ar we able to envisage a real use-case for it). - segment parsing is performed in a Worker. But We do not feel the need to do that for now as the CPU footprint of parsing segments is really low. So theorically, there's good reasons to make the parsing operation asynchronous, but reastically for now, there's none. Meanwhile, making that operation asynchronous lead to big headaches: - it is one of the main reasons why the PR on making the init and first media segment's requests in parallel [#918] was not merged. Subtle bugs, some that would only be seen in a case where the parsing is asynchronous (so not really existant for now) could arise because of that. - Even if the response of the parser is wrapped in an Observable, it wasn't lazy (the code would be called even if the parser was not subscribed to). We could fix that by wrapping all parsers in a `defer` call from RxJS, but it would complexify even more the code. - Handling Observables is a LOT harder than just using directly a response in general. There's pending work about not depending so much on RxJS in the future [#916] and I think beginning to remove it where it is not even needed is a good first step. If it become needed again in the future (e.g. to support the "sidx ref type 1" feature), we could always revert that work, maybe even with a more Promise-based solution which might be much easier to reason about. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
Jun 22, 2021
After doing a proof-of-concept looking at how some parts of the code looks like without RxJS (#916), this is a first functional proposal which looks good enough to me to be merged. It removes all RxJS code from the `transports` code in `src/transports`. As a reminder, the reasons for doing this are: 1. Observables are complicated and full of implicit behaviors (lazily running, sync or async, unsubscribing automatically after the last unsubscription etc.) which is difficult to reason about, especially for a newcomer. Things like exploiting schedulers through the `deferSubscriptions` util to work-around some subtle potential race-conditions, or merging Observables in a specific order for similar reasons, are ugly hacks that are difficult to explain to someone not familiar with that code. Even for us with multiple years of experience with it, we sometimes struggle with it. 2. Promises, event listeners - and direct callbacks in general - are generally much more explicit and most developpers (at least JS/TS devs) are familiar with them. 3. Call stacks are close to inexploitable when using RxJS. 4. Promises provide async/await syntax which can improve drastically the readability of our async-heavy code, which for the moment suffer from callback hells almost everywhere. However, I'm still not sure if this wish (getting rid of RxJS) is shared by other maintainers and/or contributors, so it is still only a proposal. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
Jun 24, 2021
After doing a proof-of-concept looking at how some parts of the code looks like without RxJS (#916), this is a first functional proposal which looks good enough to me to be merged. It removes all RxJS code from the `transports` code in `src/transports`. As a reminder, the reasons for doing this are: 1. Observables are complicated and full of implicit behaviors (lazily running, sync or async, unsubscribing automatically after the last unsubscription etc.) which is difficult to reason about, especially for a newcomer. Things like exploiting schedulers through the `deferSubscriptions` util to work-around some subtle potential race-conditions, or merging Observables in a specific order for similar reasons, are ugly hacks that are difficult to explain to someone not familiar with that code. Even for us with multiple years of experience with it, we sometimes struggle with it. 2. Promises, event listeners - and direct callbacks in general - are generally much more explicit and most developpers (at least JS/TS devs) are familiar with them. 3. Call stacks are close to inexploitable when using RxJS. 4. Promises provide async/await syntax which can improve drastically the readability of our async-heavy code, which for the moment suffer from callback hells almost everywhere. However, I'm still not sure if this wish (getting rid of RxJS) is shared by other maintainers and/or contributors, so it is still only a proposal. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
Jun 24, 2021
After doing a proof-of-concept looking at how some parts of the code looks like without RxJS (#916), this is a first functional proposal which looks good enough to me to be merged. It removes all RxJS code from the `transports` code in `src/transports`. As a reminder, the reasons for doing this are: 1. Observables are complicated and full of implicit behaviors (lazily running, sync or async, unsubscribing automatically after the last unsubscription etc.) which is difficult to reason about, especially for a newcomer. Things like exploiting schedulers through the `deferSubscriptions` util to work-around some subtle potential race-conditions, or merging Observables in a specific order for similar reasons, are ugly hacks that are difficult to explain to someone not familiar with that code. Even for us with multiple years of experience with it, we sometimes struggle with it. 2. Promises, event listeners - and direct callbacks in general - are generally much more explicit and most developpers (at least JS/TS devs) are familiar with them. 3. Call stacks are close to inexploitable when using RxJS. 4. Promises provide async/await syntax which can improve drastically the readability of our async-heavy code, which for the moment suffer from callback hells almost everywhere. However, I'm still not sure if this wish (getting rid of RxJS) is shared by other maintainers and/or contributors, so it is still only a proposal. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
Jul 27, 2021
After doing a proof-of-concept looking at how some parts of the code looks like without RxJS (#916), this is a first functional proposal which looks good enough to me to be merged. It removes all RxJS code from the `transports` code in `src/transports`. As a reminder, the reasons for doing this are: 1. Observables are complicated and full of implicit behaviors (lazily running, sync or async, unsubscribing automatically after the last unsubscription etc.) which is difficult to reason about, especially for a newcomer. Things like exploiting schedulers through the `deferSubscriptions` util to work-around some subtle potential race-conditions, or merging Observables in a specific order for similar reasons, are ugly hacks that are difficult to explain to someone not familiar with that code. Even for us with multiple years of experience with it, we sometimes struggle with it. 2. Promises, event listeners - and direct callbacks in general - are generally much more explicit and most developpers (at least JS/TS devs) are familiar with them. 3. Call stacks are close to inexploitable when using RxJS. 4. Promises provide async/await syntax which can improve drastically the readability of our async-heavy code, which for the moment suffer from callback hells almost everywhere. However, I'm still not sure if this wish (getting rid of RxJS) is shared by other maintainers and/or contributors, so it is still only a proposal. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
Nov 23, 2021
Every segment parser in `src/transports` (which allows to parse things like data, encryption metadata and time information from loaded segments according to the current streaming protocol) do so synchronously. However, those are typed as Observables. My guess (this was before my time on the project) is that it was mainly for two reasons: 1. Manifest parsing can be asynchronous (for example to fetch MPD xlinks or the clock) and it follows the same pipeline architecture (a loader and a parser) than segments. Letting segment parser be asynchronous thus allows to harmonize all resource-fetching code to a similar interface. 2. Theorically, segment parsing could be asynchronous. As far as I can see for now, this could be because of two reasons: - We need to perform an HTTPS request to fully parse segment information. We already encountered such case when trying to implement the "sidx ref type 1" feature (#754). However we did not merge that work for now due to the complexity and most of all because no real DASH MPD seems to rely on this feature (nor ar we able to envisage a real use-case for it). - segment parsing is performed in a Worker. But We do not feel the need to do that for now as the CPU footprint of parsing segments is really low. So theorically, there's good reasons to make the parsing operation asynchronous, but reastically for now, there's none. Meanwhile, making that operation asynchronous lead to big headaches: - it is one of the main reasons why the PR on making the init and first media segment's requests in parallel [#918] was not merged. Subtle bugs, some that would only be seen in a case where the parsing is asynchronous (so not really existant for now) could arise because of that. - Even if the response of the parser is wrapped in an Observable, it wasn't lazy (the code would be called even if the parser was not subscribed to). We could fix that by wrapping all parsers in a `defer` call from RxJS, but it would complexify even more the code. - Handling Observables is a LOT harder than just using directly a response in general. There's pending work about not depending so much on RxJS in the future [#916] and I think beginning to remove it where it is not even needed is a good first step. If it become needed again in the future (e.g. to support the "sidx ref type 1" feature), we could always revert that work, maybe even with a more Promise-based solution which might be much easier to reason about. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
Nov 23, 2021
After doing a proof-of-concept looking at how some parts of the code looks like without RxJS (#916), this is a first functional proposal which looks good enough to me to be merged. It removes all RxJS code from the `transports` code in `src/transports`. As a reminder, the reasons for doing this are: 1. Observables are complicated and full of implicit behaviors (lazily running, sync or async, unsubscribing automatically after the last unsubscription etc.) which is difficult to reason about, especially for a newcomer. Things like exploiting schedulers through the `deferSubscriptions` util to work-around some subtle potential race-conditions, or merging Observables in a specific order for similar reasons, are ugly hacks that are difficult to explain to someone not familiar with that code. Even for us with multiple years of experience with it, we sometimes struggle with it. 2. Promises, event listeners - and direct callbacks in general - are generally much more explicit and most developpers (at least JS/TS devs) are familiar with them. 3. Call stacks are close to inexploitable when using RxJS. 4. Promises provide async/await syntax which can improve drastically the readability of our async-heavy code, which for the moment suffer from callback hells almost everywhere. However, I'm still not sure if this wish (getting rid of RxJS) is shared by other maintainers and/or contributors, so it is still only a proposal. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
Nov 24, 2021
Every segment parser in `src/transports` (which allows to parse things like data, encryption metadata and time information from loaded segments according to the current streaming protocol) do so synchronously. However, those are typed as Observables. My guess (this was before my time on the project) is that it was mainly for two reasons: 1. Manifest parsing can be asynchronous (for example to fetch MPD xlinks or the clock) and it follows the same pipeline architecture (a loader and a parser) than segments. Letting segment parser be asynchronous thus allows to harmonize all resource-fetching code to a similar interface. 2. Theorically, segment parsing could be asynchronous. As far as I can see for now, this could be because of two reasons: - We need to perform an HTTPS request to fully parse segment information. We already encountered such case when trying to implement the "sidx ref type 1" feature (#754). However we did not merge that work for now due to the complexity and most of all because no real DASH MPD seems to rely on this feature (nor ar we able to envisage a real use-case for it). - segment parsing is performed in a Worker. But We do not feel the need to do that for now as the CPU footprint of parsing segments is really low. So theorically, there's good reasons to make the parsing operation asynchronous, but reastically for now, there's none. Meanwhile, making that operation asynchronous lead to big headaches: - it is one of the main reasons why the PR on making the init and first media segment's requests in parallel [#918] was not merged. Subtle bugs, some that would only be seen in a case where the parsing is asynchronous (so not really existant for now) could arise because of that. - Even if the response of the parser is wrapped in an Observable, it wasn't lazy (the code would be called even if the parser was not subscribed to). We could fix that by wrapping all parsers in a `defer` call from RxJS, but it would complexify even more the code. - Handling Observables is a LOT harder than just using directly a response in general. There's pending work about not depending so much on RxJS in the future [#916] and I think beginning to remove it where it is not even needed is a good first step. If it become needed again in the future (e.g. to support the "sidx ref type 1" feature), we could always revert that work, maybe even with a more Promise-based solution which might be much easier to reason about. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
Nov 24, 2021
After doing a proof-of-concept looking at how some parts of the code looks like without RxJS (#916), this is a first functional proposal which looks good enough to me to be merged. It removes all RxJS code from the `transports` code in `src/transports`. As a reminder, the reasons for doing this are: 1. Observables are complicated and full of implicit behaviors (lazily running, sync or async, unsubscribing automatically after the last unsubscription etc.) which is difficult to reason about, especially for a newcomer. Things like exploiting schedulers through the `deferSubscriptions` util to work-around some subtle potential race-conditions, or merging Observables in a specific order for similar reasons, are ugly hacks that are difficult to explain to someone not familiar with that code. Even for us with multiple years of experience with it, we sometimes struggle with it. 2. Promises, event listeners - and direct callbacks in general - are generally much more explicit and most developpers (at least JS/TS devs) are familiar with them. 3. Call stacks are close to inexploitable when using RxJS. 4. Promises provide async/await syntax which can improve drastically the readability of our async-heavy code, which for the moment suffer from callback hells almost everywhere. However, I'm still not sure if this wish (getting rid of RxJS) is shared by other maintainers and/or contributors, so it is still only a proposal. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
Jan 26, 2022
Every segment parser in `src/transports` (which allows to parse things like data, encryption metadata and time information from loaded segments according to the current streaming protocol) do so synchronously. However, those are typed as Observables. My guess (this was before my time on the project) is that it was mainly for two reasons: 1. Manifest parsing can be asynchronous (for example to fetch MPD xlinks or the clock) and it follows the same pipeline architecture (a loader and a parser) than segments. Letting segment parser be asynchronous thus allows to harmonize all resource-fetching code to a similar interface. 2. Theorically, segment parsing could be asynchronous. As far as I can see for now, this could be because of two reasons: - We need to perform an HTTPS request to fully parse segment information. We already encountered such case when trying to implement the "sidx ref type 1" feature (#754). However we did not merge that work for now due to the complexity and most of all because no real DASH MPD seems to rely on this feature (nor ar we able to envisage a real use-case for it). - segment parsing is performed in a Worker. But We do not feel the need to do that for now as the CPU footprint of parsing segments is really low. So theorically, there's good reasons to make the parsing operation asynchronous, but reastically for now, there's none. Meanwhile, making that operation asynchronous lead to big headaches: - it is one of the main reasons why the PR on making the init and first media segment's requests in parallel [#918] was not merged. Subtle bugs, some that would only be seen in a case where the parsing is asynchronous (so not really existant for now) could arise because of that. - Even if the response of the parser is wrapped in an Observable, it wasn't lazy (the code would be called even if the parser was not subscribed to). We could fix that by wrapping all parsers in a `defer` call from RxJS, but it would complexify even more the code. - Handling Observables is a LOT harder than just using directly a response in general. There's pending work about not depending so much on RxJS in the future [#916] and I think beginning to remove it where it is not even needed is a good first step. If it become needed again in the future (e.g. to support the "sidx ref type 1" feature), we could always revert that work, maybe even with a more Promise-based solution which might be much easier to reason about. Thoughts?
peaBerberian
added a commit
that referenced
this pull request
Jan 26, 2022
After doing a proof-of-concept looking at how some parts of the code looks like without RxJS (#916), this is a first functional proposal which looks good enough to me to be merged. It removes all RxJS code from the `transports` code in `src/transports`. As a reminder, the reasons for doing this are: 1. Observables are complicated and full of implicit behaviors (lazily running, sync or async, unsubscribing automatically after the last unsubscription etc.) which is difficult to reason about, especially for a newcomer. Things like exploiting schedulers through the `deferSubscriptions` util to work-around some subtle potential race-conditions, or merging Observables in a specific order for similar reasons, are ugly hacks that are difficult to explain to someone not familiar with that code. Even for us with multiple years of experience with it, we sometimes struggle with it. 2. Promises, event listeners - and direct callbacks in general - are generally much more explicit and most developpers (at least JS/TS devs) are familiar with them. 3. Call stacks are close to inexploitable when using RxJS. 4. Promises provide async/await syntax which can improve drastically the readability of our async-heavy code, which for the moment suffer from callback hells almost everywhere. However, I'm still not sure if this wish (getting rid of RxJS) is shared by other maintainers and/or contributors, so it is still only a proposal. Thoughts?
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
proposal
This Pull Request or Issue is only a proposal for a change with the expectation of a debate on it
Refacto
This Pull Request changes a lot of RxPlayer's code and/or logic
SegmentBuffers
Relative to the `SegmentBuffers` media buffers
work-in-progress
This Pull Request or issue is not finished yet
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is a first attempt to remove most RxJS-related code from the RxPlayer code. Here it is mainly done for the code inside
src/core/segment_buffers
.There are multiple reasons for this weird new idea:
Observables in general are complex beasts. By default, they are lazy, they are called as much times as they are subscribed to, they are "teared down" (all linked side-effects such as event listeners are cleaned-up) automatically on unsubscription,
depending on the scheduler used they may emit or be subscribed to in a synchronous or multiple types of asynchronous ways, they keep being open until you either complete them, emit an error or unsubscribe from them - when you're not careful, you thus have a chance of leaking memory.
Even if we understand it well, we still make frequent mistakes with it, some having pretty big consequences (e.g. big memory leak in fix memory leak in RepresentationStream #913, API events sent too much time in eme/api: stop sending eme-related events twice #850 - well to be honest - I am guilty on both I think!).
And that's only talking about us, with multiple years of experience with it. Newer developper may find it even more complex to understand.
Observables are IMO too much used in the code, even when
Promises
seems like a simpler and better solution (for example when there is only one item emitted).They are too risky (in my opinion) when we're talking about code with important side-effects (e.g. HTTP requests, SourceBuffer interactions, API interactions) due to the easyness with which we can perform them multiple times by mistake.
I find Observables tear-down too implicit (the fact that it happens silently on unsubscription, mostly) for what we use it for:
Unsubscribing from some Observables will stop the content, others will stop segments from being pushed, others will cancel requests.
Here I would prefer a much more explicit solution: one where we explicitly write a line of code anouncing that operations are aborted at the time we abort them.
Call stacks are unreadable and too long currently. I don't know if we ever exploited any call stack information because of that.
The fact that call stacks are unreadable seems to be directly linked to the heavy usage of RxJS operators.
Observables cannot be used with async/await. We often end-up with a huge callback hell when inner Observables contain other inner Observables and so one, whereas it could in effect be much simpler.
I still think that there are real good use cases for RxJS in the RxPlayer, for example I found the text synchronization code in the
HTMLTextSegmentBuffer
really simple to grasp and functional.In this PR, I also created the
TaskCanceller
util, which allows to implement explicit Promise cancellation roughly the same way that is is done for the nativefetch
API (on that note, I was heavily inspired by theAbortController
interface - used to cancel fetch operations - to write it).Thoughts?