-
Notifications
You must be signed in to change notification settings - Fork 425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YouTube] Fixes for n param deobfuscation function #1253
base: dev
Are you sure you want to change the base?
Conversation
…ixup function to prevent early return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
// Pattern.compile(SINGLE_CHAR_VARIABLE_REGEX + "=\"nn\"\\[\\+" + MULTIPLE_CHARS_REGEX | ||
// + "\\." + MULTIPLE_CHARS_REGEX + "]," + MULTIPLE_CHARS_REGEX + "\\(" | ||
// + MULTIPLE_CHARS_REGEX + "\\)," + MULTIPLE_CHARS_REGEX + "=" | ||
// + MULTIPLE_CHARS_REGEX + "\\." + MULTIPLE_CHARS_REGEX + "\\[" | ||
// + MULTIPLE_CHARS_REGEX + "]\\|\\|null\\).+\\|\\|(" + MULTIPLE_CHARS_REGEX | ||
// + ")\\(\"\"\\)"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you remove the previous regex? It's better if you keep it, so in case YouTube reverts something it's already there and it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1.
This pattern is prone to find wrong functions.
Imagine this:
- NewPipeExtractor searches for the n deobfuscation function
- the first regex doesn't work so it uses the second one (the one I commented out)
- the (commented out) regex finds the wrong function
- NewPipeExtractor runs the wrong function and gets the wrong n parameter
- User gets 403
- No exception is thrown in NewPipeExtractor so it is harder to find the correct place which needs fixing
2.
The regex which comes after the one commented out already finds a very similar pattern, only different groups and more specific.
In this example:
a.D&&(b="nn"[+a.D],WL(a),c=a.j[b]||null)&&(c=SDa[0](c),a.set(b,c),SDa.length||Wma("")
Looking at different versions of the player code it seems more often correct to use the function which is in SDa[0]
but using Wma
in this case would find the wrong function in newer versions.
Going for SDa[0]
(and searching for what is in the array afterwards) seems more robust across multiple versions. The next regex already catches that.
From what I have seen I think if the regex which I commented out is added back in it should be swapped with the next one to hopefully prevent false positives more often.
(Also it might be a good idea to add some additional checks to validate it found the correct function but I consider this out of scope for this PR.)
But I have to say I'm very new to this YT/NewPipeExtractor stuff and only looked at some but not all player version so I might lack some experience with YTs past which means I might be wrong. I can only talk about what I have seen so far so if you know better feel free to correct me.
My suggestion for now:
I will add it back in but swap it with the next one.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Also it might be a good idea to add some additional checks to validate it found the correct function but I consider this out of scope for this PR.)
That's a pretty good idea, however if you are using other clients not requiring deobfuscation like it is the case currently (we do not use HTML5 clients for now except for age-restricted videos, but this is a broken workaround that will be removed soon), you are preventing streams extraction. This already happened in past. A general error/logging warning is something that would fit the best.
My suggestion for now:
I will add it back in but swap it with the next one.
What do you think?
Keep the current third version as it is and move the current first one, potentially finding the wrong function, at the third place and the current regexes from this position to the last one after the new third regex. Let me know if I am not clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I have done now compared to the current code on the dev
branch:
- swap the first two regexes
- insert the new regex in third place
Is this correct?
...main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeThrottlingParameterUtils.java
Show resolved
Hide resolved
Is it really needed to add a new regex matching partially what the current second one does? I think it could increase the chances to extract the wrong function, as there is nothing specific about the |
The old regexes don't find any matches and the current player js does not contain the The chance of extracting the wrong function in the future will probably never be zero. |
… is a function which takes one parameter
Fixes #1252
Changes: