Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Video related primitives should be provided in scraperlist #194

Open
kelson42 opened this issue Sep 19, 2024 · 3 comments
Open

Video related primitives should be provided in scraperlist #194

kelson42 opened this issue Sep 19, 2024 · 3 comments
Labels
enhancement New feature or request question Further information is requested
Milestone

Comments

@kelson42
Copy link
Contributor

We are publishing more and more ZIM files with videos using many different scrapers.

Do do that we mainly:

  • Re-encode videos/audio streams
  • Handle sub-titles
  • Display all of this using video.js

For the moment many of these pieces of video related functions are distributed in different places (in the scraper relying on them).

At least to me, this is:

  • difficult to understand which scraper is at which status related to the development of these pieces of code
  • Fixes will have to be requested/tracked in each scraper
  • Releases are also de-facto made at very different times

I'm not prescriptive about the exact solution, but I believe we should try to consolidate this at one place.

@kelson42 kelson42 added enhancement New feature or request question Further information is requested labels Sep 19, 2024
@benoit74
Copy link
Collaborator

For me this is a problem of dependencies management: which software is using which version of which library.

Would a dashboard of dependencies versions per scraper help? I'm not sure it is sufficient because one still needs to know that problem x has been fixed in version x.y of dependency nnn.

What makes it even harder is that we need a solution which can handle both Python (because video re-encoding is done in scraperlib so we want to track the scraperlib version per scraper) and JS (because display is done with video.js which is ... JS).

Would it be me, I would propose a very radical solution, because the problems you describe are typically the strong argument for a mono-repo of all scrapers : all scrapers are at the same level of development, fixes have to be requested and tracked only once, releases are synchronized. Unfortunately it comes with its own share of drawbacks.

@kelson42
Copy link
Contributor Author

kelson42 commented Oct 21, 2024

@benoit74 monorepo is a nogo to me. For the rest, I'm very open. If not technical solution can be found, we should at least have a procedural approach.

@benoit74
Copy link
Collaborator

Since monorepo is a nogo, there is nothing but procedural approach / tooling to solve the problem you're describing, because since you would like to have an overview of the situation, we will always need to have a kind of dashboard allowing to:

  • track issues known to impact many ZIMs / scrapers and whether they have to be fixed by a dependency update - with minimum required version (because dependency has a fix) - or by a code change at scraper level - and once fixed at scraper level which scraper version contain the fix
  • track which ZIM is built with which scrapers versions (because even if scraper is released, until the ZIM is not updated, problem is not fixed for our users)
  • track which versions of dependencies are used in which scrapers versions (because fixing a problem in a shared dependency is great, but not fixing the end-problem until used and released in the scrapers)

This is the solution to be able to quickly say that something like "issue xxx is fixed by updating dependency xyz to version x.y.z, this version has been deployed in scraper aaa version x.y.z and scraper bbb y.z.x, not yet in other scrapers, and we have xxx ZIMs using version aaa or newer, but zzz ZIM still using older versions".

To me this is not a small thing to build / deploy because we need tooling for that. We need to find funding to develop/configure this tooling / procedures.

Without that funding / tooling, we are back to square one, doing all this manually when needs arise.

@benoit74 benoit74 added this to the backlog milestone Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants