Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[🚀 Feature]: Package Selenium Manager Separately #12318

Closed
aaltat opened this issue Jul 6, 2023 · 12 comments
Closed

[🚀 Feature]: Package Selenium Manager Separately #12318

aaltat opened this issue Jul 6, 2023 · 12 comments

Comments

@aaltat
Copy link

aaltat commented Jul 6, 2023

Feature and motivation

When webdirver-manager was added as binary to the Python package, the size of the package did grow from 985.8 kB (release 4.4.0) to 6.5 MB (release 4.10.0). When one spins ups new VM/containers/whatever, then each time that extra amount data is needed to be transferred from PyPi (or whatever blob storage one is using) to the target machine. And that extra data transfer causes extra costs. The cost can be time taken to transfer the extra data, cost of storing the extra data, cost of transfer and so on. I agree that when doing once or only few times, the total cost of storing webdirver-manager in PyPi package is not significant, but when one spins up large amount of VM/containers, like in ten of thousands or more, then this increase starts actually show in bottom line.

Therefore it can be useful to make the installation of webdirver-manager optional. One can example use optional dependencies or some other method the provide opout or opin for the webdirver-manager. This would be generally useful for people who have used selenium before the webdirver-manager and already have an way to bake in the needed webdrivers in their VM/containers.

Usage example

I would like to have: pip install selenium[webdirver-manager] to install also the webdirver-manager binaries.

@github-actions
Copy link

github-actions bot commented Jul 6, 2023

@aaltat, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

@titusfortner titusfortner changed the title [🚀 Feature]: Make installation of webdirver-manager optional in Python package [🚀 Feature]: Package Selenium Manager Separately Jul 7, 2023
@titusfortner
Copy link
Member

This proposal isn't unique to Python. This idea potentially could be applied to all the bindings.

The minimum requirement, though, is that if someone requests "selenium" that the selenium manager would be included by default. It can't be opt-in, it has to be opt-out. Does your Python suggestion allow this behavior, or would it require users to specify it additionally?

We can consider splitting out the code into separate packages, but only if we see distinct benefit to doing so, and decreasing size of dependency by itself probably isn't sufficient. But I'll keep this issue open for future discussion.

@dtopuzov
Copy link

dtopuzov commented Jul 7, 2023

Selenium Manager (and in general driver manager solutions) is useful for easier getting started and local development.
Things are a bit different if we talk about CI/CD. Nowadays we run everything on containers with specific browser and driver versions, so we don't need driver managers.

Having several MB more is not a problem for us at the moment, just saying there are valid cases when you don't need driver managers.

@diemol
Copy link
Member

diemol commented Jul 7, 2023

Why are you installing the Python bindings every time you spin a new container? Doesn't that make things prone to error if there are connectivity issues with PyPi? Isn't it more efficient to install the bindings through the Dockerfile? In the end, Selenium releases are more or less in sync with most browser releases.

@titusfortner
Copy link
Member

The other thing here that could help your situation is to allow the bindings to look for the binary in the cache directory. So long as everything is kept up to date nothing would be downloaded.

@aaltat
Copy link
Author

aaltat commented Jul 8, 2023

I don’t install dependencies each time I spin up the container/VM. I create once (once meaning that it’s built each time I update my dependencies) a container for all my Python dependencies and store it in my blob storage. But I do need to push the container from my blob storage to my cloud provider where the tests are actually running. The container size before Selenium the version change was +50Mb and now it’s little bit over 60Mb. No other dependencies where updated.

I understand the need to opting out for the installation of the webdriver-manager. No idea is it possible with Python or other package managers. At least with Python it’s possible to create to PyPi packages one selenium and another selenium-no-wdm. I never needed to look at this type of things before, so I am not sure which would be best solution to go.

@aaltat
Copy link
Author

aaltat commented Aug 26, 2023

Another option that came to my mind and should work at least in Python land, is to build OS specific packages. Now each selenium package contains binaries for Windows, Linux and Mac, but (usually) one downloaded package is installed only in one type of operating system. Therefore if there would be separate build for each OS, containing only binary for the specific OS then it would reduce the size significantly.

Also there might be other ways to make the binary size smaller, example things mentioned in here: https://github.com/johnthagen/min-sized-rust But because I am not Rust expert, I can not tell are you using some of those already. But those instructions might provide easy wins for all different Selenium language bindings.

@titusfortner
Copy link
Member

How does it work to install only one architecture by default? The packaging happens before runtime. Wouldn't we need multiple releases that users would need to choose between? (suboptimal)

We've already looked at how to minimize what is required in Rust and taken it into account. Our original implementation was larger.

For comparison, I just checked and the cypress zip file is 178 MB (567MB uncompressed), and a default install of playwright is 15MB of node modules and 888MB of custom browsers and tools.

Selenium 4.12 is going to increase size again because we're adding direct support for Mac/ARM architecture.
But ~10MB of manager and ~10/MB per driver seems pretty reasonable for total size.
Chrome for Testing is like 129 MB and Firefox about 120?

So Selenium with Chrome we're looking at ~150MB total; add Firefox and it's another ~130? This seems reasonable tradeoff for the value.

We can get more clever in our packaging, but I don't think focusing on shaving a few MB is worth our time right now when we have so many other things on our plate.

@aaltat
Copy link
Author

aaltat commented Aug 29, 2023

How does it work to install only one architecture by default? The packaging happens before runtime. Wouldn't we need multiple releases that users would need to choose between? (suboptimal)

Never build package which would not be universal (meaning that package works on all OS where Python works), but I have used many packages which are not universal. Numpy example is not an universal package, in PyPi: https://pypi.org/project/numpy/#files it has several wheel files for different OS and also different Python versions. But during install time, people install package with pip: pip install numpy and do not have to care which OS they are on. Pip does some magic behind the scenes to find the correct wheel for the users OS.

For Selenium this should similarly, at least in Python land: pip install selenium and if there OS specific package, then pip will automatically select the correct one. In Selenium case, the Python code is same, only difference is that each OS specific package contains only the webdriver manager for that OS. Example Windows package contains only the selenium-manager.exe and not the other executables.

From users perspective this could be the easiest way to go, at least in Python land. No new packages, install works in same way, just smaller package and faster install. Good win for all users, but does add some extra steps during the build time. How big that complexity is, no idea because I have never done it.

@titusfortner
Copy link
Member

This requires generating multiple wheels which is what I meant by multiple releases, which wouldn't be ideal, but maybe?

@AutomatedTester / @isaulv / @symonk any idea how much work would go into this from the implementation side? Bazel might be an added pain for it. :)

I'm kind of curious how the automatic conditional for architecture works, and what other package managers might support this kind of thing...

Again, not a priority for us right now, but worth considering.

@titusfortner
Copy link
Member

I'm going to track this feature request in a new issue since the discussion migrated a bit — #13021

Copy link

github-actions bot commented Dec 2, 2023

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Dec 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants