-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multilabel Classification FutureWarning with KNeighborsClassifier.predict() #45
Comments
Thanks for your feedback @tg2k . If you don't want to see this warning, you can upgrade Scikit-Learn to 1.1.2 or later. Hope this helps. |
Would have replied sooner but was traveling. This turned out to be tricky, because I was running an anaconda install (installed via Chocolatey). Attempting to upgrade Scikit-Learn landed me in this problem. After checking this article I went with a miniconda install (also via Chocolatey) and used a .yml file similar to that article. For anyone not experienced with miniconda who may be reading this, the basic steps were
Activate miniconda base env:
Update the conda install further:
Then install yml file with Within the .yml file, my dependencies so far look about like this:
Unlike the medium article, I opted for conda packages wherever possible, and only for pip where there was no conda package. I saw a note during an operation that recommended I install With this done, I have a more current install, and I would discourage anyone from using anaconda as in the medium article above. |
This is great feedback, and I'm sure it will be useful to other readers, thanks a lot @tg2k ! 👍 |
Actually, as I've progressed through the book, it became more and more difficult to keep a working Windows environment. Conflicts grew worse, I installed Mamba instead of Conda, pulled more from conda-forge, etc., but at a certain point in Chapter 11 or so it became untenable even for Mamba to find a solution in Windows. I think some of the Conda packages aren't available for critical versions in Windows, leading to a situation where a consistent environment was difficult or impossible to attain. For all I know this could change, but the experience has left me with the distinct impression that the data science community sees Windows as an afterthought. I also kept running into conflicts with multiple libiomp5md.dll. I fixed one of them by forcing a newer version of numpy (maybe during Chapter 10) but it hit me again and so I looked at why I had the versions of libraries I did. TensorFlow 2.10 was one of the packages that used this DLL. I noted that 2.11 is out and as I began looking at that, I discovered that the TensorFlow team dropped native Windows GPU support, in favor of using WSL2. This is annoying for Windows users, in that if you want the best TensorFlow experience you are forced to use a bit of Linux on Windows, but WSL2 is better integrated than any VM solution I've ever used. And it is far, far easier to get all the package versions. I'm unclear on whether packaging issues themselves are related to the TensorFlow team's decision. For those interested, 2.11 release notes here and non-explanations from the team are here and here. WSL2-based Install Process The overall install process becomes significantly longer but at least it's workable. For me it was roughly: Install WSL 2 with Open WSL by running "Ubuntu" app or going to a command line and just running If you have Windows folders for relevant scripts, use ln -s to link them for ease of use. Make sure those scripts have Unix line endings. Install CUDA per this and this. The prior CUDA install has a problem that causes
Then close anything running inside WSL and restart WSL (you can use the same
Inside WSL, verify libcuda now has symlinks with Install graphviz with
Install mamba with
Run a new shell (this will activate the base environment too) Add the Conda lib path to the end of
Create mamba/conda env (in a section below I've posted some yml) Register Jupyter kernel Verify TensorFlow Another verification of CUDA per NVidia: TensorFlow can print a lot of warnings about NUMA support, so edit .bashrc and around another export add
Install protoc Restart VSCode if it is running. Install the Python extension and the WSL extension (formerly Remote-WSL). In the bottom left of the VSCode window connect to the WSL instance, which will re-launch VSCode to connect to WSL. YML My YML looks roughly like so:
Performance with WSL2 I have a new and reasonably powerful desktop computer, and prior to WSL I was able to run the heavier (generally Scikit-Learn) loads in 1/3 - 1/2 of the times that the book warned about. However with WSL these loads are generally running slower than the book mentions, so I think my throughput has been significantly cut down. Theoretically on very large loads the GPU support should make up for the WSL overhead. Moreover at least I shouldn't have to deal with as much head-banging to resolve package installation issues. Fingers crossed on that as I continue through the book. Make sure though to put all files for the book inside the WSL Ubuntu directory structure. If instead you do as I initially did, and run it off /mnt/c (or a symlink thereof), you'll encounter significant slowdowns as I mention in a comment below. TensorFlow 2.11 Compatibility with book's Jupyter notebooks Installing TensorFlow 2.11 revealed some incompatibilities with the current Jupyter notebook code vs. TensorFlow. In Chapter 11 you can use this line:
The current optimizer requires
In Chapter 12 you can use this line:
The .legacy avoids an issue with a missing _set_hyper() method. Or new code could be provided for the current optimizer. In Chapter 14
can be turned into a |
This is gold @tg2k , thanks so much for taking the time to write this thorough review of your ML experience on Windows. I agree with you that Windows does not seem to be a high priority for most of the ML community, sadly. In fact, I'm currently consulting for a company that's entirely running on Windows, and I keep running into issues like the ones you encountered, it's quite frustrating. That said, I did run all the notebooks in this project on Windows before the book came out (on a Windows Server VM on Google Cloud), but it looks like some things have broken since then. I'll investigate as soon as I can. |
@ageron I have no doubt this worked better on Windows some time ago. Some combination of rapidly advancing packages, lack of attention/support for Windows at coding and packaging levels, and TF 2.11's lack of Windows GPU support all conspired over time to ruin the pure-Windows experience. It's unfortunate because even if TF still had full Windows support I could still run into problems with other packages, yet running on WSL comes with significant performance penalties. With pure Windows my new computer easily beat out your years-old laptop stats by a factor of 2-3x, but under WSL it's often 2-4x slower than any numbers you included. It gets even worse if the files are on the Windows side rather than in the Ubuntu filesystem. Chapter 13's Exercise 10 IMDB tree printout is quick with all the files on WSL, but if the files were actually under /mnt/c (aka marshalled over Windows using 9P which should eventually get somewhat faster) then it became a multi-minute operation. None of this is intended to discourage Windows users to steer clear, I just hope that some of the info above (which I've been editing as I continue through the book) helps other users and you may want to incorporate some of it into your own instructions. |
An interesting thing came up when I went to upgrade to TensorFlow 2.12. Here the
See also this issue I created in the TF project. For TensorFlow 2.12 newer packages are required elsewhere, some of which (like cudnn and the huggingface transformers/datasets) are currently only on pip. Also I ended up reapplying the Windows-side
|
In the Multilabel Classification section, the following code generates a warning:
The warning is:
I verified I have the latest repo code.
Versions:
The text was updated successfully, but these errors were encountered: