Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Many things to improve or fix #269

Open
darkanubis0100 opened this issue Oct 2, 2024 · 4 comments
Open

Many things to improve or fix #269

darkanubis0100 opened this issue Oct 2, 2024 · 4 comments

Comments

@darkanubis0100
Copy link

  • 3,000 and 1 bugs...

  • Using CLANG is impossible because it directly brings up any infinity of bugs, under which version is that supposed to work without bugs? Because I'm using 18.1.3 on Ubuntu 24.04 and it's impossible to even use EXO without GPU because of CLANG bugs (the same one that has been shown many times in the repo issues).

  • If CLANG already didn't work for some reason, with CUDA it's even worse, it doesn't support WSL or I have to use a physical Linux to run EXO? Because CUDA fails me with a weird error but if I disable bfloat16, directly the error is a “Segmentation Fault” after it asks me to install “llvmlite”.

  • Detailed instructions and requirements in the repository? That doesn't exist here, I have to do magic trying to find out what programs or modules are missing to install and it still fails. Where is the complete list of dependencies? They don't even indicate that the “build-essential” is required.

  • How am I supposed to run it on Android? I tried using Termux with an Ubuntu and ran into several errors, most notably Tailscale....

@AlexCheema
Copy link
Contributor

AlexCheema commented Oct 2, 2024

Thanks for being patient with us. exo is still highly experimental and there are indeed a lot of bugs that need to be fixed.

I think the general point here is the tinygrad inference engine isn't stable enough (no fault of tinygrad, it's a new library that we've integrated into a highly experimental project). We have other inference engines coming (PyTorch and llama.cpp). PyTorch is almost ready to merge: #139. I'm hoping since PyTorch is more mature and people are more familiar with it, it will be a lot more stable.

I will prioritise better instructions and requirements. The idea is that there shouldn't need to be much since it should "just work".

I've run on Android successfully with termux before we introduced tailscale. Is the tailscale dependency breaking it now?

@darkanubis0100
Copy link
Author

Thanks for being patient with us. exo is still highly experimental and there are indeed a lot of bugs that need to be fixed.

I think the general point here is the tinygrad inference engine isn't stable enough (no fault of tinygrad, it's a new library that we've integrated into a highly experimental project). We have other inference engines coming (PyTorch and llama.cpp). PyTorch is almost ready to merge: #139. I'm hoping since PyTorch is more mature and people are more familiar with it, it will be a lot more stable.

I will prioritise better instructions and requirements. The idea is that there shouldn't need to be much since it should "just work".

I've run on Android successfully with termux before we introduced tailscale. Is the tailscale dependency breaking it now?

I understand the situation, PyTorch is certainly more than welcome but I don't understand why I can't get it working in WSL. Does it require something from dbus?

In the case of Android, that is indeed the case, Tailscale. It seems that Tailscale in Python does not exist for ARM and that is why I cannot install it.

@fullofcaffeine
Copy link

fullofcaffeine commented Nov 22, 2024

+1

I really appreciate what you folks are doing, but it's been almost impossible to run Exo on a Cluster of 3 Linux systems (NVIDIA/CUDA) + a Mac M2 on tinygrad. All sorts of exceptions, network errors, OOM errors, etc -- it's very brittle. I managed to run it a few times, but one of the nodes always ends up crashing somehow. It's hard to debug and track (I've crated a few issues about some of the bugs/issues I've found).

I have fallen back to just running Llama 3.1 8B locally (via Ollama) on my M2 for now, which ended up being much more stable than via the Exo cluster (albeit a bit slower, less tokens/s).

I'll wait a few months and try again, as at the moment don't have the time nor the know-how to help now, either (other than testing it out on my systems). Don't mean to discourage your work at all, I'm still very excited about Exo!

@AlexCheema
Copy link
Contributor

+1

I really appreciate what you folks are doing, but it's been almost impossible to run Exo on a Cluster of 3 Linux systems (NVIDIA/CUDA) + a Mac M2 on tinygrad. All sorts of exceptions, network errors, OOM errors, etc -- it's very brittle. I managed to run it a few times, but one of the nodes always ends up crashing somehow. It's hard to debug and track (I've crated a few issues about some of the bugs/issues I've found).

I have fallen back to just running Llama 3.1 8B locally (via Ollama) on my M2 for now, which ended up being much more stable than via the Exo cluster (albeit a bit slower, less tokens/s).

I'll wait a few months and try again, as at the moment don't have the time nor the know-how to help now, either (other than testing it out on my systems). Don't mean to discourage your work at all, I'm still very excited about Exo!

Appreciate the feedback. Our approach has been depth-first focusing on making Mac support as good as it can be before focusing on linux again. Hopefully we'll be able to delight you in a few months time once linux is more stable / mature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants