-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add recommendation for model in readme for newcomers #410
Comments
@security-companion Sounds like a great idea, can you do a PR? Another option could be to split the models list on the README |
For sure I can make a PR. |
I would personally go with |
@raezor117 Separating the recommendation based on the system specs is a good idea, but Vicuna has been surpassed for some time now: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard |
@noproto Agree, that gave me an idea to make the Models list filter out any model that requires more RAM than the one reported by the system. For example, if your host had 8GB, there's no point in listing models that require 20,30,40GB of ram |
@gaby: That's a good idea. Perhaps we could even make a switch-on/off button that says (Show/Filter only models that are supported on my machine) |
I agree with this. And as previous person said. I also thing it would be good to have a "speed" ranking since this is the major problem for me. A question, I cant find a sweet spot for amount of threads, is there any "best" amount?I tried up to 32. Last but not least, is there a GPU support yet? |
I have an instance also spun up on my virtual server, the specs of it being: with the following prompt:
And with the above specs the following models take the following time to respond with a result:
|
In order to finish this issue. What models should we add?
|
I have a Synology NAS RS1619xs+ and I downloaded GPT4All-13B. Then I typed a simple question which took over 11 mins and did not complete the answer. I had to close the chat and restart the container for the NAS to breath. I agree, there should be a readme for newcomers with recommended models. |
Any news about suggested models? |
I personally had the best experience with the following models:
As for which specific. I would say, start with the minimum of 7B. Then if your hardware is capable of more, you can try the larger models. However, don't try to jump to the 65B (my opinion). Work your way up until you find an acceptable response time for the number of tokens. I personally will be looking into Orca next, as I have read that its quite a good competitor. |
#866 introduces support for GGUF models. By default Serge will include the top downloaded ones. These are: LLaMA2, CodeLLaMA, Zephyr, ans Mistral |
Hi,
I think it would be good to have a recommendation for a model in the readme for starting with.
Otherwise newcomers to the project might not know which model is best for starting with.
What do you think.
security-companion
The text was updated successfully, but these errors were encountered: