Add recommendation for model in readme for newcomers #410

security-companion · 2023-06-10T12:46:17Z

Hi,
I think it would be good to have a recommendation for a model in the readme for starting with.
Otherwise newcomers to the project might not know which model is best for starting with.
What do you think.
security-companion

gaby · 2023-06-10T12:47:38Z

@security-companion Sounds like a great idea, can you do a PR? Another option could be to split the models list on the README

security-companion · 2023-06-10T12:48:44Z

For sure I can make a PR.
Which model would you recommend for starting?

raezor117 · 2023-06-11T09:52:03Z

I would personally go with Vicuna-v1.1-7B for lower end spec instances (5GB Ram) or else the Vicuna-v1.1-13B for better spec instances (12GB Ram)

noproto · 2023-06-11T14:17:28Z

@raezor117 Separating the recommendation based on the system specs is a good idea, but Vicuna has been surpassed for some time now: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

gaby · 2023-06-11T16:18:33Z

@noproto Agree, that gave me an idea to make the Models list filter out any model that requires more RAM than the one reported by the system.

For example, if your host had 8GB, there's no point in listing models that require 20,30,40GB of ram

security-companion · 2023-06-11T19:15:39Z

@gaby: That's a good idea. Perhaps we could even make a switch-on/off button that says (Show/Filter only models that are supported on my machine)

oktaborg · 2023-06-13T06:11:35Z

@gaby: That's a good idea. Perhaps we could even make a switch-on/off button that says (Show/Filter only models that are supported on my machine)

I agree with this. And as previous person said. I also thing it would be good to have a "speed" ranking since this is the major problem for me.

A question, I cant find a sweet spot for amount of threads, is there any "best" amount?I tried up to 32.

Last but not least, is there a GPU support yet?

raezor117 · 2023-06-13T10:48:15Z

@gaby: That's a good idea. Perhaps we could even make a switch-on/off button that says (Show/Filter only models that are supported on my machine)

I agree with this. And as previous person said. I also thing it would be good to have a "speed" ranking since this is the major problem for me.

A question, I cant find a sweet spot for amount of threads, is there any "best" amount?I tried up to 32.

Last but not least, is there a GPU support yet?

I have an instance also spun up on my virtual server, the specs of it being:
CPU: 20 vCores (AMD EPYC™ Milan 7763 Base Clocks 2.45GHz with Max. 3.5GHz)
RAM: 16 GB (3200MHz)

with the following prompt:

Provide only the string that would best describe the Invoice Number in the following piece of text:
sample Invoice (4 Click to edit Billed To Your Client 1234 Clients Street City, California 90210 United States 1-888-123-8910
Date Issued 26/3/2021 Due Date 25/4/2021 YOUR COMPANY 1234 Your Street City, California 90210 United States 1-888-123-4567 Invoice Number Amount Due INV-00456 $1,699.48

And with the above specs the following models take the following time to respond with a result:

Vicuna-v1.1-7B-q6_K = 6.54 seconds
Vicuna-v1.1-13B = 12.68 seconds
Lazarus-30B = Timed out after 1m30s PS: I know my specs does not meet the recommended requirements

security-companion · 2023-06-30T20:04:30Z

In order to finish this issue. What models should we add?
Should we go with the following?

I would personally go with Vicuna-v1.1-7B for lower end spec instances (5GB Ram) or else the Vicuna-v1.1-13B for better spec instances (12GB Ram)

hassansf · 2023-07-18T18:41:14Z

I have a Synology NAS RS1619xs+ and I downloaded GPT4All-13B. Then I typed a simple question which took over 11 mins and did not complete the answer. I had to close the chat and restart the container for the NAS to breath.

I agree, there should be a readme for newcomers with recommended models.

security-companion · 2023-07-28T15:54:05Z

Any news about suggested models?
What do you suggest so that we can proceed with that pull request?

raezor117 · 2023-07-28T19:10:38Z

I personally had the best experience with the following models:

Vicuna
Wizard
Alpaca

As for which specific. I would say, start with the minimum of 7B. Then if your hardware is capable of more, you can try the larger models. However, don't try to jump to the 65B (my opinion). Work your way up until you find an acceptable response time for the number of tokens.

I personally will be looking into Orca next, as I have read that its quite a good competitor.

gaby · 2023-11-14T04:06:38Z

#866 introduces support for GGUF models. By default Serge will include the top downloaded ones. These are: LLaMA2, CodeLLaMA, Zephyr, ans Mistral

gaby added 💡 Help Wanted 📒 Documentation 👍 Accepting PR labels Sep 4, 2023

gaby closed this as completed Nov 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add recommendation for model in readme for newcomers #410

Add recommendation for model in readme for newcomers #410

security-companion commented Jun 10, 2023

gaby commented Jun 10, 2023

security-companion commented Jun 10, 2023

raezor117 commented Jun 11, 2023

noproto commented Jun 11, 2023

gaby commented Jun 11, 2023

security-companion commented Jun 11, 2023

oktaborg commented Jun 13, 2023

raezor117 commented Jun 13, 2023

security-companion commented Jun 30, 2023

hassansf commented Jul 18, 2023

security-companion commented Jul 28, 2023

raezor117 commented Jul 28, 2023

gaby commented Nov 14, 2023

Add recommendation for model in readme for newcomers #410

Add recommendation for model in readme for newcomers #410

Comments

security-companion commented Jun 10, 2023

gaby commented Jun 10, 2023

security-companion commented Jun 10, 2023

raezor117 commented Jun 11, 2023

noproto commented Jun 11, 2023

gaby commented Jun 11, 2023

security-companion commented Jun 11, 2023

oktaborg commented Jun 13, 2023

raezor117 commented Jun 13, 2023

security-companion commented Jun 30, 2023

hassansf commented Jul 18, 2023

security-companion commented Jul 28, 2023

raezor117 commented Jul 28, 2023

gaby commented Nov 14, 2023