-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Context sensitive help #3556
Context sensitive help #3556
Conversation
…/llama.cpp into context-sensitive-help
Trailing ws removed.
|
Absolutely nothing against you personally, but I really, really don't like the approach of scanning source files to try to determine that information. I think it's likely to be pretty brittle, hard to understand/confusing and overly complicated. A much simpler approach would just be to make a big enum of the possible options and have a function that parses/generates help based on a list of options. Then the individual examples can just pass a list of the options they support and "context-sensitive help" is the result. |
Thank for the response. I understand your reservations, and I don't take constructive criticism personally, so no worries there, but I think you're underestimating the overhead to keep the enum you suggest up to date. I did consider that, as well as the "binary coding" option I outlined in the discussion before rejecting it, but both ultimately demand some kind of automation or they become prohibitively cumbersome. Unless it's done automatically, which brings us back to some version of where we are, every time there is a new implementation or an enhancement to the whole suite of apps someone needs to go through and change everything, and in my experience that ceases to happen pretty quickly and is itself a bit "brittle". As for complicated: if the underlying code is written tightly according to consistent standards - and it is - then it isn't that complicated; if it were, I wouldn't have been able to do it! |
I definitely respect that! I still hate saying something negative about someone else's work, but in this case I'm really not in favor of this type of solution.
I don't think it would be that bad. Most examples fall into one of three categories: Supports everything, supports only a few things, supports everything except for a few options. The option interface could export a couple convenience functions like one that returns all the options: examples that support all except a few can just remove the ones they don't support. Examples that only support a few can start with an empty list of supported options and just add the ones they support, and of course the examples that support everything just don't need to do anything. If there are other common sets of options then there could be other convenience functions that provide them as a starting point. Just an example, On the other hand, an application like There would still be the need to maintain a master list of options and associated help text, parsing those options, etc, but that's the case with the current status quo and also your changes as far as I know. So that part isn't really the issue. (Not sure if I explained it clearly enough. If not, let me know and I can show a more concrete example of what I'm talking about.) |
Thanks for this work - interesting approach. I'm inclined to say no atm, as I think we are over-engineering this problem. |
No problem at all. @KerfuffleV2 and I had a conversation about it as you will have seen, but it's absolutely your call. And apologies (again) for my incompetence causing you extra work when so many PR runs failed because of my lack of experience with github, VSC and the system. I think the version I recently pushed will compile and I've fixed the previous failure to catch Anyway, I've learned a lot about the whole suite of apps by doing this, so nothing is lost. Thanks for your patience and, more particularly, for the |
I suspect that this horse has already bolted, but just for the sake of completeness I feel that I should respond to what at least appear to be misreadings of what I have done in your evaluation. As I indicated before, your solution seems to involve some kind of manual intervention to every app that indicates which argument/parameter pairs it supports. The apps don't decide for themselves which they support: unless it is manual, the app must be scanned - either by a fallible human or some kind of program - to ascertain which they are; this is exactly what The "master list" you mention - as things stand - arises from a combination of My point, in sum, is that if a developer for some reason wanted to add |
@pudepiedj I don't think running a python script should be part of the process of adding a new option to a llama.cpp example (or examples), especially one that tries to parse C++ code with a regex. |
Basically, but that doesn't mean the options have to be dealt with individually for each app.
I mean, they basically do otherwise you could just use all options for every example. In reality, so apps don't implement the stuff that's required to respect/be compatible with certain options. Just for example,
I think you misunderstood me. I wasn't saying maintaining the master list was something specific to your approach. I was saying it was an issue (if you want to look at it that way - I do, personally) that was shared by all three approaches: the status quo, your approach and what I was suggesting as well. The reason I brought that up, is because I don't think that requirement would really be part of the ideal solution. What I was suggesting was a relatively simple compromise. I mean, if I add that
This is basically the issue, you can't reliably tell if it's "implemented" using a regex. #ifdef LLAMA_SUPPORT_CARROT_CHOPPING
printf("%d", params.dice_carrots);
#endif Your script will think I implemented gpt_params my_params;
// Stuff
printf("%d", my_params.dice_carrots); Your script won't recognize that I "implement" Also, it doesn't look like your regexs will handle Also, if I happen to name some random variable It's really not maintaining lists or whatever that I was concerned about, it was arbitrary/unintuitive effects and restrictions, and also the difficulty of reliably determining stuff like "this example implements this feature". edit: Another thing just occurred to me. I have some changes in #3543 that moves the sampling-related parameters out of If not, how will you deal with stuff like: gpt_params params;
llama_sampling_params & sampling_params = params.sampling_params; and that kind of thing (an approach I used a lot in my change to avoid having to write |
You are right, and I wouldn't dream of it. I just find python easier to script in terms of "proof of concept"; a working version would need to be in C++. |
I think that there are two big categories of flags, model related and sampling related. Nearly everything else is specific to one or two examples. So maybe it would be good to have common parsing for these two big categories (that can be toggled per example), and then just do the rest of the parsing in each example? We could extend the command line parsing functions in I would avoid any over-engineered solution because examples are supposed to be simple and easy to understand. It may be preferable to duplicate some code than to make the code harder to understand by adding more layers. |
These are great points! Thank you! I absolutely think (amongst many things) that we are at the point where we need some kind of common parsing/naming if this project is to go on. My issue in this conversation has absolutely nothing to do with the relative merits of Python and C, and my ineptitude in both should prove that lack of expertise in coding is not in itself disproof of concept. Honestly: who (with any sense) cares? But the question you raise is really important: No doubt we can create heaps of ad hoc solutions to any number of perceived intermediate problems, but the key is that what we are doing can be generalised wherever we want to go. We should not accept "fixit-it" temporary solutions: observed limitations should feed back to redefine fundamental concepts (modus tollendo tollens) and so we shouldn't build limitations into this fantastic project by opting for quick fixes to systemic difficulties. Specifically: if we need (right now) to regularise the syntax/naming of the code, we should do it. Otherwise, we are building on sand. |
@pudepiedj : we are not making the rules, the lead contributors are (and the success of the project proves them right). |
In this case, my main concern is that I should be to understand everything in an example without having to look too far. If I see something like this in an example, it is fairly obvious what it does and where the value of add_model_arguments(&mparams);
add_sampling_arguments(&sparams);
add_argument({"-np", "--parallel"}, "number of parallel sequences to decode", 1, [&](auto s) {
n_parallel = std::stoi(s);
});
parse_arguments(argc, argv);
auto [model, ctx] = init_model(mparams);
// ...
llama_token token = sample_token(ctx, sparams, ...); When you add another abstraction layer on top of all of this, I am no longer able to understand at a glance what is happening in the code. In some cases that complexity may be completely justified, but in this case simplicity is the main concern, because the reason I am looking at the example is to learn how to use llama.cpp, not to learn about command line parsing schemes. This is just an example, I am not saying that the command line parsing should look exactly like that (that code isn't even C++11), and I don't want to discourage from exploring different options either. |
That's very, very similar to the approach I was thinking of. The only thing different really is I might have done it where you pass an enum of the argument type like |
Lots of interesting things to consider have been mentioned and many deserve long consideration, especially about the way to organise multi-contributor projects and the extent to which some kind of central control is eventually necessary, but this isn't the place for that. I think this PR has served its purpose, so I will close it. |
All the apps/executables in
llama.cpp/examples
currently use the samehelp
based on lists incommon.cpp
, which don't distinguish between the arguments/parameters that are implemented in a particular app, so it's hard to know which ones should work (implemented) and which shouldn't (not implemented) or for some reason don't (bug).The discussion #3514 details early thoughts and shows output from the latest implementation, which scans all the
.cpp
source files inexamples/
and identifies which are using one of thegpt-params
attributes by searching onparams.attribute
. At the moment this is implement inpython
infind-implemented-args.py
but it could be ported toC++
and some part would obviously need to be (TODO) to incorporate it into, say,common.cpp
.readcommonh.py
grabs all the attributes fromcommon.h
and is then imported intofind-implemented-args.py
providing up-to-date scans ofcommon/common.h
for all current param attributes (although there are a couple that escape this scan -logit_bias
andlora_adapter
). There are two syntactically distinct forms for the attributes, in theirparam.attribute
form (e.g.n_threads
) and in theircommon.cpp
help form (e.g.,--threads
) as is triggered by/bin/app --help
. These are matched byfind-implemented-args.py
to perform substitutions/links.About 12 implemented args/parameters are not listed in the
master
version ofcommon.cpp
, so they have been added in thecontext-sensitive-help
branch of my fork ---embedding, --beams, --ppl-stride, --ppl-output-type, --memory-f32, --no-mmap, --mlock, --use-color, --nprobs, --alias, --infill, --prompt-file
- but thefprintf()
sequence will also need to be checked (TODO).Discussion #3514 shows a complete output run based on current values, although only the
help
variety is really needed, but if this is of any general interest we can work on aC++
version that could be built intocommon.cpp
along with the rest of the help.Written and tested on Apple M2 MAX (38 core) with 32GB RAM running Sonoma 14.0.