-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC-0001-The-Llama-Stack #8
Conversation
One question I've got about the lifecycle is around the monitoring/human feedback portion of this diagram and if this proposal could look into adding a more complete observability standard for development and deployment: I think it would be a pretty good idea to include some standard around how observability is taken into account in this lifecycle, and there have been first efforts to address this in the gen-ai semantic conventions in OpenTelemetry One issue with the current OpenTelemetry semantic conventions that exist currently are:
With work, especially around activation steering with papers like Activation Addition and the sparse auto encoder work done by Anthropic and OpenAI it's only a matter of time before we get better information about the internals of these models. Last month Anthropic also released a waitlist for their Beta Steering API which will hopefully release feature clamping and monitoring. I think recently there’s been a lot of work showing that the single shot that we allow models to have when answering a prompt is fundamentally a simplistic way of using the models, and I imagine that a strong agent framework would allow for monitoring and observability of internal states which is a richer, continuous and differentiable representation of the models. Unfortunately API models are fundamentally restricted to the “Chat API” era - they can’t give any access to the underlying activation space because that’s their multi-billion dollar intellectual property. This is where I see open source models completely leap frogging in terms of capabilities as having this access would allow for so much more than we’ve got today. What I would like: It would be nice to have things like a tracing setup that was extendible that could incorporate useful information we know about the models today, as well as allowing for future improvements:
I’ve been thinking about this a lot, and whilst there are tools like transformer_lens for development and research, it doesn’t seem like there are many tools or efforts out there that allow for this for deployed systems. This might also be a bunch of scope creep on this spec, but observability/monitoring solutions are an essential part of any complex software system and it feels like the LLM agent world hasn’t caught up yet and a lot of these applications have been limiting some of the development I want to do on other downstream tasks. |
Couple of things I'm interested in
|
e7cd58a
to
417ba2a
Compare
cc3614f
to
124b2c1
Compare
Thanks for the pointer. This is a great addition! We have as an update to this PR itself, added an Observability API which can be used from fine-tuning. Please take a look and let us know what you think. |
We are just starting right now :) and we'd like to make sure we cover the basics well and provide enough value so the ecosystem finds this worth building and adopting. If things go well, all you suggest could arrive. |
At present, I am pushing the implementation of LLM production. I think there are many production level standards missing, which makes it difficult for me to know what is right. I hope llama can be standardized by using open source. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I'm going back to work now and then I'm going back and then I can come get the kids from work school if that you would like me and you can go just to get the kids out and then I we can could you just come let me in and the @prassanna-ravishankar
As part of the Llama 3.1 release, Meta is releasing an RFC for ‘Llama Stack’, a comprehensive set of interfaces / API for ML developers building on top of Llama foundation models. We are looking for feedback on where the API can be improved, any corner cases we may have missed and your general thoughts on how useful this will be. Ultimately, our hope is to create a standard for working with Llama models in order to simplify the developer experience and foster innovation across the Llama ecosystem.