Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updated local APIs to return usage info #585

Merged
merged 3 commits into from
Dec 14, 2023
Merged

Conversation

cpacker
Copy link
Collaborator

@cpacker cpacker commented Dec 6, 2023

Closes #580


Please describe the purpose of this pull request.
Closes #580

How to test
Run local LLM and check for runtime errors + token tracking with --debug

Have you tested this PR?

  • webui
    - [ ] webui-legacy
  • lmstudio
  • llamacpp
  • koboldcpp
  • ollama
  • vllm

@cpacker cpacker changed the title updated APIs to return usage info updated local APIs to return usage info Dec 6, 2023
@cpacker cpacker merged commit 986567a into main Dec 14, 2023
2 checks passed
@cpacker cpacker deleted the track-localllm-tokens branch December 14, 2023 05:11
cpacker added a commit that referenced this pull request Dec 15, 2023
* updated local APIs to return usage info (#585)

* updated APIs to return usage info

* tested all endpoints

* added autogen as an extra (#616)

* added autogen as an extra

* updated docs

Co-authored-by: hemanthsavasere <[email protected]>

* Update LICENSE

* Add safeguard on tokens returned by functions (#576)

* swapping out hardcoded str for prefix (forgot to include in #569)

* add extra failout when the summarizer tries to run on a single message

* added function response validation code, currently will truncate responses based on character count

* added return type hints (functions/tools should either return strings or None)

* discuss function output length in custom function section

* made the truncation more informative

* patch bug where None.copy() throws runtime error (#617)

* allow passing custom host to uvicorn (#618)

* feat: initial poc for socket server

* feat: initial poc for frontend based on react

Set up an nx workspace which maks it easy to manage dependencies and added shadcn components
that allow us to build good-looking ui in a fairly simple way.
UI is a very simple and basic chat that starts with a message of the user and then simply displays the
answer string that is sent back from the fastapi ws endpoint

* feat: mapp arguments to json and return new messages

Except for the previous user message we return all newly generated messages and let the frontend figure out how to display them.

* feat: display messages based on role and show inner thoughts and connection status

* chore: build newest frontend

* feat(frontend): show loader while waiting for first message and disable send button until connection is open

* feat: make agent send the first message and loop similar to CLI

currently the CLI loops until the correct function call sends a message to the user. this is an initial try to achieve a similar behavior in the socket server

* chore: build new version of frontend

* fix: rename lib directory so it is not excluded as part of python gitignore

* chore: rebuild frontend app

* fix: save agent at end of each response to allow the conversation to carry on over multiple sessions

* feat: restructure server to support multiple endpoints and add agents and sources endpoint

* feat: setup frontend routing and settings page

* chore: build frontend

* feat: another iteration of web interface

changes include: websocket for chat. switching between different agents. introduction of zustand state management

* feat: adjust frontend to work with memgpt rest-api

* feat: adjust existing rest_api to serve and interact with frontend

* feat: build latest frontend

* chore: build latest frontend

* fix: cleanup workspace

---------

Co-authored-by: Charles Packer <[email protected]>
Co-authored-by: hemanthsavasere <[email protected]>
sarahwooders pushed a commit that referenced this pull request Dec 26, 2023
* updated APIs to return usage info

* tested all endpoints
goetzrobin added a commit to goetzrobin/MemGPT that referenced this pull request Jan 11, 2024
* updated local APIs to return usage info (letta-ai#585)

* updated APIs to return usage info

* tested all endpoints

* added autogen as an extra (letta-ai#616)

* added autogen as an extra

* updated docs

Co-authored-by: hemanthsavasere <[email protected]>

* Update LICENSE

* Add safeguard on tokens returned by functions (letta-ai#576)

* swapping out hardcoded str for prefix (forgot to include in letta-ai#569)

* add extra failout when the summarizer tries to run on a single message

* added function response validation code, currently will truncate responses based on character count

* added return type hints (functions/tools should either return strings or None)

* discuss function output length in custom function section

* made the truncation more informative

* patch bug where None.copy() throws runtime error (letta-ai#617)

* allow passing custom host to uvicorn (letta-ai#618)

* feat: initial poc for socket server

* feat: initial poc for frontend based on react

Set up an nx workspace which maks it easy to manage dependencies and added shadcn components
that allow us to build good-looking ui in a fairly simple way.
UI is a very simple and basic chat that starts with a message of the user and then simply displays the
answer string that is sent back from the fastapi ws endpoint

* feat: mapp arguments to json and return new messages

Except for the previous user message we return all newly generated messages and let the frontend figure out how to display them.

* feat: display messages based on role and show inner thoughts and connection status

* chore: build newest frontend

* feat(frontend): show loader while waiting for first message and disable send button until connection is open

* feat: make agent send the first message and loop similar to CLI

currently the CLI loops until the correct function call sends a message to the user. this is an initial try to achieve a similar behavior in the socket server

* chore: build new version of frontend

* fix: rename lib directory so it is not excluded as part of python gitignore

* chore: rebuild frontend app

* fix: save agent at end of each response to allow the conversation to carry on over multiple sessions

* feat: restructure server to support multiple endpoints and add agents and sources endpoint

* feat: setup frontend routing and settings page

* chore: build frontend

* feat: another iteration of web interface

changes include: websocket for chat. switching between different agents. introduction of zustand state management

* feat: adjust frontend to work with memgpt rest-api

* feat: adjust existing rest_api to serve and interact with frontend

* feat: build latest frontend

* chore: build latest frontend

* fix: cleanup workspace

---------

Co-authored-by: Charles Packer <[email protected]>
Co-authored-by: hemanthsavasere <[email protected]>
norton120 pushed a commit to norton120/MemGPT that referenced this pull request Feb 15, 2024
* updated APIs to return usage info

* tested all endpoints
norton120 pushed a commit to norton120/MemGPT that referenced this pull request Feb 15, 2024
* updated local APIs to return usage info (letta-ai#585)

* updated APIs to return usage info

* tested all endpoints

* added autogen as an extra (letta-ai#616)

* added autogen as an extra

* updated docs

Co-authored-by: hemanthsavasere <[email protected]>

* Update LICENSE

* Add safeguard on tokens returned by functions (letta-ai#576)

* swapping out hardcoded str for prefix (forgot to include in letta-ai#569)

* add extra failout when the summarizer tries to run on a single message

* added function response validation code, currently will truncate responses based on character count

* added return type hints (functions/tools should either return strings or None)

* discuss function output length in custom function section

* made the truncation more informative

* patch bug where None.copy() throws runtime error (letta-ai#617)

* allow passing custom host to uvicorn (letta-ai#618)

* feat: initial poc for socket server

* feat: initial poc for frontend based on react

Set up an nx workspace which maks it easy to manage dependencies and added shadcn components
that allow us to build good-looking ui in a fairly simple way.
UI is a very simple and basic chat that starts with a message of the user and then simply displays the
answer string that is sent back from the fastapi ws endpoint

* feat: mapp arguments to json and return new messages

Except for the previous user message we return all newly generated messages and let the frontend figure out how to display them.

* feat: display messages based on role and show inner thoughts and connection status

* chore: build newest frontend

* feat(frontend): show loader while waiting for first message and disable send button until connection is open

* feat: make agent send the first message and loop similar to CLI

currently the CLI loops until the correct function call sends a message to the user. this is an initial try to achieve a similar behavior in the socket server

* chore: build new version of frontend

* fix: rename lib directory so it is not excluded as part of python gitignore

* chore: rebuild frontend app

* fix: save agent at end of each response to allow the conversation to carry on over multiple sessions

* feat: restructure server to support multiple endpoints and add agents and sources endpoint

* feat: setup frontend routing and settings page

* chore: build frontend

* feat: another iteration of web interface

changes include: websocket for chat. switching between different agents. introduction of zustand state management

* feat: adjust frontend to work with memgpt rest-api

* feat: adjust existing rest_api to serve and interact with frontend

* feat: build latest frontend

* chore: build latest frontend

* fix: cleanup workspace

---------

Co-authored-by: Charles Packer <[email protected]>
Co-authored-by: hemanthsavasere <[email protected]>
mattzh72 pushed a commit that referenced this pull request Oct 9, 2024
* updated APIs to return usage info

* tested all endpoints
mattzh72 pushed a commit that referenced this pull request Oct 9, 2024
* updated local APIs to return usage info (#585)

* updated APIs to return usage info

* tested all endpoints

* added autogen as an extra (#616)

* added autogen as an extra

* updated docs

Co-authored-by: hemanthsavasere <[email protected]>

* Update LICENSE

* Add safeguard on tokens returned by functions (#576)

* swapping out hardcoded str for prefix (forgot to include in #569)

* add extra failout when the summarizer tries to run on a single message

* added function response validation code, currently will truncate responses based on character count

* added return type hints (functions/tools should either return strings or None)

* discuss function output length in custom function section

* made the truncation more informative

* patch bug where None.copy() throws runtime error (#617)

* allow passing custom host to uvicorn (#618)

* feat: initial poc for socket server

* feat: initial poc for frontend based on react

Set up an nx workspace which maks it easy to manage dependencies and added shadcn components
that allow us to build good-looking ui in a fairly simple way.
UI is a very simple and basic chat that starts with a message of the user and then simply displays the
answer string that is sent back from the fastapi ws endpoint

* feat: mapp arguments to json and return new messages

Except for the previous user message we return all newly generated messages and let the frontend figure out how to display them.

* feat: display messages based on role and show inner thoughts and connection status

* chore: build newest frontend

* feat(frontend): show loader while waiting for first message and disable send button until connection is open

* feat: make agent send the first message and loop similar to CLI

currently the CLI loops until the correct function call sends a message to the user. this is an initial try to achieve a similar behavior in the socket server

* chore: build new version of frontend

* fix: rename lib directory so it is not excluded as part of python gitignore

* chore: rebuild frontend app

* fix: save agent at end of each response to allow the conversation to carry on over multiple sessions

* feat: restructure server to support multiple endpoints and add agents and sources endpoint

* feat: setup frontend routing and settings page

* chore: build frontend

* feat: another iteration of web interface

changes include: websocket for chat. switching between different agents. introduction of zustand state management

* feat: adjust frontend to work with memgpt rest-api

* feat: adjust existing rest_api to serve and interact with frontend

* feat: build latest frontend

* chore: build latest frontend

* fix: cleanup workspace

---------

Co-authored-by: Charles Packer <[email protected]>
Co-authored-by: hemanthsavasere <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Track token use with local LLMs
1 participant