Skip to content

Commit

Permalink
[v0.2.1] change default server settings (add small and large), enable…
Browse files Browse the repository at this point in the history
… env variables to config appl, update cursorrules

* Can configurate default small and large models, unify the interface (server name) for the developpers and users.
* Enable using environmental variable to update appl configs (requires `jsonargparse`).
* Update cursorrules to better use cursor to compose codes.
* Update docs for setup observability platforms.
* Support caching request with response_format as a pydantic model.
* Add `.env.example` to help setup for new projects.
* Add hint when Langfuse or Lunary is not configured.
  • Loading branch information
dhh1995 committed Dec 23, 2024
1 parent 4fbdab8 commit 248c409
Show file tree
Hide file tree
Showing 22 changed files with 348 additions and 104 deletions.
21 changes: 14 additions & 7 deletions .cursorrules
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Best Practices for writing code:
</general_rules>

<working_with_llms>
When calling Large Language Models (LLMs), you should use the package `appl` to build the application.
When calling Large Language Models (LLMs), you should use the package `appl` to build the application (The name on `pypi` is `applang`, so it should be installed with `pip install -U applang`).

When working with LLMs, you should have clear format instructions to guide the LLMs to generate the desired output, and have appropriate parser to parse the output into the desired format. The format should be designed clear and easy to parse, consider using format like markdown's code block.

Expand All @@ -16,13 +16,14 @@ You should add `stream=True` for tasks that require LLMs to generate a large amo

<explain_appl>
APPL is a package that integrates prompts of LLMs into the code.
- `@ppl` is a decorator that marks a function as a prompt function, the function cannot be a coroutine (async function).
- Grow your prompt by calling `grow()`, a implicit newline is added between each component. When being asked to be implicit, you can remove the `grow()` function and leave the content inside `grow` as it is, APPL will automatically add the `grow()` function for you during runtime.
- The docstring of the `@ppl` function will not be counted as a part of the prompt by default. If that part is meant to be the system prompt, you can specified that using `@ppl(docstring_as="system")`.
- The `gen` function is a wrapper around `litellm.completion`, it returns a future object, it will automatically takes the prompt captured so far as the prompt for the LLM. See the example below for more details. Note that you do not need to wrap `gen` in AIRole() scope to call it for generation.
- You can use `with role:` to specify the role of the message, for example `with AIRole():` to specify the prompt growed in the scope as the assistant message. The default scope is `user`.
- To get the result of `gen` immediately, use `str()` to convert it to a string. Otherwise, it is a `Generation` object where you can take the `result` attribute to get the result.
- Try to delay the time you get the result of `gen` as much as possible, so that the code can be more parallelized.
- You should avoid using multi-line string in `@ppl` function as much as possible. But when needed, write them with indentation aligning with the code, it will be dedented similar to docstring before being used in the code.
- Try to delay the time you get the result of `gen` as much as possible, so that the code can be more parallelized. See the example below for more details.
- When writing multi-line prompt, it is recommended to `grow` the prompt multiple times to utilize the implicit compositor that adds a newline between each component. This way provides a better control over the prompt where you can easily comment out parts of the prompt. But you can also use multi-line string with indentation aligning with the code (it will be dedented similar to docstring before being used in the code).

<example>

Expand Down Expand Up @@ -94,6 +95,7 @@ Prompt:
```
Output: Mona Lisa.

The two questions are answered in parallel, since the generation are future objects until being evaluated by `str` or printing, so the main process is not blocked by the LLM.
</example>

<example>
Expand Down Expand Up @@ -158,6 +160,8 @@ The prompt for both `gen` calls in `hello1` and `hello2` will looks like:

<example>
You can use compositors to build the prompt, which specify the indexing, indentation, and separation between different parts of the prompt (growed by `grow()`) inside its scope. Some useful compositors include: Tagged, NumberedList.
The Tagged wraps the content inside with opening and closing tag, and NumberedList indexes each single prompt part.
You are encouraged to use Tagged to wrap contents to make the prompt more readable.

```python
from appl import ppl, gen, grow
Expand All @@ -169,6 +173,7 @@ def guess_output(hints: list[str], inputs: str):
grow("Guess the output of the input.")
with Tagged("hints"):
with NumberedList():
grow("First hint")
grow(hints) # list will be captured one by one

with Tagged("inputs"):
Expand All @@ -178,15 +183,17 @@ def guess_output(hints: list[str], inputs: str):

return gen()

print(guess_output(["The output is the sum of the numbers"], "1, 2, 3"))
print(guess_output(["The output involves addition", "The output is a single number"], "1, 2, 3"))
```

The prompt will looks like:
```yaml
- User:
Guess the output of the input.
<hints>
1. The output is the sum of the numbers
1. First hint
2. The output involves addition
3. The output is a single number
</hints>
<inputs>
1, 2, 3
Expand All @@ -197,7 +204,7 @@ The prompt will looks like:
</example>

<best_practices>
Though you can call LLMs with simple tasks sharing the same context multiple times, you are encouraged to combine them into a single call with proper formatting and parsing (or using pydantic model) to reduce cost. For example, when being asked to generate a person's name and age:
Though you can call LLMs to generate one thing at a time, you are encouraged to combine them into a single call with proper formatting and parsing (or using pydantic model as `response_format`) to reduce cost. For example, when being asked to generate a person's name and age:
```python
class Person(BaseModel):
name: str
Expand Down Expand Up @@ -226,7 +233,7 @@ def parse_to_get_name_and_age() -> Person:
person: Person = parse_response(response)
return person

# or this (generally more recommended)
# or this (generally more recommended, you do not need to include format instructions in the prompt)
@ppl
def pydantic_to_get_name_and_age() -> Person:
grow("Generate a person's name and age.")
Expand Down
14 changes: 14 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# API keys
OPENAI_API_KEY=<your-openai-api-key>
ANTHROPIC_API_KEY=<your-anthropic-api-key>

# Observability platform
## Langfuse
## You can find the keys at: <your-langfuse-host>/project/<project-id>/setup (Project Dashboard -> Configure Tracing)
LANGFUSE_PUBLIC_KEY=<your-langfuse-public-key>
LANGFUSE_SECRET_KEY=<your-langfuse-secret-key>
LANGFUSE_HOST=<your-langfuse-host>

## Lunary
LUNARY_PUBLIC_KEY=<your-lunary-public-key>
LUNARY_API_URL=<your-lunary-api-url>
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ dumps
docs/reference
docs/docs

# changelog
changelogs

# appl
appl.yml

Expand Down
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,14 +212,16 @@ For a more comprehensive tutorial, please refer to the [tutorial](https://appl-t
- [Prompt Coding Helpers](https://appl-team.github.io/appl/tutorials/6_prompt_coding)
- [Using Tracing](https://appl-team.github.io/appl/tutorials/7_tracing)

### Cookbook
### Cookbook and Applications
For more detailed usage and examples, please refer to the [cookbook](https://appl-team.github.io/appl/cookbook).

APPL can be used to reproduce some popular LM-based applications easily, such as:
* [Tree of Thoughts](https://github.com/princeton-nlp/tree-of-thought-llm)[[APPL implementation](examples/advanced/tree_of_thoughts/)]: deliberate problem solving with Large Language Models.
We use APPL to reimplement popular LLM and prompting algorithms in [Reppl](https://github.com/appl-team/reppl), such as:
* [Tree of Thoughts](https://github.com/princeton-nlp/tree-of-thought-llm) [[Re-implementation](https://github.com/appl-team/reppl/tree/main/tree-of-thoughts/)] [[APPL Example](examples/advanced/tree_of_thoughts/)]: deliberate problem solving with Large Language Models.

We use APPL to build popular LM-based applications, such as:
* [Wordware's TwitterPersonality](https://twitter.wordware.ai/)[[APPL implementation](https://github.com/appl-team/TwitterPersonality)]: analyzes your tweets to determine your Twitter personality.

We also use APPL to build small LLM-powered libraries, such as:
We use APPL to build small LLM-powered libraries, such as:
* [AutoNaming](https://github.com/appl-team/AutoNaming): automatically generate names for experiments based on argparse arguments.
* [ExplErr](https://github.com/appl-team/ExplErr): a library for error explanation with LLMs.

Expand Down
53 changes: 51 additions & 2 deletions docs/setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,11 @@ For example, you can create a `.env` file with the following content to specify
OPENAI_API_KEY=<your openai api key>
```

We provide an example of `.env.example` file in the root directory, you can copy it to your project directory and modify it.
```bash title=".env.example"
--8<-- ".env.example"
```

### Export or Shell Configuration
Alterantively, you can export the environment variables directly in your terminal, or add them to your shell configuration file (e.g., `.bashrc`, `.zshrc`). For example:
```bash
Expand All @@ -30,8 +35,15 @@ export OPENAI_API_KEY=<your openai api key>
--8<-- "src/appl/default_configs.yaml"
```

??? note "You should setup your own default server."
The default server (currently `gpt-4o-mini`) set in APPL could be outdated and changed in the future. We recommend you to specify your own default model in the `appl.yaml` file.
??? note "Setup your default models"
You should specify your own default model in the `appl.yaml` file. You may also specify the default "small" and "large" models, which will fallback to the default model if not specified.
The name can be a server name in your configuration (`servers` section), or a model name that is supported by litellm.
```yaml title="appl.yaml (example)"
settings:
model:
default: gpt-4o-mini # small model fallback to this
large: gpt-4o
```

### Override Configs
You can override these configurations by creating a `appl.yaml` file in the root directory of your project (or other directories, see [Priority of Configurations](#priority-of-configurations) for more details). A typical usage is to override the `servers` configuration to specify the LLM servers you want to use, as shown in the following example `appl.yaml` file.
Expand Down Expand Up @@ -96,6 +108,43 @@ settings:
To resume from a previous trace, you can specify the `APPL_RESUME_TRACE` environment variable with the path to the trace file. See more details in the [tutorial](./tutorials/7_tracing.md).

## Visualize Traces

### Langfuse (Recommended)

Langfuse is an open-source web-based tool for visualizing traces and LLM calls.

You can host Langfuse [locally](https://langfuse.com/self-hosting) or use [public version](https://langfuse.com/).

```bash
git clone https://github.com/langfuse/langfuse.git
cd langfuse
docker compose up
```

Then you can set the environment variables for the Langfuse server by:

```bash title=".env"
LANGFUSE_PUBLIC_KEY=<your-langfuse-public-key>
LANGFUSE_SECRET_KEY=<your-langfuse-secret-key>
LANGFUSE_HOST=<your-langfuse-host>
# Set to http://localhost:3000 if you are hosting Langfuse locally
```
You can find your Langfuse public and private API keys in the project settings page (Project Dashboard -> Configure Tracing).

Please see [the tutorial](./tutorials/7_tracing.md#langfuse-recommended) for more details.

You can see conversation like:

![Langfuse Conversation](./_assets/tracing/langfuse_convo.png)

and the timeline like:

![Langfuse Timeline](./_assets/tracing/langfuse_timeline.png)

### Lunary
Please see [the tutorial](./tutorials/7_tracing.md#lunary) for more details.

### LangSmith
To enable [LangSmith](https://docs.smith.langchain.com/) tracing, you need to to [obtain your API key](https://smith.langchain.com/settings) from LangSmith and add the following environment variables to your `.env` file:

Expand Down
14 changes: 9 additions & 5 deletions docs/tutorials/7_tracing.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,12 +78,13 @@ docker compose up

Then you can set the environment variables for the Langfuse server by:

```bash
export LANGFUSE_PUBLIC_KEY=<your public key>
export LANGFUSE_SECRET_KEY=<your secret key>
export LANGFUSE_HOST=http://localhost:3000
```bash title=".env"
LANGFUSE_PUBLIC_KEY=<your-langfuse-public-key>
LANGFUSE_SECRET_KEY=<your-langfuse-secret-key>
LANGFUSE_HOST=<your-langfuse-host>
# Set to http://localhost:3000 if you are hosting Langfuse locally
```
You can find your Langfuse public and private API keys in the project settings page.
You can find your Langfuse public and private API keys in the project settings page (Project Dashboard -> Configure Tracing).

Then you can visualize the traces by:

Expand All @@ -99,6 +100,9 @@ and the timeline like:

![Langfuse Timeline](../_assets/tracing/langfuse_timeline.png)

??? question "Troubleshooting: Incomplete traces on Langfuse"
You may see incomplete traces (function calls tree) in Langfuse when you click from the `Traces` page. This might because langfuse apply a filter based on the timestamp. Try to remove the `?timestamp=<timestamp>` in the url and refresh the page.

### Lunary

Lunary is another open-source web-based tool for visualizing traces and LLM calls.
Expand Down
4 changes: 3 additions & 1 deletion examples/advanced/tree_of_thoughts/appl.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
servers:
default_servers:
default: gpt4o-t07

servers:
gpt4o-t07:
model: gpt-4o
temperature: 0.7
4 changes: 3 additions & 1 deletion examples/appl.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,11 @@ settings:
tracing:
enabled: true

# default_servers:
# default: azure-gpt35 # override the default server according to your needs

# example for setting up servers
servers:
# default: azure-gpt35 # override the default server according to your needs
azure-gpt35: # the name of the server
model: azure/gpt35t # the model name
# temperature: 1.0 # set the default temperature for the calls to this server
Expand Down
13 changes: 9 additions & 4 deletions examples/usage/cmd_args.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,20 @@
import appl
from appl import gen, ppl

# option 1: update part of the configs in a dict
# appl.init(servers={"default": "gpt-4o"})
# * option 1: update part of the configs in a dict
# appl.init(default_servers={"default": "gpt-4o", "small": "gpt-4o-mini"})
# !! update the default server to `gpt-4o`, and the small server to `gpt-4o-mini`
# !! used for gen(...) and gen(server="small", ...)

# option 2: get options from command line
# * option 2: get options from command line or environment variables
parser = appl.get_parser()
parser.add_argument("--name", type=str, default="APPL")
args = parser.parse_args()
appl.update_appl_configs(args.appl)
# python cmd_args.py --appl.servers.default gpt-4o

# * Both change the default server to `gpt-4o`
# python cmd_args.py --appl.default_servers.default gpt-4o
# _APPL__DEFAULT_SERVERS__DEFAULT=gpt-4o python cmd_args.py


@ppl # the @ppl decorator marks the function as an `APPL function`
Expand Down
16 changes: 15 additions & 1 deletion pdm.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "pdm.backend"

[project]
name = "applang"
version = "0.2.0"
version = "0.2.1"
description = "A Prompt Programming Language"
authors = [
{ name = "Honghua Dong", email = "[email protected]" },
Expand All @@ -30,6 +30,7 @@ dependencies = [
"rich>=13.8.1",
"pillow>=11.0.0",
"deprecated>=1.2.15",
"jsonargparse>=4.35.0",
]
requires-python = ">=3.9"
readme = "README.md"
Expand Down
6 changes: 4 additions & 2 deletions src/appl/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,9 +90,11 @@
from .version import __version__


def get_parser():
def get_parser(
env_prefix: str = "", default_env: bool = True, **kwargs: Any
) -> ArgumentParser:
"""Get an argument parser with configurable APPL configs."""
parser = ArgumentParser()
parser = ArgumentParser(env_prefix=env_prefix, default_env=default_env, **kwargs)
parser.add_argument("--appl", type=APPLConfigs, default=global_vars.configs)
return parser

Expand Down
19 changes: 17 additions & 2 deletions src/appl/caching/db.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

from litellm import ModelResponse
from loguru import logger
from pydantic import BaseModel

from ..core.globals import global_vars
from ..core.types.caching import DBCacheBase
Expand Down Expand Up @@ -186,6 +187,20 @@ def insert(
)


def _serialize_args(args: Dict[str, Any]) -> str:
args = args.copy()
for k, v in args.items():
# dump as schema if it is a pydantic model
if isinstance(v, type):
if issubclass(v, BaseModel):
args[k] = v.model_json_schema()
else:
# TODO: convert to a schema
logger.warning(f"Unknown type during serialization: {type(v)}")
args[k] = str(v)
return json.dumps(args)


def find_in_cache(
args: Dict[str, Any], cache: Optional[DBCacheBase] = None
) -> Optional[ModelResponse]:
Expand All @@ -207,7 +222,7 @@ def find_in_cache(
):
return None
# only cache the completions with temperature == 0
value = cache.find(json.dumps(args))
value = cache.find(_serialize_args(args))
if value is None:
return None
return dict_to_pydantic(value, ModelResponse)
Expand All @@ -226,7 +241,7 @@ def add_to_cache(
and not global_vars.configs.settings.caching.allow_temp_greater_than_0
):
return
args_str = json.dumps(args)
args_str = _serialize_args(args)
value_dict = pydantic_to_dict(value)
logger.info(f"Adding to cache, args: {args_str}, value: {value_dict}")
cache.insert(args_str, value_dict)
Loading

0 comments on commit 248c409

Please sign in to comment.