Langfuse is an open source observability & analytics solution for LLM-based applications. It is mostly geared towards production usage but some users also use it for local development of their LLM applications.
Langfuse is focused on applications built on top of LLMs. Many new abstractions and common best practices evolved recently, e.g. agents, chained prompts, embedding-based retrieval, LLM access to REPLs & APIs. These make applications more powerful but also unpredictable for developers as they cannot fully anticipate how changes impact the quality, cost and overall latency of their application. Thus Langfuse helps to monitor and debug these applications.
Demo (2 min)
langfuse_demo_2_min.mp4
Muted by default, enable sound for voice-over
Explore demo project in Langfuse here (free account required): https://langfuse.com/demo
Langfuse offers an admin UI to explore the ingested data.
- Nested view of LLM app executions; detailed information along the traces on: latency, cost, scores
- Segment execution traces by user feedback, to e.g. identify production issues
Reporting on
- Token usage by model
- Volume of traces
- Scores/evals
Broken down by
- Users
- Releases
- Prompt/chain versions
- Prompt/chain types
- Time
→ Expect releases with more ways to analyze the data over the next weeks.
Managed deployment by the Langfuse team, generous free-tier (hobby plan) available, no credit card required.
Links: Create account, learn more
Requirements: docker, docker compose (e.g. using Docker Desktop)
# Clone repository
git clone https://github.com/langfuse/langfuse.git
cd langfuse
# Run server and database
docker compose up -d
Fully async, typed SDKs to instrument any LLM application. Currently available for Python & JS/TS.
→ Guide with an example of how the SDK can be used
Package | Description | Links |
---|---|---|
Python | docs, repo | |
JS/TS: Node >= 18, Edge runtimes | docs, repo | |
JS/TS: Node <18 | docs, repo |
The Langfuse callback handler automatically instruments Langchain applications. Currently available for Python and JS/TS.
Python
pip install langfuse
# Initialize Langfuse handler
from langfuse.callback import CallbackHandler
handler = CallbackHandler(PUBLIC_KEY, SECRET_KEY)
# Setup Langchain
from langchain.chains import LLMChain
...
chain = LLMChain(llm=llm, prompt=prompt)
# Add Langfuse handler as callback
chain.run(input="<user_input", callbacks=[handler])
→ Langchain integration docs for Python
JS/TS
→ Langchain integration docs for JS/TS
Quality/evaluation of traces is tracked via scores (docs). Scores are related to traces and optionally to observations. Scores can be added via:
-
Backend SDKs (see docs above):
{trace, event, span, generation}.score()
-
API (see docs below):
POST /api/public/scores
-
Client-side using Web SDK, e.g. to capture user feedback or other user-based quality metrics:
npm install langfuse
// Client-side (browser) import { LangfuseWeb } from "langfuse"; const langfuseWeb = new LangfuseWeb({ publicKey: process.env.LANGFUSE_PUBLIC_KEY, }); // frontend handler (example: React) export function UserFeedbackComponent(props: { traceId: string }) { const handleUserFeedback = async (value: number) => { await langfuseWeb.score({ traceId: props.traceId, name: "user_feedback", value, }); }; return ( <div> <button onClick={() => handleUserFeedback(1)}>👍</button> <button onClick={() => handleUserFeedback(-1)}>👎</button> </div> ); }
- POST/PATCH routes to ingest data
- GET routes to use data in downstream applications (e.g. embedded analytics)
The maintainers are very active in the Langfuse Discord and are happy to answer questions or discuss feedback/ideas regarding the future of the project.
Join the community on Discord.
To contribute, send us a PR, raise a GitHub issue, or email at [email protected]
See CONTRIBUTING.md for details on how to setup a development environment.
Langfuse is MIT licensed, except for ee/
folder. See LICENSE and docs for more details.
# Stop server and db
docker compose down
# Pull latest changes
git pull
docker-compose pull
# Run server and db
docker compose up -d
Checkout GitHub Actions workflows of Python SDK and JS/TS SDK.
By default, Langfuse automatically reports basic usage statistics to a centralized server (PostHog).
This helps us to:
- Understand how Langfuse is used and improve the most relevant features.
- Track overall usage for internal and external (e.g. fundraising) reporting.
None of the data is shared with third parties and does not include any sensitive information. We want to be super transparent about this and you can find the exact data we collect here.
You can opt-out by setting TELEMETRY_ENABLED=false
.