Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

External / Internal frameworks implementation (+ multi-agent) #28

Open
GALLLASMILAN opened this issue Jan 21, 2025 · 5 comments
Open

External / Internal frameworks implementation (+ multi-agent) #28

GALLLASMILAN opened this issue Jan 21, 2025 · 5 comments
Assignees

Comments

@GALLLASMILAN
Copy link
Contributor

GALLLASMILAN commented Jan 21, 2025

Final suggestion [draft]

I would like to split the solution to the more parts and each part will have the priority

Common data

Here is a list of data (based on the openinference packages) we are able to collect from all of the external frameworks mentioned below.

attribute example
traceId (DM4tjgcj7C45km4kXGBmdw)
status (error / ok)
framework (langchain / dspy / crewai)
language (Javascript / python)
input (How are you today?)
output (I am an agent, I am fine)

extended data

Some frameworks do not provide this data (like dspy)

attribute example
provider ollama
llm llama3.1

framework specific data

see the whole span tree data for each framework and summaries I prepared to have full view on the provided data. Data are very often framework-specific.

framework summary all spans data
LangChain openinference-langchain-data.txt openinference-langchain-spans.json
CrewAI openinference-crewai-data.txt openinference-crewai-spans.json
Dspy openinference-dspy-data.txt openinference-dspy-spans.json
smolagents openinference-smolagent-data.txt openinference-smolagents-spans.json

I will add the more specific traces for the tool calling as well later.

Priority 1 = python frameworks

Openinference ✅

I found the openinference repo that contains a set of open-telemetry packages for a lot of our chosen frameworks like (crewAI, LangChain, and LangGraph) that we can use in our stack. The repositories are the technology independent and provide us with a solid telemetry solution. Otherwise, It would be so hard to create the telemetry solution for each framework from scratch.

IBM solution ⛔

openllmetry ⛔

Here is a list of the supported Python technologies we will use:

TODO: specify the supported frameworks in the first version


agent creates the traceId
generate flowId


Call bee-agent-framework + observability

SINGLE agent

  • tracing between javascript and Python languages ​​will be unified. In the python version of bee-hive, the emitter principle is already used. TODO: It will be necessary to provide the same trace interface -> so that the same thing is sent in both languages.
    @tomas.dvorak
  • Single-agent support = We will use external frameworks (CrewAI, Atogen, AWS labs) and for each framework, we will need to make a "wrapper" that will compile events compatible with our framework from the internal emitter/log. For some frameworks, this will not even work, for example in AWS labs. => It will not work for those that do not support the emitter pattern (there will be more of these).
  • In terms of compatibility of basic events, we will be able to provide most event names, but not data. So we will not be able to select specific data, such as rawPrompt now (an internal thing)

MULTI agent

  • for support in the framework for "multi-agent systems" we will only support input/output in the first version. (for example -> now I started agent xx from crewUI, and now agent xx from crewAI has finished).
  • "multi-agent approach" => We will support both (agent/workflow). This means that for example in aws-lab we can export a single agent and then use it within the workflow in bee, but we can also export the entire orchestration (multi-agent) and then use it in our bee. So we will have to provide instrumentation at both levels.

I found this project https://github.com/Arize-ai/openinference?tab=readme-ov-file where are opentelemerty providers for more frameworks like (crewAI, LangChain, DSPy, AWS Bedrock). I will inspire by this project

CrewAI

Instrumentation implementation (openinference)

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from openinference.instrumentation.crewai import CrewAIInstrumentor

endpoint = "http://127.0.0.1:4319/v1/traces"
trace_provider = TracerProvider()
trace.set_tracer_provider(trace_provider)
trace_provider.add_span_processor(BatchSpanProcessor(OTLPSpanExporter(endpoint)))
CrewAIInstrumentor().instrument(tracer_provider=trace_provider)

What is very important, each span has the traceId property. It's very important

Crew AI span (openinference)

Crew AI span (agent-analytics)

LangChain

Instrumentation implementation

from opentelemetry import trace as trace_api
from opentelemetry.sdk import trace as trace_sdk
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace.export import ConsoleSpanExporter, BatchSpanProcessor
from openinference.instrumentation.langchain import LangChainInstrumentor

endpoint = "http://127.0.0.1:4319/v1/traces"
tracer_provider = trace_sdk.TracerProvider()
trace_api.set_tracer_provider(tracer_provider)
tracer_provider.add_span_processor(BatchSpanProcessor(OTLPSpanExporter(endpoint)))
tracer_provider.add_span_processor(BatchSpanProcessor(ConsoleSpanExporter()))

LangChainInstrumentor().instrument()

LangChain span

Autogen

AUTOGEN ANALYSIS 👁

Aws labs

Multi-Agent (awslabs) ANALYSIS

Dspy

Instrumentation implementation (openinference)

from openinference.instrumentation.dspy import DSPyInstrumentor
from opentelemetry import trace as trace_api
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk import trace as trace_sdk
from opentelemetry.sdk.trace.export import SimpleSpanProcessor

endpoint = "http://127.0.0.1:4319/v1/traces"
tracer_provider = trace_sdk.TracerProvider()
trace_api.set_tracer_provider(tracer_provider)
tracer_provider.add_span_processor(SimpleSpanProcessor(OTLPSpanExporter(endpoint)))

DSPyInstrumentor().instrument()

smolagents

Instrumentation implementation (openinference)

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor

from openinference.instrumentation.smolagents import SmolagentsInstrumentor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter

endpoint = "http://127.0.0.1:4319/v1/traces"
trace_provider = TracerProvider()
trace_provider.add_span_processor(BatchSpanProcessor(OTLPSpanExporter(endpoint)))

SmolagentsInstrumentor().instrument(tracer_provider=trace_provider)
@GALLLASMILAN GALLLASMILAN self-assigned this Jan 21, 2025
@GALLLASMILAN GALLLASMILAN changed the title External frameworks implementation External / Internal frameworks implementation (+ multi-agent) Jan 21, 2025
@GALLLASMILAN
Copy link
Contributor Author

Hello @anafucs, I added some information about the collected data to the issue description. See the Common data, extended data and framework specific data sections.

I will continue with the data analytics tomorrow, so I may add more info there.

CC: @tomkis

@tomkis
Copy link

tomkis commented Jan 30, 2025

@anafucs please let us know if common + extended data is something that would work in your designs.

If you feel like there's some more info that would be good to visualise let us know and we'll investigate the option.

@GALLLASMILAN
Copy link
Contributor Author

The next inspiration for us could be a phoenix observability tool that is part of the Arize-ai platform. The phoenix is based on the same data I analyzed from openinference packages.

I don't thing this tool is a good desing inspiration but it's good to know, how their native UI tool vizualize the data.

The don't pick the data and only vizualize them in 2 main forms.

  • info = pretty view
  • attibutes = json view

Then they have next 2 tabs for

  • events = for example error
  • feedback = human feedback (we don't support in our observe yet)

List

Image

Trace error span detail on info tab

Image

The trace span detail on attributes page

Image

@anafucs @tomkis @mmurad2 @matoushavlena

@anafucs
Copy link

anafucs commented Jan 30, 2025

Thanks @GALLLASMILAN and @tomkis . we are thinking a solution very similar to the one above. On the common/extended data, the item I'm not sure is the language (as a primary info). This might be helpful in a drill down or when we add evaluation features (comparing agents)? We will share some drafts later today!

@GALLLASMILAN
Copy link
Contributor Author

Hello @anafucs I only mentioned the language as an option and I don't think that is as important as others. (We can skip it).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants