OpenLLMetry is a set of extensions built on top of OpenTelemetry that gives you complete observability over your LLM application with minimal complexity.
Because it uses OpenTelemetry under the hood it can be connected to existing observability solutions such as our polyglot stackqryn and qryn.cloud ⭐⭐⭐
Step 1: Traceloop SDK Setup
OpenLLMetry lets you easily trace prompts and embedding calls of OpenAI and can provide a complete view of your OpenAI application using traces and spans.
To get started, Install the Traceloop SDK and initialize it within your code.
🧠 OpenAI Example
Automatically log all calls to OpenAI, with prompts and completions
import openai
from traceloop.sdk import Traceloop
from traceloop.sdk.decorators import workflow
Traceloop.init(app_name="joke_generation_service")
@workflow(name="joke_creation")
def create_joke():
completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Tell me a joke about opentelemetry"}],
)
return completion.choices[0].message.content
🦙 LLAMA Example
Automatically log all calls to LLAMA models, with prompts and completions
import chromadb
import os
import openai
from llama_index import VectorStoreIndex, SimpleDirectoryReader, ServiceContext
from llama_index.vector_stores import ChromaVectorStore
from llama_index.storage.storage_context import StorageContext
from llama_index.embeddings import HuggingFaceEmbedding
from traceloop.sdk import Traceloop
openai.api_key = os.environ["OPENAI_API_KEY"]
# Initialize Traceloop
Traceloop.init()
chroma_client = chromadb.EphemeralClient()
chroma_collection = chroma_client.create_collection("quickstart")
# define embedding function
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-base-en-v1.5")
# load documents
documents = SimpleDirectoryReader("./data/my_docs/").load_data()
# set up ChromaVectorStore and load in data
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
service_context = ServiceContext.from_defaults(embed_model=embed_model)
index = VectorStoreIndex.from_documents(
documents, storage_context=storage_context, service_context=service_context
)
# Query Data
query_engine = index.as_query_engine()
response = query_engine.query("Summarize the documents in context")
For more information refer to the Traceloop SDK Documentation
Step 2: Grafana Agent Sender
Configure a Grafana Agent instance to feed Traceloop traces into qryn / qryn.cloud
traces:
configs:
- name: default
remote_write:
- endpoint: <Gigapipe qryn.cloud endpoint>:443
basic_auth
username: <Gigapipe qryn X-API-Key>
password: <Gigapipe qryn X-API-Secret>
receivers:
otlp:
protocols:
grpc:
/* Environment Variable for your local app with Traceloop */
TRACELOOP_BASE_URL=http://<grafana-agent-hostname>:4318
That's it! You're now ready to explore your LLMs activity using qryn
You can immediately get started with some popular examples:
👉 Trace prompts and completions
Call OpenAI and see prompts, completions, and token usage for your call.
👉 Trace your RAG retrieval pipeline
Build a RAG pipeline with Chroma and OpenAI. See vectors returned from Chroma, full prompt in OpenAI and responses
Are you Ready?
Signup for a free account on qryn.cloud or install our oss stack on-premise ⭐