LLMs Observability with Traceloop + qryn

LLMs Observability with Traceloop + qryn

OpenLLMetry is a set of extensions built on top of OpenTelemetry that gives you complete observability over your LLM application with minimal complexity.

Because it uses OpenTelemetry under the hood it can be connected to existing observability solutions such as our polyglot stackqryn and qryn.cloud ⭐⭐⭐

Step 1: Traceloop SDK Setup

OpenLLMetry lets you easily trace prompts and embedding calls of OpenAI and can provide a complete view of your OpenAI application using traces and spans.

To get started, Install the Traceloop SDK and initialize it within your code.

🧠 OpenAI Example

Automatically log all calls to OpenAI, with prompts and completions

import openai
from traceloop.sdk import Traceloop
from traceloop.sdk.decorators import workflow

Traceloop.init(app_name="joke_generation_service")

@workflow(name="joke_creation")
def create_joke():
    completion = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": "Tell me a joke about opentelemetry"}],
    )

    return completion.choices[0].message.content

🦙 LLAMA Example

Automatically log all calls to LLAMA models, with prompts and completions

import chromadb
import os
import openai

from llama_index import VectorStoreIndex, SimpleDirectoryReader, ServiceContext
from llama_index.vector_stores import ChromaVectorStore
from llama_index.storage.storage_context import StorageContext
from llama_index.embeddings import HuggingFaceEmbedding
from traceloop.sdk import Traceloop

openai.api_key = os.environ["OPENAI_API_KEY"]

# Initialize Traceloop
Traceloop.init()

chroma_client = chromadb.EphemeralClient()
chroma_collection = chroma_client.create_collection("quickstart")

# define embedding function
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-base-en-v1.5")

# load documents
documents = SimpleDirectoryReader("./data/my_docs/").load_data()

# set up ChromaVectorStore and load in data
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
service_context = ServiceContext.from_defaults(embed_model=embed_model)
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context, service_context=service_context
)

# Query Data
query_engine = index.as_query_engine()
response = query_engine.query("Summarize the documents in context")

For more information refer to the Traceloop SDK Documentation

Step 2: Grafana Agent Sender

Configure a Grafana Agent instance to feed Traceloop traces into qryn / qryn.cloud

traces:
  configs:
    - name: default
      remote_write:
        - endpoint: <Gigapipe qryn.cloud endpoint>:443
          basic_auth
            username: <Gigapipe qryn X-API-Key>
            password: <Gigapipe qryn X-API-Secret>
      receivers:
        otlp:
          protocols:
            grpc:

/* Environment Variable for your local app with Traceloop */
TRACELOOP_BASE_URL=http://<grafana-agent-hostname>:4318

That's it! You're now ready to explore your LLMs activity using qryn

You can immediately get started with some popular examples:

👉 Trace prompts and completions

Call OpenAI and see prompts, completions, and token usage for your call.

👉 Trace your RAG retrieval pipeline

Build a RAG pipeline with Chroma and OpenAI. See vectors returned from Chroma, full prompt in OpenAI and responses

Are you Ready?

Signup for a free account on qryn.cloud or install our oss stack on-premise ⭐

Did you find this article valuable?

Support qryn by becoming a sponsor. Any amount is appreciated!