OpenAI Observability

If you are building Applications with OpenAI's API this post is for you!

Introduction

In an era where AI and machine learning are at the forefront of technological advancements, services like OpenAI and ChatGPT have gained immense popularity for their ability to transform industries, streamline customer interactions, and automate various processes.

However, with great power comes great responsibility, and monitoring and observability are vital aspects of ensuring the smooth operation and optimal utilization of such AI services, and avoiding unpleasant surprises (=costs)

In this quick article, we will explore how the qryn.cloud polyglot observability stack can collect Metrics and Logs to provide valuable insights into the performance, behavior, and usage patterns of OpenAI and ChatGPT, enabling organizations to harness the full platform potential while ensuring reliability and efficiency.

Benefits of Monitoring OpenAI ✨

Here are some good reasons to Monitor OpenAI - as explained by ChatGPT:

Performance Optimization: Effective monitoring allows organizations to keep a close eye on the performance of OpenAI and ChatGPT. By collecting and analyzing metrics, businesses can identify bottlenecks, latency issues, and areas where optimization is needed. This, in turn, leads to improved response times and enhanced user experiences.
Cost Management: Running AI models like ChatGPT can be resource-intensive. Through comprehensive monitoring, you can gain insights into usage patterns and cost trends. This data enables informed decisions about resource allocation, helping organizations optimize their budget and prevent unexpected overages.
User Experience Enhancement: Understanding how users interact with ChatGPT and OpenAI is crucial for delivering a seamless user experience. Observing user behavior and analyzing logs can uncover pain points, frequently asked questions, and other valuable insights to tailor AI responses and services to better meet user needs.
Security and Compliance: Security is a top concern when dealing with sensitive data or information. Effective monitoring helps detect and prevent security breaches, unauthorized access, and potential vulnerabilities. It also aids in ensuring compliance with data protection regulations and industry standards.
Predictive Maintenance: Proactive monitoring can help identify issues before they become critical. By setting up alerts for anomalies and unusual behavior, organizations can implement preventive measures, reducing downtime and the risk of service disruptions.
Scaling and Resource Allocation: As demand for AI services fluctuates, organizations need to be agile in scaling resources. Monitoring helps in understanding usage patterns and trends, enabling efficient resource allocation and scaling to meet demand, whether for peak hours or seasonal changes.
Customization and Improvement: Observability data can provide insights into how users are interacting with AI models. This information can be used to refine and customize AI responses, improving the quality of interactions and ultimately enhancing user satisfaction.

Requirements

For this experiment we can use any Linux system with Python3.x.

Before starting, install the grafana-openai-monitoring dependency using pip

pip install grafana-openai-monitoring

We will need a couple of tokens to configure our monitoring script:

OpenAI API Key
Scoped Token for qryn.cloud (not needed for qryn oss)

It's Monitoring Time 🔭

To monitor Chat completions using the OpenAI API, you can use the chat_v2.monitor decorator. This decorator automatically tracks API calls and sends metrics and logs to the qryn or qryn.cloud endpoints.

import openai
from grafana_openai_monitoring import chat_v2

# Set your OpenAI API key
openai.api_key = "YOUR_OPEN_AI_API_KEY"

# Apply the custom decorator to the OpenAI API function
openai.ChatCompletion.create = chat_v2.monitor(
  openai.ChatCompletion.create,
  metrics_url="https://qryn.gigapipe.com/api/v1/prom/remote/write",
  logs_url="https://qryn.gigapipe.com/loki/api/v1/push",
  metrics_username="X-API-Key",
  logs_username="X-API-Key",
  access_token="X-API-Secret"
  )

# Now any call to openai.ChatCompletion.create will be automatically tracked
response = openai.ChatCompletion.create(model="gpt-4", max_tokens=100, messages=[{"role": "user", "content": "What is Observability?"}])

print(response)

To monitor Completions using the OpenAI API, you can use the chat_v1.monitor decorator. This decorator adds monitoring capabilities to the OpenAI API function and sends metrics and logs to the qryn or qryn.cloud endpoints.

import openai
from grafana_openai_monitoring import chat_v1

# Set your OpenAI API key
openai.api_key = "YOUR_OPEN_AI_API_KEY"

# Apply the custom decorator to the OpenAI API function
openai.Completion.create = chat_v1.monitor(
  openai.Completion.create,
  metrics_url="https://qryn.gigapipe.com/api/v1/prom/remote/write",
  logs_url="https://qryn.gigapipe.com/loki/api/v1/push",
  metrics_username="X-API-Key",
  logs_username="X-API-Key",
  access_token="X-API-Secret"
  )

# Now any call to openai.Completion.create will be automatically tracked
response = openai.Completion.create(model="davinci", max_tokens=100, prompt="What is Observability?")

print(response)

After configuring the parameters, the monitored API function will automatically log and track the requests and responses to the specified endpoints.

Grafana Dashboard 🛸

Once our data is ingested in qryn or qryn.cloud, we can download and import the OpenAI Dashboard in our connected Grafana instance to display metrics and logs.

Potential Unlocked 🔥

Monitoring OpenAI and ChatGPT with the qryn.cloud polyglot observability stack is not just about tracking metrics and logs; it's about unlocking the full potential of AI services while ensuring reliability, security, and cost-effectiveness. With the right observability tools and practices in place, organizations can harness the power of AI to its fullest, staying competitive in a rapidly evolving technological landscape.

Are you Ready?

Signup for a free account on qryn.cloud or install our oss stack on-premise ⭐