LangWatch: Your Open Source LLM Monitoring, Evaluation, and DSPy Visualization Tool
Are you struggling to understand how users interact with your LLM-powered application? Do you need a way to measure its performance and identify areas for improvement? LangWatch is here to help! This open-source LLMOps platform provides the tools you need to confidently build, monitor, and evaluate your AI applications.
Why Choose LangWatch for Your LLM application performance monitoring?
LangWatch empowers you to:
- Gain confidence in your AI application's performance.
- Visualize DSPy experiments for optimal prompt engineering.
- Understand user behavior to improve engagement.
- Measure LLM output quality with concrete metrics
- Collaborate effectively with stakeholders on a single platform.
- Iterate rapidly towards a more valuable and reliable LLM application.
Key Features for LLM Monitoring and Evaluation
LangWatch offers a comprehensive suite of features to streamline your LLMOps workflow:
- Real-time Telemetry: Track LLM cost, latency, and other vital metrics in real time. This data lets you optimize your application's performance and efficiency.
- Detailed Debugging: Capture every step of your LLM calls, including all metadata and history. Easily troubleshoot issues and reproduce errors with complete context.
- Measurable Quality: Define and track key performance indicators (KPIs) for your LLM pipeline using LangEvals evaluators. Improve your prompts, switch models, and optimize performance with data-driven confidence.
Visualize DSPy Experiments with Ease
LangWatch's DSPy Visualizer allows you to track the progress of your DSPy experiments effortlessly. You can easily inspect runs, compare results, and iterate towards optimal prompts and pipelines.
- Track your optimizer compilation
- Keep history and compare runs
Gain Deeper Insights with User Analytics
Understand how users are interacting with your LLM application. LangWatch provides metrics on engagement, user interactions, and behavior patterns to help you refine your product and improve user experience.
- Engagement metrics
- User interaction tracking
- Behavioral insights
Protect Your Application with Guardrails
Implement guardrails to detect potential issues such as PII leaks and toxic language. LangWatch offers built-in guardrails using services like Google DLP and Azure Moderation, and also allows you to create custom guardrails using semantic matching or LLMs.
- Detect PII leaks with Google DLP
- Filter toxic language with Azure Moderation
- Build custom guardrails with semantic matching
Quickstart Guide: OpenAI Python Integration for LLM monitoring
Get started with LangWatch in just a few simple steps:
-
Install the LangWatch library:
-
Add the
@langwatch.trace()
decorator: This decorator enables tracking for your LLM pipeline function. -
Enable OpenAI call autotracking: Use
autotrack_openai_calls()
to automatically capture OpenAI API calls. -
Set your API key: Export your LangWatch API key to authenticate your application.
Set up your project on LangWatch to generate your API key.
DSPy Visualizer Quickstart
import langwatch
langwatch.login()
# Initialize langwatch for this run, to track the optimizer compilation
langwatch.dspy.init(experiment="my-awesome-experiment", optimizer=optimizer)
compiled_rag = optimizer.compile(RAG(), trainset=trainset)
That's it! Now open the link provided when the compilation starts or go to your LangWatch dashboard to follow the progress of your experiments:
[
Ready to Get Started with LLM application performance monitoring?
Visit the LangWatch documentation for detailed guides on integration with OpenAI, LangChain, and custom REST APIs.
Contributing to LangWatch
LangWatch thrives on community contributions. Please review our Contribution Guidelines to learn how you can help improve the platform. Join us in building the future of LLMOps!