Deploy DocSum on Kubernetes: A Step-by-Step Guide with Helm Charts
This guide provides a comprehensive walkthrough on deploying the DocSum service using Helm charts. DocSum leverages LLM microservices, and this article details the installation, configuration, and verification processes. Whether you are using CPU, Gaudi, or AMD ROCm devices, this guide provides the necessary steps. Let's get started with DocSum deployment.
Prerequisites for DocSum Deployment
Before deploying DocSum, ensure you have the following:
- A Kubernetes cluster.
- Helm installed.
kubectl
configured to connect to your cluster.
Installing the DocSum Helm Chart
Follow these steps to install the DocSum Helm chart and get your document summarization service up and running.
-
Clone the GenAIInfra Repository:
-
Prepare the Model Directory:
-
Update Dependencies:
-
Export Environment Variables: Configure your Hugging Face token, model directory, and model name. Replace
"insert-your-huggingface-token-here"
with your actual token:
Deploying DocSum with Different Devices
The following commands detail deploying DocSum on various devices utilizing different YAML configurations.
Deploying DocSum on CPU
Use this command if you plan to utilize the CPU
- Install DocSum using Helm:
Deploying DocSum on Gaudi
Use these commands if you plan to utilize Gaudi
Deploying DocSum on AMD ROCm
Use these commands if you plan to utilize an AMD ROCm device
Verifying the DocSum Installation
Ensuring your DocSum deployment is successful involves checking the pod status and testing the service via curl
commands or the UI.
-
Check Pod Status: Verify all pods are running.
-
Port Forwarding: Expose the DocSum service for local access.
Testing with Curl
To verify that the document summarization service is correctly deployed, use the following curl
command:
- Send a Test Request: Open a new terminal and run:
Accessing the UI for Verification
The UI provides an interface to interact with the service.
-
Determine the NodePort:
- For the default Gradio UI:
- For other UIs (e.g., NGINX):
-
Access the UI in a Browser: Open a browser and navigate to
http://<k8s-node-ip-address>:${port}
.
Now you can interact with the DocSum workload through the UI, testing document summarization with various inputs.
Conclusion
Successfully deploying DocSum with Helm charts involves careful configuration and verification steps. By following this guide, you can deploy and verify DocSum, ensuring it operates correctly within your Kubernetes environment. This enables efficient and scalable document summarization for your applications.