Monitoring
Monitor CCProxy performance, health, and usage with comprehensive monitoring solutions.
Key Metrics to Monitor
Service Health
- Service uptime: Monitor
/healthendpoint - Response time: Track API response latency
- Error rate: Monitor failed requests
- Request volume: Track requests per second
Provider Performance
- Provider latency: Time taken for provider API calls
- Provider errors: Failed provider API calls
- Token usage: Monitor token consumption per provider
- Rate limiting: Track rate limit hits
System Resources
- Memory usage: Monitor container/process memory
- CPU usage: Track CPU utilization
- Network I/O: Monitor network bandwidth
- Disk usage: Track log file sizes
Monitoring Stack
Prometheus + Grafana
Prometheus Configuration
yaml
# prometheus.yml
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'ccproxy'
static_configs:
- targets: ['localhost:3456']
metrics_path: /health
scrape_interval: 30sGrafana Dashboard
json
{
"dashboard": {
"title": "CCProxy Dashboard",
"panels": [
{
"title": "Service Health",
"type": "stat",
"targets": [
{
"expr": "up{job=\"ccproxy\"}",
"legendFormat": "Service Status"
}
]
},
{
"title": "Request Rate",
"type": "graph",
"targets": [
{
"expr": "rate(http_requests_total[5m])",
"legendFormat": "Requests/sec"
}
]
}
]
}
}Log-Based Monitoring
ELK Stack Integration
yaml
# logstash.conf
input {
file {
path => "/var/log/ccproxy/app.log"
codec => "json"
type => "ccproxy"
}
}
filter {
if [type] == "ccproxy" {
if [action] == "anthropic_request" {
mutate {
add_field => { "metric_type" => "request" }
}
}
if [action] == "anthropic_response" {
mutate {
add_field => { "metric_type" => "response" }
}
}
}
}
output {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "ccproxy-%{+YYYY.MM.dd}"
}
}Kibana Visualizations
- Request Volume: Line chart showing requests over time
- Error Rate: Pie chart showing error distribution
- Response Time: Histogram of API response times
- Provider Usage: Bar chart showing provider utilization
Health Check Monitoring
Uptime Monitoring
bash
#!/bin/bash
# health-check.sh
ENDPOINT="http://localhost:3456/health"
TIMEOUT=10
while true; do
if curl -f -s --max-time $TIMEOUT $ENDPOINT > /dev/null; then
echo "$(date): Service is healthy"
else
echo "$(date): Service is unhealthy!"
# Send alert
fi
sleep 30
doneKubernetes Monitoring
yaml
apiVersion: v1
kind: Service
metadata:
name: ccproxy-monitoring
labels:
app: ccproxy
spec:
selector:
app: ccproxy
ports:
- name: http
port: 3456
targetPort: 3456
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: ccproxy
spec:
selector:
matchLabels:
app: ccproxy
endpoints:
- port: http
path: /health
interval: 30sAlerting
Prometheus Alerts
yaml
# alerts.yml
groups:
- name: ccproxy
rules:
- alert: CCProxyDown
expr: up{job="ccproxy"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "CCProxy service is down"
description: "CCProxy has been down for more than 1 minute"
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.1
for: 2m
labels:
severity: warning
annotations:
summary: "High error rate detected"
description: "Error rate is {{ $value }} errors per second"
- alert: HighLatency
expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) > 2
for: 3m
labels:
severity: warning
annotations:
summary: "High latency detected"
description: "95th percentile latency is {{ $value }} seconds"Notification Channels
yaml
# alertmanager.yml
route:
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 1h
receiver: 'web.hook'
receivers:
- name: 'web.hook'
slack_configs:
- api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
channel: '#alerts'
title: 'CCProxy Alert'
text: '{{ range .Alerts }}{{ .Annotations.summary }}{{ end }}'Custom Monitoring Scripts
Provider Health Check
bash
#!/bin/bash
# provider-health.sh
PROVIDERS=("groq" "openai" "gemini" "mistral" "xai" "ollama")
for provider in "${PROVIDERS[@]}"; do
echo "Checking $provider provider..."
# Test provider endpoint
response=$(curl -s -o /dev/null -w "%{http_code}" \
-H "Content-Type: application/json" \
-X POST http://localhost:3456/v1/messages \
-d '{
"model": "test",
"messages": [{"role": "user", "content": "test"}],
"max_tokens": 10
}')
if [ "$response" -eq 200 ]; then
echo "$provider: OK"
else
echo "$provider: ERROR (HTTP $response)"
fi
donePerformance Metrics
bash
#!/bin/bash
# performance-metrics.sh
LOG_FILE="/var/log/ccproxy/app.log"
# Request count in last hour
echo "Requests in last hour:"
grep "$(date -d '1 hour ago' '+%Y-%m-%dT%H')" $LOG_FILE | \
grep '"action":"anthropic_request"' | wc -l
# Average response time
echo "Average response time:"
grep '"duration_ms"' $LOG_FILE | \
jq '.duration_ms' | \
awk '{sum+=$1; count++} END {print sum/count "ms"}'
# Error rate
echo "Error rate:"
total=$(grep '"level":"' $LOG_FILE | wc -l)
errors=$(grep '"level":"error"' $LOG_FILE | wc -l)
echo "scale=2; $errors * 100 / $total" | bc -lIntegration with External Services
Datadog
yaml
# datadog.yaml
logs:
- type: file
path: /var/log/ccproxy/app.log
service: ccproxy
source: go
sourcecategory: sourcecodeNew Relic
bash
# Install New Relic Go agent
go get github.com/newrelic/go-agent/v3/newrelic
# Configuration
export NEW_RELIC_LICENSE_KEY=your_license_key
export NEW_RELIC_APP_NAME=CCProxyBest Practices
- Monitor key metrics: Focus on SLIs (Service Level Indicators)
- Set up proactive alerting: Don't wait for users to report issues
- Use dashboards: Visualize metrics for quick understanding
- Regular review: Analyze trends and adjust thresholds
- Test alerts: Ensure notifications work correctly
- Document runbooks: Create incident response procedures
Troubleshooting
Common Monitoring Issues
- No data in dashboards: Check Prometheus scraping configuration
- False alerts: Adjust alert thresholds and timing
- Missing metrics: Verify log format and parsing
- High monitoring overhead: Optimize scraping intervals
Performance Impact
Monitor the monitoring system itself:
bash
# Check Prometheus memory usage
ps aux | grep prometheus
# Monitor log file sizes
du -h /var/log/ccproxy/
# Check disk space
df -h