This pipeline automatically collects, processes, and stores performance metrics for comparing Shortfin LLM Server with SGLang Server. The pipeline runs daily and handles:
- Benchmark collection for both servers
- Metrics processing and standardization
- Data storage in RDS
- Automatic dashboard updates in Grafana
# Install required Python packages
pip install schedule mysql-connector-python sqlalchemy pandas py7zr
# Clone your benchmark repository (if applicable)
git clone <your-benchmark-repo>
cd <repo-directory>
Ensure these files are in your working directory:
benchmark-collector.py
- Your benchmark collection scriptmetrics-processor.py
- Metrics processing scriptmetrics_pipeline.py
- Main pipeline script
- Setup a github issue with the set of actions they need to run at their end
The pipeline uses Amazon RDS MySQL. Your database connection details:
Host: llm-metrics.c3kwuosg6kjs.us-east-2.rds.amazonaws.com
Database: llm_metrics(for example)
Username: [user-name]
Password: [password]
- Runs benchmarks for multiple configurations:
- Request rates: 1, 2, 4, 8, 16, 32
- Servers: sglang, shortfin
- Model types: none, trie (for shortfin)
- Metrics collected:
- Median E2E Latency
- Median TTFT (Time to First Token)
- Median ITL (Inter-Token Latency)
- Request Throughput
- Duration
- Processes raw benchmark data
- Standardizes metrics format
- Creates timestamped CSV files
- Stores processed data in
./processed_data/
- Automatically loads processed metrics to RDS
- Maintains historical data
- Enables time-series analysis
- Supports Grafana visualization
# Run the complete pipeline once
python metrics_pipeline.py
# Run individual components
python metrics_pipeline.py --collect-only # Only collect benchmarks
python metrics_pipeline.py --process-only # Only process existing data
python metrics_pipeline.py --load-only # Only load to database
The pipeline is configured to run daily at midnight using cron:
- Set up the cron job:
# Create the runner script
cat << 'EOF' > run_pipeline.sh
#!/bin/bash
cd /home/cloudshell-user
source ~/.bashrc
python metrics_pipeline.py >> pipeline.log 2>&1
EOF
# Make executable
chmod +x run_pipeline.sh
# Add to crontab
crontab -e
# Add: 0 0 * * * /home/cloudshell-user/run_pipeline.sh
- Pipeline logs:
llm_pipeline.log
- Cron execution logs:
pipeline.log
- RDS metrics available in CloudWatch
- Pipeline status viewable in Grafana
CREATE TABLE llm_metrics (
id INT AUTO_INCREMENT PRIMARY KEY,
server VARCHAR(50),
date DATE,
request_rate INT,
model_type VARCHAR(50),
dataset VARCHAR(50),
input_tokens INT,
output_tokens INT,
output_tokens_retokenized INT,
mean_latency FLOAT,
median_latency FLOAT,
median_ttft FLOAT,
median_itl FLOAT,
throughput FLOAT,
duration FLOAT,
completed_requests INT,
tokens_per_second FLOAT,
timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
- Data source: MySQL
- Default dashboard provided
- Key visualizations:
- Latency comparison
- Throughput analysis
- Request rate impact
- Token processing efficiency
-
Benchmark Collection Fails
- Check server accessibility
- Verify benchmark script parameters
- Check disk space for outputs
-
Database Connection Issues
- Verify RDS security group settings
- Check credentials
- Confirm network connectivity
-
Missing Data
- Check pipeline logs
- Verify file permissions
- Ensure sufficient disk space
# Check pipeline status
tail -f llm_pipeline.log
# Check database connectivity
python -c "import mysql.connector; mysql.connector.connect(host='your-rds-endpoint', user='admin', password='your-password', database='llm_metrics')"
# Verify cron job
crontab -l
- Log rotation:
# Compress old logs
find . -name "*.log" -mtime +7 -exec gzip {} \;
- Data cleanup:
# Remove processed files older than 7 days
find ./processed_data -mtime +7 -exec rm {} \;
- Database backup:
# Create RDS snapshot
aws rds create-db-snapshot \
--db-instance-identifier llm-metrics \
--db-snapshot-identifier metrics-backup-$(date +%Y%m%d)