Troubleshooting Guide
Solutions for common issues and debugging techniques.
Common Issues
Services Won’t Start
Problem: docker-compose up fails or services crash
Solutions:
# Check Docker is running
docker ps
# View service logs
docker-compose logs hemostat-redis
docker-compose logs hemostat-monitor
# Rebuild from scratch
docker-compose down -v
docker-compose build --no-cache
docker-compose up -d
# Check for port conflicts
lsof -i :8501 # Streamlit
lsof -i :3000 # Arcane
lsof -i :6379 # Redis
Monitor Not Detecting Issues
Problem: Container anomalies not appearing in logs
Check:
# View Monitor logs
docker-compose logs -f hemostat-monitor
# Verify Redis connection
docker exec hemostat-redis redis-cli ping
# Check test-api is running
docker-compose ps | grep test-api
# Manually trigger issue
docker exec hemostat-test-api apk add stress
docker exec hemostat-test-api stress --cpu 4 --timeout 10
Solutions:
Increase memory/CPU stress to exceed thresholds (80% memory, 85% CPU)
Check Monitor polling interval (default 30 seconds)
Verify Redis is healthy:
docker-compose logs hemostat-redis
Analyzer Errors
Problem: Analyzer crashes or doesn’t process alerts
Check:
# View Analyzer logs
docker-compose logs -f hemostat-analyzer
# Check if OpenAI API key is set
echo $OPENAI_API_KEY
# Verify Redis connection
docker exec hemostat-redis redis-cli KEYS "hemostat:*"
Solutions:
Without API key, system uses fallback rule-based analysis (normal!)
Check API key is valid:
curl https://api.openai.com/v1/models -H "Authorization: Bearer $OPENAI_API_KEY"Check rate limits on OpenAI account
Review Analyzer logs for error details
Analyzer - Anthropic API Authentication Error (401)
Problem: Error code: 401 - invalid x-api-key when using Claude models
Root Causes:
Wrong environment file being loaded - Docker Compose uses
.envby default, not.env.docker.{platform}Incorrect ChatAnthropic parameter - Using
model_nameinstead ofmodelAPI key not in the correct .env file - Key is in
.env.docker.windowsbut not in.env
Check:
# Verify API key is in the container
docker exec hemostat-analyzer printenv | grep ANTHROPIC_API_KEY
# Check which env file Docker Compose is using
docker inspect hemostat-analyzer | grep -A 20 "Env"
# Verify the API key format (should start with sk-ant-)
docker exec hemostat-analyzer printenv ANTHROPIC_API_KEY
Solutions:
Option 1: Use correct env file with Docker Compose (Recommended)
Always use the platform-specific env file when building/running:
# Windows
docker compose -f docker-compose.yml -f docker-compose.windows.yml --env-file .env.docker.windows build analyzer --no-cache
docker compose -f docker-compose.yml -f docker-compose.windows.yml --env-file .env.docker.windows up -d analyzer
# Linux
docker compose -f docker-compose.yml -f docker-compose.linux.yml --env-file .env.docker.linux build analyzer --no-cache
docker compose -f docker-compose.yml -f docker-compose.linux.yml --env-file .env.docker.linux up -d analyzer
# macOS
docker compose -f docker-compose.yml -f docker-compose.macos.yml --env-file .env.docker.macos build analyzer --no-cache
docker compose -f docker-compose.yml -f docker-compose.macos.yml --env-file .env.docker.macos up -d analyzer
Option 2: Add API key to default .env file
Copy your Anthropic API key from .env.docker.windows to .env:
# Edit .env and set:
ANTHROPIC_API_KEY=sk-ant-api03-YOUR_KEY_HERE
AI_MODEL=claude-haiku-4-5-20251001
Then rebuild normally:
docker compose build analyzer --no-cache
docker compose up -d analyzer
See README.md section “Building & Rebuilding Services” for complete platform-specific commands.
Responder Not Fixing Issues
Problem: Remediation not executing
Check:
# View Responder logs
docker-compose logs -f hemostat-responder
# Check Docker socket permissions
ls -la /var/run/docker.sock
# Verify container can be restarted
docker exec hemostat-responder docker restart hemostat-test-api
Solutions:
Check cooldown not active:
docker exec hemostat-redis redis-cli GET "hemostat:remediation:*"Verify Docker socket is mounted correctly in docker-compose.yml
Check safety mechanisms (cooldown, max retries) aren’t triggered
Review Responder logs for error details
Dashboard Not Updating
Problem: Streamlit shows no data
Check:
# View Streamlit logs
docker-compose logs -f hemostat-dashboard
# Verify Redis has data
docker exec hemostat-redis redis-cli GET "hemostat:stats:hemostat-test-api"
# Check dashboard can connect to Redis
docker exec hemostat-dashboard ping redis
# Refresh browser
# Streamlit auto-refreshes every 5 seconds
Solutions:
Wait 30+ seconds for Monitor to collect first stats
Manually refresh Streamlit page
Check Redis is healthy and storing data
Verify dashboard/app.py is reading correct Redis keys
Alert Not Sending Slack
Problem: No Slack notifications despite fixes
Check:
# View Alert logs
docker-compose logs -f hemostat-alert
# Check Slack webhook URL is set
echo $SLACK_WEBHOOK_URL
# Test webhook manually
curl -X POST $SLACK_WEBHOOK_URL \
-H 'Content-type: application/json' \
-d '{
"attachments": [{
"color": "good",
"title": "HemoStat Test",
"text": "This is a test notification"
}]
}'
Solutions:
Without Slack webhook, system still works (just no notifications)
Verify webhook URL is correct and active
Check Slack workspace permissions
Review Alert logs for HTTP errors
Performance Issues
Problem: System slow or laggy
Check:
# View system resources
docker stats
# Check individual service performance
docker-compose logs hemostat-monitor | grep "published"
docker-compose logs hemostat-analyzer | grep "Published"
# Monitor Redis performance
docker exec hemostat-redis redis-cli --stat
Solutions:
Reduce Monitor polling interval (edit hemostat_monitor.py)
Increase Docker resource limits (edit docker-compose.yml)
Check Redis is not full:
docker exec hemostat-redis redis-cli INFO memoryClear old events:
docker exec hemostat-redis redis-cli KEYS "hemostat:events:*" | xargs redis-cli DEL
Docker Permissions
Problem: Permission denied errors
Check:
# Check Docker socket permissions
ls -la /var/run/docker.sock
# Check if user is in docker group
groups $USER
Solutions:
# Add user to docker group (Linux)
sudo usermod -aG docker $USER
newgrp docker
# Restart Docker service
sudo systemctl restart docker
# Or run docker-compose with sudo
sudo docker-compose up -d
Debug Mode
Enable Verbose Logging
Edit each agent to add more logging:
import logging
logging.basicConfig(level=logging.DEBUG) # Change from INFO to DEBUG
Monitor Redis Activity
# Watch all Redis events in real-time
docker exec hemostat-redis redis-cli SUBSCRIBE "hemostat:*"
# List all Redis keys
docker exec hemostat-redis redis-cli KEYS "*"
# View specific key value
docker exec hemostat-redis redis-cli GET "hemostat:stats:hemostat-test-api"
# Monitor Redis traffic
docker exec hemostat-redis redis-cli MONITOR
Test Individual Components
# Test Monitor independently
docker run -it hemostat-agents-hemostat-monitor bash
python hemostat_monitor.py
# Test Docker SDK
docker run -it python:3.11 bash
pip install docker
python -c "import docker; c = docker.from_env(); print(c.containers.list())"
# Test Redis connection
docker run -it redis:7-alpine redis-cli -h redis ping
Getting Help
Check logs first: Most issues show up in Docker logs
Search TROUBLESHOOTING.md: Common issues documented here
Review docker-compose logs: Full system trace
Check GitHub issues: If this was forked from a repo
Ask AI/LLM: Paste logs into Claude/ChatGPT for analysis
Performance Tuning
Monitor Polling Interval
# In agents/hemostat_monitor/monitor.py
time.sleep(30) # Change to 10 for faster detection, 60 for less CPU
Analyzer Thresholds
# In agents/hemostat_analyzer/analyzer.py
if memory_pct > 90: # Lower for earlier alerts
if cpu_pct > 90: # Lower for earlier alerts
Responder Cooldown
# In agents/hemostat_responder/responder.py
self.cooldown_period = 3600 # 1 hour, lower for more restarts
self.max_actions_per_hour = 3 # Increase for more remediation attempts
Dashboard Refresh Rate
# In dashboard/app.py - Streamlit auto-refreshes every 5 seconds
# To change, add to Streamlit config
Advanced Debugging
Network Inspection
# Check if agents can communicate
docker exec hemostat-monitor ping hemostat-redis
docker exec hemostat-analyzer ping hemostat-redis
# Test inter-container connectivity
docker network inspect hemostat-agents_default
Resource Limits
# View resource usage
docker stats hemostat-monitor
docker stats hemostat-analyzer
docker stats hemostat-responder
# Set resource limits in docker-compose.yml
# See docker-compose.yml for examples
Collecting Debug Information
If you’re still stuck, capture the output of:
docker-compose logs > hemostat-debug.log
docker ps > containers.log
docker-compose ps > services.log
Review the logs carefully for error messages!