Live Monitoring with tonl top
Real-time server monitoring dashboard for TONL MCP Server. Like htop but for your TONL infrastructure.
Overview
The tonl top command provides a live, terminal-based dashboard that displays real-time metrics from your TONL MCP Server. Monitor CPU, memory, conversions, token savings, and business impact in a single view.
Quick Start
# Start server
export TONL_AUTH_TOKEN=your-token
npm run mcp:start
# In another terminal
tonl top![Dashboard showing CPU 3.3%, RAM 32.9 MB, 156 conversions, 38k tokens saved]
Features
System Resources
- CPU Usage: Real-time percentage with color-coded bars
- Memory (RSS): Actual process memory usage vs container limit (default 512MB)
- Heap Usage: Node.js heap percentage (shown as additional info)
- Event Loop Lag: Monitor for performance issues
- Health Status: 🟢 Online / 🔴 Degraded
Live Activity
- Active Streams: Current SSE connections
- Conversions: Total, successful, and failed counts
- Error Tracking: By type (auth, validation, internal, stream)
Business Impact
- Tokens Saved: Cumulative token savings
- Cost Saved: Estimated cost savings in USD
- Avg Compression: Average compression ratio across all operations
Installation
npm install -g tonl-mcp-bridgeBasic Usage
Monitor Local Server
tonl topConnects to http://localhost:3000/metrics by default.
Monitor Remote Server
tonl top --url https://production.company.com/metricsWith Authentication
# Via environment variable
export TONL_AUTH_TOKEN=your-secure-token
tonl top
# Via command line
tonl top --token your-secure-tokenCustom Refresh Rate
# Update every 2 seconds (default: 1s)
tonl top --interval 2000Command Options
tonl top [options]
Options:
-u, --url <url> Server metrics URL (default: http://localhost:3000/metrics)
-i, --interval <ms> Refresh interval in milliseconds (default: 1000)
-t, --token <token> Auth token (or use TONL_AUTH_TOKEN env var)
--no-stream Disable SSE streaming (use polling instead)
-h, --help Display helpDashboard Sections
Header
TONL SERVER MONITOR 🟢 ONLINE v1.0.0 • http://localhost:3000/metrics • Uptime: 0h 12m- Status indicator (🟢/🔴)
- Server version
- Server URL
- Uptime
Resources
Resources:
CPU [████░░░░░░░░░░░░░░░░] 3.3%
RAM [███░░░░░░░░░░░░░░░░░] 93.0 MB (Heap: 67%)
Lag [2.1ms] HealthyCPU Bar Colors:
- Green: 0-50%
- Yellow: 50-80%
- Red: 80%+
Memory Bar:
- Shows RSS (Resident Set Size) - actual process memory
- Bar represents percentage of typical container limit (512MB)
- Heap percentage shown as additional context
- Why RSS? More accurate for operations/K8s resource planning
- Heap % high (80-95%) is normal - Node.js triggers GC automatically
Memory Colors:
- Green: 0-50% of 512MB (0-256MB)
- Yellow: 50-80% (256-410MB)
- Red: 80%+ (410MB+)
Event Loop Lag:
- Healthy: <100ms
- Slow: ≥100ms
Live Activity
Live Activity:
👥 Active Streams: 1
⚡ Conversions: 156 total (156 ✓, 0 ✗)
🔥 Errors (total): 0Counters:
- Active Streams: Current SSE connections
- Conversions: Success/failure breakdown
- Errors: Detailed by type when > 0
Business Impact
Business Impact:
💰 Tokens Saved: 38,168
💵 Cost Saved: $0.0954
📉 Avg Compression: 64.9%Metrics:
- Tokens Saved: Cumulative since server start
- Cost Saved: Based on GPT-4o pricing ($2.50/1M tokens)
- Compression: Average ratio (lower = better compression)
Live Streaming Mode
By default, tonl top uses SSE (Server-Sent Events) for real-time updates:
tonl top # SSE streaming (recommended)Benefits:
- Real-time updates (push-based)
- Lower latency
- More efficient than polling
- Works great over networks
Polling Mode
Fallback to HTTP polling if SSE is unavailable:
tonl top --no-streamUses periodic HTTP requests instead of persistent connection.
Authentication
Development (No Auth)
# Server without token
npm run mcp:start
# Dashboard without token
tonl topServer auto-generates session tokens.
Production (With Auth)
# Server with token
export TONL_AUTH_TOKEN=your-secure-token
npm run mcp:start
# Dashboard with token
export TONL_AUTH_TOKEN=your-secure-token
tonl topOr use CLI flag:
tonl top --token your-secure-tokenConfiguration
Environment Variables
# Server URL (if not default)
export TONL_SERVER_URL=https://production.company.com/metrics
# Auth token
export TONL_AUTH_TOKEN=your-secure-token
# Then simply
tonl topRate Limiting
Server-side rate limiting is configurable:
# Disable rate limiting (development)
export TONL_RATE_LIMIT_ENABLED=false
npm run mcp:start
# Configure limits (production)
export TONL_RATE_LIMIT_ENABLED=true
export TONL_RATE_LIMIT_WINDOW_MS=900000 # 15 minutes
export TONL_RATE_LIMIT_MAX=100 # 100 requests
npm run mcp:startNote: Rate limiting applies to /stream/convert, not /metrics/live.
Use Cases
1. Development Monitoring
Monitor server during local development:
# Terminal 1: Server
npm run mcp:start
# Terminal 2: Dashboard
tonl top
# Terminal 3: Load testing
node load-test.jsWatch metrics update in real-time as you test.
2. Production Monitoring
Monitor remote production server:
tonl top --url https://tonl-prod.company.com/metrics --token $PROD_TOKENKeep dashboard open in dedicated terminal or tmux pane.
3. Performance Testing
Monitor server under load:
# Terminal 1: Server
npm run mcp:start
# Terminal 2: Dashboard
tonl top
# Terminal 3: Load test
for i in {1..1000}; do
curl -X POST http://localhost:3000/stream/convert \
-H "Content-Type: application/x-ndjson" \
--data-binary @test.ndjson
doneWatch CPU, memory, and event loop lag.
4. Troubleshooting
Debug server issues:
tonl topCheck for:
- High CPU (>80%)
- Memory leaks (increasing heap)
- Event loop lag (>100ms)
- Error spikes
- Failed conversions
5. Team Collaboration
Share server status with team:
# In tmux or screen session
tonl top --url https://shared-server.company.com/metricsTeam members can view same dashboard.
Metrics Explained
CPU Percentage
Calculated from Prometheus counters:
CPU% = (cpu_delta / time_delta) * 100Requires 2+ snapshots for calculation (shows 0% initially).
Memory Percentage
RSS (Resident Set Size):
RSS = Actual process memory footprint
Bar% = (RSS / 512MB limit) * 100Why we show RSS instead of Heap:
- RSS is what Kubernetes/Docker sees
- RSS is what counts against container limits
- RSS is stable and predictable
- Heap % can be 90%+ and still healthy (GC triggers automatically)
Heap Percentage (shown for context):
Heap% = (heap_used / heap_total) * 100- High heap % (80-95%) is NORMAL and HEALTHY
- Node.js V8 engine triggers GC when needed
- Focus on RSS for operational decisions
Example:
RAM [███░░░░░░░] 93.0 MB (Heap: 95%)
^^^ ^^^
RSS: 18% of 512MB Heap almost full - GC will run soon
Green bar = healthy This is normal!Event Loop Lag
Measured in milliseconds:
- 0-10ms: Excellent
- 10-50ms: Good
- 50-100ms: Fair (warning)
- 100ms+: Poor (server is struggling)
Compression Ratio
Compression% = (1 - tonl_bytes / json_bytes) * 100Higher percentage = better compression.
Troubleshooting
Dashboard Shows DEGRADED
Possible causes:
- High CPU usage (>80%)
- Event loop lag (>100ms)
- High RSS (>400MB for 512MB limit)
Note: High Heap % alone does NOT cause DEGRADED status - this is normal!
Solutions:
- Check server logs for errors
- Reduce concurrent requests
- Scale horizontally (more replicas)
- Optimize payload sizes
- Check if RSS is growing continuously (memory leak)
Connection Failed (401/403)
Issue: Authentication error
Solutions:
# Check token is set
echo $TONL_AUTH_TOKEN
# Use correct token
export TONL_AUTH_TOKEN=correct-token-here
tonl topNo Data / Stuck on Loading
Issue: Server not responding
Solutions:
- Check server is running:
ps aux | grep node - Verify endpoint:
curl http://localhost:3000/metrics - Check firewall settings
- Try polling mode:
tonl top --no-stream
Metrics Not Updating
Issue: SSE stream disconnected
Solutions:
- Check network stability
- Restart dashboard:
Ctrl+Cthentonl top - Use polling mode:
tonl top --no-stream
High Error Count
Issue: Server reporting errors
Check error types:
- Auth: Wrong tokens
- Validation: Bad input data
- Internal: Server bugs
- Stream: Streaming failures
View server logs for details:
npm run mcp:start 2>&1 | tee server.logPerformance
Dashboard Overhead
- CPU: <1% typical
- Memory: ~50MB
- Network: ~1KB/second (SSE mode)
Update Frequency
Default 1-second updates provide:
- Real-time feel
- Reasonable network usage
- Low server impact
Adjust for your needs:
tonl top --interval 5000 # Every 5 secondsKeyboard Shortcuts
- Ctrl+C: Exit dashboard
- q: Quit (if implemented)
Integration
With Prometheus
The dashboard reads Prometheus metrics exposed by the server:
# View raw metrics
curl http://localhost:3000/metricsKey metrics used:
tonl_process_cpu_*: CPU usagetonl_nodejs_heap_size_*: Memorytonl_nodejs_eventloop_lag_*: Event looptonl_tokens_saved_total: Token savingstonl_conversion_requests_total: Conversionstonl_errors_total: Errors
With Grafana
For advanced monitoring, import metrics to Grafana:
# prometheus.yml
scrape_configs:
- job_name: 'tonl'
static_configs:
- targets: ['localhost:3000']
metrics_path: '/metrics'Use tonl top for quick checks, Grafana for dashboards.
Best Practices
Always Monitor Under Load
Don't just check when things are working:
# Start monitoring BEFORE load testing
tonl top
# Then run load tests in another terminalSet Up Alerts
For production, monitor continuously:
- Use Grafana alerts for CPU/memory thresholds
- Use
tonl topfor manual checks
Keep Token Secure
Never commit auth tokens:
# .env
TONL_AUTH_TOKEN=secret-token
# Use env file
source .env
tonl topMonitor Multiple Servers
Use tmux/screen for multiple dashboards:
# Window 1: Production
tonl top --url https://prod.company.com/metrics
# Window 2: Staging
tonl top --url https://staging.company.com/metrics
# Window 3: Development
tonl topExamples
Basic Local Monitoring
# Terminal 1
npm run mcp:start
# Terminal 2
tonl topProduction Monitoring with Auth
export TONL_AUTH_TOKEN=$(vault read -field=token secret/tonl)
tonl top --url https://tonl-prod.company.com/metricsCustom Refresh Rate
# Every 500ms (high frequency)
tonl top --interval 500
# Every 5s (low frequency)
tonl top --interval 5000Polling Mode (No SSE)
tonl top --no-stream --interval 2000See Also
- MCP Server Guide - Server setup
- Metrics Reference - All available metrics
- Deployment - Production deployment
- Troubleshooting - Common issues