Autoscaler Architecture
The Productify Autoscaler uses a two-tier architecture combining Time-Series forecasting with mathematical optimization for intelligent scaling decisions.
System Overview
┌────────────────────────────────────────────────────────┐
│ Nomad Autoscaler Framework │
│ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ APM Plugin (Metrics Source) │ │
│ │ - Collects application metrics │ │
│ │ - CPU, Memory, Request count │ │
│ └──────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Strategy Plugin (Scaling Logic) │ │
│ │ - Evaluates scaling policies │ │
│ │ - Determines when to scale │ │
│ └──────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Target Plugin (Productify Nomadscaler) │ │
│ │ - Manages second-level cache │ │
│ │ - Calls optimizer for predictions │ │
│ │ - Updates Nomad job counts │ │
│ └──────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────┘
│
│ HTTP API
▼
┌────────────────────────────────────────────────────────┐
│ Optimizer Service (Python) │
│ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Time-Series Forecasting │ │
│ │ - SARIMAX model │ │
│ │ - Historical data analysis │ │
│ │ - Future demand prediction │ │
│ └──────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ MILP Optimization │ │
│ │ - Constraint-based optimization │ │
│ │ - Cost vs. performance balance │ │
│ └──────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Prediction Cache Generation │ │
│ │ - N seconds of predictions │ │
│ │ - One value per second │ │
│ │ - Includes metadata │ │
│ └──────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────┘Components
Nomadscaler Plugin (Go)
Role: Nomad autoscaler target plugin
Responsibilities:
- Receive scaling policy evaluations from Nomad autoscaler
- Request predictions from optimizer service
- Manage second-level prediction cache
- Select appropriate cached value based on elapsed time
- Update Nomad job replica counts
- Handle optimizer failures gracefully
Key Features:
- Second-level time-based cache
- Automatic cache refresh strategy
- Fallback to cached values on optimizer failure
- Constraint enforcement (min/max replicas)
Optimizer Service (Python)
Role: Time-Series forecasting and optimization engine
Responsibilities:
- Fetch historical metrics from Nomad
- Train SARIMAX time-series model
- Forecast demand for next N seconds
- Calculate optimal replica counts using MILP
- Provide HTTP API for predictions
- Manage model persistence
Key Features:
- SARIMAX time-series forecasting
- Mixed-Integer Linear Programming
- Configurable prediction horizon
- Multi-objective optimization
- Token-based authentication
Data Flow
1. Metrics Collection
Application → Nomad Metrics → Historical DatabaseMetrics collected:
- CPU usage (%)
- Memory usage (MB)
- Request rate (req/s)
- Active connections
- Custom application metrics
2. Policy Evaluation
Nomad Autoscaler → Check Policy → Evaluate StrategyPolicy defines:
- Target metric
- Evaluation interval
- Min/max replicas
- Strategy plugin to use
3. Prediction Request
Nomadscaler Plugin → HTTP POST /optimize → Optimizer ServiceRequest payload:
{
"token": "SECRET",
"check": {
"metric_app_name": "my-app"
},
"current_replicas": 3,
"min_replicas": 1,
"max_replicas": 10,
"cache_size": 10
}4. Time-Series Forecasting
Optimizer → Fetch Metrics → Train SARIMAX → Forecast DemandSARIMAX model:
- Seasonal
- AutoRegressive
- Integrated
- Moving Average with eXogenous variables
The optimizer automatically selects the best SARIMAX order from multiple candidates:
- Candidate orders: (1,1,1), (1,0,1), (2,1,1), (1,1,0)
- Selection criteria: Minimum AIC (Akaike Information Criterion)
- Exogenous variables: avg_response_time, authentication_awaiting_users, queue_waiting, avg_processing_time
5. MILP Optimization
Forecasted Demand → MILP Solver → Optimal Replica CountsOptimization objective:
Minimize: Σ (replica_cost × replicas[t] +
penalty × sla_shortfall[t] +
startup_cost × scale_up[t] +
shutdown_cost × scale_down[t])
Subject to:
min_replicas ≤ replicas[t] ≤ max_replicas ∀t
capacity × replicas[t] + sla[t] ≥ demand[t] ∀t
scale_down[t] ≤ replicas[t-1] ∀t
scale_up[t] ≤ max_scale_up ∀t
scale_down[t] ≤ max_scale_down ∀t
replicas[t] ∈ Integer ∀t6. Response Generation
Optimizer → Generate N values → Return to PluginResponse:
{
"desired": [3, 3, 4, 4, 5, 5, 6, 6, 7, 7]
}Array contains replica counts for next 10 seconds.
7. Cache Management
Plugin → Cache Response → Serve Values → RefreshCache behavior:
- Store all N values with timestamp
- Select value based on
seconds_elapsed % cache_size - After 2 cache uses, request fresh predictions
- On optimizer failure, continue using cache
8. Scaling Execution
Plugin → Update Nomad Job → Nomad Schedules TasksNomad actions:
- Calculate delta (desired - current)
- Schedule new allocations (scale up)
- Stop allocations (scale down)
- Wait for health checks
- Update service catalog
Failure Modes & Handling
Optimizer Unavailable
Behavior: Plugin uses cached predictions
Impact: Continues scaling with last known good predictions
Recovery: Automatic reconnection on next refresh attempt
Stale Metrics
Detection: Check metric timestamp
Behavior: Use default conservative scaling
Mitigation: Alert on metric staleness
MILP Solver Timeout
Behavior: Fall back to simple linear interpolation
Impact: Less optimal but still functional scaling
Configuration: Set solver timeout in config
Network Partition
Behavior: Plugin caches last 10 seconds of predictions
Impact: Up to 10 seconds of scaling with stale data
Recovery: Resume normal operation when network restored
Scaling Patterns
Proactive Scale-Up
09:25 - Forecast predicts 09:30 traffic increase
09:25 - Start adding replicas
09:30 - Fully scaled when traffic arrivesGradual Scale-Down
23:00 - Forecast predicts decreasing demand
23:00 - Begin removing replicas one by one
00:00 - Minimal replicas for overnightEvent-Driven
11:50 - Scheduled event at 12:00
11:50 - Rapidly scale to predicted capacity
12:00 - Ready for event traffic
12:30 - Gradual scale downSee Also
- Quick Start - Get started quickly
- Nomadscaler Plugin - Plugin configuration
- Optimizer Service - Optimizer setup
- Scaling Policies - Policy examples