Skip to content

Autoscaler Architecture

The Productify Autoscaler uses a two-tier architecture combining Time-Series forecasting with mathematical optimization for intelligent scaling decisions.

System Overview

┌────────────────────────────────────────────────────────┐
│              Nomad Autoscaler Framework                │
│                                                        │
│  ┌──────────────────────────────────────────────────┐  │
│  │      APM Plugin (Metrics Source)                 │  │
│  │  - Collects application metrics                  │  │
│  │  - CPU, Memory, Request count                    │  │
│  └──────────────────────────────────────────────────┘  │
│                       │                                │
│                       ▼                                │
│  ┌──────────────────────────────────────────────────┐  │
│  │      Strategy Plugin (Scaling Logic)             │  │
│  │  - Evaluates scaling policies                    │  │
│  │  - Determines when to scale                      │  │
│  └──────────────────────────────────────────────────┘  │
│                       │                                │
│                       ▼                                │
│  ┌──────────────────────────────────────────────────┐  │
│  │  Target Plugin (Productify Nomadscaler)          │  │
│  │  - Manages second-level cache                    │  │
│  │  - Calls optimizer for predictions               │  │
│  │  - Updates Nomad job counts                      │  │
│  └──────────────────────────────────────────────────┘  │
└────────────────────────────────────────────────────────┘

                       │ HTTP API

┌────────────────────────────────────────────────────────┐
│           Optimizer Service (Python)                   │
│                                                        │
│  ┌──────────────────────────────────────────────────┐  │
│  │      Time-Series Forecasting                     │  │
│  │  - SARIMAX model                                 │  │
│  │  - Historical data analysis                      │  │
│  │  - Future demand prediction                      │  │
│  └──────────────────────────────────────────────────┘  │
│                       │                                │
│                       ▼                                │
│  ┌──────────────────────────────────────────────────┐  │
│  │      MILP Optimization                           │  │
│  │  - Constraint-based optimization                 │  │
│  │  - Cost vs. performance balance                  │  │
│  └──────────────────────────────────────────────────┘  │
│                       │                                │
│                       ▼                                │
│  ┌──────────────────────────────────────────────────┐  │
│  │      Prediction Cache Generation                 │  │
│  │  - N seconds of predictions                      │  │
│  │  - One value per second                          │  │
│  │  - Includes metadata                             │  │
│  └──────────────────────────────────────────────────┘  │
└────────────────────────────────────────────────────────┘

Components

Nomadscaler Plugin (Go)

Role: Nomad autoscaler target plugin

Responsibilities:

  • Receive scaling policy evaluations from Nomad autoscaler
  • Request predictions from optimizer service
  • Manage second-level prediction cache
  • Select appropriate cached value based on elapsed time
  • Update Nomad job replica counts
  • Handle optimizer failures gracefully

Key Features:

  • Second-level time-based cache
  • Automatic cache refresh strategy
  • Fallback to cached values on optimizer failure
  • Constraint enforcement (min/max replicas)

Optimizer Service (Python)

Role: Time-Series forecasting and optimization engine

Responsibilities:

  • Fetch historical metrics from Nomad
  • Train SARIMAX time-series model
  • Forecast demand for next N seconds
  • Calculate optimal replica counts using MILP
  • Provide HTTP API for predictions
  • Manage model persistence

Key Features:

  • SARIMAX time-series forecasting
  • Mixed-Integer Linear Programming
  • Configurable prediction horizon
  • Multi-objective optimization
  • Token-based authentication

Data Flow

1. Metrics Collection

Application → Nomad Metrics → Historical Database

Metrics collected:

  • CPU usage (%)
  • Memory usage (MB)
  • Request rate (req/s)
  • Active connections
  • Custom application metrics

2. Policy Evaluation

Nomad Autoscaler → Check Policy → Evaluate Strategy

Policy defines:

  • Target metric
  • Evaluation interval
  • Min/max replicas
  • Strategy plugin to use

3. Prediction Request

Nomadscaler Plugin → HTTP POST /optimize → Optimizer Service

Request payload:

json
{
  "token": "SECRET",
  "check": {
    "metric_app_name": "my-app"
  },
  "current_replicas": 3,
  "min_replicas": 1,
  "max_replicas": 10,
  "cache_size": 10
}

4. Time-Series Forecasting

Optimizer → Fetch Metrics → Train SARIMAX → Forecast Demand

SARIMAX model:

  • Seasonal
  • AutoRegressive
  • Integrated
  • Moving Average with eXogenous variables

The optimizer automatically selects the best SARIMAX order from multiple candidates:

  • Candidate orders: (1,1,1), (1,0,1), (2,1,1), (1,1,0)
  • Selection criteria: Minimum AIC (Akaike Information Criterion)
  • Exogenous variables: avg_response_time, authentication_awaiting_users, queue_waiting, avg_processing_time

5. MILP Optimization

Forecasted Demand → MILP Solver → Optimal Replica Counts

Optimization objective:

Minimize: Σ (replica_cost × replicas[t] +
              penalty × sla_shortfall[t] +
              startup_cost × scale_up[t] +
              shutdown_cost × scale_down[t])

Subject to:
  min_replicas ≤ replicas[t] ≤ max_replicas     ∀t
  capacity × replicas[t] + sla[t] ≥ demand[t]   ∀t
  scale_down[t] ≤ replicas[t-1]                  ∀t
  scale_up[t] ≤ max_scale_up                     ∀t
  scale_down[t] ≤ max_scale_down                 ∀t
  replicas[t] ∈ Integer                          ∀t

6. Response Generation

Optimizer → Generate N values → Return to Plugin

Response:

json
{
  "desired": [3, 3, 4, 4, 5, 5, 6, 6, 7, 7]
}

Array contains replica counts for next 10 seconds.

7. Cache Management

Plugin → Cache Response → Serve Values → Refresh

Cache behavior:

  • Store all N values with timestamp
  • Select value based on seconds_elapsed % cache_size
  • After 2 cache uses, request fresh predictions
  • On optimizer failure, continue using cache

8. Scaling Execution

Plugin → Update Nomad Job → Nomad Schedules Tasks

Nomad actions:

  • Calculate delta (desired - current)
  • Schedule new allocations (scale up)
  • Stop allocations (scale down)
  • Wait for health checks
  • Update service catalog

Failure Modes & Handling

Optimizer Unavailable

Behavior: Plugin uses cached predictions

Impact: Continues scaling with last known good predictions

Recovery: Automatic reconnection on next refresh attempt

Stale Metrics

Detection: Check metric timestamp

Behavior: Use default conservative scaling

Mitigation: Alert on metric staleness

MILP Solver Timeout

Behavior: Fall back to simple linear interpolation

Impact: Less optimal but still functional scaling

Configuration: Set solver timeout in config

Network Partition

Behavior: Plugin caches last 10 seconds of predictions

Impact: Up to 10 seconds of scaling with stale data

Recovery: Resume normal operation when network restored

Scaling Patterns

Proactive Scale-Up

09:25 - Forecast predicts 09:30 traffic increase
09:25 - Start adding replicas
09:30 - Fully scaled when traffic arrives

Gradual Scale-Down

23:00 - Forecast predicts decreasing demand
23:00 - Begin removing replicas one by one
00:00 - Minimal replicas for overnight

Event-Driven

11:50 - Scheduled event at 12:00
11:50 - Rapidly scale to predicted capacity
12:00 - Ready for event traffic
12:30 - Gradual scale down

See Also