Productify Autoscaler

The autoscaler component for the Productify FrameWork — primarily for use with Nomad. This repository holds 2 components. Each has its own README with detailed usage and configuration.

Overview

The autoscaler uses a second-level time-based caching mechanism for improved accuracy:

The optimizer service forecasts resource requirements at second-level granularity
It returns a list of desired replica counts for the next N seconds (configurable via cache_size, default: 10)
The nomadscaler plugin caches these values with a timestamp
Each cached value corresponds to one second of elapsed time
The plugin selects the appropriate cached value based on seconds elapsed since cache update
After using the cache 2 times, the plugin requests fresh predictions (with fallback to cache on failure)

This approach allows for granular, second-by-second scaling decisions based on the optimizer's MILP-calculated resource requirements over time.

Components

nomadscaler/ — Nomad autoscaler target plugin (Go). See Nomadscaler Plugin for build and usage instructions.
optimizer/ — MILP optimizer service (Python). See Optimizer Service for installation, configuration and API details.

Prediction Horizon

The prediction horizon is configurable:

Default: 10 seconds (returns 10 values)
Configure via cache_size parameter in the scaling policy
Optimizer uses SARIMAX time-series forecasting for predictions
Historical metrics (minutes) are used to forecast future demand (seconds)

Architecture

┌───────────────────────────────────────────────────┐
│         Nomad Autoscaler Framework                │
│  ┌────────────────────────────────────────────┐   │
│  │   Nomadscaler Plugin (Target Plugin)       │   │
│  │   - Reads scaling policies                 │   │
│  │   - Manages cache (second-level)           │   │
│  │   - Updates Nomad job counts               │   │
│  └────────────────────────────────────────────┘   │
└───────────────────────────────────────────────────┘
                    │ HTTP API
                    ▼
┌───────────────────────────────────────────────────┐
│         Optimizer Service (Python)                │
│  - SARIMAX time-series forecasting                │
│  - MILP optimization                              │
│  - Returns N second predictions                   │
│  - Historical metrics analysis                    │
└───────────────────────────────────────────────────┘
                    │
                    ▼
          ┌───────────────────┐
          │   Nomad Metrics   │
          │  (APM, Telemetry) │
          └───────────────────┘

How It Works

1. Prediction Request

Nomadscaler plugin requests predictions from optimizer service:

json

{
  "check": {
    "metric_app_name": "my-app"
  },
  "current_replicas": 3,
  "min_replicas": 1,
  "max_replicas": 10,
  "cache_size": 10
}

2. Time-Series Forecasting

Optimizer analyzes historical metrics and forecasts demand for next N seconds.

3. MILP Optimization

For each second in the prediction horizon, optimizer calculates optimal replica count considering:

Forecasted resource demand
Min/max constraints
Cost optimization
Performance requirements

4. Second-Level Caching

Optimizer returns array of desired replica counts:

json

{
  "desired": [3, 3, 4, 4, 5, 5, 6, 6, 7, 7]
}

Each value corresponds to one second in the future.

5. Cache-Based Scaling

Plugin caches predictions and selects value based on elapsed time:

Second 0: Use desired[0] = 3 replicas
Second 1: Use desired[1] = 3 replicas
Second 2: Use desired[2] = 4 replicas
...and so on

6. Automatic Refresh

After using cache 2 times, plugin requests fresh predictions (with fallback to cache on failure).

Use Cases

Traffic Spike Prevention

Predict and scale before traffic increases:

Historical Pattern:
  09:00 - Low traffic
  09:30 - Gradual increase
  10:00 - Peak traffic

Autoscaler Action:
  09:25 - Start scaling up (proactive)
  09:30 - Continue scaling
  10:00 - Ready for peak (no lag)

Cost Optimization

Scale down during predictable low-traffic periods:

Forecasted Demand:
  23:00 - High traffic ends
  00:00 - Low traffic predicted

Autoscaler Action:
  23:00 - Begin gradual scale down
  00:00 - Minimal replicas (cost savings)
  07:00 - Proactive scale up for morning traffic

Event-Driven Scaling

Handle scheduled events:

Event: Product launch at 12:00
Historical: Similar launches caused 10x traffic

Autoscaler Action:
  11:50 - Start scaling to predicted capacity
  12:00 - Fully scaled for event
  12:30 - Gradual scale down as traffic normalizes

Benefits

Accuracy

Second-level predictions vs. minute-level in traditional autoscalers
MILP optimization for mathematically optimal scaling decisions
Historical pattern recognition for accurate forecasting

Performance

Proactive scaling eliminates lag time
Smooth transitions with gradual scaling
No scaling oscillations due to intelligent caching

Cost Efficiency

Minimize over-provisioning with accurate predictions
Optimize resource usage with MILP
Scheduled scale-down during predictable low-traffic periods

Reliability

Fallback mechanisms maintain service during optimizer outages
Cache redundancy ensures continuous operation
Constraint enforcement prevents under/over-scaling

Quick Links

Architecture - Detailed architecture overview
Quick Start - Get started in minutes
Nomadscaler Plugin - Plugin configuration
Optimizer Service - Optimizer setup
Scaling Policies - Policy examples

Requirements

Nomad 1.4+ with autoscaler
Go 1.25+ (for building nomadscaler)
Python 3.12+ (for optimizer service)
Metrics source (Nomad APM, Prometheus, etc.)

Next Steps

Read the Architecture overview
Follow the Quick Start guide
Configure Scaling Policies
Deploy to Nomad

Productify Autoscaler ​

Overview ​

Components ​

Prediction Horizon ​

Architecture ​

How It Works ​

1. Prediction Request ​

2. Time-Series Forecasting ​

3. MILP Optimization ​

4. Second-Level Caching ​

5. Cache-Based Scaling ​

6. Automatic Refresh ​

Use Cases ​

Traffic Spike Prevention ​

Cost Optimization ​

Event-Driven Scaling ​

Benefits ​

Accuracy ​

Performance ​

Cost Efficiency ​

Reliability ​

Quick Links ​

Requirements ​

Next Steps ​