Scaling Policies
Advanced scaling policy configuration and patterns for the Productify Autoscaler.
Policy Structure
Based on Nomad's autoscaling block with the productify-scaler strategy:
hcl
scaling {
enabled = true
min = <minimum instances>
max = <maximum instances>
policy {
evaluation_interval = "<how often to evaluate>"
cooldown = "<minimum time between actions>"
check "<check name>" {
source = "nomad-apm" # or "prometheus"
query = "<metric query>"
strategy "productify-scaler" {
optimizer_url = "<optimizer service URL>"
min = <min replicas>
max = <max replicas>
metric_app_name = "<application name for metrics>"
cache_size = <seconds to cache predictions> # optional, default: 10
}
}
}
}Strategy Configuration
Required Parameters
The productify-scaler strategy requires the following configuration:
hcl
strategy "productify-scaler" {
optimizer_url = "http://optimizer:8000" # Required
min = 0 # Required
max = 10 # Required
metric_app_name = "my-app" # Required
}| Parameter | Required | Default | Description |
|---|---|---|---|
optimizer_url | Yes | - | URL of the optimizer service |
min | Yes | - | Minimum number of replicas |
max | Yes | - | Maximum number of replicas |
metric_app_name | Yes | - | Application name for metrics queries |
cache_size | No | 10 | Number of seconds to cache predictions |
Optional Parameters
hcl
strategy "productify-scaler" {
optimizer_url = "http://optimizer:8000"
min = 1
max = 20
metric_app_name = "web-app"
cache_size = 15 # Cache predictions for 15 seconds
}Complete Example from Code
From autoscaler/nomadscaler/config/test.hcl:
hcl
job "webapp" {
datacenters = ["dc1"]
group "demo" {
count = 3
network {
port "webapp_http" {}
}
scaling {
enabled = true
min = 0
max = 10
policy {
evaluation_interval = "3s"
cooldown = "10s"
check "productify-scale-check" {
source = "nomad-apm"
query = "avg_cpu-allocated"
strategy "productify-scaler" {
min = 0
max = 10
metric_app_name = "test"
cache_size = 10
}
}
}
}
task "webapp" {
driver = "docker"
config {
image = "hashicorp/demo-webapp-lb-guide"
ports = ["webapp_http"]
}
resources {
cpu = 100
memory = 16
}
}
}
}Advanced Configuration
Cache Size Tuning
Adjust the number of seconds to cache predictions:
Longer cache (more stable):
hcl
strategy "productify-scaler" {
optimizer_url = "http://optimizer:8000"
min = 2
max = 20
metric_app_name = "my-app"
cache_size = 30 # Cache 30 seconds of predictions
}Shorter cache (more responsive):
hcl
strategy "productify-scaler" {
optimizer_url = "http://optimizer:8000"
min = 2
max = 20
metric_app_name = "my-app"
cache_size = 5 # Cache 5 seconds of predictions
}Predictive Scaling
The productify-scaler strategy automatically uses SARIMAX forecasting for predictive scaling:
- Historical metrics are analyzed to identify patterns
- Future demand is forecasted at second-level granularity
- MILP optimization determines optimal scaling decisions
- Predictions are cached for smooth scaling transitions
No additional configuration needed - forecasting is built-in.
Best Practices
Cooldown Periods
- Fast-changing workloads: 30-60s
- Stable workloads: 2-5 minutes
- Database scaling: 5-15 minutes
Min/Max Bounds
- Min: Set to handle baseline load + small buffer
- Max: Set to cost limit or infrastructure capacity
- Gap: Ensure sufficient room for scaling (min < max * 0.5)
Evaluation Interval
- Latency-sensitive: 2-30s
- Throughput-focused: 30-60s
- Cost-optimized: 60-120s
Metric Selection
- Single metric: Use one primary metric per scaling policy
- Choose wisely: Select the metric that best represents service health
- metric_app_name: Must match the application name in your metrics system
Troubleshooting
Over-Scaling
Symptoms:
- Frequent scale-up beyond necessary
- High costs
Solutions:
- Increase target values
- Lengthen cooldown
- Increase scale-up cost in MILP
- Review metric accuracy
Under-Scaling
Symptoms:
- High latency/errors
- Maxed-out instances
Solutions:
- Decrease target values
- Shorten evaluation interval
- Increase max instances
- Review MILP violation penalty
Oscillation
Symptoms:
- Rapid up/down scaling
- Unstable instance count
Solutions:
- Increase cooldown period
- Increase cache size
- Smooth metrics (longer query windows)
- Adjust SARIMAX parameters