Scaling Policies

Advanced scaling policy configuration and patterns for the Productify Autoscaler.

Policy Structure

Based on Nomad's autoscaling block with the productify-scaler strategy:

hcl

scaling {
  enabled = true
  min     = <minimum instances>
  max     = <maximum instances>

  policy {
    evaluation_interval = "<how often to evaluate>"
    cooldown            = "<minimum time between actions>"

    check "<check name>" {
      source = "nomad-apm"  # or "prometheus"
      query  = "<metric query>"

      strategy "productify-scaler" {
        optimizer_url   = "<optimizer service URL>"
        min             = <min replicas>
        max             = <max replicas>
        metric_app_name = "<application name for metrics>"
        cache_size      = <seconds to cache predictions>  # optional, default: 10
      }
    }
  }
}

Strategy Configuration

Required Parameters

The productify-scaler strategy requires the following configuration:

hcl

strategy "productify-scaler" {
  optimizer_url   = "http://optimizer:8000"  # Required
  min             = 0                        # Required
  max             = 10                       # Required
  metric_app_name = "my-app"                 # Required
}

Parameter	Required	Default	Description
`optimizer_url`	Yes	-	URL of the optimizer service
`min`	Yes	-	Minimum number of replicas
`max`	Yes	-	Maximum number of replicas
`metric_app_name`	Yes	-	Application name for metrics queries
`cache_size`	No	10	Number of seconds to cache predictions

Optional Parameters

hcl

strategy "productify-scaler" {
  optimizer_url   = "http://optimizer:8000"
  min             = 1
  max             = 20
  metric_app_name = "web-app"
  cache_size      = 15  # Cache predictions for 15 seconds
}

Complete Example from Code

From autoscaler/nomadscaler/config/test.hcl:

hcl

job "webapp" {
  datacenters = ["dc1"]

  group "demo" {
    count = 3

    network {
      port "webapp_http" {}
    }

    scaling {
      enabled = true
      min     = 0
      max     = 10

      policy {
        evaluation_interval = "3s"
        cooldown            = "10s"

        check "productify-scale-check" {
          source = "nomad-apm"
          query  = "avg_cpu-allocated"

          strategy "productify-scaler" {
            min             = 0
            max             = 10
            metric_app_name = "test"
            cache_size      = 10
          }
        }
      }
    }

    task "webapp" {
      driver = "docker"

      config {
        image = "hashicorp/demo-webapp-lb-guide"
        ports = ["webapp_http"]
      }

      resources {
        cpu    = 100
        memory = 16
      }
    }
  }
}

Advanced Configuration

Cache Size Tuning

Adjust the number of seconds to cache predictions:

Longer cache (more stable):

hcl

strategy "productify-scaler" {
  optimizer_url   = "http://optimizer:8000"
  min             = 2
  max             = 20
  metric_app_name = "my-app"
  cache_size      = 30  # Cache 30 seconds of predictions
}

Shorter cache (more responsive):

hcl

strategy "productify-scaler" {
  optimizer_url   = "http://optimizer:8000"
  min             = 2
  max             = 20
  metric_app_name = "my-app"
  cache_size      = 5   # Cache 5 seconds of predictions
}

Predictive Scaling

The productify-scaler strategy automatically uses SARIMAX forecasting for predictive scaling:

Historical metrics are analyzed to identify patterns
Future demand is forecasted at second-level granularity
MILP optimization determines optimal scaling decisions
Predictions are cached for smooth scaling transitions

No additional configuration needed - forecasting is built-in.

Best Practices

Cooldown Periods

Fast-changing workloads: 30-60s
Stable workloads: 2-5 minutes
Database scaling: 5-15 minutes

Min/Max Bounds

Min: Set to handle baseline load + small buffer
Max: Set to cost limit or infrastructure capacity
Gap: Ensure sufficient room for scaling (min < max * 0.5)

Evaluation Interval

Latency-sensitive: 2-30s
Throughput-focused: 30-60s
Cost-optimized: 60-120s

Metric Selection

Single metric: Use one primary metric per scaling policy
Choose wisely: Select the metric that best represents service health
metric_app_name: Must match the application name in your metrics system

Troubleshooting

Over-Scaling

Symptoms:

Frequent scale-up beyond necessary
High costs

Solutions:

Increase target values
Lengthen cooldown
Increase scale-up cost in MILP
Review metric accuracy

Under-Scaling

Symptoms:

High latency/errors
Maxed-out instances

Solutions:

Decrease target values
Shorten evaluation interval
Increase max instances
Review MILP violation penalty

Oscillation

Symptoms:

Rapid up/down scaling
Unstable instance count

Solutions:

Increase cooldown period
Increase cache size
Smooth metrics (longer query windows)
Adjust SARIMAX parameters

Scaling Policies ​

Policy Structure ​

Strategy Configuration ​

Required Parameters ​

Optional Parameters ​

Complete Example from Code ​

Advanced Configuration ​

Cache Size Tuning ​

Predictive Scaling ​

Best Practices ​

Cooldown Periods ​

Min/Max Bounds ​

Evaluation Interval ​

Metric Selection ​

Troubleshooting ​

Over-Scaling ​

Under-Scaling ​

Oscillation ​

See Also ​

Scaling Policies

Policy Structure

Strategy Configuration

Required Parameters

Optional Parameters

Complete Example from Code

Advanced Configuration

Cache Size Tuning

Predictive Scaling

Best Practices

Cooldown Periods

Min/Max Bounds

Evaluation Interval

Metric Selection

Troubleshooting

Over-Scaling

Under-Scaling

Oscillation

See Also