Skip to content

Scaling Policies

Advanced scaling policy configuration and patterns for the Productify Autoscaler.

Policy Structure

Based on Nomad's autoscaling block with the productify-scaler strategy:

hcl
scaling {
  enabled = true
  min     = <minimum instances>
  max     = <maximum instances>

  policy {
    evaluation_interval = "<how often to evaluate>"
    cooldown            = "<minimum time between actions>"

    check "<check name>" {
      source = "nomad-apm"  # or "prometheus"
      query  = "<metric query>"

      strategy "productify-scaler" {
        optimizer_url   = "<optimizer service URL>"
        min             = <min replicas>
        max             = <max replicas>
        metric_app_name = "<application name for metrics>"
        cache_size      = <seconds to cache predictions>  # optional, default: 10
      }
    }
  }
}

Strategy Configuration

Required Parameters

The productify-scaler strategy requires the following configuration:

hcl
strategy "productify-scaler" {
  optimizer_url   = "http://optimizer:8000"  # Required
  min             = 0                        # Required
  max             = 10                       # Required
  metric_app_name = "my-app"                 # Required
}
ParameterRequiredDefaultDescription
optimizer_urlYes-URL of the optimizer service
minYes-Minimum number of replicas
maxYes-Maximum number of replicas
metric_app_nameYes-Application name for metrics queries
cache_sizeNo10Number of seconds to cache predictions

Optional Parameters

hcl
strategy "productify-scaler" {
  optimizer_url   = "http://optimizer:8000"
  min             = 1
  max             = 20
  metric_app_name = "web-app"
  cache_size      = 15  # Cache predictions for 15 seconds
}

Complete Example from Code

From autoscaler/nomadscaler/config/test.hcl:

hcl
job "webapp" {
  datacenters = ["dc1"]

  group "demo" {
    count = 3

    network {
      port "webapp_http" {}
    }

    scaling {
      enabled = true
      min     = 0
      max     = 10

      policy {
        evaluation_interval = "3s"
        cooldown            = "10s"

        check "productify-scale-check" {
          source = "nomad-apm"
          query  = "avg_cpu-allocated"

          strategy "productify-scaler" {
            min             = 0
            max             = 10
            metric_app_name = "test"
            cache_size      = 10
          }
        }
      }
    }

    task "webapp" {
      driver = "docker"

      config {
        image = "hashicorp/demo-webapp-lb-guide"
        ports = ["webapp_http"]
      }

      resources {
        cpu    = 100
        memory = 16
      }
    }
  }
}

Advanced Configuration

Cache Size Tuning

Adjust the number of seconds to cache predictions:

Longer cache (more stable):

hcl
strategy "productify-scaler" {
  optimizer_url   = "http://optimizer:8000"
  min             = 2
  max             = 20
  metric_app_name = "my-app"
  cache_size      = 30  # Cache 30 seconds of predictions
}

Shorter cache (more responsive):

hcl
strategy "productify-scaler" {
  optimizer_url   = "http://optimizer:8000"
  min             = 2
  max             = 20
  metric_app_name = "my-app"
  cache_size      = 5   # Cache 5 seconds of predictions
}

Predictive Scaling

The productify-scaler strategy automatically uses SARIMAX forecasting for predictive scaling:

  • Historical metrics are analyzed to identify patterns
  • Future demand is forecasted at second-level granularity
  • MILP optimization determines optimal scaling decisions
  • Predictions are cached for smooth scaling transitions

No additional configuration needed - forecasting is built-in.

Best Practices

Cooldown Periods

  • Fast-changing workloads: 30-60s
  • Stable workloads: 2-5 minutes
  • Database scaling: 5-15 minutes

Min/Max Bounds

  • Min: Set to handle baseline load + small buffer
  • Max: Set to cost limit or infrastructure capacity
  • Gap: Ensure sufficient room for scaling (min < max * 0.5)

Evaluation Interval

  • Latency-sensitive: 2-30s
  • Throughput-focused: 30-60s
  • Cost-optimized: 60-120s

Metric Selection

  • Single metric: Use one primary metric per scaling policy
  • Choose wisely: Select the metric that best represents service health
  • metric_app_name: Must match the application name in your metrics system

Troubleshooting

Over-Scaling

Symptoms:

  • Frequent scale-up beyond necessary
  • High costs

Solutions:

  • Increase target values
  • Lengthen cooldown
  • Increase scale-up cost in MILP
  • Review metric accuracy

Under-Scaling

Symptoms:

  • High latency/errors
  • Maxed-out instances

Solutions:

  • Decrease target values
  • Shorten evaluation interval
  • Increase max instances
  • Review MILP violation penalty

Oscillation

Symptoms:

  • Rapid up/down scaling
  • Unstable instance count

Solutions:

  • Increase cooldown period
  • Increase cache size
  • Smooth metrics (longer query windows)
  • Adjust SARIMAX parameters

See Also