Skip to content

Productify Autoscaler — Nomadscaler

Nomad target plugin for the Productify autoscaler. This component is implemented in Go and is intended to run as a container or a Nomad job.

Overview

The plugin:

  • Interfaces with Nomad Autoscaler
  • Queries the Optimizer service for predictions
  • Implements second-level caching mechanism
  • Applies scaling decisions to Nomad jobs

Installation

Build from Source

bash
cd nomadscaler
./build.sh

The compiled binary will be in ./bin/nomadscaler.

Run Locally

bash
docker run --rm ghcr.io/productifyfw/nomadscaler:latest

Configuration

Configuration for nomadscaler is managed via HCL configuration. An example job that renders the config.hcl is available at:

  • nomadscaler/config/autoscaler.hcl

Strategy Configuration Parameters

The productify-scaler strategy plugin supports the following configuration parameters:

  • optimizer_token (required): Authentication token for the optimizer API
  • optimizer_url (required): URL of the optimizer service
  • min (required): Minimum number of replicas
  • max (required): Maximum number of replicas
  • metric_app_name (required): Name of the application for metrics
  • cache_size (optional, default: 10): Number of desired replica values to cache from optimizer response

Cache Behavior

The plugin uses a second-level time-based cache for accurate scaling decisions:

  • Optimizer predictions: The optimizer forecasts demand at second-level granularity and returns desired replica counts for the next N seconds (where N = cache_size)
  • Cache storage: All returned values are cached with a timestamp
  • Value selection: Each second corresponds to one cache position. The plugin selects the value matching elapsed seconds since cache update
  • Cache exhaustion: If more time has elapsed than cache size, the last cached value is reused
  • Refresh strategy: After using the cache 2 times, the plugin attempts a fresh optimizer request
  • Fallback: If the optimizer request fails, the plugin continues using cached values

Example with cache_size=10:

Time (s)  | 0  1  2  3  4  5  6  7  8  9  | 10+
Value     | 3  3  4  4  5  5  6  6  7  7  | 7 (last)
Request   | ^  -  -  F  -  -  F  -  -  -  | F
          | F=Fresh request, -=Use cache

This design ensures smooth scaling transitions while maintaining responsiveness to changing demand patterns. T=3s : Request new predictions (background) T=4s : Use prediction[3] = 4 ...


### Scaling Logic

```go
func (p *Plugin) Scale(ctx context.Context) (int64, error) {
    // Get current metrics
    metrics := p.getMetrics()

    // Check cache validity
    if p.cacheValid() {
        return p.getCachedPrediction(), nil
    }

    // Request new predictions
    predictions, err := p.optimizer.Predict(metrics, p.cacheSize)
    if err != nil {
        // Fallback to last cached value
        return p.getLastPrediction(), nil
    }

    // Update cache
    p.updateCache(predictions)

    return predictions[0], nil
}

Metrics Collection

Nomad APM

Use Nomad's built-in metrics:

hcl
check "cpu" {
  source = "nomad-apm"
  query  = "avg_cpu_percent"
}

check "memory" {
  source = "nomad-apm"
  query  = "avg_memory_percent"
}

Prometheus

Query Prometheus metrics:

hcl
check "cpu" {
  source = "prometheus"
  query  = "avg(rate(container_cpu_usage_seconds_total[5m]))"
}

Troubleshooting

Plugin Not Loading

Check:

  • Plugin binary is in correct directory
  • Binary has execute permissions
  • Nomad Autoscaler logs for errors
bash
tail -f /var/log/nomad-autoscaler.log | grep nomadscaler

Optimizer Unreachable

Verify:

  • Optimizer service is running
  • URL is correct and accessible
  • Network connectivity
  • Firewall rules

Test connection:

bash
curl http://{ip}:{optimizer_port}/health

No Scaling Actions

Debug:

  • Check if policy is enabled
  • Verify metrics are being collected
  • Review cooldown period
  • Check min/max constraints

Enable debug logging:

hcl
log_level = "DEBUG"

See Also