Productify Autoscaler — Optimizer
A small optimizer service used by Productify's autoscaler. It provides calculation utilities and a HTTP endpoint for running optimization calculations.
The optimizer uses second-level granularity for predictions, returning a list of desired replica counts for the next N seconds (configurable via cache_size, default: 10 seconds).
Minimal requirements: Python 3.12+ and Poetry
Overview
The Optimizer:
- Forecasts future resource demand using SARIMAX
- Optimizes scaling decisions with MILP
- Returns second-level predictions for caching
- Exposes REST API for integration
Installation
Via Poetry
Install Poetry (if you don't have it):
Follow the instructions at https://python-poetry.org/docs/
Install project dependencies:
cd optimizer
poetry installUsage
Run the example calculation (quick check):
poetry run testcalcStart the HTTP endpoint (starts a small web service exposing the optimizer):
poetry run webVia Docker
There is a Dockerfile in this folder. To build and run the image locally:
cd optimizer
docker build -t ghcr.io/productifyfw/optimizer:latest .
# Run the optimizer
docker run --rm -p 8015:8015 ghcr.io/productifyfw/optimizer:latestAdjust published ports to match the service configuration.
Configuration
config.ini
The optimizer reads configuration from a config.ini file in the project root. Below is an example config.ini and a short explanation of each value. Do not commit secrets — keep tokens and credentials out of version control.
Example config.ini:
[main]
loglevel=debug
api_loglevel=warning
only_test_data=true
token=SUPER_SECRET_TOKENKeys:
loglevel— Controls logging verbosity for the optimizer. Typical values:debug,info,warning,error.api_loglevel— Controls logging for API/request handling.only_test_data— Set totrueto force the optimizer to use bundled test/sample data instead of real inputs (useful for local testing).token— API token used to authenticate requests. Treat this as a secret; prefer injecting it via a secrets manager or environment variable in production.
Usage notes:
- Edit
config.inibefore starting the service - If your deployment platform supports secrets or environment variables, use those instead of storing tokens in plaintext.
API
POST /optimize
The main optimization endpoint that returns desired replica counts.
Request Body:
{
"token": "SUPER_SECRET_TOKEN",
"check": {
"name": "scaling-check",
"source": "nomad-apm",
"group": "webapp",
"metric_app_name": "my-app"
},
"current_replicas": 3,
"min_replicas": 1,
"max_replicas": 10,
"cache_size": 10
}Parameters:
cache_size(optional, default: 10): Number of seconds to predict ahead. The optimizer will return this many values in thedesiredarray.
Response:
{
"desired": [3, 3, 4, 4, 5, 5, 6, 6, 7, 7]
}The response contains a list of desired replica counts, one for each second in the prediction horizon. The nomadscaler plugin caches these values and serves the appropriate value based on elapsed time.
GET /health
Health check endpoint.
GET /metrics
Prometheus-compatible metrics, only available if enable_test_metrics setting is enabled.
SARIMAX Forecasting
Automatic Order Selection
The optimizer automatically selects the best SARIMAX model from multiple candidate orders:
candidate_orders = [(1, 1, 1), (1, 0, 1), (2, 1, 1), (1, 1, 0)]
best_result = None
best_aic = float("inf")
for order in candidate_orders:
try:
model = SARIMAX(
y, # historical request rate
exog=X, # exogenous variables
order=order,
enforce_stationarity=False,
enforce_invertibility=False,
)
res = model.fit(disp=False)
if res.aic < best_aic:
best_aic = res.aic
best_result = res
except Exception:
continue
# Use best model for forecasting
forecast = best_result.get_forecast(
steps=forecast_horizon_seconds,
exog=exog_forecast
)Exogenous Variables
The model uses the following exogenous variables:
avg_response_time: Average response time per requestauthentication_awaiting_users: Number of users waiting for authenticationqueue_waiting: Queue waiting timeavg_processing_time: Average processing time per request
Demand Calculation
Forecasted request rate is converted to demand:
demand[t] = forecast_requests[t] * avg_response_time * 0.05 +
queue_waiting * 0.1 +
authentication_awaiting_users * 0.2MILP Optimization
OR-Tools SCIP Solver
The optimizer uses OR-Tools with the SCIP solver for Mixed-Integer Linear Programming:
from ortools.linear_solver import pywraplp
solver = pywraplp.Solver.CreateSolver("SCIP")Objective Function
Minimize total cost including replica costs, SLA penalties, and scaling costs:
total_cost = solver.Sum(
replica_cost * x[t] + # Cost of running replicas
penalty * sla[t] + # SLA violation penalty
startup_cost * up[t] + # Cost to start instances
shutdown_cost * down[t] + # Cost to stop instances
extra_penalty * no_rep[t] # Penalty for no replicas when needed
for t in range(T)
)
solver.Minimize(total_cost)Decision Variables
# Integer variables
x[t] = Number of replicas at time t (min_replicas to max_replicas)
up[t] = Number of instances to scale up (0 to max_scale_up, default 2)
down[t] = Number of instances to scale down (0 to max_scale_down, default 2)
no_rep[t] = Binary indicator: 1 if no replicas running, 0 otherwise
# Continuous variables
sla[t] = SLA shortfall/unmet demand at time t (>= 0)Constraints
# 1. Meet predicted demand (with SLA slack)
solver.Add(x[t] * capacity_per_replica + sla[t] >= demand[t])
# 2. Replica count evolution
solver.Add(x[t] == x[t-1] + up[k] - down[t])
# 3. Scale down limit (can't stop more than you have)
solver.Add(down[t] <= x[t-1])
# 4. Initial state
solver.Add(x[0] == initial_replicas)Parameters
Default values:
capacity_per_replica: Based on allocated CPU/memoryreplica_cost: Based on CPU/memory pricing (configurable weights)penalty: 1.0 (SLA violation cost)startup_cost: 0.5shutdown_cost: 0.3max_scale_up: 2 (instances per time step)max_scale_down: 2 (instances per time step)
Troubleshooting
Slow Predictions
Optimize:
- Reduce forecast horizon
- Simplify SARIMAX parameters
- Increase solver time limit
- Add more CPU resources
Inaccurate Forecasts
Improve:
- Provide more historical metrics
- Tune SARIMAX parameters
- Adjust seasonal period
- Filter metric outliers
Memory Issues
Solutions:
- Limit metric history size
- Reduce forecast horizon
- Implement metric sampling
- Increase memory limits
Monitoring
Logging
# Configure logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)Development
If you plan on developing the optimizer library:
- Use
poetry installto install dev dependencies. - Run the web service locally with
poetry run web. - Run unit tests frequently and add tests for new behaviour.
Testing
Run the project's test suite with:
poetry run test