Deployment Architecture
Complete architecture overview for deploying the Productify Framework.
System Architecture
┌─────────────────┐
│ Internet │
└────────┬────────┘
│
┌────────▼────────┐
│ Productify │
│ Proxy (Caddy) │
│ - TLS │
│ - Routing │
│ - Load Balance │
└────────┬────────┘
│
┌────────────────┼────────────────┐
│ │ │
┌───────▼──────┐ ┌──────▼──────┐ ┌─────▼────────┐
│ Manager │ │ Manager │ │ Manager │
│ API (1) │ │ API (2) │ │ API (N) │
│ Port 8080 │ │ Port 8080 │ │ Port 8080 │
└───────┬──────┘ └──────┬──────┘ └─────┬────────┘
└────────────────┼────────────────┘
│
┌────────▼────────┐
│ Manager │
│ Executor │
│ (Single) │
│ - Triggers │
└────────┬────────┘
│
┌────────▼────────┐
│ PostgreSQL │
│ Database │
│ (HA Cluster) │
└─────────────────┘
┌───────────────────────────────────────────────────────┐
│ Nomad Cluster │
│ ┌──────────────┐ ┌─────────────────────┐ │
│ │ Autoscaler │◄──────►│ Optimizer Service │ │
│ │ Plugin │ HTTP │ (Python/MILP) │ │
│ └──────────────┘ └─────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────┐ │
│ │ Application Workloads │ │
│ │ (Auto-scaled by Nomadscaler) │ │
│ └─────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────────┘Component Overview
Manager
API Instances (Stateless)
- Handle GraphQL and REST requests
- Horizontally scalable
- Load-balanced by Proxy
- No persistent state
Executor Instance (Stateful)
- Runs trigger execution loop
- Single instance (no scaling)
- Manages cron schedules
- Sends backend callbacks
Database
- PostgreSQL cluster
- Primary + standby replicas
- Connection pooling via PgBouncer
Proxy
Caddy Server
- TLS termination
- Service discovery (Nomad)
- Load balancing
- HTTP/2 support
Autoscaler
Nomadscaler Plugin
- Integrates with Nomad Autoscaler
- Queries Optimizer for predictions
- Second-level caching
Optimizer Service
- SARIMAX forecasting
- MILP optimization
- Stateless, horizontally scalable
Network Architecture
┌─────────── DMZ ───────────┐
│ │
│ ┌──────────────────────┐ │
│ │ Proxy (Public) │ │
│ │ Ports: 80, 443 │ │
│ └──────────┬───────────┘ │
└─────────────┼─────────────┘
│
┌─────────────▼─────────────┐
│ Application Network │
│ (Private) │
│ │
│ ┌──────────────────────┐ │
│ │ Manager API │ │
│ │ Port: 8080 │ │
│ └──────────────────────┘ │
│ │
│ ┌──────────────────────┐ │
│ │ Manager Executor │ │
│ │ Port: 8080 │ │
│ └──────────────────────┘ │
│ │
│ ┌──────────────────────┐ │
│ │ Optimizer │ │
│ │ Port: 8000 │ │
│ └──────────────────────┘ │
└───────────────────────────┘
│
┌─────────────▼─────────────┐
│ Data Network │
│ (Private) │
│ │
│ ┌──────────────────────┐ │
│ │ PostgreSQL │ │
│ │ Port: 5432 │ │
│ └──────────────────────┘ │
│ │
│ ┌──────────────────────┐ │
│ │ PgBouncer │ │
│ │ Port: 6432 │ │
│ └──────────────────────┘ │
└───────────────────────────┘Deployment Models
Single Node (Development)
┌─────────────────────────────┐
│ Single Server │
│ ┌────────────────────────┐ │
│ │ Docker Compose │ │
│ │ - Proxy │ │
│ │ - Manager │ │
│ │ - PostgreSQL │ │
│ │ - Optimizer │ │
│ └────────────────────────┘ │
└─────────────────────────────┘Multi-Node (Production)
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Proxy │ │ Proxy │ │ Proxy │
│ (LB) │ │ (LB) │ │ (LB) │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
└─────────────────┼─────────────────┘
│
┌─────────────────┼─────────────────┐
│ │ │
┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐
│ Manager │ │ Manager │ │ Manager │
│ API │ │ API │ │ API │
└─────────────┘ └─────────────┘ └─────────────┘
┌─────────────┐
│ Manager │
│ Executor │
└──────┬──────┘
│
┌──────▼──────────────────────┐
│ PostgreSQL Cluster │
│ Primary + Replicas │
└─────────────────────────────┘Nomad Cluster
┌───────────────────────────────────────┐
│ Nomad Cluster │
│ │
│ ┌─────────────────────────────────┐ │
│ │ System Jobs │ │
│ │ - Proxy (on all nodes) │ │
│ └─────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────┐ │
│ │ Service Jobs │ │
│ │ - Manager API (3 instances) │ │
│ │ - Manager Executor (1 instance) │ │
│ │ - Optimizer (2 instances) │ │
│ └──────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────┐ │
│ │ Batch Jobs │ │
│ │ - Application workloads │ │
│ │ - Auto-scaled │ │
│ └──────────────────────────────────┘ │
└───────────────────────────────────────┘Scaling Strategy
Manager API
Horizontal Scaling:
- Add instances based on load
- Stateless, can scale freely
- Load balanced by Proxy
- Typical: 2-10 instances
Manager Executor
No Scaling:
- Single instance only
- Stateful (runs cron loop)
- Leader election NOT implemented
- High availability via quick restart
Optimizer
Horizontal Scaling:
- Stateless service
- Can scale freely
- Each instance independent
- Typical: 2-5 instances
Database
Vertical + Replication:
- Scale up primary for write performance
- Add read replicas for read scaling
- Use connection pooling (PgBouncer)
High Availability
Manager API
- Multiple instances - 3+ instances across availability zones
- Health checks - Proxy removes unhealthy instances
- Graceful shutdown - Connection draining
Manager Executor
- Fast restart - Restart on failure (< 30s)
- No data loss - Trigger state in database
- Missed executions - Catch up on restart
Database
- Streaming replication - Primary + 2 standby replicas
- Automatic failover - Via Patroni or similar
- Point-in-time recovery - WAL archiving
Proxy
- Multiple instances - DNS round-robin or L4 load balancer
- Health checks - Remove failed instances
- Certificate replication - Shared cert storage
Deployment Strategies
Blue-Green Deployment
Zero-downtime deployment by running two identical environments:
- Deploy green (new version) alongside blue (current)
- Test green in isolated environment
- Switch traffic from blue to green
- Keep blue running for quick rollback
- Decommission blue after verification
Canary Deployment
Gradual rollout to minimize risk:
- Deploy canary (new version) with small traffic percentage (5-10%)
- Monitor metrics (errors, latency, resource usage)
- Gradually increase canary traffic (25% → 50% → 100%)
- Full rollout if metrics are acceptable
- Rollback immediately if issues detected
Rolling Update
Update instances sequentially:
- Update instance 1, wait for health check
- Update instance 2, wait for health check
- Continue until all instances updated
- Automatic rollback on health check failure
Disaster Recovery
Backup Strategy
Database:
- Full backup daily
- WAL archiving continuous
- Retain 30 days
Configuration:
- Version control (Git)
- Environment variables and secrets
Certificates:
- Backup Let's Encrypt data directory
- Export certificates for DR
Recovery Procedures
Database Failure:
- Promote standby replica
- Update connection strings
- Restore from backup if needed
Complete Failure:
- Restore database from backup
- Deploy infrastructure from IaC
- Restore configuration
- Verify services
Security
Network Security
- Firewall rules: Restrict access to internal components
- VPC/Private network: Isolate backend services from internet
- TLS everywhere: Encrypt all internal communication
- Service mesh: Optional for microservice architectures
Secrets Management
- Never commit secrets: Use environment variables or secrets managers (Vault, AWS Secrets Manager)
- Rotate credentials: Regular password/token rotation (90 days)
- Minimal privileges: Each component gets only needed permissions
- Encrypted at rest: Sensitive data encrypted in database
Access Control
- VPN required: Access to internal services via VPN only
- SSH keys: No password authentication
- Audit logs: Track all administrative actions
- RBAC: Role-based access control in Manager
Monitoring
Metrics
- Manager: Request rate, latency, error rate
- Database: Connections, query time, replication lag
- Proxy: Requests, TLS handshakes, upstream health
- Optimizer: Prediction latency, forecast accuracy
Logging
- Centralized: ELK stack or Loki
- Structured: JSON format
- Retention: 30-90 days
Alerting
- Service down
- High error rate
- Database replication lag
- Certificate expiration
- Disk space low