Setup Cost Gates

What Are Cost Gates?

Cost Gates is a GitHub App that automatically creates pull requests with rightsizing recommendations for your Kubernetes workloads.

How It Works

text
1Kubeadapt analyzes usage → Recommendation ready → Bot creates PR → Team reviews → Merge → Optimization applied

Key Features

GitOps-First Approach:

  • All changes trackable and auditable through Git
  • No direct kubectl apply calls to clusters
  • Infrastructure teams maintain complete control

Intelligent Throttling:

  • Prevents PR spam with configurable rules
  • Per-deployment throttling (multiple workloads can have PRs simultaneously)
  • PR grouping (updates existing PR instead of creating duplicates for maintainability)

Flexible Actions:

  • autopr: Create PR, require manual review (default)
  • automerge: Auto-merge PRs automatically

Prerequisites

Before setting up Cost Gates, ensure you have:

  • Kubeadapt cluster connected with active monitoring
  • GitHub repository for your Kubernetes manifests (YAML files)
  • Admin access to install GitHub Apps
  • GitOps workflow (Argo CD, Flux, or manual kubectl apply from Git)

Supported Platforms:

  • GitHub
  • GitLab (coming soon)

Step 1: Install Kubeadapt GitHub App

Cost Gates operates as a GitHub App that monitors your repository.

Install App

  1. Navigate Kubeadapt Github App
  2. Click "Install"
  3. Select repositories:
    • All repositories (if you manage multiple clusters)
    • Only select repositories (recommended: choose repos with K8s manifests)
  4. Click "Install & Authorize"

Onboarding

After installation, the bot automatically:

  1. Creates onboarding PR with .github/kubeadapt.yaml
  2. Scans repository for tracking comments
  3. Validates configuration and posts status in PR description

Review the onboarding PR:

  • Check default throttling settings
  • Adjust configuration if needed
  • Merge to activate the bot

Step 2: Add Tracking Comments

To track a workload, add inline tracking comments next to resource values in your YAML files.

Why Inline Comments?

Inline comments work with any YAML structure:

  • Helm chart values.yaml
  • Kustomize overlays
  • Plain Kubernetes manifests
  • Custom template structures

Basic Format

yaml
1resources: 2 requests: 3 cpu: "1" # kubeadapt.io/cluster=prod,deployment=api-server,resource=cpu 4 memory: "2Gi" # kubeadapt.io/cluster=prod,deployment=api-server,resource=memory

Full Format with Limits and HPA

yaml
1resources: 2 requests: 3 cpu: "2" # kubeadapt.io/cluster=prod,deployment=api-server,resource=cpu 4 memory: "4Gi" # kubeadapt.io/cluster=prod,deployment=api-server,resource=memory 5 limits: 6 cpu: "4" # kubeadapt.io/cluster=prod,deployment=api-server,resource=cpu-limit 7 memory: "8Gi" # kubeadapt.io/cluster=prod,deployment=api-server,resource=memory-limit 8 9minReplicas: 3 # kubeadapt.io/cluster=prod,deployment=api-server,hpa=min 10maxReplicas: 10 # kubeadapt.io/cluster=prod,deployment=api-server,hpa=max

Comment Attributes

Required:

  • cluster: Cluster name in Kubeadapt (e.g., prod, staging)
  • deployment: Workload name
  • resource: Resource type - cpu, memory, cpu-limit, memory-limit OR hpa: min, max

Note: Namespace, controller type (Deployment/StatefulSet/DaemonSet), and action (autopr/automerge) are configured in .github/kubeadapt.yaml using path patterns and environment overrides


Step 3: Configure Throttling & Auto-Merge

Edit .kubeadapt/config.yaml to customize bot behavior.

Example Configuration

yaml
1version: "1.0" 2 3# Throttling prevents PR spam 4throttling: 5 min_interval: "24h" # Min time between PRs for same deployment 6 min_cost_impact: 25.0 # Min $25/month savings required 7 min_percentage_gain: 5.0 # Min 5% improvement required 8 max_concurrent_prs: 10 # Max 10 open PRs across all deployments 9 10# Auto-merge settings 11auto_merge: 12 enabled: false # Enable auto-merge globally 13 require_approval: true # Require human approval before merge 14 wait_time_after_merge: "1h" # Cooldown period after merge 15 16# Periodic check schedule (cron format) 17schedule: "0 9 * * 1" # Every Monday at 9 AM 18 19# Environment-specific overrides 20environments: 21 - name: production 22 cluster_id: prod 23 path_patterns: 24 - "k8s/production/**" 25 auto_merge: false # Conservative for prod 26 27 - name: staging 28 cluster_id: staging 29 path_patterns: 30 - "k8s/staging/**" 31 auto_merge: true # Aggressive for staging

Configuration Profiles

Conservative (Production):

yaml
1throttling: 2 min_interval: "7d" # Weekly max 3 min_cost_impact: 100.0 # Significant savings only 4 min_percentage_gain: 15.0 # Substantial improvements 5 max_concurrent_prs: 3 # Limited PRs

Aggressive (Staging):

yaml
1throttling: 2 min_interval: "12h" # Twice daily 3 min_cost_impact: 5.0 # Any savings 4 min_percentage_gain: 2.0 # Small improvements OK 5 max_concurrent_prs: 20 # Many PRs allowed

Balanced (Recommended):

yaml
1throttling: 2 min_interval: "24h" # Daily max 3 min_cost_impact: 25.0 # Meaningful savings 4 min_percentage_gain: 5.0 # Reasonable improvements 5 max_concurrent_prs: 10 # Moderate concurrency

Step 4: Track Your Workloads

Add tracking comments to your Kubernetes manifests.

Example 1: Helm Chart Values (Auto-PR)

yaml
1# values.yaml for api-server 2replicaCount: 3 3 4image: 5 repository: myapp 6 tag: latest 7 8resources: 9 requests: 10 cpu: "1" # kubeadapt.io/cluster=prod,deployment=api-server,resource=cpu 11 memory: "2Gi" # kubeadapt.io/cluster=prod,deployment=api-server,resource=memory

Configuration in .github/kubeadapt.yaml:

yaml
1environments: 2 - name: production 3 cluster_id: prod 4 path_patterns: 5 - "helm/api-server/values.yaml" 6 action: autopr # Default: create PR, require review

Result: Bot creates PRs automatically when recommendations are available and met threshold criterias.

Example 2: Kustomize Overlay with Auto-Merge

yaml
1# overlays/production/postgres-patch.yaml 2 3apiVersion: apps/v1 4kind: StatefulSet 5metadata: 6 name: postgres 7 namespace: database 8spec: 9 template: 10 spec: 11 containers: 12 - name: postgres 13 resources: 14 requests: 15 cpu: "2" # kubeadapt.io/cluster=prod,deployment=postgres,resource=cpu 16 memory: "4Gi" # kubeadapt.io/cluster=prod,deployment=postgres,resource=memory 17 limits: 18 cpu: "4" # kubeadapt.io/cluster=prod,deployment=postgres,resource=cpu-limit 19 memory: "8Gi" # kubeadapt.io/cluster=prod,deployment=postgres,resource=memory-limit

Configuration in .kubeadapt/config.yaml:

yaml
1environments: 2 - name: production-database 3 cluster_id: prod 4 path_patterns: 5 - "overlays/production/postgres-patch.yaml" 6 action: automerge

Result: Bot creates and auto-merges PRs automatically.


Step 5: Understand PR Format

When a recommendation is ready, the bot creates a PR.

PR Structure

Title:

text
1chore(ci): rightsize <Type>/<name> resources

Examples:

  • text
    1chore(ci): rightsize Deployment/api-server resources
  • text
    1chore(ci): rightsize StatefulSet/postgres resources

Branch:

text
1kubeadapt/<type>/<deployment-name>

PR Description includes:

  • Current vs. recommended resources
  • Cost impact (monthly savings)
  • Safety analysis
  • Interactive controls (checkboxes)

Interactive PR Controls

Bot includes Renovate-inspired controls:

Checkbox Commands:

markdown
1- [ ] If you want to rebase/retry this PR, check this box

Check the box → Bot rebases PR onto latest base branch

Slash Commands:

  • text
    1/rebase
    - Rebase PR onto latest base
  • text
    1/recreate
    - Close and recreate PR with fresh analysis
  • text
    1/skip
    - Close PR without merging

Throttling Behavior

Throttling prevents PR spam while ensuring important optimizations aren't missed.

Per-Deployment Throttling

Key Principle: Throttling applies per-deployment, not globally.

  • Allowed: 5 PRs open simultaneously for different deployments
  • Blocked: 2 PRs for the same deployment within 24 hours (Kubeadapt can generate 1 or 2 recommendation depending on the spike trend)

Throttling Checks

Check 0: Concurrent Limit (Global)

  • Max 10 open PRs across all deployments

Check 1: Time Interval (Per-Deployment)

  • Min 24h between PRs for same deployment

Check 2: Cost Impact

  • Savings must exceed $25/month

Check 3: Percentage Gain

  • Improvement must exceed 5%

Example Scenario

yaml
1# Repo tracks 3 deployments 2 3Deployment A: Last PR 12h ago → New rec: $50/mo → BLOCKED (< 24h) 4Deployment B: Last PR 3d ago → New rec: $30/mo → ALLOWED 5Deployment C: Last PR 7d ago → New rec: $20/mo → ALLOWED

Result: 2 PRs created (B + C), A waits 12 more hours.


PR Grouping

To keep your repository maintainable and avoid PR spam, the bot updates existing PRs whenever possible instead of creating new ones.

How It Works

text
1Day 1: Recommendation arrives → $60 savings → Create PR #123 2 3Day 3: New recommendation arrives → $70 savings 4 → UPDATE PR #123 (rebase + add new changes) 5 → Don't create new PR 6 7Day 5: PR #123 merged ✅ 8 → Next recommendation creates new PR

Key Principle: The bot prefers updating existing PRs to maintain a clean PR history.

When PRs Are Updated vs. Created

Bot UPDATES existing PR when:

  • Open PR already exists for the deployment
  • New recommendation is available

Bot CREATES new PR when:

  • No open PR exists
  • Previous PR was merged or closed

Best Practices

1. Use Auto-Merge for Non-Critical Services

Configuration in .kubeadapt/config.yaml:

yaml
1environments: 2 - name: staging 3 cluster_id: staging 4 path_patterns: 5 - "helm/**" 6 action: automerge # Fully automated

Staging environments benefit from automatic optimization.

2. Set Conservative Throttling for Production

yaml
1throttling: 2 min_interval: "7d" # Weekly max 3 min_cost_impact: 50.0 # Significant savings only

3. Different Settings for Different Environments

yaml
1environments: 2 - name: production 3 cluster_id: prod 4 path_patterns: 5 - "k8s/production/**" 6 action: autopr 7 auto_merge: false # Manual review required 8 9 - name: staging 10 cluster_id: staging 11 path_patterns: 12 - "k8s/staging/**" 13 action: automerge 14 auto_merge: true # Fully automated

4. Use Path Patterns for Organization

yaml
1environments: 2 - name: backend-services 3 cluster_id: prod 4 path_patterns: 5 - "helm/backend/**" 6 action: autopr 7 8 - name: job-workers 9 cluster_id: prod 10 path_patterns: 11 - "helm/jobs/**" 12 action: automerge