Nixpacks + Cloud Registry Integration Guide

This document describes the complete Nixpacks + Cloud Registry integration for BitFlow, enabling automatic container image builds from Git repositories.

🏗️ Architecture Overview

graph TD
    A[Git Push] --> B[Webhook]
    B --> C[API Server]
    C --> D[RabbitMQ Queue]
    D --> E[Build Worker]
    E --> F[Nixpacks]
    F --> G[Cloud Registry]
    G --> H[Federated Training]

Components

  1. API Server - Receives webhooks, manages builds, queues jobs
  2. Build Worker - Processes build jobs using Nixpacks
  3. RabbitMQ - Reliable job queue with retries and dead letters
  4. Cloud Registry - Secure container image storage (GCR, ECR, ACR, etc.)
  5. Nixpacks - Automatic buildpack detection and building

🚀 Quick Start

1. Prerequisites

# Install dependencies
apt-get install -y docker.io git

# Install Nixpacks
curl -sSL https://nixpacks.com/install.sh | bash

# Verify installation
nixpacks --version
docker --version

2. Configure Services

RabbitMQ Setup:

docker run -d --name rabbitmq \
  -p 5672:5672 -p 15672:15672 \
  -e RABBITMQ_DEFAULT_USER=bitflow \
  -e RABBITMQ_DEFAULT_PASS=bitflow123 \
  rabbitmq:3-management

Cloud Registry Setup (Google Container Registry example):

# Install Google Cloud SDK
curl https://sdk.cloud.google.com | bash

# Authenticate with your Google Cloud account
gcloud auth login

# Configure Docker to use GCR
gcloud auth configure-docker

# Alternative: Use service account key
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json

3. Start Build Worker

# Copy configuration
cp configs/worker.yaml.example configs/worker.yaml

# Edit with your settings
vim configs/worker.yaml

# Start worker
./build/worker --config configs/worker.yaml

📋 API Endpoints

Build Management

MethodEndpointDescription
GET/v1/buildsList image builds
POST/v1/buildsCreate new build
GET/v1/builds/:idGet build details
GET/v1/builds/:id/logsView build logs
POST/v1/builds/:id/cancelCancel running build
POST/v1/builds/:id/retryRetry failed build
GET/v1/builds/builder-typesList available builders

Webhook Integration

MethodEndpointDescription
POST/v1/webhooks/github/eventsGitHub webhook handler
POST/v1/webhooks/gitlab/eventsGitLab webhook handler
POST/v1/webhooks/gitee/eventsGitee webhook handler

🛠️ CLI Usage

Basic Operations

# List all builds
bitflow builds list

# Create manual build
bitflow builds create owner/repo --branch main --tag v1.0.0

# Monitor build progress
bitflow builds show abc123-def456
bitflow builds logs abc123-def456 --follow

# List available builders
bitflow builds builders

Repository Management

# Connect repository with auto-build
bitflow repos connect owner/repo --platform github

# Configure webhook (auto-setup)
bitflow repos init --template python

🔧 Build Configuration

Nixpacks Configuration

Create .nixpacks/nixpacks.toml in your repository:

[variables]
NODE_VERSION = "18"
PYTHON_VERSION = "3.11"

[buildpacks]
python = true
node = false

[phases.build]
cmd = "pip install -r requirements.txt"

[phases.start]
cmd = "python app.py"

Build Request Example

{
  "repository_id": "owner/lungadenosquam",
  "branch": "main",
  "builder_type": "nixpacks",
  "image_tag": "v1.0.0",
  "build_config": {
    "nixpacks_version": "1.0.0",
    "registry_url": "gcr.io/my-project",
    "image_name": "medical/lungadenosquam",
    "build_timeout": 1800,
    "scan_on_push": true,
    "environment": {
      "MODEL_TYPE": "classification",
      "DATASET_PATH": "/data"
    }
  }
}

🔄 Workflow Integration

Using Built Images in Workflows

{
  "name": "lung-cancer-training",
  "tasks": [
    {
      "task_type": "container",
      "configuration": {
        "image": "gcr.io/my-project/medical/lungadenosquam:v1.0.0",
        "resources": {
          "gpu_count": 1,
          "memory": "32Gi"
        },
        "dataset_mounts": [
          {
            "dataset_id": "lung-wsi-dataset",
            "mount_path": "/data",
            "permissions": "ro"
          }
        ]
      }
    }
  ]
}

Auto-Build Triggers

  1. Push to main/master - Automatic build with tag main-auto-abc1234
  2. Manual API call - Custom tag and configuration
  3. Scheduled builds - Future: cron-based rebuilds

🏥 Medical AI Example: LungAdenoSquam

Repository Setup

# Connect the LungAdenoSquam repository
bitflow repos connect bitroc-ai/LungAdenoSquam --platform github

# Enable auto-builds (default: enabled)
# Push to main will trigger: gcr.io/my-project/bitroc-ai/lungadenosquam:main-auto-abc1234

Training Workflow

{
  "name": "lung-adenosquam-federated-training",
  "dataset_mounts": [
    { "dataset_id": "lung-wsi-us", "mount_path": "/data/us" },
    { "dataset_id": "lung-wsi-eu", "mount_path": "/data/eu" }
  ],
  "tasks": [
    {
      "task_type": "container",
      "configuration": {
        "image": "gcr.io/my-project/bitroc-ai/lungadenosquam:latest",
        "command": ["python", "train.py", "--federated"],
        "resources": { "gpu_count": 1, "memory": "32Gi" }
      }
    }
  ]
}

Federated Execution

  • US Datacenter: Uses local lung-wsi-us dataset only
  • EU Datacenter: Uses local lung-wsi-eu dataset only
  • Both pull same image: gcr.io/my-project/bitroc-ai/lungadenosquam:latest
  • Results aggregated: Central orchestrator combines model weights

🔍 Monitoring & Debugging

Build Status Tracking

# Monitor queue status
curl -H "Authorization: Bearer $TOKEN" \
  http://api.bitflow.ai/v1/builds/queue/stats

# Check build logs
bitflow builds logs abc123-def456

# View worker metrics
docker logs bitflow-worker

Common Issues

IssueSolution
Nixpacks not foundInstall: curl -sSL https://nixpacks.com/install.sh | bash
Registry push failsCheck authentication and permissions
Build timeoutIncrease build_timeout in BuildConfig
Queue backing upScale worker count or add more workers

Log Locations

  • API Server: Application logs
  • Build Worker: /var/log/bitflow/worker.log
  • Build Logs: Database image_builds.build_log field
  • RabbitMQ: Management UI at :15672

🔒 Security Considerations

Cloud Registry Security

  1. Service Accounts - Use service accounts with minimal required permissions
  2. Project/Namespace Isolation - Separate registries per team/environment
  3. Image Scanning - Enable vulnerability scanning (GCR: Container Analysis API)
  4. Access Control - Use IAM policies for fine-grained access control

Build Security

  1. Isolated Builds - Each build runs in separate directory
  2. Resource Limits - Configure memory/CPU limits per build
  3. Network Isolation - Workers should not access external internet
  4. Secret Management - Use environment variables, not hardcoded secrets

Network Architecture

[Internet] -> [Load Balancer] -> [API Server]
                                      |
[Cloud Registry] <- [Build Workers] <-+-> [RabbitMQ]
                                      |
                               [Database]

📈 Scaling & Performance

Horizontal Scaling

# Scale workers across multiple machines
# Machine 1
./build/worker --config configs/worker-1.yaml

# Machine 2
./build/worker --config configs/worker-2.yaml

# Configure different work directories to avoid conflicts

Performance Tuning

  • Concurrent Workers: Start with 2 per CPU core
  • Build Timeout: 30 minutes for typical projects
  • RabbitMQ: Use cluster mode for high availability
  • Cloud Registry: Automatically scales, no configuration needed

Monitoring Metrics

  • Build queue depth
  • Average build time
  • Success/failure rates
  • Registry storage usage
  • Worker CPU/memory usage

🔄 Future Enhancements

  1. Multi-stage Builds - Optimize image layers
  2. Cache Management - Build cache sharing between workers
  3. Build Templates - Pre-configured builds for common frameworks
  4. Parallel Builds - Multi-architecture builds (AMD64, ARM64)
  5. Build Artifacts - Store build reports and test results

This integration provides a complete solution for automated container builds in federated learning environments, ensuring data privacy while enabling consistent model training across multiple datacenters. By using cloud registries instead of self-hosted solutions, the system benefits from global CDN distribution, automatic scaling, and enterprise-grade security.