Skip to content

Jailer Integration Plan

Executive Summary

Goal: Provide developers with fast, manageable database subsets while maintaining staging environment relevance and testing production backups.

Flow:

text
Production DB (30GB)
    ↓ Daily backup
Staging Server (Ubuntu, DigitalOcean MySQL)
    ↓ Import full dump (tests backup validity)
    ↓ Extract Jailer subsets (1-5GB)
    ↓ Upload to storage
Dev Machines (macOS, Docker)
    ↓ Download subset
    ↓ Import locally

Benefits:

  • ✅ Tests production backups daily (staging import)
  • ✅ Keeps staging relevant (fresh prod data)
  • ✅ Devs get manageable subsets (30GB → 2GB)
  • ✅ Fast dev imports (60min → 5min)
  • ✅ Automated daily refresh

Architecture Overview

Components

text
┌─────────────────────────────────────────────────────────────────┐
│                        PRODUCTION                                │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  MySQL 8.0 (DigitalOcean Managed)                        │  │
│  │  - 30GB database                                         │  │
│  │  - Daily mysqldump backup                                │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

                    Daily at 2:00 AM UTC

┌─────────────────────────────────────────────────────────────────┐
│                         STAGING                                  │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  Ubuntu 22.04 Server                                     │  │
│  │  ┌────────────────────────────────────────────────────┐ │  │
│  │  │  1. Import full dump (30GB)                        │ │  │
│  │  │     - Validates backup integrity                   │ │  │
│  │  │     - Keeps staging fresh                          │ │  │
│  │  │     - Takes 25-40 mins                             │ │  │
│  │  └────────────────────────────────────────────────────┘ │  │
│  │  ┌────────────────────────────────────────────────────┐ │  │
│  │  │  2. Jailer extraction (multiple profiles)         │ │  │
│  │  │     - Small (1% sample, 300MB)                     │ │  │
│  │  │     - Medium (10% sample, 2GB)                     │ │  │
│  │  │     - Recent (6 months, 3GB)                       │ │  │
│  │  │     - Takes 15-30 mins total                       │ │  │
│  │  └────────────────────────────────────────────────────┘ │  │
│  │  ┌────────────────────────────────────────────────────┐ │  │
│  │  │  3. Upload to DigitalOcean Spaces (S3)            │ │  │
│  │  │     - jailer-dumps/small-YYYYMMDD.sql.gz          │ │  │
│  │  │     - jailer-dumps/medium-YYYYMMDD.sql.gz         │ │  │
│  │  │     - jailer-dumps/recent-YYYYMMDD.sql.gz         │ │  │
│  │  │     - jailer-dumps/latest-small.sql.gz (symlink)  │ │  │
│  │  └────────────────────────────────────────────────────┘ │  │
│  └──────────────────────────────────────────────────────────┘  │
│  MySQL: DigitalOcean Managed Database                           │
└─────────────────────────────────────────────────────────────────┘

                    Download on demand

┌─────────────────────────────────────────────────────────────────┐
│                    DEV MACHINES (macOS)                          │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  Docker Compose (MySQL 8.0)                              │  │
│  │  - Download latest subset from Spaces                    │  │
│  │  - Import locally (2-5 mins)                             │  │
│  │  - Work with realistic subset                            │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

Phase 1: Staging Server Setup (Ubuntu)

1.1 Prerequisites

Server Specs (Recommended):

  • CPU: 4+ cores
  • RAM: 16GB+
  • Disk: 200GB SSD
  • OS: Ubuntu 22.04 LTS

Access Required:

  • SSH access to staging server
  • MySQL connection to DigitalOcean managed DB (staging)
  • DigitalOcean Spaces API keys (S3-compatible storage)
  • Read access to production DB backups

1.2 Install Dependencies

bash
# Update system
sudo apt update && sudo apt upgrade -y

# Install MySQL client (for accessing managed DB)
sudo apt install -y mysql-client

# Install Java (required for Jailer)
sudo apt install -y openjdk-17-jdk

# Install compression tools
sudo apt install -y pigz pv

# Install s3cmd for DigitalOcean Spaces
sudo apt install -y s3cmd

# Verify installations
java -version
mysql --version
pigz --version
s3cmd --version

1.3 Install Jailer

bash
# Create directory
sudo mkdir -p /opt/jailer
cd /opt/jailer

# Download latest Jailer (check https://github.com/Wisser/Jailer/releases)
JAILER_VERSION="16.3"
sudo wget https://github.com/Wisser/Jailer/releases/download/v${JAILER_VERSION}/jailer_${JAILER_VERSION}.zip

# Extract
sudo unzip jailer_${JAILER_VERSION}.zip
sudo chmod +x jailer.sh

# Verify
./jailer.sh version

# Create symlink for easier access
sudo ln -s /opt/jailer/jailer.sh /usr/local/bin/jailer

1.4 Configure MySQL Connection

Create /opt/jailer-dumps/.my.cnf:

ini
[client]
host = staging-db.digitalocean.com
port = 25060
user = doadmin
password = YOUR_STAGING_DB_PASSWORD
ssl-ca = /opt/jailer-dumps/ca-certificate.crt

Download DigitalOcean MySQL CA certificate:

bash
sudo mkdir -p /opt/jailer-dumps
cd /opt/jailer-dumps
sudo wget https://raw.githubusercontent.com/DigitalOcean/engineering-blog/master/ca-certificate.crt
sudo chmod 600 .my.cnf

1.5 Configure DigitalOcean Spaces (S3)

bash
# Configure s3cmd
s3cmd --configure

# Enter when prompted:
# - Access Key: YOUR_SPACES_ACCESS_KEY
# - Secret Key: YOUR_SPACES_SECRET_KEY
# - Region: nyc3 (or your region)
# - Endpoint: nyc3.digitaloceanspaces.com
# - DNS-style bucket: %(bucket)s.nyc3.digitaloceanspaces.com

# Test connection
s3cmd ls s3://ixdf-jailer-dumps/

# Create bucket if doesn't exist
s3cmd mb s3://ixdf-jailer-dumps

Phase 2: Staging Server Scripts

2.1 Directory Structure

bash
/opt/jailer-dumps/
├── .my.cnf                    # MySQL credentials
├── ca-certificate.crt         # DigitalOcean SSL cert
├── config/
   ├── exclusions.txt        # Tables to exclude
   └── profiles/
       ├── small.conf        # 1% sample config
       ├── medium.conf       # 10% sample config
       └── recent.conf       # 6 months config
├── model/                     # Jailer data model (generated)
├── dumps/                     # Temporary dump storage
   ├── production-full.sql.gz
   └── subsets/
       ├── small-YYYYMMDD.sql.gz
       ├── medium-YYYYMMDD.sql.gz
       └── recent-YYYYMMDD.sql.gz
├── scripts/
   ├── 1-import-production-dump.sh
   ├── 2-generate-jailer-model.sh
   ├── 3-extract-jailer-subsets.sh
   ├── 4-upload-to-spaces.sh
   └── daily-refresh.sh      # Main orchestrator
└── logs/
    └── daily-refresh-YYYYMMDD.log

2.2 Script: Import Production Dump

File: /opt/jailer-dumps/scripts/1-import-production-dump.sh

bash
#!/bin/bash
set -e

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
BASE_DIR="$(dirname "$SCRIPT_DIR")"
DUMP_DIR="$BASE_DIR/dumps"
LOG_FILE="$BASE_DIR/logs/import-$(date +%Y%m%d-%H%M%S).log"

exec > >(tee -a "$LOG_FILE") 2>&1

echo "================================"
echo "STEP 1: Import Production Dump"
echo "Started: $(date)"
echo "================================"

# Download production dump from forge server
echo "Downloading production dump..."
scp forge@information-architecture.org:/home/forge/backups/lastbackup--dev.sql.gz \
    "$DUMP_DIR/production-full.sql.gz"

DUMP_SIZE=$(du -h "$DUMP_DIR/production-full.sql.gz" | cut -f1)
echo "Downloaded: $DUMP_SIZE"

# Decompress with pigz
echo "Decompressing with pigz..."
pigz -d -k -f "$DUMP_DIR/production-full.sql.gz"
echo "Decompressed: $(du -h "$DUMP_DIR/production-full.sql" | cut -f1)"

# Import to staging database
echo "Importing to staging database..."
DB_NAME="ixdf_staging"

# Drop and recreate database
mysql --defaults-extra-file="$BASE_DIR/.my.cnf" -e "DROP DATABASE IF EXISTS \`$DB_NAME\`"
mysql --defaults-extra-file="$BASE_DIR/.my.cnf" -e "CREATE DATABASE \`$DB_NAME\` DEFAULT CHARACTER SET utf8mb4 DEFAULT COLLATE utf8mb4_unicode_ci"

# Import with optimizations
mysql --defaults-extra-file="$BASE_DIR/.my.cnf" "$DB_NAME" -e "
    SET SESSION foreign_key_checks = 0;
    SET SESSION unique_checks = 0;
    SET SESSION autocommit = 0;
    SET SESSION sql_log_bin = 0;
"

pv "$DUMP_DIR/production-full.sql" | \
    mysql --defaults-extra-file="$BASE_DIR/.my.cnf" "$DB_NAME" \
    --max-allowed-packet=1G \
    --net-buffer-length=1M

# Re-enable checks
mysql --defaults-extra-file="$BASE_DIR/.my.cnf" "$DB_NAME" -e "
    SET SESSION autocommit = 1;
    SET SESSION unique_checks = 1;
    SET SESSION foreign_key_checks = 1;
"

echo "✅ Import complete!"
echo "Finished: $(date)"
echo ""

2.3 Script: Generate Jailer Model

File: /opt/jailer-dumps/scripts/2-generate-jailer-model.sh

bash
#!/bin/bash
set -e

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
BASE_DIR="$(dirname "$SCRIPT_DIR")"
MODEL_DIR="$BASE_DIR/model"
LOG_FILE="$BASE_DIR/logs/model-$(date +%Y%m%d-%H%M%S).log"

exec > >(tee -a "$LOG_FILE") 2>&1

echo "================================"
echo "STEP 2: Generate Jailer Model"
echo "Started: $(date)"
echo "================================"

# Clean old model
rm -rf "$MODEL_DIR"
mkdir -p "$MODEL_DIR"

# Extract DB credentials from .my.cnf
DB_HOST=$(grep 'host' "$BASE_DIR/.my.cnf" | cut -d'=' -f2 | tr -d ' ')
DB_PORT=$(grep 'port' "$BASE_DIR/.my.cnf" | cut -d'=' -f2 | tr -d ' ')
DB_USER=$(grep 'user' "$BASE_DIR/.my.cnf" | cut -d'=' -f2 | tr -d ' ')
DB_PASS=$(grep 'password' "$BASE_DIR/.my.cnf" | cut -d'=' -f2 | tr -d ' ')
DB_NAME="ixdf_staging"
CA_CERT="$BASE_DIR/ca-certificate.crt"

echo "Analyzing database schema..."
jailer import-schema \
    -jdbcjar /opt/jailer/lib/mysql-connector-java.jar \
    -datamodel "$MODEL_DIR" \
    -url "jdbc:mysql://${DB_HOST}:${DB_PORT}/${DB_NAME}?useSSL=true&requireSSL=true&verifyServerCertificate=true&trustCertificateKeyStoreUrl=file:${CA_CERT}" \
    -user "$DB_USER" \
    -password "$DB_PASS"

echo "✅ Model generated!"
echo "Tables analyzed: $(ls "$MODEL_DIR" | wc -l)"
echo "Finished: $(date)"
echo ""

2.4 Script: Extract Jailer Subsets

File: /opt/jailer-dumps/scripts/3-extract-jailer-subsets.sh

bash
#!/bin/bash
set -e

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
BASE_DIR="$(dirname "$SCRIPT_DIR")"
MODEL_DIR="$BASE_DIR/model"
DUMP_DIR="$BASE_DIR/dumps/subsets"
CONFIG_DIR="$BASE_DIR/config"
LOG_FILE="$BASE_DIR/logs/extract-$(date +%Y%m%d-%H%M%S).log"
DATE_STAMP=$(date +%Y%m%d)

exec > >(tee -a "$LOG_FILE") 2>&1

echo "================================"
echo "STEP 3: Extract Jailer Subsets"
echo "Started: $(date)"
echo "================================"

mkdir -p "$DUMP_DIR"

# DB connection details
DB_HOST=$(grep 'host' "$BASE_DIR/.my.cnf" | cut -d'=' -f2 | tr -d ' ')
DB_PORT=$(grep 'port' "$BASE_DIR/.my.cnf" | cut -d'=' -f2 | tr -d ' ')
DB_USER=$(grep 'user' "$BASE_DIR/.my.cnf" | cut -d'=' -f2 | tr -d ' ')
DB_PASS=$(grep 'password' "$BASE_DIR/.my.cnf" | cut -d'=' -f2 | tr -d ' ')
DB_NAME="ixdf_staging"
CA_CERT="$BASE_DIR/ca-certificate.crt"

JDBC_URL="jdbc:mysql://${DB_HOST}:${DB_PORT}/${DB_NAME}?useSSL=true&requireSSL=true&verifyServerCertificate=true&trustCertificateKeyStoreUrl=file:${CA_CERT}"

# Build exclusions
EXCLUSIONS=""
while IFS= read -r line; do
    [[ "$line" =~ ^#.*$ || -z "$line" ]] && continue
    EXCLUSIONS="$EXCLUSIONS -exclude $line"
done < "$CONFIG_DIR/exclusions.txt"

# Profile 1: Small (1% sample)
echo "Extracting: Small (1% sample)..."
jailer export \
    -datamodel "$MODEL_DIR" \
    -jdbcjar /opt/jailer/lib/mysql-connector-java.jar \
    -url "$JDBC_URL" \
    -user "$DB_USER" \
    -password "$DB_PASS" \
    -format SQL \
    -e "member" \
    -where "MOD(id, 100) = 0" \
    -threads 4 \
    $EXCLUSIONS \
    -d "$DUMP_DIR/small-${DATE_STAMP}.sql"

pigz -f "$DUMP_DIR/small-${DATE_STAMP}.sql"
echo "✅ Small: $(du -h "$DUMP_DIR/small-${DATE_STAMP}.sql.gz" | cut -f1)"

# Profile 2: Medium (10% sample)
echo "Extracting: Medium (10% sample)..."
jailer export \
    -datamodel "$MODEL_DIR" \
    -jdbcjar /opt/jailer/lib/mysql-connector-java.jar \
    -url "$JDBC_URL" \
    -user "$DB_USER" \
    -password "$DB_PASS" \
    -format SQL \
    -e "member" \
    -where "MOD(id, 10) = 0" \
    -threads 4 \
    $EXCLUSIONS \
    -d "$DUMP_DIR/medium-${DATE_STAMP}.sql"

pigz -f "$DUMP_DIR/medium-${DATE_STAMP}.sql"
echo "✅ Medium: $(du -h "$DUMP_DIR/medium-${DATE_STAMP}.sql.gz" | cut -f1)"

# Profile 3: Recent (last 6 months)
echo "Extracting: Recent (6 months)..."
jailer export \
    -datamodel "$MODEL_DIR" \
    -jdbcjar /opt/jailer/lib/mysql-connector-java.jar \
    -url "$JDBC_URL" \
    -user "$DB_USER" \
    -password "$DB_PASS" \
    -format SQL \
    -e "member" \
    -where "created_at > DATE_SUB(NOW(), INTERVAL 6 MONTH)" \
    -threads 4 \
    $EXCLUSIONS \
    -d "$DUMP_DIR/recent-${DATE_STAMP}.sql"

pigz -f "$DUMP_DIR/recent-${DATE_STAMP}.sql"
echo "✅ Recent: $(du -h "$DUMP_DIR/recent-${DATE_STAMP}.sql.gz" | cut -f1)"

echo "Finished: $(date)"
echo ""

2.5 Script: Upload to Spaces

File: /opt/jailer-dumps/scripts/4-upload-to-spaces.sh

bash
#!/bin/bash
set -e

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
BASE_DIR="$(dirname "$SCRIPT_DIR")"
DUMP_DIR="$BASE_DIR/dumps/subsets"
LOG_FILE="$BASE_DIR/logs/upload-$(date +%Y%m%d-%H%M%S).log"
DATE_STAMP=$(date +%Y%m%d)
S3_BUCKET="s3://ixdf-jailer-dumps"

exec > >(tee -a "$LOG_FILE") 2>&1

echo "================================"
echo "STEP 4: Upload to Spaces"
echo "Started: $(date)"
echo "================================"

# Upload dated files
echo "Uploading small-${DATE_STAMP}.sql.gz..."
s3cmd put "$DUMP_DIR/small-${DATE_STAMP}.sql.gz" \
    "$S3_BUCKET/small-${DATE_STAMP}.sql.gz" \
    --acl-private

echo "Uploading medium-${DATE_STAMP}.sql.gz..."
s3cmd put "$DUMP_DIR/medium-${DATE_STAMP}.sql.gz" \
    "$S3_BUCKET/medium-${DATE_STAMP}.sql.gz" \
    --acl-private

echo "Uploading recent-${DATE_STAMP}.sql.gz..."
s3cmd put "$DUMP_DIR/recent-${DATE_STAMP}.sql.gz" \
    "$S3_BUCKET/recent-${DATE_STAMP}.sql.gz" \
    --acl-private

# Update "latest" symlinks by copying
echo "Updating latest-small.sql.gz..."
s3cmd cp "$S3_BUCKET/small-${DATE_STAMP}.sql.gz" \
    "$S3_BUCKET/latest-small.sql.gz"

echo "Updating latest-medium.sql.gz..."
s3cmd cp "$S3_BUCKET/medium-${DATE_STAMP}.sql.gz" \
    "$S3_BUCKET/latest-medium.sql.gz"

echo "Updating latest-recent.sql.gz..."
s3cmd cp "$S3_BUCKET/recent-${DATE_STAMP}.sql.gz" \
    "$S3_BUCKET/latest-recent.sql.gz"

# Create metadata file
cat > "/tmp/metadata-${DATE_STAMP}.json" <<EOF
{
  "date": "$(date -Iseconds)",
  "files": {
    "small": {
      "filename": "small-${DATE_STAMP}.sql.gz",
      "size": "$(du -b "$DUMP_DIR/small-${DATE_STAMP}.sql.gz" | cut -f1)",
      "size_human": "$(du -h "$DUMP_DIR/small-${DATE_STAMP}.sql.gz" | cut -f1)"
    },
    "medium": {
      "filename": "medium-${DATE_STAMP}.sql.gz",
      "size": "$(du -b "$DUMP_DIR/medium-${DATE_STAMP}.sql.gz" | cut -f1)",
      "size_human": "$(du -h "$DUMP_DIR/medium-${DATE_STAMP}.sql.gz" | cut -f1)"
    },
    "recent": {
      "filename": "recent-${DATE_STAMP}.sql.gz",
      "size": "$(du -b "$DUMP_DIR/recent-${DATE_STAMP}.sql.gz" | cut -f1)",
      "size_human": "$(du -h "$DUMP_DIR/recent-${DATE_STAMP}.sql.gz" | cut -f1)"
    }
  }
}
EOF

s3cmd put "/tmp/metadata-${DATE_STAMP}.json" \
    "$S3_BUCKET/latest-metadata.json" \
    --acl-private

rm "/tmp/metadata-${DATE_STAMP}.json"

# Cleanup old dumps (keep last 7 days on server)
echo "Cleaning up old local dumps..."
find "$DUMP_DIR" -name "*.sql.gz" -mtime +7 -delete

# Cleanup old dumps on Spaces (keep last 30 days)
echo "Cleaning up old remote dumps..."
s3cmd ls "$S3_BUCKET/" | \
    awk '{print $4}' | \
    grep -E '\d{8}\.sql\.gz$' | \
    while read -r file; do
        file_date=$(echo "$file" | grep -oE '\d{8}')
        days_old=$(( ($(date +%s) - $(date -d "$file_date" +%s)) / 86400 ))
        if [ "$days_old" -gt 30 ]; then
            echo "Deleting old file: $file (${days_old} days old)"
            s3cmd rm "$file"
        fi
    done

echo "✅ Upload complete!"
echo "Finished: $(date)"
echo ""

# List current files
echo "Current files in Spaces:"
s3cmd ls "$S3_BUCKET/" | tail -10

2.6 Script: Daily Orchestrator

File: /opt/jailer-dumps/scripts/daily-refresh.sh

bash
#!/bin/bash
set -e

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
BASE_DIR="$(dirname "$SCRIPT_DIR")"
LOG_FILE="$BASE_DIR/logs/daily-refresh-$(date +%Y%m%d).log"

exec > >(tee -a "$LOG_FILE") 2>&1

echo "========================================"
echo "DAILY JAILER REFRESH"
echo "Started: $(date)"
echo "========================================"
echo ""

# Run all steps in sequence
bash "$SCRIPT_DIR/1-import-production-dump.sh"
bash "$SCRIPT_DIR/2-generate-jailer-model.sh"
bash "$SCRIPT_DIR/3-extract-jailer-subsets.sh"
bash "$SCRIPT_DIR/4-upload-to-spaces.sh"

echo "========================================"
echo "✅ DAILY REFRESH COMPLETE!"
echo "Finished: $(date)"
echo "Duration: $SECONDS seconds"
echo "========================================"

# Send notification (optional - integrate with Slack, email, etc.)
# curl -X POST https://hooks.slack.com/... \
#     -d '{"text":"Jailer daily refresh completed successfully!"}'

2.7 Configuration Files

File: /opt/jailer-dumps/config/exclusions.txt

text
# Tables to exclude from Jailer subsets
# Based on scripts/database/backup--dev.sh

# Logging
logging__admin_logs
logging__logs

# Leads
leads
subscription__contact_list_subscribers
lead__download_requests
guest_downloaders

# Notifications
notification__message_log_email
notification__message_log_postcard
notification__message_log_sms
notification__message_queue_email
notification__message_queue_postcard
notification__message_queue_sms
notification__notification_log
notification__notification_queue

# Session data
session_references
jobs--failed

# Statistics
statistic__metric_values

# Membership renewals
membership__renewal_results

# OAuth tokens
oauth_access_tokens

2.8 Setup Cron Job

bash
# Edit crontab
sudo crontab -e

# Add daily refresh at 3:00 AM UTC (after production backup completes)
0 3 * * * /opt/jailer-dumps/scripts/daily-refresh.sh

# Or for testing, run every hour
#0 * * * * /opt/jailer-dumps/scripts/daily-refresh.sh

2.9 Make Scripts Executable

bash
chmod +x /opt/jailer-dumps/scripts/*.sh

Phase 3: Dev Machine Setup (macOS)

3.1 Install Dependencies

bash
# Install Homebrew (if not already installed)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# Install tools
brew install pigz pv s3cmd

# Verify
pigz --version
pv --version
s3cmd --version

3.2 Configure DigitalOcean Spaces Access

bash
# Configure s3cmd
s3cmd --configure

# Enter:
# - Access Key: YOUR_SPACES_ACCESS_KEY (read-only recommended)
# - Secret Key: YOUR_SPACES_SECRET_KEY
# - Region: nyc3
# - Endpoint: nyc3.digitaloceanspaces.com

# Test
s3cmd ls s3://ixdf-jailer-dumps/

3.3 Update Makefile

Add Jailer download and import commands:

File: Makefile

makefile
# Jailer subsets from DigitalOcean Spaces
S3_BUCKET := s3://ixdf-jailer-dumps
JAILER_LOCAL_DIR := tools/jailer/subsets

.PHONY: db\:update\:jailer\:small db\:update\:jailer\:medium db\:update\:jailer\:recent db\:jailer\:list

db\:jailer\:list:
	@echo "📦 Available Jailer subsets in Spaces:"
	@s3cmd ls $(S3_BUCKET)/ | grep 'latest-' || echo "No subsets found"
	@echo ""
	@echo "Download with:"
	@echo "  make db:update:jailer:small   - 1% sample (~300MB)"
	@echo "  make db:update:jailer:medium  - 10% sample (~2GB)"
	@echo "  make db:update:jailer:recent  - Last 6 months (~3GB)"

db\:update\:jailer\:small:
	@echo "📥 Downloading small Jailer subset (1%, ~300MB)..."
	@mkdir -p $(JAILER_LOCAL_DIR)
	s3cmd get $(S3_BUCKET)/latest-small.sql.gz $(JAILER_LOCAL_DIR)/latest-small.sql.gz --force
	@echo "✅ Downloaded: $$(du -h $(JAILER_LOCAL_DIR)/latest-small.sql.gz | cut -f1)"
	@echo "Decompressing..."
	pigz -d -k -f $(JAILER_LOCAL_DIR)/latest-small.sql.gz
	@echo "Importing to local database..."
	./scripts/database/import.sh \
		-d $${DB_DATABASE:-ixdf} \
		-u root \
		-p $${DB_PASSWORD:-password} \
		-h 127.0.0.1 \
		-P $${FORWARD_DB_PORT:-3306} \
		-s $(JAILER_LOCAL_DIR) \
		-f latest-small.sql
	@echo "✅ Small subset imported!"

db\:update\:jailer\:medium:
	@echo "📥 Downloading medium Jailer subset (10%, ~2GB)..."
	@mkdir -p $(JAILER_LOCAL_DIR)
	s3cmd get $(S3_BUCKET)/latest-medium.sql.gz $(JAILER_LOCAL_DIR)/latest-medium.sql.gz --force
	@echo "✅ Downloaded: $$(du -h $(JAILER_LOCAL_DIR)/latest-medium.sql.gz | cut -f1)"
	@echo "Decompressing..."
	pigz -d -k -f $(JAILER_LOCAL_DIR)/latest-medium.sql.gz
	@echo "Importing to local database..."
	./scripts/database/import.sh \
		-d $${DB_DATABASE:-ixdf} \
		-u root \
		-p $${DB_PASSWORD:-password} \
		-h 127.0.0.1 \
		-P $${FORWARD_DB_PORT:-3306} \
		-s $(JAILER_LOCAL_DIR) \
		-f latest-medium.sql
	@echo "✅ Medium subset imported!"

db\:update\:jailer\:recent:
	@echo "📥 Downloading recent Jailer subset (6 months, ~3GB)..."
	@mkdir -p $(JAILER_LOCAL_DIR)
	s3cmd get $(S3_BUCKET)/latest-recent.sql.gz $(JAILER_LOCAL_DIR)/latest-recent.sql.gz --force
	@echo "✅ Downloaded: $$(du -h $(JAILER_LOCAL_DIR)/latest-recent.sql.gz | cut -f1)"
	@echo "Decompressing..."
	pigz -d -k -f $(JAILER_LOCAL_DIR)/latest-recent.sql.gz
	@echo "Importing to local database..."
	./scripts/database/import.sh \
		-d $${DB_DATABASE:-ixdf} \
		-u root \
		-p $${DB_PASSWORD:-password} \
		-h 127.0.0.1 \
		-P $${FORWARD_DB_PORT:-3306} \
		-s $(JAILER_LOCAL_DIR) \
		-f latest-recent.sql
	@echo "✅ Recent subset imported!"

3.4 Update .gitignore

bash
echo "tools/jailer/subsets/" >> .gitignore

3.5 Developer Workflow

bash
# List available subsets
make db:jailer:list

# Download and import small subset (quickest, for quick tests)
make db:update:jailer:small

# Download and import medium subset (balanced, daily work)
make db:update:jailer:medium

# Download and import recent subset (most complete, specific features)
make db:update:jailer:recent

Phase 4: Monitoring & Maintenance

4.1 Staging Server Monitoring

Create monitoring script: /opt/jailer-dumps/scripts/check-health.sh

bash
#!/bin/bash

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
BASE_DIR="$(dirname "$SCRIPT_DIR")"
S3_BUCKET="s3://ixdf-jailer-dumps"

echo "==================================="
echo "Jailer Health Check"
echo "==================================="
echo ""

# Check last successful run
LAST_LOG=$(ls -t "$BASE_DIR"/logs/daily-refresh-*.log 2>/dev/null | head -1)
if [ -n "$LAST_LOG" ]; then
    echo "Last run log: $LAST_LOG"
    if grep -q "✅ DAILY REFRESH COMPLETE" "$LAST_LOG"; then
        echo "Status: ✅ SUCCESS"
        echo "Completed: $(grep 'Finished:' "$LAST_LOG" | tail -1)"
    else
        echo "Status: ❌ FAILED"
        echo "Check log for errors: $LAST_LOG"
    fi
else
    echo "Status: ⚠️  No logs found"
fi
echo ""

# Check Spaces uploads
echo "Latest files in Spaces:"
s3cmd ls "$S3_BUCKET/" | grep 'latest-' || echo "No files found"
echo ""

# Check disk space
echo "Disk usage:"
df -h /opt/jailer-dumps
echo ""

# Check staging DB connection
echo "Staging DB connection:"
if mysql --defaults-extra-file="$BASE_DIR/.my.cnf" -e "SELECT 1" &>/dev/null; then
    echo "✅ Connected"
else
    echo "❌ Failed to connect"
fi

4.2 Alerting

Slack notifications (optional):

Add to end of daily-refresh.sh:

bash
# At end of script
if [ $? -eq 0 ]; then
    STATUS="✅ SUCCESS"
    COLOR="good"
else
    STATUS="❌ FAILED"
    COLOR="danger"
fi

curl -X POST https://hooks.slack.com/services/YOUR/WEBHOOK/URL \
    -H 'Content-Type: application/json' \
    -d "{
        \"attachments\": [{
            \"color\": \"$COLOR\",
            \"title\": \"Jailer Daily Refresh\",
            \"text\": \"Status: $STATUS\",
            \"fields\": [
                {\"title\": \"Date\", \"value\": \"$(date)\", \"short\": true},
                {\"title\": \"Duration\", \"value\": \"${SECONDS}s\", \"short\": true}
            ]
        }]
    }"

4.3 Log Rotation

Create: /etc/logrotate.d/jailer-dumps

text
/opt/jailer-dumps/logs/*.log {
    daily
    rotate 30
    compress
    delaycompress
    notifempty
    create 0644 root root
    sharedscripts
}

4.4 Backup Validation

The daily staging import serves as backup validation:

  • If import fails → backup is corrupted
  • If import succeeds → backup is valid
  • Automatically tests restore procedure

Add validation to import script:

bash
# After import, run basic sanity checks
echo "Running sanity checks..."

# Check table count
TABLE_COUNT=$(mysql --defaults-extra-file="$BASE_DIR/.my.cnf" "$DB_NAME" -N -e "
    SELECT COUNT(*) FROM information_schema.TABLES WHERE TABLE_SCHEMA='$DB_NAME'
")

if [ "$TABLE_COUNT" -lt 100 ]; then
    echo "❌ ERROR: Only $TABLE_COUNT tables imported (expected 100+)"
    exit 1
fi

# Check member count
MEMBER_COUNT=$(mysql --defaults-extra-file="$BASE_DIR/.my.cnf" "$DB_NAME" -N -e "
    SELECT COUNT(*) FROM member
")

if [ "$MEMBER_COUNT" -lt 1000 ]; then
    echo "❌ ERROR: Only $MEMBER_COUNT members (expected 1000+)"
    exit 1
fi

echo "✅ Sanity checks passed"
echo "  - Tables: $TABLE_COUNT"
echo "  - Members: $MEMBER_COUNT"

Phase 5: Team Onboarding

5.1 Developer Documentation

Update: README.md

Add the following section:

Database Setup

We use Jailer subsets for fast local development (2GB instead of 30GB).

Quick Start

bash
# List available subsets
make db:jailer:list

# Download and import (recommended for daily work)
make db:update:jailer:medium

# First time only: Install dependencies
brew install pigz pv s3cmd
s3cmd --configure  # Get credentials from 1Password

Subset Profiles

ProfileSizeUse CaseCommand
Small~300MBQuick tests, bug fixesmake db:update:jailer:small
Medium~2GBDaily developmentmake db:update:jailer:medium
Recent~3GBRecent data (6 months)make db:update:jailer:recent
Full~30GBRare, special casesmake db:update

Import Times

  • Small: ~2 minutes
  • Medium: ~5 minutes
  • Recent: ~8 minutes
  • Full: ~25 minutes

Freshness

Jailer subsets are regenerated daily at 3 AM UTC from production data.

5.2 DigitalOcean Spaces Access

Create read-only access keys for developers:

  1. Go to DigitalOcean → API → Spaces Keys
  2. Create new key: "Dev Team - Jailer Subsets (Read Only)"
  3. Store in 1Password vault: "Engineering / Jailer Subsets Access"
  4. Share with team

Key configuration in s3cmd:

text
Access Key: <READ_ONLY_KEY>
Secret Key: <READ_ONLY_SECRET>
Region: nyc3
Endpoint: nyc3.digitaloceanspaces.com

5.3 Troubleshooting Guide

Update: docs/troubleshooting-jailer.md

Create new file with the following content:

Jailer Troubleshooting

"s3cmd: command not found"

Install s3cmd:

bash
brew install s3cmd

"403 Forbidden" when downloading

Reconfigure s3cmd with correct credentials:

bash
s3cmd --configure
# Get keys from 1Password: "Engineering / Jailer Subsets Access"

"Import failed: foreign key constraint"

Jailer maintains referential integrity, so this shouldn't happen. If it does:

  1. Check which table failed
  2. Try full dump instead: make db:update
  3. Report to #backend channel

Subset too old

Check when last generated:

bash
make db:jailer:list
# Shows file dates

If older than 2 days, check staging server:

bash
ssh staging-server
sudo /opt/jailer-dumps/scripts/check-health.sh

Need specific data not in subset

Options:

  1. Use larger subset: make db:update:jailer:recent
  2. Use full dump: make db:update
  3. Request custom subset from DevOps

Phase 6: Rollout Plan

Week 1: Staging Setup

  • [ ] Install Jailer on staging server
  • [ ] Configure MySQL connections
  • [ ] Set up DigitalOcean Spaces
  • [ ] Create all scripts
  • [ ] Test manual execution
  • [ ] Validate dump integrity

Week 2: Automation

  • [ ] Set up cron job
  • [ ] Monitor first automated run
  • [ ] Verify Spaces uploads
  • [ ] Test downloads from dev machine
  • [ ] Set up alerting

Week 3: Beta Testing

  • [ ] Share with 2-3 developers
  • [ ] Test all three profiles
  • [ ] Gather feedback
  • [ ] Fix issues
  • [ ] Update documentation

Week 4: Team Rollout

  • [ ] Create onboarding video
  • [ ] Share credentials via 1Password
  • [ ] Update team documentation
  • [ ] Announce in #backend channel
  • [ ] Provide support during transition

Success Metrics

Performance

MetricCurrentTargetMeasured
Dev import time25 mins5 minsWeekly avg
Staging refresh timeN/A60 minsDaily log
Download timeN/A2 minsPer subset
Disk usage (dev)60GB5GBPer machine

Reliability

MetricTarget
Daily refresh success rate>95%
Backup validation (staging import)100%
Subset integrity (FK checks)100%

Adoption

MetricTarget
% devs using Jailer subsets>80%
Avg subset age when downloaded<24 hours
Support tickets<2/month

Costs

DigitalOcean Spaces

Storage:

  • Small subset: 300MB × 30 days = 9GB
  • Medium subset: 2GB × 30 days = 60GB
  • Recent subset: 3GB × 30 days = 90GB
  • Total: ~160GB = $3.20/month

Bandwidth:

  • 10 devs × 3 downloads/week × 2GB = 240GB/month
  • Cost: Free (first 1TB free)

Total: ~$3-5/month

Staging Server Time

Daily task:

  • Import: 30 mins
  • Jailer extraction: 20 mins
  • Upload: 5 mins
  • Total: ~55 mins/day CPU time

Cost: Negligible (runs during off-hours)


Appendix: Advanced Topics

A. Custom Subset Profiles

Create custom extraction for specific use cases:

File: /opt/jailer-dumps/config/profiles/specific-course.sh

bash
# Extract specific course and all related data
COURSE_ID=123

jailer export \
    -datamodel "$MODEL_DIR" \
    -jdbcjar /opt/jailer/lib/mysql-connector-java.jar \
    -url "$JDBC_URL" \
    -user "$DB_USER" \
    -password "$DB_PASS" \
    -format SQL \
    -e "course__courses" \
    -where "id = $COURSE_ID" \
    -d "$DUMP_DIR/course-${COURSE_ID}.sql"

B. Parallel Extraction

Speed up subset generation by running profiles in parallel:

bash
# In 3-extract-jailer-subsets.sh
(extract_small) &
(extract_medium) &
(extract_recent) &
wait  # Wait for all to complete

C. Incremental Updates

For very large databases, consider extracting only new data:

bash
# Store last extraction timestamp
LAST_RUN=$(cat "$BASE_DIR/.last-extraction-timestamp" || echo "1970-01-01")

# Extract only new/updated records
jailer export \
    -e "member" \
    -where "updated_at > '$LAST_RUN'" \
    ...

# Update timestamp
date -Iseconds > "$BASE_DIR/.last-extraction-timestamp"

D. Multi-Region Distribution

For distributed teams, replicate Spaces to multiple regions:

bash
# In upload script, sync to EU region
s3cmd sync s3://ixdf-jailer-dumps-us/ s3://ixdf-jailer-dumps-eu/

Summary

This plan provides:

  1. Daily backup validation - Staging imports test production backups
  2. Fresh staging data - Staging stays relevant with daily refresh
  3. Fast dev imports - 25min → 5min with Jailer subsets
  4. Multiple profiles - Small, medium, recent for different use cases
  5. Automated pipeline - Cron job handles entire workflow
  6. Easy distribution - DigitalOcean Spaces for downloads
  7. Monitoring & alerts - Health checks and notifications
  8. Cost effective - ~$5/month total costs

Timeline: 4 weeks from setup to full team adoption ROI: 20 mins saved per dev per week × 10 devs = 3.3 hours/week saved