Database Maintenance¶

This guide covers database maintenance tasks for Ibutsu, including cleanup policies, vacuum operations, and storage optimization.

Cleanup Tasks¶

Ibutsu uses Celery Beat scheduled tasks to automatically clean up old data from the database. These tasks help manage database size and performance.

Import Files Cleanup¶

Task: prune_old_import_files

Schedule: Daily at 2 AM

Default Retention: 7 days

Import files contain the raw uploaded content (JUnit XML, tar.gz archives) which can be large. After the import task completes, the file content is automatically cleared to save space, but the import record is kept for audit purposes. This cleanup task removes both the import records and their associated files after the retention period.

Configuration:

The retention period can be adjusted in celery_utils.py:

"prune-old-import-files": {
    "task": "ibutsu_server.tasks.db.prune_old_import_files",
    "schedule": crontab(minute=0, hour=2),  # 2 am daily
    "args": (7,),  # delete any import files older than 7 days
}

Storage Optimization:

After an import completes successfully, the clear_import_file_content task automatically clears the binary content from the import_files table while keeping the import record for history. This significantly reduces database storage requirements.

Artifact Files Cleanup¶

Task: prune_old_files

Schedule: Weekly on Saturday at 4 AM

Default Retention: 3 months

Deletes artifact files (test logs, screenshots, etc.) older than the specified retention period.

Results Cleanup¶

Task: prune_old_results

Schedule: Weekly on Saturday at 5 AM

Default Retention: 5 months

Removes test result records older than the specified retention period.

Important: The retention period for results must be greater than the retention period for artifact files to avoid foreign key constraint errors.

Runs Cleanup¶

Task: prune_old_runs

Schedule: Weekly on Saturday at 6 AM

Default Retention: 12 months

Removes test run records older than the specified retention period.

Important: The retention period for runs must be greater than the retention period for results to avoid foreign key constraint errors.

Database Vacuum¶

PostgreSQL VACUUM is essential for reclaiming storage space and maintaining database performance after large deletions.

Why VACUUM is Important¶

PostgreSQL uses Multi-Version Concurrency Control (MVCC), which means deleted rows are not immediately removed from disk. Instead, they are marked as “dead tuples” and remain until a VACUUM operation reclaims the space.

After running cleanup tasks that delete large amounts of data, you should run VACUUM to:

Reclaim disk space - Remove dead tuples and return space to the operating system
Update statistics - Help the query planner make better decisions
Prevent transaction ID wraparound - Essential for database health

VACUUM Operations¶

Standard VACUUM:

VACUUM (VERBOSE);

Reclaims space but doesn’t return it to the OS
Can run concurrently with normal operations
Recommended for regular maintenance

VACUUM FULL:

VACUUM FULL (VERBOSE);

Reclaims space and returns it to the OS
Requires an exclusive lock on tables (blocks all operations)
Use only during maintenance windows
Rewrites entire tables, which can take significant time

Automated VACUUM¶

Ibutsu deployments include a CronJob for automated VACUUM operations:

iqe-keeper (Production):

Located in ibutsu/prod/database_vacuum.yaml:

schedule: 1 1 * * *  # Daily at 1:01 AM

OCP Templates:

Located in ocp-templates/*/postgres.yaml:

schedule: ${VACUUM_SCHEDULE}  # Default: 1 1 * * * (Daily at 1:01 AM)

The automated VACUUM runs after the cleanup tasks complete, ensuring efficient space reclamation.

Manual VACUUM¶

To manually run VACUUM on specific tables:

# Connect to the database
psql -h $PGHOST -U $PGUSER -d $PGDATABASE

# VACUUM specific tables
VACUUM (VERBOSE, ANALYZE) imports;
VACUUM (VERBOSE, ANALYZE) import_files;
VACUUM (VERBOSE, ANALYZE) artifacts;
VACUUM (VERBOSE, ANALYZE) results;
VACUUM (VERBOSE, ANALYZE) runs;

# Or VACUUM the entire database
VACUUM (VERBOSE, ANALYZE);

Monitoring VACUUM¶

Check when tables were last vacuumed:

SELECT
    schemaname,
    relname,
    last_vacuum,
    last_autovacuum,
    n_dead_tup,
    n_live_tup
FROM pg_stat_user_tables
WHERE schemaname = 'public'
ORDER BY n_dead_tup DESC;

Storage Optimization Best Practices¶

Run cleanup tasks during low-traffic periods

The default schedule runs cleanup tasks on Saturday mornings when usage is typically lower.

Monitor database size

SELECT
    pg_size_pretty(pg_database_size('ibutsu')) as database_size;

SELECT
    relname as table_name,
    pg_size_pretty(pg_total_relation_size(relid)) as total_size,
    pg_size_pretty(pg_relation_size(relid)) as table_size,
    pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) as index_size
FROM pg_catalog.pg_statio_user_tables
ORDER BY pg_total_relation_size(relid) DESC;

Adjust retention periods based on needs

Shorter retention periods reduce storage requirements but may limit historical analysis.
Schedule VACUUM after cleanup tasks

The automated VACUUM runs at 1 AM daily, before the 2 AM import cleanup. For optimal space reclamation, consider adjusting the VACUUM schedule to run after cleanup tasks (e.g., at 3 AM).
Consider VACUUM FULL for major cleanups

If you significantly reduce retention periods or perform a one-time large deletion, schedule a maintenance window for VACUUM FULL to reclaim maximum space.

Import Record Lifecycle¶

Understanding the lifecycle of import records helps optimize storage:

Upload - User uploads a file (XML or tar.gz)
- imports record created with status='pending'
- import_files record created with full binary content
- created timestamp set automatically
Processing - Celery worker processes the import
- Status updated to status='running'
- Test results and artifacts extracted and stored
- Run records created
Completion - Import finishes
- Status updated to status='done' or status='error'
- clear_import_file_content task automatically clears binary content
- Import record kept for audit trail
Cleanup - After retention period (7 days)
- prune_old_import_files task deletes import record
- Cascade delete removes associated import_files record
- VACUUM reclaims disk space

Database Schema Considerations¶

Import Tables¶

imports table:

id (UUID) - Primary key
filename (TEXT) - Original filename
format (TEXT) - File format (junit, archive)
status (TEXT) - Import status (pending, running, done, error)
created (TIMESTAMP) - Creation time (NOT NULL, indexed)
data (JSONB) - Metadata about the import

import_files table:

id (UUID) - Primary key
import_id (UUID) - Foreign key to imports (NOT NULL, indexed)
content (BYTEA) - Binary file content (cleared after processing)

The import_id foreign key ensures referential integrity and enables cascade deletion.

Indexes¶

The created field on the imports table is indexed to optimize the cleanup query:

CREATE INDEX ix_imports_created ON imports (created);

This index makes the age-based deletion query efficient even with millions of records.

Troubleshooting¶

Import Cleanup Not Running¶

Check if the scheduler is running:

# In Kubernetes/OpenShift
kubectl get pods -l component=scheduler

Check Celery Beat logs:

kubectl logs -l component=scheduler -c scheduler

Verify the task is registered:

from ibutsu_server.tasks.db import prune_old_import_files
print(prune_old_import_files)

Database Size Not Decreasing¶

Check if VACUUM is running:
```
kubectl get cronjobs
kubectl get jobs
```

Manually run VACUUM:

kubectl exec -it postgresql-0 -- psql -U $PGUSER -d $PGDATABASE -c "VACUUM (VERBOSE, ANALYZE);"

Consider VACUUM FULL during a maintenance window:

# WARNING: This locks tables and can take hours on large databases
kubectl exec -it postgresql-0 -- psql -U $PGUSER -d $PGDATABASE -c "VACUUM FULL (VERBOSE);"

Foreign Key Constraint Errors¶

If you see foreign key constraint errors during cleanup:

Ensure cleanup tasks run in the correct order:
- Import files (7 days)
- Artifact files (3 months)
- Results (5 months)
- Runs (12 months)

Check for orphaned records:

-- Find import_files without imports
SELECT COUNT(*) FROM import_files
WHERE import_id NOT IN (SELECT id FROM imports);

The foreign key relationship should handle cascades automatically. If not, check the migration:

sa.ForeignKeyConstraint(
    ["import_id"],
    ["imports.id"],
    ondelete="CASCADE",
)