Integrity Check (fsck)
Overview#
The ncps fsck command checks for consistency issues between the database and storage backend. Over time, database records and physical storage files can drift apart due to server crashes, manual file deletions, failed migrations, or other unexpected events. Running fsck detects and optionally repairs these inconsistencies.
What fsck Checks#
Standard Checks (always performed)#
| Issue | Description |
|---|---|
| Narinfos without nar_files | Narinfo records in the database that have no linked nar_file entry |
| Orphaned nar_files (DB only) | nar_file records in the database not linked to any narinfo |
| Nar_files missing from storage | nar_file records in the database whose physical file is absent from storage |
| Orphaned NAR files in storage | NAR files on disk or in S3 that have no corresponding database record |
CDC Checks (when CDC is enabled)#
| Issue | Description |
|---|---|
| Orphaned chunks (DB only) | Chunk records in the database not linked to any nar_file |
| Chunks missing from storage | Chunk records in the database whose physical chunk file is absent |
| Orphaned chunk files | Chunk files in storage that have no corresponding database record |
How fsck Works#
fsck runs in three phases:
- Phase 1 – Collect suspects: Runs database queries and walks storage to identify potential issues.
- Phase 2 – Re-verify: Each suspected issue is individually re-checked to filter out items that were in-flight (being added or removed concurrently). This prevents false positives from in-progress cache operations.
- Phase 3 – Repair (optional): Re-verifies each item one final time before deleting, then removes all confirmed issues.
The double re-verify design means fsck is safe to run against a live cache without taking it offline.
Usage#
Check Only (Report Mode)#
Run fsck without any flags to detect issues and print a summary. If issues are found, you will be prompted to confirm repair:
ncps fsck \
--cache-database-url="sqlite:/var/lib/ncps/db.sqlite" \
--cache-storage-local="/var/lib/ncps"Example output when issues are found:
ncps fsck summary
=================
Narinfos without nar_files: 0
Orphaned nar_files (DB only): 0
Nar_files missing from storage: 1
Orphaned NAR files in storage: 5
-----------------
Total issues: 6
Repair all issues? [y/N]:Example output when everything is consistent:
ncps fsck summary
=================
Narinfos without nar_files: 0
Orphaned nar_files (DB only): 0
Nar_files missing from storage: 0
Orphaned NAR files in storage: 0
-----------------
Total issues: 0
All checks passed.Automatic Repair#
Use --repair to fix all issues without a prompt:
ncps fsck \
--cache-database-url="sqlite:/var/lib/ncps/db.sqlite" \
--cache-storage-local="/var/lib/ncps" \
--repairDry Run#
Use --dry-run to print the summary of issues without making any changes:
ncps fsck \
--cache-database-url="sqlite:/var/lib/ncps/db.sqlite" \
--cache-storage-local="/var/lib/ncps" \
--dry-runWhen issues are found in dry-run mode, the command exits with a non-zero status so it can be used in scripts.
S3 Storage#
ncps fsck \
--cache-database-url="postgresql://user:pass@localhost/ncps" \
--cache-storage-s3-bucket="ncps-cache" \
--cache-storage-s3-endpoint="https://s3.amazonaws.com" \
--cache-storage-s3-region="us-east-1" \
--cache-storage-s3-access-key-id="..." \
--cache-storage-s3-secret-access-key="..."PostgreSQL or MySQL#
ncps fsck \
--cache-database-url="postgresql://user:pass@localhost/ncps" \
--cache-storage-local="/var/lib/ncps"Flags Reference#
Core Flags#
| Flag | Description |
|---|---|
--repair |
Automatically fix all detected issues |
--dry-run |
Show what would be fixed without making any changes |
Storage Flags#
| Flag | Description |
|---|---|
--cache-storage-local |
Path to local cache storage directory |
--cache-storage-s3-bucket |
S3 bucket name |
--cache-storage-s3-endpoint |
S3-compatible endpoint URL |
--cache-storage-s3-region |
S3 region (optional) |
--cache-storage-s3-access-key-id |
S3 access key ID |
--cache-storage-s3-secret-access-key |
S3 secret access key |
--cache-storage-s3-force-path-style |
Force path-style S3 addressing |
Database Flags#
| Flag | Default | Description |
|---|---|---|
--cache-database-url |
(required) | Database URL (sqlite:, postgresql://, or mysql://) |
--cache-database-pool-max-open-conns |
— | Maximum open database connections |
--cache-database-pool-max-idle-conns |
— | Maximum idle database connections |
Distributed Locking Flags (optional)#
These flags are optional and only needed if you want fsck to use distributed (Redis-backed) locks during the check. For standalone use, local locking is used by default.
| Flag | Default | Description |
|---|---|---|
--cache-redis-addrs |
— | Redis server addresses (enables distributed locking) |
--cache-redis-username |
— | Redis username |
--cache-redis-password |
— | Redis password |
--cache-redis-db |
0 |
Redis database number |
--cache-redis-use-tls |
false |
Use TLS for Redis connections |
--cache-redis-pool-size |
10 |
Redis connection pool size |
--cache-lock-backend |
local |
Lock backend: local or redis |
--cache-lock-allow-degraded-mode |
false |
Fall back to local locks if Redis is unavailable |
Repair Behaviour#
When --repair is used (or confirmed interactively), fsck deletes the following:
| Issue | Action |
|---|---|
| Narinfos without nar_files | Delete the narinfo DB record (and its references/signatures via cascade) |
| Orphaned nar_files (DB only) | Delete the nar_file DB record |
| Nar_files missing from storage | Delete the nar_file DB record; also deletes any narinfo that becomes orphaned as a result (cascade cleanup — no second run needed) |
| Orphaned NAR files in storage | Delete the physical file from storage |
| [CDC] Orphaned chunks (DB only) | Delete the chunk DB record |
| [CDC] Chunks missing from storage | Delete the chunk DB record |
| [CDC] Orphaned chunk files | Delete the physical chunk file from storage |
Note: Repair does not recover missing data — it removes the inconsistent records. If you need to recover missing NAR files, restore from a backup before running repair.
Exit Codes#
| Exit Code | Meaning |
|---|---|
0 |
All checks passed (or repair completed successfully) |
| Non-zero | Issues were found and either --dry-run was used, the prompt was answered with N, or an error occurred |
Scheduling Regular Checks#
For production deployments it is good practice to run fsck periodically as a health check:
# Daily check — alert if issues found (no repair)
ncps fsck \
--cache-database-url="$CACHE_DATABASE_URL" \
--cache-storage-local="$CACHE_STORAGE_LOCAL" \
--dry-run
# Weekly automated repair
ncps fsck \
--cache-database-url="$CACHE_DATABASE_URL" \
--cache-storage-local="$CACHE_STORAGE_LOCAL" \
--repairAs a systemd timer:
[Unit]
Description=ncps integrity check
[Service]
Type=oneshot
ExecStart=ncps fsck \
--cache-database-url=sqlite:/var/lib/ncps/db.sqlite \
--cache-storage-local=/var/lib/ncps \
--dry-run
[Install]
WantedBy=timers.targetTroubleshooting#
Large number of orphaned NAR files in storage#
Possible causes:
- The server was stopped mid-write (NAR file written, DB record not yet committed)
- Manual file operations on the storage directory
- A failed migration left files behind
Action: Run with --repair to remove the orphaned files. The re-verify phase guards against removing in-flight writes.
Nar_files missing from storage#
Possible causes:
- Files were deleted manually from the storage directory or S3 bucket
- A storage backend failure caused partial data loss
Action: If backups are available, restore the missing files before running repair. Otherwise, run with --repair to remove the orphaned DB records. Any narinfos that become orphaned as a result are also cleaned up automatically in the same repair pass.
fsck is slow on large caches#
Causes: Walking all NAR files in storage (phase 1d) and checking each DB entry against storage (phase 1c) are O(n) operations.
Tips:
- Run during low-traffic periods
- For S3 storage, fsck performs one
HeadObjectper NAR file — this is billed per request - For very large caches, consider running the check against a read replica for the database checks
CDC checks not appearing in summary#
CDC checks are only performed when CDC is enabled in the database configuration. If you see only 4 rows in the summary (no [CDC] lines), CDC is disabled. Enable it via the serve command with the appropriate --cache-cdc-* flags.
Related Documentation#
- Backup Restore - Back up before repairing data loss
- NarInfo Migration - Migrate narinfo metadata to database
- NAR to Chunks Migration - Migrate NAR files to content-defined chunks
- Database - Database configuration
- Storage - Storage backend configuration