Skip to content

Lake maintenance

A lakehouse degrades without upkeep. Streaming-sized commits pile up small files, merge-on-read folds accumulate delete files that slow every scan, snapshots and manifests grow without bound, and a crash can strand files no commit ever adopted. Modak owns this lifecycle by default: every table gets a maintenance pass on a schedule, driven by a per-table policy, inside safety bounds only Modak can compute.

Maintenance is also optional. Every pass can be disabled per table or fleet wide, so a team that already runs its own compaction keeps doing exactly that. What is not optional is the safety story below.

Who owns what

The design splits ownership along the same seam as the rest of Modak:

  • Modak owns scheduling, policy, and safety. It decides when a pass runs, resolves the settings, and computes two bounds every pass must honor: no snapshot at or above the oldest pinned reader horizon may expire (pinned reads scan an old metadata_location whose files must stay put), and staged Stream Load files awaiting adoption must never be deleted.
  • The format plugin owns the work. Which operations exist, what the policy keys mean, and what a pass reports are Iceberg's business today and a future format's tomorrow. Nothing in the catalog, console, or metrics assumes Iceberg's vocabulary.

What must stay with Modak

The dividing line is whether an operation deletes files:

Operation Externally safe? Why
Small-file bin-pack Yes Snapshot-additive, old snapshots stay readable
Delete-debt compaction Yes Snapshot-additive
Manifest rewrite Yes Snapshot-additive, metadata only
Snapshot expiry No Deletes data files. Only Modak knows the pinned reader horizon
Orphan file removal No Deletes files by listing. Only Modak knows which staged loads await adoption

Never run an external expire_snapshots or remove_orphan_files

Spark's and Trino's maintenance procedures do not know about Modak's read pins or staged loads. An external expiry can delete data files a pinned reader is scanning mid-query, and an external orphan sweep can delete staged files a load is about to adopt. Keep both with Modak, whatever else you disable.

Bringing your own maintenance

If you already run compaction through Spark, Amoro, or a catalog service, disable the rewrite passes and let Modak keep only the file-deleting ones:

modak-worker policy --table public.events \
  --set rewrite_enabled=false \
  --set delete_compaction_enabled=false \
  --set manifest_rewrite_enabled=false

Snapshot expiry stays on and pin-safe. Disabling it too (snapshot_expiry_enabled=false) is allowed but means nothing expires snapshots for that table, and nothing else safely can, so plan for metadata and storage growth.

The longer-term answer is the engine seam: maintenance runs as a MaintenancePlan handed to a MaintenanceEngine, and the plan carries the safety bounds. MODAK_MAINTENANCE_ENGINE=embedded (the in-worker engine) is the only engine so far. An external engine such as Spark would receive the same plan and the same bounds, which is the supported way to move the heavy lifting out of the worker without giving up safety.

Enabling and disabling

Everything layers: worker env defaults first, then the table's policy.

Setting Default Scope
MODAK_MAINTENANCE_ENABLED true Fleet-wide default for the whole pass
maintenance_enabled env Per table, the whole pass
rewrite_enabled true Per table or env-defaulted, small-file bin-pack
delete_compaction_enabled true Delete-debt compaction
manifest_rewrite_enabled true Manifest rewrite
snapshot_expiry_enabled true Snapshot expiry
orphan_sweep_enabled false Listing-based orphan sweep, opt-in

Disabling maintenance does not disable monitoring or Modak's own bookkeeping: lake health collection keeps running, and staged files of failed loads are still cleaned up from the load journal (that is journal-driven, not listing-driven, and touches only files Modak itself staged).

Policy

Per-table settings live in the catalog (modak.tables.maintenance_policy) and are edited with the policy command:

modak-worker policy --table public.events                                # view
modak-worker policy --table public.events --set snapshot_retention_hours=6
modak-worker policy --table public.events --unset snapshot_retention_hours
modak-worker policy --table public.events --reset                        # back to defaults

The view marks each setting (table) or (default) so the resolution is never a guess. The full key set the Iceberg plugin understands:

Key Default Meaning
maintenance_enabled env Master switch for the table's maintenance pass
rewrite_enabled true Small-file bin-pack on/off
rewrite_target_bytes env Data files smaller than this are bin-pack candidates
rewrite_min_input_files env Small files that must accumulate before a rewrite runs
delete_compaction_enabled true Delete-debt compaction on/off
delete_compaction_min_deletes 1 Delete files a data file must carry before it is rewritten
manifest_rewrite_enabled true Manifest rewrite on/off
manifest_rewrite_min_manifests 100 Manifest count that triggers a manifest rewrite
snapshot_expiry_enabled true Snapshot expiry on/off
snapshot_retention_hours env Snapshots older than this are expirable
snapshot_min_retained env Snapshots always kept, regardless of age
orphan_sweep_enabled false Opt-in listing-based orphan file sweep
orphan_grace_hours 72 Age before an unreferenced or failed-load file may be deleted

Env defaults for the env rows are in the configuration reference.

What a pass does (Iceberg)

Every table gets a pass every MODAK_MAINTENANCE_INTERVAL_SECONDS (default hourly). Mirrored tables need it most, since the pump commits one snapshot per flush.

Delete-debt compaction: a data file carrying at least delete_compaction_min_deletes delete files is rewritten with the deletes applied, and delete files no surviving data file needs are dropped. Merge-mode folds write equality deletes, so mirrored and keep-heap tables build this debt in normal operation.

Small-file bin-pack: once rewrite_min_input_files data files smaller than rewrite_target_bytes accumulate, they are rewritten into one file per partition in a single atomic commit. Files still under deletes are left to the compaction pass above.

Manifest rewrite: past manifest_rewrite_min_manifests manifests, the manifest list is folded and clustered by partition. Metadata files themselves are bounded from birth, tables are created with write.metadata.delete-after-commit.enabled so old metadata JSON is pruned on commit.

Snapshot expiry: snapshots older than snapshot_retention_hours are expired, always keeping snapshot_min_retained and never crossing the pinned horizon.

Orphan sweep (opt-in): files in the table's data directory older than orphan_grace_hours that no retained snapshot references are deleted. Only a crash between write and commit produces such files, so it is off by default. Separately and always on, staged files of loads that ended in failed are deleted after the same grace period, driven by the load journal rather than listing.

Every non-noop pass is journaled in modak.op_log with its counters, and the console's table page shows the last pass next to the policy in force.

Forcing a pass

A pass can be requested out of schedule, from the CLI or the console's table page:

modak-worker maintain --table public.events            # wait and print the result
modak-worker maintain --table public.events --no-wait  # file and return

The request is a row in modak.maintenance_requests that the leader claims atomically on its next cycle, so a forced pass runs under the same coordination as a scheduled one and never concurrently with it. One request per table is pending at a time, repeating the command just refreshes it.

Lake health

Every MODAK_LAKE_STATS_INTERVAL_SECONDS the worker refreshes modak.lake_stats: the plugin's counters, its health warnings (Iceberg warns on delete-file debt and manifest sprawl), and the policy in force. The console shows it per table, /metrics exports it as modak_lake_*{table} gauges (see the metrics reference), and new warnings are logged once as WARN.

Health collection runs regardless of whether maintenance is enabled, so a table maintained externally still shows its file counts, delete debt, and warnings in the console. If you disabled a pass and its warning keeps firing, that is the signal your external maintenance is not keeping up.