feat: rotate the internal PostgreSQL logs#312
Conversation
The internal vector-db PostgreSQL redirected all output to a single "$DATA_DIR/logfile" (pg_ctl -l) that grew without bound and was never rotated, eventually filling the persistent storage on long-running instances. Drop the -l redirect and enable PostgreSQL's built-in logging collector with the documented weekday scheme: one file per day named postgresql-Mon.log through postgresql-Sun.log under "$DATA_DIR/log". Size-based rotation stays off (log_rotation_size=0): with weekday filenames PostgreSQL reopens the same file on a size rotation without truncating, so a size cap cannot bound a day's file anyway. The collector overwrites a weekday file only when it rotates while running across a day boundary, not on startup, so a container that restarts frequently would keep appending to the same-named weekday file week after week. To keep the logs bounded to about a week regardless of restart frequency, remove the legacy single logfile and any weekday log at least six days old before starting the server, so last week's file is always gone before its name is reused. This also reclaims the old unbounded logfile on upgraded installs. The options are passed on the pg_ctl command line, so they also take effect for already-initialised data directories on the next restart and need no edits to persisted configuration. Document the new log location and retention in the README. Fixes nextcloud#303 Signed-off-by: Cesar <275373127+sanzakicesarr@users.noreply.github.com>
21f9cd5 to
e65c0a2
Compare
| # Rotate the internal PostgreSQL logs with the built-in logging collector: one | ||
| # file per weekday (postgresql-Mon.log ... postgresql-Sun.log) under | ||
| # "$DATA_DIR/log". The collector overwrites a weekday file only on time-based | ||
| # rotation, not on startup, so a frequently-restarted container would otherwise | ||
| # append to last week's same-named file. Before starting, drop the pre-rotation | ||
| # single logfile and any weekday log at least six days old (find -mtime +5), so | ||
| # last week's file is always gone before its name is reused. See #303. | ||
| rm -f "${DATA_DIR}/logfile" | ||
| find "${DATA_DIR}/log" -maxdepth 1 -type f -name 'postgresql-*.log' -mtime +5 -delete 2>/dev/null || true | ||
| PG_LOG_OPTS="-c logging_collector=on -c log_directory=log -c log_filename=postgresql-%a.log -c log_rotation_age=1d -c log_rotation_size=0 -c log_truncate_on_rotation=on" | ||
|
|
There was a problem hiding this comment.
wdyt of having the log rotation based on file size instead of the current day of the week? It would mean we have a fixed estimate to how large the logs can grow, and also have logs going back much further than a week.
something like this maybe:
logging_collector = on
log_directory = 'log'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_file_mode = 0640
log_rotation_size = 50MBThere was a problem hiding this comment.
also, maybe it makes sense to keep all the args in one place, in the env file in the same pgsql folder as this file, so PG_LOG_OPTS could be declared there too
| - add gh workflows for docker builds and do separate cpu, cuda and rocm (vulkan) images (#295) @kyteinsky | ||
|
|
||
| ### Changed | ||
| - rotate the internal PostgreSQL logs into one file per weekday instead of a single ever-growing logfile (#312) @sanzakicesarr |
| # append to last week's same-named file. Before starting, drop the pre-rotation | ||
| # single logfile and any weekday log at least six days old (find -mtime +5), so | ||
| # last week's file is always gone before its name is reused. See #303. | ||
| rm -f "${DATA_DIR}/logfile" |
There was a problem hiding this comment.
it would be cleaner to put this in a repair step
see https://github.com/nextcloud/context_chat_backend#how-to-generate-a-repair-step-file
use APP_VERSION=5.4.0 ./genrepair.sh
this can be used as an example on how to structure the repair file: https://github.com/nextcloud/context_chat_backend/blob/e869eb21964bf2241b1f3c269509388b5f6a7ed3/context_chat_backend/repair/repair5004_date20260521105831.py
|
Really like the size-based approach, and it also fixes something the weekday scheme quietly suffers from: log_truncate_on_rotation only fires on time-based rotation, never on startup, so a container that restarts often just keeps appending to the same weekday file. Unique timestamps sidestep that completely. I ran your settings locally and each file caps at the rotation size under normal load 👍🏽 One thing I want to settle with you before reworking it: with timestamped files nothing is ever removed, so the total log size keeps climbing over time, which is sort of the inverse of what #303 set out to fix (logs eating the persistent storage). Two ways to go, keep the long history as is, or add a small retention cap that prunes the oldest files past some count or total size. A cap would have to run periodically, so I'd keep that piece in setup.sh on startup, and move the one-off cleanup of the legacy single logfile into a repair step via genrepair.sh, exactly like you pointed out. env file makes sense too, I'll declare PG_LOG_OPTS in dockerfile_scripts/pgsql/env next to the rest. And the changelog line comes out. Just say which way you lean on retention and I'll push the reworked version √ |
What & why
Fixes #303.
The internal vector-db PostgreSQL was started with
pg_ctl -l "$DATA_DIR/logfile", redirecting all output to a single file that grows without bound and is never rotated — thelogfilethe issue points at.This enables PostgreSQL's built-in logging collector with the rotation scheme from the PostgreSQL manual's own example, and drops the
-lredirect so the collector is the only log sink:logging_collector=onlog_directory=log$DATA_DIR/log/(PostgreSQL's default location, inside the persistent volume)log_filename=postgresql-%a.logpostgresql-Mon.log…postgresql-Sun.loglog_rotation_age=1d+log_truncate_on_rotation=onlog_rotation_size=0Bounding across restarts
log_truncate_on_rotationonly fires on time-based rotation while the server runs across a day boundary — not on startup. Since this ExApp restarts the internal PostgreSQL on every app restart/update, the weekday files would otherwise be appended to week after week and keep growing (verified empirically: a restart doubles the file).So before starting the server,
setup.shremoves the legacy singlelogfileand any weekday log more than five days old (so last week's file is always gone before its weekday name is reused):This keeps the logs bounded to about a week no matter how often the container restarts, and reclaims the old unbounded
logfileon upgraded installs. Same-day restarts keep appending to the current day's file, so no log lines are lost.Notes for review
docker logs), consistent with how the rest of the stack already logs via supervisord.pg_ctlcommand line, so existing data dirs get rotation on their next restart without touchingpostgresql.conf.log_rotation_size=0is intentional: with%afilenames PostgreSQL reopens the same file on a size-based rotation without truncating it (verified empirically — a 10 kB cap still let the file grow to ~220 kB), so a size cap cannot bound a day's file. Time-based daily rotation plus the startup cleanup is what bounds the logs.## Logssection now documents where the internal PostgreSQL logs live and their ~1-week retention.Tested
In a stock
postgres:16container: the server starts, nologfileis created,logging_collectoris on,$DATA_DIR/log/postgresql-<weekday>.logis created and receives output; a restart appends to the same-day file; and the startup cleanup removes the legacylogfile, deletes a weekday log older than five days, and preserves both the current day's file and a five-day-old file.