diff options
author | Drew DeVault <sir@cmpwn.com> | 2020-06-30 14:42:29 -0400 |
---|---|---|
committer | Drew DeVault <sir@cmpwn.com> | 2020-06-30 14:42:29 -0400 |
commit | 598c64ac1df4d3c8a02768419a544096db1865b4 (patch) | |
tree | e6cfdbe532535e85fbfc6d49abd4b3669ddaab1e /ops | |
parent | 72fe32c37fd3cf615c368f0344edad310e425031 (diff) | |
download | sr.ht-docs-598c64ac1df4d3c8a02768419a544096db1865b4.tar.gz |
Updates to PSQL planning documentt
Diffstat (limited to 'ops')
-rw-r--r-- | ops/robust-psql.md | 161 |
1 files changed, 129 insertions, 32 deletions
diff --git a/ops/robust-psql.md b/ops/robust-psql.md index 69ad2ec..8a157f5 100644 --- a/ops/robust-psql.md +++ b/ops/robust-psql.md @@ -7,22 +7,20 @@ availability, performance, and data integrity. 1. Real-time off-site backup of the database 1. Ability to access the database as it appeared in the past -1. Ability to failover to the secondary in the event of an outage on the +1. Ability to failover to the standby in the event of an outage on the primary, with minimal application downtime -1. Ability to failover to the secondary during planned maintenance of the +1. Ability to failover to the standby during planned maintenance of the primary, with minimal application downtime **Prerequisites** -✓ PostgreSQL 12.x on primary - -✓ PostgreSQL 12.x on secondary +- ✓ PostgreSQL 12.x on primary +- ✓ PostgreSQL 12.x on standby **Nice-to-have** -✗ Alpine 3.12 on primary - -✓ Alpine 3.12 on secondary +- ✗ Alpine 3.12 on primary +- ✓ Alpine 3.12 on standby If we have to restart the primary PostgreSQL daemon for any reason, we should take the opportunity to upgrade the host to Alpine 3.12. @@ -33,28 +31,28 @@ First we'll have to prepare a [warm standby][warm standby] (which can be promoted to primary at short notice, but does not accept queries), then we can reconfigure it as a [hot standby][hot standby] (which will accept queries). -1. Set up a user on the primary for the secondary to connect as for streaming +1. Set up a user on the primary for the standby to connect as for streaming replication. It must have the REPLICATION and LOGIN privileges and [pg_hba.conf][hba changes] must be updated. 1. Take a [base backup][base backup] of the primary and ship it to the - secondary. Be advised: we have `full_page_writes` disabled as a ZFS + standby. Be advised: we have `full_page_writes` disabled as a ZFS optimization; this may cause performance degredation during the basebackup process. `pg_basebackup -PzFt -D basebackup`. Place 'basedirectory' in `/var/lib/postgresql`. -1. Nuke the postgres data directory's contents on the secondary, su to postgres, +1. Nuke the postgres data directory's contents on the standby, su to postgres, then run these commands: `tar -C ~/12/data -xf ~/basebackup/base.tar.gz`, `tar -C ~/12/data/pg_wal -xf ~/basebackup/pg_wal.tar.gz`, `touch ~/12/data/standby.signal` -1. Fill out `primary_conninfo` on the secondary with the connection details for +1. Fill out `primary_conninfo` on the standby with the connection details for the account created for replication. `host=... port=5432 user=... password=...` -1. Configure a replication slot on the primary for the secondary: +1. Configure a replication slot on the primary for the standby: SELECT * FROM pg_create_physical_replication_slot('konpaku'); - Then set `primary_slot_name` to `konpaku` on the secondary. -1. Set `hot_standby` to "on" on the secondary. + Then set `primary_slot_name` to `konpaku` on the standby. +1. Set `hot_standby` to "on" on the standby. 1. Start the database, monitor logs for restore completion. -1. Sanity check the secondary database. +1. Sanity check the standby database. Note: some of this procedure is adapted from the [restore the base backup][restore base backup] docs, but modified cause those @@ -66,26 +64,117 @@ docs are shit. [restore base backup]: https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-PITR-RECOVERY [hba changes]: https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION-AUTHENTICATION -# Failover procedure (primary & secondary role swap) +✓ Process tested -TBD +# Deploying pgbouncer + +pgbouncer will allow us to quickly change which database server the fleet is +using at any given moment. It does not handle load balancing directly, we would +need to use a secondary approach for this if desirable (DNS round robin and +haproxy are suggested by the docs). + +A configuration similar to the following will be deployed fleet-wide: + +``` +[pgbouncer] +listen_addr = 127.0.0.1 +listen_port = 16432 +pool_mode = transaction + +auth_type = trust +auth_file = /etc/pgbouncer/userlist.txt + +max_client_conn = 100 +default_pool_size = 20 + +[databases] +xxx.sr.ht=host=remilia.sr.ht user=xxx.sr.ht password=... +``` + +config.ini's will point at localhost through pgbouncer rather than directly at +the database servers. The initd scripts will need to be updated to `use +pgbouncer` so that services are started up correctly. + +✓ Process tested + +# Application-level updates + +For web applications, we should abort requests which would write to the database +and instead show a page explaining the circumstances. The user can attempt a +refresh later to re-attempt the same action. + +For the SSH pipelines: + +- Repository auto-creation attempts should abort and tell the user +- Repository mtime updates should be silently dropped +- If unknown SSH keys can be fetched from meta.sr.ht, use them, but don't store them +- Webhook delivery attempts should be dropped and a warning printed + +For builds.sr.ht, if any database update is met with a read-only error, +retry with an exponential backoff and block until it succeeds. If it exceeds a +deadline, then abort the build. + +For webhook processing celery workers, it should be acceptable for it to simply +fail. Webhooks are only generated when the database is updated, so if the +database can't be updated, ¯\\\_(ツ)\_/¯! + +There may be some room for future improvements in all of these areas, but this +is sufficient for the MVP. + +✓ core.sr.ht patch to (somewhat) gracefully handle a read-only database + +✗ read-only support for celery jobs + +✗ read-only support for git.sr.ht, hg.sr.ht SSH pipeline + +✗ read-only support for ongoing builds.sr.ht jobs + +# Failover procedure (read-only mode) + +*Ideally, this is to be announced in advance, as it will cause a partial outage +across all services.* + +First, prepare all services for failover by updating their pgbouncer +configurations to point to the standby. Do not reload pgbouncer until all +service configs have been updated, then use a script to update them all at once. + +TODO: Write that script + +✓ Process tested + +# Standby promotion procedure (read/write mode) + +1. Prerequisite: complete the read-only failover procedure +1. Shut off the master +1. Run `pg_ctl promote` on the standby +1. Convert the master to a standby + +✗ TODO: Test this process # Database upgrade procedure TBD +✗ TODO: Test this process + ## Minor version upgrades TBD +✗ TODO: Test this process + ## Major version upgrades TBD +✗ TODO: Test this process + # Accessing past data TBD +✗ TODO: Test this process + # Changes to make on the next primary reboot - Set `wal_level` to replica; we are not going to use logical replication. TODO: @@ -95,10 +184,6 @@ TBD # Open questions -- Would pgbouncer/pgpool be valuable here? It could be used to quickly redirect - traffic from some or all nodes to one server or another. If so, we should - probably put it on a dedicated host so that it's robust to host-level failures - in the primary or secondary servers. - What is the plan of execution for replacing the server hardware? - How often are WALs recycled, how far into the past can we peek? - Can we/should we detect failure conditions and automate the failover @@ -107,13 +192,13 @@ TBD - How can we maintain an ongoing high level of confidence in our system? It may be possible to *always* service SELECT statements from either the primary -or secondary. If we eventually require load balancing, these queries could be +or standby. If we eventually require load balancing, these queries could be eligible. However, queries (or transactions?) which mutate the database may only be executed on the primary. Therefore: - If we require more capacity for writes, is vertical scaling our only option? - In the event of a primary outage, are we able to transparently direct reads - to the secondary? + to the standby? - Can application-level changes be implemented which would allow for reduced service availability (read-only) in the event of a primary outage? - If applications cannot write to the database, but can write to other data @@ -123,22 +208,22 @@ be executed on the primary. Therefore: # Mock contingency scenarios -We need to set up a test network with a primary and secondary and induce failure +We need to set up a test network with a primary and standby and induce failure scenarios to learn how our applications behave under those conditions and -practice the failure resolution steps. +practice the failure resolution steps. Such a test network is already partially +provisioned on Drew's home network. This will require at least the following: - A couple of sr.ht applications (meta.sr.ht and git.sr.ht would be good) -- A primary and secondary PostgreSQL server +- A primary and standby PostgreSQL server -At least three separate VMs are required here, four if we set up a standalone -pgbouncer or pgpool. Then we should play out the following scenarios: +We should play out the following scenarios: -- Promoting the secondary to the primary -- Shut down the primary or secondary without pre-preparation +- Promoting the standby to the primary +- Shut down the primary or standby without pre-preparation - Accessing the database state at an arbitrary point in the past -- Bringing the secondary back up to date after a primary outage +- Bringing the standby back up to date after a primary outage - Major and minor version upgrades (pg 11 to 12) - Host OS upgrades (Alpine 3.11 to 3.12 or edge) - Cause a ZFS resliver on the primary and monitor database performance: does @@ -147,3 +232,15 @@ pgbouncer or pgpool. Then we should play out the following scenarios: At each stage of these processes, observe application behavior and record these observations. + +# Verification process + +The whole setup needs to be performed on a secondary deployment and each service +carefully tested under normal and failure mode conditions. Of particular +interest is: + +- Do git.sr.ht's more sophisticated pg commands from the SSH pipeline work + correctly? +- Does lists.sr.ht's prepared statements get along well with pgbouncer? +- Do all services work reasonably well in a read-only mode? +- Do all services work correctly with pgbouncer's transaction mode? |