aboutsummaryrefslogtreecommitdiffstats
path: root/ops
diff options
context:
space:
mode:
authorDrew DeVault <sir@cmpwn.com>2020-06-30 14:42:29 -0400
committerDrew DeVault <sir@cmpwn.com>2020-06-30 14:42:29 -0400
commit598c64ac1df4d3c8a02768419a544096db1865b4 (patch)
treee6cfdbe532535e85fbfc6d49abd4b3669ddaab1e /ops
parent72fe32c37fd3cf615c368f0344edad310e425031 (diff)
downloadsr.ht-docs-598c64ac1df4d3c8a02768419a544096db1865b4.tar.gz
Updates to PSQL planning documentt
Diffstat (limited to 'ops')
-rw-r--r--ops/robust-psql.md161
1 files changed, 129 insertions, 32 deletions
diff --git a/ops/robust-psql.md b/ops/robust-psql.md
index 69ad2ec..8a157f5 100644
--- a/ops/robust-psql.md
+++ b/ops/robust-psql.md
@@ -7,22 +7,20 @@ availability, performance, and data integrity.
1. Real-time off-site backup of the database
1. Ability to access the database as it appeared in the past
-1. Ability to failover to the secondary in the event of an outage on the
+1. Ability to failover to the standby in the event of an outage on the
primary, with minimal application downtime
-1. Ability to failover to the secondary during planned maintenance of the
+1. Ability to failover to the standby during planned maintenance of the
primary, with minimal application downtime
**Prerequisites**
-✓ PostgreSQL 12.x on primary
-
-✓ PostgreSQL 12.x on secondary
+- ✓ PostgreSQL 12.x on primary
+- ✓ PostgreSQL 12.x on standby
**Nice-to-have**
-✗ Alpine 3.12 on primary
-
-✓ Alpine 3.12 on secondary
+- ✗ Alpine 3.12 on primary
+- ✓ Alpine 3.12 on standby
If we have to restart the primary PostgreSQL daemon for any reason, we should
take the opportunity to upgrade the host to Alpine 3.12.
@@ -33,28 +31,28 @@ First we'll have to prepare a [warm standby][warm standby] (which can be
promoted to primary at short notice, but does not accept queries), then we can
reconfigure it as a [hot standby][hot standby] (which will accept queries).
-1. Set up a user on the primary for the secondary to connect as for streaming
+1. Set up a user on the primary for the standby to connect as for streaming
replication. It must have the REPLICATION and LOGIN privileges and
[pg_hba.conf][hba changes] must be updated.
1. Take a [base backup][base backup] of the primary and ship it to the
- secondary. Be advised: we have `full_page_writes` disabled as a ZFS
+ standby. Be advised: we have `full_page_writes` disabled as a ZFS
optimization; this may cause performance degredation during the basebackup
process. `pg_basebackup -PzFt -D basebackup`. Place 'basedirectory' in
`/var/lib/postgresql`.
-1. Nuke the postgres data directory's contents on the secondary, su to postgres,
+1. Nuke the postgres data directory's contents on the standby, su to postgres,
then run these commands: `tar -C ~/12/data -xf ~/basebackup/base.tar.gz`,
`tar -C ~/12/data/pg_wal -xf ~/basebackup/pg_wal.tar.gz`,
`touch ~/12/data/standby.signal`
-1. Fill out `primary_conninfo` on the secondary with the connection details for
+1. Fill out `primary_conninfo` on the standby with the connection details for
the account created for replication. `host=... port=5432 user=... password=...`
-1. Configure a replication slot on the primary for the secondary:
+1. Configure a replication slot on the primary for the standby:
SELECT * FROM pg_create_physical_replication_slot('konpaku');
- Then set `primary_slot_name` to `konpaku` on the secondary.
-1. Set `hot_standby` to "on" on the secondary.
+ Then set `primary_slot_name` to `konpaku` on the standby.
+1. Set `hot_standby` to "on" on the standby.
1. Start the database, monitor logs for restore completion.
-1. Sanity check the secondary database.
+1. Sanity check the standby database.
Note: some of this procedure is adapted from the
[restore the base backup][restore base backup] docs, but modified cause those
@@ -66,26 +64,117 @@ docs are shit.
[restore base backup]: https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-PITR-RECOVERY
[hba changes]: https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION-AUTHENTICATION
-# Failover procedure (primary & secondary role swap)
+✓ Process tested
-TBD
+# Deploying pgbouncer
+
+pgbouncer will allow us to quickly change which database server the fleet is
+using at any given moment. It does not handle load balancing directly, we would
+need to use a secondary approach for this if desirable (DNS round robin and
+haproxy are suggested by the docs).
+
+A configuration similar to the following will be deployed fleet-wide:
+
+```
+[pgbouncer]
+listen_addr = 127.0.0.1
+listen_port = 16432
+pool_mode = transaction
+
+auth_type = trust
+auth_file = /etc/pgbouncer/userlist.txt
+
+max_client_conn = 100
+default_pool_size = 20
+
+[databases]
+xxx.sr.ht=host=remilia.sr.ht user=xxx.sr.ht password=...
+```
+
+config.ini's will point at localhost through pgbouncer rather than directly at
+the database servers. The initd scripts will need to be updated to `use
+pgbouncer` so that services are started up correctly.
+
+✓ Process tested
+
+# Application-level updates
+
+For web applications, we should abort requests which would write to the database
+and instead show a page explaining the circumstances. The user can attempt a
+refresh later to re-attempt the same action.
+
+For the SSH pipelines:
+
+- Repository auto-creation attempts should abort and tell the user
+- Repository mtime updates should be silently dropped
+- If unknown SSH keys can be fetched from meta.sr.ht, use them, but don't store them
+- Webhook delivery attempts should be dropped and a warning printed
+
+For builds.sr.ht, if any database update is met with a read-only error,
+retry with an exponential backoff and block until it succeeds. If it exceeds a
+deadline, then abort the build.
+
+For webhook processing celery workers, it should be acceptable for it to simply
+fail. Webhooks are only generated when the database is updated, so if the
+database can't be updated, ¯\\\_(ツ)\_/¯!
+
+There may be some room for future improvements in all of these areas, but this
+is sufficient for the MVP.
+
+✓ core.sr.ht patch to (somewhat) gracefully handle a read-only database
+
+✗ read-only support for celery jobs
+
+✗ read-only support for git.sr.ht, hg.sr.ht SSH pipeline
+
+✗ read-only support for ongoing builds.sr.ht jobs
+
+# Failover procedure (read-only mode)
+
+*Ideally, this is to be announced in advance, as it will cause a partial outage
+across all services.*
+
+First, prepare all services for failover by updating their pgbouncer
+configurations to point to the standby. Do not reload pgbouncer until all
+service configs have been updated, then use a script to update them all at once.
+
+TODO: Write that script
+
+✓ Process tested
+
+# Standby promotion procedure (read/write mode)
+
+1. Prerequisite: complete the read-only failover procedure
+1. Shut off the master
+1. Run `pg_ctl promote` on the standby
+1. Convert the master to a standby
+
+✗ TODO: Test this process
# Database upgrade procedure
TBD
+✗ TODO: Test this process
+
## Minor version upgrades
TBD
+✗ TODO: Test this process
+
## Major version upgrades
TBD
+✗ TODO: Test this process
+
# Accessing past data
TBD
+✗ TODO: Test this process
+
# Changes to make on the next primary reboot
- Set `wal_level` to replica; we are not going to use logical replication. TODO:
@@ -95,10 +184,6 @@ TBD
# Open questions
-- Would pgbouncer/pgpool be valuable here? It could be used to quickly redirect
- traffic from some or all nodes to one server or another. If so, we should
- probably put it on a dedicated host so that it's robust to host-level failures
- in the primary or secondary servers.
- What is the plan of execution for replacing the server hardware?
- How often are WALs recycled, how far into the past can we peek?
- Can we/should we detect failure conditions and automate the failover
@@ -107,13 +192,13 @@ TBD
- How can we maintain an ongoing high level of confidence in our system?
It may be possible to *always* service SELECT statements from either the primary
-or secondary. If we eventually require load balancing, these queries could be
+or standby. If we eventually require load balancing, these queries could be
eligible. However, queries (or transactions?) which mutate the database may only
be executed on the primary. Therefore:
- If we require more capacity for writes, is vertical scaling our only option?
- In the event of a primary outage, are we able to transparently direct reads
- to the secondary?
+ to the standby?
- Can application-level changes be implemented which would allow for reduced
service availability (read-only) in the event of a primary outage?
- If applications cannot write to the database, but can write to other data
@@ -123,22 +208,22 @@ be executed on the primary. Therefore:
# Mock contingency scenarios
-We need to set up a test network with a primary and secondary and induce failure
+We need to set up a test network with a primary and standby and induce failure
scenarios to learn how our applications behave under those conditions and
-practice the failure resolution steps.
+practice the failure resolution steps. Such a test network is already partially
+provisioned on Drew's home network.
This will require at least the following:
- A couple of sr.ht applications (meta.sr.ht and git.sr.ht would be good)
-- A primary and secondary PostgreSQL server
+- A primary and standby PostgreSQL server
-At least three separate VMs are required here, four if we set up a standalone
-pgbouncer or pgpool. Then we should play out the following scenarios:
+We should play out the following scenarios:
-- Promoting the secondary to the primary
-- Shut down the primary or secondary without pre-preparation
+- Promoting the standby to the primary
+- Shut down the primary or standby without pre-preparation
- Accessing the database state at an arbitrary point in the past
-- Bringing the secondary back up to date after a primary outage
+- Bringing the standby back up to date after a primary outage
- Major and minor version upgrades (pg 11 to 12)
- Host OS upgrades (Alpine 3.11 to 3.12 or edge)
- Cause a ZFS resliver on the primary and monitor database performance: does
@@ -147,3 +232,15 @@ pgbouncer or pgpool. Then we should play out the following scenarios:
At each stage of these processes, observe application behavior and record these
observations.
+
+# Verification process
+
+The whole setup needs to be performed on a secondary deployment and each service
+carefully tested under normal and failure mode conditions. Of particular
+interest is:
+
+- Do git.sr.ht's more sophisticated pg commands from the SSH pipeline work
+ correctly?
+- Does lists.sr.ht's prepared statements get along well with pgbouncer?
+- Do all services work reasonably well in a read-only mode?
+- Do all services work correctly with pgbouncer's transaction mode?