diff options
author | Drew DeVault <sir@cmpwn.com> | 2020-06-30 10:47:56 -0400 |
---|---|---|
committer | Drew DeVault <sir@cmpwn.com> | 2020-06-30 10:48:20 -0400 |
commit | 134328b0ed45995ec33928d0f6738bcdd071fe0a (patch) | |
tree | e4aa302038351a41f642673fc45ae368dd745d6c /ops | |
parent | 6355d0aefc1bb120d956076b84ca41936b36f3d8 (diff) | |
download | sr.ht-docs-134328b0ed45995ec33928d0f6738bcdd071fe0a.tar.gz |
Add robust PSQL planning document
Diffstat (limited to 'ops')
-rw-r--r-- | ops/index.md | 1 | ||||
-rw-r--r-- | ops/robust-psql.md | 143 |
2 files changed, 144 insertions, 0 deletions
diff --git a/ops/index.md b/ops/index.md index dc9c75e..6aaeb95 100644 --- a/ops/index.md +++ b/ops/index.md @@ -17,6 +17,7 @@ Additional resources: - [Monitoring & alarms](/ops/monitoring.md) - [Network topology](/ops/topology.md) - [Provisioning & allocation](/ops/provisioning.md) +- [PostgreSQL robustness planning](/ops/robust-psql.md) # Operational Resources diff --git a/ops/robust-psql.md b/ops/robust-psql.md new file mode 100644 index 0000000..bd24387 --- /dev/null +++ b/ops/robust-psql.md @@ -0,0 +1,143 @@ +PostgreSQL is a critical part of the SourceHut infrastructure in terms of +availability, performance, and data integrity. + +# Streaming replication + +**Goals** + +1. Real-time off-site backup of the database +1. Ability to access the database as it appeared in the past +1. Ability to failover to the secondary in the event of an outage on the + primary, with minimal application downtime +1. Ability to failover to the secondary during planned maintenance of the + primary, with minimal application downtime + +**Prerequisites** + +✓ PostgreSQL 12.x on primary + +✓ PostgreSQL 12.x on secondary + +**Nice-to-have** + +✗ Alpine 3.12 on primary + +✓ Alpine 3.12 on secondary + +If we have to restart the primary PostgreSQL daemon for any reason, we should +take the opportunity to upgrade the host to Alpine 3.12. + +## Preparing the hot standby + +First we'll have to prepare a [warm standby][warm standby] (which can be +promoted to primary at short notice, but does not accept queries), then we can +reconfigure it as a [hot standby][hot standby] (which will accept queries). + +1. Set up a user on the primary for the secondary to connect as for streaming + replication. It must have the REPLICATION and LOGIN privileges and + [pg_hba.conf][hba changes] must be updated. +1. Take a [base backup][base backup] of the primary and ship it to the + secondary. Be advised: we have full_page_writes disabled as a ZFS + optimization; this may cause performance degredation during the basebackup + process. +1. Fill out `primary_conninfo` on the secondary with the connection details for + the account created for replication. +1. Configure a replication slot on the primary for the secondary: + + SELECT * FROM pg_create_physical_replication_slot('konpaku'); + + Then set `primary_slot_name` to `konpaku` on the secondary. +1. Set `hot_standby` to true on the secondary. +1. [Restore the base backup][restore base backup] on the secondary. It will + automatically sync with the primary when complete; monitor the logs. TODO: + What should our `restore_command` be? Want to figure it out in advance. +1. Sanity check the secondary database. +1. Roll back the `restore_command` change so that we aren't surprised by it + later. Drop the base backup files as well. + +[warm standby]: https://www.postgresql.org/docs/current/warm-standby.html +[hot standby]: https://www.postgresql.org/docs/current/hot-standby.html +[base backup]: https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-BASE-BACKUP +[restore base backup]: https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-PITR-RECOVERY +[hba changes]: https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION-AUTHENTICATION + +# Failover procedure (primary & secondary role swap) + +TBD + +# Database upgrade procedure + +TBD + +## Minor version upgrades + +TBD + +## Major version upgrades + +TBD + +# Accessing past data + +TBD + +# Changes to make on the next primary reboot + +- Set `wal_level` to replica; we are not going to use logical replication. TODO: + Will this create WAL files which are incompatible with one another? +- Drop the `archive_command` and set `archive_mode` to off; we're going to use + streaming replication instead of log shipping. + +# Open questions + +- Would pgbouncer/pgpool be valuable here? It could be used to quickly redirect + traffic from some or all nodes to one server or another. If so, we should + probably put it on a dedicated host so that it's robust to host-level failures + in the primary or secondary servers. +- What is the plan of execution for replacing the server hardware? +- How often are WALs recycled, how far into the past can we peek? +- Can we/should we detect failure conditions and automate the failover + procedure? +- What kinds of monitoring and alarms can we/should we set up? +- How can we maintain an ongoing high level of confidence in our system? + +It may be possible to *always* service SELECT statements from either the primary +or secondary. If we eventually require load balancing, these queries could be +eligible. However, queries (or transactions?) which mutate the database may only +be executed on the primary. Therefore: + +- If we require more capacity for writes, is vertical scaling our only option? +- In the event of a primary outage, are we able to transparently direct reads + to the secondary? +- Can application-level changes be implemented which would allow for reduced + service availability (read-only) in the event of a primary outage? +- If applications cannot write to the database, but can write to other data + stores (filesystem, redis, etc), can inconsistent states be produced? +- Can we simulate typical and excessive levels of application traffic against + the database for mock scenarios? + +# Mock contingency scenarios + +We need to set up a test network with a primary and secondary and induce failure +scenarios to learn how our applications behave under those conditions and +practice the failure resolution steps. + +This will require at least the following: + +- A couple of sr.ht applications (meta.sr.ht and git.sr.ht would be good) +- A primary and secondary PostgreSQL server + +At least three separate VMs are required here, four if we set up a standalone +pgbouncer or pgpool. Then we should play out the following scenarios: + +- Promoting the secondary to the primary +- Shut down the primary or secondary without pre-preparation +- Accessing the database state at an arbitrary point in the past +- Major and minor version upgrades (pg 11 to 12) +- Host OS upgrades (Alpine 3.11 to 3.12 or edge) +- Cause a ZFS resliver on the primary and monitor database performance: does + this trigger a "soft outage" where performance is awful but the database is + ostensibly online? + +At each stage of these processes, observe application behavior and record these +observations. |