I have a Postgres 9.4 RDS instance with Multi-AZ, and there's a slave, read-only replica.
Up to this point the load balancing was made in the business layer of my app, but it's inefficient, and I was hoping to use PGPool, so the app interacts with a single Postgres connection.
It turns out that using PGPool has been a pain in the ass. If I set it to act as a load balancer, simple SELECT queries throw errors like:
SQLSTATE[HY000]: General error: 7
message contents do not agree with length in message type "N"
server sent data ("D" message)
without prior row description ("T" message)
If I set it to act in a master/slave mode with stream replication (as suggested in Postgres mail list) I get:
psql: ERROR: MD5 authentication is unsupported
in replication and master-slave modes.
HINT: check pg_hba.conf
Yeah, well, pg_hba.conf if off hands in RDS so I can't alter it.
Has anyone got PGPool to work in RDS? Are there other tools that can act as middleware to take advantage of reading replicas in RDS?
I was able to make it work here are my working config files:
You have to use md5 authentication, and sync the username/password from your database to the pool_passwd file. Also need enable_pool_hba, load_balance_mode, and master_slave_mode on.
pgpool.conf
listen_addresses = '*'
port = 9999
pcp_listen_addresses = '*'
pcp_port = 9898
pcp_socket_dir = '/tmp'
listen_backlog_multiplier = 1
backend_hostname0 = 'master-rds-database-with-multi-AZ.us-west-2.rds.amazonaws.com'
backend_port0 = 5432
backend_weight0 = 0
backend_flag0 = 'ALWAYS_MASTER'
backend_hostname1 = 'readonly-replica.us-west-2.rds.amazonaws.com'
backend_port1 = 5432
backend_weight1 = 999
backend_flag1 = 'ALWAYS_MASTER'
enable_pool_hba = on
pool_passwd = 'pool_passwd'
ssl = on
num_init_children = 1
max_pool = 2
connection_cache = off
replication_mode = off
load_balance_mode = on
master_slave_mode = on
pool_hba.conf
local all all md5
host all all 127.0.0.1/32 md5
pool_passwd
username:md5d51c9a7e9353746a6020f9602d452929
to update pool_password you can use pg_md5 or
echo username:md5`echo -n usernamepassword | md5sum`
username:md5d51c9a7e9353746a6020f9602d452929 -
Output of running example:
psql --dbname=database --host=localhost --username=username --port=9999
database=> SHOW POOL_NODES;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay
---------+-------------------------------------------------+------+--------+-----------+---------+------------+-------------------+-------------------
0 | master-rds-database.us-west-2.rds.amazonaws.com | 8193 | up | 0.000000 | primary | 0 | false | 0
1 | readonly-replica.us-west-2.rds.amazonaws.com | 8193 | up | 1.000000 | standby | 0 | true | 0
database=> select now();
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay
---------+-------------------------------------------------+------+--------+-----------+---------+------------+-------------------+-------------------
0 | master-rds-database.us-west-2.rds.amazonaws.com | 8193 | up | 0.000000 | primary | 0 | false | 0
1 | readonly-replica.us-west-2.rds.amazonaws.com | 8193 | up | 1.000000 | standby | 1 | true | 1
database=> CREATE TABLE IF NOT EXISTS tmp_test_read_write ( data varchar(40) );
CREATE TABLE
database=> INSERT INTO tmp_test_read_write (data) VALUES (concat('',inet_server_addr()));
INSERT 0 1
database=> select data as master_ip,inet_server_addr() as replica_ip from tmp_test_read_write;
master_ip | replica_ip
--------------+---------------
172.31.37.69 | 172.31.20.121
(1 row)
You can also see from the logs id does both databases:
2018-10-16 07:56:37: pid 124528: LOG: DB node id: 0 backend pid: 21731 statement: CREATE TABLE IF NOT EXISTS tmp_test_read_write ( data varchar(40) );
2018-10-16 07:56:47: pid 124528: LOG: DB node id: 0 backend pid: 21731 statement: INSERT INTO tmp_test_read_write (data) VALUES (concat('',inet_server_addr()));
2018-10-16 07:56:52: pid 124528: LOG: DB node id: 1 backend pid: 24890 statement: select data as master_ip,inet_server_addr() as replica_ip from tmp_test_read_write;
Notice the insert used ip_address of master, and the next select used ip_address of the read only replica.
I can update after more testing, but psql client testing looks promising.
There is Citus(pgShard) that is supposed to work with standard Amazon RDS instances. It has catches though. You will have a single point of failure if you use the open source version. It's coordinator node is not duplicated.
You can get a fully HA seamless fail over version of it but you have to buy the enterprise licence, but it is CRAZY expensive. It will easily cost you $50,000 to $100,000 or more per year.
Also they are REALLY pushing their cloud version now which is even more insanely expensive.
https://www.citusdata.com/
I have also heard of people using HAProxy to balance between Postgres or MySql nodes.
Related
I'm trying to create a logical replication with two local postgresql servers (node1: port 5434, node2: port 5435).
I could successfully create publication and subscription on node1 and node2 for a table in public schema.
Node1:
CREATE PUBLICATION my_pub FOR TABLE t1;
GRANT SELECT ON t1 TO repuser;
Node2:
CREATE SUBSCRIPTION my_sub CONNECTION 'host=localhost port=5434 dbname=pub user=repuser password=password' PUBLICATION my_pub;
Node2 public.t1 replicates all data in node1 public.t1.
However, my problem is when I create publication and subscription with same code but in different schema, node2 fail to replicate.
Below is output of some pg_catalog query :
Node1:
pub=# select * from pg_catalog.pg_publication_tables;
pubname | schemaname | tablename
----------+------------+-----------
my_pub | public | t1
cdl_test | cdl | t1
pub_test | test | t1
Node2:
sub=# \dRs
List of subscriptions
Name | Owner | Enabled | Publication
--------------+----------+---------+-------------
cdl_sub_test | postgres | t | {cdl_test}
my_sub | postgres | t | {my_pub}
sub_test | postgres | t | {pub_test}
sub=# select * from pg_catalog.pg_replication_origin;
roident | roname
---------+----------
2 | pg_18460
1 | pg_18461
3 | pg_18466
sub=# select * from pg_catalog.pg_subscription_rel ;
srsubid | srrelid | srsubstate | srsublsn
---------+---------+------------+------------
18461 | 16386 | r | 0/3811C810
18466 | 18463 | d |
18460 | 18456 | d |
As it is shown in select * from pg_catalog.pg_subscription_rel, two subscription for test and cdl schema are in d(data is being copied) state.
Any recommendation on how to go about this problem or diagnose why the problem occurs?
As jjanes has suggested, a snippet of the log file is shown below:
2022-01-17 16:05:25.165 PST [622] WARNING: out of logical replication worker slots
2022-01-17 16:05:25.165 PST [622] HINT: You might need to increase max_logical_replication_workers.
2022-01-17 16:05:25.168 PST [970] LOG: logical replication table synchronization worker for subscription "cdl_sub_test", table "t1" has started
2022-01-17 16:05:25.245 PST [970] ERROR: could not start initial contents copy for table "cdl.t1": ERROR: permission denied for schema cdl
2022-01-17 16:05:25.247 PST [471] LOG: background worker "logical replication worker" (PID 970) exited with exit code 12022-01-17 16:05:25.797 PST [488] postgres#sub LOG: statement: /*pga4dash*/
It seems like the subscriber doesn't have permission to read cdl schema in publisher even after I gave permission for SELECT ON cdl.t1 TO repuser;.
You have to give the user repuser permission to read the table that should be replicated. That also requires USAGE permission on the schema that contains the table.
I'm trying to move a Postgres (11.6) database on EC2 to RDS (postgres 11.6). I started replication last a couple nights ago and have now noticed the replication has slowed down considerably when I see how fast the database size in increasing on the subscriber SELECT pg_database_size('db_name')/1024/1024. Here are some stats of the environment:
Publisher Node:
Instance type: r5.24xlarge
Disk: 5Tb GP2 with 16,000 PIOPs
Database Size w/ pg_database_size/1024/1024: 2,295,955 mb
Subscriber Node:
Instance type: RDS r5.24xlarge
Disk: 3Tb GP2
Here is the current DB size for the subscriber and publisher:
Publisher:
SELECT pg_database_size('db_name')/1024/1024 db_size_publisher;
db_size_publisher
-------------------
2295971
(1 row)
Subscriber:
SELECT pg_database_size('db_name')/1024/1024 as db_size_subscriber;
db_size_subscriber
--------------------
1506348
(1 row)
The difference is still about 789GB left to replicate it seems like and I've noticed that the subsriber db is increasing at a rate of about 250kb/sec
db_name=> SELECT pg_database_size('db_name')/1024/1024, current_timestamp;
?column? | current_timestamp
----------+-------------------------------
1506394 | 2020-05-21 06:27:46.028805-07
(1 row)
db_name=> SELECT pg_database_size('db_name')/1024/1024, current_timestamp;
?column? | current_timestamp
----------+-------------------------------
1506396 | 2020-05-21 06:27:53.542946-07
(1 row)
At this rate, it would take another 30 days to finish replication, which makes me think I've set something up wrong.
Here are also some other stats from the publisher and subscriber:
Subscriber pg_stat_subscription:
db_name=> select * from pg_stat_subscription;
subid | subname | pid | relid | received_lsn | last_msg_send_time | last_msg_receipt_time | latest_end_lsn | latest_end_time
-------+----------------+-------+-------+---------------+-------------------------------+-------------------------------+----------------+-------------------------------
21562 | rds_subscriber | 2373 | 18411 | | 2020-05-20 18:41:54.132407-07 | 2020-05-20 18:41:54.132407-07 | | 2020-05-20 18:41:54.132407-07
21562 | rds_subscriber | 43275 | | 4811/530587E0 | 2020-05-21 06:15:55.160231-07 | 2020-05-21 06:15:55.16003-07 | 4811/5304BD10 | 2020-05-21 06:15:54.931053-07
(2 rows)
At this rate...it would take weeks to complete....what am I doing wrong here?
I have celery setup and working together with django. I have some periodic tasks that run. The celery log shows that the tasks are executed and that they return something.
[2017-03-26 14:34:27,039: INFO/MainProcess] Received task: my_webapp.apps.events.tasks.clean_outdated[87994396-04f7-452b-a964-f6bdd07785e0]
[2017-03-26 14:34:28,328: INFO/PoolWorker-1] Task my_webapp.apps.events.tasks.clean_outdated[87994396-04f7-452b-a964-f6bdd07785e0] succeeded in 0.05246314400005758s: 'Removed 56 event(s)
| Removed 4 SGW(s)
'
But the result is not showing up on django-celery-results admin page.
These are my settings:
CELERY_BROKER_URL = os.environ.get('BROKER_URL')
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = 'Europe/Stockholm'
CELERY_RESULT_BACKEND = 'django-cache'
CELERYBEAT_SCHEDULER = 'djcelery.schedulers.DatabaseScheduler'
CELERY_RESULT_DB_SHORT_LIVED_SESSIONS = True # Fix for low traffic sites like this one
I have also tried setting CELERY_RESULT_BACKEND = 'django-db'. I know the migrations are made (when using those settings), the table exists in the database:
my_webapp=> \dt
List of relations
Schema | Name | Type | Owner
--------+--------------------------------------+-------+----------------
...
public | django_celery_beat_crontabschedule | table | my_webapp
public | django_celery_beat_intervalschedule | table | my_webapp
public | django_celery_beat_periodictask | table | my_webapp
public | django_celery_beat_periodictasks | table | my_webapp
public | django_celery_results_taskresult | table | my_webapp
...
(26 rows)
Google won't give me much help, most answers is about old libraries like djcelery. Any idea how the get the results in the table?
I have a table in my postgres table which has data structured strangely. Here is an example of the data structure:
id | 1
name | name
data | :type: information
| :url: url
| :platform:
| android: ''
| iphone: ''
created_at | 2016-07-29 11:39:44.938359
updated_at | 2016-08-22 12:24:32.734321
How do i change data > platform > android for example?
Just did some more research and found this which did the trick:
postgresql - replace all instances of a string within text field
Trying to connect to Postgres using pyodbc.
I can connect to the DB with isql:
echo "select 1" | isql -v my-connector
Returns:
+---------------------------------------+
| Connected! |
| |
| sql-statement |
| help [tablename] |
| quit |
| |
+---------------------------------------+
SQL> select 1
+------------+
| ?column? |
+------------+
| 1 |
+------------+
SQLRowCount returns 1
1 rows fetched
But when I try to connect with pyodbc:
import pyodbc
con = pyodbc.connect("DRIVER={PostgreSQL Unicode}; DATABASE=<dbname>; UID=<username>; PWD=<password>; SERVER=localhost; PORT=5432;")
I get the following error:
pyodbc.Error: ('08001', '[08001] [unixODBC]connction string lacks some options (202) (SQLDriverConnect)')
obdc.ini file looks like this:
[my-connector]
Description = PostgreSQL connection to '<dbname>' database
Driver = PostgreSQL Unicode
Database = <dbname>
Servername = localhost
UserName = <username>
Password = <password>
Port = 5432
Protocol = 9.3
ReadOnly = No
RowVersioning = No
ShowSystemTables = No
ShowOidColumn = No
FakeOidIndex = No
ConnSettings =
odbcinst.ini file looks like this:
[PostgreSQL ANSI]
Description = PostgreSQL ODBC driver (ANSI version)
Driver = psqlodbca.so
Setup = libodbcpsqlS.so
Debug = 0
CommLog = 1
UsageCount = 1
[PostgreSQL Unicode]
Description = PostgreSQL ODBC driver (Unicode version)
Driver = psqlodbcw.so
Setup = libodbcpsqlS.so
Debug = 0
CommLog = 1
UsageCount = 1
Notes:
Ubuntu 14.04
Python 3
Postgresql 9.3
I have used psycopg2 in the past to connect to Postgres, however my current company uses Netezza, Postgres, and MySQL. I want to write 1 connection module, and use different drivers to connect to the different databases.
Any help would be greatly appreciated.
-- Thanks
Since you already have a working DSN defined in odbc.ini you can just use that:
con = pyodbc.connect("DSN=my-connector")
Also, for the record, that extra whitespace in your connection string may have been confusing the issue because this worked fine for me, under Python 2.7 at least
import pyodbc
conn_str = (
"DRIVER={PostgreSQL Unicode};"
"DATABASE=postgres;"
"UID=postgres;"
"PWD=whatever;"
"SERVER=localhost;"
"PORT=5432;"
)
conn = pyodbc.connect(conn_str)
crsr = conn.execute("SELECT 123 AS n")
row = crsr.fetchone()
print(row)
crsr.close()
conn.close()