Golang Panic Crash on Searching Postgres - postgresql

So I have a feature that I want to add in a program I'm writing. Basically, if a user has cookies that say what his username and password are it will look up username and the hashed value of the cookie in a stored postgres database and return a positive response and JWT token if I get a match (made as secure as I can with bcrypt hashing - whether storing hashed passwords is a good idea is maybe another question).
This request is made to the server every time a user goes to a page that requires a saved password.
My problem is that every now and then I get the following
inside CookieLogin
app_1 | search_userinfo_table started
db_1 | FATAL: sorry, too many clients already
app_1 | 2018/08/26 16:22:25 pq: sorry, too many clients already
app_1 | 2018/08/26 16:22:25 http: panic serving 172.21.0.1:37304: pq: sorry, too many clients already
app_1 | goroutine 704 [running]:
app_1 | net/http.(*conn).serve.func1(0xc420156f00)
app_1 | /usr/local/go/src/net/http/server.go:1721 +0xd0
app_1 | panic(0x6d63e0, 0xc420102c20)
app_1 | /usr/local/go/src/runtime/panic.go:489 +0x2cf
app_1 | log.Panic(0xc4200438c0, 0x1, 0x1)
app_1 | /usr/local/go/src/log/log.go:322 +0xc0
app_1 | github.com/patientplatypus/webserver/database.Search_userinfo_table(0xc42017f29d, 0xb, 0xc4201500c0, 0xc4200c22e0, 0xc420043a10)
app_1 | /go/src/github.com/patientplatypus/webserver/database/search.go:18 +0x169
app_1 | github.com/patientplatypus/webserver/authentication.loginHandler(0xc42017f29d, 0xb, 0xc42017f2ba, 0x3, 0x899a80, 0xc420148460)
app_1 | /go/src/github.com/patientplatypus/webserver/authentication/login.go:53 +0x39
app_1 | github.com/patientplatypus/webserver/authentication.CookieLogin(0x899a80, 0xc420148460, 0xc4203fc800)
app_1 | /go/src/github.com/patientplatypus/webserver/authentication/login.go:35 +0x279
app_1 | net/http.HandlerFunc.ServeHTTP(0x75c658, 0x899a80, 0xc420148460, 0xc4203fc800)
app_1 | /usr/local/go/src/net/http/server.go:1942 +0x44
app_1 | github.com/gorilla/mux.(*Router).ServeHTTP(0xc420052460, 0x899a80, 0xc420148460, 0xc4203fc800)
app_1 | /go/src/github.com/gorilla/mux/mux.go:162 +0x101
app_1 | main.JWTHandler.func1(0x899a80, 0xc420148460, 0xc4203fc600)
app_1 | /go/src/github.com/patientplatypus/webserver/main.go:34 +0x342
app_1 | net/http.HandlerFunc.ServeHTTP(0xc4200b12a0, 0x899a80, 0xc420148460, 0xc4203fc600)
app_1 | /usr/local/go/src/net/http/server.go:1942 +0x44
app_1 | github.com/rs/cors.(*Cors).Handler.func1(0x899a80, 0xc420148460, 0xc4203fc600)
app_1 | /go/src/github.com/rs/cors/cors.go:200 +0xe9
app_1 | net/http.HandlerFunc.ServeHTTP(0xc4200b12c0, 0x899a80, 0xc420148460, 0xc4203fc600)
app_1 | /usr/local/go/src/net/http/server.go:1942 +0x44
app_1 | net/http.serverHandler.ServeHTTP(0xc42008a630, 0x899a80, 0xc420148460, 0xc4203fc600)
app_1 | /usr/local/go/src/net/http/server.go:2568 +0x92
app_1 | net/http.(*conn).serve(0xc420156f00, 0x899fc0, 0xc4200506c0)
app_1 | /usr/local/go/src/net/http/server.go:1825 +0x612
app_1 | created by net/http.(*Server).Serve
app_1 | /usr/local/go/src/net/http/server.go:2668 +0x2ce
...and then the golang client will crash and I will have to manually restart. This is an error that happens only sometimes which is incredibly frustrating as it makes debugging difficult (sorry for the verbose terminal log - I'm not sure when the error will fire again).
Does anyone have any advice or suggestions? Is this a common problem with a no-duh solution?
Thanks!
EDIT:
As far as I can tell, here is the offending postgres search query:
package data
import(
"fmt"
"log"
_ "github.com/lib/pq"
)
func Search_userinfo_table(searchEmail string)(bool, string) {
fmt.Println("search_userinfo_table started")
rows, err1 := db.Query("SELECT * FROM userinfo")
if err1 != nil {
log.Panic(err1)
}
for rows.Next() {
var email string
var password string
var regString string
var regBool bool
var uid int
err2 := rows.Scan(&email, &password, &regString, &regBool, &uid)
if err2 != nil {
log.Panic(err2)
}
fmt.Println("email | password")
fmt.Printf("%s | %s", email, password)
if email == searchEmail{
return false, password
}
}
return true, ""
}
Which returns the whether the user is in the database or not as well as the hashed password. Talking with a few people the answer appears to be that postgres adds a new connection every time it makes this read and then eventually crashes, but I want it to reuse connections (although I thought that this happened automagically?). Anyway, more information.

You're missing the required defer rows.Close().
Each instance of *sql.Rows has a connection associated with it. When you close a rows instance the associated connection gets released and put back into the pool (*DB).
Not closing a rows instance, on the other hand, will cause its connection not to be released. And not releasing connections will in turn lead to the connection pool to attempt to open one too many new connections.

Related

postgresql subscription not working for schemas other than public

I'm trying to create a logical replication with two local postgresql servers (node1: port 5434, node2: port 5435).
I could successfully create publication and subscription on node1 and node2 for a table in public schema.
Node1:
CREATE PUBLICATION my_pub FOR TABLE t1;
GRANT SELECT ON t1 TO repuser;
Node2:
CREATE SUBSCRIPTION my_sub CONNECTION 'host=localhost port=5434 dbname=pub user=repuser password=password' PUBLICATION my_pub;
Node2 public.t1 replicates all data in node1 public.t1.
However, my problem is when I create publication and subscription with same code but in different schema, node2 fail to replicate.
Below is output of some pg_catalog query :
Node1:
pub=# select * from pg_catalog.pg_publication_tables;
pubname | schemaname | tablename
----------+------------+-----------
my_pub | public | t1
cdl_test | cdl | t1
pub_test | test | t1
Node2:
sub=# \dRs
List of subscriptions
Name | Owner | Enabled | Publication
--------------+----------+---------+-------------
cdl_sub_test | postgres | t | {cdl_test}
my_sub | postgres | t | {my_pub}
sub_test | postgres | t | {pub_test}
sub=# select * from pg_catalog.pg_replication_origin;
roident | roname
---------+----------
2 | pg_18460
1 | pg_18461
3 | pg_18466
sub=# select * from pg_catalog.pg_subscription_rel ;
srsubid | srrelid | srsubstate | srsublsn
---------+---------+------------+------------
18461 | 16386 | r | 0/3811C810
18466 | 18463 | d |
18460 | 18456 | d |
As it is shown in select * from pg_catalog.pg_subscription_rel, two subscription for test and cdl schema are in d(data is being copied) state.
Any recommendation on how to go about this problem or diagnose why the problem occurs?
As jjanes has suggested, a snippet of the log file is shown below:
2022-01-17 16:05:25.165 PST [622] WARNING: out of logical replication worker slots
2022-01-17 16:05:25.165 PST [622] HINT: You might need to increase max_logical_replication_workers.
2022-01-17 16:05:25.168 PST [970] LOG: logical replication table synchronization worker for subscription "cdl_sub_test", table "t1" has started
2022-01-17 16:05:25.245 PST [970] ERROR: could not start initial contents copy for table "cdl.t1": ERROR: permission denied for schema cdl
2022-01-17 16:05:25.247 PST [471] LOG: background worker "logical replication worker" (PID 970) exited with exit code 12022-01-17 16:05:25.797 PST [488] postgres#sub LOG: statement: /*pga4dash*/
It seems like the subscriber doesn't have permission to read cdl schema in publisher even after I gave permission for SELECT ON cdl.t1 TO repuser;.
You have to give the user repuser permission to read the table that should be replicated. That also requires USAGE permission on the schema that contains the table.

Index not creating in kibana for mongodb + logstash +elasticsearch

I'm trying to visualize the mongodb data in kibana using logstash configuration.Below is my configuration.I'm getting some outputs in terminal and it is looping forever. I couldn't see any index created by the name mentioned in the config file and if the index was generated also don't have any data on it. Saying no results to match in the discover tab.How to make the configuration to visualize the data in kibana?
input {
mongodb {
uri => "mongodb+srv:###############?retryWrites=true&w=majority"
placeholder_db_dir => "C:/logstash-mongodb"
placeholder_db_name => "logstash1_sqlite.db"
collection => "logs"
batch_size => 1
}
}
filter {
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
action => "index"
index => "ayesha_logs"
hosts => ["localhost:9200"]
}
}
http://localhost:9200/ayesha_logs/_search?pretty
Terminal logs:
D, [2020-10-01T08:11:45.717000 #2372] DEBUG -- : MONGODB | range-api-test-cluster-shard-00-02.icqif.azure.mongodb.net:27017 req:259 conn:1:1 sconn:231839 | coexistence-poc.listCollections | STARTED | {"listCollections"=>1, "cursor"=>{}, "nameOnly"=>true, "$db"=>"coexistence-poc", "$clusterTime"=>{"clusterTime"=>#<BSON::Timestamp:0x32598cb2 #increment=1, #seconds=1601532700>, "signature"=>{"hash"=><BSON::Binary:0x2622 type=generic data=0xfaf25a8d85...
D, [2020-10-01T08:11:45.755000 #2372] DEBUG -- : MONGODB | range-api-test-cluster-shard-00-02.icqif.azure.mongodb.net:27017 req:259 | coexistence-poc.listCollections | SUCCEEDED | 0.038s
D, [2020-10-01T08:11:50.801000 #2372] DEBUG -- : MONGODB | range-api-test-cluster-shard-00-02.icqif.azure.mongodb.net:27017 req:260 conn:1:1 sconn:231839 | coexistence-poc.find | STARTED | {"find"=>"coexistence-pinfobackfill-logs", "filter"=>{"_id"=>{"$gt"=>BSON::ObjectId('5f71f009b6b9115861d379d8')}}, "limit"=>50, "$db"=>"coexistence-poc", "$clusterTime"=>{"clusterTime"=>#<BSON::Timestamp:0x32598cb2 #increment=1, #seconds=1601532700>, ...
D, [2020-10-01T08:11:50.843000 #2372] DEBUG -- : MONGODB | range-api-test-cluster-shard-00-02.icqif.azure.mongodb.net:27017 req:260 | coexistence-poc.find | SUCCEEDED | 0.042s
D, [2020-10-01T08:11:50.859000 #2372] DEBUG -- : MONGODB | range-api-test-cluster-shard-00-02.icqif.azure.mongodb.net:27017 req:261 conn:1:1 sconn:231839 | coexistence-poc.listCollections | STARTED | {"listCollections"=>1, "cursor"=>{}, "nameOnly"=>true, "$db"=>"coexistence-poc", "$clusterTime"=>{"clusterTime"=>#<BSON::Timestamp:0x32598cb2 #increment=1, #seconds=1601532700>, "signature"=>{"hash"=><BSON::Binary:0x2622 type=generic data=0xfaf25a8d85...
D, [2020-10-01T08:11:50.906000 #2372] DEBUG -- : MONGODB | range-api-test-cluster-shard-00-02.icqif.azure.mongodb.net:27017 req:261 | coexistence-poc.listCollections | SUCCEEDED | 0.047s
Did you create your Kibana's index pattern ?
If not, just go to Menu > stack managment > Kibana > Index pattern
click on
And follow the steps.
You will then be able to use you index in Discover or visualization tabs.

slow running postgres11 logical replication from EC2 to RDS

I'm trying to move a Postgres (11.6) database on EC2 to RDS (postgres 11.6). I started replication last a couple nights ago and have now noticed the replication has slowed down considerably when I see how fast the database size in increasing on the subscriber SELECT pg_database_size('db_name')/1024/1024. Here are some stats of the environment:
Publisher Node:
Instance type: r5.24xlarge
Disk: 5Tb GP2 with 16,000 PIOPs
Database Size w/ pg_database_size/1024/1024: 2,295,955 mb
Subscriber Node:
Instance type: RDS r5.24xlarge
Disk: 3Tb GP2
Here is the current DB size for the subscriber and publisher:
Publisher:
SELECT pg_database_size('db_name')/1024/1024 db_size_publisher;
db_size_publisher
-------------------
2295971
(1 row)
Subscriber:
SELECT pg_database_size('db_name')/1024/1024 as db_size_subscriber;
db_size_subscriber
--------------------
1506348
(1 row)
The difference is still about 789GB left to replicate it seems like and I've noticed that the subsriber db is increasing at a rate of about 250kb/sec
db_name=> SELECT pg_database_size('db_name')/1024/1024, current_timestamp;
?column? | current_timestamp
----------+-------------------------------
1506394 | 2020-05-21 06:27:46.028805-07
(1 row)
db_name=> SELECT pg_database_size('db_name')/1024/1024, current_timestamp;
?column? | current_timestamp
----------+-------------------------------
1506396 | 2020-05-21 06:27:53.542946-07
(1 row)
At this rate, it would take another 30 days to finish replication, which makes me think I've set something up wrong.
Here are also some other stats from the publisher and subscriber:
Subscriber pg_stat_subscription:
db_name=> select * from pg_stat_subscription;
subid | subname | pid | relid | received_lsn | last_msg_send_time | last_msg_receipt_time | latest_end_lsn | latest_end_time
-------+----------------+-------+-------+---------------+-------------------------------+-------------------------------+----------------+-------------------------------
21562 | rds_subscriber | 2373 | 18411 | | 2020-05-20 18:41:54.132407-07 | 2020-05-20 18:41:54.132407-07 | | 2020-05-20 18:41:54.132407-07
21562 | rds_subscriber | 43275 | | 4811/530587E0 | 2020-05-21 06:15:55.160231-07 | 2020-05-21 06:15:55.16003-07 | 4811/5304BD10 | 2020-05-21 06:15:54.931053-07
(2 rows)
At this rate...it would take weeks to complete....what am I doing wrong here?

django-celery-results won't recieve results

I have celery setup and working together with django. I have some periodic tasks that run. The celery log shows that the tasks are executed and that they return something.
[2017-03-26 14:34:27,039: INFO/MainProcess] Received task: my_webapp.apps.events.tasks.clean_outdated[87994396-04f7-452b-a964-f6bdd07785e0]
[2017-03-26 14:34:28,328: INFO/PoolWorker-1] Task my_webapp.apps.events.tasks.clean_outdated[87994396-04f7-452b-a964-f6bdd07785e0] succeeded in 0.05246314400005758s: 'Removed 56 event(s)
| Removed 4 SGW(s)
'
But the result is not showing up on django-celery-results admin page.
These are my settings:
CELERY_BROKER_URL = os.environ.get('BROKER_URL')
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = 'Europe/Stockholm'
CELERY_RESULT_BACKEND = 'django-cache'
CELERYBEAT_SCHEDULER = 'djcelery.schedulers.DatabaseScheduler'
CELERY_RESULT_DB_SHORT_LIVED_SESSIONS = True # Fix for low traffic sites like this one
I have also tried setting CELERY_RESULT_BACKEND = 'django-db'. I know the migrations are made (when using those settings), the table exists in the database:
my_webapp=> \dt
List of relations
Schema | Name | Type | Owner
--------+--------------------------------------+-------+----------------
...
public | django_celery_beat_crontabschedule | table | my_webapp
public | django_celery_beat_intervalschedule | table | my_webapp
public | django_celery_beat_periodictask | table | my_webapp
public | django_celery_beat_periodictasks | table | my_webapp
public | django_celery_results_taskresult | table | my_webapp
...
(26 rows)
Google won't give me much help, most answers is about old libraries like djcelery. Any idea how the get the results in the table?

Can't use PGPool with Amazon RDS Postgres

I have a Postgres 9.4 RDS instance with Multi-AZ, and there's a slave, read-only replica.
Up to this point the load balancing was made in the business layer of my app, but it's inefficient, and I was hoping to use PGPool, so the app interacts with a single Postgres connection.
It turns out that using PGPool has been a pain in the ass. If I set it to act as a load balancer, simple SELECT queries throw errors like:
SQLSTATE[HY000]: General error: 7
message contents do not agree with length in message type "N"
server sent data ("D" message)
without prior row description ("T" message)
If I set it to act in a master/slave mode with stream replication (as suggested in Postgres mail list) I get:
psql: ERROR: MD5 authentication is unsupported
in replication and master-slave modes.
HINT: check pg_hba.conf
Yeah, well, pg_hba.conf if off hands in RDS so I can't alter it.
Has anyone got PGPool to work in RDS? Are there other tools that can act as middleware to take advantage of reading replicas in RDS?
I was able to make it work here are my working config files:
You have to use md5 authentication, and sync the username/password from your database to the pool_passwd file. Also need enable_pool_hba, load_balance_mode, and master_slave_mode on.
pgpool.conf
listen_addresses = '*'
port = 9999
pcp_listen_addresses = '*'
pcp_port = 9898
pcp_socket_dir = '/tmp'
listen_backlog_multiplier = 1
backend_hostname0 = 'master-rds-database-with-multi-AZ.us-west-2.rds.amazonaws.com'
backend_port0 = 5432
backend_weight0 = 0
backend_flag0 = 'ALWAYS_MASTER'
backend_hostname1 = 'readonly-replica.us-west-2.rds.amazonaws.com'
backend_port1 = 5432
backend_weight1 = 999
backend_flag1 = 'ALWAYS_MASTER'
enable_pool_hba = on
pool_passwd = 'pool_passwd'
ssl = on
num_init_children = 1
max_pool = 2
connection_cache = off
replication_mode = off
load_balance_mode = on
master_slave_mode = on
pool_hba.conf
local all all md5
host all all 127.0.0.1/32 md5
pool_passwd
username:md5d51c9a7e9353746a6020f9602d452929
to update pool_password you can use pg_md5 or
echo username:md5`echo -n usernamepassword | md5sum`
username:md5d51c9a7e9353746a6020f9602d452929 -
Output of running example:
psql --dbname=database --host=localhost --username=username --port=9999
database=> SHOW POOL_NODES;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay
---------+-------------------------------------------------+------+--------+-----------+---------+------------+-------------------+-------------------
0 | master-rds-database.us-west-2.rds.amazonaws.com | 8193 | up | 0.000000 | primary | 0 | false | 0
1 | readonly-replica.us-west-2.rds.amazonaws.com | 8193 | up | 1.000000 | standby | 0 | true | 0
database=> select now();
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay
---------+-------------------------------------------------+------+--------+-----------+---------+------------+-------------------+-------------------
0 | master-rds-database.us-west-2.rds.amazonaws.com | 8193 | up | 0.000000 | primary | 0 | false | 0
1 | readonly-replica.us-west-2.rds.amazonaws.com | 8193 | up | 1.000000 | standby | 1 | true | 1
database=> CREATE TABLE IF NOT EXISTS tmp_test_read_write ( data varchar(40) );
CREATE TABLE
database=> INSERT INTO tmp_test_read_write (data) VALUES (concat('',inet_server_addr()));
INSERT 0 1
database=> select data as master_ip,inet_server_addr() as replica_ip from tmp_test_read_write;
master_ip | replica_ip
--------------+---------------
172.31.37.69 | 172.31.20.121
(1 row)
You can also see from the logs id does both databases:
2018-10-16 07:56:37: pid 124528: LOG: DB node id: 0 backend pid: 21731 statement: CREATE TABLE IF NOT EXISTS tmp_test_read_write ( data varchar(40) );
2018-10-16 07:56:47: pid 124528: LOG: DB node id: 0 backend pid: 21731 statement: INSERT INTO tmp_test_read_write (data) VALUES (concat('',inet_server_addr()));
2018-10-16 07:56:52: pid 124528: LOG: DB node id: 1 backend pid: 24890 statement: select data as master_ip,inet_server_addr() as replica_ip from tmp_test_read_write;
Notice the insert used ip_address of master, and the next select used ip_address of the read only replica.
I can update after more testing, but psql client testing looks promising.
There is Citus(pgShard) that is supposed to work with standard Amazon RDS instances. It has catches though. You will have a single point of failure if you use the open source version. It's coordinator node is not duplicated.
You can get a fully HA seamless fail over version of it but you have to buy the enterprise licence, but it is CRAZY expensive. It will easily cost you $50,000 to $100,000 or more per year.
Also they are REALLY pushing their cloud version now which is even more insanely expensive.
https://www.citusdata.com/
I have also heard of people using HAProxy to balance between Postgres or MySql nodes.