Higher than expected latency in GO/Postgres Connection [duplicate] - postgresql
I'm running the same query against a local postgresql instance using a golang application, and using psql. The timings differ greatly and I'm wondering why. Using explain/analyze the query took 1ms, using database/sql in golang, it took 24ms. I've added my code snippets below. I realize that explain/analyze may not be equivalent to querying the database directly, and there may be some network latency involved as well, however the discrepancy is still significant. Why is there such a discrepancy?
edit: I've tried the above with a sample size of 10+ queries, and the discrepancy still holds true.
postgres=# \timing
Timing is on.
postgres=# select 1;
?column?
----------
1
(1 row)
Time: 2.456 ms
postgres=# explain analyze select 1;
QUERY PLAN
------------------------------------------------------------------------------------
Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1)
Planning Time: 0.017 ms
Execution Time: 0.012 ms
(3 rows)
Time: 3.748 ms
package main
import (
"database/sql"
"fmt"
_ "github.com/lib/pq"
"time"
)
func main() {
// setup database connection
db, err := sql.Open("postgres", "host='localhost' port=5432 user='postgres' password='' dbname='postgres' sslmode=disable")
if err != nil {
panic(err)
}
// query database
firstQueryStart := time.Now()
_, err = db.Query("select 1;")
firstQueryEnd := time.Now()
if err != nil {
panic(err)
}
fmt.Println(fmt.Sprintf("first query took %s", firstQueryEnd.Sub(firstQueryStart).String()))
//run the same query a second time and measure the timing
secondQueryStart := time.Now()
_, err = db.Query("select 1;")
secondQueryEnd := time.Now()
if err != nil {
panic(err)
}
fmt.Println(fmt.Sprintf("second query took %s", secondQueryEnd.Sub(secondQueryStart).String()))
}
first query took 13.981435ms
second query took 13.343845ms
Note #1: sql.DB does not represent a connection, instead it represents a pool of connections.
Note #2: sql.Open initializes the pool but it does not have to actually open a connection, it's allowed to only validate the dsn input and then the opening of the connections will be handled lazily by the pool.
The reason your 1st db.Query is slow is because you're starting off with a fresh connection pool, one that has 0 idle (but open) connections, and therefore the 1st db.Query will need to first establish a new connection to the server and only after that will it be able to execute the sql statement.
The reason your 2nd db.Query is also slow is because the connection created by the 1st db.Query has not been released back to the pool, and therefore your 2nd db.Query will also need to first establish a new connection to the server before it can execute the sql statement.
To release a connection back to the pool you need to first retain the primary return value of db.Query and then invoke the Close method on it.
To start off with a pool that has at least one available connection, call Ping right after initializing the pool.
Example:
func main() {
// setup database connection
db, err := sql.Open("postgres", "postgres:///?sslmode=disable")
if err != nil {
panic(err)
} else if err := db.Ping(); err != nil {
panic(err)
}
for i := 0; i < 5; i++ {
// query database
firstQueryStart := time.Now()
rows, err := db.Query("select 1;")
firstQueryEnd := time.Now()
if err != nil {
panic(err)
}
// put the connection back to the pool so
// that it can be reused by next iteration
rows.Close()
fmt.Println(fmt.Sprintf("query #%d took %s", i, firstQueryEnd.Sub(firstQueryStart).String()))
}
}
Times on my machine (without db.Ping only #0 is slow)
query #0 took 6.312676ms
query #1 took 102.88µs
query #2 took 66.702µs
query #3 took 64.694µs
query #4 took 208.016µs
Times on my machine (with db.Ping #0 is a lot faster than without)
query #0 took 284.846µs
query #1 took 78.349µs
query #2 took 76.518µs
query #3 took 81.733µs
query #4 took 103.862µs
A note on prepared statements:
If you're executing a simple query with no arguments e.g. db.Query("select 1 where true") then you really are executing just a simple query.
If, however, you're executing a query with arguments e.g. db.Query("select 1 where $1", true) then, in actuality, you are creating and executing a prepared statement.
See 4.2. Value Expressions, it says:
A value expression is one of the following: ...
A positional parameter reference, in the body of a function definition or prepared statement
...
Also Positional Parameters says:
A positional parameter reference is used to indicate a value that is
supplied externally to an SQL statement. Parameters are used in SQL
function definitions and in prepared queries. Some client libraries
also support specifying data values separately from the SQL command
string, in which case parameters are used to refer to the out-of-line
data values.
How the postgres' message-flow protocol specifies simple queries and extended queries
The extended query protocol breaks down the above-described simple
query protocol into multiple steps. The results of preparatory steps
can be re-used multiple times for improved efficiency. Furthermore,
additional features are available, such as the possibility of
supplying data values as separate parameters instead of having to
insert them directly into a query string.
And finally, under the covers of the lib/pq driver:
...
// Check to see if we can use the "simpleQuery" interface, which is
// *much* faster than going through prepare/exec
if len(args) == 0 {
return cn.simpleQuery(query)
}
if cn.binaryParameters {
cn.sendBinaryModeQuery(query, args)
cn.readParseResponse()
cn.readBindResponse()
rows := &rows{cn: cn}
rows.rowsHeader = cn.readPortalDescribeResponse()
cn.postExecuteWorkaround()
return rows, nil
}
st := cn.prepareTo(query, "")
st.exec(args)
return &rows{
cn: cn,
rowsHeader: st.rowsHeader,
}, nil
...
Related
Convert PostgreSQL Query to GORM Query
How do I convert the following SQL query to GORM? SELECT * FROM files WHERE tsv ## plainto_tsquery('lexeme word'); Or equivalently: SELECT * FROM files, plainto_tsquery('lexeme word') q WHERE tsv ## q; I can use .Exec(): err := d.Connection.DB.Exec("SELECT * FROM scenes WHERE tsv ## plainto_tsquery(?);",text).Error if err != nil { return nil, err } This query doesn't allow me to get an array of data and work with it. Unfortunately, I could not find a solution using the GORM documentation.
Use .Raw() for raw SQL query db.Raw("SELECT id, name, age FROM users WHERE name = ?", 3).Scan(&result) Official doc for about raw SQL query
How do I make this Golang to Postgres query faster? Any specific alternatives?
I am using Golang and Postgres to filter some financial data. I have a Postgres database which has a single table containing a single Stock Market (if that's the correct term). This table has columns for id, symbol, date, open, high, low, close and volume. The total number of rows is 6,610,598 and the number of distinct stocks (symbols) is 2174. Now what I want to do is to filter the data from that table, and save to another table. So the first one contains raw data and second one contains cleaned data. We have three parameters, a date (EVALDATE) and 2 integers (MINCTD & MINDP). First, we have to select only those stocks that will pass our minimum calendar trading days parameter. So that will be selected by (NOTE: we use golang for this) symbols []string got its value from ( Select distinct symbol from table_name; ) []filteredSymbols var symbol, date string var open, high, low, close float64 var volume int for _, symbol := range symbols { var count int query := fmt.Sprintf("Select count(*) from table_name where symbol = '%s' and date >= '%s';", symbol, EVALDATE) row := db.QueryRow(query) if err := row.Scan(&count); err != nil ........ if count >= MINCTD filteredSymbols = append(filteredSymbols, symbol) } Basically, the operation above only asks for those symbols which has enough number of rows from the EVALDATE up to current date (latest date in data) that will satisfy MINCTD. The operation above took 30 minutes If a symbol satisfies the first filter above, it will undergo a second filter which will test if within that period (EVALDATE to LATEST_DATE) it has enough rows that contain complete data (no OHLC without values). So the query below is used to filter the symbols which passed the filter above: Select count(*) from table_name where symbol='symbol' and date>= 'EVALDATE' and open != 0 and high != 0 and low != 0 and close != 0; This query took 36 minutes. After getting the slice of symbols which passed both filter, I will then grab their data again using postgres query then begin a bulk insert to another table. So 1 hour and 6 minutes is not very acceptable. What should I do then? Grab all data then filter using Golang in memory?
Couple of things I note from the question. Try to avoid scanning 6 million+ rows to arrive at 2174 values (i.e. avoid Select distinct symbol from table_name;). Do you not have (or can you build) a "master table" of symbols with a primary key of the symbols? Combine your queries to test the data such as the following: select count(*) c1 , count(case when open != 0 and high != 0 and low != 0 and close != 0 then 1 end) as c2 from table_name where symbol='symbol' and date>= 'EVALDATE' An index on (symbol, date) would assist performance.
In Go, clean 7,914,698 rows for 3,142 symbols in 28.7 seconds, which is better than 3,960 seconds (1 hour and 6 minutes) for 6,610,598 rows for 2,174 symbols. Output: $ go run clean.go clean: 7914698 rows 28.679295705s $ psql psql (9.6.6) peter=# \d clean Table "public.clean" Column | Type | Modifiers --------+------------------+----------- id | integer | symbol | text | not null date | date | not null close | double precision | volume | integer | open | double precision | high | double precision | low | double precision | Indexes: "clean_pkey" PRIMARY KEY, btree (symbol, date) peter=# SELECT COUNT(*) FROM clean; count --------- 7914698 peter=# SELECT COUNT(DISTINCT symbol) FROM clean; count ------- 3142 peter=# \q $ clean.go: package main import ( "database/sql" "fmt" "strconv" "time" _ "github.com/lib/pq" ) func clean(db *sql.DB, EVALDATE time.Time, MINCTD, MINDP int) (int64, time.Duration, error) { start := time.Now() tx, err := db.Begin() if err != nil { return 0, 0, err } committed := false defer func() { if !committed { tx.Rollback() } }() { const query = `DROP TABLE IF EXISTS clean;` if _, err := tx.Exec(query); err != nil { return 0, 0, err } } var nRows int64 { const query = ` CREATE TABLE clean AS SELECT id, symbol, date, close, volume, open, high, low FROM unclean WHERE symbol IN ( SELECT symbol FROM unclean WHERE date >= $1 GROUP BY symbol HAVING COUNT(*) >= $2 AND COUNT(CASE WHEN NOT (open >0 AND high >0 AND low >0 AND close >0) THEN 1 END) <= $3 ) ORDER BY symbol, date ; ` EVALDATE := EVALDATE.Format("'2006-01-02'") MINCTD := strconv.Itoa(MINCTD) MINDP := strconv.Itoa(MINDP) res, err := tx.Exec(query, EVALDATE, MINCTD, MINDP) if err != nil { return 0, 0, err } nRows, err = res.RowsAffected() if err != nil { return 0, 0, err } } { const query = `ALTER TABLE clean ADD PRIMARY KEY (symbol, date);` _, err := tx.Exec(query) if err != nil { return 0, 0, err } } if err = tx.Commit(); err != nil { return 0, 0, err } committed = true since := time.Since(start) { const query = `ANALYZE clean;` if _, err := db.Exec(query); err != nil { return nRows, since, err } } return nRows, since, nil } func main() { db, err := sql.Open("postgres", "user=peter password=peter dbname=peter") if err != nil { fmt.Println(err) return } defer db.Close() var ( // one year EVALDATE = time.Now().AddDate(-1, 0, 0) MINCTD = 240 MINDP = 5 ) nRows, since, err := clean(db, EVALDATE, MINCTD, MINDP) if err != nil { fmt.Println(err) return } fmt.Println("clean:", nRows, "rows", since) return } Playground: https://play.golang.org/p/qVOQQ6mcU-1 References: Technical Analysis of the Financial Markets: A Comprehensive Guide to Trading Methods and Applications, John J. Murphy. An Introduction to Database Systems, 8th Edition, C.J. Date. PostgreSQL: Introduction and Concepts, Bruce Momjian. PostgreSQL 9.6.6 Documentation, PostgreSQL.
Golang postgres Commit unknown command error?
Using postgres 9.3, go 1.6 I've been trying to use transactions with the go pq library. // Good txn, _ := db.Begin() txn.Query("UPDATE t_name SET a = 1") err := txn.Commit() // err is nil // Bad txn, _ := db.Begin() txn.Query("UPDATE t_name SET a = $1", 1) err := txn.Commit() // Gives me a "unexpected command tag Q" error // although the data is committed For some reason, when I execute a Query with parameters, I always get an unexpected command tag Q error from the Commit(). What is this error (what is Q?) and why am I getting it? I believe this is where the error is created.
To start of i agree whit Dmitri from the comments, in this case you should probably use Exec. However after receiving this same issue I started digging: Query returns 2 arguments a Rows pointer and an error. What you always have to do with a Rows object is to close it when you are don with it: // Fixed txn, _ := db.Begin() rows, _ := txn.Query("UPDATE t_name SET a = $1", 1) //Read out rows rows.Close() //<- This will solve the error err := txn.Commit() I was however unable to see any difference in the traffic to the database when using rows.Close() witch indicates to me that this might be a bug in pq.
Go PostgreSQL: How to obtain number of rows from a db.Query?
As PostgreSQL connector I import following package: _ "github.com/lib/pq" The query I run is: res, err := db.Query("SELECT id FROM applications WHERE email='" + email + "'") where email is naturally a string. One way to count the number of rows in res is by following snippet count:=0 for res.Next() { count++ //some other code } but there should be some simpler (and quicker) way. It seems RowsAffected() is not the way to go. So, what is your suggestion?
Use the COUNT function: "SELECT count(*) FROM applications WHERE email='" + email + "'"
pq: invalid input syntax for type double precision: "$1" with Golang
I'm trying to do a simple INSERT into a postgresql database in my GO program. I have the number 0 that is a float64, I have a column in my database that expects double precision. I have no idea what I need to convert the number to in order for the database to accept the value.
The PostgreSQL driver handles insertion of float64 into double precision columns quite well: tmp=# \d test Table "public.test" Column | Type | Modifiers --------+------------------+----------- v | double precision | And code: package main import ( "database/sql" "fmt" _ "github.com/lib/pq" "log" ) func main() { db, err := sql.Open("postgres", "user=alex dbname=tmp sslmode=disable") if err != nil { log.Fatal(err) } result, err := db.Exec("insert into test values ($1)", float64(0.5)) if err != nil { log.Fatal(err) } fmt.Println(result) } And then: tmp=# select * from test ; v ----- 0.5 0.5 0.5 (3 rows) The question was downvoted because obviously the problem description you provided is not enough to reproduce the issue. I've tried to follow but as you see it is working.