I'm working on using Postgres logical replication in Go using the pgx library to get database changes via a logical replication slot using wal2json as the output plugin. The Postgres version being used is v10.1.
When in the WaitForReplicationMessage loop, I receive a message, and if that message is a ServerHeartbeat, I send a standby status message to update the server of my consumed position in the WAL. This StandbyStatus message has a field called ReplyRequested which, if equals 1 tells the server to send a ServerHeartbeat; and if the value is 0 it's not supposed to do anything.
Now I'm sending a StandbyStatus message with the ReplyRequested value to 0 (which is the default value when the object is created). On sending this, the server sends a heartbeat message despite me telling it not to. I'm unable to see the cause of this issue.
Here's my code:
for {
log.Info("Waiting for message")
message, err := session.ReplConn.WaitForReplicationMessage(context.TODO())
if err != nil {
log.WithError(err).Errorf("%s", reflect.TypeOf(err))
continue
}
if message.WalMessage != nil {
log.Info(string(message.WalMessage.WalData))
} else if message.ServerHeartbeat != nil {
log.Info("Heartbeat requested")
// set the flushed LSN (and other LSN values) in the standby status and send to PG
log.Info(message.ServerHeartbeat)
// send Standby Status with the LSN position
err = session.sendStandbyStatus()
if err != nil {
log.WithError(err).Error("Unable to send standby status")
}
}
}
The sendStandbyStatus function above is:
func (session *Session) sendStandbyStatus() error {
standbyStatus, err := pgx.NewStandbyStatus(session.RestartLSN)
if err != nil {
return err
}
log.Info(standbyStatus) // the output of this confirms ReplyRequested is indeed 0
standbyStatus.ReplyRequested = 0 // still set it
err = session.ReplConn.SendStandbyStatus(standbyStatus)
if err != nil {
return err
}
return nil
}
Related
I have the two tables in postgres database, which is inside docker - data_table where is stored usual types and related data_files_table where is stored file data in bytea type. When I'm inserting in data_table and data_files_table in one transaction from golang service using sqlx, sometimes I get an error:
unexpected EOF
on query of insert to data_files_table. Often, it occurs on data bigger than 7 MB.
In postgres logs I found row, that might be relates:
could not receive data from client: Connection reset by peer
Why this occurs? Is it relate to TOAST, tcp connection killing, or something else?
Here is my code:
func (r *repo) CreateGrantWithDocuments(
ctx context.Context,
model models.Grant,
documents []models.GrantDocument,
) (*pgtype.UUID, error) {
tx, err := r.db.Beginx() // sqlx with pgx driver
if err != nil {
return nil, fmt.Errorf("db.Beginx: %w", err)
}
defer helpdb.Rollback(ctx, r.log, tx)
var grantID pgtype.UUID
err = tx.GetContext(
ctx, &grantID, createGrantSQL, args...) // args doesn't matter, it's ok; query returns id
if err != nil {
return nil, fmt.Errorf("tx.GetContext: %w", err)
}
for _, document := range documents {
data, err := io.ReadAll(document.File)
if err != nil {
return nil, fmt.Errorf("io.ReadAll: %w", err)
}
if err = document.File.Close(); err != nil { // file is an io.ReadCloser from multipart/form-data request
return nil, fmt.Errorf("document.File.Close: %w", err)
}
_, err = tx.NamedExec(createGrantDocumentSQL, document) // ERROR: unexpected EOF
if err != nil {
return nil, fmt.Errorf("tx.NamedExec: %w", err)
}
}
if err = tx.Commit(); err != nil {
return nil, fmt.Errorf("tx.Commit: %w", err)
}
return &grantID, nil
}
I'm using dependencies:
go 1.19
github.com/jmoiron/sqlx v1.3.4 // db connects
github.com/jackc/pgx/v4 v4.14.0 // db driver
I already changed the config on app-side: set idle and open connections count and lifetime to different values, disable idle, increase and decrease lifetime, but nothing helps.
Error doesn't occur locally, only on dev-server.
This problem relates limits of docker container by cgroups. docker stats shows that memory limit is not exceeded, but in fact it is exceeded, because this command doesn't see some types of memory that postgres uses, but cgroups see.
Also, this problem may relates to network problems, but not in my case.
Use rss+cache memory types in monitoring for prevent this problem.
I use postgres COPY in my Go backend. Copy is the only operation inside transaction. Should I roll it back if it failed?
func (pc *Postgres) Copy(records [][]interface{}) error {
tx, err := pc.db.Begin()
if err != nil {
return errors.Wrap(err, "can't open transaction")
}
stmt, err := tx.Prepare(pq.CopyIn(pc.table, pc.columns...))
if err != nil {
return errors.Wrap(err, "can't prepare stmt")
}
for _, record := range records {
if _, err := stmt.Exec(record...); err != nil {
return errors.Wrap(err, "error exec record")
}
}
if _, err = stmt.Exec(); err != nil {
return errors.Wrap(err, "error exec stmt")
}
if err = stmt.Close(); err != nil {
return errors.Wrap(err, "error close stmt")
}
if err = tx.Commit(); err != nil {
return errors.Wrap(err, "error commit transaction")
}
return nil
}
As far as I understand if \copy fails transaction will be aborted(link) and rolled back.
However in officials lib/pq examples I see they always use rollback(but they have more than one operation).
Could somebody please guide me through these nuances?
It looks like part of the confusion is because \COPY is not the same thing as COPY.
\COPY (as referenced in this question) is a command in psql that "Performs a frontend (client) copy". In simple terms you run psql (a terminal-based front-end to PostgreSQL) on a computer and can \COPY data between the database and a file stored locally (or accessible from your machine). This command is part of psql and not something you can use via lib/pq.
COPY is a Postgres SQL command that runs on the server; the file you are copying "must be accessible by PostgreSQL user (the user ID the server runs as) and the name must be specified from the viewpoint of the server". This is what you are calling in your application (from the pq docs "CopyIn uses COPY FROM internally" implementation here and here)
So, as the answer referenced by #mh-cbon states:
the deferred rollback is there to ensure that the transaction is rolled back in case of an early return.
Consider:
tx, err := pc.db.Begin()
if err != nil {
return errors.Wrap(err, "can't open transaction")
}
stmt, err := tx.Prepare(pq.CopyIn(pc.table, pc.columns...))
if err != nil {
return errors.Wrap(err, "can't prepare stmt")
}
If the Prepare fails you have created a transaction and then return without closing it; this leaves the transaction open which is not a good thing. Adding a defer tx.Rollback() ensures that does not happen.
I'm trying to rollback a transaction on my unit tests, between scenarios, to keep the database empty and do not make my tests dirty. So, I'm trying:
for _, test := range tests {
db := connect()
_ = db.RunInTransaction(func() error {
t.Run(test.name, func(t *testing.T) {
for _, r := range test.objToAdd {
err := db.PutObj(&r)
require.NoError(t, err)
}
objReturned, err := db.GetObjsWithFieldEqualsXPTO()
require.NoError(t, err)
require.Equal(t, test.queryResultSize, len(objReturned))
})
return fmt.Errorf("returning error to clean up the database rolling back the transaction")
})
}
I was expecting to rollback the transaction on the end of the scenario, so the next for step will have an empty database, but when I run, the data is never been rolling back.
I believe I'm trying to do what the doc suggested: https://pg.uptrace.dev/faq/#how-to-test-mock-database, am I right?
More info: I notice that my interface is implementing a layer over RunInTransaction as:
func (gs *DB) RunInTransaction(fn func() error) error {
f := func(*pg.Tx) error { return fn() }
return gs.pgDB.RunInTransaction(f)
}
IDK what is the problem yet, but I really guess that is something related to that (because the TX is encapsulated just inside the RunInTransaction implementation.
go-pg uses connection pooling (in common with most go database packages). This means that when you call a database function (e.g. db.Exec) it will grab a connection from the pool (establishing a new one if needed), run the command and return the connection to the pool.
When running a transaction you need to run BEGIN, whatever updates etc you require, followed by COMMIT/ROLLBACK, on a single connection dedicated to the transaction (any commands sent on other connections are not part of the transaction). This is why Begin() (and effectively RunInTransaction) provide you with a pg.Tx; use this to run commands within the transaction.
example_test.go provides an example covering the usage of RunInTransaction:
incrInTx := func(db *pg.DB) error {
// Transaction is automatically rollbacked on error.
return db.RunInTransaction(func(tx *pg.Tx) error {
var counter int
_, err := tx.QueryOne(
pg.Scan(&counter), `SELECT counter FROM tx_test FOR UPDATE`)
if err != nil {
return err
}
counter++
_, err = tx.Exec(`UPDATE tx_test SET counter = ?`, counter)
return err
})
}
You will note that this only uses the pg.DB when calling RunInTransaction; all database operations use the transaction tx (a pg.Tx). tx.QueryOne will be run within the transaction; if you ran db.QueryOne then that would be run outside of the transaction.
So RunInTransaction begins a transaction and passes the relevant Tx in as a parameter to the function you provide. You wrap this with:
func (gs *DB) RunInTransaction(fn func() error) error {
f := func(*pg.Tx) error { return fn() }
return gs.pgDB.RunInTransaction(f)
}
This effectively ignores the pg.Tx and you then run commands using other connections (e.g. err := db.PutObj(&r)) (i.e. outside of the transaction). To fix this you need to use the transaction (e.g. err := tx.PutObj(&r)).
Today when I try to send 100M data to my server (a very simple TCP server also written in Golang), I found that the TCPConn.Write method returns 104857600 and nil error and then I close the socket. But my server only receives very little data. I think it is because Write method works in async mode, so although the method returns 104857600, only a little data is sent to the server. So I want to know whether there is a way to set the Write work in sync mode or how to detect whether all data is sent to the server from the socket.
The code is as follows:
server:
const ListenAddress = "192.168.0.128:8888"
func main() {
var l net.Listener
var err error
l, err = net.Listen("tcp", ListenAddress)
if err != nil {
fmt.Println("Error listening:", err)
os.Exit(1)
}
defer l.Close()
fmt.Println("listen on " + ListenAddress)
for {
conn, err := l.Accept()
if err != nil {
fmt.Println("Error accepting: ", err)
os.Exit(1)
}
//logs an incoming message
fmt.Printf("Received message %s -> %s \n", conn.RemoteAddr(), conn.LocalAddr())
// Handle connections in a new goroutine.
go handleRequest(conn)
}
}
func handleRequest(conn net.Conn) {
defer conn.Close()
rcvLen := 0
rcvData := make([]byte,20 * 1024 * 1024) // 20M
for {
l , err := conn.Read(rcvData)
if err != nil {
fmt.Printf("%v", err)
return
}
rcvLen += l
fmt.Printf("recv: %d\r\n", rcvLen)
conn.Write(rcvData[:l])
}
}
Client:
conn, err := net.Dial("tcp", "192.168.0.128:8888")
if err != nil {
fmt.Println(err)
os.Exit(-1)
}
defer conn.Close()
data := make([]byte, 500 * 1024 * 1024)
length, err := conn.Write(data)
fmt.Println("send len: ", length)
The output of the client:
send len: 524288000
The output of the server:
listen on 192.168.0.128:8888
Received message 192.168.0.2:50561 -> 192.168.0.128:8888
recv: 166440
recv: 265720
EOF
I know if I can make the client wait for a while by SetLinger method, the data will be all sent to the server before the socket is closed. But I want to find a way to make the socket send all data before returns without calling SetLinger().
Please excuse my poor English.
Did you poll the socket before trying to write?
Behind the socket is your operating system's tcp stack. When writing on a socket, you push bytes to the send buffer. Your operating system then self determines when and how to send. If the receiving end has no buffer space.available in their receice buffer, your sending end knows this and will not put any more information in the send buffer.
Make sure your send buffer has enough space for whatever you are trying to send next. This is done by polling the socket. This method is usually called Socket.Poll. I.recommend you ccheck the golang docs for the exact usage.
You are not handling the error returned by conn.Read correctly. From the docs (emphasis mine):
When Read encounters an error or end-of-file condition after successfully reading n > 0 bytes, it returns the number of bytes read. It may return the (non-nil) error from the same call or return the error (and n == 0) from a subsequent call. [...]
Callers should always process the n > 0 bytes returned before considering the error err. Doing so correctly handles I/O errors that happen after reading some bytes and also both of the allowed EOF behaviors.
Note that you are re-inventing io.Copy (albeit with an excessive buffer size). Your server code can be rewritten as:
func handleRequest(conn net.Conn) {
defer conn.Close()
n, err := io.Copy(conn, conn)
}
Talk is cheap, so here we go the simple code:
package main
import (
"fmt"
"time"
"net"
)
func main() {
addr := "127.0.0.1:8999"
// Server
go func() {
tcpaddr, err := net.ResolveTCPAddr("tcp4", addr)
if err != nil {
panic(err)
}
listen, err := net.ListenTCP("tcp", tcpaddr)
if err != nil {
panic(err)
}
for {
if conn, err := listen.Accept(); err != nil {
panic(err)
} else if conn != nil {
go func(conn net.Conn) {
buffer := make([]byte, 1024)
n, err := conn.Read(buffer)
if err != nil {
fmt.Println(err)
} else {
fmt.Println(">", string(buffer[0 : n]))
}
conn.Close()
}(conn)
}
}
}()
time.Sleep(time.Second)
// Client
if conn, err := net.Dial("tcp", addr); err == nil {
for i := 0; i < 2; i++ {
_, err := conn.Write([]byte("hello"))
if err != nil {
fmt.Println(err)
conn.Close()
break
} else {
fmt.Println("ok")
}
// sleep 10 seconds and re-send
time.Sleep(10*time.Second)
}
} else {
panic(err)
}
}
Ouput:
> hello
ok
ok
The Client writes to the Server twice. After the first read, the Server closes the connection immediately, but the Client sleeps 10 seconds and then re-writes to the Server with the same already closed connection object(conn).
Why can the second write succeed (returned error is nil)?
Can anyone help?
PS:
In order to check if the buffering feature of the system affects the result of the second write, I edited the Client like this, but it still succeeds:
// Client
if conn, err := net.Dial("tcp", addr); err == nil {
_, err := conn.Write([]byte("hello"))
if err != nil {
fmt.Println(err)
conn.Close()
return
} else {
fmt.Println("ok")
}
// sleep 10 seconds and re-send
time.Sleep(10*time.Second)
b := make([]byte, 400000)
for i := range b {
b[i] = 'x'
}
n, err := conn.Write(b)
if err != nil {
fmt.Println(err)
conn.Close()
return
} else {
fmt.Println("ok", n)
}
// sleep 10 seconds and re-send
time.Sleep(10*time.Second)
} else {
panic(err)
}
And here is the screenshot:
attachment
There are several problems with your approach.
Sort-of a preface
The first one is that you do not wait for the server goroutine
to complete.
In Go, once main() exits for whatever reason,
all the other goroutines still running, if any, are simply
teared down forcibly.
You're trying to "synchronize" things using timers,
but this only works in toy situations, and even then it
does so only from time to time.
Hence let's fix your code first:
package main
import (
"fmt"
"log"
"net"
"time"
)
func main() {
addr := "127.0.0.1:8999"
tcpaddr, err := net.ResolveTCPAddr("tcp4", addr)
if err != nil {
log.Fatal(err)
}
listener, err := net.ListenTCP("tcp", tcpaddr)
if err != nil {
log.Fatal(err)
}
// Server
done := make(chan error)
go func(listener net.Listener, done chan<- error) {
for {
conn, err := listener.Accept()
if err != nil {
done <- err
return
}
go func(conn net.Conn) {
var buffer [1024]byte
n, err := conn.Read(buffer[:])
if err != nil {
log.Println(err)
} else {
log.Println(">", string(buffer[0:n]))
}
if err := conn.Close(); err != nil {
log.Println("error closing server conn:", err)
}
}(conn)
}
}(listener, done)
// Client
conn, err := net.Dial("tcp", addr)
if err != nil {
log.Fatal(err)
}
for i := 0; i < 2; i++ {
_, err := conn.Write([]byte("hello"))
if err != nil {
log.Println(err)
err = conn.Close()
if err != nil {
log.Println("error closing client conn:", err)
}
break
}
fmt.Println("ok")
time.Sleep(2 * time.Second)
}
// Shut the server down and wait for it to report back
err = listener.Close()
if err != nil {
log.Fatal("error closing listener:", err)
}
err = <-done
if err != nil {
log.Println("server returned:", err)
}
}
I've spilled a couple of minor fixes
like using log.Fatal (which is
log.Print + os.Exit(1)) instead of panicking,
removed useless else clauses to adhere to the coding standard of keeping the main
flow where it belongs, and lowered the client's timeout.
I have also added checking for possible errors Close on sockets may return.
The interesting part is that we now properly shut the server down by closing the listener and then waiting for the server goroutine to report back (unfortunately Go does not return an error of a custom type from net.Listener.Accept in this case so we can't really check that Accept exited because we've closed the listener).
Anyway, our goroutines are now properly synchronized, and there is
no undefined behaviour, so we can reason about how the code works.
Remaining problems
Some problems still remain.
The more glaring is you making wrong assumption that TCP preserves
message boundaries—that is, if you write "hello" to the client
end of the socket, the server reads back "hello".
This is not true: TCP considers both ends of the connection
as producing and consuming opaque streams of bytes.
This means, when the client writes "hello", the client's
TCP stack is free to deliver "he" and postpone sending "llo",
and the server's stack is free to yield "hell" to the read
call on the socket and only return "o" (and possibly some other
data) in a later read.
So, to make the code "real" you'd need to somehow introduce these
message boundaries into the protocol above TCP.
In this particular case the simplest approach would be either
using "messages" consisting of a fixed-length and agreed-upon
endianness prefix indicating the length of the following
data and then the string data itself.
The server would then use a sequence like
var msg [4100]byte
_, err := io.ReadFull(sock, msg[:4])
if err != nil { ... }
mlen := int(binary.BigEndian.Uint32(msg[:4]))
if mlen < 0 {
// handle error
}
if mlen == 0 {
// empty message; goto 1
}
_, err = io.ReadFull(sock, msg[5:5+mlen])
if err != nil { ... }
s := string(msg[5:5+mlen])
Another approach is to agree on that the messages do not contain
newlines and terminate each message with a newline
(ASCII LF, \n, 0x0a).
The server side would then use something like
a usual bufio.Scanner loop to get
full lines from the socket.
The remaining problem with your approach is to not dealing with
what Read on a socket returns: note that io.Reader.Read
(that's what sockets implement, among other things) is allowed
to return an error while having had read some data from the
underlying stream. In your toy example this might rightfully
be unimportant, but suppose that you're writing a wget-like
tool which is able to resume downloading of a file: even if
reading from the server returned some data and an error, you
have to deal with that returned chunk first and only then
handle the error.
Back to the problem at hand
The problem presented in the question, I beleive, happens simply because in your setup you hit some TCP buffering problem due to the tiny length of your messages.
On my box which runs Linux 4.9/amd64 two things reliably "fix"
the problem:
Sending messages of 4000 bytes in length: the second call
to Write "sees" the problem immediately.
Doing more Write calls.
For the former, try something like
msg := make([]byte, 4000)
for i := range msg {
msg[i] = 'x'
}
for {
_, err := conn.Write(msg)
...
and for the latter—something like
for {
_, err := conn.Write([]byte("hello"))
...
fmt.Println("ok")
time.Sleep(time.Second / 2)
}
(it's sensible to lower the pause between sending stuff in
both cases).
It's interesting to note that the former example hits the
write: connection reset by peer (ECONNRESET in POSIX)
error while the second one hits write: broken pipe
(EPIPE in POSIX).
This is because when we're sending in chunks worth 4k bytes,
some of the packets generated for the stream manage to become
"in flight" before the server's side of the connection manages
to propagate the information on its closure to the client,
and those packets hit an already closed socket and get rejected
with the RST TCP flag set.
In the second example an attempt to send another chunk of data
sees that the client side already knows that the connection
has been teared down and fails the sending without "touching
the wire".
TL;DR, the bottom line
Welcome to the wonderful world of networking. ;-)
I'd recommend buying a copy of "TCP/IP Illustrated",
read it and experiment.
TCP (and IP and other protocols above IP)
sometimes works not like people expect them to by applying
their "common sense".