Recently, I tried to improve upload and download experience, and I found that s3manager.Uploader is so amazing to improve the upload experience for larger objects by parallelizing them. The below code works well. We can make full use of our bandwidth (far better than PutObject).
uploader := s3manager.NewUploaderWithClient(s.Client, func(u *s3manager.Uploader) {
u.Concurrency = 128
})
So I tried to use the similar way to improve the download speed, but it seems that whatever concurrency is used it can't accelarate the download speed. The download speed is always below 15 MB/s.
By the way, GetObject API works pretty good with very large objects, it makes full use of our bandwidth when we download large objects, GetObject can reach bandwidth limit about 1250 MB/s.
So I tried to do some benchmark test, and the above image is what go tool pprof shows.
func (b *WriteAtBuffer) WriteAt(p []byte, pos int64) (n int, err error) {
pLen := len(p)
expLen := pos + int64(pLen)
b.m.Lock()
defer b.m.Unlock()
if int64(len(b.buf)) < expLen {
if int64(cap(b.buf)) < expLen {
if b.GrowthCoeff < 1 {
b.GrowthCoeff = 1
}
newBuf := make([]byte, expLen, int64(b.GrowthCoeff*float64(expLen)))
copy(newBuf, b.buf)
b.buf = newBuf
}
b.buf = b.buf[:expLen]
}
copy(b.buf[pos:], p)
return pLen, nil
}
Here's my question, Why is s3manager.Downloader so slow, is data race slowing down the WriteAt interface? What's the recommended way to use s3manager.Downloader?
Related
I have a database consist of 85.4k of document with average size of 4kb
I write a simple code in go to find and get over 70k document from the database using mongodb-go-driver
package main
import (
"context"
"log"
"time"
"go.mongodb.org/mongo-driver/mongo"
"go.mongodb.org/mongo-driver/mongo/options"
)
func main() {
localC, _ := mongo.Connect(context.TODO(), options.Client().ApplyURI("mongodb://127.0.0.1:27017/?gssapiServiceName=mongodb"))
localDb := localC.Database("sampleDB")
collect := localDb.Collection("sampleCollect")
localCursor, _ := collect.Find(context.TODO(), JSON{
"deleted": false,
})
log.Println("start")
start := time.Now()
var result []map[string] interface{} = make([]map[string] interface{}, 0)
localCursor.All(context.TODO(), &result)
log.Println(len(result))
log.Println("done")
log.Println(time.Now().Sub(start))
}
Which done in around 20 seconds
2021/03/21 01:36:43 start
2021/03/21 01:36:56 70922
2021/03/21 01:36:56 done
2021/03/21 01:36:56 20.0242869s
After that, I try to implement the similar thing in rust using mongodb-rust-driver
use mongodb::{
bson::{doc, Document},
error::Error,
options::FindOptions,
Client,
};
use std::time::Instant;
use tokio::{self, stream::StreamExt};
#[tokio::main]
async fn main() {
let client = Client::with_uri_str("mongodb://localhost:27017/")
.await
.unwrap();
let db = client.database("sampleDB");
let coll = db.collection("sampleCollect");
let find_options = FindOptions::builder().build();
let cursor = coll
.find(doc! {"deleted": false}, find_options)
.await
.unwrap();
let start = Instant::now();
println!("start");
let results: Vec<Result<Document, Error>> = cursor.collect().await;
let es = start.elapsed();
println!("{}", results.iter().len());
println!("{:?}", es);
}
But it took almost 1 minutes to complete the same task on release build
$ cargo run --release
Finished release [optimized] target(s) in 0.43s
Running `target\release\rust-mongo.exe`
start
70922
51.1356069s
May I know the performance on rust in this case is consider normal or I made some mistake on my rust code and it could be improve?
EDIT
As comment suggested, here is the Example document
The discrepancy here was due to some known bottlenecks in the Rust driver that have since been addressed in the latest beta release (2.0.0-beta.3); so, upgrading your mongodb dependency to use that version should solve the issue.
Re-running your examples with 10k copies of the provided sample document, I now see the Rust one taking ~3.75s and the Go one ~5.75s on my machine.
I've got code that has two UnsafeMutableBufferPointer<UInt32>s, and need to selectively copy from one buffer to the other, as fast as possible. There is some logic that decides wether the element at a particular index/address is copied or not, so unfortunately I can't just memcpy() the whole lot.
What is the quickest way to do a per-element buffer copy in Swift?
For the purpose of simplicity, I've removed the per-element logic in the examples below.
This is slow (56ms):
for i in 0..<size {
if ... {
dest[i] = source[i]
}
}
This is much faster (18ms), but I'd like to go quicker:
var srcPtr = source.baseAddress
var dstPtr = dest.baseAddress
for i in 0..<size {
if ... {
srcPtr.pointee = dstPtr.pointee
srcPtr = srcPtr.advance(by: 1)
dstPtr = dstPtr.advance(by: 1)
}
}
Is there a quicker way?
I am running a query in Golang where I select multiple rows from my Postgresql Database.
I am using the following imports for my query
"database/sql"
"github.com/lib/pq"
I have narrowed down to my loop for scanning the results into my struct.
// Returns about 400 rows
rows, err = db.Query('SELECT * FROM infrastructure')
if err != nil {
return nil, err
}
var arrOfInfra []model.Infrastructure
for rows.Next() {
obj, ptrs := model.InfrastructureInit()
rows.Scan(ptrs...)
arrOfInfra = append(arrOfInfra, *obj)
}
rows.Close()
The above code takes about 8 seconds to run, and while the query is fast, the loop in rows.Next() takes the entire 8 seconds over to complete.
Any ideas? Am I doing something wrong, or is there a better way?
My configuration for my database
// host, port, dbname, user, password masked for obvious reasons
db, err := sql.Open("postgres", "host=... port=... dbname=... user=... password=... sslmode=require")
if err != nil {
panic(err)
}
// I have tried using the default, or setting to high number (100), but it doesn't seem to help with my situation
db.SetMaxIdleConns(1)
db.SetMaxOpenConns(1)
UPDATE 1:
I placed print statements in the for loop. Below is my updated snippet
for rows.Next() {
obj, ptrs := model.InfrastructureInit()
rows.Scan(ptrs...)
arrOfInfra = append(arrOfInfra, *obj)
fmt.Println("Len: " + fmt.Sprint(len(arrOfInfra)))
fmt.Println(obj)
}
I noticed that in this loop, it will actually pause half-way, and continue after a short break. It looks like this:
Len: 221
Len: 222
Len: 223
Len: 224
<a short pause about 1 second, then prints Len: 225 and continues>
Len: 226
Len: 227
...
..
.
and it will happen again later on at another row count, and again after a few hundred records.
UPDATE 2:
Below is a snippet of my InfrastructureInit() method
func InfrastructureInit() (*Infrastructure, []interface{}) {
irf := new(Infrastructure)
var ptrs []interface{}
ptrs = append(ptrs,
&irf.Base.ID,
&irf.Base.CreatedAt,
&irf.Base.UpdatedAt,
&irf.ListingID,
&irf.AddressID,
&irf.Type,
&irf.Name,
&irf.Description,
&irf.Details,
&irf.TravellingFor,
)
return irf, ptrs
}
I am not exactly sure what is causing this slowness, but I currently placed a quick patch on my server to using a redis database and precache my infrastructures, saving it as a string. It seems to be okay for now, but I now have to maintain both redis and my postgres.
I am still puzzled over this weird behavior, but I'm not exactly how rows.Next() work - does it make a query to the database everytime I call rows.Next()?
How do you think about just do like this?
defer rows.Close()
var arrOfInfra []*Infrastructure
for rows.Next() {
irf := &Infrastructure{}
err = rows.Scan(
&irf.Base.ID,
&irf.Base.CreatedAt,
&irf.Base.UpdatedAt,
&irf.ListingID,
&irf.AddressID,
&irf.Type,
&irf.Name,
&irf.Description,
&irf.Details,
&irf.TravellingFor,
)
if err == nil {
arrOfInfra = append(arrOfInfra, irf)
}
}
Hope this help.
I went some weird path myself while consolidating my understanding of how rows.Next() work and what might be impacting performance so thought about sharing this here for posterity (despite the question asked a long time ago).
Related to:
I am still puzzled over this weird behavior, but I'm not exactly how
rows.Next() work - does it make a query to the database everytime I
call rows.Next()?
It doesn't make a 'query' but it reads (transfers) data from the db through a driver on each iteration which means it can be impacted by e.g. bad network performance. Especially true if, for example, your db is not local to where you are running your Go code.
One approach to confirm whether network performance is an issue would be to run your go app on the same machine where your db is (if possible).
Assuming columns that are scanned in the above code are not of extremely large size or having custom conversions - reading ~400 rows should take in the order of 100ms at most (in a local setup).
For example - I had a case where I needed to read about 100k rows with about 300B per row and that was taking ~4s (local setup).
From what I understood here, "V8 has a generational garbage collector. Moves objects aound randomly. Node can’t get a pointer to raw string data to write to socket." so I shouldn't store data that comes from a TCP stream in a string, specially if that string becomes bigger than Math.pow(2,16) bytes. (hope I'm right till now..)
What is then the best way to handle all the data that's comming from a TCP socket ? So far I've been trying to use _:_:_ as a delimiter because I think it's somehow unique and won't mess around other things.
A sample of the data that would come would be something_:_:_maybe a large text_:_:_ maybe tons of lines_:_:_more and more data
This is what I tried to do:
net = require('net');
var server = net.createServer(function (socket) {
socket.on('connect',function() {
console.log('someone connected');
buf = new Buffer(Math.pow(2,16)); //new buffer with size 2^16
socket.on('data',function(data) {
if (data.toString().search('_:_:_') === -1) { // If there's no separator in the data that just arrived...
buf.write(data.toString()); // ... write it on the buffer. it's part of another message that will come.
} else { // if there is a separator in the data that arrived
parts = data.toString().split('_:_:_'); // the first part is the end of a previous message, the last part is the start of a message to be completed in the future. Parts between separators are independent messages
if (parts.length == 2) {
msg = buf.toString('utf-8',0,4) + parts[0];
console.log('MSG: '+ msg);
buf = (new Buffer(Math.pow(2,16))).write(parts[1]);
} else {
msg = buf.toString() + parts[0];
for (var i = 1; i <= parts.length -1; i++) {
if (i !== parts.length-1) {
msg = parts[i];
console.log('MSG: '+msg);
} else {
buf.write(parts[i]);
}
}
}
}
});
});
});
server.listen(9999);
Whenever I try to console.log('MSG' + msg), it will print out the whole buffer, so it's useless to see if something worked.
How can I handle this data the proper way ? Would the lazy module work, even if this data is not line oriented ? Is there some other module to handle streams that are not line oriented ?
It has indeed been said that there's extra work going on because Node has to take that buffer and then push it into v8/cast it to a string. However, doing a toString() on the buffer isn't any better. There's no good solution to this right now, as far as I know, especially if your end goal is to get a string and fool around with it. Its one of the things Ryan mentioned # nodeconf as an area where work needs to be done.
As for delimiter, you can choose whatever you want. A lot of binary protocols choose to include a fixed header, such that you can put things in a normal structure, which a lot of times includes a length. In this way, you slice apart a known header and get information about the rest of the data without having to iterate over the entire buffer. With a scheme like that, one can use a tool like:
node-buffer - https://github.com/substack/node-binary
node-ctype - https://github.com/rmustacc/node-ctype
As an aside, buffers can be accessed via array syntax, and they can also be sliced apart with .slice().
Lastly, check here: https://github.com/joyent/node/wiki/modules -- find a module that parses a simple tcp protocol and seems to do it well, and read some code.
You should use the new stream2 api. http://nodejs.org/api/stream.html
Here are some very useful examples: https://github.com/substack/stream-handbook
https://github.com/lvgithub/stick
I've got a piece of code that opens a data reader and for each record (which contains a url) downloads & processes that page.
What's the simplest way to make it multi-threaded so that, let's say, there are 10 slots which can be used to download and process pages in simultaneousy, and as slots become available next rows are being read etc.
I can't use WebClient.DownloadDataAsync
Here's what i have tried to do, but it hasn't worked (i.e. the "worker" is never ran):
using (IDataReader dr = q.ExecuteReader())
{
ThreadPool.SetMaxThreads(10, 10);
int workerThreads = 0;
int completionPortThreads = 0;
while (dr.Read())
{
do
{
ThreadPool.GetAvailableThreads(out workerThreads, out completionPortThreads);
if (workerThreads == 0)
{
Thread.Sleep(100);
}
} while (workerThreads == 0);
Database.Log l = new Database.Log();
l.Load(dr);
ThreadPool.QueueUserWorkItem(delegate(object threadContext)
{
Database.Log log = threadContext as Database.Log;
Scraper scraper = new Scraper();
dc.Product p = scraper.GetProduct(log, log.Url, true);
ManualResetEvent done = new ManualResetEvent(false);
done.Set();
}, l);
}
}
You do not normally need to play with the Max threads (I believe it defaults to something like 25 per proc for worker, 1000 for IO). You might consider setting the Min threads to ensure you have a nice number always available.
You don't need to call GetAvailableThreads either. You can just start calling QueueUserWorkItem and let it do all the work. Can you repro your problem by simply calling QueueUserWorkItem?
You could also look into the Parallel Task Library, which has helper methods to make this kind of stuff more manageable and easier.