S3 uploading high disk I/O and CPU usage - aws-sdk-go

I faced hight CPU and I/O usage when I tried to upload 100Gb of small files (PNG images) to S3 bucket via very simple go s3 uploader.
Is there any way to limit bandwidth (i.e. via aws-sdk-go config) or something else to make the process of uploading less intensive or effective :) to reduce CPU and I/O usage.
I've tried nice CPU and IO but it actually doesn't help.

Have you tried S3Manager, https://docs.aws.amazon.com/sdk-for-go/api/service/s3/s3manager/? From the docs:
Package s3manager provides utilities to upload and download objects from S3 concurrently. Helpful for when working with large objects.

Related

manipulation of processor speed without changing the processor

is it possible to replace a ram with higher storage on a machine with a processor of low speed? is speed of processing increased or decreased.
i want to replace the ram drive of my machine with a higher storage ram so that I can try to manipulate the processor speed without changing the processors, will it work?
Yes. You can use faster RAM, currently up to DDR4-4800MHz. Using higher capacity RAM would also be useful if running out of RAM capacity is what’s slowing you down. Using a high-speed solid-state drive (SSD) would also speed up task that require reading or writing from storage, including booting up the computer and running programs that use a lot of assets on the computer. You can also overclock both your CPU and RAM to make them faster, but this may void your warranty.
Additionally, you can use software. For example, you can use CCleaner to remove useless files which clutter your computer. You can also use it to disable unneeded scheduled tasks that your computer runs. If your computer has a spinning-disc hard drive, then you can try defragmenting it by using the built-in Windows defragger, or you can use the Defraggler program from the same people who make CCleaner. Of course, you can also try deleting programs or files you no longer need or use if it’s limited hard drive capacity that’s slowing you down.
If it’s web browsing that’s slow, you may want to consider using a faster browser, like Google Chrome or Firefox. You can also install browser extensions/add-ons like AdBlock and Ghostery to prevent unneeded things from being loaded on pages, making pages load faster.

Storage to serve assets for Toloka.ai tasks

Is there a way to serve files for tasks (like images) not only from Yandex Disk?
I think disk is OK only for start because it's personal and limited to 10 GB
Additionally to Yandex Disk you can use Yandex.Cloud for uploading images. Looks like that these are all possible options (according to the doc) https://yandex.com/support/toloka-requester/concepts/use-object-storage.html

Cache/Memory size recommended for memcached?

Hi so I want to know what's the recommended cache or memory size for memcached? I'm running a drupal site on AwS Thanks!
This is a very subjective question as it depends on what you are caching and how big your data is. That you will have to figure out yourself.
After you do that you should know that on average you get a 60 to 70% utilization on memcache.
So if you are planning to cache around 6GB of data then its good to allocate 10GB to Memcache.
You can take a look here to see how memcache works. This will explain why it is not possible to fully utilize memcache

Data Upload using REST

I am developing a REST application that can be used as a data upload service for large file. I create chunks of the file and upload each chunk. I would like to have multiple services running this service (For load balancing). I would like my REST service to be a stateless system (No information about each stored chunk). This will help me avoid server affinity. If i allow server affinity, i can have a server for each upload request and the chunks can be stored in a temporary file in the disk and can be moved to some other place once the upload is complete.
Ideally i would use a central place for the data to be stored. I would like to avoid this as this is a single point of failure (bad in a distributed system). So i was thinking about using a distributed file system say like HDFS but appending to file is not very efficient and so this is not an option.
Is it possible to use some kind of a cache for storing the data? Since the size of the data is quite big (2 -3 GB files) traditional cache solutions like Memcache cannot be used.
Is there any other option to solve this problem. Am I not looking in any particular direction?
Any help will be greatly appreciated.

Ideal Chunk Size for Writing Streamed Content to Disk on iPhone

I am writing an app that caches streaming content from the web on the iPhone. Right now, I'm saving data to disk as it arrives (in chunk sizes ranging from 1KB to about 60KB), but application response is somewhat sluggish (better than I was expecting, but still pretty bad).
My question is: does anyone have a rule of thumb for how frequent and large writes to the device memory should be to maximize performance?
I realize this seems application-specific, and I intend to do performance tuning for my scenario, but this applies generally to any app on the iPhone downloading a lot of data because there is probably a sweet spot (given sufficient incoming data availability) for write frequency/size.
These are the resources I've already read related to the issue, but no one addresses the specific issue of how much data to accumulate before dumping:
Best way to download large files from web to iPhone for writing to disk
The Joy in Discovering You are an Idiot
One year later, I finally got around to writing a test harness to test chunking performance of streaming downloads.
Here's the set-up: Use an iPhone 4 to download a large file over a Wi-Fi connection* with an asynchronous NSURLConnection. Periodically flush downloaded data to disk (atomically), whenever the amount of data downloaded exceeds a threshold.
And the results: It doesn't make a difference. The performance difference between using 32kB and 512kB chunks (and several sizes in-between) is smaller than the variance between runs using the same chunking size. The file download time, as expected, is comprised almost entirely of time spent waiting on the network.
*Average throughput was approximately 8Mbps.