More Logical Reads but Query Executing in Less time - sqlperformance

Situation 1
Table 'lead_transaction'. Scan count 10, logical reads 394, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'appt_master'. Scan count 20, logical reads 4532, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Situation 2
Table 'lead_transaction'. Scan count 36466, logical reads 117088, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'appt_master'. Scan count 36466, logical reads 195492, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
In Situation 1 Query is executing in 4 seconds, Used Left join
&
In Situation 2 Query is executing in 3 seconds, Used outer apply ,but Logical reads are very high.
So what is good as per performance?

Since Logical Reads are from the data cache (memory) I would think the fact that there is a lot of reads will make little difference and it would seem that the second query is more efficient when reading a lot of data in small chunks while the first query reads the data in large chunks.
Id be interested to see how the performance would work if it was making physical reads and not logical reads.
Try clearing the buffers and execution plans before running each query and see what the performance is like.
•DBCC DROPCLEANBUFFERS clears buffer pool
•DBCC FLUSHPROCINDB clears execution plans for that database

Related

PostgreSQL autovacuum causing significant performance degradation

Our Postgres DB (hosted on Google Cloud SQL with 1 CPU, 3.7 GB of RAM, see below) consists mostly of one big ~90GB table with about ~60 million rows. The usage pattern consists almost exclusively of appends and a few indexed reads near the end of the table. From time to time a few users get deleted, deleting a small percentage of rows scattered across the table.
This all works fine, but every few months an autovacuum gets triggered on that table, which significantly impacts our service's performance for ~8 hours:
Storage usage increases by ~1GB for the duration of the autovacuum (several hours), then slowly returns to the previous value (might eventually drop below it, due to the autovacuum freeing pages)
Database CPU utilization jumps from <10% to ~20%
Disk Read/Write Ops increases from near zero to ~50/second
Database Memory increases slightly, but stays below 2GB
Transaction/sec and ingress/egress bytes are also fairly unaffected, as would be expected
This has the effect of increasing our service's 95th latency percentile from ~100ms to ~0.5-1s during the autovacuum, which in turn triggers our monitoring. The service serves around ten requests per second, with each request consisting of a few simple DB reads/writes that normally have a latency of 2-3ms each.
Here are some monitoring screenshots illustrating the issue:
The DB configuration is fairly vanilla:
The log entry documenting this autovacuum process reads as follows:
system usage: CPU 470.10s/358.74u sec elapsed 38004.58 sec
avg read rate: 2.491 MB/s, avg write rate: 2.247 MB/s
buffer usage: 8480213 hits, 12117505 misses, 10930449 dirtied
tuples: 5959839 removed, 57732135 remain, 4574 are dead but not yet removable
pages: 0 removed, 6482261 remain, 0 skipped due to pins, 0 skipped frozen
automatic vacuum of table "XXX": index scans: 1
Any suggestions what we could tune to reduce the impact of future autovacuums on our service? Or are we doing something wrong?
If you can increase autovacuum_vacuum_cost_delay, your autovacuum would run slower and be less invasive.
However, it is usually the best solution to make it faster by setting autovacuum_vacuum_cost_limit to 2000 or so. Then it finishes faster.
You could also try to schedule VACUUMs of the table yourself at times when it hurts least.
But frankly, if a single innocuous autovacuum is enough to disturb your operation, you need more I/O bandwidth.

Unstable insert rate in MongoDB

I have a process that can generate 20 000 records per second (record size ~30Kb). I am trying to insert them as fast as possible into single instance of MongoDB. But I am getting ~1500 inserts per second with unstable rate that ranges from 1000 inserts to 2000 inserts per second. The question is what is the reason and how to fix it? :) Here is data from mongostat for 2.5 hours:
Set up
I am running instance in the cloud with 8 cores, 16Gb RAM, 150Gb HDD, Ubuntu 18.04, MongoDB 4.0 official docker image. On the same instance run 2 workers that generate 10 000 records per second each and insert_many them into MongoDB 100 records per chunk. Each record is split between 2 collections cases and docs, docs uses zlib compression. cases record is ~1Kb in size on average. Random record as an example:
{'info': {'judge': 'Орлова Олеся Викторовна', 'decision': 'Отменено с возвращением на новое рассмотрение', 'entry_date': datetime.datetime(2017, 1, 1, 0, 0), 'number': '12-48/2017 (12-413/2016;)', 'decision_date': datetime.datetime(2017, 2, 9, 0, 0)}, 'acts': [{'doc': ObjectId('5c3c76543d495a000c97243b'), 'type': 'Решение'}], '_id': ObjectId('5c3c76543d495a000c97243a'), 'sides': [{'name': 'Кузнецов П. В.', 'articles': 'КоАП: ст. 5.27.1 ч.4'}], 'history': [{'timestamp': datetime.datetime(2017, 1, 1, 15, 6), 'type': 'Материалы переданы в производство судье'}, {'timestamp': datetime.datetime(2017, 2, 9, 16, 0), 'type': 'Судебное заседание', 'decision': 'Отменено с возвращением на новое рассмотрение'}, {'timestamp': datetime.datetime(2017, 2, 17, 15, 6), 'type': 'Дело сдано в отдел судебного делопроизводства'}, {'timestamp': datetime.datetime(2017, 2, 17, 15, 7), 'type': 'Вручение копии решения (определения) в соотв. с чч. 2, 2.1, 2.2 ст. 30.8 КоАП РФ'}, {'timestamp': datetime.datetime(2017, 3, 13, 16, 6), 'type': 'Вступило в законную силу'}, {'timestamp': datetime.datetime(2017, 3, 14, 16, 6), 'type': 'Дело оформлено'}, {'timestamp': datetime.datetime(2017, 3, 29, 14, 33), 'type': 'Дело передано в архив'}], 'source': {'date': datetime.datetime(2017, 1, 1, 0, 0), 'engine': 'v1', 'instance': 'appeal', 'host': 'bratsky.irk.sudrf.ru', 'process': 'adm_nar', 'crawled': datetime.datetime(2018, 12, 22, 8, 15, 7), 'url': 'https://bratsky--irk.sudrf.ru/modules.php?name=sud_delo&srv_num=1&name_op=case&case_id=53033119&case_uid=A84C1A34-846D-4912-8242-C7657985873B&delo_id=1502001'}, 'id': '53033119_A84C1A34-846D-4912-8242-C7657985873B_1_'}
docs record is ~30Kb on average:
{'_id': ObjectId('5c3c76543d495a000c97243b'), 'data': 'PEhUTUw+PEhFQUQ+DQo8TUVUQSBodHRwLWVxdWl2PUNvbnRlbnQtVHlwZSBjb250ZW50PSJ0ZXh0L2h0bWw7IGNoYXJzZXQ9V2luZG93cy0xMjUxIj4NCjxTVFlMRSB0eXBlPXRleHQvY3NzPjwvU1RZTEU+DQo8L0hFQUQ+DQo8Qk9EWT48U1BBTiBzdHlsZT0iVEVYVC1BTElHTjoganVzdGlmeSI+DQo8UCBzdHlsZT0iVEVYVC1JTkRFTlQ6IDAuNWluOyBURVhULUFMSUdOOiBjZW50ZXIiPtCgINCVINCoINCVINCdINCYINCVPC9QPg0KPFAgc3R5bGU9IlRFWFQtSU5ERU5UOiAwLjVpbjsgVEVYVC1BTElHTjoganVzdGlmeSI+0LMuINCR0YDQsNGC0YHQuiAwOSDRhNC10LLRgNCw0LvRjyAyMDE3INCz0L7QtNCwPC9QPg0KPFAgc3R5bGU9IlRFWFQtSU5ERU5UOiAwLjVpbjsgVEVYVC1BTElHTjoganVzdGlmeSI+0KHRg9C00YzRjyDQkdGA0LDRgtGB0LrQvtCz0L4g0LPQvtGA0L7QtNGB0LrQvtCz0L4g0YHRg9C00LAg0JjRgNC60YPRgtGB0LrQvtC5INC+0LHQu9Cw0YHRgtC4INCe0YDQu9C+0LLQsCDQni7Qki4sINGA0LDRgdGB0LzQvtGC0YDQtdCyINCw0LTQvNC40L3QuNGB0YLRgNCw0YLQuNCy0L3QvtC1INC00LXQu9C+IOKEliAxMi00OC8yMDE3INC/0L4g0LbQsNC70L7QsdC1INC40L3QtNC40LLQuNC00YPQsNC70YzQvdC+0LPQviDQv9GA0LXQtNC/0YDQuNC90LjQvNCw0YLQtdC70Y8g0JrRg9C30L3QtdGG0L7QstCwIDxTUE.....TlQ6IDAuNWluOyBURVhULUFMSUdOOiBqdXN0aWZ5Ij7QoNC10YjQtdC90LjQtSDQvNC+0LbQtdGCINCx0YvRgtGMINC+0LHQttCw0LvQvtCy0LDQvdC+INCyINCY0YDQutGD0YLRgdC60LjQuSDQvtCx0LvQsNGB0YLQvdC+0Lkg0YHRg9C0INCyINGC0LXRh9C10L3QuNC1IDEwINGB0YPRgtC+0Log0YEg0LzQvtC80LXQvdGC0LAg0L/QvtC70YPRh9C10L3QuNGPINC10LPQviDQutC+0L/QuNC4LjwvUD4NCjxQIHN0eWxlPSJURVhULUlOREVOVDogMC41aW47IFRFWFQtQUxJR046IGp1c3RpZnkiPtCh0YPQtNGM0Y8g0J4u0JIuINCe0YDQu9C+0LLQsDwvUD48L1NQQU4+PC9CT0RZPjwvSFRNTD4=', 'extension': '.html'}
Analysis
To figure out what is going on I use docker stats and mongostat. Key metrics are highlighted:
I collect metrics for 2.5 hours during data insertion and plot CPU %, insert, dirty from pictures above:
One can see that insert rate drops when dirty peaks at 20% and goes up to ~2000 when dirty is lower then 20%:
Dirty goes down when CPU is active. One can see that when cpu is ~300% dirty starts to go down (plots are a bit out of sick since docker stats and mongostat run separately), when cpu is 200% dirty grows back to 20% and inserts slow down:
Question
Is my analysis correct? It is my first time using MongoDB so I may be wrong
If analysis is correct why MongoDB does not always use 300%+ CPU (instance has 8 cores) to keep dirty low and insert rate high? Is it possible to force it to do so and is it the right way solve my issue?
Update
Maybe HDD IO is an issue?
I did not log IO utilisation, but
I remember looking into cloud.mongodb.com/freemonitoring during insertion process, there is a plot called "Disk Utilisation", it was 50% max
Currently my problem is insert rate instability. I am ok with current 2000 inserts per seconds max. It means that current HDD can handle that, right? I do not understand why periodically insert rate drops to 1000.
On sharding
Currently I am trying to reach max performance on single machine
Solution
Just change HDD to SSD.
Before:
After:
With the same ~1500 inserts per second, dirty is stable at ~5%. Inserts and CPU usage is now stable. This is the behaviour I expected to see. SSD solves the problem from the title of this question "Unstable insert rate in MongoDB"
Use a better disk will definitely improve the performance. There are other metrics you can monitor.
The percentage of dirty bytes indicates data is modified in wiredTiger cache but not persisted to the disk yet. You should monitor your disk IOPS if it has reach your provisioned limit. Use command iostat to monitor or get it from MongoDB FTDC data.
When your CPU spikes, monitor whether if CPU time spent on iowait. If the iowait % is high, you have I/O blocking, i.e. faster disk or more IOPS will help.
Monitor the qrw (queued read and write requests) and arw (active read and write requests) from mongostat output. If these numbers remain low like your sample output, especially the qrw, mongo is able to support your requests without queuing requests up.
Avoid resources competition by moving the injection works to other instances.
You can further optimize using different disk partitions for mongo data path and journal location.
Clients (ingestion workers) performance is usually ignored by observers. The CPU spike might be from your workers and thus had lower throughputs. Monitor clients performance using top command or equivalent.
Hope the above help.

uniformly partition a rdd in spark

I have a text file in HDFS, which has about 10 million records. I am trying to read the file do some transformations on that data. I am trying to uniformly partition the data before I do the processing on it. here is the sample code
var myRDD = sc.textFile("input file location")
myRDD = myRDD.repartition(10000)
and when I do my transformations on this re-partitioned data, I see that one partition has abnormally large number of records and others have very little data. (image of the distribution)
So the load is high on only one executor
I also tried and got the same result
myRDD.coalesce(10000, shuffle = true)
is there a way to uniformly distribute records among partitions.
Attached is the shuffle read size/ number of records on that particular executor
the circled one has a lot more records to process than the others
any help is appreciated thank you.
To deal with the skew, you can repartition your data using distribute by(or using repartition as you used). For the expression to partition by, choose something that you know will evenly distribute the data.
You can even use the primary key of the DataFrame(RDD).
Even this approach will not guarantee that data will be distributed evenly between partitions. It all depends on the hash of the expression by which we distribute.
Spark : how can evenly distribute my records in all partition
Salting can be used which involves adding a new "fake" key and using alongside the current key for better distribution of data.
(here is link for salting)
For small data I have found that I need to enforce uniform partitioning myself. In pyspark the difference is easily reproducible. In this simple example I'm just trying to parallelize a list of 100 elements into 10 even partitions. I would expect each partition to hold 10 elements. Instead, I get an uneven distribution with partitions sizes anywhere from 4 to 22:
my_list = list(range(100))
rdd = spark.sparkContext.parallelize(my_list).repartition(10)
rdd.glom().map(len).collect()
# Outputs: [10, 4, 14, 6, 22, 6, 8, 10, 4, 16]
Here is the workaround I use, which is to index the data myself and then mod the index to find which partition to place the row in:
my_list = list(range(100))
number_of_partitions = 10
rdd = (
spark.sparkContext
.parallelize(zip(range(len(my_list)), my_list))
.partitionBy(number_of_partitions, lambda idx: idx % number_of_partitions)
)
rdd.glom().map(len).collect()
# Outputs: [10, 10, 10, 10, 10, 10, 10, 10, 10, 10]

Sublinear behavior (MongoDB cluster)

I have the following setup:
Import CSV-file (20GB) with 90 million rows -> data takes 9GB in MongoDB -> index on „2d“ column -> additional integer-column index for sharding -> distribute data with 1, 2, 4, 6, 8, 16 shards.
Each shard machine in cluster has 20GB disk space and 2GB RAM.
I generated a random query and benchmarked the execution time for each cluster configuration (see attachment).
Now my question:
Using 1, 2, 4, 6, and 8 shards I see a more or less linear decrease of runtime, as expected. With 8 shards I would assume that on each shard my data fits into memory. Therefore I thought there would be no improvement from 8 shards to 16 shards.
But from my benchmarks I observe a very strong sublinear decrease of runtime.
Do you have an idea how this behavior might be explained? Any suggestions or references to the manual are much appreciated!
Thanks in advance,
Lydia

mongodb sharding - chunks are not having the same size

I am new on playing with mongodb.
Due to the fact that I have to store +-50 mln of documents, I had to set up a mongodb shard cluster with two replica sets
The document looks like this:
{
"_id" : "predefined_unique_id",
"appNr" : "abcde",
"modifiedDate" : ISODate("2016-09-16T13:00:57.000Z"),
"size" : NumberLong(803),
"crc32" : NumberLong(538462645)
}
The shard key is appNr (was selected because for query performance reasons, all documents having same appNr have to stay within one chunk).
Usually multiple documents have the same appNr.
After loading like two million records, I see the chunks are equally balanced however when running db.my_collection.getShardDistribution(), I get :
Shard rs0 at rs0/...
data : 733.97MiB docs : 5618348 chunks : 22
estimated data per chunk : 33.36MiB
estimated docs per chunk : 255379
Shard rs1 at rs1/...
data : 210.09MiB docs : 1734181 chunks : 19
estimated data per chunk : 11.05MiB
estimated docs per chunk : 91272
Totals
data : 944.07MiB docs : 7352529 chunks : 41
Shard rs0 contains 77.74% data, 76.41% docs in cluster, avg obj size on shard : 136B
Shard rs1 contains 22.25% data, 23.58% docs in cluster, avg obj size on shard : 127B
My question is what settings I should do in order to get the data equally distributed between shards? I would like to understand how the data gets split in chunks. I have defined a ranged shard key and chunk size 264.
MongoDB uses the shard key associated to the collection to partition the data into chunks. A chunk consists of a subset of sharded data. Each chunk has a inclusive lower and exclusive upper range based on the shard key.
Diagram of the shard key value space segmented into smaller ranges or chunks.
The mongos routes writes to the appropriate chunk based on the shard key value. MongoDB splits chunks when they grows beyond the configured chunk size. Both inserts and updates can trigger a chunk split.
The smallest range a chunk can represent is a single unique shard key
value. A chunk that only contains documents with a single shard key
value cannot be split.
Chunk Size will have a major impact on the shards.
The default chunk size in MongoDB is 64 megabytes. We can increase or reduce the chunk size. But modification of the chunk size should be done after considering the below items
Small chunks lead to a more even distribution of data at the expense of more frequent migrations. This creates expense at the query routing (mongos) layer.
Large chunks lead to fewer migrations. This is more efficient both from the networking perspective and in terms of internal overhead at the query routing layer. But, these efficiencies come at the expense of a potentially uneven distribution of data.
Chunk size affects the Maximum Number of Documents Per Chunk to Migrate.
Chunk size affects the maximum collection size when sharding an existing collection. Post-sharding, chunk size does not constrain collection size.
By referring these information and your shard key "appNr", this would have happened because of chunk size.
Try resizing the chunk size instead of 264MB(which you have currently) to a lower size and see whether there is a change in the document distribution. But this would be a trial and error approach and it would take considerable amount of time and iterations.
Reference : https://docs.mongodb.com/v3.2/core/sharding-data-partitioning/
Hope it Helps!
I'll post my findings here - maybe they will have some further use.
The mongodb documentation says that "when a chunk grows beyond specified chunk size" it gets splitted.
I think the documentation is not fully accurate or rather incomplete.
When mongo does auto-splitting, splitVector command will ask the primary shard for splitting points, then will split accordingly.This will happen first when like 20% from specified chunk size is reached and - if no splitting points found - will retry at 40%,60% so on...so the splitting should not wait for max size .
In my case, for the first half of the shards this happened ok, but then for the second half - the split happened only after the max chunk size was exceeded. Still have to investigate why the split didn't happened earlier, as I see no reason for this behaviour.
After splitting in chunks, the balancer starts. This will divide the chunks equally across shards, without considering chunk size ( a chunk with 0 documents is equal to a chunk with 100 documents from this regard).The chunks will be moved following the order of their creation.
My problem was that the second half of the chunks was almost twice the size than the first half. Therefore as balancer allways moved the first half of the chunks collection to the other shard, the cluster became unbalanced.
a much better explanation I found here
In order to fix it, I have changed the sharding key to "hashed".