I have a collection which has concurrent reads as and also some part of the application updating the same collection but during the load each read and update operation taking so much of time and it gets very slow with time
Here is log of some query
nscanned:4 nupdated:2 keyUpdates:3 numYields: 1 locks(micros) w:2475463 10247ms
nscanned:4 nupdated:2 keyUpdates:2 numYields: 1 locks(micros) w:2077481 1054ms
Collection has only 70K records.
Concurrent read and writes are almost 10.
This is what I have already done
Sharding with 3 member replica set
Sharding key is hashed and both db and collection level sharding is enble
Each replica box is has enough power and ram .
Query are bounded with index and db.collection.find().explain() has this output
{
"cursor" : "BtreeCursor fp.ulc_1_c_1_p_1",
"isMultiKey" : true,
"n" : 0,
"nscannedObjects" : 2,
"nscanned" : 2,
"nscannedObjectsAllPlans" : 2,
"nscannedAllPlans" : 2,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"fp.ulc" : [
[
"0ca01c47c984b5583d455e42aafded2c",
"0ca01c47c984b5583d455e42aafded2c"
]
],
"c" : [
[
false,
false
]
],
"p" : [
[
1372062247612,
1.7976931348623157e+308
]
]
}
}
I have also tried to set read preference with secondary but after a period of time it also goes slow
Also I have noticed lock in mongostat here is output from mongostat
insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netIn netOut conn set repl time
*0 *0 6 *0 4 2|0 0 54.4g 109g 1.74g 0 collectDb:199.7% 0 6|0 0|1 3k 130k 21 set1 PRI 08:27:55
*0 *0 15 *0 11 8|0 1 54.4g 109g 1.74g 0 collectDb:200.1% 0 6|0 0|1 11k 357k 21 set1 PRI 08:27:58
7 *0 34 *0 18 26|0 0 54.4g 109g 1.75g 0 collectDb:202.9% 0 6|0 0|1 36k 362k 21 set1 PRI 08:28:00
1 *0 13 *0 8 7|0 0 54.4g 109g 1.75g 0 collectDb:192.3% 0 6|0 0|1 12k 287k 21 set1 PRI 08:28:03
1 *0 9 *0 7 8|0 0 54.4g 109g 1.75g 0 collectDb:196.1% 0 6|0 0|1 5k 258k 21 set1 PRI 08:28:04
5 *0 20 *0 10 13|0 0 54.4g 109g 1.75g 0 collectDb:207.7% 0 6|0 0|1 23k 214k 21 set1 PRI 08:28:08
8 *0 38 *0 21 29|0 0 54.4g 109g 1.74g 0 collectDb:215.9% 0 5|0 0|1 40k 548k 21 set1 PRI 08:28:12
6 *0 44 *0 24 22|0 0 54.4g 109g 1.75g 0 collectDb:199.5% 0 3|0 0|1 45k 509k 21 set1 PRI 08:28:15
2 4 27 *0 11 28|0 0 54.4g 109g 1.75g 0 collectDb:169.2% 0 6|0 0|1 21k 318k 21 set1 PRI 08:28:18
2 *0 29 *0 18 20|0 0 54.4g 109g 1.74g 0 collectDb:255.5% 0 5|0 0|1 28k 588k 21 set1 PRI 08:28:24
So i finally figured out some best way to avoid locking in mongodb.
What i did
Updated my mongodb to latest stable production release 2.4.8 from here.
Updated my ebs to optimized iops 2000 with Raid 10 ebs.
Monitored my slow queries from mongod.log file and also iowait for each drive.
Added some multikey index and compound indexs From Mongodb indexs docs.
And also i watched the consumption of ram on each ec2 instance including primary and secondary member of replica set.
Change the instance type to Ebs optimized with Gigabit Ethernet interface and more than 16 gb of ram on each server so that most of the time ram is available for index and current data set .
A good to read Documentation for amazon instance and their best use case so that you can understand your requirement better.
Although locking is a major issue in MongoDB but i think they are working on collection level locking so may be in the upcoming version it will solve almost everything related to performance degrade due to locking.
Here is the jira link you can check the status.
Related
I try to replace values per row that enclosed between two values within accepatable tolerance range/distance between the two values (non-zero). For example:
A=[0 0 0 0 1 *0 *0 *0 *0 *0 -1 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 -1 *0 *0 *0 1 0 0 0;
0 0 1 *0 *0 *0 *0 *0 *0 *0 -1 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 -1 *0 *0 *0 1 0 0 0]
* indicates the possibles values to be considered as an enclosure. With tolerance of 6 (for instance), the values inside will be replaced with the first boundary value:
A_replaced=[0 0 0 0 1 *1 *1 *1 *1 *1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 *-1 *-1 *-1 -1 0 0 0;
0 0 1 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 *-1 *-1 *-1 -1 0 0 0]
As you can see, everything that can be considered as an enclosure is replaced by the first boundary value (including the end closure value) since the distance between the two boundaries does not exceeds 6 (smaller than tolerance). Is there vectorized method to do this for large size of matrix? Thanks in advance!
Because I've already answered half of this question, I will do it with two steps. Note that your phrase of the question here is not the fully general case, since you need treating matrices with some continuous non-zeros in the places here that you placed only 1. This is relying on your previous question and comments.
So first step is to find all the zeros sequences that their length is under the tolerance, and replace them with 1. Second step is your previous question- replace all non-zeros with the starting value of the sequence.
Pre-first-step will be to add zeros to the beginning of A, because doing it all with linear indexing and treat A as a long vector is a much comfort, but we don't want to catch zero-sequences that cross the columns. We can remove the zeros at the end.
A=[
0 0 0 0 0 0.5 0.9 -1 0.1 0.3 1 1 0 0 0 0.9 -1 0.1 0 0 0 0 0.3 -1 -1 0 0 0 0 0;
0 0 0 0 0 0.1 0.3 1 -1 -1 -1 -1 0 0 0 0 0 -1 -1 -1 0 0 1 1 1 0 0 0 0 0 ];
tol=6;
A=[zeros(size(A,1),tol), A]; % add zeros at beginning of each row
A=A.'; % working on transposed A let us index continuously the sequences
zero_starts=strfind([A(1)==0; A(:)~=0].',[1 0]).'; % indices where `0` sequences starts
zero_starts(end)=[]; % last sequance is not enclosed
zero_length=diff([0; find(diff(find(~A))>1)]); % length of each of these zero-sequences
zero_tol=find(zero_length<=tol); % zero-sequences that shorter than tolerance
zero_tol(strfind(diff([0; zero_tol].')-1>0,[1 0])+1)=[]; % remove each second of continuous zero-seqeunces
% replace all short zero-seqeunces with `1`
sl=arrayfun(#(x)0:x-1, zero_length(zero_tol),'uni',0);
A(repelem(zero_starts(zero_tol),zero_length(zero_tol))+cat(2,sl{:}).')=1;
Now, the second step is to replace each non-zero sequences with the first element of each sequence:
B=find(A);
A(A~=0)=repelem(A(B([0 ;find(diff(B)>1)]+1)),diff([0;find(diff(B)>1) ;length(B)]));
Lastly, we can reverse back A to its shape and size-
A=A(tol+1:end,:).'
A=[
0 0 0 0 0 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0 0 0 0 0.3 -1 -1 0 0 0 0 0;
0 0 0 0 0 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0 0 1 1 1 0 0 0 0 0 ];
when we run the show dbs command, the x-development database doesn't appear.
> show dbs
admin (empty)
local 0.203125GB
p-development 0.203125GB
p-production 3.9521484375GB
test (empty)
>
we want to make sure this database isn't taking server resources, but mongostat keeps listing it. why does this happen if the database doesn't exist?
[root#mongo]# mongostat
connected to: 127.0.0.1
insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netIn netOut conn time
0 5 0 0 0 1 1 2.11g 4.86g 464m 0 x-development:0.0% 0 0|0 0|0 62b 2k 3 11:42:57
0 0 0 0 0 1 0 2.11g 4.86g 464m 0 x-development:0.0% 0 0|0 0|0 62b 2k 3 11:42:58
0 0 0 0 0 1 0 2.11g 4.86g 464m 0 x-development:0.0% 0 0|0 0|0 62b 2k 3 11:42:59
It don't think it's strange or wrong that mongo chooses to retain statistics of a deleted database. After all they are statistics.
Deleting the database does not change the fact that you ran queries against it before.
I have a single mongod (2.0.2) running on a (Intel Xeon 8 core 2.1 Ghz, 32 GB RAM) server. Here are some sample stats from mongostat (usual normal calm day):
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn repl time
0 186 3 0 0 5 0 42.6g 87.8g 22.6g 2 0.8 0 0|0 0|0 35k 1m 319 M 20:36:00
0 177 3 0 0 4 0 42.6g 87.8g 22.5g 2 0.7 0 0|0 0|0 28k 993k 319 M 20:36:30
0 181 3 0 0 3 0 42.6g 87.8g 22.6g 1 0.6 0 0|0 0|1 28k 878k 319 M 20:37:00
0 177 4 0 0 4 0 42.6g 87.8g 22.6g 2 0.7 0 0|0 0|0 31k 851k 319 M 20:37:30
0 171 2 0 0 3 0 42.6g 87.8g 22.6g 2 0.4 0 0|0 1|0 25k 912k 319 M 20:38:00
0 133 1 0 0 3 0 42.6g 87.8g 22.5g 0 0.3 0 0|0 0|0 20k 673k 319 M 20:38:30
0 180 3 0 0 4 0 42.6g 87.8g 22.5g 1 0.6 0 0|0 1|0 29k 890k 319 M 20:39:00
But sometimes when there are 500-600 users online (I store visit counters in Mongo, so there are a lot of updates when visitors come) queries jump up to ~500 per second and read queue grows quickly and stays at around 40-50 for few minutes which makes scripts time out.
Can adding a replica set member on the same machine (I don't have any more physical servers) help me? I want to set the read preference to point to the secondary member so that writes on the primary instance do not block the reads.
We have a MongoDB collection containing nearly 40 million records. The current size of the collection is 5GB. The data stored in this collection contains following fields:
_id: "MongoDB id"
userid: "user id" (int)
mobile: "users mobile number" (int)
transaction: "transaction id" (int)
sms: "message sent to user mobile" (text)
created_dt: "unix timestamp of the transaction"
Apart from the index on _id (created by default), we have defined separate indexes on the mobile and transaction fields.
However, the following query takes anywhere between 60 to 120 seconds to complete:
{
mobile:<users mobile number>
}
I access MongoDB using RockMongo. MongoDB is hosted on a server with 16GB RAM. Nearly 8GB RAM on this server is free.
What is it that I am doing wrong here?
Update:
Output of explain:
{
"cursor" : "BasicCursor",
"nscanned" : 37145516,
"nscannedObjects" : 37145516,
"n" : 37145516,
"millis" : 296040,
"nYields" : 1343,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
}
}
Output of mongostat at the time of the query
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn time
13 2 0 0 0 1 0 168g 336g 6.86g 1 1 0 0|0 1|0 21k 1k 19 11:30:04
16 0 0 0 0 1 0 168g 336g 6.88g 0 0.1 0 0|0 1|0 21k 1k 19 11:30:05
14 0 0 0 0 1 0 168g 336g 6.86g 0 0 0 0|0 1|0 29k 1k 19 11:30:06
10 0 0 0 0 1 0 168g 336g 6.86g 0 0 0 0|0 1|0 19k 1k 19 11:30:07
16 0 0 0 0 1 0 168g 336g 6.88g 0 0.1 0 0|0 1|0 21k 1k 19 11:30:08
9 0 0 0 0 1 0 168g 336g 6.89g 0 0 0 0|0 1|0 13k 1k 19 11:30:09
19 0 0 0 0 1 0 168g 336g 6.89g 0 0 0 0|0 1|0 27k 1k 19 11:30:10
12 0 0 0 0 1 0 168g 336g 6.89g 1 1.2 0 0|0 1|0 24k 1k 19 11:30:11
17 0 0 0 0 1 0 168g 336g 6.89g 1 1.7 0 0|0 1|0 31k 1k 19 11:30:12
15 0 0 0 0 1 0 168g 336g 6.89g 0 0 0 0|0 1|0 19k 1k 19 11:30:13
Update 2:
Until recently, we used to store another collection with about 1.3 billion documents in the same MongoDB server. This collection has now been removed (dropped). This may explain the mapped / vsize column in above output from mongostat.
The server also stores 6 other collections which have frequent inserts. The total storage size currently is about 35GB.
Update 3:
Indexes defined on the collection. Created using RockMongo.
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"ns" : "mymongodb.transaction_sms_details",
"name" : "_id_"
},
{
"v" : 1,
"key" : {
"_transaction_mobile_" : 1
},
"ns" : "mymongodb.transaction_sms_details",
"background" : 1,
"name" : "mobile"
},
{
"v" : 1,
"key" : {
"_transaction_transaction_" : 1
},
"ns" : "mymongodb.transaction_sms_details",
"background" : 1,
"name" : "transaction"
}
]
The keys generated by RockMongo is apparently incorrect.
"_transaction_mobile_" : 1
"_transaction_transtion_" : 1
I don't know what's wrong with RockMongo, but I think this can fix the issue:
db.xxx.dropIndexes();
db.xxx.ensureIndex({mobile: 1});
db.xxx.ensureIndex({transaction: 1});
Notice: This may take VERY LONG time. Don't do this on a running production machine.
I'm running a MongoDB instance using a Replica Set, when there are a lot of insert, I can see very weird statistics on faults and locked %.
How come locked % can be more than 100 ?!
Where does the faults happen, I have no logs mentioning any fault, does someone have any clue about what it means ?
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn set repl time
9 0 0 0 1 4 0 70.3g 141g 4.77g 20 124 0 0|0 0|1 1m 2m 10 socialdb M 18:49:49
18 0 0 0 3 1 0 70.3g 141g 4.77g 17 73.8 0 0|0 0|1 1m 2m 10 socialdb M 18:49:50
21 0 0 0 1 5 0 70.3g 141g 4.77g 18 104 0 0|0 0|1 1m 1m 10 socialdb M 18:49:51
20 0 0 0 3 1 0 70.3g 141g 4.78g 18 98.8 0 0|0 0|1 1m 3m 10 socialdb M 18:49:52
172 0 0 0 5 4 0 70.3g 141g 4.79g 133 72.8 0 0|0 0|0 7m 12m 10 socialdb M 18:49:53
76 0 0 0 3 1 0 70.3g 141g 4.8g 114 65.1 0 0|0 0|1 6m 10m 10 socialdb M 18:49:54
54 0 0 0 4 4 1 70.3g 141g 4.81g 45 90.6 0 0|0 0|1 2m 8m 10 socialdb M 18:49:55
85 0 0 0 4 2 0 70.3g 141g 4.84g 101 98.1 0 0|0 0|1 6m 11m 10 socialdb M 18:49:56
77 0 0 0 3 4 0 70.3g 141g 4.82g 78 74.5 0 0|0 0|1 4m 9m 10 socialdb M 18:49:57
72 0 0 0 3 1 0 70.3g 141g 4.84g 111 95.7 0 0|0 0|1 6m 10m 10 socialdb M 18:49:58
Is there a better (standard) monitoring tool, free ?
Not sure about the other two but this could be the answer to your first question, if you are using v2.2:
http://docs.mongodb.org/manual/reference/mongostat/The above page mentions:
locked:
The percent of time in a global write lock.
(Changed in version 2.2: The locked db field replaces the locked % field to more appropriate data regarding the database specific locks in version 2.2)
locked db:
New in version 2.2.
The percent of time in the per-database context-specific lock. mongostat will report the database that has spent the most time since the last mongostat call with a write lock.
This value represents the amount of time the database had a database specific lock and the time that the mongod spent in the global lock. Because of this, and the sampling method, you may see some values greater than 100%.