I have a single mongod (2.0.2) running on a (Intel Xeon 8 core 2.1 Ghz, 32 GB RAM) server. Here are some sample stats from mongostat (usual normal calm day):
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn repl time
0 186 3 0 0 5 0 42.6g 87.8g 22.6g 2 0.8 0 0|0 0|0 35k 1m 319 M 20:36:00
0 177 3 0 0 4 0 42.6g 87.8g 22.5g 2 0.7 0 0|0 0|0 28k 993k 319 M 20:36:30
0 181 3 0 0 3 0 42.6g 87.8g 22.6g 1 0.6 0 0|0 0|1 28k 878k 319 M 20:37:00
0 177 4 0 0 4 0 42.6g 87.8g 22.6g 2 0.7 0 0|0 0|0 31k 851k 319 M 20:37:30
0 171 2 0 0 3 0 42.6g 87.8g 22.6g 2 0.4 0 0|0 1|0 25k 912k 319 M 20:38:00
0 133 1 0 0 3 0 42.6g 87.8g 22.5g 0 0.3 0 0|0 0|0 20k 673k 319 M 20:38:30
0 180 3 0 0 4 0 42.6g 87.8g 22.5g 1 0.6 0 0|0 1|0 29k 890k 319 M 20:39:00
But sometimes when there are 500-600 users online (I store visit counters in Mongo, so there are a lot of updates when visitors come) queries jump up to ~500 per second and read queue grows quickly and stays at around 40-50 for few minutes which makes scripts time out.
Can adding a replica set member on the same machine (I don't have any more physical servers) help me? I want to set the read preference to point to the secondary member so that writes on the primary instance do not block the reads.
Related
I try to replace values per row that enclosed between two values within accepatable tolerance range/distance between the two values (non-zero). For example:
A=[0 0 0 0 1 *0 *0 *0 *0 *0 -1 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 -1 *0 *0 *0 1 0 0 0;
0 0 1 *0 *0 *0 *0 *0 *0 *0 -1 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 *0 -1 *0 *0 *0 1 0 0 0]
* indicates the possibles values to be considered as an enclosure. With tolerance of 6 (for instance), the values inside will be replaced with the first boundary value:
A_replaced=[0 0 0 0 1 *1 *1 *1 *1 *1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 *-1 *-1 *-1 -1 0 0 0;
0 0 1 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 *-1 *-1 *-1 -1 0 0 0]
As you can see, everything that can be considered as an enclosure is replaced by the first boundary value (including the end closure value) since the distance between the two boundaries does not exceeds 6 (smaller than tolerance). Is there vectorized method to do this for large size of matrix? Thanks in advance!
Because I've already answered half of this question, I will do it with two steps. Note that your phrase of the question here is not the fully general case, since you need treating matrices with some continuous non-zeros in the places here that you placed only 1. This is relying on your previous question and comments.
So first step is to find all the zeros sequences that their length is under the tolerance, and replace them with 1. Second step is your previous question- replace all non-zeros with the starting value of the sequence.
Pre-first-step will be to add zeros to the beginning of A, because doing it all with linear indexing and treat A as a long vector is a much comfort, but we don't want to catch zero-sequences that cross the columns. We can remove the zeros at the end.
A=[
0 0 0 0 0 0.5 0.9 -1 0.1 0.3 1 1 0 0 0 0.9 -1 0.1 0 0 0 0 0.3 -1 -1 0 0 0 0 0;
0 0 0 0 0 0.1 0.3 1 -1 -1 -1 -1 0 0 0 0 0 -1 -1 -1 0 0 1 1 1 0 0 0 0 0 ];
tol=6;
A=[zeros(size(A,1),tol), A]; % add zeros at beginning of each row
A=A.'; % working on transposed A let us index continuously the sequences
zero_starts=strfind([A(1)==0; A(:)~=0].',[1 0]).'; % indices where `0` sequences starts
zero_starts(end)=[]; % last sequance is not enclosed
zero_length=diff([0; find(diff(find(~A))>1)]); % length of each of these zero-sequences
zero_tol=find(zero_length<=tol); % zero-sequences that shorter than tolerance
zero_tol(strfind(diff([0; zero_tol].')-1>0,[1 0])+1)=[]; % remove each second of continuous zero-seqeunces
% replace all short zero-seqeunces with `1`
sl=arrayfun(#(x)0:x-1, zero_length(zero_tol),'uni',0);
A(repelem(zero_starts(zero_tol),zero_length(zero_tol))+cat(2,sl{:}).')=1;
Now, the second step is to replace each non-zero sequences with the first element of each sequence:
B=find(A);
A(A~=0)=repelem(A(B([0 ;find(diff(B)>1)]+1)),diff([0;find(diff(B)>1) ;length(B)]));
Lastly, we can reverse back A to its shape and size-
A=A(tol+1:end,:).'
A=[
0 0 0 0 0 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0 0 0 0 0.3 -1 -1 0 0 0 0 0;
0 0 0 0 0 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0 0 1 1 1 0 0 0 0 0 ];
I have an issue with MongoDB becoming unresponsive and refusing connections in production.
In testing, I'm able to consistently reproduce the "lock out" by sending lot of requests to my app.
Here is an example output from mongostat:
0 0 0 0 0 1 0 1.3g 1.8g 177m 0 0 0 0|0 0|0 62b 1k 18 19:40:11
0 0 0 0 0 1 0 1.3g 1.8g 177m 0 0 0 0|0 0|0 62b 1k 18 19:40:12
--------------LOAD TEST BEGINS-----------------------
1 56 0 0 1 58 0 1.3g 1.81g 177m 0 0 0 0|0 0|0 10k 302k 18 19:40:13
10 116 0 0 2 127 0 1.3g 1.81g 177m 0 0 0 0|0 1|0 32k 263k 19 19:40:14
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn time
206 431 0 0 2 638 0 1.3g 1.82g 177m 0 1.1 0 0|0 0|0 206k 339k 22 19:40:15
331 649 0 0 2 983 0 1.3g 1.81g 177m 0 1 0 0|0 0|0 317k 521k 21 19:40:16
367 723 0 0 10 1090 0 1.3g 1.81g 177m 0 0.9 0 0|1 1|1 297k 1m 20 19:40:17
500 961 0 0 11 1461 0 1.3g 1.81g 177m 0 1.2 0 0|0 0|0 390k 1m 19 19:40:18
543 1073 0 0 10 1618 0 1.3g 1.82g 177m 0 1.2 0 0|0 2|0 433k 1m 21 19:40:19
461 909 0 0 11 1370 0 1.3g 1.81g 177m 0 1.1 0 0|0 2|0 365k 1m 20 19:40:20
347 685 0 0 6 1033 0 1.3g 1.81g 177m 0 0.8 0 0|0 1|0 411k 962k 21 19:40:21
482 938 0 0 9 1421 0 1.3g 1.81g 177m 0 1.1 0 0|0 0|0 492k 1m 20 19:40:22
444 877 0 0 10 1322 0 1.3g 1.81g 177m 0 1.1 0 0|0 0|1 479k 1m 22 19:40:23
528 1019 0 0 9 1549 0 1.3g 1.81g 177m 0 1.2 0 0|0 0|0 446k 1m 21 19:40:24
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn time
520 1033 0 0 10 1554 0 1.3g 1.81g 177m 0 1.2 0 0|0 1|0 418k 1m 22 19:40:25
448 878 0 0 10 1326 0 1.3g 1.81g 177m 0 1.2 0 0|0 1|0 349k 1m 22 19:40:26
503 979 0 0 8 1483 0 1.3g 1.82g 177m 0 1.3 0 0|0 1|0 397k 1m 21 19:40:27
379 743 0 0 10 1123 0 1.3g 1.81g 177m 0 1 0 0|0 0|0 315k 1m 22 19:40:28
456 896 0 0 7 1354 0 1.3g 1.81g 177m 0 1.1 0 0|0 0|0 436k 1m 22 19:40:29
371 715 0 0 10 1086 0 1.3g 1.81g 175m 0 0.9 0 0|0 0|0 299k 1m 20 19:40:30
505 993 0 0 7 1499 0 1.3g 1.81g 175m 0 1.2 0 0|0 2|0 394k 1m 21 19:40:31
493 1023 0 0 10 1519 0 1.3g 1.82g 175m 0 1.1 0 0|0 1|0 392k 1m 22 19:40:32
426 868 0 0 11 1295 0 1.3g 1.81g 170m 0 1 0 0|0 1|0 363k 1m 21 19:40:33
352 772 0 0 7 1125 0 1.3g 1.81g 163m 19 4.4 0 0|1 0|1 368k 1m 22 19:40:34
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn time
330 656 0 0 5 985 0 1.3g 1.81g 159m 4 1.5 0 0|0 0|0 325k 876k 19 19:40:35
398 817 0 0 8 1218 0 1.3g 1.81g 155m 9 1.3 0 0|0 1|0 372k 1m 21 19:40:36
464 1015 0 0 8 1479 0 1.3g 1.81g 155m 2 1.2 0 0|0 1|0 381k 1m 21 19:40:37
457 915 0 0 11 1372 0 1.3g 1.81g 155m 0 1.1 0 0|0 0|0 356k 1m 22 19:40:38
458 960 0 0 7 1419 0 1.3g 1.83g 155m 0 1.1 0 0|0 0|0 376k 1m 22 19:40:39
391 760 0 0 11 1153 0 1.3g 1.81g 155m 0 1 0 0|0 1|0 321k 1m 21 19:40:40
315 639 0 0 5 956 0 1.3g 1.81g 155m 0 0.8 0 0|0 0|0 325k 877k 21 19:40:41
303 696 0 0 7 998 0 1.3g 1.81g 155m 0 0.8 0 0|0 0|0 279k 1m 21 19:40:42
388 772 0 0 8 1163 0 1.3g 1.81g 155m 0 1 0 0|0 0|0 308k 1m 20 19:40:43
273 623 0 0 7 895 0 1.3g 1.81g 155m 0 0.7 0 0|0 2|0 232k 1m 20 19:40:44
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn time
397 796 0 0 7 1194 0 1.3g 1.81g 155m 0 0.9 0 0|0 0|0 311k 1m 19 19:40:45
347 754 0 0 4 1103 0 1.3g 1.81g 155m 0 0.9 0 0|0 0|0 329k 771k 20 19:40:46
336 700 0 0 8 1042 0 1.3g 1.82g 160m 20 0.8 0 2|2 3|2 370k 1m 23 19:40:47
272 525 0 0 5 805 0 1.3g 1.79g 167m 188 0.6 0 0|0 0|0 306k 818k 18 19:40:48
103 246 0 0 1 350 0 1.3g 1.78g 166m 0 0.4 0 0|0 0|0 136k 234k 15 19:40:49
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:40:50
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:40:51
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:40:52
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:40:53
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:40:54
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn time
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:40:55
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:40:56
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:40:57
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:40:58
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:40:59
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:00
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:01
0 3 0 0 0 4 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 775b 4k 15 19:41:02
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:03
0 0 0 0 0 1 1 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:04
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn time
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:05
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:06
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:07
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:08
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:09
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:10
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:11
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:12
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:13
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:14
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn time
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:15
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:16
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:17
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:18
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:19
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:20
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:21
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:22
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:23
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:24
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn time
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:25
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:26
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:27
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:28
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:29
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:30
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:31
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:32
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:33
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:34
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn time
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:35
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:36
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:37
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:38
0 0 0 0 0 1 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 62b 1k 15 19:41:39
0 28 0 0 0 29 0 1.3g 1.78g 166m 0 0 0 0|0 0|0 5k 27k 15 19:41:40
86 234 0 0 1 322 0 1.3g 1.79g 166m 0 0.2 0 0|0 1|0 145k 217k 17 19:41:41
216 445 0 0 5 663 0 1.3g 1.79g 167m 0 0.5 0 0|0 0|0 220k 807k 16 19:41:42
302 612 0 0 6 916 0 1.3g 1.79g 167m 0 0.6 0 0|0 0|0 236k 950k 16 19:41:43
113 247 0 0 1 362 0 1.3g 1.79g 167m 0 0.2 0 0|0 0|0 95k 192k 17 19:41:44
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn time
38 63 0 0 2 103 0 1.3g 1.79g 167m 0 0 0 0|0 0|0 26k 289k 16 19:41:45
--------------LOAD TEST ENDS-----------------------
0 0 0 0 0 2 0 1.3g 1.79g 167m 0 0 0 0|0 0|0 124b 2k 16 19:41:46
0 0 0 0 0 2 0 1.3g 1.79g 167m 0 0 0 0|0 0|0 124b 2k 16 19:41:47
0 0 0 0 0 2 0 1.3g 1.79g 167m 0 0 0 0|0 0|0 124b 2k 16 19:41:48
0 0 0 0 0 2 0 1.3g 1.79g 167m 0 0 0 0|0 0|0 124b 2k 16 19:41:49
As you can see, in the middle of the load test, the reads and writes drop to pretty much nothing. During that time, I'm unable to connect locally to MongoDB via mongo or mongostat (connection refused). As you can also see, the connections is pretty low and there are 16000 connections in the max connections for mongo.
I'm able to reproduce this consistently. However, each time the "lock out" is a different length. I've observed 51 seconds, 20 seconds, 28 seconds. In each instance, I'm still connected to mongo via mongostat (and receiving output), but new connections will be refused.
user#myhostname:~# mongostat
couldn't connect to [127.0.0.1] couldn't connect to server 127.0.0.1
What could be the issue? I'm running MongoDB v1.8.2.
As an aside, I'm able to reproduce a similar issue, where the commands don't drop off entirely, but are low in volume and fluctuate wildly.
Fixed it:
sudo sysctl net.ipv4.tcp_tw_reuse=1
Then, restart mongo.
Alternatively, you can add it to /etc/sysctl.conf (so it will be run on reboot):
net.ipv4.tcp_tw_reuse=1
Then run this to reload (without having to reboot)
sudo sysctl -p /etc/sysctl.conf
This "fix" will disable the timewait state for TCP sockets (server-wide). So, it's really not a fix at all. However, until mongo reduces their timewait state using SO_LINGER, a large number of server sockets will pool up in the TIME_WAIT state and remain unusable for new connections. You can see the count of connections in TIME_WAIT with this:
netstat -an | grep TIME_WAIT | wc -l
With that, I was able to see it fail at around 28k TIME_WAIT connections. Using this kernel flag:
sysctl net.ipv4.ip_local_port_range="18000 65535"
The server fails at 45k connections. So, to reproduce the error more easily, you could lower the range to 200 or something small.
So, the result of this was a programming question after all (as you can see from the last link):
TCP option SO_LINGER (zero) - when it's required
http://alas.matf.bg.ac.rs/manuals/lspe/snode=105.html
Spent a ton of time debugging a similar issue. In the end, it turned out we had too many short lived connection, with each one getting closed after one small operation. However, this was leaving a bunch of CLOSE_WAIT connections on the mongo server process and the exact same symptom you described in the question.
The solution was to reuse connections instead of opening and closing a connection every command.
when we run the show dbs command, the x-development database doesn't appear.
> show dbs
admin (empty)
local 0.203125GB
p-development 0.203125GB
p-production 3.9521484375GB
test (empty)
>
we want to make sure this database isn't taking server resources, but mongostat keeps listing it. why does this happen if the database doesn't exist?
[root#mongo]# mongostat
connected to: 127.0.0.1
insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netIn netOut conn time
0 5 0 0 0 1 1 2.11g 4.86g 464m 0 x-development:0.0% 0 0|0 0|0 62b 2k 3 11:42:57
0 0 0 0 0 1 0 2.11g 4.86g 464m 0 x-development:0.0% 0 0|0 0|0 62b 2k 3 11:42:58
0 0 0 0 0 1 0 2.11g 4.86g 464m 0 x-development:0.0% 0 0|0 0|0 62b 2k 3 11:42:59
It don't think it's strange or wrong that mongo chooses to retain statistics of a deleted database. After all they are statistics.
Deleting the database does not change the fact that you ran queries against it before.
I'm running a MongoDB instance using a Replica Set, when there are a lot of insert, I can see very weird statistics on faults and locked %.
How come locked % can be more than 100 ?!
Where does the faults happen, I have no logs mentioning any fault, does someone have any clue about what it means ?
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn set repl time
9 0 0 0 1 4 0 70.3g 141g 4.77g 20 124 0 0|0 0|1 1m 2m 10 socialdb M 18:49:49
18 0 0 0 3 1 0 70.3g 141g 4.77g 17 73.8 0 0|0 0|1 1m 2m 10 socialdb M 18:49:50
21 0 0 0 1 5 0 70.3g 141g 4.77g 18 104 0 0|0 0|1 1m 1m 10 socialdb M 18:49:51
20 0 0 0 3 1 0 70.3g 141g 4.78g 18 98.8 0 0|0 0|1 1m 3m 10 socialdb M 18:49:52
172 0 0 0 5 4 0 70.3g 141g 4.79g 133 72.8 0 0|0 0|0 7m 12m 10 socialdb M 18:49:53
76 0 0 0 3 1 0 70.3g 141g 4.8g 114 65.1 0 0|0 0|1 6m 10m 10 socialdb M 18:49:54
54 0 0 0 4 4 1 70.3g 141g 4.81g 45 90.6 0 0|0 0|1 2m 8m 10 socialdb M 18:49:55
85 0 0 0 4 2 0 70.3g 141g 4.84g 101 98.1 0 0|0 0|1 6m 11m 10 socialdb M 18:49:56
77 0 0 0 3 4 0 70.3g 141g 4.82g 78 74.5 0 0|0 0|1 4m 9m 10 socialdb M 18:49:57
72 0 0 0 3 1 0 70.3g 141g 4.84g 111 95.7 0 0|0 0|1 6m 10m 10 socialdb M 18:49:58
Is there a better (standard) monitoring tool, free ?
Not sure about the other two but this could be the answer to your first question, if you are using v2.2:
http://docs.mongodb.org/manual/reference/mongostat/The above page mentions:
locked:
The percent of time in a global write lock.
(Changed in version 2.2: The locked db field replaces the locked % field to more appropriate data regarding the database specific locks in version 2.2)
locked db:
New in version 2.2.
The percent of time in the per-database context-specific lock. mongostat will report the database that has spent the most time since the last mongostat call with a write lock.
This value represents the amount of time the database had a database specific lock and the time that the mongod spent in the global lock. Because of this, and the sampling method, you may see some values greater than 100%.
i have an installation on memcache which i want to use in my production environment but when i have ran a couple of tests it seems that memcache doesn't free up memory even after it has used up all of it allocated memory, Also i logged in and ran a flush_all command but the objects are still in the cache.
Here are outputs from some tests
memcached-tool
memcache-top v0.6 (default port: 11211, color: on, refresh: 3 seconds)
INSTANCE USAGE HIT % CONN TIME EVICT/s READ/s WRITE/s
127.0.0.1:11211 427.1% 0.0% 18 1.4ms 0.0 244 261.0K
AVERAGE: 427.1% 0.0% 18 1.4ms 0.0 244 261.0K
TOTAL: 4.3MB/ 1.0MB 18 1.4ms 0.0 244 261.0K
memcached-tool 127.0.0.1:11211 display
No Item_Size Max_age Pages Count Full? Evicted Evict_Time OOM
1 560B 4s 1 1872 yes 0 0 15488
2 704B 32s 1 559 no 0 0 0
3 880B 4s 1 1191 yes 0 0 1335
4 1.1K 9s 1 116 no 0 0 0
5 1.4K 21s 1 14 no 0 0 0
6 1.7K 4s 1 17 no 0 0 0
7 2.1K 84s 1 24 no 0 0 0
8 2.7K 130s 1 60 no 0 0 0
9 3.3K 25s 1 290 no 0 0 0
10 4.2K 9s 1 194 no 0 0 0
11 5.2K 9s 1 116 no 0 0 0
15 12.7K 816s 1 1 no 0 0 0
16 15.9K 769s 1 5 no 0 0 0
18 24.8K 786s 1 1 no 0 0 0
21 48.5K 816s 1 1 no 0 0 0
memcached-tool 127.0.0.1:11211 stats
127.0.0.1:11211 Field Value
accepting_conns 1
auth_cmds 0
auth_errors 0
bytes 4478060
bytes_read 23964596
bytes_written 546642860
cas_badval 0
cas_hits 0
cas_misses 0
cmd_flush 0
cmd_get 240894
cmd_set 4504
conn_yields 0
connection_structures 21
curr_connections 18
curr_items 4461
decr_hits 0
decr_misses 0
delete_hits 0
delete_misses 0
evictions 0
get_hits 43756
get_misses 197138
incr_hits 0
incr_misses 0
limit_maxbytes 1048576
listen_disabled_num 0
pid 8731
pointer_size 64
reclaimed 0
rusage_system 5.047232
rusage_user 4.311344
threads 4
time 1306247929
total_connections 3092
total_items 4504
uptime 1240
version 1.4.5
-m tells memcached how much RAM to use for item storage (in megabytes). Note
carefully that this isn't a global
memory limit, so memcached will use a
few % more memory than you tell it to.
Set this to safe values. Setting it to
less than 48 megabytes does not work
properly in 1.4.x and earlier. It will
still use the memory.
Source: https://github.com/memcached/memcached/wiki/ConfiguringServer#commandline-arguments