MongoDB: Increasing read speed on a single machine - mongodb

I use MongoDB to store price events for stocks. Depending on what you want to screen, the number of event can rapidly grow to 1Go-2Go.
I run MongoDB on a single machine and it is taking longer and longer to load the data. I am not able to find a clear answer on the web if "sharding on a single server" is a benefit to read speed.
Is it the right path to increase the read speed?
insert query update delete getmore command flushes mapped vsize res faults locked db
0 1 395 0 1 395 0 63.9g 128g 2.53g 7484 prices:70.5%
0 0 14726 0 5 14728 0 63.9g 128g 2.48g 31555 prices:7.4%
0 0 0 0 0 1 0 63.9g 128g 2.48g 436 prices:0.0%
0 0 0 0 1 1 0 63.9g 128g 2.48g 0 prices:0.0%
0 0 0 0 0 1 1 63.9g 128g 2.49g 3877 .:83.9%
0 0 0 0 0 1 0 63.9g 128g 2.49g 0 prices:0.0%

Related

scbl exception Heap exhausted during garbage collection

When our application run for some time, for example , run for hours, the sbcl will throw heap exhausted exception.
Heap exhausted during garbage collection: 1968 bytes available, 2128 requested.
Gen StaPg UbSta LaSta LUbSt Boxed Unboxed LB LUB !move Alloc Waste Trig WP GCs Mem-age
0: 0 0 0 0 0 0 0 0 0 0 0 5368709 0 0 0.0000
1: 0 0 0 0 0 0 0 0 0 0 0 5368709 0 0 0.0000
2: 0 0 0 0 0 0 0 0 0 0 0 5368709 0 0 0.0000
3: 101912 101913 0 0 19362 20536 0 0 0 162867456 554752 102714709 0 1 1.4405
4: 130984 131071 0 0 29240 18868 0 0 25 191196152 5854216 128537781 14785 1 0.6442
5: 75511 81013 0 0 16567 17127 92 99 36 132974568 5818392 2000000 16565 0 0.0000
6: 0 0 0 0 7949 1232 0 0 0 37605376 0 2000000 7766 0 0.0000
Total bytes allocated = 524643552
Dynamic-space-size bytes = 536870912
GC control variables:
*GC-INHIBIT* = true
*GC-PENDING* = true
*STOP-FOR-GC-PENDING* = false
fatal error encountered in SBCL pid 3281(tid 3067845440):
Heap exhausted, game over.
Welcome to LDB, a low-level debugger for the Lisp runtime environment.
ldb>
Any suggestion?
SBCL does not allow you to allocate more than (sb-ext:dynamic-space-size) bytes on the heap. Here you have a 512MB default size (536870912 bytes) and the Lisp program already was using nearly that amount when it attempted to make another allocation.
You could double the amount of heap space available to 1024MB by starting SBCL with --dynamic-space-size 1024. However, as several comments point out, there may be a memory leak, where objects are referenced somehow proportional to the time that the system has been running, so this will offer only a temporary respite.
The (room t) standard Common Lisp function call might help debug this, if you call it periodically.
More advanced code like this http://dwim.hu/darcsweb/darcsweb.cgi?r=HEAD%20hu.dwim.debug;a=headblob;f=/source/path-to-root.lisp#l42 which delves into the SB-VM internal map of allocations could shed more light, and SBCL has a statistical profiler, http://www.sbcl.org/manual/#Statistical-Profiler that supports reporting on allocations too.

Check whether unit cell forms a "whole piece" in MATLAB?

I wish to check whether a square unit cell, when repeated side by side, forms a valid "whole piece" or not. I am sorry that I may not have described it in an accurate way, but I really do not know how to say this using scientific terms. Let me give you two examples, one for "whole piece" and the other for "not whole piece".
1 represents rubber, and 0 represents air.
"whole piece"
0 0 1 0 0 0
0 0 1 0 0 0
1 1 1 0 1 1
0 0 1 1 1 0
0 0 1 0 0 0
0 0 1 0 0 0
"not whole piece"
0 0 1 0 0 0
0 0 1 0 0 0
1 1 1 0 1 1
0 0 0 1 1 0
0 0 1 0 0 0
0 0 1 0 0 0
As you can see, in the "whole piece" case, one can find a path (diagonal counted) from top to bottom and left to right, whereas in the "not whole piece", the path is broken. So if you lift the periodic structure up, part of it will fall apart.
How can I do it efficiently in MATLAB? Is there an efficient way? Built-in?
Update
Actually, even if bwconncomp() returns 1, the periodic structure may not be connected. Let's consider this example.
A =
0 0 0 1 0 0
0 0 1 1 0 0
1 1 1 0 1 0
0 0 1 1 1 1
0 0 1 0 0 0
0 0 1 0 0 0
>> bwconncomp(A, 4)
>> 1
If I show 4 A blocks, then it is like
A | A
-----
A | A
0 0 0 1 0 0 | 0 0 0 1 0 0
0 0 1 1 0 0 | 0 0 1 1 0 0
1 1 1 0 1 0|1 1 1 0 1 0
0 0 1 1 1 1 | 0 0 1 1 1 1
0 0 1 0 0 0 | 0 0 1 0 0 0
0 0 1 0 0 0 | 0 0 1 0 0 0
-------------------------
0 0 0 1 0 0 | 0 0 0 1 0 0
0 0 1 1 0 0 | 0 0 1 1 0 0
1 1 1 0 1 0 | 1 1 1 0 1 0
0 0 1 1 1 1 | 0 0 1 1 1 1
0 0 1 0 0 0 | 0 0 1 0 0 0
0 0 1 0 0 0 | 0 0 1 0 0 0
So as you can see, the 1 on the left side collides with the 0 on the right side of another A, causing the thing not a "whole piece". (Placed with no space deliberately to make it patent).
The term is "connected". Equivalently, each piece is called a connected component.
Use bwconncomp (Image Processing Toolbox) to determine the number of connectrd components. Let A denote your matrix. Then
c = bwconncomp(A,4);
returns a structure c whose NumObjects field indicates how many connected components there are in A. The input argument "4" is the number of neighbours you want to consider to defin connectedness: either 4 (don't take diagonals into account; that semes to be what you want) or 8 (include diagonals). So, just check if
c.NumObjects
is 1 ("whole piece") or more than 1 ("not whole piece"). In your examples, the first matrix has 1 connected component and the second has 3.
If you want it in one line:
getfield(bwconncomp(A,4),'NumObjects')==1
(or use Divakar's suggestion).

distribute matrix values of a binary image

I need to take the tril below and distribute the values equally so it ends up like a modified checkerboard.
x=ones(1186,686);
x2=tril(x);
A sample simplified matrix of what I need to have happen is below:
1 1 1 1 1 1
1 1 1 1 1 1
0 1 1 1 1 1
0 0 1 1 1 1
0 0 0 1 1 1
0 0 0 0 1 1
0 0 0 0 0 1
0 0 0 0 0 0
0 0 0 0 0 0
The matrix above needs to be changed into:
1 1 1 1 1 1
1 1 1 1 1 1
1 0 1 1 1 1
1 0 1 0 1 1
1 0 1 0 1 0
1 0 1 0 0 0
1 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
I've tried looking for built in MATLAB functions that would evenly distribute the zeros values across the board, but have not found anything that works. In terms of the output, where the ones and zeros appear is somewhat irrelevant. They just need to be distributed as evenly as possible within the matrix. For example the 3rd line of the modified matrix could = 1 1 1 0 1 1 or the like.
Would be be possible to achieve this effect a different way than starting with the tril that I'm not seeing?
For Beaker and anyone that wants to comment on how I could have better asked my original question.
This is the result of what I wanted. It's not exactly what I had described, but it achieves the same function I'm after. It's Completely different direction. I'm doing image analysis of the mixing and segregation of particles. Having contrived images like this allows me to determine if my mixing algorithm is producing the results I'm expecting. I can use the interleaving code you gave to continue making new and interesting patterns (not shown).
img=checkerboard(1,1186,686);
img_bw=im2bw(img);
img_mix=triu(img_bw,-500);
img_neg=imcomplement(img_mix);
imshow(img_neg)
I think this might be what you're after. Note that I borrowed the interleave-by-reshape trick from this blog post. (I also used triu instead of tril since it matched your example.)
x=ones(12,10);
x2=triu(x);
[rows,cols]=size(x2);
a = x2(:,1:cols/2);
b = x2(:,end:-1:(cols/2)+1);
% interlave two same sized matrices by column
a = a.';
b = b.';
col_interleave = reshape([b(:) a(:)]',2*size(a,1), []).'
Output is:
col_interleave =
1 1 1 1 1 1 1 1 1 1
1 0 1 1 1 1 1 1 1 1
1 0 1 0 1 1 1 1 1 1
1 0 1 0 1 0 1 1 1 1
1 0 1 0 1 0 1 0 1 1
1 0 1 0 1 0 1 0 1 0
1 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 0 0 0 0
1 0 1 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0

"netIn" in mongostat output

I have wrote a test script which did millions of updates(using update query) in a collection. Following is the mongostat output
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn time
0 0 21156 0 0 1 0 208m 2.45g 119m 0 81.7 0 0|8 0|9 2m 1k 10 12:52:11
0 0 20620 0 0 1 0 208m 2.45g 119m 0 82.5 0 0|8 0|9 1m 1k 10 12:52:12
0 0 21915 0 0 1 0 208m 2.45g 119m 0 81.9 0 0|8 0|9 2m 1k 10 12:52:13
0 0 21634 0 0 1 0 208m 2.45g 119m 0 82.1 0 0|8 0|9 2m 1k 10 12:52:15
0 0 19793 0 0 1 0 208m 2.45g 119m 0 81.8 0 0|8 0|9 1m 1k 10 12:52:16
0 0 22062 0 0 1 0 208m 2.45g 119m 0 81.9 0 0|8 0|8 2m 1k 10 12:52:17
0 0 23395 0 0 1 0 208m 2.45g 119m 0 81.9 0 0|8 0|8 2m 1k 10 12:52:19
The netIn says the total network in bytes per second, i hope. Is there any way to increase the size of netIn to some mb, so that i can increase the update statement per second.
I don't think you understand the netIn statistic. It isn't some limit but the actual amount of data received by MongoDB per interval sample. In other words, the netIn value will increase if you (can) do more updates.
Increasing update throughput itself may be possible but is very application specific.

Memcached using more than max memory

i have an installation on memcache which i want to use in my production environment but when i have ran a couple of tests it seems that memcache doesn't free up memory even after it has used up all of it allocated memory, Also i logged in and ran a flush_all command but the objects are still in the cache.
Here are outputs from some tests
memcached-tool
memcache-top v0.6 (default port: 11211, color: on, refresh: 3 seconds)
INSTANCE USAGE HIT % CONN TIME EVICT/s READ/s WRITE/s
127.0.0.1:11211 427.1% 0.0% 18 1.4ms 0.0 244 261.0K
AVERAGE: 427.1% 0.0% 18 1.4ms 0.0 244 261.0K
TOTAL: 4.3MB/ 1.0MB 18 1.4ms 0.0 244 261.0K
memcached-tool 127.0.0.1:11211 display
No Item_Size Max_age Pages Count Full? Evicted Evict_Time OOM
1 560B 4s 1 1872 yes 0 0 15488
2 704B 32s 1 559 no 0 0 0
3 880B 4s 1 1191 yes 0 0 1335
4 1.1K 9s 1 116 no 0 0 0
5 1.4K 21s 1 14 no 0 0 0
6 1.7K 4s 1 17 no 0 0 0
7 2.1K 84s 1 24 no 0 0 0
8 2.7K 130s 1 60 no 0 0 0
9 3.3K 25s 1 290 no 0 0 0
10 4.2K 9s 1 194 no 0 0 0
11 5.2K 9s 1 116 no 0 0 0
15 12.7K 816s 1 1 no 0 0 0
16 15.9K 769s 1 5 no 0 0 0
18 24.8K 786s 1 1 no 0 0 0
21 48.5K 816s 1 1 no 0 0 0
memcached-tool 127.0.0.1:11211 stats
127.0.0.1:11211 Field Value
accepting_conns 1
auth_cmds 0
auth_errors 0
bytes 4478060
bytes_read 23964596
bytes_written 546642860
cas_badval 0
cas_hits 0
cas_misses 0
cmd_flush 0
cmd_get 240894
cmd_set 4504
conn_yields 0
connection_structures 21
curr_connections 18
curr_items 4461
decr_hits 0
decr_misses 0
delete_hits 0
delete_misses 0
evictions 0
get_hits 43756
get_misses 197138
incr_hits 0
incr_misses 0
limit_maxbytes 1048576
listen_disabled_num 0
pid 8731
pointer_size 64
reclaimed 0
rusage_system 5.047232
rusage_user 4.311344
threads 4
time 1306247929
total_connections 3092
total_items 4504
uptime 1240
version 1.4.5
-m tells memcached how much RAM to use for item storage (in megabytes). Note
carefully that this isn't a global
memory limit, so memcached will use a
few % more memory than you tell it to.
Set this to safe values. Setting it to
less than 48 megabytes does not work
properly in 1.4.x and earlier. It will
still use the memory.
Source: https://github.com/memcached/memcached/wiki/ConfiguringServer#commandline-arguments