Perl fork() shared memory - perl

Is there any way to load read-only variables into memory before forking and keep them there without using up virt memory x number of children?
It seems to be a common opinion that read-only memory is shared by default and is copied on write. I ran a test and found this to be untrue:
#!/usr/bin/perl
my $data;
$$data = 'a'x 1_000_000; #keep it in a ref just in case that matters
foreach (0..10){
last unless my $pid = fork();
}
<STDIN>;
While the process is sitting on STDIN, I check top:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15982 wdev 20 0 121m 2540 188 S 0.0 0.1 0:00.00 foo.t
15983 wdev 20 0 121m 2540 188 S 0.0 0.1 0:00.00 foo.t
15984 wdev 20 0 121m 2540 188 S 0.0 0.1 0:00.00 foo.t
15985 wdev 20 0 121m 2540 188 S 0.0 0.1 0:00.00 foo.t
15986 wdev 20 0 121m 2540 188 S 0.0 0.1 0:00.00 foo.t
15987 wdev 20 0 121m 2540 188 S 0.0 0.1 0:00.00 foo.t
15988 wdev 20 0 121m 2540 188 S 0.0 0.1 0:00.00 foo.t
15989 wdev 20 0 121m 2540 188 S 0.0 0.1 0:00.00 foo.t
15990 wdev 20 0 121m 2540 188 S 0.0 0.1 0:00.00 foo.t
15991 wdev 20 0 121m 2540 188 S 0.0 0.1 0:00.00 foo.t
15992 wdev 20 0 121m 2540 188 S 0.0 0.1 0:00.00 foo.t
Sure enough, practically all the memory is in individual child processes.
Is there something I am missing, or does forking a perl process really copy over the entire set of data structures for each child?

Either this is operating-system dependent or you're not seeing what you think you're seeing. I modified your test script to:
#!/usr/bin/env perl
print `free`;
my $data = [ ('a') x 1000000 ];
for (1 .. 10) {
print `free`;
last unless fork;
}
sleep 10;
Running it with ./forktest | grep Mem, I get output such as:
Mem: 3979908 2866552 1113356 0 667388 1258560
Mem: 3979908 2917888 1062020 0 667388 1258560
Mem: 3979908 2918284 1061624 0 667388 1258560
Mem: 3979908 2918532 1061376 0 667388 1258560
Mem: 3979908 2918936 1060972 0 667388 1258560
Mem: 3979908 2919404 1060504 0 667388 1258560
Mem: 3979908 2919900 1060008 0 667388 1258560
Mem: 3979908 2919900 1060008 0 667388 1258560
Mem: 3979908 2920148 1059760 0 667388 1258560
Mem: 3979908 2920148 1059760 0 667388 1258560
Mem: 3979908 2920496 1059412 0 667388 1258560
in which the second column of numbers (third column overall) is the total system RAM usage. Note that it increases from 2866552 to 2917888 when $data is assigned at the start of the program, then stays fairly steady from there on as the forks are done.
I suspect that what you're seeing in top is that it uses "shared memory" in the IPC-specific sense (i.e., chunks of memory which have been explicitly requested and allocated as "shared") and pages which are currently available to multiple processes on a copy-on-write basis do not meet that definition.

The perl docs on fork say "File descriptors (and sometimes locks on those descriptors) are shared, while everything else is copied."

Related

sed replace all zeros with ones and all ones with zero

have this:
0|1 1|1 0|0 1|0 .|1 0|.
want this:
1|0 0|0 1|1 0|1 .|0 1|.
I was thinking this would work:
sed -e 's/0/1/g' -e 's/1/0/g'
but just seems to give me all zeros
There is a slightly lesser-known sed command to transliterate, y, which comes in handy here:
$ sed 'y/01/10/' <<< '0|1 1|1 0|0 1|0 .|1 0|.'
1|0 0|0 1|1 0|1 .|0 1|.
Let's play computer.
Given this:
0|1 1|1 0|0 1|0 .|1 0|.
... change all 0s to 1s. Now you have this:
1|1 1|1 1|1 1|1 .|1 1|.
... OK, now you are stuck. There are only 1s left and you do not know which to change back.
... so instead change the 0s into 2s, just for a brief moment:
2|1 1|1 2|2 1|2 .|1 2|.
... now change the 1s to 0s
2|0 0|0 2|2 0|2 .|1 2|.
... and then the 2s to 1s
1|0 0|0 1|1 0|1 .|1 1|.
And now the grand finale:
$ sed -e 's/0/2/g' -e 's/1/0/g' -e 's/2/1/g'

Akka threads constant in futex_ state -- Is this normal?

I have an Akka program with threads that are spending most of their time in the futex_ state.
This is from an Akka program that isn't doing much. I am not familiar with the internals of Akka, so I'm not sure if this is normal or not.
F S UID PID PPID LWP C PRI NI ADDR SZ WCHAN TTY TIME CMD
4 S 1 19 0 19 0 80 0 - 1149556 futex_ ? 0:00 /usr/bin/java
5 S 1 19 0 25 0 80 0 - 1149556 futex_ ? 0:04 /usr/bin/java
1 S 1 19 0 26 0 80 0 - 1149556 futex_ ? 0:00 /usr/bin/java
1 S 1 19 0 27 0 80 0 - 1149556 futex_ ? 0:00 /usr/bin/java
1 S 1 19 0 28 0 80 0 - 1149556 futex_ ? 0:00 /usr/bin/java
1 S 1 19 0 29 0 80 0 - 1149556 futex_ ? 0:00 /usr/bin/java
1 S 1 19 0 30 1 80 0 - 1149556 futex_ ? 0:12 /usr/bin/java
1 S 1 19 0 31 0 80 0 - 1149556 futex_ ? 0:04 /usr/bin/java
1 S 1 19 0 32 0 80 0 - 1149556 futex_ ? 0:00 /usr/bin/java
1 S 1 19 0 33 0 80 0 - 1149556 futex_ ? 0:00 /usr/bin/java
1 S 1 19 0 43 0 80 0 - 1149556 futex_ ? 0:02 /usr/bin/java
1 S 1 19 0 44 0 80 0 - 1149556 futex_ ? 0:00 /usr/bin/java
1 S 1 19 0 45 0 80 0 - 1149556 futex_ ? 0:00 /usr/bin/java
1 S 1 19 0 46 0 80 0 - 1149556 futex_ ? 0:01 /usr/bin/java
1 S 1 19 0 47 0 80 0 - 1149556 futex_ ? 0:02 /usr/bin/java
1 S 1 19 0 50 0 80 0 - 1149556 futex_ ? 0:00 /usr/bin/java
1 S 1 19 0 53 0 80 0 - 1149556 futex_ ? 0:01 /usr/bin/java
1 S 1 19 0 60 0 80 0 - 1149556 futex_ ? 0:03 /usr/bin/java
1 S 1 19 0 121 0 80 0 - 1149556 futex_ ? 0:00 /usr/bin/java
1 S 1 19 0 122 0 80 0 - 1149556 futex_ ? 0:01 /usr/bin/java
1 S 1 19 0 123 0 80 0 - 1149556 ep_pol ? 0:00 /usr/bin/java
1 S 1 19 0 282 0 80 0 - 1149556 futex_ ? 0:00 /usr/bin/java
1 S 1 19 0 283 0 80 0 - 1149556 futex_ ? 0:00 /usr/bin/java
1 S 1 19 0 284 0 80 0 - 1149556 futex_ ? 0:00 /usr/bin/java
1 S 1 19 0 285 0 80 0 - 1149556 futex_ ? 0:00 /usr/bin/java
1 S 1 19 0 295 0 80 0 - 1149556 futex_ ? 0:01 /usr/bin/java
1 S 1 19 0 383 0 80 0 - 1149556 futex_ ? 0:00 /usr/bin/java
1 S 1 19 0 433 0 80 0 - 1149556 futex_ ? 0:01 /usr/bin/java
1 S 1 19 0 1109 0 80 0 - 1149556 futex_ ? 0:00 /usr/bin/java
Checking the syscalls, there are a ton of timed out futex calls constantly happening. Here's a snippet from just a fraction of a second:
[pid 43] futex(0x7f36266dfc28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 43] futex(0x7f36266dfc54, FUTEX_WAIT_BITSET_PRIVATE, 1, {tv_sec=14532, tv_nsec=950443844}, 0xffffffff <unfinished ...>
[pid 47] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)
[pid 47] futex(0x7f3626f8ab28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 47] futex(0x7f3626f8ab54, FUTEX_WAIT_BITSET_PRIVATE, 1, {tv_sec=14532, tv_nsec=954449564}, 0xffffffff <unfinished ...>
[pid 33] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)
[pid 33] futex(0x7f3624a11128, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 33] futex(0x7f3624a11154, FUTEX_WAIT_BITSET_PRIVATE, 1, {tv_sec=14532, tv_nsec=998037087}, 0xffffffff <unfinished ...>
[pid 43] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)
[pid 43] futex(0x7f36266dfc28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 43] futex(0x7f36266dfc54, FUTEX_WAIT_BITSET_PRIVATE, 1, {tv_sec=14532, tv_nsec=960733704}, 0xffffffff <unfinished ...>
[pid 47] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)
[pid 47] futex(0x7f3626f8ab28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 47] futex(0x7f3626f8ab54, FUTEX_WAIT_BITSET_PRIVATE, 1, {tv_sec=14532, tv_nsec=964665830}, 0xffffffff <unfinished ...>
[pid 43] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)
[pid 43] futex(0x7f36266dfc28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 43] futex(0x7f36266dfc54, FUTEX_WAIT_BITSET_PRIVATE, 1, {tv_sec=14532, tv_nsec=971001771}, 0xffffffff <unfinished ...>
[pid 47] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)
[pid 47] futex(0x7f3626f8ab28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 47] futex(0x7f3626f8ab54, FUTEX_WAIT_BITSET_PRIVATE, 1, {tv_sec=14532, tv_nsec=973981797}, 0xffffffff <unfinished ...>
[pid 43] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)
[pid 43] futex(0x7f36266dfc28, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 43] futex(0x7f36266dfc54, FUTEX_WAIT_BITSET_PRIVATE, 1, {tv_sec=14532, tv_nsec=980553240}, 0xffffffff <unfinished ...>
[pid 47] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)
My theory is that this may be the Akka actors rapidly checking to see if there's any new work for them to do, but I am seeking confirmation if that is a valid theory. Thank you!

Mongodb Out of Memory while total DB size < available RAM

I have read what I could find on memory consumption for MongoDB, but the gist of what I understood was that everything was handled by the OS, and if no memory is available the data will be read from disk, then replace something else in memory.
I have a pretty small database
> db.stats()
{
"db" : "prod",
"collections" : 11,
"objects" : 2022,
"avgObjSize" : 43469.34915924827,
"dataSize" : 87895024,
"storageSize" : 113283072,
"numExtents" : 30,
"indexes" : 10,
"indexSize" : 4840192,
"fileSize" : 201326592,
"nsSizeMB" : 16,
"extentFreeList" : {
"num" : 0,
"totalSize" : 0
},
"dataFileVersion" : {
"major" : 4,
"minor" : 22
},
"ok" : 1
}
with a small server with 1GB of RAM. Seeing the size of the DB (~100MB), I would assume 1GB of RAM should be plenty.
I have however been having Out of Memory errors for some time now, first infrequently (one every 2-3 weeks), and almost twice a day now.
I'm at a loss as to what could cause these issues, and thought that I may be missing something completely.
I ran what diagnosis I found on the net:
ulimit
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 7826
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 7826
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
mongod version is 3.0.12
OS info:
NAME="Amazon Linux AMI"
VERSION="2015.03"
ID="amzn"
ID_LIKE="rhel fedora"
VERSION_ID="2015.03"
PRETTY_NAME="Amazon Linux AMI 2015.03"
ANSI_COLOR="0;33"
CPE_NAME="cpe:/o:amazon:linux:2015.03:ga"
HOME_URL="http://aws.amazon.com/amazon-linux-ami/"
Amazon Linux AMI release 2015.03
db.serverStatus() is on pastebin. Edit: looking at https://docs.mongodb.com/manual/reference/command/serverStatus/#memory-status
If mem.virtual value is significantly larger than mem.mapped (e.g. 3 or more times), this may indicate a memory leak.
So maybe something to look at
Swap is enabled
free -m execution before launching MongoDB
$ free -m
total used free shared buffers cached
Mem: 996 60 935 0 5 19
-/+ buffers/cache: 36 959
Swap: 4095 9 4086
and right after launch (launching + running the command immediately)
$ free -m
total used free shared buffers cached
Mem: 996 925 71 0 5 834
-/+ buffers/cache: 84 911
Swap: 4095 9 4086
mongostat run
$ mongostat
insert query update delete getmore command flushes mapped vsize res faults qr|qw ar|aw netIn netOut conn time
*0 *0 *0 *0 0 1|0 0 688.0M 7.6G 764.0M 0 0|0 0|0 79b 10k 9 07:04:55
*0 *0 *0 *0 0 1|0 0 688.0M 7.6G 764.0M 0 0|0 0|0 79b 10k 9 07:04:56
*0 *0 *0 *0 0 3|0 0 688.0M 7.6G 764.0M 0 0|0 0|0 196b 11k 9 07:04:57
*0 *0 *0 *0 0 1|0 0 688.0M 7.6G 764.0M 0 0|0 0|0 79b 10k 9 07:04:58
*0 *0 *0 *0 0 2|0 0 688.0M 7.6G 764.0M 0 0|0 0|0 133b 10k 9 07:04:59
*0 *0 *0 *0 0 1|0 0 688.0M 7.6G 764.0M 0 0|0 0|0 79b 10k 9 07:05:00
Running mongostat a few hours later show an increase in res memory (Edit: re-running serverStatus() shows no increase in mem.resident though)
$ mongostat
insert query update delete getmore command flushes mapped vsize res faults qr|qw ar|aw netIn netOut conn time
*0 *0 *0 *0 0 1|0 0 688.0M 7.7G 856.0M 8 0|0 0|0 79b 10k 8 10:39:50
*0 *0 *0 *0 0 1|0 0 688.0M 7.7G 856.0M 0 0|0 0|0 79b 10k 8 10:39:51
*0 *0 *0 *0 0 1|0 0 688.0M 7.7G 856.0M 0 0|0 0|0 79b 10k 8 10:39:52
*0 *0 *0 *0 0 1|0 0 688.0M 7.7G 856.0M 0 0|0 0|0 79b 10k 8 10:39:53
*0 *0 *0 *0 0 4|0 0 688.0M 7.7G 856.0M 0 0|0 0|0 250b 11k 8 10:39:54
*0 2 *0 *0 0 1|0 0 688.0M 7.7G 856.0M 0 0|0 0|0 183b 11k 8 10:39:55
*0 1 *0 *0 0 1|0 0 688.0M 7.7G 856.0M 0 0|0 0|0 131b 11k 8 10:39:56
swapon -s
$ swapon -s
Filename Type Size Used Priority
/swapfile file 4194300 10344 -1
Edit: I've setup MongoDB Cloud Monitoring, and the issue just reoccurred. This is the report, and the mongo process was killed at 02:29
Do you have any idea what may be causing the issue? Or hints into where I should look?
Thanks for your help!
Seb

I want to parse these columns to psql table but how can I handle empty fields ?

I want to parse these columns to psql table but how can I handle empty fields ?
with this command : " awk '{print $3;}' text.txt " it ignores spaces and writes next column number..
StartTime LastTime SrcAddr DstAddr Sport Dport SrcPkts DstPkts SrcBytes DstBytes
13:14:52.088291 13:14:52.088291 192.168.0.23 192.168.0.255 57621 57621 1 0 86 0
13:14:54.936682 13:14:54.936682 192.168.0.23 192.168.0.255 17500 17500 1 0 243 0
13:14:54.936479 13:14:54.936479 192.168.0.23 17500 17500 1 0 243 0
13:14:56.056179 13:14:56.056179 192.168.0.163 17500 17500 1 0 208 0
13:14:56.056370 13:14:56.056370 192.168.0.163 192.168.0.255 17500 17500 1 0 208 0
13:15:00.027462 13:15:00.027462 192.168.0.170 192.168.0.255 17500 17500 1 0 146 0
13:15:00.652690 13:15:00.652874 192.168.0.166 443 57201 1 1 121 66
13:15:04.636177 13:15:04.636177 192.168.0.163 192.168.0.255 57621 57621 1 0 86 0
13:15:09.622847 13:15:09.622847 192.168.0.166 38029 53 1 0 76 0
13:15:13.138197 13:15:13.138197 192.168.0.158 192.168.0.255 57621 57621 1 0 86 0
13:15:15.429763 13:15:15.429763 192.168.0.151 192.168.0.255 57621 57621 1 0 86 0
13:15:19.793651 13:15:19.793651 192.168.0.163 192.168.0.255 138 138 1 0 280 0
13:15:19.795815 13:15:19.795815 192.168.0.90 192.168.0.255 137 137 1 0 92 0
13:15:20.342669 13:15:20.342669 192.168.0.23 192.168.0.255 137 137 1 0 92 0
13:14:40.721537 13:14:42.016881 192.168.0.136 192.168.0.166 22 59301 11 14 1334 1404
13:15:24.981466 13:15:24.981466 192.168.0.23 255.255.255.255 17500 17500 1 0 243 0
13:15:24.981666 13:15:24.981666 192.168.0.23 192.168.0.255 17500 17500 1 0 243 0
13:14:40.996353 13:14:41.996328 192.168.0.22 2 0 120 0
13:14:41.203309 13:15:41.171881 66 66 31 0 1984 0
13:15:26.116537 13:15:26.116537 192.168.0.163 17500 17500 1 0 208 0
13:15:26.116728 13:15:26.116728 192.168.0.163 192.168.0.255 17500 17500 1 0 208 0
13:14:41.863153 13:15:41.852795 0 0 31 0 1860 0
13:15:01.195960 13:15:03.192229 192.168.0.234 3 0 180 0
13:15:04.636774 13:15:04.636774 192.168.0.91 192.168.0.163 1 0 60 0
13:15:10.398423 13:15:10.398423 192.168.0.110 192.168.0.1 1 0 60 0
13:15:40.800831 13:15:40.913802 192.168.0.152 192.168.0.1 49556 53 2 2 148 689
You can specify by position where to get data, trim it and print it with awk
awk '{a=substr($0,40,17);gsub(/ */,"",a);print a}' file
SrcAddr
192.168.0.23
192.168.0.23
192.168.0.23
192.168.0.163
192.168.0.170
192.168.0.163
192.168.0.166
192.168.0.158
192.168.0.151
192.168.0.163
192.168.0.90
192.168.0.23
192.168.0.136
192.168.0.23
192.168.0.23
192.168.0.163
192.168.0.163
192.168.0.91
192.168.0.110
192.168.0.152
It LOOKS like you have fixed-width fields so using GNU awk:
$ cat file
abcd ef ghi klmno
1234 12 123
$ awk -v FIELDWIDTHS="4 1 2 1 3 1 5" '{print "<" $3 ">"}' file
<ef>
<12>
$ awk -v FIELDWIDTHS="4 1 2 1 3 1 5" '{print "<" $5 ">"}' file
<ghi>
< >
and if you want to strip leading and trailing white space from each field for output just add a gsub():
$ awk -v FIELDWIDTHS="4 1 2 1 3 1 5" '{gsub(/^\s+|\s+$/,"",$5); print "<" $5 ">"}' file
<ghi>
<>
$ awk -v FIELDWIDTHS="4 1 2 1 3 1 5" '{print "<" $7 ">"}' file
<klmno>
< 123 >
$ awk -v FIELDWIDTHS="4 1 2 1 3 1 5" '{gsub(/^\s+|\s+$/,"",$7); print "<" $7 ">"}' file
<klmno>
<123>

will there be any negative impact in using the option usePowerOf2Sizes

I am using mongodb for our Application .
I ran the db.setProfilingLevel(1,25) on the mongo shell to identify the slow queries from system.profile collection .
I observed that the read operations are fast , but the update operations are very slow
This is a sample of my mongostat
insert/s query/s update/s delete/s getmore/s command/s flushes/s mapped vsize res faults/s locked % idx miss % q t|r|w conn time
0 950 469 0 0 471 0 10396 12295 3207 27 34.9 0 0|0|0 152 07:18:49
0 838 418 0 0 422 0 10396 12295 3209 21 34.6 0 0|0|0 152 07:18:50
0 1005 502 0 0 504 0 10396 12295 3211 21 35.5 0 0|0|0 152 07:18:51
0 837 410 0 0 418 0 10396 12295 3212 20 35.7 0 0|0|0 152 07:18:52
0 754 377 0 0 379 0 10396 12295 3214 19 36.7 0 0|0|0 152 07:18:53
0 841 420 0 0 422 0 10396 12295 3216 24 35.9 0 0|0|0 152 07:18:54
0 877 438 0 0 442 0 10396 12295 3217 23 37.2 0 0|0|0 152 07:18:55
0 799 393 0 0 395 0 10396 12295 3219 21 37 0 0|0|0 152 07:18:56
0 947 471 0 0 479 0 10396 12295 3221 26 39 0 0|0|0 152 07:18:57
0 855 427 0 0 429 0 10396 12295 3222 24 38.4 0 0|0|0 152 07:18:58
0 1007 504 0 0 506 0 10396 12295 3224 31 36 0 0|0|0 152 07:18:59
0 841 413 0 0 417 0 10396 12295 3226 23 37.2 0 0|0|0 152 07:19:00
The stats are from dev environment , cant assume really from prod environment .
As per the architecture i cannot reduce the index size on that collection , but i saw that usePowerOf2Sizes can help me in this case in improving the write/update response time in mongodb .
I have heard lot of usePowerOf2Sizes , which says that
As usePowerOf2Sizes can reduce fragmentation .
all data will be set in mmemory and performance will be great .
With this option MongoDB will be able to more effectively reuse space.
usePowerOf2Sizes is useful for collections where you will be inserting and deleting large numbers of documents to ensure that MongoDB will effectively use space on disk.
I want to know if there will be any negative impact in using the option usePowerOf2Sizes ?? I have got 17 collections in my mongodb and want to use usePowerOf2Sizes for only one collection .
Please let me know , thanks in advance .