How to understand Memory dump from windbg? - windbg

One of our website is using like 2gb memories, and we are trying to understand why it is using so much (as we are trying to push this site to azure, as big memory usage means higher bill from azure).
I took an IIS dump and from task manager, I can see it was using like 2.2GB momory.
Then I run !address -summaryand this is what I got:
--- Usage Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
Free 913 7fb`2f5ce000 ( 7.981 Tb) 99.76%
<unknown> 4055 4`a49c9000 ( 18.572 Gb) 96.43% 0.23%
Heap 338 0`1dbd1000 ( 475.816 Mb) 2.41% 0.01%
Image 3147 0`0c510000 ( 197.063 Mb) 1.00% 0.00%
Stack 184 0`01d40000 ( 29.250 Mb) 0.15% 0.00%
Other 14 0`001bf000 ( 1.746 Mb) 0.01% 0.00%
TEB 60 0`00078000 ( 480.000 kb) 0.00% 0.00%
PEB 1 0`00001000 ( 4.000 kb) 0.00% 0.00%
--- Type Summary (for busy) ------ RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_PRIVATE 2206 4`ba7d2000 ( 18.914 Gb) 98.20% 0.23%
MEM_IMAGE 5522 0`148b0000 ( 328.688 Mb) 1.67% 0.00%
MEM_MAPPED 71 0`019a0000 ( 25.625 Mb) 0.13% 0.00%
--- State Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_FREE 913 7fb`2f5ce000 ( 7.981 Tb) 99.76%
MEM_RESERVE 2711 4`378f4000 ( 16.868 Gb) 87.58% 0.21%
MEM_COMMIT 5088 0`9912e000 ( 2.392 Gb) 12.42% 0.03%
--- Protect Summary (for commit) - RgnCount ----------- Total Size -------- %ofBusy %ofTotal
PAGE_READWRITE 1544 0`81afb000 ( 2.026 Gb) 10.52% 0.02%
PAGE_EXECUTE_READ 794 0`0f35d000 ( 243.363 Mb) 1.23% 0.00%
PAGE_READONLY 2316 0`05ea8000 ( 94.656 Mb) 0.48% 0.00%
PAGE_EXECUTE_READWRITE 279 0`020f4000 ( 32.953 Mb) 0.17% 0.00%
PAGE_WRITECOPY 92 0`0024f000 ( 2.309 Mb) 0.01% 0.00%
PAGE_READWRITE|PAGE_GUARD 61 0`000e6000 ( 920.000 kb) 0.00% 0.00%
PAGE_EXECUTE 2 0`00005000 ( 20.000 kb) 0.00% 0.00%
--- Largest Region by Usage ----------- Base Address -------- Region Size ----------
Free 5`3fac0000 7f9`59610000 ( 7.974 Tb)
<unknown> 3`06a59000 0`f9067000 ( 3.891 Gb)
Heap 0`0f1c0000 0`00fd0000 ( 15.813 Mb)
Image 7fe`fe767000 0`007ad000 ( 7.676 Mb)
Stack 0`01080000 0`0007b000 ( 492.000 kb)
Other 0`00880000 0`00183000 ( 1.512 Mb)
TEB 7ff`ffe44000 0`00002000 ( 8.000 kb)
PEB 7ff`fffdd000 0`00001000 ( 4.000 kb)
There are lots of things I don't really get:
The webserver has 8GB memory in total, but the Free section in the Usage summary is showing 7.9Tb? Why?
Unknown was showing 19.572GB, but the webserver has 8GB memory in total. Why?
The task manager shows private memory workset was like 2.2GB, but if I add Heap, Image and Stack together it is only around 700MB, so where are the rest 1.5 GB memory or I totally read the output wrong?
Many Thanks

The webserver has 8GB memory in total, but the Free section in the Usage summary is showing 7.9Tb? Why?
The 8 GB RAM is physical memory, i.e. the one that's filled in the DDR slots of your PC. The 8 TB is virtual memory, which could also be stored in the page file.
The virtual memory can be 4 GB for a 32 bit process and depends on the exact limitations of the OS for 64 bit processes.
Unknown was showing 19.572GB, but the webserver has 8GB memory in total. Why?
The 19 GB is amount the virtual memory used by an <unknown> memory manager, e.g. .NET or direct calls to VirtualAlloc().
Even if 19 GB is more than 8 GB, this does not necessarily mean that memory was swapped to disk. It depends on the state of the memory. Looking at MEM_RESERVE, we see that most of it is not in use yet. Therefore, your application may still have good performance.
The task manager shows private memory workset was like 2.2GB, but if I add Heap, Image and Stack together it is only around 700MB, so where are the rest 1.5 GB memory or I totally read the output wrong?
The rest is in <unknown>, so the sum is even more than 2.2 GB shown by task manager. The working set size indicates how much physical RAM is used by your process. Ideally, everything would be in RAM, since RAM is the fastest. But RAM is limited and not all applications will fit into RAM. Therefore, memory that is not used very often is swapped to disk, which decreases the use of physical RAM and thus decreases the working set size.

Related

How to configure u-boot for 512 Mb Ram

I'm using an imx6slevk board with yocto-bsp.
I use mfgtool to flash the images and it works fine for 1GB DRAM.
Now I'm trying to change DRAM to 512 MB.
I modified the dts file memory node:
memory {
reg = <0x80000000 0x20000000>; //it was 0x40000000
};
I ran calibration tool and updated 2 registers
DATA 4 0x021b0848 0x4644484a //changed for 512 mb old value = 0x4241444a
DATA 4 0x021b0850 0x3a363a30 //changed for 512 mb old value = 0x3030312b
However u-boot log still shows 1 GiB DRAM in flash log:
U-Boot 2017.03-imx_v2017.03_4.9.88_2.0.0_ga+gb76bb1b (Sep 24 2019 - 11:04:03 +0530)
CPU: Freescale i.MX6SL rev1.2 996 MHz (running at 792 MHz)
CPU: Commercial temperature grade (0C to 95C) at 48C
Reset cause: POR
Model: Freescale i.MX6 SoloLite EVK Board
Board: MX6SLEVK
DRAM: 1 GiB
How can I change DRAM from 1 GiB o 512 MiB?
The kernel doesn't flash without this.

joblib Parallel running out of memory

I have something like this
outputs = Parallel(n_jobs=12, verbose=10)(delayed(_process_article)(article, config) for article in data)
Case 1: Run on ubuntu with 80 cores:
CPU(s): 80
Thread(s) per core: 2
Core(s) per socket: 20
Socket(s): 2
There are a total of 90,000 tasks. At around 67k it fails and is terminated.
joblib.externals.loky.process_executor.BrokenProcessPool: A process in the executor was terminated abruptly, the pool is not usable anymore.
When I monitor the top at 67k I see a sharp fall in the memory
top - 11:40:25 up 2 days, 18:35, 4 users, load average: 7.09, 7.56, 7.13
Tasks: 32 total, 3 running, 29 sleeping, 0 stopped, 0 zombie
%Cpu(s): 7.6 us, 2.6 sy, 0.0 ni, 89.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 33554432 total, 40 free, 33520996 used, 33396 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 40 avail Mem
Case 2: Mac with 8 cores
hw.physicalcpu: 4
hw.logicalcpu: 8
But on the mac it is much much slower .. And surprisingly it does not get killed at 67k..
Additionally, I reduced the parallelism (in case 1) to 2,4 and it still fails :(
Why is this happening? Has anyone faced this issue before and has a fix?
Note: when I run for 50,000 tasks it runs well and does not give any problems.
Thank you!
Got a machine with an increased memory of 128GB and that solved the problem!

pg_top output analysis of puppetdb with postgres

I recently started using a tool called pg_top that shows statistics for Postgres, however since I am not verify versed with the internals of Postgres I needed a bit of clarification on the output.
last pid: 6152; load avg: 19.1, 18.6, 20.4; up 119+20:31:38 13:09:41
41 processes: 5 running, 36 sleeping
CPU states: 52.1% user, 0.0% nice, 0.8% system, 47.1% idle, 0.0% iowait
Memory: 47G used, 16G free, 2524M buffers, 20G cached
DB activity: 151 tps, 0 rollbs/s, 253403 buffer r/s, 86 hit%, 1550639 row r/s,
21 row w/s
DB I/O: 0 reads/s, 0 KB/s, 35 writes/s, 2538 KB/s
DB disk: 233.6 GB total, 195.1 GB free (16% used)
Swap:
My question is under the DB Activity, the 1.5 million row r/s, is that a lot? If so what can be done to improve it? I am running puppetdb 2.3.8, with 6.8 million resources, 2500 nodes, and Postgres 9.1. All of this runs on a single 24 core box with 64GB of memory.

strange G-WAN response speed differences

I had just implement G-WAN web server and test for my code, however, it is very strange that my server response very fast sometimes (20 ms), and sometimes over few seconds (6–7 s) or even timeout...
I try to simplify my code, and return a string to clients, the problem still occurs...
Beside, I had log the time consume by my code, it never over 1 sec, so what cause the problem?!
I guess this cause by network delay, and test the network speed of the same server, it very fast, any idea? (Will the problem caused by include some 3rd party library like MySQL?)
Here is my G-WAN log:
*------------------------------------------------
*G-WAN 4.3.14 64-bit (Mar 14 2013 07:33:12)
* ------------------------------------------------
* Local Time: Mon, 29 Jul 2013 10:09:05 GMT+8
* RAM: (918.46 MiB free + 0 shared + 222.81 MiB buffers) / 1.10 GiB total
* Physical Pages: 918.46 MiB / 1.10 GiB
* DISK: 3.27 GiB free / 6.46 GiB total
* Filesystem Type Size Used Avail Use% Mounted on
* /dev/mapper/vg_centos6-root
* ext4 6.5G 3.2G 3.0G 52% /
* tmpfs tmpfs 1004M 8.2M 995M 1% /dev/shm
* /dev/xvda1 ext4 485M 129M 331M 28% /boot
* 105 processes, including pid:10874 '/opt/gwan/gwan'
* Page-size:4,096 Child-max:65,535 Stream-max:16
* CPU: 1x Intel(R) Xeon(R) CPU E5506 # 2.13GHz
* 0 id: 0 0
* Cores: possible:0-14 present:0 online:0
* L1d cache: 32K line:64 0
* L1i cache: 32K line:64 0
* L2 cache: 256K line:64 0
* L3 cache: 4096K line:64 0
* NUMA node #1 0
* CPU(s):1, Core(s)/CPU:0, Thread(s)/Core:2
* Bogomips: 4,256.14
* Hypervisor: XenVMMXenVMM
* using 1 workers 0[1]0
* among 2 threads 0[]1
* 64-bit little-endian (least significant byte first)
* CentOS release 6.3 (Final) (3.5.5-1.) 64-bit
* user: root (uid:0), group: root (uid:0)
* system fd_max: 65,535
* program fd_max: 65,535
* updated fd_max: 500,000
* Available network interfaces (3):
* 127.0.0.1
* 192.168.0.1
* xxx.xxx.xxx.xxx
* memory footprint: 1.39 MiB.
* Host /opt/gwan/0.0.0.0_8080/#0.0.0.0
* loaded index.c 3.46 MiB MD5:afb6c263-791c706a-598cc77b-e0873517
* memory footprint: 3.40 MiB.
If I use -g mode, and increase the number of workers up to the number of CPUs of the server, this problem seem to be resolved
Then, it seems to be a CPU detection issue. Please dump the relevant part of your gwan.log file header (CPU detection) so ew can have a look.
When G-WAN has to re-compile a servlet using external libraires that must be searched and linked, this may take time (especially if there's only one worker and other requests are pending).
UPDATE: following your gwan.log file dump, here is what's important:
CPU: 1x Intel(R) Xeon(R) CPU E5506 # 2.13GHz
0 id: 0 0
Cores: possible:0-14 present:0 online:0
CPU(s):1, Core(s)/CPU:0, Thread(s)/Core:2
Hypervisor: XenVMMXenVMM
using 1 workers 0[1]0
among 2 threads 0[]1
The Intel E5506 is a 4-Core CPU... but the Xen Hypervisor is reporting 1 CPU and 0 Cores (and hyperthreading enabled, which makes no sense without any CPU Core).
Why Xen finds it a priority to corrupt the genuine and correct information about the CPU with complete nonsense is beyond the purpose of this discussion.
All I can say is that this is the cause of the issue experienced by 'moriya' (hence the 'fix' with ./gwan -g -w 4 to bypass the wrong information reported by the corrupted Linux kernel /proc and the CPUID instruction).
I can only suggest to avoid using brain-damaged hypervisors which prevent multicore software (like G-WAN) from running correctly by sabotaging the two standard ways to detect CPU topologies: the Linux kernel /proc structure and the CPUID instruction.

Dedicated database server heavy iowait spikes

We have a dedicated database server that runs PostgreSQL 8.3 on linux debian. The database is being regularly queried for a lot of data while updates/inserts happen frequently also. Periodically the database does not respond for a small duration ( like 10 seconds ) and then it goes into normal execution flow again.
What I noticed through top is that there's an iowait spike during that time that lasts for as long as the database does not respond. At the same time pdflush gets activated. So my idea is that pdflush has to write data from the cached memory space back to the disk based on dirty page and background ratio. The rest of the time , when postgresql works normally there's no iowait happening since pdflush is not active. The values for my vm are the following:
dirty_background_ratio = 5
dirty_ratio = 10
dirty_expire_centisecs = 3000
My meminfo :
MemTotal: 12403212 kB
MemFree: 1779684 kB
Buffers: 253284 kB
Cached: 9076132 kB
SwapCached: 0 kB
Active: 7298316 kB
Inactive: 2555240 kB
SwapTotal: 7815544 kB
SwapFree: 7814884 kB
Dirty: 1804 kB
Writeback: 0 kB
AnonPages: 495028 kB
Mapped: 3142164 kB
Slab: 280588 kB
SReclaimable: 265284 kB
SUnreclaim: 15304 kB
PageTables: 422980 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 14017148 kB
Committed_AS: 3890832 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 304188 kB
VmallocChunk: 34359433983 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
I am thinking to tweak the duration at which a dirty page stays in the memory ( dirty_expire_centisecs) so as to divide the iowait spikes equally in time ( call pdflush more regularly so as to write smaller chunks of data to the disk ). Any other proposed solution ?
IO spikes are likely to happen when postgresql is checkpointing.
You can verify that by logging checkpoints and see if they coincide with the lack of response of the server.
If that's the case, tuning checkpoints_segments and checkpoint_completion_target is likely to help.
See the wiki's advice about that and the doc about the WAL configuration.