what does mellanox interrupt mlx4-async#pci:0000 ... means? - operating-system

I'm using an InfiniBand Mellanox card [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] with OFED version 4-1.0.0 on an ubuntu 3.13.0 running on a x86_64 computer with 4 cores.
Here is the result of ibstat on my computer
CA 'mlx4_0'
CA type: MT26428
Number of ports: 1
Firmware version: 2.8.600
Hardware version: b0
Node GUID: 0x0002c903004d58ee
System image GUID: 0x0002c903004d58f1
Port 1:
State: Active
Physical state: LinkUp
Rate: 40
Base lid: 1
LMC: 0
SM lid: 1
Capability mask: 0x02510868
Port GUID: 0x0002c903004d58ef
Link layer: InfiniBand
and my /proc/interrupts looks like this :
67: 17923 4654 0 0 PCI-MSI-edge mlx4-async#pci:0000:01:00.0
68: 26696 0 54 0 PCI-MSI-edge mlx4_0-0
69: 0 34 23 0 PCI-MSI-edge mlx4_0-1
70: 0 0 0 0 PCI-MSI-edge mlx4_0-2
71: 0 0 0 0 PCI-MSI-edge mlx4_0-3
I read that each mlx4_0-x interrupts are associated to each CPU. My question is : what does the first interrupt mlx4-async#pci:0000:01:00.0 means ? I experiment that when the opensm deamon is not yet running, this interrupt occur every 5 minutes.

mlx4-async is used for asynchronous events other than completion events, e.g. link events, catastrophic events, cq overrun, etc.
the interrupt is handled by the adapter driver and depending on the event different modules are activated, such a link event notifications or cleanups due to asynchronous errors.

Related

Where does the igmp version get set in RedHat 7

Is there a different location/method to set the default igmp version for multicast on a RedHat 7 server other than using the force parameter (net.ipv4.conf.eth0.force_igmp_version = 0 ) in sysctl.conf or sysctl.d etc. In the example above the 0 implies that there is a default which I assume is V3. The output below has a value of V2 on eth0 but it is not set or forced anywhere that I can find.
Idx Device : Count Querier Group Users Timer Reporter
1 lo : 1 V3
010000E0 1 0:00000000 0
2 eth0 : 2 V2
0A0707E7 1 0:00000000 1
010000E0 1 0:00000000 0
3 eth1 : 1 V3
010000E0 1 0:00000000 0
4 eth2 : 1 V3
010000E0 1 0:00000000 0
Any Linux expert there with an idea
you can try net.ipv4.conf.all.force_igmp_version=2 instead of forcing it one place, force for every interface.

How can I solve the error "Can't switch processors on a single processor kernel triage dump" in WinDbg?

I have a mini dump generated with the default parameters described at Collecting User-Mode Dumps.
The dump was generated when the system was hanging through right CTRL+SCROLL LOCK+SCROLL LOCK as set in the following register keys:
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\kbdhid\Parameters]
"CrashOnCtrlScroll"=dword:00000001
So the call stack that WinDbg shows me after the command 0: kd> !analyze -v is the one of the thread that was executing from kbdhid device driver.
When I tried to switch to a different processor I get the error:
0: kd> ~1
Can't switch processors on a single processor kernel triage dump
How can I solve this error?
What is a "single processor kernel triage dump"? If I search with Google I will get 3 or 4 results... no more, maybe someone from Microsoft could be of great help here :-).
Is there some particular value of CustomDumpFlags that I have to set? See MINIDUMP_TYPE enumeration.
I know that my system is multiprocessor and WinDbg confirms it:
0: kd> ~8
8 is not a valid processor number
0: kd> ~7
Can't switch processors on a single processor kernel triage dump
A Single Processor Kernel Dump or a kernel triage dump is a feature
where you can collect the kernel mode stack trace of an user mode process
on a machine that was not booted with /DEBUG on iirc available from vista+
you can also collect this dump using kdbgctrl
D:\>tasklist | grep -i edge
xxxxxxxxxxxxxxxxxxxxxx
MicrosoftEdgeCP.exe 12588 Console 5 41,892 K
MicrosoftEdgeCP.exe 9152 Console 5 1,49,064 K
xxxxxxxxxxxxxx
D:\>kdbgctrl -td 9152 edgy.dmp
Dump created in edgy.dmp, 1048564 bytes
D:\>file edgy.dmp
edgy.dmp: MS Windows 64bit crash dump, 1018708 pages
run !process -1 1f command to get the stack of all the threads for the current process
only one process kernel memory will be available in this dump
!process 0 0 wont work
it is not full kernel memory dump and may not be having information about any other processor stack aswell
run !cpuid only the info about 0 processor will be present in this dump
0: kd> !cpuid
CP F/M/S Manufacturer MHz
0 6,142,9 GenuineIntel 2304
Unable to get information for processor 1
Unable to get information for processor 2
Unable to get information for processor 3
0: kd>
or irql
0: kd> !irql 0
Debugger saved IRQL for processor 0x0 -- 0 (LOW_LEVEL)
0: kd> !irql 1
Cannot get PRCB address from processor 0x1
0: kd> !irql 2
Cannot get PRCB address from processor 0x2
0: kd> !irql 3
Cannot get PRCB address from processor 0x3
0: kd>

OpenOCD multiple STLinks

I need to be connect to 2 STM32s over 2 ST-Links at the same time. I found this issue described here.
However, solution doesn't work for me.
ST-Link ID1: 55FF6B067087534923182367
ST-Link ID2: 49FF6C064983574951291787
OpenOCD cfg file:
source [find interface/stlink-v2.cfg]
hla_serial "55FF6B067087534923182367"
source [find target/stm32f4x.cfg]
# use hardware reset, connect under reset
reset_config srst_only srst_nogate
I get:
$ openocd.exe -f stm32f4_fmboard.cfg
Open On-Chip Debugger 0.10.0
Licensed under GNU GPL v2
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
Info : auto-selecting first available session transport "hla_swd". To override use 'transport select <transport>'.
Info : The selected transport took over low-level target control. The results might differ compared to plain JTAG/SWD
adapter speed: 2000 kHz
adapter_nsrst_delay: 100
none separate
srst_only separate srst_nogate srst_open_drain connect_deassert_srst
Info : Unable to match requested speed 2000 kHz, using 1800 kHz
Info : Unable to match requested speed 2000 kHz, using 1800 kHz
Info : clock speed 1800 kHz
Error: open failed
in procedure 'init'
in procedure 'ocd_bouncer'
I do not know if solved but:
pi#raspberrypi:~/prog/bootloader $ st-info --probe
Found 1 stlink programmers
serial: 363f65064b46323613500643
openocd: "\x36\x3f\x65\x06\x4b\x46\x32\x36\x13\x50\x06\x43"
flash: 0 (pagesize: 0)
sram: 0
chipid: 0x0000
descr: unknown device
this tool shows serial of st-links and there is option called openocd. When I put hla_serial "\x36\x3f\x65\x06\x4b\x46\x32\x36\x13\x50\x06\x43" in file then it works for me. Your way does not. It also does not work in command line given as argument. It works only as I described in cfg file
The format of the configuration file seems to have changed recently. The following applies for Open On-Chip Debugger 0.10.0+dev-00634-gdb070eb8 (2018-12-30-23:05).
Find out the serial number with lsusb, st-link, or with ls -l /dev/serial/by-id. The latter yields (with two STLink/V2.1 connected):
total 0
lrwxrwxrwx 1 root root 13 Nov 30 14:31 usb-STMicroelectronics_STM32_STLink_066CFF323535474B43125623-if02 -> ../../ttyACM0
lrwxrwxrwx 1 root root 13 Dec 30 23:55 usb-STMicroelectronics_STM32_STLink_0672FF485457725187052924-if02 -> ../../ttyACM1
The specification on the .cfg-file is now plain hex. Do not use the C string syntax any longer. For selecting the latter device, simply write:
#hla_serial "066CFF323535474B43125623"
hla_serial "0672FF485457725187052924"

Understanding results of mongostat

I am trying to understand the results of mongostat:
example
insert query update delete getmore command flushes mapped vsize res faults locked % idx
0 2 4 0 0 10 0 976m 2.21g 643m 0 0.1 0
0 1 0 0 0 4 0 976m 2.21g 643m 0 0 0
0 0 0 0 0 1 0 976m 2.21g 643m 0 0 0
I see
mapped - 976m
vsize-2.2.g
res - 643m
res - RAM, so ~650MB of my database is in RAM
mapped - total size of database (via memory mapped files)
vsize - ???
not sure why vsize is important or what exactly it means in this content - im running an m1.large so i have like 400GB of HD space + 8GB of RAM.
Can someone help me out here and explain if
I am on the right page
what stats I should monitor in production
This should give you enough information
mapped - amount of data mmaped (total data size) megabytes
vsize - virtual size of process in megabytes
res - resident size of process in megabytes
1) I am on the right page
So mongostat is not really a "live monitor". It's mostly useful for connecting to a specific server and watching for something specific (what's happening when this job runs?). But it's not really useful for tracking performance over time.
Typically, for monitoring the server, you will want to use a tool like Zabbix or Cacti or Munin. Or some third-party server monitor. The MongoDB webiste has a list.
2) what stats I should monitor in production
You should monitor the same basic stats you would monitor on any server:
CPU
Memory
Disk IO
Network traffic
For MongoDB specifically, you will to run db.serverStatus() and track the
opcounters
connections
indexcounters
Note that these are increasing counters, so you'll have to create the correct "counter type" in your monitoring system (Zabbix, Cacti, etc.) A few of these monitoring programs already have MongoDB plug-ins available.
Also note that MongoDB has a "free" monitoring service called MMS. I say "free" because you will be receiving calls from salespeople in exchange for setting up MMS.
Also you can use these mini tools watching mongodb
http://openmymind.net/2011/9/23/Compressed-Blobs-In-MongoDB/
by the way I remembered this great online tool from 10gen
https://mms.10gen.com/user/login

How do I determine which are the foreground .NET threads from WinDBG?

How do I determine which are the foreground .NET threads from WinDBG ?
Using the !threads command the SOS extenstion tells us the count of the foreground threads but not which ones.
The state flag in the !threads output holds a lot of information. If the 0x00000200 flag is set the thread is a background thread.
In SOS for .NET 4 and PSSCOR2, there's a !threadstate command, that will list the texts for a given flag value. If you don't have that, there's an overview of the flags in the rotor source code and in Debugging .NET 2.0 applications by John Robbins.
You can use the thread state values given in this link and find out if a thread is a background thread or not.
TS_Background 0x00000200 Thread is a
background thread
Netext's command !wthreads shows type and status information:
0:011> !wthreads
Id OSId Address Domain Alloc Start:End COM GC Type Locks Type / Status Last Exception
1 1854 0074f580 00748cd0 02c19308:02c1b2e8 STA Preemptive 0
2 1890 0075ab18 00748cd0 00000000:00000000 MTA Preemptive 0 Background|Finalizer
3 1bac 080ecb98 00748cd0 00000000:00000000 MTA Preemptive 0 Background|Worker
4 ---- 08106068 00748cd0 00000000:00000000 MTA Preemptive 0 Worker|Terminated
5 ---- 0810e988 00748cd0 00000000:00000000 MTA Preemptive 0 Worker|Terminated
6 ---- 080eb1d0 00748cd0 00000000:00000000 MTA Preemptive 0 Worker|Terminated
7 081c 080fcb48 00748cd0 00000000:00000000 MTA Preemptive 0 Background|IOCPort