Zabbix server not starting after upgrade from 4.0 to 6.0 - upgrade

Zabbix server not starting after upgrading from 4.0 to 6.0
OS Oracle Linux 8. Database postgresql 14, web is nginx. cachesize is 2048M.
zabbix upgrade to 6.0
1142698:20220327:151229.553 ====== Fatal information: ======
1142698:20220327:151229.553 Program counter: 0x5626e015fd3a
1142698:20220327:151229.553 === Registers: ===
1142698:20220327:151229.554 r8 = 5626e247bb70 = 94725005097840 = 94725005097840
1142698:20220327:151229.554 r9 = 5626e246f5c0 = 94725005047232 = 94725005047232
1142698:20220327:151229.554 r10 = 0 = 0 = 0
1142698:20220327:151229.554 r11 = f = 15 = 15
1142698:20220327:151229.554 r12 = 5626e2430800 = 94725004789760 = 94725004789760
1142698:20220327:151229.554 r13 = 5626e027998b = 94724969437579 = 94724969437579
1142698:20220327:151229.554 r14 = 0 = 0 = 0
1142698:20220327:151229.554 r15 = 0 = 0 = 0
1142698:20220327:151229.554 rdi = 0 = 0 = 0
1142698:20220327:151229.554 rsi = 7ffc909bdb64 = 140722734619492 = 140722734619492
1142698:20220327:151229.554 rbp = 7ffc909bdb10 = 140722734619408 = 140722734619408
1142698:20220327:151229.554 rbx = 1 = 1 = 1
1142698:20220327:151229.554 rdx = 7ffc909bdb64 = 140722734619492 = 140722734619492
1142698:20220327:151229.554 rax = 0 = 0 = 0
1142698:20220327:151229.554 rcx = 0 = 0 = 0
1142698:20220327:151229.554 rsp = 7ffc909bdb10 = 140722734619408 = 140722734619408
1142698:20220327:151229.554 rip = 5626e015fd3a = 94724968283450 = 94724968283450
1142698:20220327:151229.554 efl = 10202 = 66050 = 66050
1142698:20220327:151229.554 csgsfs = 2b000000000033 = 12103423998558259 = 12103423998558259
1142698:20220327:151229.554 err = 4 = 4 = 4
1142698:20220327:151229.554 trapno = e = 14 = 14
1142698:20220327:151229.554 oldmask = 0 = 0 = 0
1142698:20220327:151229.554 cr2 = 0 = 0 = 0
1142698:20220327:151229.554 === Backtrace: ===
1142698:20220327:151229.555 15: /usr/sbin/zabbix_server: timer #1 [started, processing maintenances](zbx_backtrace+0x3f) [0x5626e0150815]
1142698:20220327:151229.555 14: /usr/sbin/zabbix_server: timer #1 [started, processing maintenances](zbx_log_fatal_info+0x141) [0x5626e0150a72]
1142698:20220327:151229.555 13: /usr/sbin/zabbix_server: timer #1 [started, processing maintenances](+0x24c25e) [0x5626e015125e]
1142698:20220327:151229.555 12: /lib64/libpthread.so.0(+0x12c30) [0x7f0660ebdc30]
1142698:20220327:151229.555 11: /usr/sbin/zabbix_server: timer #1 [started, processing maintenances](zbx_deserialize_uint31_compact+0x10) [0x5626e015fd3a]
1142698:20220327:151229.555 10: /usr/sbin/zabbix_server: timer #1 [started, processing maintenances](zbx_get_serialized_expression_functionids+0x27) [0x5626e01418fe]
1142698:20220327:151229.555 9: /usr/sbin/zabbix_server: timer #1 [started, processing maintenances](zbx_dc_get_event_maintenances+0x240) [0x5626e00ff7a3]
1142698:20220327:151229.555 8: /usr/sbin/zabbix_server: timer #1 [started, processing maintenances](+0x9fb21) [0x5626dffa4b21]
1142698:20220327:151229.555 7: /usr/sbin/zabbix_server: timer #1 [started, processing maintenances](timer_thread+0x1f2) [0x5626dffa53ad]
1142698:20220327:151229.555 6: /usr/sbin/zabbix_server: timer #1 [started, processing maintenances](zbx_thread_start+0x37) [0x5626e0160849]
1142698:20220327:151229.555 5: /usr/sbin/zabbix_server: timer #1 [started, processing maintenances](+0x66b33) [0x5626dff6bb33]
1142698:20220327:151229.555 4: /usr/sbin/zabbix_server: timer #1 [started, processing maintenances](MAIN_ZABBIX_ENTRY+0x7b8) [0x5626dff6ca5e]
1142698:20220327:151229.555 3: /usr/sbin/zabbix_server: timer #1 [started, processing maintenances](daemon_start+0x2f5) [0x5626e0150434]
1142698:20220327:151229.555 2: /usr/sbin/zabbix_server: timer #1 [started, processing maintenances](main+0x33a) [0x5626dff6afee]
1142698:20220327:151229.557 Please consider attaching a disassembly listing to your bug report.
1142698:20220327:151229.557 This listing can be produced with, e.g., objdump -DSswx zabbix_server.
1142698:20220327:151229.557 ================================
1142677:20220327:151229.559 One child process died (PID:1142698,exitcode/signal:1). Exiting ...
1142677:20220327:151229.657 PROCESS EXIT: 1142698
1142678:20220327:151229.657 HA manager has been paused
zabbix_server [1142677]: Error waiting for process with PID 1142698: [10] No child processes
1142678:20220327:151229.705 HA manager has been stopped
1142677:20220327:151229.745 syncing history data...
1142677:20220327:151229.746 error reason for "404b30b2-eaf6-11e1-8c79-b4c22c3547a9:vmware.hv.cpu.usage.perc" changed: Cannot evaluate function: item "/404b30b2-eaf6-11e1-8c79-b4c22c3547a9/vmware.hv.cpu.usage[{$URL},{HOST.HOST}]" is not supported at "last(//vmware.hv.cpu.usage[{$URL},{HOST.HOST}])*100)/(last(//vmware.hv.hw.cpu.num[{$URL},{HOST.HOST}])*last(//vmware.hv.hw.cpu.freq[{$URL},{HOST.HOST}]))".
1142677:20220327:151229.746 error reason for "38383135-3837-5a43-3331-353033444d35:vmware.hv.network.in[{$URL},{HOST.HOST},bps]" changed: No "vmware collector" processes started.
1142677:20220327:151229.754 [Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR: invalid input syntax for type integer: "1648328400.000000"
LINE 1: ... partitions.history_p2022_03_27 (CHECK ((clock >= '164832840...
^
QUERY: CREATE TABLE IF NOT EXISTS partitions.history_p2022_03_27 (CHECK ((clock >= '1648328400.000000' AND clock < '1648414800.000000'))) INHERITS (history) TABLESPACE history_partitions;
CONTEXT: PL/pgSQL function trg_partition() line 38 at EXECUTE
[insert into history (itemid,clock,ns,value) values (872723,1648383149,527845142,0);
]
1142677:20220327:151229.809 syncing history data... 100.000000%
1142677:20220327:151229.809 syncing history data done
1142677:20220327:151229.809 syncing trend data...
1142677:20220327:151229.829 syncing trend data done
1142677:20220327:151229.867 Zabbix Server stopped. Zabbix 6.0.2 (revision d726a4d916).

Related

How to store packet offset in BPF Map or skb->cb field?

Example Code
I want to pass the offset to the following tail calls. But when I try to store it in BPF MAP(Method 1) or skb->cb field(Method 2), I get an error offset is outside of the packet.
If Method 1/2 code is removed, the bpf program can be loaded sucessfully.
#include <vmlinux.h>
#include <bpf/bpf_endian.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__type(key, __u32);
__type(value, __u32);
__uint(max_entries, 100);
} state_vars SEC(".maps");
SEC("tc")
int tc_ingress(struct __sk_buff *ctx) {
void *data_end = (void *)(__u64)ctx->data_end;
void *data = (void *)(__u64)ctx->data;
u32 data_len = data_end - data;
u32 rn = 0;
u32 rn_idx = 0;
for (rn = 0; rn < 1000 && rn < data_len && data + rn < data_end; rn++) {
char c = *(char*)(data + rn);
if (c == '\r') {
rn_idx = rn;
break;
}
}
/// Method 1
// u32 var_idx = 0;
// bpf_map_update_elem(&state_vars, &var_idx, &rn_idx, BPF_ANY);
/// Method 2
// ctx->cb[0] = rn_idx;
return TC_ACT_OK;
}
Error Message
Method 1
; void *data = (void *)(__u64)ctx->data;
0: (61) r2 = *(u32 *)(r1 +76)
; void *data_end = (void *)(__u64)ctx->data_end;
1: (61) r3 = *(u32 *)(r1 +80)
2: (b7) r1 = 0
; u32 rn_idx = 0;
3: (63) *(u32 *)(r10 -4) = r1
last_idx 3 first_idx 0
regs=2 stack=0 before 2: (b7) r1 = 0
; u32 data_len = data_end - data;
4: (bf) r4 = r3
5: (1f) r4 -= r2
6: (67) r4 <<= 32
7: (77) r4 >>= 32
; for (rn = 0; rn < 1000 && rn < data_len && data + rn < data_end; rn++) {
8: (15) if r4 == 0x0 goto pc+10
R1_w=invP0 R2_w=pkt(id=0,off=0,r=0,imm=0) R3_w=pkt_end(id=0,off=0,imm=0) R4_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 fp-8=0000????
; for (rn = 0; rn < 1000 && rn < data_len && data + rn < data_end; rn++) {
9: (bf) r5 = r2
10: (0f) r5 += r1
; for (rn = 0; rn < 1000 && rn < data_len && data + rn < data_end; rn++) {
11: (3d) if r5 >= r3 goto pc+7
R1_w=invP0 R2_w=pkt(id=0,off=0,r=0,imm=0) R3_w=pkt_end(id=0,off=0,imm=0) R4_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R5_w=pkt(id=0,off=0,r=0,imm=0) R10=fp0 fp-8=0000????
; char c = *(char*)(data + rn);
12: (71) r5 = *(u8 *)(r5 +0)
invalid access to packet, off=0 size=1, R5(id=0,off=0,r=0)
R5 offset is outside of the packet
processed 13 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
Method 2
; void *data = (void *)(__u64)ctx->data;
1: (61) r3 = *(u32 *)(r1 +76)
; void *data_end = (void *)(__u64)ctx->data_end;
2: (61) r4 = *(u32 *)(r1 +80)
; u32 data_len = data_end - data;
3: (bf) r5 = r4
4: (1f) r5 -= r3
5: (bf) r0 = r5
6: (67) r0 <<= 32
7: (77) r0 >>= 32
; for (rn = 0; rn < 1000 && rn < data_len && data + rn < data_end; rn++) {
8: (15) if r0 == 0x0 goto pc+21
R0_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R1=ctx(id=0,off=0,imm=0) R2_w=inv0 R3_w=pkt(id=0,off=0,r=0,imm=0) R4_w=pkt_end(id=0,off=0,imm=0) R5_w=inv(id=0) R10=fp0
9: (b7) r2 = 0
10: (b7) r0 = 0
; for (rn = 0; rn < 1000 && rn < data_len && data + rn < data_end; rn++) {
11: (bf) r6 = r3
12: (0f) r6 += r0
last_idx 12 first_idx 0
regs=1 stack=0 before 11: (bf) r6 = r3
regs=1 stack=0 before 10: (b7) r0 = 0
; for (rn = 0; rn < 1000 && rn < data_len && data + rn < data_end; rn++) {
13: (3d) if r6 >= r4 goto pc+16
R0_w=invP0 R1=ctx(id=0,off=0,imm=0) R2_w=inv0 R3_w=pkt(id=0,off=0,r=0,imm=0) R4_w=pkt_end(id=0,off=0,imm=0) R5_w=inv(id=0) R6_w=pkt(id=0,off=0,r=0,imm=0) R10=fp0
; char c = *(char*)(data + rn);
14: (71) r6 = *(u8 *)(r6 +0)
invalid access to packet, off=0 size=1, R6(id=0,off=0,r=0)
R6 offset is outside of the packet
processed 15 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
Question
Why does the error appear when Method 1/2 code is added?
How could I store the offset value in BPF MAP or skb->cb field?
Update 2022.11.1
follow #pchaigno's advice, add +1 in the condition, I get similar error:
; int tc_ingress(struct __sk_buff *ctx)
0: (b7) r2 = 0
; void *data = (void *)(__u64)ctx->data;
1: (61) r3 = *(u32 *)(r1 +76)
; void *data_end = (void *)(__u64)ctx->data_end;
2: (61) r4 = *(u32 *)(r1 +80)
; u32 data_len = data_end - data;
3: (bf) r5 = r4
4: (1f) r5 -= r3
5: (bf) r0 = r5
6: (67) r0 <<= 32
7: (77) r0 >>= 32
; for (rn = 0; rn < 1000 && rn < data_len && data + rn + 1 < data_end; rn++) {
8: (15) if r0 == 0x0 goto pc+23
R0_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R1=ctx(id=0,off=0,imm=0) R2_w=inv0 R3_w=pkt(id=0,off=0,r=0,imm=0) R4_w=pkt_end(id=0,off=0,imm=0) R5_w=inv(id=0) R10=fp0
9: (b7) r2 = 0
10: (b7) r0 = 0
; for (rn = 0; rn < 1000 && rn < data_len && data + rn + 1 < data_end; rn++) {
11: (bf) r6 = r3
12: (0f) r6 += r0
last_idx 12 first_idx 0
regs=1 stack=0 before 11: (bf) r6 = r3
regs=1 stack=0 before 10: (b7) r0 = 0
13: (bf) r7 = r6
14: (07) r7 += 1
; for (rn = 0; rn < 1000 && rn < data_len && data + rn + 1 < data_end; rn++) {
15: (3d) if r7 >= r4 goto pc+16
R0_w=invP0 R1=ctx(id=0,off=0,imm=0) R2_w=inv0 R3_w=pkt(id=0,off=0,r=0,imm=0) R4_w=pkt_end(id=0,off=0,imm=0) R5_w=inv(id=0) R6_w=pkt(id=0,off=0,r=0,imm=0) R7_w=pkt(id=0,off=1,r=0,imm=0) R10=fp0
; char c = *(char*)(data + rn);
16: (71) r6 = *(u8 *)(r6 +0)
invalid access to packet, off=0 size=1, R6(id=0,off=0,r=0)
R6 offset is outside of the packet
processed 17 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
TL;DR. The issue is when you read the packet data, not when you write it. You have an off-by-one issue on the bounds check. The issue only appears once you use the read data because otherwise the compiler optimizes out the code.
Verifier Error Explanation
; for (rn = 0; rn < 1000 && rn < data_len && data + rn < data_end; rn++) {
11: (3d) if r5 >= r3 goto pc+7
R1_w=invP0 R2_w=pkt(id=0,off=0,r=0,imm=0) R3_w=pkt_end(id=0,off=0,imm=0) R4_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R5_w=pkt(id=0,off=0,r=0,imm=0) R10=fp0 fp-8=0000????
; char c = *(char*)(data + rn);
12: (71) r5 = *(u8 *)(r5 +0)
invalid access to packet, off=0 size=1, R5(id=0,off=0,r=0)
The verifier says that that the packet access is out of bounds by one: R5's offset + size > R3's offset. That is, both offsets are 0 and the access size is 1.
Root Cause
Your bounds check is off by one:
for (rn = 0; rn < 1000 && rn < data_len && data + rn < data_end; rn++) {
char c = *(char*)(data + rn);
To account for the access size, it should be:
for (rn = 0; rn < 1000 && rn < data_len && data + rn + 1 < data_end; rn++) {
char c = *(char*)(data + rn);
Why does it only happen once you write the packet data?
If you don't write rn_idx anywhere, then the compiler understands that rn_idx and c are not needed. All code for those variables is compiled out and the out-of-bound packet access is removed.

Register errors when trying to separate strings and print them using BCC

I would like to separate some strings and print them out one by one but it seems almost impossible due to these errors. Trying to change const char str[] to const char *str seems to just net me an opcode 00 error. It seems like I am heading down the right path but just need help to print these lines out one by one. Here is the code running on an online compiler to see what its output looks like String Parser online IDE
from bcc import BPF
# BPF PROGRAM
bpfprogram = """
int helloworld2(void *ctx)
{
const char str[] = "here are some words";
int length = sizeof(str);
int start = 0;
//#pragma unroll Tried using this but does not really fix the issue.
for (int i = 0; i < sizeof(str); i++) {
if (str[i] == ' ') {
bpf_trace_printk("%s\\n", i - start, str + start);
start = i + 1;
}
}
bpf_trace_printk("%s\\n", length - start, str + start);
return 0;
}
"""
# This compiles the program defined by the bpfprogram string into bpf bytecode and
#loads it to the kernel BPF verifier.
b = BPF(text=bpfprogram)
# This attaches the compiled BPF program to a kernel event of your choosing,
#in this case to the sys_clone syscall which will cause the BPF program to run
#everytime the sys_clone call occurs.
b.attach_kprobe(event=b.get_syscall_fnname("clone"), fn_name="helloworld2")
# Capture and print the BPF program's trace output
b.trace_print()
Here is the error that I am seeing, Trying a pointer instead of a char array just nets me an opcode 00 error. Trying #pragma unroll does not seem to really fix the issue either. I am wondering if there is a solution to this problem that I am just not seeing. One notable error is near the end here: R4 bitwise operator |= on pointer prohibited
bpf: Failed to load program: Permission denied
btf_vmlinux is malformed
Unrecognized arg#0 type PTR
; int helloworld2(void *ctx)
0: (b7) r1 = 7562354
; const char str[] = "here are some words";
1: (63) *(u32 *)(r10 -8) = r1
2: (18) r1 = 0x6f7720656d6f7320
4: (7b) *(u64 *)(r10 -16) = r1
5: (18) r1 = 0x6572612065726568
7: (7b) *(u64 *)(r10 -24) = r1
8: (b7) r6 = 684837
; ({ char _fmt[] = "%s\n"; bpf_trace_printk_(_fmt, sizeof(_fmt), i - start, str + start); });
9: (63) *(u32 *)(r10 -28) = r6
10: (bf) r1 = r10
;
11: (07) r1 += -28
12: (bf) r4 = r10
13: (07) r4 += -24
; ({ char _fmt[] = "%s\n"; bpf_trace_printk_(_fmt, sizeof(_fmt), i - start, str + start); });
14: (b7) r2 = 4
15: (b7) r3 = 4
16: (85) call bpf_trace_printk#6
last_idx 16 first_idx 0
regs=4 stack=0 before 15: (b7) r3 = 4
regs=4 stack=0 before 14: (b7) r2 = 4
17: (b7) r1 = 5
; if (str[i] == ' ') {
18: (71) r2 = *(u8 *)(r10 -19)
; if (str[i] == ' ') {
19: (55) if r2 != 0x20 goto pc+9
R0_w=inv(id=0) R1_w=inv5 R2_w=inv32 R6_w=inv684837 R10=fp0 fp-8=????mmmm fp-16_w=inv8031924080438375200 fp-24_w=inv7310011936944579944 fp-32=mmmm????
;
20: (bf) r4 = r10
21: (07) r4 += -19
; ({ char _fmt[] = "%s\n"; bpf_trace_printk_(_fmt, sizeof(_fmt), i - start, str + start); });
22: (63) *(u32 *)(r10 -28) = r6
23: (bf) r1 = r10
;
24: (07) r1 += -28
; ({ char _fmt[] = "%s\n"; bpf_trace_printk_(_fmt, sizeof(_fmt), i - start, str + start); });
25: (b7) r2 = 4
26: (b7) r3 = 0
27: (85) call bpf_trace_printk#6
last_idx 27 first_idx 0
regs=4 stack=0 before 26: (b7) r3 = 0
regs=4 stack=0 before 25: (b7) r2 = 4
28: (b7) r1 = 6
; if (str[i] == ' ') {
29: (71) r2 = *(u8 *)(r10 -18)
; if (str[i] == ' ') {
30: (55) if r2 != 0x20 goto pc+12
R0=inv(id=0) R1_w=inv6 R2_w=inv32 R6=inv684837 R10=fp0 fp-8=????mmmm fp-16=inv8031924080438375200 fp-24=inv7310011936944579944 fp-32=mmmm????
31: (b7) r2 = 684837
; ({ char _fmt[] = "%s\n"; bpf_trace_printk_(_fmt, sizeof(_fmt), i - start, str + start); });
32: (63) *(u32 *)(r10 -28) = r2
33: (b7) r3 = 6
; ({ char _fmt[] = "%s\n"; bpf_trace_printk_(_fmt, sizeof(_fmt), i - start, str + start); });
34: (1f) r3 -= r1
35: (bf) r4 = r10
;
36: (07) r4 += -24
; ({ char _fmt[] = "%s\n"; bpf_trace_printk_(_fmt, sizeof(_fmt), i - start, str + start); });
37: (4f) r4 |= r1
last_idx 37 first_idx 28
regs=2 stack=0 before 36: (07) r4 += -24
regs=2 stack=0 before 35: (bf) r4 = r10
regs=2 stack=0 before 34: (1f) r3 -= r1
regs=2 stack=0 before 33: (b7) r3 = 6
regs=2 stack=0 before 32: (63) *(u32 *)(r10 -28) = r2
regs=2 stack=0 before 31: (b7) r2 = 684837
regs=2 stack=0 before 30: (55) if r2 != 0x20 goto pc+12
regs=2 stack=0 before 29: (71) r2 = *(u8 *)(r10 -18)
regs=2 stack=0 before 28: (b7) r1 = 6
R4 bitwise operator |= on pointer prohibited
processed 36 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1
Traceback (most recent call last):
File "BPFHelloWorld.py", line 31, in <module>
b.attach_kprobe(event=b.get_syscall_fnname("clone"), fn_name="helloworld2")
File "/usr/lib/python3/dist-packages/bcc/__init__.py", line 654, in attach_kprobe
fn = self.load_func(fn_name, BPF.KPROBE)
File "/usr/lib/python3/dist-packages/bcc/__init__.py", line 394, in load_func
raise Exception("Failed to load BPF program %s: %s" %
Exception: Failed to load BPF program b'helloworld2': Permission denied
You are using bcc's bpf_trace_print function which expects different arguments from the kernel helper.
In bcc, bpf_trace_print is a sort of wrapper around the corresponding BPF helper. If you check its documentation, it expects a single mandatory argument, a string and several optional arguments:
Syntax: int bpf_trace_printk(const char *fmt, ...)
So you can write things like:
bpf_trace_printk("remote-port: %d, local-port: %d\\n", skk.remote_port,
skk.local_port);
In contrast, the BPF helper expects the first argument to be the string and the second to be the size of the string:
static const struct bpf_func_proto bpf_trace_printk_proto = {
.func = bpf_trace_printk,
.gpl_only = true,
.ret_type = RET_INTEGER,
.arg1_type = ARG_PTR_TO_MEM,
.arg2_type = ARG_CONST_SIZE,
};

update to xamarin forms 5 causes debugging to be super slow

Upgrade to Xamarin Forms 5.0.0.2012 from 4.8 makes the app (only in debug mode) to freeze for few seconds for each interaction and makes tons of outputs:
Thread started: <Thread Pool> #41
[Mono] GC_BRIDGE waiting for bridge processing to finish
[Mono] GC_TAR_BRIDGE bridges 0 objects 0 opaque 0 colors 0 colors-bridged 0 colors-visible 169 xref 4 cache-hit 0 cache-semihit 0 cache-miss 0 setup 0.07ms tarjan 0.09ms scc-setup 0.06ms gather-xref 0.00ms xref-setup 0.00ms cleanup 0.00ms
[Mono] GC_BRIDGE: Complete, was running for 0.05ms
[Mono] GC_MINOR: (Nursery full) time 15.02ms, stw 18.82ms promoted 1445K major size: 33920K in use: 32001K los size: 9472K in use: 7121K
Thread started: <Thread Pool> #42
Thread started: <Thread Pool> #43
Thread started: <Thread Pool> #44
Thread started: <Thread Pool> #45
Thread started: <Thread Pool> #46
Thread started: <Thread Pool> #47
Thread started: <Thread Pool> #48
Thread started: <Thread Pool> #49
Thread started: <Thread Pool> #50
Thread started: <Thread Pool> #51
[Mono] GC_BRIDGE waiting for bridge processing to finish
[Mono] GC_TAR_BRIDGE bridges 0 objects 0 opaque 0 colors 0 colors-bridged 0 colors-visible 169 xref 4 cache-hit 0 cache-semihit 0 cache-miss 0 setup 0.07ms tarjan 0.09ms scc-setup 0.06ms gather-xref 0.00ms xref-setup 0.00ms cleanup 0.00ms
[Mono] GC_BRIDGE: Complete, was running for 0.07ms
[Mono] GC_MINOR: (Nursery full) time 11.92ms, stw 14.43ms promoted 926K major size: 34896K in use: 32961K los size: 12544K in use: 10620K
[mono] Full thread dump:
[Mono] GC_TAR_BRIDGE bridges 0 objects 0 opaque 0 colors 0 colors-bridged 0 colors-visible 169 xref 4 cache-hit 0 cache-semihit 0 cache-miss 0 setup 0.07ms tarjan 0.09ms scc-setup 0.06ms gather-xref 0.00ms xref-setup 0.00ms cleanup 0.00ms
[Mono] GC_BRIDGE waiting for bridge processing to finish
[Mono] GC_BRIDGE: Complete, was running for 0.07ms
[Mono] GC_MINOR: (Nursery full) time 13.09ms, stw 15.10ms promoted 860K major size: 35728K in use: 33849K los size: 17664K in use: 15619K
[Mono] GC_TAR_BRIDGE bridges 0 objects 0 opaque 0 colors 0 colors-bridged 0 colors-visible 169 xref 4 cache-hit 0 cache-semihit 0 cache-miss 0 setup 0.07ms tarjan 0.09ms scc-setup 0.06ms gather-xref 0.00ms xref-setup 0.00ms cleanup 0.00ms
[Mono] GC_BRIDGE: Complete, was running for 0.07ms
[Mono] GC_MINOR: (Concurrent start) time 3.81ms, stw 13.08ms promoted 1K major size: 35728K in use: 33851K los size: 17664K in use: 15619K
[Mono] GC_MAJOR_CONCURRENT_START: (LOS overflow)
[Mono] GC_BRIDGE waiting for bridge processing to finish
Any idea how to bring back useability of a debugger?
removing a CachingStrategy="RecycleElement" from ListView was a partial solution
Xamarin garba collection runs often

What are bsdthread_register, ulock_wake and other system calls reported by dtruss?

Here is an example interaction with dtruss on my macOS machine.
$ cat main.c
int main() {
return 0;
}
$ clang main.c
$ sudo dtruss ./a.out
dtrace: system integrity protection is on, some features will not be available
SYSCALL(args) = return
open("/dev/dtracehelper\0", 0x2, 0x7FFF5AF83930) = 3 0
ioctl(0x3, 0x80086804, 0x7FFF5AF838B8) = 0 0
close(0x3) = 0 0
thread_selfid(0x3, 0x80086804, 0x7FFF5AF838B8) = 5434813 0
bsdthread_register(0x7FFFAF245080, 0x7FFFAF245070, 0x2000) = 1073741919 0
ulock_wake(0x1, 0x7FFF5AF830EC, 0x0) = -1 Err#2
issetugid(0x1, 0x7FFF5AF830EC, 0x0) = 0 0
mprotect(0x104C7F000, 0x88, 0x1) = 0 0
mprotect(0x104C81000, 0x1000, 0x0) = 0 0
mprotect(0x104C97000, 0x1000, 0x0) = 0 0
mprotect(0x104C98000, 0x1000, 0x0) = 0 0
mprotect(0x104CAE000, 0x1000, 0x0) = 0 0
mprotect(0x104CAF000, 0x1000, 0x1) = 0 0
mprotect(0x104C7F000, 0x88, 0x3) = 0 0
mprotect(0x104C7F000, 0x88, 0x1) = 0 0
getpid(0x104C7F000, 0x88, 0x1) = 34522 0
stat64("/AppleInternal/XBS/.isChrooted\0", 0x7FFF5AF82FA8, 0x1) = -1 Err#2
stat64("/AppleInternal\0", 0x7FFF5AF83040, 0x1) = -1 Err#2
csops(0x86DA, 0x7, 0x7FFF5AF82AD0) = -1 Err#22
dtrace: error on enabled probe ID 2158 (ID 552: syscall::sysctl:return): invalid kernel access in action #10 at DIF offset 40
ulock_wake(0x1, 0x7FFF5AF83050, 0x0) = -1 Err#2
csops(0x86DA, 0x7, 0x7FFF5AF823B0) = -1 Err#22
Many of these system calls look familiar, but others don't. Specifically, what are thread_selfid, bsdthread_register, ulock_wake, csops, and dtrace? If these are system calls, they don't have man pages documenting them. Are they private system calls, or something else? Where does dtruss get their names from?
Yes, ULOCK_wake is part of a private systemcall for threads afaik.
You can see it in the xnu sourcecode as being systemcall number 516.

Hung processes resume if attached to strace

I have a network program written in C using TCP sockets. Sometimes the client program hangs forever expecting input from server. Specifically, the client hangs on select() call set on an fd intended to read characters sent by server.
I am using strace to know where the process got stuck. However, sometimes when I attach the hung client process to strace, it immediately resumes it's execution and properly exits. Not all hung processes exhibit this behavior, some processes stuck in the select() even if I attach them to strace. But most of the processes resume their execution when attached to strace.
I am curious what causing the processes resume when attached to strace. It might give me clues to know why client processes are getting hung.
Any ideas? what causes a hung process to resume it's execution when attached to strace?
Update:
Here's the output of strace on hung processes.
> sudo strace -p 25645
Process 25645 attached - interrupt to quit
--- SIGSTOP (Stopped (signal)) # 0 (0) ---
--- SIGSTOP (Stopped (signal)) # 0 (0) ---
[ Process PID=25645 runs in 32 bit mode. ]
select(6, [3 5], NULL, NULL, NULL) = 2 (in [3 5])
read(5, "\0", 8192) = 1
write(2, "", 0) = 0
read(3, "====Setup set_oldtempbehaio"..., 8192) = 555
write(1, "====Setup set_oldtempbehaio"..., 555) = 555
select(6, [3 5], NULL, NULL, NULL) = 2 (in [3 5])
read(5, "", 8192) = 0
read(3, "", 8192) = 0
close(5) = 0
kill(25652, SIGKILL) = 0
exit_group(0) = ?
Process 25645 detached
_
> sudo strace -p 14462
Process 14462 attached - interrupt to quit
[ Process PID=14462 runs in 32 bit mode. ]
read(0, 0xff85fdbc, 8192) = -1 EIO (Input/output error)
shutdown(3, 1 /* send */) = 0
exit_group(0) = ?
_
> sudo strace -p 7517
Process 7517 attached - interrupt to quit
--- SIGSTOP (Stopped (signal)) # 0 (0) ---
--- SIGSTOP (Stopped (signal)) # 0 (0) ---
[ Process PID=7517 runs in 32 bit mode. ]
connect(3, {sa_family=AF_INET, sin_port=htons(300), sin_addr=inet_addr("100.64.220.98")}, 16) = -1 ETIMEDOUT (Connection timed out)
close(3) = 0
dup(2) = 3
fcntl64(3, F_GETFL) = 0x1 (flags O_WRONLY)
close(3) = 0
write(2, "dsd13: Connection timed out\n", 30) = 30
write(2, "Error code : 110\n", 17) = 17
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
exit_group(1) = ?
Process 7517 detached
Not just select(), but the processes(of same program) are stuck in various system calls before I attach them to strace. They suddenly resume after attaching to strace. If I don't attach them to strace, they just hang there forever.
Update 2:
I learned that strace could start a process which was previously stopped (process in T sate). Now I am trying to understand why did these processes go to 'T' state, what's the cause. Here's the /proc//status information:
> cat /proc/12554/status
Name: someone
State: T (stopped)
SleepAVG: 88%
Tgid: 12554
Pid: 12554
PPid: 9754
TracerPid: 0
Uid: 5000 5000 5000 5000
Gid: 48986 48986 48986 48986
FDSize: 256
Groups: 9149 48986
VmPeak: 1992 kB
VmSize: 1964 kB
VmLck: 0 kB
VmHWM: 608 kB
VmRSS: 608 kB
VmData: 156 kB
VmStk: 20 kB
VmExe: 16 kB
VmLib: 1744 kB
VmPTE: 20 kB
Threads: 1
SigQ: 54/73728
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000006
SigCgt: 0000000000004000
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
Cpus_allowed: 00000000,00000000,00000000,0000000f
Mems_allowed: 00000000,00000001
strace uses ptrace. The ptrace man page has this:
Since attaching sends SIGSTOP and the tracer usually suppresses it,
this may cause a stray EINTR return from the currently executing system
call in the tracee, as described in the "Signal injection and
suppression" section.
Are you seeing select return EINTR?