ebpf: BPF_FUNC_map_lookup_elem calling convention - bpf

Looking at kernel's sample/bpf/sock_example.c:
struct bpf_insn prog[] = {
BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
BPF_LD_ABS(BPF_B, ETH_HLEN + offsetof(struct iphdr, protocol) /* R0 = ip->proto */),
BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_0, -4), /* *(u32 *)(fp - 4) = r0 */
BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -4), /* r2 = fp - 4 */
BPF_LD_MAP_FD(BPF_REG_1, map_fd),
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
BPF_MOV64_IMM(BPF_REG_1, 1), /* r1 = 1 */
BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_0, BPF_REG_1, 0),
BPF_MOV64_IMM(BPF_REG_0, 0), /* r0 = 0 */
BPF_EXIT_INSN(),
};
I understand that eBPF sets registers r1-r5 to hold arguments to BPF helpers. What I don't understand is why to pass a map fd to BPF_FUNC_map_lookup_elem? According to helpers code :
const struct bpf_func_proto bpf_map_lookup_elem_proto = {
.func = bpf_map_lookup_elem,
.gpl_only = false,
.pkt_access = true,
.ret_type = RET_PTR_TO_MAP_VALUE_OR_NULL,
.arg1_type = ARG_CONST_MAP_PTR,
.arg2_type = ARG_PTR_TO_MAP_KEY,
};
which means both arguments are pointers, and none is the map fd. Unless, I'm looking in the wrong code?

File descriptors when writing your program in user space, but later replaced by pointers to the map by the verifier.
File Descriptors When Writing Your eBPF Program
You write your eBPF program in user space, where you don't have any address pointer to the map. So you use a file descriptor for referencing that map for the various operations (lookups, updates, deletes) that your program may run.
If writing your program in C, instead of assembly instructions like you do, this is usually abstracted: The program references the map with a C pointer, but the loader (typically relying on libbpf) performs some relocation step to extract metadata about the map from a dedicated ELF section of the object file, retrieves the file descriptor to the map, and inserts it in the relevant bytecode instructions.
Kernel Verifier Switches to Pointers
But you are correct: in the kernel, the BPF_FUNC_map_lookup_elem() helper and the like use pointers to the maps, not file descriptors. This is at load time, during verification of the program, that the verifier replaces the file descriptors by the pointers to the memory area associated to the maps (see resolve_pseudo_ldimm64() from kernel/bpf/verifier.c). It is possible to get a pointer at this time: The verifier does have access to the kernel-memory pointers for those maps.
Note that the verifier actually goes even further and, for some map types (hash, arrays), it even replaces the calls to the helpers for map lookups completely, using instead instructions to directly read from the relevant addresses in the map (search for map_gen_lookup for details).

Related

How to get the name, or the ID, of the System Call that was detected with BCC/eBPF

I have this code, which is largely inspired by the syscount.py tool present in BCC:
BPF_HASH(data, u32, u64);
TRACEPOINT_PROBE(raw_syscalls,sys_exit){
u64 pid_tgid = bpf_get_current_pid_tgid();
u32 key = pid_tgid >> 32;
u32 tid = (u32)pid_tgid;
u64 *val, zero = 0;
val = data.lookup_or_try_init(&key, &zero);
if(val){
lock_xadd(val,1);
}
This counts the system calls performed by each process.
How can I get the name, or the ID of the system call being handled, so that I can create an array of the system calls used?
As far as I know there is no lookup table of function in linux itself, there are a number of places where people made translation tables such as:
https://filippo.io/linux-syscall-table/
https://chromium.googlesource.com/chromiumos/docs/+/master/constants/syscalls.md
The hassle is that the numbers can vary quite a bit between architectures. The most complete lists that I know of exist in the Golang source code https://cs.opensource.google/go/go/+/refs/tags/go1.20.1:src/syscall/zsysnum_linux_arm.go which defines most of them for most architectures.
You can use these resources to construct your own lookup table

Dynamic generation of signal spies in testbench

I have a .txt file that contains certain signals that I want to monitor in my testbench during the application of some stimulus.
I am creating an initial block in which I am reading the file and then I try to generate a init_signal_spy() for every one of the lines that I have read.
The code that I have written up until this point has the following format:
module testbench();
logic probes[];
initial begin : read_signals_to_dump_at
automatic int fd;
automatic string fname,line,line_stripped;
if ($value$plusargs("sigfile=%s",fname)) begin : read
fd = $fopen(fname,"r");
while($fgets(line,fd)) begin
//static logic net_to_be_probed;
automatic sig_and_spy entry = new();
// Trim away the '\n' from the line.
line_stripped = line.substr(0,line.len()-2);
// Resize the array
probes = new [probes.size() + 1] (probes);
// Link the extracted new line with the probe list
// - this raises an error "An invalid empty string was passed in as the Destination object."
// - expected since the last element is empty...
$init_signal_spy(line_stripped, probes[probes.size()-1] , 1);
end
end
end : read_signals_to_dump_at
endmodule
In the code above, just before I issue the generation for the spy, I get why the error
An invalid empty string was passed in as the Destination object.
is generated by the compiler. Although the array has been resized, it does not hold any element i.e., its empty. Thus, I tried creating locally a logic variable that then I assign to the signal spy within the loop in the following manner:
module testbench();
logic probes[];
initial begin : read_signals_to_dump_at
automatic int fd;
automatic string fname,line,line_stripped;
if ($value$plusargs("sigfile=%s",fname)) begin : read
fd = $fopen(fname,"r");
while($fgets(line,fd)) begin
logic new_probe;
// Trim away the '\n' from the line.
line_stripped = line.substr(0,line.len()-2);
// Resize the array and copy old values.
probes = new [probes.size() + 1] (probes);
// Add the new probe to the Testbenchs' probes array
probes[probes.size()-1] = new_probe;
// Again, An invalid empty string was passed in as the Destination object.
$init_signal_spy(line_stripped, probes[probes.size()-1] , 1);
end
end
end : read_signals_to_dump_at
endmodule
But then again, I see the same error at runtime during the simulation. So...Is there a way of achieving such a "dynamic" signal monitoring in the testbench somehow? As far as I understood the error concerns that the destination object is NOT a signal of the testbench. Thus the logic new_probe has no effect. Which is to be expected I mean, but is there a way of achieving the desired behavior in the Testbench via sysverilog?
You have at least two problems.
Both the source and destination arguments to init_signal_spy() need to be strings. Your destination argument is an integral variable with a 0 value, and that gets interpreted as a null string. init_signal_spy() was designed for mixed language simulation, and using strings was the only way to achieve that.
Your destination variable should be queue, not a dynamic array. Every time you re-size a dynamic array, the previous elements get relocated and that breaks the previous connection made by signal spy.
This example shows the proper syntax for string this up
module top;
int A[$];
int s1,s2;
initial begin
A.push_back(0);
$init_signal_spy("s1","A[0]");
A.push_back(0);
$init_signal_spy("s2","A[1]");
#1 s1 = 1;
#1 s2 = 2;
#1 $display("%p",A);
end
endmodule
A far better solution for performance is converting your .txt file into actual SystemVerilog code that can be compiled into your testbench.

Seed for hash-table non cryptographic hash functions

If one sets the hash table seed during resize or table creation to a random number, will that prevent the DDoS attacks on such hash table or, knowing the hash algorithm, the attacker will still easily get around the seed? What if the algorithm uses the Pearson hash function with randomly generated tables, unknown to the attacker? Does such table hash still need a seed or it is safe enough?
Context: I want to use an on-disk hash table for a key-value database for my toy web server, where the keys may depend on the user input.
There is exist several approaches to protect your hash-subsystem from "adverse selection" attack, most popular of them is named Universal Hashing, where hash-function or it's property randomly selected, at initialization.
In my own approach, I am using same hash function, where each char adding to result with non-linear mixing, dependends of random array of uint32_t[256]. Array is created during system initialization, and in my code, it happening at each start, by reading the /dev/urandom. See my implementation in open source emerSSL program. You're welcome for borrow this entire hash-table implementation, or hash-function only.
Currently, my hash-function from the referred source computes two independent hashes for double hashing search algorithm.
There is "reduced" hash-function form the source, to demonstrate idea of non-linear mixing with S-block array"
uint32_t S_block[0x100]; // Substitute block, random contains
#define NLF(h, c) (S_block[(unsigned char)(c + h)] ^ c)
#define ROL(x, n) (((x) << (n)) | ((x) >> (32 - (n))))
int32_t hash(const char *key) {
uint32_t h = 0x1F351F35; // Barker code * 2
char c;
for(int i = 0; c = key[i]; i++) {
h = ROL(h, 5);
h += NLF(h, c);
}
return h;
}

System Verilog to Specman E

What is the equivalent syntax in Specman E for $readmemh(file,array) and similar system tasks and functions in System verilog?
I am working in converting the existing System verilog code into Specman E ,I have converted and implemented most of the concepts except few system methods like below .Please help me to implement methods like below in Specman E.
$readmemh(file_s,data_2d_i);//For converting SV code into Specman E
In the vr_ad Package there is an equivalent method. Assuming you have a vr_ad_mem object called data_2d_i, you can e.g. call
data_2d_i.readmemh(file_s,0,1000,0,1000);
To read addresses 0..1000 from that file into memory.
Example:
import vr_ad/e/vr_ad_top;
extend sys {
mem: vr_ad_mem;
keep mem.addressing_width_in_bytes == 1;
keep mem.size == 1000;
run() is also {
var data_2d_l: list of byte;
-- read first 16 bytes of mem-file and store the result in a list
mem.readmemh("mem.txt", 0, 15, 0, 15);
data_2d_l = mem.fetch(0, 16);
print data_2d_l;
};
};

Using memcpy/memset

When using memset or memcpy within an Obj-C program, will the compiler optimise the setting (memset) or copying (memcpy) of data into 32-bit writes or will it do it byte by byte?
You can see the libc implementations of these methods in the Darwin source. In 10.6.3, memset works at the word level. I didn't check memcpy, but probably it's the same.
You are correct that it's possible for the compiler to do the work inline instead of calling these functions. I suppose I'll let someone who knows better answer what it will do, though I would not expect a problem.
Memset will come as part of your standard C library so it depends on the implementation you are using. I would guess most implementations will copy in blocks of the native CPU size (32/64 bits) and then the remainder byte-by-byte.
Here is glibc's version of memcpy for an example implementation:
void *
memcpy (dstpp, srcpp, len)
void *dstpp;
const void *srcpp;
size_t len;
{
unsigned long int dstp = (long int) dstpp;
unsigned long int srcp = (long int) srcpp;
/* Copy from the beginning to the end. */
/* If there not too few bytes to copy, use word copy. */
if (len >= OP_T_THRES)
{
/* Copy just a few bytes to make DSTP aligned. */
len -= (-dstp) % OPSIZ;
BYTE_COPY_FWD (dstp, srcp, (-dstp) % OPSIZ);
/* Copy whole pages from SRCP to DSTP by virtual address manipulation,
as much as possible. */
PAGE_COPY_FWD_MAYBE (dstp, srcp, len, len);
/* Copy from SRCP to DSTP taking advantage of the known alignment of
DSTP. Number of bytes remaining is put in the third argument,
i.e. in LEN. This number may vary from machine to machine. */
WORD_COPY_FWD (dstp, srcp, len, len);
/* Fall out and copy the tail. */
}
/* There are just a few bytes to copy. Use byte memory operations. */
BYTE_COPY_FWD (dstp, srcp, len);
return dstpp;
}
So you can see it copies a few bytes first to get aligned, then copies in words, then finally in bytes again. It does some optimized page copying using some kernel operations.