string matching in bpf programs

string matching in bpf programs - ebpf

I am writing a bpf program in which i need to match prefix of filename in openat syscall.
Since we cannot link libc, and there is no such builtin function, i wrote one myself.
#define MAX_FILE_NAME_LENGTH 128
#define LOG_DIR "/my/prefix"
#define LEN_LOG_DIR sizeof(LOG_DIR)
int matchPrefix(char str[MAX_FILE_NAME_LENGTH]) {
for (int i = 0; i < LEN_LOG_DIR; i++) {
char ch1 = LOG_DIR[i];
if (ch1 == '\0') {
return 0;
}
char ch2 = str[i];
if (ch2 == '\0') {
return -1;
}
if (ch1 != ch2) {
return -2;
}
}
return (-3);
}
i am getting invalid mem access 'mem_or_null' error when i try to load this program.
libbpf: load bpf program failed: Permission denied
libbpf: -- BEGIN DUMP LOG ---
libbpf:
Validating matchPrefix() func#1...
38: R1=mem_or_null(id=2,off=0,imm=0) R10=fp0
; int matchPrefix(char str[MAX_FILE_NAME_LENGTH]) {
38: (18) r0 = 0xffffffff ; R0_w=P4294967295
; char ch2 = str[i];
40: (71) r2 = *(u8 *)(r1 +0)
R1 invalid mem access 'mem_or_null'
processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
libbpf: -- END LOG --
libbpf: failed to load program 'syscall_enter_open'
R1 is the register for first argument. which is a char array on stack. Do i need to pass length of array separately?
the function is called this way
char filename[MAX_FILE_NAME_LENGTH];
bpf_probe_read_user(filename, sizeof(filename), args->filename);
if (matchPrefix(filename) != 0) {
return 0;
}
Even if i change the function signature to accept a char * , there is some other error R1 invalid mem access 'scalar'.
Can someone help in understanding why am i getting this error in function verification?

TL;DR. Making your matchPrefix function a static inline one should work around the verifier issue.
I believe this is happening because the BPF verifier recognizes your function as a global one (vs. inlined) and therefore verifies it independently. That means it won't assume anything for the arguments. Thus, the str argument is recognized as mem_or_null and verification fails because you didn't check that pointer isn't null.
Inlining the function will work around this issue because the verifier won't see a function anymore. It will be able to preserve the inferred type of filename when verifying the code that corresponds to the body of matchPrefix.

there is easier solution using strcmp.
find in xdp-project/bpf-next
code from the same is
int strcmp(const char *cs, const char *ct)
{
unsigned char c1, c2;
while (1) {
c1 = *cs++;
c2 = *ct++;
if (c1 != c2)
return c1 < c2 ? -1 : 1;
if (!c1)
break;
}
return 0;
}
Do let me know if you still have issue.
NOTE: you cannot use #define to define string.
do reverify line
char ch1 = LOG_DIR[i];

Related

eBPF verifier: R1 is not a scalar

I have this eBPF code:
struct sock_info {
struct sockaddr addr;
};
SEC("tracepoint/syscalls/sys_enter_accept4")
int sys_enter_accept4(int fd, struct sockaddr *upeer_sockaddr, int *upeer_addrlen, int flags) {
struct sock_info *iad = bpf_ringbuf_reserve(&connections, sizeof(struct sock_info), 0);
if (!iad) {
bpf_printk("can't reserve ringbuf space");
return 0;
}
// https://man7.org/linux/man-pages/man7/bpf-helpers.7.html
bpf_probe_read(&iad->addr, sizeof(struct sockaddr), upeer_sockaddr);
bpf_ringbuf_submit(iad, 0);
return 0;
}
When I try to load it from the user space, the Cilium eBPF library returns me this Verification error:
permission denied
R1 is not a scalar
; int sys_enter_accept4(int fd, struct sockaddr *upeer_sockaddr, int *upeer_addrlen, int flags) {
0: (bf) r6 = r2
R2 !read_ok
processed 1 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
If I remove the bpf_probe_read function, then the code runs. I tried many alternatives to try to read the contents of the *upeer_sockaddr pointer, but did not succeed.
Any hint why the eBPF verifier is complaining?
This is the output of llvm-objdump command:
llvm-objdump -S --no-show-raw-insn pkg/ebpf/bpf_bpfel.o
pkg/ebpf/bpf_bpfel.o: file format elf64-bpf
Disassembly of section tracepoint/syscalls/sys_enter_accept4:
0000000000000000 <sys_enter_accept4>:
0: r6 = r2
1: r1 = 0 ll
3: r2 = 16
4: r3 = 0
5: call 131
6: r7 = r0
7: if r7 != 0 goto +5 <LBB0_2>
8: r1 = 0 ll
10: r2 = 28
11: call 6
12: goto +7 <LBB0_3>
0000000000000068 <LBB0_2>:
13: r1 = r7
14: r2 = 16
15: r3 = r6
16: call 4
17: r1 = r7
18: r2 = 0
19: call 132
00000000000000a0 <LBB0_3>:
20: r0 = 0
21: exit

You have defined your tracepoint program with 4 arguments int sys_enter_accept4(int fd, struct sockaddr *upeer_sockaddr, int *upeer_addrlen, int flags)
But these are not the parameters with which your program will be invoked.
The R2 !read_ok error is caused because you are accessing a second, non-existing parameter and you are not allowed to read from uninitialized registers.
For tracepoints you can find out the context structure by looking at the sysfs:
$ cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_accept4/format
name: sys_enter_accept4
ID: 1595
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:int __syscall_nr; offset:8; size:4; signed:1;
field:int fd; offset:16; size:8; signed:0;
field:struct sockaddr * upeer_sockaddr; offset:24; size:8; signed:0;
field:int * upeer_addrlen; offset:32; size:8; signed:0;
field:int flags; offset:40; size:8; signed:0;
print fmt: "fd: 0x%08lx, upeer_sockaddr: 0x%08lx, upeer_addrlen: 0x%08lx, flags: 0x%08lx", ((unsigned long)(REC->fd)), ((unsigned long)(REC->upeer_sockaddr)), ((unsigned long)(REC->upeer_addrlen)), ((unsigned long)(REC->flags))
If we turn this into a structure we get the following:
struct accept4_args {
u64 pad;
u32 __syscall_nr;
u32 fd;
struct sockaddr *upeer_sockaddr;
int *upeer_addrlen;
int flags;
};
Note that I replaced the common_ fields here with u64 pad, since that is likely not what you are interested in.
A pointer to the struct is passed in as one parameter: int sys_enter_accept4(struct accept4_args *args)

C question in logical OR: 2 operands evaluated (0) false, but the result works as TRUE range

My doubt is about the basic theory of "or logical operator". Especifically, logical OR returns true only if either one operand is true.
For instance, in this OR expression (x<O || x> 8) using x=5 when I evalute the 2 operand, I interpret it as both of them are false.
But I have an example that does not fit wiht it rule. On the contrary the expression works as range between 0 and 8, both included.
Following the code:
#include <stdio.h>
int main(void)
{
int x ; //This is the variable for being evaluated
do
{
printf("Imput a figure between 1 and 8 : ");
scanf("%i", &x);
}
while ( x < 1 || x > 8); // Why this expression write in this way determinate the range???
{
printf("Your imput was ::: %d ",x);
printf("\n");
}
printf("\n");
}
I have modified my first question. I really appreciate any helpo in order to clarify my doubt
In advance, thank you very much. Otto

It's not a while loop; it's a do ... while loop. The formatting makes it hard to see. Reformatted:
#include <stdio.h>
int main(void) {
int x;
// Execute the code in the `do { }` block once, no matter what.
// Keep executing it again and again, so long as the condition
// in `while ( )` is true.
do {
printf("Imput a figure between 1 and 8 : ");
scanf("%i", &x);
} while (x < 1 || x > 8);
// This creates a new scope. While perfectly valid C,
// this does absolutely nothing in this particular case here.
{
printf("Your imput was ::: %d ",x);
printf("\n");
}
printf("\n");
}
The block with the two printf calls is not part of the loop. The while (x < 1 || x > 8) makes it so that the code in the do { } block runs, so long as x < 1 or x > 8. In other words, it runs until x is between 1 and 8. This has the effect of asking the user to input a number again and again, until they finally input a number that's between 1 and 8.

Is there a String size limit when sending strings back to BPF code and back to userspace?

I am sending this sentence through my BPF code through a BPF Char Array here:
jmommyijsadifjasdijfa, hello, world
And when I print out my output, I only seem to get this output
jmommyij
I seem to be hitting some kind of String size limit. Is there any way to go over this string size limit and print the entire string?
Here is what my BPF code looks like:
#include <uapi/linux/bpf.h>
#define ARRAYSIZE 512
BPF_ARRAY(lookupTable, char**, ARRAYSIZE);
int helloworld2(void *ctx)
{
int k = 0;
//print the values in the lookup table
#pragma clang loop unroll(full)
for (int i = 0; i < sizeof(lookupTable); i++) {
//need to use an intermiate variable to hold the value since the pointer will not increment correctly.
k = i;
char *key = lookupTable.lookup(&k);
// if the key is not null, print the value
if (key != NULL && sizeof(key) > 1) {
bpf_trace_printk("%s\n", key);
}
}
return 0;
}
Here is my py file:
import ctypes
from bcc import BPF
b = BPF(src_file="hello.c")
lookupTable = b["lookupTable"]
#add hello.csv to the lookupTable array
f = open("hello.csv","r")
file_contents = f.read()
#append file contents to the lookupTable array
b_string1 = file_contents.encode('utf-8')
print(b_string1)
lookupTable[ctypes.c_int(0)] = ctypes.create_string_buffer(b_string1, len(b_string1))
#print(file_contents)
f.close()
# This attaches the compiled BPF program to a kernel event of your choosing,
#in this case to the sys_clone syscall which will cause the BPF program to run
#everytime the sys_clone call occurs.
b.attach_kprobe(event=b.get_syscall_fnname("clone"), fn_name="helloworld2")
# Capture and print the BPF program's trace output
b.trace_print()

You're creating an array of 512 char** (basically u64). So you're just storing the first 8 bytes of your string the rest is discarded.
What you need is an array of 1 holding a 512 byte value:
struct data_t {
char buf[ARRAYSIZE];
};
BPF_ARRAY(lookupTable, struct data_t, ARRAYSIZE);
Also see https://github.com/iovisor/bpftrace/issues/1957

strncpy functions produces wrong file names

I am new in C and writing a code to help my data analysis. Part of it opens predetermined files.
This piece of code is giving me problems and I cannot understand why.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXLOGGERS 26
// Declare the input files
char inputfile[];
char inputfile_hum[MAXLOGGERS][8];
// Declare the output files
char newfile[];
char newfile_hum[MAXLOGGERS][8];
int main()
{
int n = 2;
while (n > MAXLOGGERS)
{
printf("n error, n must be < %d: ", MAXLOGGERS);
scanf("%d", &n);
}
// Initialize the input and output file names
strncpy(inputfile_hum[1], "Ahum.csv", 8);
strncpy(inputfile_hum[2], "Bhum.csv", 8);
strncpy(newfile_hum[1], "Ahum.txt", 8);
strncpy(newfile_hum[2], "Bhum.txt", 8);
for (int i = 1; i < n + 1; i++)
{
strncpy(inputfile, inputfile_hum[i], 8);
FILE* file1 = fopen(inputfile, "r");
// Safety check
while (file1 == NULL)
{
printf("\nError: %s == NULL\n", inputfile);
printf("\nPress enter to exit:");
getchar();
return 0;
}
strncpy(newfile, newfile_hum[i], 8);
FILE* file2 = fopen(newfile, "w");
// Safety check
if (file2 == NULL)
{
printf("Error: file2 == NULL\n");
getchar();
return 0;
}
for (int c = fgetc(file1); c != EOF; c = fgetc(file1))
{
fprintf(file2, "%c", c);
}
fclose(file1);
fclose(file2);
}
// system("Ahum.txt");
// system("Bhum.txt");
}
This code produces two files but instead of the names:
Ahum.txt
Bhum.txt
the files are named:
Ahum.txtv
Bhum.txtv
The reason I am using strncpy in the for loop is because n will actually be inputted by the user later.

I see at least three problems here.
The first problem is that your character array is too small for your strings.
"ahum.txt", etc. will need to take nine characters. Eight for the actual text plus one more for the null terminating character.
The second problem is that you have declared the character arrays "newfile" and "inputfile" as empty arrays. These also need to be a number able to contain the strings (at least 9).
You're lucky to have not had a crash from overwriting memory out the program space.
The third and final problem is your use of strcpy().
strncpy(dest, src, n) will copy n characters from src to dest, but it won't copy final null terminator character if n is equal or less than size of the src string.
From strncpy() manpage: https://linux.die.net/man/3/strncpy
The strncpy() function ... at most n bytes of src are copied.
Warning: If there is no null byte among the first n bytes of src,
the string placed in dest will not be null-terminated.
Normally what you would want to do is have "n" be the size of the destination buffer minus 1 to allow for the null character.
For example:
strncpy(dest, src, sizeof(dest) - 1); // assuming dest is char array

There are a couple of problems with your code.
inputfile_hum, newfile_hum, need to be to be one char bigger for the trailing '\0' on strings.
char inputfile_hum[MAXLOGGERS][9];
...
char newfile_hum[MAXLOGGERS][9];
strncpy expects the first argument to be a char * region big enough to hold the expected results, so inputfile[] and outputfile[] need to be declared:
char inputfile[9];
char outputfile[9];

How to use fgets() to safely handle user input more than once

I'm sorry if I duplicate, but I have tried EVERYTHING, and I can't figure out why this code keeps breaking. The highest-priority goal was to make this code handle input safely, or just anything that the user can type into the console, without it breaking. However, I also need it to be able to run more than once. fgets() won't let me do that since it keeps reading '\n' somewhere and preventing me from entering input more than once when it hits the end of the do/while loop. I have tried fflushing stdin, I have tried scanf("%d *[^\n]"); and just regular scanf("%d *[^\n]");, but none of those work, and, in fact, they break the code! I used this website to try to get the "Safely handling input" code to work, but I don't completely understand what they're doing. I tried to jerry-rig (spelling?) it as best I could, but I'm not sure if I did it right. Did I miss something? I didn't think a problem this seemingly simple could be so much of a headache! >_<
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
using namespace std;
#define BUF_LEN 100
#define SPACE 32
#define SPCL_CHAR1F 33
#define SPCL_CHAR1L 47
#define SPCL_CHAR2F 58
#define SPCL_CHAR2L 64
#define SPCL_CHAR3F 91
#define SPCL_CHAR3L 96
#define NUMF 48
#define NUML 57
#define UC_CHARF 65
#define UC_CHARL 90
#define LC_CHARF 97
#define LC_CHARL 122
void main ()
{
char* buffer;
int SpcCounter=0, SpclCounter=0, NumCounter=0,LcCounter=0, UcCounter=0;
char line[BUF_LEN],response[4];
char*input="";
bool repeat=false;
do
{
for(int i=0;i<BUF_LEN;i++)
{
line[i]=NULL;
}
buffer=NULL;
printf("Enter your mess of characters.\n");
buffer=fgets(line,BUF_LEN,stdin);
//To handle going over the buffer limit: BROKEN
if(buffer!=NULL)
{
size_t last=strlen(line)-1;
if(line[last]=='\n')
line[last]='\0';
else
{
fscanf(stdin,"%c *[^\n]");
}
}
for(int i=0;i<BUF_LEN;i++)
{
char temp=buffer[i];
if(temp==SPACE||temp==255)
SpcCounter++;
else if((temp >= SPCL_CHAR1F && temp <= SPCL_CHAR1L)||/*Special characters*/
(temp >= SPCL_CHAR2F && temp <= SPCL_CHAR2L)||
(temp >= SPCL_CHAR3F && temp <= SPCL_CHAR3L))
SpclCounter++;
else if (temp >=NUMF && temp <= NUML)/*Numbers*/
NumCounter++;
else if (temp >= UC_CHARF && temp <= UC_CHARL)/*Uppercase letters*/
UcCounter++;
else if (temp >= LC_CHARF && temp <= LC_CHARL)/*Lowercase letters*/
LcCounter++;
}
printf("There were %i space%s, %i special character%s, %i number%s, and %i letter%s,\n"
"consisting of %i uppercase letter%s and %i lowercase.\n",
SpcCounter,(SpcCounter==1?"":"s"),SpclCounter,(SpclCounter==1?"":"s"), NumCounter,(NumCounter==1?"":"s"),UcCounter+LcCounter,
(UcCounter+LcCounter==1?"":"s"), UcCounter,(UcCounter==1?"":"s"), LcCounter);
printf("Would you like to do this again? (yes/no)");
input=fgets(response,4,stdin);
/*
ALL BROKEN
if(input!=NULL)
{
size_t last=strlen(response)-1;
if(response[last]=='\n')
response[last]='\0';
else
{
fscanf(stdin,"%*[^\n]");
fscanf(stdin,"%c");
}
}
*/
//To capitalize the letters
for(int i=0;i<4;i++)
{
char* temp=&response[i];
if (*temp >= LC_CHARF && *temp <= LC_CHARL)
*temp=toupper(*temp);//Capitalize it
}
//To set repeat: WORKS, BUT WEIRD
repeat=!strncmp(input,"YES",4);
}
while(repeat);
}

For safe, secure user input in C (and in C++ if I'm using C-style strings), I usually revert to an old favorite of mine, the getLine function:
// Use stdio.h and string.h for C.
#include <cstdio>
#include <cstring>
#define OK 0
#define NO_INPUT 1
#define TOO_LONG 2
static int getLine (char *prmpt, char *buff, size_t sz) {
int ch, extra;
// Output prompt then get line with buffer overrun protection.
if (prmpt != NULL) {
printf ("%s", prmpt);
fflush (stdout);
}
if (fgets (buff, sz, stdin) == NULL)
return NO_INPUT;
// If it was too long, there'll be no newline. In that case, we flush
// to end of line so that excess doesn't affect the next call.
if (buff[strlen(buff)-1] != '\n') {
extra = 0;
while (((ch = getchar()) != '\n') && (ch != EOF))
extra = 1;
return (extra == 1) ? TOO_LONG : OK;
}
// Otherwise remove newline and give string back to caller.
buff[strlen(buff)-1] = '\0';
return OK;
}
This function:
can output a prompt if desired.
uses fgets in a way that avoids buffer overflow.
detects end-of-file during the input.
detects if the line was too long, by detecting lack of newline at the end.
removes the newline if there.
"eats" characters until the next newline to ensure that they're not left in the input stream for the next call to this function.
It's a fairly solid piece of code that's been tested over many years and is a good solution to the problem of user input.
In terms of how you call it for the purposes in your question, I would add something very similar to what you have, but using the getLine function instead of directly calling fgets and fiddling with the results. First some headers and the same definitions:
#include <iostream>
#include <cstdlib>
#include <cctype>
#define BUF_LEN 100
#define SPACE 32
#define SPCL_CHAR1F 33
#define SPCL_CHAR1L 47
#define SPCL_CHAR2F 58
#define SPCL_CHAR2L 64
#define SPCL_CHAR3F 91
#define SPCL_CHAR3L 96
#define NUMF 48
#define NUML 57
#define UC_CHARF 65
#define UC_CHARL 90
#define LC_CHARF 97
#define LC_CHARL 122
Then the first part of main gathering a valid line (using the function) to be evaluated:
int main () {
int SpcCounter, SpclCounter, NumCounter, LcCounter, UcCounter;
char line[BUF_LEN], response[4];
bool repeat = false;
do {
SpcCounter = SpclCounter = NumCounter = LcCounter = UcCounter = 0;
// Get a line until valid.
int stat = getLine ("\nEnter a line: ", line, BUF_LEN);
while (stat != OK) {
// End of file means no more data possible.
if (stat == NO_INPUT) {
cout << "\nEnd of file reached.\n";
return 1;
}
// Only other possibility is "Too much data on line", try again.
stat = getLine ("Input too long.\nEnter a line: ", line, BUF_LEN);
}
Note that I've changed where the counters are set to zero. Your method had them accumulating values every time through the loop rather than resetting them to zero for each input line. This is followed by your own code which assigns each character to a class:
for (int i = 0; i < strlen (line); i++) {
char temp=line[i];
if(temp==SPACE||temp==255)
SpcCounter++;
else if((temp >= SPCL_CHAR1F && temp <= SPCL_CHAR1L)||
(temp >= SPCL_CHAR2F && temp <= SPCL_CHAR2L)||
(temp >= SPCL_CHAR3F && temp <= SPCL_CHAR3L))
SpclCounter++;
else if (temp >=NUMF && temp <= NUML)
NumCounter++;
else if (temp >= UC_CHARF && temp <= UC_CHARL)
UcCounter++;
else if (temp >= LC_CHARF && temp <= LC_CHARL)
LcCounter++;
}
printf("There were %i space%s, %i special character%s, "
"%i number%s, and %i letter%s,\n"
"consisting of %i uppercase letter%s and "
"%i lowercase.\n",
SpcCounter, (SpcCounter==1?"":"s"),
SpclCounter, (SpclCounter==1?"":"s"),
NumCounter, (NumCounter==1?"":"s"),
UcCounter+LcCounter, (UcCounter+LcCounter==1?"":"s"),
UcCounter, (UcCounter==1?"":"s"),
LcCounter);
Then finally, a similar way as above for asking whether user wants to continue.
// Get a line until valid yes/no, force entry initially.
*line = 'x';
while ((*line != 'y') && (*line != 'n')) {
stat = getLine ("Try another line (yes/no): ", line, BUF_LEN);
// End of file means no more data possible.
if (stat == NO_INPUT) {
cout << "\nEnd of file reached, assuming no.\n";
strcpy (line, "no");
}
// "Too much data on line" means try again.
if (stat == TOO_LONG) {
cout << "Line too long.\n";
*line = 'x';
continue;
}
// Must be okay: first char not 'y' or 'n', try again.
*line = tolower (*line);
if ((*line != 'y') && (*line != 'n'))
cout << "Line doesn't start with y/n.\n";
}
} while (*line == 'y');
}
That way, you build up your program logic based on a solid input routine (which hopefully you'll understand as a separate unit).
You could further improve the code by removing the explicit range checks and using proper character classes with cctype(), like isalpha() or isspace(). That would make it more portable (to non-ASCII systems) but I'll leave that as an exercise for later.
A sample run of the program is:
Enter a line: Hello, my name is Pax and I am 927 years old!
There were 10 spaces, 2 special characters, 3 numbers, and 30 letters,
consisting of 3 uppercase letters and 27 lowercase.
Try another line (yes/no): yes
Enter a line: Bye for now
There were 2 spaces, 0 special characters, 0 numbers, and 9 letters,
consisting of 1 uppercase letter and 8 lowercase.
Try another line (yes/no): no

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

string matching in bpf programs - ebpf

Related

eBPF verifier: R1 is not a scalar

C question in logical OR: 2 operands evaluated (0) false, but the result works as TRUE range

Is there a String size limit when sending strings back to BPF code and back to userspace?

strncpy functions produces wrong file names

How to use fgets() to safely handle user input more than once

Categories

Resources