Fork() in XV6, does the process child execute in kernel or user mode? - operating-system

In XV6, when a fork() is called, does the child execute in kernel mode or user mode?
This is the fork code in XV6:
// Create a new process copying p as the parent.
// Sets up stack to return as if from system call.
// Caller must set state of returned proc to RUNNABLE.
int fork(void){
int i, pid;
struct proc *np;
struct proc *curproc = myproc();
// Allocate process.
if((np = allocproc()) == 0){
return -1;
}
// Copy process state from proc.
if((np->pgdir = copyuvm(curproc->pgdir, curproc->sz)) == 0){
kfree(np->kstack);
np->kstack = 0;
np->state = UNUSED;
return -1;
}
np->sz = curproc->sz;
np->parent = curproc;
*np->tf = *curproc->tf;
// Clear %eax so that fork returns 0 in the child.
np->tf->eax = 0;
for(i = 0; i < NOFILE; i++)
if(curproc->ofile[i])
np->ofile[i] = filedup(curproc->ofile[i]);
np->cwd = idup(curproc->cwd);
safestrcpy(np->name, curproc->name, sizeof(curproc->name));
pid = np->pid;
acquire(&ptable.lock);
np->state = RUNNABLE;
release(&ptable.lock);
return pid;
}
I did some research but even from the code I can't understand how it works. Understanding how it works in UNIX would also help

It is almost the exact copy of the parent process except the value of eax register and parent process information, so it will execute whichever context the parent process is in.
The fork() function here creates a new process structure by calling allocproc() and fills it with the values of the original process and maps the same page tables.
Finally, it sets the process state to RUNNABLE which allows the scheduler to run the new process along with the parent.
That means actual running is performed by the scheduler, not the fork code here.

What Sedat has written entirely correct. The forked process, or the child would run in the same context which it's parent was, i.e. either Kernel or User.
In addition to that, I feel what confused you were the calls done by alloproc() like kalloc() and the attributes like kstack. These deal with setting up the new process in the system with regards to the page tables and the memory part.

Related

How to monitor Gtk3 Event Loop latency

I would like to monitor Gtk3 event loop latency, i.e time spent for each iteration of Gtk main event loop. Basically, the idea is to run a custom function at each tick of the main event loop.
I tried g_idle_add, but documentation is not clear if the callback will be invoked at each loop.
Any thoughts ?
Probably writing a custom GSource is your best choice.
GSource *
g_source_new (GSourceFuncs *source_funcs,
guint struct_size);
The size is specified to allow creating structures derived from GSource that contain additional data
You should also give it the highest priority.
I'm not sure it will be dispatched at every single iteration, but it will be prepared on every iteration. To bring your source to life you obtain context with g_main_loop_get_context and call g_source_attach.
All in all it looks like this:
// struct MySource
// {
// struct GSource glib;
// int my_data;
// };
gboolean my_prepare (GSource *source,
gint *timeout_)
{
g_message ("%li", g_get_monotonic_time());
*timeout_ = 0;
(MySource*)source->my_data = 1;
return TRUE;
}
GSourceFuncs funcs = {.prepare = my_prepare};
GSource *src = g_source_new (&funcs, sizeof (MySource));
g_source_set_priority (src, G_PRIORITY_HIGH);
g_source_attach (src, g_main_loop_get_context());
This doesn't include any cleanup.

Difference between sleep(1) and while(sleep(1))

I have the following piece of code while looking for sigchild code. In the code below 50 children are created and the parent process waits in sigchild handler until all 50 children are destroyed.
I get the expected result if I use while(sleep(1)) at the end of main, however if I replace it by sleep(1), the parent gets destoyed before all child processes terminate.
int l=0;
/* SIGCHLD handler. */
static void sigchld_hdl (int sig)
{
/* Wait for all dead processes.
* We use a non-blocking call to be sure this signal handler will not
* block if a child was cleaned up in another part of the program. */
while (waitpid(-1, NULL, WNOHANG) > 0) {
printf(" %d",l++);
}
printf("\nExiting from child :: %d\n",l);
}
int main (int argc, char *argv[])
{
struct sigaction act;
int i;
memset (&act, 0, sizeof(act));
act.sa_handler = sigchld_hdl;
if (sigaction(SIGCHLD, &act, 0)) {
perror ("sigaction");
return 1;
}
/* Make some children. */
for (i = 0; i < 50; i++) {
switch (fork()) {
case -1:
perror ("fork");
return 1;
case 0:
return 0;
}
}
/* Wait until we get a sleep() call that is not interrupted by a signal. */
while (sleep(1)) {
}
// sleep(1);
printf("\nterminating\n");
return 0;
}
I have the following piece of code while looking for sigchild code. In
the code below 50 children are created and the parent process waits in
sigchild handler until all 50 children are destroyed.
No, it does not. waitpid WNOHANG will fail if there is nobody exited. And there is no guarantee all the children exited (or will exit) during execution of the handler.
Even with mere sleep(1) there is no guarantee any child will manage to exit, but in practice most of them will.
sleeping is a fundamentally wrong approach here. Since you know how many children you created, you should wait for all of them to finish and that's it. For instance you can decrement a counter of existing children each time you reap something and wait for it to go to 0.
Depending on how the real program looks like, it may be you don't want the handler in the first place: just have the loop at the end, but without WNOHANG.
I also have to comment about this:
/* Wait for all dead processes.
* We use a non-blocking call to be sure this signal handler will not
* block if a child was cleaned up in another part of the program. */
You can't mix a signal handler and waiting on your own. You risk snatching the process from the other code waiting for it, what happens then?
It's a design error. fork/exit behaviour has to either be unified OR decentralized.
From the manual page
Return Value
Zero if the requested time has elapsed, or the number of seconds
left to sleep, if the call was interrupted by a signal handler.
So I guess without the while bit, the sleep is being interrupted, hence that process ending quickly

Page fault with newlib functions

I've been porting newlib to my very small kernel, and I'm stumped: whenever I include a function that references a system call, my program will page fault on execution. If I call a function that does not reference a system call, like rand(), nothing will go wrong.
Note: By include, I mean as long as the function, e.g. printf() or fopen(), is somewhere inside the program, even if it isn't called through main().
I've had this problem for quite some time now, and have no idea what could be causing this:
I've rebuilt newlib numerous times
Modified my ELF loader to load the
code from the section headers instead of program headers
Attempted to build newlib/libgloss separately (which failed)
Linked the libraries (libc, libnosys) through the ld script using GROUP, gcc and ld
I'm not quite sure what other information I should include with this, but I'd be happy to include what I can.
Edit: To verify, the page faults occurring are not at the addresses of the failing functions; they are elsewhere in the program. For example, when I call fopen(), located at 0x08048170, I will page fault at 0xA00A316C.
Edit 2:
Relevant code for loading ELF:
int krun(u8int *name) {
int fd = kopen(name);
Elf32_Ehdr *ehdr = kmalloc(sizeof(Elf32_Ehdr*));
read(fd, ehdr, sizeof(Elf32_Ehdr));
if (ehdr->e_ident[0] != 0x7F || ehdr->e_ident[1] != 'E' || ehdr->e_ident[2] != 'L' || ehdr->e_ident[3] != 'F') {
kfree(ehdr);
return -1;
}
int pheaders = ehdr->e_phnum;
int phoff = ehdr->e_phoff;
int phsize = ehdr->e_phentsize;
int sheaders = ehdr->e_shnum;
int shoff = ehdr->e_shoff;
int shsize = ehdr->e_shentsize;
for (int i = 0; i < pheaders; i++) {
lseek(fd, phoff + phsize * i, SEEK_SET);
Elf32_Phdr *phdr = kmalloc(sizeof(Elf32_Phdr*));
read(fd, phdr, sizeof(Elf32_Phdr));
u32int page = PMMAllocPage();
int flags = 0;
if (phdr->p_flags & PF_R) flags |= PAGE_PRESENT;
if (phdr->p_flags & PF_W) flags |= PAGE_WRITE;
int pages = (phdr->p_memsz / 0x1000) + 1;
while (pages >= 0) {
u32int mapaddr = (phdr->p_vaddr + (pages * 0x1000)) & 0xFFFFF000;
map(mapaddr, page, flags | PAGE_USER);
pages--;
}
lseek(fd, phdr->p_offset, SEEK_SET);
read(fd, (void *)phdr->p_vaddr, phdr->p_filesz);
kfree(phdr);
}
// Removed: code block that zeroes .bss: it's already zeroed whenever I check it anyways
// Removed: code block that creates thread and adds it to scheduler
kfree(ehdr);
return 0;
}
Edit 3: I've noticed that if I call a system call, such as write(), and then call printf() two or more times, I will get an unknown opcode interrupt. Odd.
Whoops! Figured it out: when I map the virtual address, I should allocate a new page each time, like so:
map(mapaddr, PMMAllocPage(), flags | PAGE_USER);
Now it works fine.
For those curious as to why it didn't work: when I wasn't including printf(), the size of the program was under 0x1000 bytes, so mapping with only one page was okay. When I include printf() or fopen(), the size of the program was much bigger so that's what caused the issue.

Is it possible to prevent children inheriting the CPU/core affinity of the parent?

I'm particularly interesting in doing this on Linux, regarding Java programs. There are already a few questions that say you have no control from Java, and some RFEs closed by Sun/Oracle.
If you have access to source code and use a low-level language, you can certainly make the relevant system calls. However, sand-boxed systems - possibly without source code - present more of a challenge. I would have thought that a tool to set this per-process or an kernel parameter are able to control this from outside the parent process. This is really what I'm after.
I understand the reason why this is the default. It looks like some version of Windows may allow some control of this, but most do not. I was expecting Linux to allow control of it, but seems like it's not an option.
Provided you have sufficient privileges, you could simply call setaffinity before execing in the child. In other words, from
if (fork() == 0)
execve("prog", "prog", ...);
move to use
/* simple example using taskset rather than setaffinity directly */
if (fork() == 0)
execve("taskset", "taskset", "-c", "0-999999", ...);
[Of course using 999999 is not nice, but that can be substituted by a program which automatically determined the number of cpus and resets the affinity mask as desired.]
What you could also do, is change the affinity of the child from the parent, after the fork(). By the way, I'm assuming you're on linux, some of this stuff, such as retrieving the number of cores with sysconf() will be different on different OS's and unix flavors.... The example here, gets the cpu of the parent process and tries to ensure all child processes are scheduled on a different core, in round robin.
/* get the number of cpu's */
numcpu = sysconf( _SC_NPROCESSORS_ONLN );
/* get our CPU */
CPU_ZERO(&mycpuset);
sched_getaffinity( getpid() , sizeof mycpuset , &mycpuset);
for(i=0 ; i < numcpu ; i++ )
{
if(CPU_ISSET( i, &mycpuset))
{
mycpu = i;
break;
}
}
//...
while(1)
{
//Some other stuff.....
/* now the fork */
if((pid = fork()) == 0)
{
//do your child stuff
}
/* Parent... can schedule child. */
else
{
cpu = ++cpu % numcpu;
if(cpu == mycpu)
cpu = ++cpu % numcpu;
CPU_ZERO(&mycpuset);
CPU_SET(cpu,&mycpuset);
/*set processor affinity*/
sched_setaffinity(pid, sizeof mycpuset, &mycpuset );
//any other father stuff
}
}

Simplest way to process a list of items in a multi-threaded manner

I've got a piece of code that opens a data reader and for each record (which contains a url) downloads & processes that page.
What's the simplest way to make it multi-threaded so that, let's say, there are 10 slots which can be used to download and process pages in simultaneousy, and as slots become available next rows are being read etc.
I can't use WebClient.DownloadDataAsync
Here's what i have tried to do, but it hasn't worked (i.e. the "worker" is never ran):
using (IDataReader dr = q.ExecuteReader())
{
ThreadPool.SetMaxThreads(10, 10);
int workerThreads = 0;
int completionPortThreads = 0;
while (dr.Read())
{
do
{
ThreadPool.GetAvailableThreads(out workerThreads, out completionPortThreads);
if (workerThreads == 0)
{
Thread.Sleep(100);
}
} while (workerThreads == 0);
Database.Log l = new Database.Log();
l.Load(dr);
ThreadPool.QueueUserWorkItem(delegate(object threadContext)
{
Database.Log log = threadContext as Database.Log;
Scraper scraper = new Scraper();
dc.Product p = scraper.GetProduct(log, log.Url, true);
ManualResetEvent done = new ManualResetEvent(false);
done.Set();
}, l);
}
}
You do not normally need to play with the Max threads (I believe it defaults to something like 25 per proc for worker, 1000 for IO). You might consider setting the Min threads to ensure you have a nice number always available.
You don't need to call GetAvailableThreads either. You can just start calling QueueUserWorkItem and let it do all the work. Can you repro your problem by simply calling QueueUserWorkItem?
You could also look into the Parallel Task Library, which has helper methods to make this kind of stuff more manageable and easier.