I am trying to put "inotify_add_watch" for process.
My intent of doing this is to get notification when the process is killed.
my notification code is,
wd = inotify_add_watch(ifd, "/proc",IN_ALL_EVENTS);
But it does not notify even if the process deleted and the directory is removed from the
/proc folder.
In many Linux distributions, /proc is mounted as procfs.
Inotify does report some but not all events in sysfs and procfs.
reference:
http://en.wikipedia.org/wiki/Inotify#Limitations
http://en.wikipedia.org/wiki/Procfs
http://inotify.aiken.cz/?section=inotify&page=faq (search procfs)
Select function on procfs file (search procfs)
Inotify does not support Pseudo files like sysfs and procfs
The proc and sys file system is sometimes referred to as a process information pseudo-file system. It does not contain ``real'' files but rather runtime system information (e.g. system memory, devices mounted, hardware configuration, etc).
Inotify reports only events that a user-space program triggers
through the filesystem API. As a result, it does not catch remote
events that occur on network filesystems. (Applications must fall
back to polling the filesystem to catch such events.) Furthermore,
various pseudo-filesystems such as /proc, /sys, and /dev/pts are not
monitorable with inotify.
http://man7.org/linux/man-pages/man7/inotify.7.html
Related
I am trying to get watchman running in order to monitor an NFS mounted folder.
I was able to get everything running within the local file system.
Now, I have changed the config to monitor a network folder from my NAS. It is locally mounted.
Watchman server is running on the Linux client.
All watchman commands on the Linux client.
watchman watch
watchman -- trigger /home/karsten/CloudStation/karsten/CloudStation/Karsten_NAS/fotos/zerene 'photostack' -- /home/karsten/bin/invoke_rawtherapee.sh
Folder is located on the NAS, according to
mtab:
192.168.xxx.xxx:/volume1/homes /home/karsten/CloudStation nfs rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.xxx.xxx,mountvers=3,mountport=892,mountproto=udp,local_lock=none,addr=192.168.xxx.xxx 0 0
If I move files into the folder on the local machine they get recognized and watchman triggers the actions.
BUT if I move files into the same folder from a remote client connected to the same NAS folder nothing happens.
Any idea what I need to change to make watchman recognize the files dropped from another client into that folder?
Many thanks in advance
Karsten
Unfortunately, it is not possible.
From https://facebook.github.io/watchman/docs/install.html#system-requirements:
Watchman relies on the operating system facilities for file notification, which means that you will likely have very poor results using it on any kind of remote or distributed filesystem.
NFS doesn't tie into the inotify layer in any kernel (the protocol simply doesn't support this sort of change notification), so you will only be able to be notified of changes that are made to the mounted volume by the client (because those are looped back through inotify locally), not for changes made anywhere else.
I can't seem to find an answer to this question. I'd just like to understand how one single OS can implement and run (support) multiple file systems?
Assume that there's a global name-space where all file and directory names have some sort of prefix to determine which file system the file or directory is from. For some operating systems (DOS) the prefix might be a device letter (e.g. the C:\ at the start of C:\foo\bar.txt). For other operating systems it might look like a normal part of the file's path (e.g. the /home at the start of /home/foo/bar.txt/ might tell the OS that the file is in the file system mounted at /home).
Once the OS has figured out which file system contains the file it can ask that file system about the file using the remaining part of the file's "global name" (e.g. for the file /home/foo/bar.txt it'd ask the file system mounted at /home for the file /foo.bar.txt).
To allow this to work there will be a layer built into the OS to register file systems and figure out which file system to ask about which file or directory (likely in addition to providing other features - e.g. caching directory info and file data). Often (but not always) this is called "the Virtual File System" (or VFS).
During boot, and when a new storage device is plugged in, there will be "something" to figure out which type of file system to use and how it will be added to the global name space. This can include auto-detection (e.g. from partition table entries on the storage device), a set of rules for removable media, and/or a configuration file (/etc/fstab).
The basic function of a file system is to provide the mapping to translate virtual blocks into logical blocks (or in ye olde days, physical blocks). For a file system, the operating system has to implement a translation system that will convert virtual block N of a file into logical block Q on the disk.
There is nothing that prevents an operating system from having multiple subsystems for performing that translation in different ways corresponding to multiple file systems.
Most operating systems have some kind of MOUNT command that tells the operating system to connect to a disk and determine what kind of file system it has. It is during the mount process that the operating system selects the appropriate virtual to logical translation software to use.
Operating system has supported multiple file systems from the beginning. In ye olde days, there were 9-track tapes with their own file systems in addition to disks. The operating system had to support those as well.
I wondering the principle of Codepad.org website. (Principle of online C compiler)
I think the principle follow these steps.
User submit the C code.
Website send to GCC installed on server.
GCC compile the code.
GCC return the strings and send to Website(Webserver)
Webserver return the strings to user.
Is that steps right?
Then, how to protect from malignant code such as deleting all file from server?
From http://codepad.org/about:
Code execution is handled by a supervisor based on geordi. The strategy is to run everything under ptrace, with many system calls disallowed or ignored. Compilers and final executables are both executed in a chroot jail, with strict resource limits. The supervisor is written in Haskell.
Also:
Paranoia
When your app is remote code execution, you have to expect security problems. Rather than rely on just the chroot and ptrace supervisor, I've taken some additional precautions:
The supervisor processes run on virtual machines, which are firewalled such that they are incapable of making outgoing connections.
The machines that run the virtual machines are also heavily firewalled, and restored from their source images periodically.
I am learning to write character device drivers from the Kernel Module Programming Guide, and used mknod to create a node in /dev to talk to my driver.
However, I cannot find any obvious way to remove it, after checking the manpage and observing that rmnod is a non-existent command.
What is the correct way to reverse the effect of mknod, and safely remove the node created in /dev?
The correct command is just rm :)
A device node created by mknod is just a file that contains a device major and minor number. When you access that file the first time, Linux looks for a driver that advertises that major/minor and loads it. Your driver then handles all I/O with that file.
When you delete a device node, the usual Un*x file behavior aplies: Linux will wait until there are no more references to the file and then it will be deleted from disk.
Your driver doesn't really notice anything of this. Linux does not automatically unload modules. Your driver wil simply no longer receive requests to do anything. But it will be ready in case anybody recreates the device node.
You are probably looking for a function rather than a command. unlink() is the answer. unlink() will remove the file/special file if no process has the file open. If any processes have the file open, then the file will remain until the last file descriptor referring to it is closed. Read more here: http://man7.org/linux/man-pages/man2/unlink.2.html
This question is an extension for that question.
Yet again: I'm working under CentOS 6.0 and I have a remote win7 folder, mounted with:
mount -t cifs //PC128/mnt /media/net -o "username=WORKGROUP\user,password=pwd,rw,noexec,soft,uid=user,gid=user"
When remote folder is not available (e.g. network cable is pulled out) an attempt to access the remote folder locks an application I'm working on. At first I detected that QDir::exists() caused locking for 20-90 seconds (I still can't find out why such difference), further I detected that any call to stat() function leads to application lock.
I followed an advice provided in topic above, I moved QDir::exists() call (and later - call to the stat() function) to another thread and this didn't solve the problem. The application still hangs when connection is suddenly lost. Qt trace shows that lock is somewhere in the kernel:
0 __kernel_vsyscall
1 __xstat64#GLIBC_2.1 /lib/libc.so.6
2 QFSFileEnginePrivate::doStat stat.h
I did also tried to check if remote share is still mounted before trying to access folder itself, but it didn't help. Approaches such as:
mount | grep /media/net
show that shared folder is still mounted even is there is no active connection to the network.
Checking folder status differences such as:
stat -fc%t:%T /media/net/ != stat -fc%t:%T /media/net/..
also hangs for ~20 seconds.
So I have several questions:
Is there any way to change CIFS timeouts? I did try to find out but it seems that there is no appropriate parameters and no CIFS config.
How can I check if remote folder is still mounted and not get locked?
How can I check is folder exists and also not get locked?
Your problem: "An unreachable network filesystem" is a very well known example which trigger linux hung task which isn't the same of zombies process at all(killing the parent PID won't do anything)
An hung task, is task which triggered a system call that cause problem in the kernel, so that the system call never return.
The major particularity is that the task is declared in the "D" state by the scheduler which mean the program is in an uninterruptible state. This mean that you can do nothing to stop you program: You can trigger all signal to the task, it would not respond. Launching hundreds of SIGTERM/SIGKILL does nothing!
This the case whith my old kernel: when my nfs server crash, I need to reboot the client to kill the tasks using the filesystem. I compiled it a long time ago (I have still the build tree on my hdd) and during the configuration I saw this in lib/Kconfig.debug:
config DETECT_HUNG_TASK
bool "Detect Hung Tasks"
depends on DEBUG_KERNEL
default LOCKUP_DETECTOR
help
Say Y here to enable the kernel to detect "hung tasks",
which are bugs that cause the task to be stuck in
uninterruptible "D" state indefinitiley.
When a hung task is detected, the kernel will print the
current stack trace (which you should report), but the
task will stay in uninterruptible state. If lockdep is
enabled then all held locks will also be reported. This
feature has negligible overhead.
It was only proposing to detect such tash or panic on detection: I don't checked if recent kernel actually can solve the problem (It seems to be the case with your question), but I think it didn't worth enabling it.
There is second problem : normally, the detection occur after 120 seconds, but I saw also a Konfig option for this:
config DEFAULT_HUNG_TASK_TIMEOUT
int "Default timeout for hung task detection (in seconds)"
depends on DETECT_HUNG_TASK
default 120
help
This option controls the default timeout (in seconds) used
to determine when a task has become non-responsive and should
be considered hung.
It can be adjusted at runtime via the kernel.hung_task_timeout_secs
sysctl or by writing a value to
/proc/sys/kernel/hung_task_timeout_secs.
A timeout of 0 disables the check. The default is two minutes.
Keeping the default should be fine in most cases.
This also works with kernel threads: example: make a loop device to a file on a fuse filesystem. Then crash the userspace program controlling the fuse filesystem!
You should a get a Ktread which name is in the form loopX (X correspond normally to your loopback device number) HUNGing!
weblinks:
https://unix.stackexchange.com/questions/5642/what-if-kill-9-does-not-work (look at the answer written by ultrasawblade)
http://www.linuxquestions.org/questions/linux-general-1/kill-a-hung-task-when-kill-9-doesn't-help-697305/
http://forums-web2.gentoo.org/viewtopic-t-811557-start-0.html
http://comments.gmane.org/gmane.linux.kernel/1189978
http://comments.gmane.org/gmane.linux.kernel.cifs/7674 (This is a case similar to yours)
In your case of the three question: you have the answer: This probably due to what is probably a well known bug in the vfs linux kernel layer! (There is no CIFS timeouts)
After much trial & error I found a solution that persists.
# vim /etc/fstab
//192.168.1.122/myshare /mnt/share cifs username=user,password=password,_netdev 0 0
The _netdev option is important since we are mounting a network device. Clients may hang during the boot process if the system encounters any difficulties with the network.
https://www.redhat.com/sysadmin/samba-windows-linux