Inotify: Odd behavior with directory creations - inotify

I have an inotify/kernel question. I'm using the "inotify" Python project in order to make my observations, but my question is still inherently about the core inotify kernel implementation.
The Python inotify project handles recursive inotify watches. It provides a nice generator that allows you to loop over the events. It implements recursive watches by identifying directory-create events and automatically adding those watches before yielding the event.
I noticed some weird behavior with "mkdir -p" calls. Whereas I can rapidly, incrementally create individual directories and see them from the event-loop, "mkdir -p" never produces events for the subdirectory of a subdirectory or a file created in that subdirectory.
Does anyone have any thoughts?
WORKS: "mkdir aa && mkdir aa/bb && touch aa/bb/filename":
(_INOTIFY_EVENT(wd=1, mask=1073742080, cookie=0, len=16), ['IN_ISDIR', 'IN_CREATE'], '/tmp/tmpt3MlIQ', u'aa')
(_INOTIFY_EVENT(wd=2, mask=1073742080, cookie=0, len=16), ['IN_ISDIR', 'IN_CREATE'], u'/tmp/tmpt3MlIQ/aa', u'bb')
(_INOTIFY_EVENT(wd=3, mask=256, cookie=0, len=16), ['IN_CREATE'], u'/tmp/tmpt3MlIQ/aa/bb', u'filename')
(_INOTIFY_EVENT(wd=3, mask=32, cookie=0, len=16), ['IN_OPEN'], u'/tmp/tmpt3MlIQ/aa/bb', u'filename')
(_INOTIFY_EVENT(wd=3, mask=4, cookie=0, len=16), ['IN_ATTRIB'], u'/tmp/tmpt3MlIQ/aa/bb', u'filename')
(_INOTIFY_EVENT(wd=3, mask=8, cookie=0, len=16), ['IN_CLOSE_WRITE'], u'/tmp/tmpt3MlIQ/aa/bb', u'filename')
DOESN'T WORK: "mkdir -p aa/bb && touch aa/bb/filename":
(_INOTIFY_EVENT(wd=1, mask=1073742080, cookie=0, len=16), ['IN_ISDIR', 'IN_CREATE'], '/tmp/tmpuTSxYl', u'aa')
(_INOTIFY_EVENT(wd=1, mask=1073741856, cookie=0, len=16), ['IN_ISDIR', 'IN_OPEN'], '/tmp/tmpuTSxYl', u'aa')
(_INOTIFY_EVENT(wd=1, mask=1073741840, cookie=0, len=16), ['IN_ISDIR', 'IN_CLOSE_NOWRITE'], '/tmp/tmpuTSxYl', u'aa')
Naturally, I did the next obvious, brainless thing I could think of and added the "-p" flag to the "mkdir aa && mkdir aa/bb", just to make sure there wasn't any "-p"-specific anomalies, but it didn't make a difference.
The GNU implementation of "mkdir -p" just iterates from separator to separator in the path. No magic. The Python implementation of os.makedirs (same functionality) also just splits the path and enumerates the parts. However, the GNU doesn't work but the Python one does. This seems to imply a race condition, except that the results are identical no matter what I manipulate the conditions. I even started using a trivial/miniscule timeout on the epoll that we're doing to read the events (read: if there was any delay with the original timeout value then that's no longer a factory). It's almost as if inotify in the kernel seems to be totally missing the subsequent creations in "mkdir -p".
I'm sure I'm just missing something.
For reference, the calls involved in the GNU implementation:
http://code.metager.de/source/xref/gnu/coreutils/src/mkdir.c
http://code.metager.de/source/xref/gnu/octave/gnulib-hg/lib/mkdir-p.c#85
http://code.metager.de/source/xref/gnu/octave/gnulib-hg/lib/mkancesdirs.c#67
Note that we start in GNU's "coreutils" and apparently proceed into GNU Octave for the implementation of the "mkdir -p". It's the only reference that OpenGrok provided. I can't explain this and I'm in unfamiliar territory.
Python's implementation:
https://github.com/python/cpython/blob/master/Lib/os.py#L196
Am I overlooking some detail of inotify's behavior?

A very interesting catch you've got there!
No, the native kernel inotify library does exactly what the documentation says. GNU mkdir -p is totally fine as well.
I noticed some weird behavior with "mkdir -p" calls. Whereas I can
rapidly, incrementally create individual directories and see them from
the event-loop, "mkdir -p" never produces events for the subdirectory
of a subdirectory or a file created in that subdirectory.
You will have to try it with other inotify implementations to assert the credibility of PyInotify's recursive watch. Simply the fact that it is a popular Python implementation alone doesn't earn my trust!
I make the following statement presuming you haven't messed up the sample Python output that you have produced for reference. It is PyInotify implementation which is slacking, I would say, rather not very good at what it does.
For our speculation here, let's split the flow and start with creation of dirs before considering file creation.
Did you notice that even with your first scenario, IN_CREATE for the dir aa
(_INOTIFY_EVENT(wd=1, mask=1073742080, cookie=0, len=16), ['IN_ISDIR', 'IN_CREATE'], '/tmp/tmpt3MlIQ', u'aa')
does not have correspoding IN_OPEN and IN_CLOSE_NOWRITE events which for some odd reason seems to be present in the second scenario though the action is just mkdir?
(_INOTIFY_EVENT(wd=1, mask=1073741856, cookie=0, len=16), ['IN_ISDIR', 'IN_OPEN'], '/tmp/tmpuTSxYl', u'aa')
(_INOTIFY_EVENT(wd=1, mask=1073741840, cookie=0, len=16), ['IN_ISDIR', 'IN_CLOSE_NOWRITE'], '/tmp/tmpuTSxYl', u'aa')
There's clearly something fishy about this. Definitely inconsistent.
I have not taken a look at PyInotify implementation yet, and I don't intend to waste my time with it. However, I have worked with the native inotify interface rather closely to vouch for its accuracy! It has never missed reporting a single event, that is, if it occurs at all.
Now let's move on to the file creation part, which seems to be your major concern- mkdir -p never produces events for the subdirectory of a subdirectory or a file created in that subdirectory. This is not true always; depends on how poorly the implementation stands.
I have reproduced the same set of actions, with a better inotify implementation, that you've performed. Yes, just to prove my claims.
Notice the event flow which is clearly more accurate than PyInotify's report?
Reproduction:
case1: mkdir aa && mkdir aa/bb && touch aa/bb/filename
root#six-k:/opt/test# ls -la
total 8
drwxr-xr-x 2 root root 4096 Mar 18 13:55 .
drwxr-xr-x 20 root root 4096 Mar 18 13:53 ..
root#six-k:/opt/test# fluffyctl -w ./
root#six-k:/opt/test# mkdir aa && mkdir aa/bb && touch aa/bb/filename
events caught:
root#six-k:/home/lab/fluffy# fluffy
event: CREATE, ISDIR,
path: /opt/test/aa
event: ACCESS, ISDIR,
path: /opt/test/aa
event: ACCESS, ISDIR,
path: /opt/test/aa
event: CLOSE_NOWRITE, ISDIR,
path: /opt/test/aa
event: CREATE, ISDIR,
path: /opt/test/aa/bb
event: ACCESS, ISDIR,
path: /opt/test/aa/bb
event: ACCESS, ISDIR,
path: /opt/test/aa/bb
event: CLOSE_NOWRITE, ISDIR,
path: /opt/test/aa/bb
event: CREATE,
path: /opt/test/aa/bb/filename
event: OPEN,
path: /opt/test/aa/bb/filename
event: ATTRIB,
path: /opt/test/aa/bb/filename
event: CLOSE_WRITE,
path: /opt/test/aa/bb/filename
case 2: mkdir -p aa/bb && touch aa/bb/filename
root#six-k:/opt/test# cd ../
root#six-k:/opt# mkdir test2
root#six-k:/opt# cd test2/
root#six-k:/opt/test2# fluffyctl -w ./
root#six-k:/opt/test2# mkdir -p aa/bb && touch aa/bb/filename
root#six-k:/opt/test2#
events caught:
root#six-k:/home/lab/fluffy# fluffy
event: CREATE, ISDIR,
path: /opt/test2/aa
event: ACCESS, ISDIR,
path: /opt/test2/aa
event: ACCESS, ISDIR,
path: /opt/test2/aa/bb
event: ACCESS, ISDIR,
path: /opt/test2/aa/bb
event: CLOSE_NOWRITE, ISDIR,
path: /opt/test2/aa/bb
event: ACCESS, ISDIR,
path: /opt/test2/aa
event: CLOSE_NOWRITE, ISDIR,
path: /opt/test2/aa
event: CREATE,
path: /opt/test2/aa/bb/filename
event: OPEN,
path: /opt/test2/aa/bb/filename
event: ATTRIB,
path: /opt/test2/aa/bb/filename
event: CLOSE_WRITE,
path: /opt/test2/aa/bb/filename
There you go, events on the sub directory and the file in it.
The answer is getting lengthy!
Neverthless, no recursive implementation built on top of the native inotify library can guarantee all the events. It's not feasible! If it were, it would have been rather simple for the kernel guys who authored inotify to have introduced recursive watches natively.
Gotchas:
Notice that there is indeed a difference in the create event from my reproduction snippet? There's none reported for the sub directory in the second case(mkdir -p). Why? Though everything happens very quickly, recursive setups aren't quick enough. By the time the first create event on dir aa is caught, mkdir -p aa/bb finishes up creating dir bb as well. So, there's not create event for dir bb. Again, reminder, this not the native inotify library's fault; it's because we haven't event set up a watch on dir aa yet, how in the world are we going to receive events on it?
I hope that cleared things up!
Hold on, if the create event of dir bb wasn't even caught, how does fluffy seem to have set a watch on it and there by has reported subsequent events on dir 'bb`?
Good, you are following! You guessed right, but not entirely. fluffy received the first create event of dir aa. By the time it processed this event, mkdir -p finished up it's work, so no dir bb create event. Right. But, while fluffy setups watches on dir aa, dir bb was already present. So, fluffy, pulls bb in to it's watch because it is indeed a descendant of dir aa. The rest you already know. Since it's being watched, it reports subsequent events, which included the file creations.
Feel free to quote this answer or point to fluffy if you(anyone reading this) mean to raise a ticket/issue about this on PyInotify GitHub project page. If you need more info, I'll gladly provide. You can open an issue at fluffy's GH page for general discussions/suggestions/opinions as well. fluffy could use your help to better it.

It seems to me that it is tricky to catch all of the events you are hoping to get notifications for. My experience is with inotify is in C. I am sure the python package has a few things added for convenience. So, I have tailored my answer here to speak to inotify in fairly general terms.
There are no guarantees on the order in which things happen. The first directory created will trigger notification in the parent. Sometime later, the next directory gets created, and the first directory gets registered for event notification. There are two possible orders that these actions can happen.
If inotify is updated first, then you will get notified when the second directory is created.
When the other order happens, notification seems unlikely. However, one might opendir and check for entries. For each directory, update inotify to also watch that.
Inotify is a great tool for getting the initial signal to do work. But, it does have its limits. Especially with new directories being created, you need to check for gaps and races.

Related

Buildroot/busybox usertable.txt and take away access rights for a group/user

Question 1:
Thru buildroot usertable.txt I created a user called deviceuser which belongs to group operator and nogroup:
$cat usertable.txt
deviceuser -1 deviceuser -1 =SERIAL_NO /mnt /bin/sh operator Device user for non-trivial maintanence work
After image is loaded into target what I get in /etc/group is :
root:x:0:
daemon:x:1:
bin:x:2:
sys:x:3:
adm:x:4:
tty:x:5:
disk:x:6:
lp:x:7:
mail:x:8:
kmem:x:9:
wheel:x:10:root
cdrom:x:11:
dialout:x:18:
floppy:x:19:
video:x:28:
audio:x:29:
tape:x:32:
www-data:x:33:
utmp:x:43:
plugdev:x:46:
staff:x:50:
lock:x:54:
netdev:x:82:
users:x:100:
admin:x:1002:deviceuser <====== not sure where its coming from
nogroup:x:65534:deviceuser
deviceuser:x:1000:
sshd:x:1001:
operator:x:37:deviceuser
$ cat /etc/shadow
root:$1$blahblahblah.:10933:0:99999:7:::
daemon:*:10933:0:99999:7:::
bin:*:10933:0:99999:7:::
sys:*:10933:0:99999:7:::
sync:*:10933:0:99999:7:::
mail:*:10933:0:99999:7:::
www-data:*:10933:0:99999:7:::
operator:*:10933:0:99999:7:::
nobody:*:10933:0:99999:7:::
deviceuser:$1$blahblahblah:::::::
sshd:*:::::::
As noted above, deviceuser gets admin priviledge and I need to eliminate that and make deviceuser part of operator and nogroup only.
Question 2:
I want to take the access rights (read/right/execute) away from this deviceuser or operator group for /etc/ folder, while keeping everybody else's permissions intact, there are a number of users and groups in the system including www-data. What's the simplest way to do this without affecting any kind of permission issue for www-data and others?
If I do "chmod -R o-wrx /etc " then I believe www-data will have issues running some init scripts.
Thanks
Ratin

MacOS, AppleScript and Git

I have a project that will require reading a local repo and collecting the diff from the most recent commit and the one before it. I then need to do additional work with those diffs (add to an existing log file, make available for tech writers to edit existing API docs with the changes - might Slack them or API into Jira and build a ticket (like that option as it leaves a trail).
I can do the yeoman level work in an AppleScript, calling shell scripts when needed then parsing the data, and passing the cleaned data to the various applications/sites I need to. But other, less technical people will also be using this app and it would be nice to give them a simple UI to work with.
Anyway, after much digging through the Google, SO and other sources I was able to get a MacOS app working that can call an AppleScript and now I've run into a wall...
I can run this AppleScript from Script Editor and it works fine:
set strGitLog to do shell script "cd ~/Desktop/xxxxxx/Projects/UnifiedSDK/Repo/xxxxxx && git log -p -- file1.html"
"commit c39c6bb004d2e104b3f8e15a6125e3d68a5323ef
Author: Steve <xxxxxx#xxxxxx.com>
Date: Tue Oct 22 15:42:13 2019 -0400
Added deprecation warning to file1
diff --git a/file1.html b/file1.html
index b7af22b..9fdc781 100644
--- a/file1.html
+++ b/file1.html
## -51,6 +51,8 ##
<h2>Class Description</h2>
<p style=\"margin-bottom:10px;\">This is the description of the class</p>
+ <p style=\"margin-bottom:10px;\">Warning: This class is scheduled to be deprecated.</p>
+
<h3>Arguments:</h3>
<p style=\"margin-bottom:10px;\">These are the arguments that the class accepts</p>
...
but, if I place this script within a MacOS application:
script gitMessenger
property parent : class "NSObject"
to readMessage()
set strGitLog to do shell script "cd ~/Desktop/xxxxxx/Projects/UnifiedSDK/Repo/xxxxxx && git log -p -- file1.html"
log strGitLog
end readMessage
end script
I get this error message in the log:
fatal: Unable to read current working directory: Operation not permitted (error 128)
Which after checking seems to be a Git permissions error. If I pwd I am pointing to the right directory:
/Users/xxxxxx/Library/Containers/xxxxxx.GitMessenger/Data/Desktop/xxxxxx/Projects/UnifiedSDK/Repo/xxxxxx
and that directory has git initiated on it:
and it has permission for read/write to everyone. So I am a little at a loss right now how to get this to work. Any help or suggestions would be appreciated.

Get last commit for every file of a file list in Mercurial

I have an hg repository and I would like to know the last commit date of every file in sources/php/dracca/endpoint/wiki/**/Wiki*.php
So far, I have this one liner:
find sources/php/dracca/endpoint/wiki/ -name "Wiki*.php" -exec hg log --limit 1 --template "{date|shortdate}" {} \; -exec echo {} \;
But this seems utterly slow as (I suppose) find makes 1 hg call per file, leading to 15seconds of computation for the (say) ~40 files I have in there...
Is there a faster way?
The output of this command looks like:
2019-09-20 sources/php/dracca/endpoint/wiki/characters/colmarr/WikiCharactersColmarrEndpoint.php
2019-09-20 sources/php/dracca/endpoint/wiki/characters/dracquints/allgroup/WikiCharactersDracquintsAllgroupEndpoint.php
...
It might be changed a bit if needed (I won't mind having, say, 1 date and then the list of files changed for that date, or whatever like this)
Even with find+exec you can have shorter (by one last exec) chain with modified template {date|shortdate}\n
You can use (accepted) perl-ism from this question or ask anybody to update mentioned lof extension to current Mercurial (code from 2012 will not work now)
Alternatives (dirty ugly hacks)
In any case, you can|have to call hg only once and perform some post-processing of results.
Before these trick, read hg help filesets and get one common fileset for your files (I suppose, it can be just set:sources/php/dracca/endpoint/wiki/**/Wiki*.php but TBT!)
After it, you can:
Perform hg log like this
hg log setup.* --template "{files % '{file} {rev} {date|shortdate}\n'}"
(I used simple pattern for test, you have to have own fileset)
get output in such form
setup.py 1163 2018-11-07
README.md 1162 2018-11-07
setup.py 1162 2018-11-07
hggit/git_handler.py 1124 2018-05-01
setup.py 1124 2018-05-01
setup.cfg 1118 2017-11-27
setup.py 1117 2017-11-27
hggit/git2hg.py 1111 2017-11-27
hggit/overlay.py 1111 2017-11-27
setup.py 1111 2017-11-27
…
(there are some unwanted unexpected files, because I out all files in revision, which affect file in interest, without filter). You have to grep only needed files, sort by cols 1+2 and use date of latest revision of each file
Use hg grep. For the above test-pattern
hg grep "." -I setup.* --files-with-matches -d -q
(find any changes, output only filename+revision, short date)
you'll get something like
setup.py:1163:2018-11-07
setup.cfg:1118:2017-11-27
and 3-rd column will be your needed last modification date of file

Where to find logs for a cloud-init user-data script?

I'm initializing spot instances running a derivative of the standard Ubuntu 13.04 AMI by pasting a shell script into the user-data field.
This works. The script runs. But it's difficult to debug because I can't figure out where the output of the script is being logged, if anywhere.
I've looked in /var/log/cloud-init.log, which seems to contain a bunch of stuff that would be relevant to debugging cloud-init, itself, but nothing about my script. I grepped in /var/log and found nothing.
Is there something special I have to do to turn logging on?
The default location for cloud init user data is already /var/log/cloud-init-output.log, in AWS, DigitalOcean and most other cloud providers. You don't need to set up any additional logging to see the output.
You could create a cloud-config file (with "#cloud-config" at the top) for your userdata, use runcmd to call the script, and then enable output logging like this:
output: {all: '| tee -a /var/log/cloud-init-output.log'}
so I tried to replicate your problem. Usually I work in Cloud Config and therefore I just created a simple test user-data script like this:
#!/bin/sh
echo "Hello World. The time is now $(date -R)!" | tee /root/output.txt
echo "I am out of the output file...somewhere?"
yum search git # just for fun
ls
exit 0
Notice that, with CloudInit shell scripts, the user-data "will be executed at rc.local-like level during first boot. rc.local-like means 'very late in the boot sequence'"
After logging in into my instance (a Scientific Linux machine) I first went to /var/log/boot.log and there I found:
Hello World. The time is now Wed, 11 Sep 2013 10:21:37 +0200! I am
out of the file. Log file somewhere? Loaded plugins: changelog,
kernel-module, priorities, protectbase, security,
: tsflags, versionlock 126 packages excluded due to repository priority protections 9 packages excluded due to repository
protections ^Mepel/pkgtags
| 581 kB 00:00
=============================== N/S Matched: git =============================== ^[[1mGit^[[0;10mPython.noarch : Python ^[[1mGit^[[0;10m Library c^[[1mgit^[[0;10m.x86_64 : A fast web
interface for ^[[1mgit^[[0;10m
...
... (more yum search output)
...
bin etc lib lost+found mnt proc sbin srv tmp var
boot dev home lib64 media opt root selinux sys usr
(other unrelated stuff)
So, as you can see, my script ran and was rightly logged.
Also, as expected, I had my forced log 'output.txt' in /root/output.txt with the content:
Hello World. The time is now Wed, 11 Sep 2013 10:21:37 +0200!
So...I am not really sure what is happening in you script.
Make sure you're exiting the script with
exit 0 #or some other code
If it still doesn't work, you should provide more info, like your script, your boot.log, your /etc/rc.local, and your cloudinit.log.
btw: what is your cloudinit version?

Magento: upgrade pre 1.6 version to most recent one

I've seen a lot of questions about pre 1.6 Magento installations to the most recent version (at the current moment 1.7.0.2) but there are a lot of answers that don't work for everybody.
So below the answer to the question:
How to upgrade Magento from a pre 1.6 installation to the most recent one.
There are a lot of versions and not all of them are working. This one has worked for me for a lot of versions, as far as 1.3 to 1.7.
Please add comments with solutions to problems you're experiencing, I can update the answer so other people get help from this topic too!
What you need:
- SUDO rights/root account on your server.
- The linux package 'nohub'
- make sure NOBODY can trigger the index.php. If your version supports maintenance.flag, put an empty maintenance.flag file in your Magento root.
Walkthrough
1) Download the latest Magento. Overwrite: ./download/* ./lib/* ./mage
2) Run these steps from you Magento root als SUDOer (if you're not root, put 'sudo' for all the commands)
find . -type f -exec chmod 644 {} \;
find . -type d -exec chmod 755 {} \;
chmod -R 777 ./var
chmod 550 mage
3) Go to your Magento root folder and type:
./mage list-upgrades
./mage config-set preferred_state stable
./mage upgrade-all --force
./mage install http://connect20.magentocommerce.com/community Mage_All_Latest --force
4) Now there is the last step. Note: In some situations this process can take up to 8+ hours!
nohup php -f ./index.php
Known issues
1) it's possible that your update gets in a loop. To find this loop, enable debugging.Edit: /lib/Varien/Db/Adapter/Pdo/Mysql.php (+/- line 112 and 112)
protected $_debug = true;
protected $_debuglogeverything = true;
This will write a debug to: /var/debug/[debug_file]
2) Read the file by opening the dir:
cd /var/debug/[debug_file] <-- replace with the actual filename
tail -f [debug_file]
3) If you use debug, the file will get HUGE! Make sure you delete it once in a while.
Tip: as a root user, type:
crontab -e
*/5 * * * * rm /[my_magento_base_folder]/var/debug/[debug_file] <-- add this line
If you want to read the file, add a # to this line and use tail to read it.
These steps help you find common errors and loops (if the tail shows a repeating error message)