`entr`: How is update ID'd? noatime troubles? &, why -r not work with -d? - inotify

I have a script regularly appending to a log file. When I use entr (discovered here) to monitor that log file, and I then touch the log, everything works fine, but when the script appends to the file, entr fails. This may be because I have noatime set in my fstab - but that only stops the updating of the access time not the modify time, so this confuses me.
I've checked and while atime is not updating, ctime (ls -lc) definitely is. Could entr really be depending on atime? I use noatime because I have an SSD. So what should I do? I just stumbled on lazytime. Would that solve the problem?
Since monitoring the log file was not working, I tried entr -cdr on the directory of files that are updated (a new file is created) at the same time as the log (the log is in a different directory). entr recognizes when the directory contents change, but the -r does not work. The entr process just ends, saying "entr: directory altered".
Any idea how to fix this or whether I should just go back to inotify, would be appreciated.
Edit: I have written it with inotify now, and the event reported when the log file is written to is, sensibly enough, "MODIFY."

It turns out that entr does not respond to IN_MODIFY events, but only to these (in Linux):
IN_CLOSE_WRITE|IN_DELETE_SELF|IN_MOVE_SELF|IN_CREATE
Also, IN_ATTRIB, but only if the file-mode or inode numbers change.
In BSD/OSX, it's:
NOTE_DELETE|NOTE_WRITE|NOTE_RENAME|NOTE_TRUNCATE|NOTE_ATTRIB
Also, the option -r has no effect in the context of the -d option. It only works when entr is monitoring files.
See the developer's comments. Also, more info on entr.

Related

Can we wget with file list and renaming destination files?

I have this wget command:
sudo wget --user-agent='some-agent' --referer=http://some-referrer.html -N -r -nH --cut-dirs=x --timeout=xxx --directory-prefix=/directory/for/downloaded/files -i list-of-files-to-download.txt
-N will check if there is actually a newer file to download.
-r will turn the recursive retrieving on.
-nH will disable the generation of host-prefixed directories.
--cut-dirs=X will avoid the generation of the host's subdirectories.
--timeout=xxx will, well, timeout :)
--directory-prefix will store files in the desired directorty.
This works nice, no problem.
Now, to the issue:
Let's say my files-to-download.txt has these kind of files:
http://website/directory1/picture-same-name.jpg
http://website/directory2/picture-same-name.jpg
http://website/directory3/picture-same-name.jpg
etc...
You can see the problem: on the second download, wget will see we already have a picture-same-name.jpg, so it won't download the second or any of the following ones with the same name. I cannot mirror the directory structure because I need all the downloaded files to be in the same directory. I can't use the -O option because it clashes with --N, and I need that. I've tried to use -nd, but doesn't seem to work for me.
So, ideally, I need to be able to:
a.- wget from a list of url's the way I do now, keeping my parameters.
b.- get all files at the same directory and being able to rename each file.
Does anybody have any solution to this?
Thanks in advance.
I would suggest 2 approaches -
Use the "-nc" or the "--no-clobber" option. From the man page -
-nc
--no-clobber
If a file is downloaded more than once in the same directory, >Wget's behavior depends on a few options, including -nc. In certain >cases, the local file will be
clobbered, or overwritten, upon repeated download. In other >cases it will be preserved.
When running Wget without -N, -nc, -r, or -p, downloading the >same file in the same directory will result in the original copy of file >being preserved and the second copy
being named file.1. If that file is downloaded yet again, the >third copy will be named file.2, and so on. (This is also the behavior >with -nd, even if -r or -p are in
effect.) When -nc is specified, this behavior is suppressed, >and Wget will refuse to download newer copies of file. Therefore, ""no->clobber"" is actually a misnomer in
this mode---it's not clobbering that's prevented (as the >numeric suffixes were already preventing clobbering), but rather the >multiple version saving that's prevented.
When running Wget with -r or -p, but without -N, -nd, or -nc, >re-downloading a file will result in the new copy simply overwriting the >old. Adding -nc will prevent this
behavior, instead causing the original version to be preserved >and any newer copies on the server to be ignored.
When running Wget with -N, with or without -r or -p, the >decision as to whether or not to download a newer copy of a file depends >on the local and remote timestamp and
size of the file. -nc may not be specified at the same time as >-N.
A combination with -O/--output-document is only accepted if the >given output file does not exist.
Note that when -nc is specified, files with the suffixes .html >or .htm will be loaded from the local disk and parsed as if they had been >retrieved from the Web.
As you can see from this man page entry, the behavior might be unpredictable/unexpected. You will need to see if it works for you.
Another approach would be to use a bash script. I am most comfortable using bash on *nix, so forgive the platform dependency. However the logic is sound, and with a bit of modifications, you can get it to work on other platforms/scripts as well.
Sample pseudocode bash script -
for i in `cat list-of-files-to-download.txt`;
do
wget <all your flags except the -i flag> $i -O /path/to/custom/directory/filename ;
done ;
You can modify the script to download each file to a temporary file, parse $i to get the filename from the URL, check if the file exists on the disk, and then take a decision to rename the temp file to the name that you want.
This offers much more control over your downloads.

colorgcc perl script with output to non-tty enabled writing to C dependency files

Ok, so here's my issue. I have written a build script in bash that pipes output to tee and sorts different output to different log files (so I can summarize errors/warnings at the end and get some statistics on files built). I wanted to use the colorgcc perl script (colorgcc.1.3.2) to colorize the output from gcc and had found in other places that this won't work piping to tee, since the script checks if it is writing to something that is not a tty. Having disabled this check everything was working until I did a full build and discovered some of the code we receive from another group builds C dependency files (we don't control this code, changing it or the build process for these isn't really an option).
The problem is that these .d files have the form as follows:
filename.o filename.d : filename.c \
dependant_file1.h \
dependant_file2.h (and so on for however many dependencies there are)
This output from GCC gets written into the .d file, but, since it is close enough to a warning/error message colorgcc outputs color codes (believe it's the check for filename:lineno:message but not 100% sure, could be filename:message check in the GCCOUT while loop). I've tried editing the regex to attempt to not match this but my perl-fu is admittedly pretty weak. So what I end up with is a color code on each line for these dependency files, which obviously causes the build to fail.
I ended up just replacing the check for ! -t STDOUT with a check for a NO_COLOR envar I set and unset in the build script for these directories (emulates the previous behavior of no color for non-tty). This works great if I run the full script, but doesn't if I cd into the directory and just run make (obviously setting and unsetting manually would work but this is a pain to do every time). Anyone have any ideas how to prevent this script from writing color codes into dependency files?
Here's how I worked around this. I added the following to colorgcc to search the gcc input for the flag to generate the .d files and just directly called the compiler in that case. This was inserted in place of the original TTY check.
for each $argnum (0 .. $#ARGV)
{
if ($ARGV[$argnum] =~ m/-M{1,2}/)
{
exec $compiler, #ARGV
or die("Couldn't exec");
}
}
I don't know if this is the proper 'perl' way of doing this sort of operation but it seems to work. Compiling inside directories that build .d files no longer inserts color codes and the source file builds do (both to terminal and my log files like I wanted). I guess sometimes the answer is more hacks instead of "hey, did you try giving up?".

Execute robocopy powershell continuously between two times established

I have a program that creates temporary files in a specific folder. Then, automatically, after a few seconds, these files are deleted.
I wanted to copy those temporal files to an specific folder, I would like to use a powershell script to do this:
robocopy startFolder destinationFolder *.TIFF *.JPEG *.jpg *.PNG *.GIF *.BMP *.ICO *.PBM *.PGM *.PPM /s /XO
My problem is that I couldn't use a scheduled task (because of the problem with limitation of seconds) or install this powershell as a Windows Service with a powershell script (as far as I know is a bad practice) . I need this powershell running all the time trying to get files at the moment that they are created, before this folders were deleted.
Could you give me a hand please? Thanks!
Not sure it's quite what you want, but robocopy does have directory monitoring funcitonality built-in. You could add /mon:1 which should monitor the source directory and re-run the copy when it detects one change (a new or changed file, for example).
However, a down-side of this perhaps is that using this method, robocopy won't exit - it will run until you kill it.
Edit: I've just noticed you specify in your question title that this should run between two established times, in which case you could add the /rh:hhmm-hhmm option to specify times between which new copies can be started. For example, /rh:1000-1200 should only perform the copies (and hence monitoring) between 10am and midday.
Caveat: I've not tried using the "monitor" option of robocopy, so I'm not sure what sort of delay there would be between a change taking place, and the copy being re-run, but it's worth a shot.

Is it possible to search though all xcodes logs

XCode now keeps the logs from the previous runs handy which is great.
Is there a way to search though all of the logs.
My use case is I have seen a particular error but cant remember which run it was in. I need to find the error URL from the logs.
Xcode stores debug logs at
~/Library/Developer/Xcode/DerivedData/<YOURAPP>/Logs/Debug/
The .xcactivitylog files are actually just gz archives. Decompress them:
cd ~/Library/Developer/Xcode/DerivedData/<YOURAPP>/Logs/Debug/
EXT=".xcactivitylog"
for LOG in *.xcactivitylog; do
NAME=`basename $LOG $EXT`
gunzip -c -S $EXT "${NAME}${EXT}" > "${NAME}.log"
done
Now you can easily search them using grep or Spotlight or what your prefer.
To add onto #DrummerB answer. Once the files are unziped you can do a search with custom scope from within XCode. I prefer this to grep or spotlight.
The folder where these logs are is
~/Library/Developer/Xcode/DerivedData/[YOURAPPID]/Logs/Debug/
You can open/read/search them for example in TextWrangler.

How can I resume downloads in Perl?

I have a project that depends upon some other binaries to be downloaded from web at install time.For this what i do is:
if ( file-present-in-src/)
# skip that file
else
# use wget to download the file
The problem with this approach is that when I interrupt a download in middle, and do invoke the script next time, the partially downloaded file is also skipped (which is not desired), also I want wget to resume the download of the partially downloaded file.
How should I go about it:
Possible Solutions I could think of:
Let the file to be downloaded to some file say download_tmp. Move to original file
if successful.
Handle SIG{'INT'} to write proper cleanup code.
But none of these could help resume the partial file download,
Any insights?
Fist, I don't understand what this has to do with Perl, since you're using wget to do the dowloading ... You could use libwww-perl (perldoc LWP) and have more control about the download process.
Then I second your idea of downloading to a "tmp" filename and move the file on success.
However I think you need to go further and verify the integrity of the files. Doing an MD5 or SHA hash is very easy, and match the downloaded one with what you're expecting. You can have a short file on server containing the checksum (filename.md5). Determine success only when you have a match.
Note that catching all the signals and generally trying to make the process unkillable, and then expecting it to have worked is bound to fail at one point or another. There could be a network timeout, a crash, power failure, configuration problem on the server ... you should instead assume downloads can fail, because they will, and code so that your process can recover.
Finally you're not telling us what kind of binaries you're downloading and what you're doing with them. Since you use wget I'm going to assume you're on Unix; you should consider using RPM+Yum or the likes, they handle all this for you. RPM are easy to write, really.
use your first approach ..
download to "FileName".tmp
move "FileName".tmp to "FileName" move! not copy
once per diem clean out all .tmp files (paranoia rulez)
You could just use wget's -N and -c options and remove the entire "if file exists" logic.