patch to stdout with process substitution - diff

I am trying to do something like:
patch <( zcat data.201301.gz ) patch_file.diff -o /dev/stdout
result I am getting is:
File /dev/fd/63 is not a regular file -- refusing to patch
4504 out of 4504 hunks ignored -- saving rejects to file /dev/stdout.rej
How can I produce the patched file with process substitution ?

Looking at the source code of patch, this not seems to be possible.
File to be patched must be standard file, any other type of files are rejected.
Unfortunately symlinks, pipes, descriptors etc not working.

To process possible symlinks inside scripts, this helps:
patch -o "$out" "$(readlink "$in")" "$diff"

Related

"log=..." command-line parameter to send script output to STDOUT? [duplicate]

I'm working with a command line utility that requires passing the name of a file to write output to, e.g.
foo -o output.txt
The only thing it writes to stdout is a message that indicates that it ran successfully. I'd like to be able to pipe everything that is written to output.txt to another command line utility. My motivation is that output.txt will end up being a 40 GB file that I don't need to keep, and I'd rather pipe the streams than work on massive files in a stepwise manner.
Is there any way in this scenario to pipe the real output (i.e. output.txt) to another command? Can I somehow magically pass stdout as the file argument?
Solution 1: Using process substitution
The most convenient way of doing this is by using process substitution. In bash the syntax looks as follows:
foo -o >(other_command)
(Note that this is a bashism. There's similar solutions for other shells, but bottom line is that it's not portable.)
Solution 2: Using named pipes explicitly
You can do the above explicitly / manually as follows:
Create a named pipe using the mkfifo command.
mkfifo my_buf
Launch your other command with that file as input
other_command < my_buf
Execute foo and let it write it's output to my_buf
foo -o my_buf
Solution 3: Using /dev/stdout
You can also use the device file /dev/stdout as follows
foo -o /dev/stdout | other_command
Named pipes work fine, but you have a nicer, more direct syntax available via bash process substitution that has the added benefit of not using a permanent named pipe that must later be deleted (process substitution uses temporary named pipes behind the scenes):
foo -o >(other command)
Also, should you want to pipe the output to your command and also save the output to a file, you can do this:
foo -o >(tee output.txt) | other command
For the sake of making stackoverflow happy let me write a long enough sentence because my proposed solution is only 18 characters long instead of the required 30+
foo -o /dev/stdout
You could use the magic of UNIX and create a named pipe :)
Create the pipe
$ mknod -p mypipe
Start the process that reads from the pipe
$ second-process < mypipe
Start the process, that writes into the pipe
$ foo -o mypipe
foo -o <(cat)
if for some reason you don't have permission to write to /dev/stdout
I use /dev/tty as the output filename, equivalent to using /dev/nul/ when you want to output nothing at all. Then | and you are done.

Can we wget with file list and renaming destination files?

I have this wget command:
sudo wget --user-agent='some-agent' --referer=http://some-referrer.html -N -r -nH --cut-dirs=x --timeout=xxx --directory-prefix=/directory/for/downloaded/files -i list-of-files-to-download.txt
-N will check if there is actually a newer file to download.
-r will turn the recursive retrieving on.
-nH will disable the generation of host-prefixed directories.
--cut-dirs=X will avoid the generation of the host's subdirectories.
--timeout=xxx will, well, timeout :)
--directory-prefix will store files in the desired directorty.
This works nice, no problem.
Now, to the issue:
Let's say my files-to-download.txt has these kind of files:
http://website/directory1/picture-same-name.jpg
http://website/directory2/picture-same-name.jpg
http://website/directory3/picture-same-name.jpg
etc...
You can see the problem: on the second download, wget will see we already have a picture-same-name.jpg, so it won't download the second or any of the following ones with the same name. I cannot mirror the directory structure because I need all the downloaded files to be in the same directory. I can't use the -O option because it clashes with --N, and I need that. I've tried to use -nd, but doesn't seem to work for me.
So, ideally, I need to be able to:
a.- wget from a list of url's the way I do now, keeping my parameters.
b.- get all files at the same directory and being able to rename each file.
Does anybody have any solution to this?
Thanks in advance.
I would suggest 2 approaches -
Use the "-nc" or the "--no-clobber" option. From the man page -
-nc
--no-clobber
If a file is downloaded more than once in the same directory, >Wget's behavior depends on a few options, including -nc. In certain >cases, the local file will be
clobbered, or overwritten, upon repeated download. In other >cases it will be preserved.
When running Wget without -N, -nc, -r, or -p, downloading the >same file in the same directory will result in the original copy of file >being preserved and the second copy
being named file.1. If that file is downloaded yet again, the >third copy will be named file.2, and so on. (This is also the behavior >with -nd, even if -r or -p are in
effect.) When -nc is specified, this behavior is suppressed, >and Wget will refuse to download newer copies of file. Therefore, ""no->clobber"" is actually a misnomer in
this mode---it's not clobbering that's prevented (as the >numeric suffixes were already preventing clobbering), but rather the >multiple version saving that's prevented.
When running Wget with -r or -p, but without -N, -nd, or -nc, >re-downloading a file will result in the new copy simply overwriting the >old. Adding -nc will prevent this
behavior, instead causing the original version to be preserved >and any newer copies on the server to be ignored.
When running Wget with -N, with or without -r or -p, the >decision as to whether or not to download a newer copy of a file depends >on the local and remote timestamp and
size of the file. -nc may not be specified at the same time as >-N.
A combination with -O/--output-document is only accepted if the >given output file does not exist.
Note that when -nc is specified, files with the suffixes .html >or .htm will be loaded from the local disk and parsed as if they had been >retrieved from the Web.
As you can see from this man page entry, the behavior might be unpredictable/unexpected. You will need to see if it works for you.
Another approach would be to use a bash script. I am most comfortable using bash on *nix, so forgive the platform dependency. However the logic is sound, and with a bit of modifications, you can get it to work on other platforms/scripts as well.
Sample pseudocode bash script -
for i in `cat list-of-files-to-download.txt`;
do
wget <all your flags except the -i flag> $i -O /path/to/custom/directory/filename ;
done ;
You can modify the script to download each file to a temporary file, parse $i to get the filename from the URL, check if the file exists on the disk, and then take a decision to rename the temp file to the name that you want.
This offers much more control over your downloads.

How do I run the sed command with input and output as the same file?

I'm trying to do use the sed command in a shell script where I want to remove lines that read STARTremoveThisComment and lines that read removeThisCommentEND.
I'm able to do it when I copy it to a new file using
sed 's/STARTremoveThisComment//' > test
But how do I do this by using the same file as input and output?
sed -i (or the extended version, --in-place) will automate the process normally done with less advanced implementations, that of sending output to temporary file, then renaming that back to the original.
The -i is for in-place editing, and you can also provide a backup suffix for keeping a copy of the original:
sed -i.bak fileToChange
sed --in-place=.bak fileToChange
Both of those will keep the original file in fileToChange.bak.
Keep in mind that in-place editing may not be available in all sed implementations but it is in GNU sed which should be available on all variants of Linux, as per your tags.
If you're using a more primitive implementation, you can use something like:
cp oldfile oldfile.bak && sed 'whatever' oldfile >newfile && mv newfile oldfile
You can use the flag -i for in-place editing and the -e for specifying normal script expression:
sed -i -e 's/pattern_to_search/text_to_replace/' file.txt
To delete lines that match a certain pattern you can use the simpler syntax. Notice the d flag:
sed -i '/pattern_to_search/d' file.txt
You really should not use sed for that. This question seems to come up ridiculously often, and it seems very strange that it does since the general solution is so trivial. It seems bizarre that people want to know how to do it in sed, and in python, and in ruby, etc. If you want to have a filter operate on an input and overwrite it, use the following simple script:
#!/bin/sh -e
in=${1?No input file specified}
mv $in ${bak=.$in.bak}
shift
"$#" < $bak > $in
Put that in your path in an executable file name inline, and then the problem is solved in general. For example:
inline input-file sed -e s/foo/bar/g
Now, if you want to add logic to keep multiple backups, or if you have some options to change the backup naming scheme, or whatever, you fix it in one place. What's the command line option to get 1-up counters on the backup file when processing a file in-place with perl? What about with ruby? Is the option different for gnu-sed? How does awk handle it? The whole friggin' point of unix is that tools do one thing only. Handling logic for backup files is a second thing, and needs to be factored out. If you are implementing a tool, do not add logic to create backup files. Tell your users to use a 2nd tool for that. Integration is bad. Modularity is good. That is the unix way.
Notice that this script has several problems. The permissions/mode of the input file may be changed, for example. I'm sure there are innumerable other issues. However, by putting the backup logic in a wrapper script, you localize all of these issues and don't have to worry that sed overwrites the files and changes mode, while python keeps the file in place and does not change the inode (I made up those two cases, the point being that not all tools will use the same logic, while the wrapper script will.)
As far as I know it is not possible to use the same file for input and output. Though one solution is make a shell script which will save it to another file, delete the old input and rename the output to the input file name.
sed -e s/try/this/g input.file > output.file;mv output.file input.file
I suggest using sponge
sponge reads standard input and writes it out to the specified file.
Unlike a shell redirect, sponge soaks up all its input before writing
the output file. This allows constructing pipelines that read from and
write to the same file.
cat test | sed 's/STARTremoveThisComment//' | sponge test

Overwrite a single line in a text file

I am looking for a safe and reliable way to overwrite ONE line in a text file. I don't care if it's using sed, grep, perl whatever. It just needs to be portable and reliable. Specifically what I am trying to do is replace the value of a variable I have saved in a text file at runtime. Let's say I have a file named variables.txt which contains a line that reads userName=stephen. Let's say my program wants to change the userName to frank. Here's what I've come up with using sed:
sed -i '' 's/userName.*/userName=frank/' variables.txt
The concern I have with this is that I've read that on some versions of sed using the '-i' switch without specifying a backup file could cause the command to fail or risk possible file corruption. Not an option. What do you guys think?
Edit
For those asking where I read about command failure and file corruption. The manpage for my version of sed recommends against providing an empty value for the -i switch as well as take a look at the comments on this page here:
It seems that some versions of sed require the argument after -i and others do not. With GNU sed version 4.1.x, it seems that the -i does not require an argument and specifying an empty argument after it actually fails.
It sounds like the unanimous recommendation is to provide a backup file and then delete it after the command completes. However I'm still concerned about this solution since my version of sed doesn't even support the --version switch. My primary concern here is that the solution is both reliable and portable.
t=`tempfile`
sed -e 's/userName.*/userName=frank/' variables.txt >$tempfile
cp $tempfile >variables.txt
rm $tempfile
you can also use mv but that won't preserve file rights
if tempfile is not available then use some other method ($$.bak) to create the filename.
As long as you don't run on windows, sed -i is as safe as anything. Even if the machine crashes mid-process, variables.txt will either have the old content or the new content -- it should never be missing or corrupt.
The concern I have with this is that I've read that on some versions of sed using the '-i' switch without specifying a backup file could cause the command to fail or risk possible file corruption.
Where have you read it?
It is not strictly true: this is not an usual problem in sed but all programs (including sed) can fail and in very unlikely situations corrupt data. So you should not be too afraid of data corruption with sed.
Anyway, why do you not use -i with an extension (such as -i.bak) for granting more safety? In any case you can erase the backup file with rm...
You could copy variables.txt to variables.txt.bak before running the command if you wanted to keep a backup. Or go the other way around:
sed 's/userName.*/userName=frank/' variables.txt > variables.txt.fixed
if [ $? -eq 0 ]
then
cp variables.txt.fixed variables.txt
fi

Unable to use SED to edit files fast

The file is initially
$cat so/app.yaml
application: SO
...
I run the following command. I get an empty file.
$sed s/SO/so/ so/app.yaml > so/app.yaml
$cat so/app.yaml
$
How can you use SED to edit the file and not giving me an empty file?
$ sed -i -e's/SO/so/' so/app.yaml
The -i means in-place.
The > used in piping will open the output file when the pipes are all set up, i.e. before command execution. Thus, the input file is truncated prior to sed executing. This is a problem with all shell redirection, not just with sed.
Sheldon Young's answer shows how to use in-place editing.
You are using the wrong tool for the job. sed is a stream editor (that's why it's called sed), so it's for in-flight editing of streams in a pipe. ed OTOH is a file editor, which can do everything sed can do, except it works on files instead of streams. (Actually, it's the other way round: ed is the original utility and sed is a clone that avoids having to create temporary files for streams.)
ed works very much like sed (because sed is just a clone), but with one important difference: you can move around in files, but you can't move around in streams. So, all commands in ed take an address parameter that tells ed, where in the file to apply the command. In your case, you want to apply the command everywhere in the file, so the address parameter is just , because a,b means "from line a to line b" and the default for a is 1 (beginning-of-file) and the default for b is $ (end-of-file), so leaving them both out means "from beginning-of-file to end-of-file". Then comes the s (for substitute) and the rest looks much like sed.
So, your sed command s/SO/so/ turns into the ed command ,s/SO/so/.
And, again because ed is a file editor, and more precisely, an interactive file editor, we also need to write (w) the file and quit (q) the editor.
This is how it looks in its entirety:
ed -- so/app.yaml <<-HERE
,s/SO/so/
w
q
HERE
See also my answer to a similar question.
What happens in your case, is that executing a pipeline is a two-stage process: first construct the pipeline, then run it. > means "open the file, truncate it, and connect it to filedescriptor 1 (stdout)". Only then is the pipe actually run, i.e. sed is executed, but at this time, the file has already been truncated.
Some versions of sed also have a -i parameter for in-place editing of files, that makes sed behave a little more like ed, but using that is not advisable: first of all, it doesn't support all the features of ed, but more importantly, it is a non-standardized proprietary extension of GNU sed that doesn't work on many non-GNU systems. It's been a while since I used a non-GNU system, but last I used one, neither Solaris nor OpenBSD nor HP-UX nor IBM AIX sed supported the -i parameter.
I believe that redirecting output into the same file you are editing is causing your problem.
You need redirect standard output to some temporary file and when sed is done overwrite the original file by the temporary one.