Overwrite a single line in a text file - perl

I am looking for a safe and reliable way to overwrite ONE line in a text file. I don't care if it's using sed, grep, perl whatever. It just needs to be portable and reliable. Specifically what I am trying to do is replace the value of a variable I have saved in a text file at runtime. Let's say I have a file named variables.txt which contains a line that reads userName=stephen. Let's say my program wants to change the userName to frank. Here's what I've come up with using sed:
sed -i '' 's/userName.*/userName=frank/' variables.txt
The concern I have with this is that I've read that on some versions of sed using the '-i' switch without specifying a backup file could cause the command to fail or risk possible file corruption. Not an option. What do you guys think?
Edit
For those asking where I read about command failure and file corruption. The manpage for my version of sed recommends against providing an empty value for the -i switch as well as take a look at the comments on this page here:
It seems that some versions of sed require the argument after -i and others do not. With GNU sed version 4.1.x, it seems that the -i does not require an argument and specifying an empty argument after it actually fails.
It sounds like the unanimous recommendation is to provide a backup file and then delete it after the command completes. However I'm still concerned about this solution since my version of sed doesn't even support the --version switch. My primary concern here is that the solution is both reliable and portable.

t=`tempfile`
sed -e 's/userName.*/userName=frank/' variables.txt >$tempfile
cp $tempfile >variables.txt
rm $tempfile
you can also use mv but that won't preserve file rights
if tempfile is not available then use some other method ($$.bak) to create the filename.

As long as you don't run on windows, sed -i is as safe as anything. Even if the machine crashes mid-process, variables.txt will either have the old content or the new content -- it should never be missing or corrupt.

The concern I have with this is that I've read that on some versions of sed using the '-i' switch without specifying a backup file could cause the command to fail or risk possible file corruption.
Where have you read it?
It is not strictly true: this is not an usual problem in sed but all programs (including sed) can fail and in very unlikely situations corrupt data. So you should not be too afraid of data corruption with sed.
Anyway, why do you not use -i with an extension (such as -i.bak) for granting more safety? In any case you can erase the backup file with rm...

You could copy variables.txt to variables.txt.bak before running the command if you wanted to keep a backup. Or go the other way around:
sed 's/userName.*/userName=frank/' variables.txt > variables.txt.fixed
if [ $? -eq 0 ]
then
cp variables.txt.fixed variables.txt
fi

Related

Can we wget with file list and renaming destination files?

I have this wget command:
sudo wget --user-agent='some-agent' --referer=http://some-referrer.html -N -r -nH --cut-dirs=x --timeout=xxx --directory-prefix=/directory/for/downloaded/files -i list-of-files-to-download.txt
-N will check if there is actually a newer file to download.
-r will turn the recursive retrieving on.
-nH will disable the generation of host-prefixed directories.
--cut-dirs=X will avoid the generation of the host's subdirectories.
--timeout=xxx will, well, timeout :)
--directory-prefix will store files in the desired directorty.
This works nice, no problem.
Now, to the issue:
Let's say my files-to-download.txt has these kind of files:
http://website/directory1/picture-same-name.jpg
http://website/directory2/picture-same-name.jpg
http://website/directory3/picture-same-name.jpg
etc...
You can see the problem: on the second download, wget will see we already have a picture-same-name.jpg, so it won't download the second or any of the following ones with the same name. I cannot mirror the directory structure because I need all the downloaded files to be in the same directory. I can't use the -O option because it clashes with --N, and I need that. I've tried to use -nd, but doesn't seem to work for me.
So, ideally, I need to be able to:
a.- wget from a list of url's the way I do now, keeping my parameters.
b.- get all files at the same directory and being able to rename each file.
Does anybody have any solution to this?
Thanks in advance.
I would suggest 2 approaches -
Use the "-nc" or the "--no-clobber" option. From the man page -
-nc
--no-clobber
If a file is downloaded more than once in the same directory, >Wget's behavior depends on a few options, including -nc. In certain >cases, the local file will be
clobbered, or overwritten, upon repeated download. In other >cases it will be preserved.
When running Wget without -N, -nc, -r, or -p, downloading the >same file in the same directory will result in the original copy of file >being preserved and the second copy
being named file.1. If that file is downloaded yet again, the >third copy will be named file.2, and so on. (This is also the behavior >with -nd, even if -r or -p are in
effect.) When -nc is specified, this behavior is suppressed, >and Wget will refuse to download newer copies of file. Therefore, ""no->clobber"" is actually a misnomer in
this mode---it's not clobbering that's prevented (as the >numeric suffixes were already preventing clobbering), but rather the >multiple version saving that's prevented.
When running Wget with -r or -p, but without -N, -nd, or -nc, >re-downloading a file will result in the new copy simply overwriting the >old. Adding -nc will prevent this
behavior, instead causing the original version to be preserved >and any newer copies on the server to be ignored.
When running Wget with -N, with or without -r or -p, the >decision as to whether or not to download a newer copy of a file depends >on the local and remote timestamp and
size of the file. -nc may not be specified at the same time as >-N.
A combination with -O/--output-document is only accepted if the >given output file does not exist.
Note that when -nc is specified, files with the suffixes .html >or .htm will be loaded from the local disk and parsed as if they had been >retrieved from the Web.
As you can see from this man page entry, the behavior might be unpredictable/unexpected. You will need to see if it works for you.
Another approach would be to use a bash script. I am most comfortable using bash on *nix, so forgive the platform dependency. However the logic is sound, and with a bit of modifications, you can get it to work on other platforms/scripts as well.
Sample pseudocode bash script -
for i in `cat list-of-files-to-download.txt`;
do
wget <all your flags except the -i flag> $i -O /path/to/custom/directory/filename ;
done ;
You can modify the script to download each file to a temporary file, parse $i to get the filename from the URL, check if the file exists on the disk, and then take a decision to rename the temp file to the name that you want.
This offers much more control over your downloads.

sed command is not working properly

I'm trying to replace the word in shell script with sed -e command but its not replacing , please help on that, i have tried the below
we have separate file in /data/docs/config.log, in that file there is a word ?account for example ,
username acc, passsword acc, ?account.name
this ?account word needs to be replaced with word 'GLOBAL' using sed -e command ,
reacc = GLOBAL
sed -e "s/?account/$reacc/g" /data/docs/config.log > /data/docs/newconfig.log
but here the file newconfig.log has created with 0 size , no output written to the file , its not replacing its an empty file,
the output should be username acc, passsword acc, GLOBAL.name in newconfig.log
Being the only person who can reproduce the problem, you are pretty much on your own. There are plenty of things you can do to analyze the problem, though.
Double-check the shell. Don't have blind faith in #!/bin/sh. In cygwin for example, /bin/sh is an alias for bash. Verify with: echo $SHELL
Check permissions and file system. Do you have rights to write to the output file? Is the disk full? Does cat /data/docs/config.log > /data/docs/newconfig.log work? Test again in a different folder.
Double-check the output file. Is it really empty, or is the file system just slow with updating the file size? Is sed really finished? Test without output redirection; see if the output is dumped to stdout.
Test with a small file; one or two lines is enough.
If even that does not work, then test sed itself. Who knows, maybe you have a weird alias that hides the real sed. The most trivial filter is sed -e '', which should simply echo every line you type (just like cat without parameters). Does that work? Then try some simple patterns.
Systematically iterate between test cases that succeed and test case that fail, until you have found the breaking point. Doing so, you should be able to find the cause. Sorry, that's all I can do for you right now.
Remove spaces around =. Try after making
reacc=GLOBAL

How do I run the sed command with input and output as the same file?

I'm trying to do use the sed command in a shell script where I want to remove lines that read STARTremoveThisComment and lines that read removeThisCommentEND.
I'm able to do it when I copy it to a new file using
sed 's/STARTremoveThisComment//' > test
But how do I do this by using the same file as input and output?
sed -i (or the extended version, --in-place) will automate the process normally done with less advanced implementations, that of sending output to temporary file, then renaming that back to the original.
The -i is for in-place editing, and you can also provide a backup suffix for keeping a copy of the original:
sed -i.bak fileToChange
sed --in-place=.bak fileToChange
Both of those will keep the original file in fileToChange.bak.
Keep in mind that in-place editing may not be available in all sed implementations but it is in GNU sed which should be available on all variants of Linux, as per your tags.
If you're using a more primitive implementation, you can use something like:
cp oldfile oldfile.bak && sed 'whatever' oldfile >newfile && mv newfile oldfile
You can use the flag -i for in-place editing and the -e for specifying normal script expression:
sed -i -e 's/pattern_to_search/text_to_replace/' file.txt
To delete lines that match a certain pattern you can use the simpler syntax. Notice the d flag:
sed -i '/pattern_to_search/d' file.txt
You really should not use sed for that. This question seems to come up ridiculously often, and it seems very strange that it does since the general solution is so trivial. It seems bizarre that people want to know how to do it in sed, and in python, and in ruby, etc. If you want to have a filter operate on an input and overwrite it, use the following simple script:
#!/bin/sh -e
in=${1?No input file specified}
mv $in ${bak=.$in.bak}
shift
"$#" < $bak > $in
Put that in your path in an executable file name inline, and then the problem is solved in general. For example:
inline input-file sed -e s/foo/bar/g
Now, if you want to add logic to keep multiple backups, or if you have some options to change the backup naming scheme, or whatever, you fix it in one place. What's the command line option to get 1-up counters on the backup file when processing a file in-place with perl? What about with ruby? Is the option different for gnu-sed? How does awk handle it? The whole friggin' point of unix is that tools do one thing only. Handling logic for backup files is a second thing, and needs to be factored out. If you are implementing a tool, do not add logic to create backup files. Tell your users to use a 2nd tool for that. Integration is bad. Modularity is good. That is the unix way.
Notice that this script has several problems. The permissions/mode of the input file may be changed, for example. I'm sure there are innumerable other issues. However, by putting the backup logic in a wrapper script, you localize all of these issues and don't have to worry that sed overwrites the files and changes mode, while python keeps the file in place and does not change the inode (I made up those two cases, the point being that not all tools will use the same logic, while the wrapper script will.)
As far as I know it is not possible to use the same file for input and output. Though one solution is make a shell script which will save it to another file, delete the old input and rename the output to the input file name.
sed -e s/try/this/g input.file > output.file;mv output.file input.file
I suggest using sponge
sponge reads standard input and writes it out to the specified file.
Unlike a shell redirect, sponge soaks up all its input before writing
the output file. This allows constructing pipelines that read from and
write to the same file.
cat test | sed 's/STARTremoveThisComment//' | sponge test

Sed creates un-deleteable files in Windows

I'm trying to run the following command in Windows Server 2003 but sed creates a pile of files that I can't delete from the command line inside the current directory.
for /R %f in (*.*) do "C:\Program Files\gnuwin32\bin\sed.exe" -i "s/bad/good/g" "%f"
Does anyone have any suggestions? Mysteriously enough, I'm able to delete the files using Windows Explorer.
As requested, here are some example filenames:
sed0E3WZJ
sed5miXwt
sed6fzFKh
And, more troubleshooting info...
It occurs from both the command prompt & batch files
If I just need to run sed on a single directory, then I use sed "s/bad/good/g" *.* and everything is OK. Alas, I also need it to tackle all the subdirectories.
I only have Sed installed.
Sed is creating the files
I have replicated your setup and I have the following observations.
I dont think there is a problem in the loop. The simple command "C:\Program Files\gnuwin32\bin\sed.exe" -i "s/bad/good/g" . - creates the same set of temporary files.
The files are indeed created by sed. sed creates these temporary files when the "in place" (-i) option is turned on. In the normal course, sed actually deletes the files (that is what happens in cygwin) using a call to the 'unlink' library. In case of gnuwin32, it looks like the 'unlink' fails. I have not been able to figure out why. I took a guess that maybe the unlink call is dependent on the gnuwin32 'coreutils' library and tried to download and install the coreutils library - no dice.
If you remove the 'read-only' restriction in the parent folder before executing the sed command, you can delete the temporary files from windows command prompt. So that should give you some temporary respite.
I think we now have enough information to raise a bug report. If you agree, I think it may be a good idea to bring it to the notice of the good folks responsible for gnuwin32 and ask them for help.
Meanwhile, the following version cleans up its temporary file:
https://github.com/mbuilov/sed-windows
As this is a known bug in sed with the -i option you can run attrib -R <filename> to remove the read only attribute from file after sed completes.
Alternatively do not use the -i option and redirect the output to a new file and then delete and rename the input and output.
Cygwin hoses the ACLs on files sometimes, you'll probably have to use cacls or chmod to fix it up before you can delete the file.
Here is where a bit of troubleshooting comes into play. Does this happen when you run that command from the command-line and a batch file? What if you run sed on an individual file on the command line - does it create these files for every file, or just certain files/filetypes? Does it only happen for that replacement, or all replacements in general, or just always when you run sed.exe on a file? Is it only sed creating these files, or all Gnuwin32 exe's (eg. awk, cat, etc)? Does the same thing happen on a sed.exe from a new install of Gnuwin32? What error message does it give when you try to delete the files? Can you delete the files from explorer while the command prompt is still open? What if you close the command prompt and reopen it, then try to delete the files?
you can just run sed without for loop
c:\test> sed -i.bak "s/bad/good/g" file*.*
This is a stab in the dark, but it wouldn't surprise me in the least if the gnuwin32 implementation of sed is duff (i.e., faulty in some way). Can you try to replicate the problem using the AT&T U/Win POSIX support for windows? It is easy to install and includes the Korn shell, sed, and find, so you can use find instead of the FOR /R. (I'm wondering if part of the problem is that the MS FOR and gnuwin32 sed don't play nicely together.)
I realize this is an old thread but It's still an issue. My fix is to add
"DEL sed*" to the end of a batch file after sed. Quick and dirty.
I am using this command to clean up the temporary files created by gnuwin32's sed:
FOR /f "tokens=*" %%a in ('dir /b ^| findstr /i "^sed[0-9a-zA-Z][0-9a-zA-Z][0-9a-zA-Z][0-9a-zA-Z][0-9a-zA-Z][0-9a-zA-Z]$"') DO del %%a
i know this is old. But just want to share with people what I did that cause this.
It was in fact the temp file for an already open file through gvim example .swap file that causing the sed tmp file didnt get remove completely.
So sed trying to read and append into the opened file which the user is currently viewing and having trouble doing it which causes the tmp file error.

Unable to use SED to edit files fast

The file is initially
$cat so/app.yaml
application: SO
...
I run the following command. I get an empty file.
$sed s/SO/so/ so/app.yaml > so/app.yaml
$cat so/app.yaml
$
How can you use SED to edit the file and not giving me an empty file?
$ sed -i -e's/SO/so/' so/app.yaml
The -i means in-place.
The > used in piping will open the output file when the pipes are all set up, i.e. before command execution. Thus, the input file is truncated prior to sed executing. This is a problem with all shell redirection, not just with sed.
Sheldon Young's answer shows how to use in-place editing.
You are using the wrong tool for the job. sed is a stream editor (that's why it's called sed), so it's for in-flight editing of streams in a pipe. ed OTOH is a file editor, which can do everything sed can do, except it works on files instead of streams. (Actually, it's the other way round: ed is the original utility and sed is a clone that avoids having to create temporary files for streams.)
ed works very much like sed (because sed is just a clone), but with one important difference: you can move around in files, but you can't move around in streams. So, all commands in ed take an address parameter that tells ed, where in the file to apply the command. In your case, you want to apply the command everywhere in the file, so the address parameter is just , because a,b means "from line a to line b" and the default for a is 1 (beginning-of-file) and the default for b is $ (end-of-file), so leaving them both out means "from beginning-of-file to end-of-file". Then comes the s (for substitute) and the rest looks much like sed.
So, your sed command s/SO/so/ turns into the ed command ,s/SO/so/.
And, again because ed is a file editor, and more precisely, an interactive file editor, we also need to write (w) the file and quit (q) the editor.
This is how it looks in its entirety:
ed -- so/app.yaml <<-HERE
,s/SO/so/
w
q
HERE
See also my answer to a similar question.
What happens in your case, is that executing a pipeline is a two-stage process: first construct the pipeline, then run it. > means "open the file, truncate it, and connect it to filedescriptor 1 (stdout)". Only then is the pipe actually run, i.e. sed is executed, but at this time, the file has already been truncated.
Some versions of sed also have a -i parameter for in-place editing of files, that makes sed behave a little more like ed, but using that is not advisable: first of all, it doesn't support all the features of ed, but more importantly, it is a non-standardized proprietary extension of GNU sed that doesn't work on many non-GNU systems. It's been a while since I used a non-GNU system, but last I used one, neither Solaris nor OpenBSD nor HP-UX nor IBM AIX sed supported the -i parameter.
I believe that redirecting output into the same file you are editing is causing your problem.
You need redirect standard output to some temporary file and when sed is done overwrite the original file by the temporary one.