sed overwrites output with another part of replacement pattern - sed

I would like to reorder some events from my .ics calendar. However, sed somehow for some reason overwrites some parts of the output.
# original text
$ cat test
BEGIN:VEVENT
DTSTART:20151230
SUMMARY:Blanka Palakova
RRULE:FREQ=YEARLY
DURATION:P1D
END:VEVENT
# command which should work
$ /bin/sed -r 's/^(SUMMARY:)(.*) (.*)$/\1\3, \2/g' test
BEGIN:VEVENT
DTSTART:20151230
, BlankaPalakova
RRULE:FREQ=YEARLY
DURATION:P1D
END:VEVENT
# desired output
$ cat test
BEGIN:VEVENT
DTSTART:20151230
SUMMARY:Palakova, Blanka
RRULE:FREQ=YEARLY
DURATION:P1D
END:VEVENT
Also, I would like this to be the last occurance of the space before enf of line, for some of my events have middle names, too.
sed (GNU sed) 4.2.2
GNU bash, version 4.3.39(1)-release (x86_64-unknown-linux-gnu)

The source of the problem
The problem is in the file's line-endings, not the command.
When I run your command, the output is correct:
$ /bin/sed -r 's/^(SUMMARY:)(.*) (.*)$/\1\3, \2/g' test
BEGIN:VEVENT
DTSTART:20151230
SUMMARY:Palakova, Blanka
RRULE:FREQ=YEARLY
DURATION:P1D
END:VEVENT
If I convert your input file to DOS/Windows line-endings, \r\n, then the same problem that you experienced occurs:
$ unix2dos <test >test.dos
$ /bin/sed -r 's/^(SUMMARY:)(.*) (.*)$/\1\3, \2/g' test.dos
BEGIN:VEVENT
DTSTART:20151230
, BlankaPalakova
RRULE:FREQ=YEARLY
DURATION:P1D
END:VEVENT
What happened is that the \r from the end of the line is included in group 3, \3, and placed in the middle of the line. \r means carriage return (without jumping to the next line). That is what happens. The cursor ("carriage") returns to the start of the line and the , Blanka overwrites what had been there.
Solutions
One solution is to convert the input file to unix line-endings, \n, with dos2unix or other utility.
Another solution is to make the sed command tolerant of DOS-Windows line-endings:
$ /bin/sed -r 's/^(SUMMARY:)(.*) ([^\r]*)/\1\3, \2/g' test.dos
BEGIN:VEVENT
DTSTART:20151230
SUMMARY:Palakova, Blanka
RRULE:FREQ=YEARLY
DURATION:P1D
END:VEVENT
Since regular expressions in sed are greedy, the expression ([^\r]*) will match either to the end of the line or to the first \r, whichever comes first.

Related

UNIX Replacing a character sequence in either tr or sed

Have a file that has been created incorrectly. There are several space delimited fields in the file but one text field has some unwanted newlines. This is causing a big problem.
How can I remove these characters but not the wanted line ends?
file is:
'Number field' 'Text field' 'Number field'
1 Some text 999999
2 more
text 111111111
3 Even more text 8888888888
EOF
So there is a NL after the word "more".
I've tried sed:
sed 's/.$//g' test.txt > test.out
and
sed 's/\n//g' test.txt > test.out
But none of these work. The newlines do not get removed.
tr -d '\n' does too much - I need to remove ONLY the newlines that are preceded by a space.
How can I delete newlines that follow a space?
SunOS 5.10 Generic_144488-09 sun4u sparc SUNW,Sun-Fire-V440
A sed solution is
sed '/ $/{N;s/\n//}'
Explanation:
/ $/: whenever the line ends in space, then
N: append a newline and the next line of input, and
s/\n//: delete the newline.
It might be simplest with Perl:
perl -p0 -e 's/ \n/ /g'
The -0 flag makes Perl read the entire file as one line. Then we can substitute using s in the usual way. You can, of course, also add the -i option to edit the file in-place.
How can I delete newlines that follow a space?
If you want every occurrence of $' \n' in the original file to be replaced by a space ($' '), and if you know of a character (e.g. a control character) that does not appear in the file, then the task can be accomplished quite simply using sed and tr (as you requested). Let's suppose, for example, that control-A is a character that is not in the file. For the sake of simplicity, let's also assume we can use bash. Then the following script should do the job:
#!/bin/bash
A=$'\01'
tr '\n' "$A" | sed "s/ $A/ /g" | tr "$A" '\n'

Using sed to keep the beginning of a line

I have a file in which some lines start by a >
For these lines, and only these ones, I want to keep the first eleven characters.
How can I do that using sed ?
Or maybe something else is better ?
Thanks !
Muriel
Let's start with this test file:
$ cat file
line one with something or other
>1234567890abc
other line in file
To keep only the first 11 characters of lines starting with > while keeping all other lines:
$ sed -r '/^>/ s/(.{11}).*/\1/' file
line one with something or other
>1234567890
other line in file
To keep only the first eleven characters of lines starting with > and deleting all other lines:
$ sed -rn '/^>/ s/(.{11}).*/\1/p' file
>1234567890
The above was tested with GNU sed. For BSD sed, replace the -r option with -E.
Explanation:
/^>/ is a condition. It means that the command which follows only applies to lines that start with >
s/(.{11}).*/\1/ is a substitution command. It replaces the whole line with just the first eleven characters.
-r turns on extended regular expression format, eliminating the need for some escape characters.
-n turns off automatic printing. With -n in effect, lines are only printed if we explicitly ask them to be printed. In the second case above, that is done by adding a p after the substitute command.
Other forms:
$ sed -r 's/(>.{10}).*/\1/' file
line one with something or other
>1234567890
other line in file
And:
$ sed -rn 's/(>.{10}).*/\1/p' file
>1234567890

sed creating ^M at end of line

I have a script on a centos server and I wrote the script on the server using VIM. The script is to edit a configuration file. When I check the configuration file after it has been edited, there is a ^M at the end of every line that was NOT edited. The lines that were edited are fine.
cat hibernate.properties |
sed -i.bk \
-e 's%\(^hibernate\.connection\.url\=ristor:jdbc:postgresql:\/\/127\.0\.0\.1/\).*%\'1$dbname'%' \
-e 's/\(^hibernate\.connection\.username\=\).*/\'1$dbuser'/' \
-e 's/\(^hibernate\.connection\.password\=\).*/\'1$pws'/' hibernate.properties
This is the code that is being used to edit the configuration file. Why is it putting ^M at the end of every line that is NOT edited?
The ^M being shown are probably windows-style line endings on some lines. Try to run your file through dos2unix before running your script.
For example:
dos2unix hibernate.properties
This is not likely to add \r, it's more like that the file had them already, but was detected as dos fileformat by vim. Your script actually removed it from each line it has touched and vim doesn't consider the file dos anymore and therefore shows carriage returns that are still left in. Once you remove them (%s/<Ctrl-V><Ctrl-M>$// in vim should do), it is not likely to happen again.

Using variables in sed -f (where sed script is in a file rather than inline)

We have a process which can use a file containing sed commands to alter piped input.
I need to replace a placeholder in the input with a variable value, e.g. in a single -e type of command I can run;
$ echo "Today is XX" | sed -e "s/XX/$(date +%F)/"
Today is 2012-10-11
However I can only specify the sed aspects in a file (and then point the process at the file), E.g. a file called replacements.sed might contain;
s/XX/Thursday/
So obviously;
$ echo "Today is XX" | sed -f replacements.sed
Today is Thursday
If I want to use an environment variable or shell value, though, I can't find a way to make it expand, e.g. if replacements.txt contains;
s/XX/$(date +%F)/
Then;
$ echo "Today is XX" | sed -f replacements.sed
Today is $(date +%F)
Including double quotes in the text of the file just prints the double quotes.
Does anyone know a way to be able to use variables in a sed file?
This might work for you (GNU sed):
cat <<\! > replacements.sed
/XX/{s//'"$(date +%F)"'/;s/.*/echo '&'/e}
!
echo "Today is XX" | sed -f replacements.sed
If you don't have GNU sed, try:
cat <<\! > replacements.sed
/XX/{
s//'"$(date +%F)"'/
s/.*/echo '&'/
}
!
echo "Today is XX" | sed -f replacements.sed | sh
AFAIK, it's not possible. Your best bet will be :
INPUT FILE
aaa
bbb
ccc
SH SCRIPT
#!/bin/sh
STRING="${1//\//\\/}" # using parameter expansion to prevent / collisions
shift
sed "
s/aaa/$STRING/
" "$#"
COMMAND LINE
./sed.sh "fo/obar" <file path>
OUTPUT
fo/obar
bbb
ccc
As others have said, you can't use variables in a sed script, but you might be able to "fake" it using extra leading input that gets added to your hold buffer. For example:
[ghoti#pc ~/tmp]$ cat scr.sed
1{;h;d;};/^--$/g
[ghoti#pc ~/tmp]$ sed -f scr.sed <(date '+%Y-%m-%d'; printf 'foo\n--\nbar\n')
foo
2012-10-10
bar
[ghoti#pc ~/tmp]$
In this example, I'm using process redirection to get input into sed. The "important" data is generated by printf. You could cat a file instead, or run some other program. The "variable" is produced by the date command, and becomes the first line of input to the script.
The sed script takes the first line, puts it in sed's hold buffer, then deletes the line. Then for any subsequent line, if it matches a double dash (our "macro replacement"), it substitutes the contents of the hold buffer. And prints, because that's sed's default action.
Hold buffers (g, G, h, H and x commands) represent "advanced" sed programming. But once you understand how they work, they open up new dimensions of sed fu.
Note: This solution only helps you replace entire lines. Replacing substrings within lines may be possible using the hold buffer, but I can't imagine a way to do it.
(Another note: I'm doing this in FreeBSD, which uses a different sed from what you'll find in Linux. This may work in GNU sed, or it may not; I haven't tested.)
I am in agreement with sputnick. I don't believe that sed would be able to complete that task.
However, you could generate that file on the fly.
You could change the date to a fixed string, like
__DAYOFWEEK__.
Create a temp file, use sed to replace __DAYOFWEEK__ with $(date +%Y).
Then parse your file with sed -f $TEMPFILE.
sed is great, but it might be time to use something like perl that can generate the date on the fly.
To add a newline in the replacement expression using a sed file, what finally worked for me is escaping a literal newline. Example: to append a newline after the string NewLineHere, then this worked for me:
#! /usr/bin/sed -f
s/NewLineHere/NewLineHere\
/g
Not sure it matters but I am on Solaris unix, so not GNU sed for sure.

How to use patches created in windows (with CRLF) in linux?I

Standard linux patch hard-coded only for unix text files.
PS: I do no want convert ALL to unix and then convert result back.
I've run into this problem before a few times. This is what I've discovered:
The Linux patch command will not recognize a patchfile that has CRLF in the patch 'meta-lines'.
The line-endings of the actual patch content must match the line endings of files being patched.
So this is what I did:
Use dos2unix to convert patch files to LF line-endings only.
Use dos2unix to convert the files being patched to LF line-endings only.
Apply patch.
You can use unix2dos to convert patched files back to CRLF line-endings if you want to maintain that convention.
Use the --binary option. Here is the relevant snippet from the man page:
--binary
Write all files in binary mode, except for standard output and /dev/tty. When reading, disable
the heuristic for transforming CRLF line endings into LF line endings. This option is needed
on POSIX systems when applying patches generated on non-POSIX systems to non-POSIX files. (On
POSIX systems, file reads and writes never transform line endings. On Windows, reads and writes
do transform line endings by default, and patches should be generated by diff --binary when
line endings are significant.)
Combined:
dos2unix patchfile.diff
dos2unix $(grep 'Index:' patchfile.diff | awk '{print $2}')
patch --verbose -p0 -i patchfile.diff
unix2dos $(grep 'Index:' patchfile.diff | awk '{print $2}')
The last line depends on whether you want to keep the CRLFs or not.
M.
PS. This should've been a reply to cscrimge's post. DS.
This is a solution one of our guys came up with in our office, so I'm not taking credit for it but it works for me here.
We have a situation of mixed linux and windows line endings in the same file sometimes, and we also create patch files from windows and apply them on linux.
If you are experience a patch problem after creating your patch file on windows or you have mixed line endings then do this:
dos2unix patch-file
dos2unix $(sed -n 's/^Index: //p' patch-file)
patch -p0 -i patch-file
perl -i.bak -pe's/\R/\n/g' inputfile to convert any line ending to the standard.