I have a file like this
{CRLF
sum: 21.46,CRLF
first: 99.10,CRLF
last: 57.71 CRLF
}CRLF
{CRLF
sum: 159.32,CRLF
first: 456.71,CRLF
last: 89.27 CRLF
}CRLF
...
ps. CRLF is the line break in windows system, not really text in this file.
I want to add a comma at the end of every line containing "last:".
I used the following command
sed '/last/ s/$/,/' old.txt >new.txt
but I got a weird result
{CRLF
sum: 21.46,CRLF
first: 99.10,CRLF
last: 57.71 CR
,CRLF
}CRLF
{CRLF
sum: 159.32,CRLF
first: 456.71,CRLF
last: 89.27 CR
,CRLF
}CRLF
...
The comma doesn't append at the end of line. Instead, it append at a new line. Any idea will be greatly appreciated. Thanks.
Your data file has DOS-style (Windows-style) CRLF line endings. sed inserts the comma between the CR and the LF (because it doesn't know about CRLF line endings and CR is just another character before the end of line).
Edit your file to remove the DOS line endings: see How to convert DOS/Windows newline to Unix newline in bash script for information on how to do that. Or, as Beta pointed out, you can do that at the same time that you add the comma:
sed -e 's/.$//' -e '/last:/s/$/,/'
This is mildly dangerous if applied to a file with Unix line endings; it will remove the last character on those lines too. It might be better to embed the CR in the script:
sed -e $'s/\r$//' -e '/last:/s/$/,/'
which uses bash's ANSI-C Quoting mechanism to embed a CR into the command string.
You're not completely consistent in your question. You say 'lines containing line:' but your code handles 'lines containing line' (missing out the colon). Your choice.
Related
I'm attempting to create a single newline at the end of a file.
My command is this:
gsed -i '$a\\r' outfiles/*.txt
Somehow this creates two newlines, and I cannot figure out what I am doing wrong.
Any thoughts?
In my first thought I would on the last line substitute end of line with a newline.
sed '$s/$/\n/'
But my second thought is just nice:
sed '$G'
Grabbing from a hold space appends a newline to pattern space and then appends the hold space to pattern space. Because hold space is empty, it effectively adds just only the newline.
Keep it clear and simple, just use gawk:
gawk -i inplace 'ENDFILE{print ""}' outfiles/*.txt
I am trying to add multiple lines to a file, all with a leading a tab. The lines should be inserted on the first line after matching a string.
Assume a file with only one line, called "my-file.txt" as follows:
foo
I have tried the following sed command:
sed "/^foo\$/a \tinsert1\n\tinsert2" my-file.txt
This produces the following output:
foo
tinsert1
insert2
Notice how the the tab that should be on the first (inserted) line is omitted. Instead it prints an extra leading 't'.
Why? And how can I change my command to print the tab on the first line, as expected?
With GNU sed:
sed '/^foo$/a \\tinsert1\n\tinsert2' file
<---- single quotes! --->
Produces:
foo
insert1
insert2
From the manual:
a \
text Append text, which has each embedded newline preceded by a backslash.
Since the text to be append itself has to to be preceded by a backslash, it needs to be \\t at the beginning.
PS: If you need to use double quotes around the sed command because you want to inject shell variables, you need to escape the \ which precedes the text to be appended:
ins1="foo"
ins2="bar"
sed "/^foo\$/a \\\t${ins1}\n\t${ins2}" file
sed is for doing s/old/new on individual strings, that is all. Just use awk:
$ awk '{print} $0=="foo"{print "\tinsert1\n\tinsert2"}' file
foo
insert1
insert2
The above will work using any awk in any shell on every UNIX box and is trivial to modify to do anything else you might want to do in future.
I'm using perl from command line to to replace duplicate spaces from a text file.
The command I use is:
perl -pi -e 's/\s+/ /g' file.csv
The problem: This procedure removes also the new lines in the resulting file....
Any idea why this occur?
Thanks!
\s means the five characters: [ \f\n\r\t]. So, you're replacing newlines by single spaces.
In your case, the simplest way is to enable automatic line-ending processing with -l flag:
perl -pi -le 's/\s+/ /g' file.csv
This way, newlines will be chomped before -e statement and appended after.
Will add my two cents to the previous answer.
If you use this regexp in perl script itself, then you can just change it to:
s/[ ]+/ /gis;
That will change every line and won't delete line-endings.
I have a .txt file with two types of paragraphs:
Some statements and numbers (02) and such followed by a return
With some more stuff followed by two returns
Then a single line paragraph that is followed by two returns
Along with some more double line text return
some more text.
I want to remove all single line paragraphs from the text file. So that the result is:
Some statements and numbers (02) and such followed by a return
With some more stuff followed by two returns
Along with some more double line text return
some more text
I have been attempting to do this with sed and awk, but I keep running into problems coming up with a regex that will look for a newline followed by some characters and ending in two consecutive newlines \n\n.
Is there anyway way to do this with a one liner or am I going to have to write a script to read in line by line and determine the length of the paragraph and strip it out that way?
Thanks.
awk -F '\n' -v RS='' -v ORS='\n\n' 'NF>1' input.txt
When RS is set to the empty string, each record always ends at the first blank line encountered.
When RS is set to the empty string, and FS is set to a single character, the newline character always acts as a field separator.
[read more]
I tend to reach for Perl for paragraph-oriented parsing:
perl -00 -lne 'print if tr/\n/\n/ > 0'
I want multi-line strings in java, so I seek a simple preprocessor to convert C-style multi-lines into single lines with a literal '\n'.
Before:
System.out.println("convert trailing backslashes\
this is on another line\
\
\
above are two blank lines\
But don't convert non-trailing backslashes, like: \"\t\" and \'\\\'");
After:
System.out.println("convert trailing backslashes\nthis is on another line\n\n\nabove are two blank lines\nBut don't convert non-trailing backslashes, like: \"\t\" and \'\\\'");
I thought sed would do it well, but sed is line-based, so replacing the '\' and the newline that follows it (effectively joining the two lines) is not very natural in sed. I adapted sredden79's oneliner to the following - it works, it's clever, but it's not clear:
sed ':a { $!N; s/\\\n/\\n/; ta }'
The substitute is of escaped literal backslash, newline with escaped literal backslash, n. :a is a label and ta is goto label if the substitute found a match; $ means the last line, and $! is the opposite (i.e. all lines but the last). N means to append the next line to the pattern space (thus making the \n character visible.)
EDIT here's a variation to keep compiler error line numbers etc accurate: it turns each extended line into "..."+\n (and handles the first and last lines of the String correctly):
sed ':a { $!N; s/\\\n/\\n"+\n"/; ta }'
giving:
System.out.println("convert trailing backslashes\n"+
"this is on another line\n"+
"\n"+
"\n"+
"above are two blank lines\n"+
"But don't convert non-trailing backslashes, like: \"\t\" and \'\\\'");
EDIT Actually, it would be better have Perl/Python style multi-line, where it starts and ends with a special code on one line (""" for python, I think).
Is there a simpler, saner, clearer way (maybe not using sed)?
Is there a simpler, saner, clearer way.
Forget the pre-processor, live with the limitation, complain about it (so that it will maybe be fixed in Java 7 or 8), and use an IDE to ease the pain.
Other alternatives (too troublesome I suppose, but still better than messing with the compilation process):
use a JVM-based language that does support here-docs
externalize the string into a resource file
A perl one-liner:
perl -0777 -pe 's/\\\n/\\n/g'
This will read either stdin or the file(s) named after it on the command line and write the output to stdout.
If you're using an editor that supports filtering, like vi or emacs, just filter your text through the above command and you're done:
If you're using Windows and have to worry about \r :
C:\> perl -0777 -pe "s/\\\r?\n/\\n/g"
although I think win32 Perl handles \r itself so this may be unnecessary.
The -0777 option is a special case of the -0 (that's a zero) option that defines the line or record separator. In this case, it means that we don't want any separator so read the entire file in as a single string.
The -pe option is a combination of -p (process line-by-line and print the result) and -e (next argument is (a line of) the program to execute)
A perl script to what you asked for.
while (<>) {
chomp;
print $_;
if (/\\$/) {
print "n";
} else {
print "\n";
}
}
sed 's/\x5c\x5c$/\x22\x5c\x5cn\x22/'
Hex for backslash and double quote is \x5c and \x22 respectively - it needs to be escaped so \x5c is doubled and the $ anchors to the end of the line.
Updated again per OP comment:
sed "{:a;N;\$!b a};s/\x5c\x5c\n/\x5c\x5cn/g"
The :a creates a label and the N appends a line to the pattern space, the b a branches back to the label :a except when its the last line $!;
After its all loaded - a single line substitution replaces all occurrences of a newline \n with a literal '\n' using the hex ascii code \x5c for the backslash.