sed back-reference not working

sed back-reference not working - sed

I need to process every line of a curve CSV to remove the last column of only those lines which start with 10 commas. the sed command I used was:
$ cat curves.csv
(...)
,,,,,,,,,,2017/10/18,20630.000000
,,,,,,,,,,2017/11/15,20595.000000
,usdSN,:usdSN,,,,8005,$,,2017/08/07,Settlement Date
,,,,,,,,,,2017/12/20,20575.000000
,,,,,,,,,,2018/01/17,20555.000000
,,,,,,,,,,2018/02/21,20535.000000
(...)
,,,,,,,,,,2018/12/21,20290.000000
,usdZS,:usdZS,,,,8007,$,,2017/08/07,Settlement Date
,,,,,,,,,,2017/08/16,2848.500000
(...)
$ sed s/\(,,,,,,,,,,[0-9/]*\),[0-9.]*/\1/g curves.csv
however, it didn't work. it printed out all lines unchanged.
Please help.

Another approach with GNU sed:
sed -r '/^,{10}/{s/,[^,]*$//}' file
Output:
(...)
,,,,,,,,,,2017/10/18
,,,,,,,,,,2017/11/15
,usdSN,:usdSN,,,,8005,$,,2017/08/07,Settlement Date
,,,,,,,,,,2017/12/20
,,,,,,,,,,2018/01/17
,,,,,,,,,,2018/02/21
(...)
,,,,,,,,,,2018/12/21
,usdZS,:usdZS,,,,8007,$,,2017/08/07,Settlement Date
,,,,,,,,,,2017/08/16
(...)

The problem is the way you are running sed. You did:
sed s/\(,,,,,,,,,,[0-9/]*\),[0-9.]*/\1/g curves.csv
But because the parameters aren't quoted the shell resolves the escape characters and what is actually run is:
sed s/(,,,,,,,,,,[0-9/]*),[0-9.]*/\1/g curves.csv
Which doesn't match anything because there are no parenthesis in your file. How you should run it is:
sed 's/\(,,,,,,,,,,[0-9/]*\),[0-9.]*/\1/g' curves.csv

Related

sed: get a line number with regex and insert text at that line

I want to get the first line of a file that is not commented out with an hash, then append a line of text just after that line just before that line.
I managed to get the number of the line:
sed -n '/^\s*#/!{=;q}' file // prints 2
and also to insert text (specifying the line manually):
sed '2 a extralinecontent' file
I can't get them working together as a one liner or in a batch.
I tried command substitution (with $(command) and also with backticks) but I get an error from bash:
sed '$(sed -n '/^\s*#/!{=;q}' file) a extralinecontent' file
-bash: !{=: event not found
and also tried many other combinations, but no luck.
I'm using gnu-sed (via brew) on macOS.

This might work for you (GNU sed):
sed -e '/^\s*#/b;a extra line content' -e ':a;n;ba' file
Bail out of any lines beginning with a comment at the beginning of the file, append an extra line following the first line that is not a comment and keep fetching/printing all the remaining lines of the file.

Here's a way to do it with GNU sed without reading the file twice
$ cat ip.txt
#comment
foo baz good
123 456 7889
$ sed -e '0,/^\s*[^#[:space:]]/ {// a XYZ' -e '}' ip.txt
#comment
foo baz good
XYZ
123 456 7889
GNU sed allows first address to be 0 if the other address is regex, that way this will work even if first line matches the condition
/^\s*[^#[:space:]]/ as sed doesn't support possessive quantifier, need to ensure that the first character being matched by the character class isn't either a # or a whitespace character
// is a handy shortcut to repeat the last regex
a XYZ your required line to be appended (note that your question mentiones insert, so if you want that, use i instead of a)

Using sed to keep the beginning of a line

I have a file in which some lines start by a >
For these lines, and only these ones, I want to keep the first eleven characters.
How can I do that using sed ?
Or maybe something else is better ?
Thanks !
Muriel

Let's start with this test file:
$ cat file
line one with something or other
>1234567890abc
other line in file
To keep only the first 11 characters of lines starting with > while keeping all other lines:
$ sed -r '/^>/ s/(.{11}).*/\1/' file
line one with something or other
>1234567890
other line in file
To keep only the first eleven characters of lines starting with > and deleting all other lines:
$ sed -rn '/^>/ s/(.{11}).*/\1/p' file
>1234567890
The above was tested with GNU sed. For BSD sed, replace the -r option with -E.
Explanation:
/^>/ is a condition. It means that the command which follows only applies to lines that start with >
s/(.{11}).*/\1/ is a substitution command. It replaces the whole line with just the first eleven characters.
-r turns on extended regular expression format, eliminating the need for some escape characters.
-n turns off automatic printing. With -n in effect, lines are only printed if we explicitly ask them to be printed. In the second case above, that is done by adding a p after the substitute command.
Other forms:
$ sed -r 's/(>.{10}).*/\1/' file
line one with something or other
>1234567890
other line in file
And:
$ sed -rn 's/(>.{10}).*/\1/p' file
>1234567890

Simple SED statement to truncate everything after first instance of underscore AFTER # sign

Input:
ab_cd#yahoo.co.uk_DN_135.PNG
ef_gh#gmail.com_ST_19_1_9.jpg
Required Output:
ab_cd#yahoo.co.uk
ef_gh#gmail.com
I'm looking for a SED statement to do the above.
Essentially I would like everything from and including the first underscore character AFTER the # sign to be removed from the output.
I'm sorry I only have basic knowledge of programming. I'm on a Windows machine [I've found a SED editor] from here
and use it modify simple strings in a batch file from the Windows shell.
Many thanks

give this a try:
sed 's/_[^#]*$//' file
it worked here with your input:
kent$ cat f
ab_cd#yahoo.co.uk_DN_135.PNG
ef_gh#gmail.com_ST_19_1_9.jpg
kent$ sed 's/_[^#]*$//' f
ab_cd#yahoo.co.uk
ef_gh#gmail.com

This can be a way:
$ sed -r 's/(.*#[^_]*).*/\1/' file
ab_cd#yahoo.co.uk
ef_gh#gmail.com
It catches all the text before #, after it and up to _. Then, it prints it back, getting rid of everything coming from _.
Explanation
Matching group: given a sed 's/something/back/' command, whatever you "catch" in the part, can be enclosed in parenthesis so that you can refer back to it with \1 (first match), \2 (2nd match), and up to \9.
$ cat file
hello33bye
hello44goodbye
hello55yeah
$ sed 's/hello([0-9]*).*/\1/g' a
33
44
55
So (.*#[^_]*).* means: catch a block of text followed by # and followed with any character appart from _. Then, catch the rest of the text.
Finally, print the catched block back.
To be sure we are not matching a _ within the domain:
sed -r 's/(.*#[^\.]*[^_]*).*/\1/' file
^^^^^^
catch a dot before catching an underscore
Test
$ cat a
ab_cd#yahoo.co.uk_DN_135.PNG
ef_gh#gmail.com_ST_19_1_9.jpg
aaa#gma_il.com_ST_BB
aaa#gma_il.com
$ sed -r 's/(.*#[^\.]*[^_]*).*/\1/' a
ab_cd#yahoo.co.uk
ef_gh#gmail.com
aaa#gma_il.com
aaa#gma_il.com

Here is an awk version. (awk is normal more easy to understand than sed)
cat file
ab_cd#yahoo.co.uk_DN_135.PNG
ef_gh#gmail.com_ST_19_1_9.jpg
test#home_net.com_BO_22.jpg
awk -F\. '{NF--;split($NF,a,"_");$NF=a[1]}1' OFS=\. file
ab_cd#yahoo.co.uk
ef_gh#gmail.com
test#home_net.com
It removes the last field when split by ., then divide last field by _ and replace it by first part of it.

Using variables in sed -f (where sed script is in a file rather than inline)

We have a process which can use a file containing sed commands to alter piped input.
I need to replace a placeholder in the input with a variable value, e.g. in a single -e type of command I can run;
$ echo "Today is XX" | sed -e "s/XX/$(date +%F)/"
Today is 2012-10-11
However I can only specify the sed aspects in a file (and then point the process at the file), E.g. a file called replacements.sed might contain;
s/XX/Thursday/
So obviously;
$ echo "Today is XX" | sed -f replacements.sed
Today is Thursday
If I want to use an environment variable or shell value, though, I can't find a way to make it expand, e.g. if replacements.txt contains;
s/XX/$(date +%F)/
Then;
$ echo "Today is XX" | sed -f replacements.sed
Today is $(date +%F)
Including double quotes in the text of the file just prints the double quotes.
Does anyone know a way to be able to use variables in a sed file?

This might work for you (GNU sed):
cat <<\! > replacements.sed
/XX/{s//'"$(date +%F)"'/;s/.*/echo '&'/e}
!
echo "Today is XX" | sed -f replacements.sed
If you don't have GNU sed, try:
cat <<\! > replacements.sed
/XX/{
s//'"$(date +%F)"'/
s/.*/echo '&'/
}
!
echo "Today is XX" | sed -f replacements.sed | sh

AFAIK, it's not possible. Your best bet will be :
INPUT FILE
aaa
bbb
ccc
SH SCRIPT
#!/bin/sh
STRING="${1//\//\\/}" # using parameter expansion to prevent / collisions
shift
sed "
s/aaa/$STRING/
" "$#"
COMMAND LINE
./sed.sh "fo/obar" <file path>
OUTPUT
fo/obar
bbb
ccc

As others have said, you can't use variables in a sed script, but you might be able to "fake" it using extra leading input that gets added to your hold buffer. For example:
[ghoti#pc ~/tmp]$ cat scr.sed
1{;h;d;};/^--$/g
[ghoti#pc ~/tmp]$ sed -f scr.sed <(date '+%Y-%m-%d'; printf 'foo\n--\nbar\n')
foo
2012-10-10
bar
[ghoti#pc ~/tmp]$
In this example, I'm using process redirection to get input into sed. The "important" data is generated by printf. You could cat a file instead, or run some other program. The "variable" is produced by the date command, and becomes the first line of input to the script.
The sed script takes the first line, puts it in sed's hold buffer, then deletes the line. Then for any subsequent line, if it matches a double dash (our "macro replacement"), it substitutes the contents of the hold buffer. And prints, because that's sed's default action.
Hold buffers (g, G, h, H and x commands) represent "advanced" sed programming. But once you understand how they work, they open up new dimensions of sed fu.
Note: This solution only helps you replace entire lines. Replacing substrings within lines may be possible using the hold buffer, but I can't imagine a way to do it.
(Another note: I'm doing this in FreeBSD, which uses a different sed from what you'll find in Linux. This may work in GNU sed, or it may not; I haven't tested.)

I am in agreement with sputnick. I don't believe that sed would be able to complete that task.
However, you could generate that file on the fly.
You could change the date to a fixed string, like
__DAYOFWEEK__.
Create a temp file, use sed to replace __DAYOFWEEK__ with $(date +%Y).
Then parse your file with sed -f $TEMPFILE.
sed is great, but it might be time to use something like perl that can generate the date on the fly.

To add a newline in the replacement expression using a sed file, what finally worked for me is escaping a literal newline. Example: to append a newline after the string NewLineHere, then this worked for me:
#! /usr/bin/sed -f
s/NewLineHere/NewLineHere\
/g
Not sure it matters but I am on Solaris unix, so not GNU sed for sure.

How to use sed to remove last double quote from each line of a file

I am trying to remove the LAST double-quote from every line of a file. I am very new to sed, and I think sed can easily do this, but cannot figure out the proper syntax. Can anyone assist?
THanks!

Try:
$ sed 's/\(.*\)"/\1/'
aaa"bbb <-- Input
aaabbb <-- Output
aaa"bbb"ccc <-- Input
aaa"bbbccc <-- Output

i guess you want to delete only the last occurrence of double quote in each line:
see the test:
kent$ cat t.txt
asdf"o"
asdfasdfsadf ix" " 000
"as;ldkfj;laskfj;lkasjdf;ljks
kent$ sed -r 's/"([^"]*$)/\1/' t.txt
asdf"o
asdfasdfsadf ix" 000
as;ldkfj;laskfj;lkasjdf;ljks

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

sed back-reference not working - sed

Related

sed: get a line number with regex and insert text at that line

Using sed to keep the beginning of a line

Simple SED statement to truncate everything after first instance of underscore AFTER # sign

Using variables in sed -f (where sed script is in a file rather than inline)

How to use sed to remove last double quote from each line of a file

Categories

Resources