I am not getting expected results from sed 's/$/2021-07-21/' demotoytable.csv
Before the command the top 3 lines look like:
urlhm|main_code|description|taxable|itemnum|xtras
t3mr.com/guitar/qrc/G19RTE000000753|G19RTE0000007530|Promo_labor_day_006|Consignment|7522831|bag
t3mr.com/guitar/qrc/G19RTE000000753|G19RTE0000007530|Promo_labor_day_006|Consignment|7522835|box
t3mr.com/guitar/qrc/G19RTE000000753|G19RTE0000007530|Promo_labor_day_006|Consignment|7522839|case
But after running the command sed 's/$/|2021-07-21/' demotoytable.csv
I get this result:
|2021-07-21code|description|taxable|itemnum|xtras
|2021-07-21itar/qrc/G19RTE000000753|G19RTE0000007530|Promo_labor_day_006|Consignment|7522831|bag
|2021-07-21itar/qrc/G19RTE000000753|G19RTE0000007530|Promo_labor_day_006|Consignment|7522835|box
|2021-07-21itar/qrc/G19RTE000000753|G19RTE0000007530|Promo_labor_day_006|Consignment|7522839|case
Any ideas on why this is happening, or better yet how to fix? I want each line to end w "|2021-07-21", not begin with it. On a Mac Pro running Big Sur
Thanks
Remove carriage returns and then add the texts you wish to add:
sed 's/\r$//; s/$/|2021-07-21/' demotoytable.csv
s/\r$// removes carriage returns at the end of lines, s/$/|2021-07-21/ in its turn appends the value of your choice at the end of lines.
Related
I have file which is shown below
Section1
George, 1998-1995
Peter, 1999-1990
Simon, 1988-1960
Section2
Gery, 2019-2015
John, 1984-1983
Thomson, 1978-1965
When i give Section1 Expected output is
Simon, 1988-1960
Like this i have lots of sections. I want to achieve this with sed not using awk.
I tried like this . But it has the line number hard coding. And also it is printing the complete range
sed -n '/Section1/,4{p}'
Here i could able to remove the hardcoding. But unable to print the last line. And also next section name also coming.
sed -n '/Section1/ , /Section./{p}'
This might work for you (GNU sed):
sed '$b;N;/\nSection/P;D' file
Make a moving window of two lines and print the first line if the second line is begins Section and always the last line.
For the last line of a specific section use:
sed -n '/^Section1/{:a;h;$!{n;/^\S/!ba};x;s/^\s*//p}' file
A gnu awk solution.
awk -v RS='Section' '$1=="1" {print $(NF-1),$NF}' file
Simon, 1988-1960
By setting Record Selector to Section, awk works in block. Then print the second latest and the latest field of block matching 1, since Section is stripped of.
You may consider using
sed -n '/^Section1$/,/^Section[0-9]*$/{:a;h;n;/^Section[0-9]*$/!ba;x;s/^[ \t]*//;p}' file > newfile
See the online demo.
Details
-n - the switch suppresses default line output mode
/^Section1$/,/^Section[0-9]*$/ - a block of lines between a line that is equal to Section1 and a line that fully matches a Section and any 0 or more digits pattern (the next {...} group of commands relates to the range matched with this)
:a - sets a label named a
h - copies the current line into hold buffer
n - discards the current pattern space value and reads the next line into it
/^Section[0-9]*$/!ba - if the pattern space value does not match the end block line go back to label a
x - else, once we get to the last line, the previous one is in hold space, so x is used to swap hold and pattern space
s/^[ \t]*// - remove initial whitespace
p - print the pattern space.
Regex:
(Section1)((\n.*,.*)*\n\s*)(?'lastLine'.*)
Test here.
I did not understand exactly what you want to do with the result, so I cannot tell you the exact sed command.
I am trying to copy the beginning of every line in a text file before a certain character to the end of the same line.
I've tried duplicating each line to the end of itself, and then deleting everything after the character, but the trouble is I haven't been able to figure out how to skip the first instance of the character so the result is that the duplicated text gets deleted as well as everything beyond the first instance of the character.
I've tried things like
sed '/S/ s/$/ append text/' sample.txt > cleaned.txt
but this only adds a fixed text. I've also tried using:
s/\(.*\)/\1 \1/
to duplicate the line, and then deleting everything after the S, but I can't figure out how to get it to go to the 3rd S not the 1st to start deleting.
What I have to start with:
dog 50_50_S5_Scale
cat 10_RV_S76_Scale
mouse 15_SQ_S81_Scale
What I'm trying to get:
dog 50_50_S5_Scale dog 50_50_
cat 10_RV_17_S76_Scale cat 10_RV_17_
mouse 15_EQ_S81_Scale mouse 15_EQ_
Where everything before the first S gets copied to the end of the line.
You may use
sed 's/\([^S]*\)S.*/& \1/' file
See the online demo
Details
\([^S]*\) - Capturing group 1 (\1): any 0+ chars other than S
S.* - S and the rest of the string (actually, line, since sed processes line by line by default).
The replacement is the concatenation of the whole match (&), space and Group 1 value.
You could try:
awk '{print $0 " " substr($0, 0, index($0,"S") - 1)}' file
We take the substring from the first character up to but not including the first occurance of "S".
I was trying to copy an example I found here : http://www.grymoire.com/Unix/Sed.html#uh-35a
here is the sed pattern
/^begin$/,/^end$/{
/begin/n
/end/!d
}
here's the first file
begin
one
end
last line
and here's the second
begin
end
last line
when I run the sed on the first file it deletes what's between the begin/end and all is well. When I run it on the second, it appears to miss the "end" and deletes the rest of the file.
running on first file
$ sed -f x.sed a
begin
end
last line
running on second
$ sed -f x.sed b
begin
end
notice how "last line" is missing on the second run.
I thought that "n" would print the current pattern and suck in the next one. It would then hit the /end/ command and process that.
as it is, it seems like it's somehow causing the end of the range to be missed. Will somebody explain what is happening?
It should be:
/^begin$/,/^end$/{
/^begin$\|^end$/!d
}
Why was your command wrong?
The n command was wrong there. In the second example it will:
begin ---> n read next line(important: this does not affect the state of the range address (begin,end))
1a. end ---> /end/! does not apply. Don't delete the line
last line ---> /end/! applies. Delete the line. (sed is still searching for a line that contains end because the n command skipped that line)
found another way around after #hek2mgl 's help. I can add a branch around the 2nd statement. I actually need this because I want to see the begin label. so you can also do this:
/^begin$/,/^end$/{
/begin/{ b skip }
/end/!d
:skip
}
I think you were close to getting it to do what you wanted. When you want to delete the next line after a match you simply need to pull it in with the sed n and then hit it with a delete d.
It looks like you want to skip the line after the line that starts with begin unless it's end and print all the other lines.
If so, the following should suffice:
/^begin$/,/^end$/{
/begin/{n;/end/!d}
}
It works by skipping the next line after begin except if it starts with end (/end/!).
Also see: sed or awk: delete n lines following a pattern
I have a line of SED, below, that is in a batch command that I run every month. It was written by someone before me, and I am looking to understand the parts of this code. From the two outputs I can tell that it takes one line and deletes another when sequential lines are duplicates, I just don't understand how it is being done with this line.
sed "$!N; /^\(.*\)\n\1$/!P; D" finalish.txt > final.txt
Exmple of - Finalish.txt
201408
201409
201409
201409
201409
Example of - Final.txt
201408
201409
Not going in to the basics of sed, here is your sed command broken down:
$!N: If it is not end of file, append next line to pattern space. The two lines will be separated by a newline (\n). At this time your pattern space is 201408\n201409.
/^\(.*\)\n\1$/!P: If the pattern space does not contain two similar content separated by a newline (\n), then Print up to the first newline (\n). So this will print 201408 to STDOUT. During the second iteration though, the pattern space will have 201409\n201409 and since it fails the regex, nothing gets printed and we proceed to the next command.
D: Deletes up to the first newline (\n) and repeats the sed script. Remember during the repeat cycle your pattern space still has the 201409
So during the first iteration 201408 gets printed but 201409 doesn't get printed until the end of file is reached which is when your regex will become true again and the content will get printed.
If you are inheriting alot of sed code, I would strongly recommend sedsed utility which is written in python and will help you understand convoluted and cryptic sed that can often become a maintenance nightmare.
Here is a sample run from the sedsed utility (I haven't shown all iterations as it is pretty verbose but you get the picture. I have added few comments to what the output really means. Also notice I am using single quotes since I am on Mac (BSD Unix) and not Windows):
$ sedsed.py -d '$!N; /^\(.*\)\n\1$/!P; D' file
PATT:201408$ # This shows your current pattern space
HOLD:$ # This shows your current hold buffer
COMM:$ !N # This shows the command that is going to run
PATT:201408$ # This shows the pattern space after the command has ran
201409$
HOLD:$ # This shows the hold buffer after the command has ran
COMM:/^\(.*\)\n\1$/ !P # This shows the command being ran
201408 # Anything without a <TAG:> is what gets printed to STDOUT
PATT:201408$
201409$
HOLD:$
COMM:D
PATT:201409$
HOLD:$
...
...
...
COMM:$ !N
PATT:201409$
HOLD:$
COMM:/^\(.*\)\n\1$/ !P
201409
PATT:201409$
HOLD:$
COMM:D
I would also suggest that once you get the idea of what your sed commands were written for, you port them to a more friendlier scripting language like awk, perl or python
This will not help you understanding the sed, but here is an awk that just get the unique lines.
awk '!seen[$0]++' finalish.txt
201408
201409
Really would appreciate help on this.
I am using sed to create a CSV file. Essentially multiple html files are all merged to a single html file and sed is then used to remove all the junk pictures etc to get to the raw columnar data.
I have all this working but am stuck on the last bit.
What I want to do is very basic - I want to replace the following lines:
"a variable string"
"end td"
"begin td"
with a single line:
"a variable string"
(with a tab character at the end of this line)
I'M USING DOS.
As you see I'm new to all this. If I could get this working would save me a lot of time in the future so would appreciate the help.
At the moment I have to inject some html headers back into the text file, open it in a html editor, select the table and then paste this into a spreadsheet which is a bit of pain.
P.S. is there an easy way to get sed to remove the parenthesis '(' and ')' from a given line?
I doubt that this is what you really want, but it's what you asked for.
sed "s/\"a variable string\"/&\t/; s/\"end td\"//; s/\"begin td\"//" inputfile
What you probably want to do is replace them when they appear consecutively. Here's how you might do that:
sed "1{N;N}; /\"a variable string\"\n\"end td\"\n\"begin td\"/ s/\n.*$/\t/;ta;bb;:a;N;N;:b;$!P;N;D" inputfile
This will remove all parentheses in a file:
sed "s/[()]//g" inputfile
To select particular lines, you could do something like this:
sed "/foo/ s/[()]//g" inputfile
which will only make the replacement if the word "foo" is somewhere on a line.
Edit: Changed single quotes to double quotes to accommodate GNUWin32 and CMD.EXE.
A previous comment I left doesn't appear to have been saved - so will try again
The code to remove the ( and ) worked perfectly thanks
You are right - I was looking to merge the 3 lines into one line so the second example you gave where it looks like its reading the next two lines into the pattern space looks more promising. The output wasn't what I was expecting however.
I now realize the code is going to have to be more complicated and I don't want to trouble you any more as my manual method of injecting some html code back into the text file and opening it up in Openoffice and pasting into a spreadsheet only takes a few seconds and I have a feeling to manually produce the sed coding to this would be a nightmare.
Essentially the rules for converting the html would need to be:
[each tag has been formatted so it appears on its own line]
I have given example of an input file and desired output file below for reference
1) if < tr > is followed by < td > on the next line completely remove the < tr > and < td > lines [i.e. do not output a carriage return] and on the NEXT line stick a " at the start of that line [it doesn't matter about a carriage return at the end of this line as it is going to be edited later]
2) if < /td > is followed by < td > completely remove both these two lines [again do not output a carriage return after these lines] and on the PREVIOUS line output a ", [do not output a carriage return] and on the NEXT line stick "at the start of the line [don't worry about the the ending carriage return is will be edited later]
3) if < /td > is followed by < /tr > delete both of these lines and on the previous line add a " at to the end of the line and a final carriage return.
I have given an example of what the input and desired output would be:
input: http://medinfo.redirectme.net/input.txt
[the wanted file will be posted in the next message - this board will not allow new users to post a message with more than one hyperlink!]
there is an added issue that the address column is on multiple lines on the input file - this could be reduced to one line by looking to see if the first character of the NEXT line is a " If it isn't then do not output the carriage return at the end of the current line
Phew that was a nightmare just to type out never mind actually code. But thanks again for all your help in getting this far!
:-)