I have been massaging with sed (found tutorial here: Grymoire ) ASCII files we get from our hardware suppliers. Files have a structure like so
Model-Manufacturer:D12-500
Test_Version:2.6.3
But some files we receive are randomly "broken" and miss an entry for "Model-Manufacturer:"
Model-Manufacturer:D12-500
Test_Version:2.6.3
Model-Manufacturer:H24-700
Test_Version:2.6.3
Test_Version:2.6.3
Model-Manufacturer:R15-300
Test_Version:2.6.3
I want to fix this problem with Sed and place the missing entry for "Model-Manufacturer:N/A" before the second occurence of "Test_Version:2.6.3" ; this is my code
sed -n '
/Test_Version/ {
# found "Test_Version" - read in next line
N
# look for "Test_Version" on the second line
# and print if there.
/\n.*Test_Version/ {
# found it - now edit making one line
s/Test_Version/Model-Manufacturer:N/A/
}
}' infile > outfile
It's not working. I believe I need to remember the position of each "Test_Version" and "Model_Manufacturer" before doing the replacement, correct? Can I do this with sed?
Thanks in advance for your input.
Change your substitution to:
s||\nModel-Manufacturer:N/A&|
Using an alternate delimiter means you don't have to escape the slash in "N/A". Using an empty left side reuses the most recent match. The ampersand copies the match into the right side.
Also, you need to remove the -n.
If I understand what you are trying to achieve, you are very close. I think changing the substitution command to the following makes it work:
s/\nTest_Version/\nMode-Manufacturer:N\/A\nTest_Version/
Related
Looking for the syntax to find a pattern in a file and remove the leading character from only that pattern.
For example, find -16 and remove the # and save it to file.
Tried grep 12345-16 testfile2 | sed -e "s/^#//g" which works but need to capture all entries into the input file.
Example:
From this:
something here 12345-14
something here 12345-15
# something here 12345-16
to this:
something here 12345-14
something here 12345-15
something here 12345-16
suggestions would be much appreciated.
You can do it with just sed alone.
sed '/12345-16/s/^# *//' file
You can use -i option of sed to make in-file changes. /../ in front of sed is a regex which only makes changes on lines that has that pattern. All remaining lines will not be touched and be printed out as is.
You don't need g for global here since you are only removing the leading #. I have added a pattern of ^# * which means # or # followed by spaces at the start of the line. You can create your own pattern based on the structure of your file.
This should be extremely simple, but for the life of me I just can't get gnu-sed to do it this afternoon.
The file in question has lines that look like this:
PART NUMBER PART NUMBER QUANTITY WEIGHT -999 -4,999 -9,999
w/ UL APPROVAL
MIN-3
I need to prepend every line like the "MIN-3" line with a ">" character, and the only thing specifically differentiating those lines from the others are two things:
The first character is a space " ".
The lines do not contain a comma.
I've tried mostly things like any of the following:
/^ +[^,]+$/ s/^/>/
/^ +[\w\-]+$/ s/^/>/
/^ +(\w|\-)+$/ s/^/>/
I will admit, I am somewhat new to sed. :)
Edit: Answers that use perl, or awk could also be appreciated, though my initial target is sed.
try this:
sed '/^ [^,]*$/s/^/>/'
the output is, only the line with MIN-3 with leading >
sed default uses basic regex. so the + should be \+ in your script. I think that could be the problem killing your time. You could add -r however, to let sed use extended-regex.
According to your description this should do:
sed 's/^\([ ][^,]*\)$/> \1/' input
which matches the complete line if the line starts with a space and then contains anything but a comma until the end.
Here is a simple answer:
sed 's/^ [^,]*$/>&/'
This is a simple question, I'm not sure if i'm able to do this with sed/awk
How can I make sed search for these 3 lines and replace with a line with a determined string?
<Blarg>
<Bllarg>
<Blllarg>
replace with
<test>
I tried with sed "s/<Blarg>\n<Bllarg>\n<Blllarg>/<test>/g" But it just don't seem to find these lines. Probably something with my break line character (?) \n. Am I missing something?
Because sed usually handles only one line at a time, your pattern will never match. Try this:
sed '1N;$!N;s/<Blarg>\n<Bllarg>\n<Blllarg>/<test>/;P;D' filename
This might work for you:
sed '/<Blarg>/ {N;N;s/<Blarg>\n<Bllarg>\n<Blllarg>/<test>/}' <filename>
It works as follows:
Search the file till <Blarg> is found
Then append the two following lines to the current pattern space using N;N;
Check if the current pattern space matches <Blarg>\n<Bllarg>\n<Blllarg>
If so, then substitute it with <test>
You can use range addresses with regular expressions an the c command, which does exactly what you are asking for:
sed '/<Blarg>/,/<Blllarg>/c<test>' filename
Is there a way to substitute only within the match space using sed?
I.e. given the following line, is there a way to substitute only the "." chars that are contained within the matching single quotes and protect the "." chars that are not enclosed by single quotes?
Input:
'ECJ-4YF1H10.6Z' ! 'CAP' ! '10.0uF' ! 'TOL' ; MGCDC1008.S1 MGCDC1009.A2
Desired result:
'ECJ-4YF1H10-6Z' ! 'CAP' ! '10_0uF' ! 'TOL' ; MGCDC1008.S1 MGCDC1009.A2
Or is this just a job to which perl or awk might be better suited?
Thanks for your help,
Mark
Give the following a try which uses the divide-and-conquer technique:
sed "s/\('[^']*'\)/\n&\n/g;s/\(\n'[^.]*\)\.\([^']*Z'\)/\1-\2/g;s/\(\n'[^.]*\)\.\([^']*uF'\)/\1_\2/g;s/\n//g" inputfile
Explanation:
s/\('[^']*'\)/\n&\n/g - Add newlines before and after each pair of single quotes with their contents
s/\(\n'[^.]*\)\.\([^']*Z'\)/\1-\2/g - Using a newline and the single quotes to key on, replace the dot with a dash for strings that end in "Z"
s/\(\n'[^.]*\)\.\([^']*uF'\)/\1_\2/g - Using a newline and the single quotes to key on, replace the dot with a dash for strings that end in "uF"
s/\n//g - Remove the newlines added in the first step
You can restrict the command to acting only on certain lines:
sed "/foo/{s/\('[^']*'\)/\n&\n/g;s/\(\n'[^.]*\)\.\([^']*Z'\)/\1-\2/g;s/\(\n'[^.]*\)\.\([^']*uF'\)/\1_\2/g;s/\n//g}" inputfile
where you would substitute some regex in place of "foo".
Some versions of sed like to be spoon fed (instead of semicolons between commands, use -e):
sed -e "/foo/{s/\('[^']*'\)/\n&\n/g" -e "s/\(\n'[^.]*\)\.\([^']*Z'\)/\1-\2/g" -e "s/\(\n'[^.]*\)\.\([^']*uF'\)/\1_\2/g" -e "s/\n//g}" inputfile
$ cat phoo1234567_sedFix.sed
#! /bin/sed -f
/'[0-9][0-9]\.[0-9][a-zA-Z][a-zA-Z]'/s/'\([0-9][0-9]\)\.\([0-9][a-zA-Z][a-zA-Z]\)'/\1_\2/
This answers your specific question. If the pattern you need to fix isn't always like the example you provided, they you'll need multiple copies of this line, with reg-expressions modified to match your new change targets.
Note that the cmd is in 2 parts, "/'[0-9][0-9].[0-9][a-zA-Z][a-zA-Z]'/" says, must match lines with this pattern, while the trailing "s/'([0-9][0-9]).([0-9][a-zA-Z][a-zA-Z])'/\1_\2/", is the part that does the substitution. You can add a 'g' after the final '/' to make this substitution happen on all instances of this pattern in each line.
The \(\) pairs in match pattern get converted into the numbered buffers on the substitution side of the command (i.e. \1 \2). This is what gives sed power that awk doesn't have.
If your going to do much of this kind of work, I highly recommend O'Rielly's Sed And Awk book. The time spent going thru how sed works will be paid back many times.
I hope this helps.
P.S. as you appear to be a new user, if you get an answer that helps you please remember to mark it as accepted, or give it a + (or -) as a useful answer.
this is a job most suitable for awk or any language that supports breaking/splitting strings.
IMO, using sed for this task, which is regex based , while doable, is difficult to read and debug, hence not the most appropriate tool for the job. No offense to sed fanatics.
awk '{
for(i=1;i<=NF;i++) {
if ($i ~ /\047/ ){
gsub(".","_",$i)
}
}
}1' file
The above says for each field (field seperator by default is white space), check to see if there is a single quote, and if there is , substitute the "." to "_". This method is simple and doesn't need complicated regex.
Really would appreciate help on this.
I am using sed to create a CSV file. Essentially multiple html files are all merged to a single html file and sed is then used to remove all the junk pictures etc to get to the raw columnar data.
I have all this working but am stuck on the last bit.
What I want to do is very basic - I want to replace the following lines:
"a variable string"
"end td"
"begin td"
with a single line:
"a variable string"
(with a tab character at the end of this line)
I'M USING DOS.
As you see I'm new to all this. If I could get this working would save me a lot of time in the future so would appreciate the help.
At the moment I have to inject some html headers back into the text file, open it in a html editor, select the table and then paste this into a spreadsheet which is a bit of pain.
P.S. is there an easy way to get sed to remove the parenthesis '(' and ')' from a given line?
I doubt that this is what you really want, but it's what you asked for.
sed "s/\"a variable string\"/&\t/; s/\"end td\"//; s/\"begin td\"//" inputfile
What you probably want to do is replace them when they appear consecutively. Here's how you might do that:
sed "1{N;N}; /\"a variable string\"\n\"end td\"\n\"begin td\"/ s/\n.*$/\t/;ta;bb;:a;N;N;:b;$!P;N;D" inputfile
This will remove all parentheses in a file:
sed "s/[()]//g" inputfile
To select particular lines, you could do something like this:
sed "/foo/ s/[()]//g" inputfile
which will only make the replacement if the word "foo" is somewhere on a line.
Edit: Changed single quotes to double quotes to accommodate GNUWin32 and CMD.EXE.
A previous comment I left doesn't appear to have been saved - so will try again
The code to remove the ( and ) worked perfectly thanks
You are right - I was looking to merge the 3 lines into one line so the second example you gave where it looks like its reading the next two lines into the pattern space looks more promising. The output wasn't what I was expecting however.
I now realize the code is going to have to be more complicated and I don't want to trouble you any more as my manual method of injecting some html code back into the text file and opening it up in Openoffice and pasting into a spreadsheet only takes a few seconds and I have a feeling to manually produce the sed coding to this would be a nightmare.
Essentially the rules for converting the html would need to be:
[each tag has been formatted so it appears on its own line]
I have given example of an input file and desired output file below for reference
1) if < tr > is followed by < td > on the next line completely remove the < tr > and < td > lines [i.e. do not output a carriage return] and on the NEXT line stick a " at the start of that line [it doesn't matter about a carriage return at the end of this line as it is going to be edited later]
2) if < /td > is followed by < td > completely remove both these two lines [again do not output a carriage return after these lines] and on the PREVIOUS line output a ", [do not output a carriage return] and on the NEXT line stick "at the start of the line [don't worry about the the ending carriage return is will be edited later]
3) if < /td > is followed by < /tr > delete both of these lines and on the previous line add a " at to the end of the line and a final carriage return.
I have given an example of what the input and desired output would be:
input: http://medinfo.redirectme.net/input.txt
[the wanted file will be posted in the next message - this board will not allow new users to post a message with more than one hyperlink!]
there is an added issue that the address column is on multiple lines on the input file - this could be reduced to one line by looking to see if the first character of the NEXT line is a " If it isn't then do not output the carriage return at the end of the current line
Phew that was a nightmare just to type out never mind actually code. But thanks again for all your help in getting this far!
:-)