How can I swap columns seperated ":" using sed?
for example
string1:string2
string3 string4:string5
string6:string7-string8
into
string2:string1
string5:string3 string4
string7-string8:string6
thanks!
This code will swap the columns around : in a file named example.txt -
sed -i -r 's/(.+):(.+)/\2:\1/' example.txt
Explanation -
-i is for in-place substitution
-r forces sed to use an extended regular syntax
.+ says look for any character any number of times. This is a very "greedy" regular expression but works in this case. Then, parentheses are used to capture the text.
Then, finally used \1 and \2 in reverse order to swap the columns around :
I want to just replace few strings in file with nothing, but sed replaces the whole line. Can someone help me with this?
line in file.xml:
<tag>sample text1 text2</tag>
My code:
sed "s/'text1 text2'//" file.xml 2>/dev/null || :
I also tried
sed -i -e "s/'text1 text2'//" file.xml 2>/dev/null || :
expected result:
<tag>sample</tag>
Actual result:
The whole line is removed from file.
Others:
text1 and text 2 are complex text with .=- characters in it
What can I do to fix this?
TIA
Remove the single quotes:
sed "s/text1 text2//" file.xml
You could use
sed 's/\([^ ]*\)[^<]*\(.*\)/\1\2/' filename
Output:
<tag>sample</tag>
Grouping is used. First all characters till a space are grouped together, then all characters till a < are matched and all following characters are grouped into another group.
If I have
123456red100green
123456bee010yellow
123456usb110orange
123456sos011querty
123456let101bottle
and I want it to be
123456red111green
123456bee111yellow
123456usb111orange
123456sos111querty
123456let111bottle
notice: the first 6 characters don't change,,,,
the following 6 change,,,,
also these strings might be anywhere in a file (beginning, end, anywhere)
I want to specify sed to
1)find 123456
2)skip the next three characters
3)replace the next three with 111
The closest I've come to is:
sed '/s/123456....../123456...111/g'
I know dots mean anything but I don't know the equivalent on the other side. In short how to command sed to leave characters in a match untouched.
sorry for having been unclear of what I want please bear with me
Matching 123456 followed by three characters that are not to be modified, and then replacing the next three characters with 111:
sed 's/\(123456...\).../\1111/g' file
The \( ... \) captures the part of the string that we don't want to modify. These are re-inserted with \1. The whole matching bit of the line is replaced by "the bit in the \( ... \) (i.e. \1) followed by 111".
If you want to change each and every zero (as in your examples), then just sed 's/0/1/g' would do. Or sed -e '/^123456/ s/0/1/g' to do the same on lines starting with 123456.
But to count characters, as you ask, use ( .. ) to capture the varying parts and \1 to replace them (using sed -E). So:
echo 123456abcdefgh | sed -Ee 's/^(123456...).../\1111/'
outputs 123456abc111gh. The \1 puts back the part matched by 123456..., the next three ones are literal characters.
(Without -E, you'd need \( .. \) to group.)
I have a text file full of lines looking like:
Female,"$0 to $25,000",Arlington Heights,0,60462,ZD111326,9/18/13 0:21,Disk Drive
I am trying to change all of the commas , to pipes |, except for the commas within the quotes.
Trying to use sed (which I am new to)... and it is not working. Using:
sed '/".*"/!s/\,/|/g' textfile.csv
Any thoughts?
As a test case, consider this file:
Female,"$0 to $25,000",Arlington Heights,0,60462,ZD111326,9/18/13 0:21,Disk Drive
foo,foo,"x,y,z",foo,"a,b,c",foo,"yes,no"
"x,y,z",foo,"a,b,c",foo,"yes,no",foo
Here is a sed command to replace non-quoted commas with pipe symbols:
$ sed -r ':a; s/^([^"]*("[^"]*"[^"]*)*),/\1|/g; t a' file
Female|"$0 to $25,000"|Arlington Heights|0|60462|ZD111326|9/18/13 0:21|Disk Drive
foo|foo|"x,y,z"|foo|"a,b,c"|foo|"yes,no"
"x,y,z"|foo|"a,b,c"|foo|"yes,no"|foo
Explanation
This looks for commas that appear after pairs of double quotes and replaces them with pipe symbols.
:a
This defines a label a.
s/^([^"]*("[^"]*"[^"]*)*),/\1|/g
If 0, 2, 4, or any an even number of quotes precede a comma on the line, then replace that comma with a pipe symbol.
^
This matches at the start of the line.
(`
This starts the main grouping (\1).
[^"]*
This looks for zero or more non-quote characters.
("[^"]*"[^"]*)*
The * outside the parens means that we are looking for zero or more of the pattern inside the parens. The pattern inside the parens consists of a quote, any number of non-quotes, a quote and then any number on non-quotes.
In other words, this grouping only matches pairs of quotes. Because of the * outside the parens, it can match any even number of quotes.
)
This closes the main grouping
,
This requires that the grouping be followed by a comma.
t a
If the previous s command successfully made a substitution, then the test command tells sed to jump back to label a and try again.
If no substitution was made, then we are done.
using awk could be eaiser:
kent$ cat f
foo,foo,"x,y,z",foo,"a,b,c",foo,"yes,no"
Female,"$0 to $25,000",Arlington Heights,0,60462,ZD111326,9/18/13 0:21,Disk Drive
kent$ awk -F'"' -v OFS='"' '{for(i=1;i<=NF;i++)if(i%2)gsub(",","|",$i)}7' f
foo|foo|"x,y,z"|foo|"a,b,c"|foo|"yes,no"
Female|"$0 to $25,000"|Arlington Heights|0|60462|ZD111326|9/18/13 0:21|Disk Drive
I suggest a language with a proper CSV parser. For example:
ruby -rcsv -ne 'puts CSV.generate_line(CSV.parse_line($_), :col_sep=>"|")' file
Female|$0 to $25,000|Arlington Heights|0|60462|ZD111326|9/18/13 0:21|Disk Drive
Here I would have used gnu awks FPAT. It define how a field looks like FS that tells what the separator is. Then you can just set the output separator to |
awk '{$1=$1}1' OFS=\| FPAT="([^,]+)|(\"[^\"]+\")" file
Female|"$0 to $25,000"|Arlington Heights|0|60462|ZD111326|9/18/13 0:21|Disk Drive
If your awk does not support FPAT, this can be used:
awk -F, '{for (i=1;i<NF;i++) {c+=gsub(/\"/,"&",$i);printf "%s"(c%2?FS:"|"),$i}print $NF}' file
Female|"$0 to $25,000"|Arlington Heights|0|60462|ZD111326|9/18/13 0:21|Disk Drive
sed 's/"\(.*\),\(.*\)"/"\1##HOLD##\2"/g;s/,/|/g;s/##HOLD##/,/g'
This will match the text in quotes and put a placeholder for the commas, then switch all the other commas to pipes and put the placeholder back to commas. You can change the ##HOLD## text to whatever you want.
I have many lines in a file which only contain '--' on each line which i want to rmeove. But there are many other lines in the file that contain 'SOMETEXT--SOMETEXT'.
sed -i "/--/d" will remove all instances of '--' but I only want to remove all lines that contain only '--'.
You can use ^ and $ to indicate beginning and end of line
sed -i '/^--$/d'
A line containing only -- would match the regex ^--$
If you want to include lines with leading/trailing whitespaces, it could be extended to
^\s*--\s*$
sed -i '/^--$/' file
The ^ and $ chars "anchor" the search to the beginning and end of the line, respectively.
OR if there can be spaces at the front or back AND assuming an modernish sed
sed -i '/^[[:space:]]*--[[:space:]]*$/' file
where [:space:] will find space chars and tabs.
ELSE a total retro sed should handle
sed '/^[ ]*--[ ]*$/' file > newFile && mv newFile file
and if there could be tabs, then just include a tab char along with the space char, i.e.\
[<Space><TAB>]
but not spelled, out, just typing a space char and a tab char will do it.
IHTH