Capture repeating groups using sed - sed

I'm looking to use sed to capture repeating groups in order to parse a log line
echo "14:14:52.449 [thread] INFO LOGGER - SYMBOL: FIELD1[1.0] FIELD2[2] FIELD3[141452 (2016-11-24 14:14:52.000)] FIELD4[4]" | sed -E "s/(\d\d:\d\d:\d\d.\d\d\d )(.*?\-)( .*?\:)(.*)( FIELD3\[.*?\]).*/\\1\\3\\5/"
I'm looking to capture only the following fields
14:14:52.449 SYMBOL FIELD3[141452 (2016-11-24 14:14:52.000)]
However I get the entire line back. Any help is deeply appreciated
14:14:52.449 [thread] INFO LOGGER - SYMBOL: FIELD1[1.0] FIELD2[2] FIELD3[141452 (2016-11-24 14:14:52.000)] FIELD4[4]

With sed:
sed -E "s/^(([0-9]{2}:){2}[0-9]{2}\.[0-9]{3}).*( [^:]*):.*( FIELD3\[[^]]*\]).*/\1\3\4/"

Related

Using sed, delete from specific line until first match(not included)

I have some data looks like
1:Alice 2313
2:Desctop 456
3:Cook 111
4:.filename 50
...
...
100:Good 3
Dir num:10
File num:90
...
...
I want to delete all lines from specific line(ex. line 3) until the line "Dir num:" show up.
The idea output should be(according above example):
1:Alice 2313
2:Desctop 456
Dir num:10
File num:90
...
...
I have google several solutions likesed -i '/somestring/,$!d' file.
But these solutions are not suitable because of the specific line where deletion satarting.
How can I do this in 1 command without any tmp file?
Forgive my poor English, I'm not native English speaker.
You need to specify the address range from the specified line number (3) to the line matching the pattern (/Dir num/). However, it's not quite as simple as
sed '3,/Dir num/ d' file
because that will delete the "Dir num" line. Try this instead:
sed '3,/Dir num/ {/Dir num/! d}' file
That will, for the lines in the range, check that the line does not match the pattern: is the pattern is not matched, delete it.
Use the range: /pattern1/,/pattern2/ option of sed
$ sed -e '/2:Desctop 456/,/Dir num:10/{//!d}' inputFile
1:Alice 2313
2:Desctop 456
Dir num:10
File num:90
...
...

Search xml for a value using sed

I have a below xml file
<documents>
<document><title>some title1</title><abstract>Some abstract1</abstract></document>
<document><title>some title2</title><abstract>Some abstract2</abstract></document>
<document><title>some title3</title><abstract>Some abstract3</abstract></document>
<document><title>some title4</title><abstract>Some abstract4</abstract></document>
</documents>
I am trying to write a ksh script to fetch the abstract value based on title=title4
xmllint , xstartlet is not allowed in my machine (access issues)
I have tried with
sed -n '/abstract/{s/.*<abstract>//;s/<\/abstract.*//;p;}' connections.xml
How to modify this to search based on a title
Based on the example you have given:
sed -n '/title>.*title4<\/title>/{s#.*<abstract>##;s#</abstract>.*##;p}' file
Will give you:
Some abstract4
grep approach:
grep -Poz '<title>.*?title4</title><abstract>\K[^<>]+(?=</abstract>)' connections.xml && echo ""
The output:
Some abstract4

Sed add text after match

I have a xmltv file that has the following style lines for program start/stop times
<programme start="20150914003000" stop="20150914020000" channel="Noor TV">
I want to add +0000 to the end of the start/stop time like the following
<programme start="20150914003000 +0000" stop="20150914020000 +0000" channel="Noor TV">
I am using windows sed and got this far
sed -r "/<programme start=\"/ s/^([0-9]{14})/\1 +0000/g" < "xml.xml" > "xml2.xml"
its giving me sed cant read >: invalid argument
in the dos windows I can see its adding the +0000 but not writing the new file
I know its something dumb but I just cant figure it out.
thks.
Try this
sed -r "/<programme start=/ s/^([0-9]{14})/\1 +0000/g" "xml.xml" > "xml2.xml"
or (posix version)
sed "/<programme start=/ s/^([0-9]\{14\})/\1 +0000/g" "xml.xml" > "xml2.xml"
$cat xml.xml
<programme start="20150914003000" stop="20150914020000" channel="Noor TV">
$sed -r 's/start="([0-9]{14})" stop="([0-9]{14})"/start="\1 +0000" stop="\2 +0000"/' xml.xml >xml2.xml
$cat xml2.xml
<programme start="20150914003000 +0000" stop="20150914020000 +0000" channel="Noor TV">
had tried it by online linux
The first < needs a backslash, not a forward slash.
sed -r "\<programme start=\"/ s/^([0-9]{14})/\1 +0000/g" < "xml.xml" > "xml2.xml"

How to remove a special string from a file?

Im trying to remove the following two lines:
<STREAMINFO> 1 39
<VECSIZE> 39<NULLD><MFCC_D_A_0><DIAGC>
which are repeated many times in a texfile (hmmdefs) in the folder hmm0.
How could I do that in UNUX?
Tried to remove each time separately, but when running the following command in command-line:
sed "<STREAMINFO> 1 39" hmm0/hmmdefs
I receive the following error:
sed: 1: "<STREAMINFO> 1 39": invalid command code <
You need to use d flag to delete the line which was matched by the given regex. And don't forget to enclose the regex within the / delimiters.
sed "/<STREAMINFO> 1 39/d" hmm0/hmmdefs
To be more specific, you need to add anchors.
sed "/^<STREAMINFO> 1 39$/d" hmm0/hmmdefs
^ Asserts that we are at the start and $ asserts that we are at the end.
Example:
$ cat file
<STREAMINFO> 1 39
<VECSIZE> 39<NULLD><MFCC_D_A_0><DIAGC>
foo bar
$ sed '/<STREAMINFO> 1 39\|<VECSIZE> 39<NULLD><MFCC_D_A_0><DIAGC>/d' file
foo bar

Multiple sed substitutions - what's wrong

The idea is to apply both these substitutions to the pattern space
sed -re 's/\v\s+/\t/g' -re 's/[(]([^()]+)[)]\s*\t/\1\t/' movies.list
gives me what I think is a syntax error.
Here is some test data (although it won't match the first pattern.
& The Oriental Groove, Yacine "Els matins a TV3" (2004)
& Vinícius, João Bosco Show da Virada (2011) Teleton 2009 (2009) Teleton 2012 (2012) "Eliana" (2009)
'77 Big Smoker Pig "Pop ràpid" (2011)
'Ariffin, Syaiful Desire (2014/III)
'Aruhane Shaping Bamboo (1979)
'Atu'ake, Taipaleti When the Man Went South (2014)
Just use a ; between your substitution commands:
sed -re 's/\v\s+/\t/g; s/[(]([^()]+)\s*\t/\1\t/' movies.list
The second -r flag is causing the problem. At that point in the argument processing you can't be using that flag apparently. Just drop it.
sed -re 's/\v\s+/\t/g' -e 's/[(]([^()]+)[)]\s*\t/\1\t/' movies.list