sed command to extract from xml

sed command to extract from xml - sed

I'm using my mac terminal to do a script, it basically does:
wget http://p2.edms-pr.ccomrcdn.com/player/player_dispatcher.html?section=radio&action=listen_live
This file returns an XML which I can save to txt or XML, I'm saving it as "url.xml"
<PlayerContent>
<ListenLiveInitialize>
<StreamInfo>
<stream id="4694" primary_location="rtmp://cp58082.live.edgefcs.net/live/COR_5103_OR#s5137?auth=daEaIcRcbb.afahbOdwbWdjdYcEdYaOaDdc-bn7nM7-4q-PN0X1_3nqDHom4EBvmEuwr&aifp=1234&CHANNELID=4694&CPROG=_&MARKET=PREMIERE&REQUESTOR=EDMS-PR&SERVER_NAME=p2.edms-pr.ccomrcdn.com&SITE_ID=13293&STATION_ID=EDMS-PR&MNM=_&TYPEOFPLAY=0" backup_location=""/>
</StreamInfo>
<JustPlayed/>
I want to used SED to return the AUTH code inside "primary_location". So basically I want to store
daEaIcRcbb.afahbOdwbWdjdYcEdYaOaDdc-bn7nM7-4q-PN0X1_3nqDHom4EBvmEuwr
on a variable.
I found this online but it doesn't seem to be working.
sed -n 's/.*\(auth=......................................... ...........................\).*/\1/p' url.xml

Try
sed -n 's|^<stream.*auth\=\(.*\)\&ai.*|\1|p' url.xml
which reads the file and matches the line up to the = before the auth code, stores everything from there up to the & in &ai as \1 which is then substituted for the whole pattern space.

You have a stray space () in the middle of your .s!
This is neater and will output auth= with the value (it looks like it's a string of alphanumerics with hyphens and underscores):
% grep -o 'auth=[[:alnum:]_-]\+' url.xml
You could even use it like so:
% eval $(grep -o 'auth=[[:alnum:]_-]\+' url.xml)
% echo ${auth}
daEaIcRcbb.afahbOdwbWdjdYcEdYaOaDdc-bn7nM7-4q-PN0X1_3nqDHom4EBvmEuwr
Works on OSX.

Related

Cannot use sed with regex via script

I have the following .sed script:
# replace female-male with F-M
s/female/F/
s/male/M/
# capitalize the name when the sport is volleyball or taekwondo
s/^([^,]*,)([^,]+)((,[^,]*){5},(volleyball|taekwondo),)/\1\U\2\L\3/
And the following csv file (first 10 lines)
id,name,nationality,sex,date_of_birth,height,weight,sport,gold,silver,bronze,info
736041664,A Jesus Garcia,ESP,male,1969-10-17,1.72,64,athletics,0,0,0,
532037425,A Lam Shin,KOR,female,1986-09-23,1.68,56,handball,0,0,0,
435962603,Aaron Brown,CAN,male,1992-05-27,1.98,79,athletics,0,0,1,
521041435,Aaron Cook,MDA,male,1991-01-02,1.83,80,taekwondo,0,0,0,
33922579,Aaron Gate,NZL,male,1990-11-26,1.81,,cycling,0,0,0,
173071782,Aaron Royle,AUS,male,1990-01-26,1.80,67,triathlon,0,0,0,
266237702,Aaron Russell,USA,male,1993-06-04,,98,volleyball,0,0,1,
382571888,Aaron Younger,AUS,male,1991-09-25,1.93,100,football,0,0,0,
87689776,Aauri Lorena Bokesa,ESP,female,1988-12-14,1.80,62,athletics,0,0,0,
The output must be done by the following command
sed -f script.sed ./file.csv
The problem I have is that despite making sure the regex is matching all the pertinent lines, I can only get it to replace the female-male values with F-M, the rest of the file is still the exact same. The names are not being capitalized.
If I run each regex directly (i.e 'sed -E 's/^([^,],)([^,]+)((,[^,]){5},(volleyball|taekwondo),)/\1\U\2\L\3/' file.csv') it works. But I need to do it via script, and with -f.
What am I missing? Thank you.

You still need to indicate that you're using extended regular expresssions:
sed -Ef script.sed file.csv
Otherwise, sed uses basic regular expressions, where escaping rules are different, specifically for () for capture groups, and {} for counts.

Have you tried using sed -Ef <script> <csv file>? You need -E to use extended regex expressions.

Exiftool: Want to output to one text file using -w command

I'm currently trying to use exiftool on Windows command prompt to read meta data from multiple files, then output to a single text file.
The exact command I last tried looked like this:
exiftool.exe -FileName -GPSPosition -CreateDate -d "%m:%d:%Y %H:%M:%S" -c "%d° %d' %.2f"\" -charset UTF-8 -ext jpg -w _Coordinate_Date.txt S:\Nick\Test\
When I run this, I get 7 individual text files with the content for one corresponding file in each of them. However, I simply want to output all of it to one single text file. Any help is greatly appreciated

The -w (textout) option can only be used to write multiple files. It is not meant to be used to output to a single file. As per the docs on -w:
It is not possible to specify a simple filename as an argument -- creating a single output file from multiple source files is typically done by shell redirection
Which is what you're doing with the >> ./output.txt part of your command. The -w _Coordinate_Date.txt isn't doing anything and I would think throw an Invalid TAG name: "w _Coordinate_Date.txt" error if quoted together like that as it gets treated as a single arugment. The -w option requires two arguments, the -w and either an extension or a format string.

I actually figured it out, if you wrap the entire -w _Coordinate_Date.txt command in quotations and append it to a file, you can throw all of the output into one text file.
i.e. "-w _Coordinate_Date.txt >> ./output.txt"

Remove a specific word from a file using shell script

I would request some help with a basic shell script that should do the following job.
File a particular word from a given file (file path is always constant)
Backup the file
Delete the specific word or replace the word with ;
Save the file changes
Example
File Name - abc.cfg
Contains the following lines
network;private;Temp;Windows;System32
I've used the following SED command for the operation
sed -i -e "/Temp;/d" abc.cfg
The output is not as expected. The complete line is removed instead of just the word Temp;
Any help would be appreciated. Thank you

sed matches against lines, and /d is the delete directive, which is why you get a deleted line. Instead, use substitution to replace the offending word with nothing:
sed 's/Temp;//g' abc.cfg
The /g modifier means "globlal", in case the offending word appears more than once. I would hold off on the -i (inline) flag until you are sure of your command, in general, or use -i .backup.

Thank you. I used your suggestion but couldn't get through. I appreciate the input though.
I was able to achieve this using the following SED syntax
sed -e "s/Temp//g" -i.backup abc.cfg
I wanted to take the backup before the change & hence -i was helpful.

Replace specials characters with sed

I am trying to use a sed command to replace specials characters in my file.
The characters are %> to replace by ].
I'am using sed -r s/\%>\/\]\/g but i have this error bash: /]/g: No such file or directory, looks like sed doesn't like it.

Put your sed code inside quotes and also add the file-path you want to work with and finally don't escape the sed delimiters.
$ echo '%>' | sed 's/%>/]/g'
]
ie,
sed 's/%>/]/g' file

To complement Avinash Raj's correct and helpful answer:
Since you were using an overall unquoted string (neither single- nor double-quoted), you were on the right track by \-escaping individual characters in your sed command.
However, you neglected to \-quote >, which is what caused your problem:
> is one of the shell's so-called metacharacters
Metacharacters have special meaning and separate words
Thus, s/\%>\/\]\/g is mistakenly split into 2 arguments by >:
s/\% is passed to sed - as s/%, because the shell removes the \ instances (a process called quote removal).
As you can see, this is not a valid sed command, but that doesn't even come into play - see below.
>\/\]\/g is interpreted by the shell (bash), because it starts with output-redirection operator >; after quote removal, the shell sees >/]/g, tries to open file /]/g for writing, and fails, because your system doesn't have a subdirectory named ] in its root directory.
bash tries to open an output file specified by a redirection before running the command and, if it fails to open the file, does not run the command - which is what happened here:
bash complained about the nonexistent target directory and aborted processing of the command - sed was never even invoked.
Upshot:
In a string that is neither enclosed in single nor in double-quotes, you must \-quote:
all metacharacters: | & ; ( ) < > space tab
additionally, to prevent accidental pathname expansion (globbing): * ? [
Also note that if you need to quote (escape) characters for sed,you need to add an extra layer of quoting; for instance to instruct sed to use a literal . in the regex, you must pass \\. - two backslashes - so that sed sees the properly escaped \..
Given the above, it is much simpler to (habitually) use single quotes around your sed command, because it ensures that the string is passed as is to sed.
Let's compare a working version of your command to the one from Avinash Raj's answer (leaving out the -r for brevity):
sed s/\%\>\/\]\/g # ok - all metachars. \-quoted, others are, but needn't be quoted
sed s/%\>/]/g # ok - minimum \-quoting
sed 's/%>/]/g' # simplest: single-quoted command

I'm not sure whether I got the question correctly. If you want to replace either % or > by ] then sed is not required here. Use tr in this case:
tr '%>' ']' < input.txt
If you want to replace the sequence %> by ] then the sed command as shown by #AvinashRaj is the way to go.

How do I run the sed command with input and output as the same file?

I'm trying to do use the sed command in a shell script where I want to remove lines that read STARTremoveThisComment and lines that read removeThisCommentEND.
I'm able to do it when I copy it to a new file using
sed 's/STARTremoveThisComment//' > test
But how do I do this by using the same file as input and output?

sed -i (or the extended version, --in-place) will automate the process normally done with less advanced implementations, that of sending output to temporary file, then renaming that back to the original.
The -i is for in-place editing, and you can also provide a backup suffix for keeping a copy of the original:
sed -i.bak fileToChange
sed --in-place=.bak fileToChange
Both of those will keep the original file in fileToChange.bak.
Keep in mind that in-place editing may not be available in all sed implementations but it is in GNU sed which should be available on all variants of Linux, as per your tags.
If you're using a more primitive implementation, you can use something like:
cp oldfile oldfile.bak && sed 'whatever' oldfile >newfile && mv newfile oldfile

You can use the flag -i for in-place editing and the -e for specifying normal script expression:
sed -i -e 's/pattern_to_search/text_to_replace/' file.txt
To delete lines that match a certain pattern you can use the simpler syntax. Notice the d flag:
sed -i '/pattern_to_search/d' file.txt

You really should not use sed for that. This question seems to come up ridiculously often, and it seems very strange that it does since the general solution is so trivial. It seems bizarre that people want to know how to do it in sed, and in python, and in ruby, etc. If you want to have a filter operate on an input and overwrite it, use the following simple script:
#!/bin/sh -e
in=${1?No input file specified}
mv $in ${bak=.$in.bak}
shift
"$#" < $bak > $in
Put that in your path in an executable file name inline, and then the problem is solved in general. For example:
inline input-file sed -e s/foo/bar/g
Now, if you want to add logic to keep multiple backups, or if you have some options to change the backup naming scheme, or whatever, you fix it in one place. What's the command line option to get 1-up counters on the backup file when processing a file in-place with perl? What about with ruby? Is the option different for gnu-sed? How does awk handle it? The whole friggin' point of unix is that tools do one thing only. Handling logic for backup files is a second thing, and needs to be factored out. If you are implementing a tool, do not add logic to create backup files. Tell your users to use a 2nd tool for that. Integration is bad. Modularity is good. That is the unix way.
Notice that this script has several problems. The permissions/mode of the input file may be changed, for example. I'm sure there are innumerable other issues. However, by putting the backup logic in a wrapper script, you localize all of these issues and don't have to worry that sed overwrites the files and changes mode, while python keeps the file in place and does not change the inode (I made up those two cases, the point being that not all tools will use the same logic, while the wrapper script will.)

As far as I know it is not possible to use the same file for input and output. Though one solution is make a shell script which will save it to another file, delete the old input and rename the output to the input file name.
sed -e s/try/this/g input.file > output.file;mv output.file input.file

I suggest using sponge
sponge reads standard input and writes it out to the specified file.
Unlike a shell redirect, sponge soaks up all its input before writing
the output file. This allows constructing pipelines that read from and
write to the same file.
cat test | sed 's/STARTremoveThisComment//' | sponge test

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse