Extracting a string from a file name

Extracting a string from a file name - sed

My script takes a file name in the form R#TYPE.TXT (# is a number and TYPE is two or three characters).
I want my script to give me TYPE. What should I do to get it? Guess I need to use awk and sed.
I'm using /bin/sh (which is a requirement)

you can use awk
$ echo R1CcC.TXT | awk '{sub(/.*[0-9]/,"");sub(".TXT","")}{print}'
CcC
or
$ echo R1CcC.TXT | awk '{gsub(/.*[0-9]|\.TXT$/,"");print}'
CcC
and if sed is really what you want
$ echo R9XXX.TXT | sed 's/R[0-9]\(.*\)\.TXT/\1/'
XXX

I think this is what you are looking for.
$ echo R3cf.txt | sed "s/.[0-9]\(.*\)\..*/\1/"
cf
If txt is always upper case and the filename always starts with R you could do something like.
$ echo R3cf.txt | sed "s/R[0-9]\(.*\)\.TXT/\1/"

You can use just the shell (depending what shell your bin/sh is:
f=R9ABC.TXT
f="${f%.TXT}" # remove the extension
type="${f#R[0-9]}" # remove the first bit
echo "$type" # ==> ABC

Related

How to remove after second period in a string using sed

In my script, have a possible version number: 15.03.2 set to variable $STRING. These numbers always change. I want to strip it down to: 15.03 (or whatever it will be next time).
How do I remove everything after the second . using sed?
Something like:
$(echo "$STRING" | sed "s/\.^$\.//")
(I don't know what ^, $ and others do, but they look related, so I just guessed.)

I think the better tool here is cut
echo '15.03.2' | cut -d . -f -2

This might work for you (GNU sed):
sed 's/\.[^.]*//2g' file
Remove the second or more occurrence of a period followed by zero or non-period character(s).

$ echo '15.03.2' | sed 's/\([^.]*\.[^.]*\)\..*/\1/'
15.03
More generally to skip N periods:
$ echo '15.03.2.3.4.5' | sed -E 's/(([^.]*\.){2}[^.]*)\..*/\1/'
15.03.2
$ echo '15.03.2.3.4.5' | sed -E 's/(([^.]*\.){3}[^.]*)\..*/\1/'
15.03.2.3
$ echo '15.03.2.3.4.5' | sed -E 's/(([^.]*\.){4}[^.]*)\..*/\1/'
15.03.2.3.4

Shell: sed pipeline

I'm trying to make a script that redirects data from a serial port to other one.
I have realizate it using this command:
cat /dev/ttyS0 > /dev/ttyS1
Everything works but, now I would also logging data. I thought I'd use the tee command:
  
cat /dev/ttyS0 | tee /dev/ttyS1 log.txt
Now I want to make sure that every time it is recorded on the log file should be preceded by the string "from S0 to S1:" I tried this:
cat /dev/ttyS0 | tee /dev/ttyS1 | sed 's/$/from S0 to S1/' | less > log.txt
But it does not work, the file remains empty.
Where am I doing wrong?

Try:
cat /dev/ttyS0 | tee /dev/ttyS1 | sed 's/^/from S0 to S1: /' | tee log.txt
Since you wanted to prefix the line with the string, the $ in your sed has been replaced by ^. The substituted output is sent to STDOUT that can serve as an input for tee.

Not sure if this helps, but I'd remove the pager from the pipeline and redirect the sed output directly to the file. Also, if you want to prepend text you need to match the beginning of a line (^) not the end of a line ($).
... | sed 's/^/from S0 to S1: /' > log.txt
Also, what does the input look like in the first place? Does it contain linebreaks that the pattern could match?

Replace string with substring in lowercase using sed / awk / tr / perl?

I have a plaintext file containing multiple instances of the pattern $$DATABASE_*$$ and the asterisk could be any string of characters. I'd like to replace the entire instance with whatever is in the asterisk portion, but lowercase.
Here is a test file:
$$DATABASE_GIBSON$$
test me $$DATABASE_GIBSON$$ test me
$$DATABASE_GIBSON$$ test $$DATABASE_GIBSON$$ test
$$DATABASE_GIBSON$$ $$DATABASE_GIBSON$$$$DATABASE_GIBSON$$
Here is the desired output:
gibson
test me gibson test me
gibson test gibson test
gibson gibsongibson
How do I do this with sed/awk/tr/perl?

Here's the perl version I ended up using.
perl -p -i.bak -e 's/\$\$DATABASE_(.*?)\$\$/lc($1)/eg' inputFile

Unfortunately there's no easy, foolproof way with awk, but here's one approach:
$ cat tst.awk
{
gsub(/[$][$]/,"\n")
head = ""
tail = $0
while ( match(tail, "\nDATABASE_[^\n]+\n") ) {
head = head substr(tail,1,RSTART-1)
trgt = substr(tail,RSTART,RLENGTH)
tail = substr(tail,RSTART+RLENGTH)
gsub(/\n(DATABASE_)?/,"",trgt)
head = head tolower(trgt)
}
$0 = head tail
gsub("\n","$$")
print
}
$ cat file
The quick brown $$DATABASE_FOX$$ jumped over the lazy $$DATABASE_DOG$$s back.
The grey $$DATABASE_SQUIRREL$$ ate $$DATABASE_NUT$$s under a $$DATABASE_TREE$$.
Put a dollar $$DATABASE_DOL$LAR$$ in the $$ string.
$ awk -f tst.awk file
The quick brown fox jumped over the lazy dogs back.
The grey squirrel ate nuts under a tree.
Put a dollar dol$lar in the $$ string.
Note the trick of converting $$ to a newline char so we can negate that char in the match(RE), without that (i.e. if we used ".+" instead of "[^\n]+") then due to greedy RE matching if the same pattern appeared twice on one input line the matching string would extend from the start of the first pattern to the end of the second pattern.

This one works with complicated examples.
perl -ple 's/\$\$DATABASE_(.*?)\$\$/lc($1)/eg' filename.txt
And for simpler examples :
echo '$$DATABASE_GIBSON$$' | sed 's#$$DATABASE_\(.*\)\$\$#\L\1#'
in sed, \L means lower case (\E to stop if needed)

Using awk alone:
> echo '$$DATABASE_AWESOME$$' | awk '{sub(/.*_/,"");sub(/\$\$$/,"");print tolower($0);}'
awesome
Note that I'm in FreeBSD, so this is not GNU awk.
But this can be done using bash alone:
[ghoti#pc ~]$ foo='$$DATABASE_AWESOME$$'
[ghoti#pc ~]$ foo=${foo##*_}
[ghoti#pc ~]$ foo=${foo%\$\$}
[ghoti#pc ~]$ foo=${foo,,}
[ghoti#pc ~]$ echo $foo
awesome
Of the above substitutions, all except the last one (${foo,,}) will work in standard Bourne shell. If you don't have bash, you can instead do use tr for this step:
$ echo $foo
AWESOME
$ foo=$(echo "$foo" | tr '[:upper:]' '[:lower:]')
$ echo $foo
awesome
$
UPDATE:
Per comments, it seems that what the OP really wants is to strip the substring out of any text in which it is included -- that is, our solutions need to account for the possibility of leading or trailing spaces, before or after the string he provided in his question.
> echo 'foo $$DATABASE_KITTENS$$ bar' | sed -nE '/\$\$[^$]+\$\$/{;s/.*\$\$DATABASE_//;s/\$\$.*//;p;}' | tr '[:upper:]' '[:lower:]'
kittens
And if you happen to have pcregrep on your path (from the devel/pcre FreeBSD port), you can use that instead, with lookaheads:
> echo 'foo $$DATABASE_KITTENS$$ bar' | pcregrep -o '(?!\$\$DATABASE_)[A-Z]+(?=\$\$)' | tr '[:upper:]' '[:lower:]'
kittens
(For Linux users reading this: this is equivalent to using grep -P.)
And in pure bash:
$ shopt -s extglob
$ foo='foo $$DATABASE_KITTENS$$ bar'
$ foo=${foo##*(?)\$\$DATABASE_}
$ foo=${foo%%\$\$*(?)}
$ foo=${foo,,}
$ echo $foo
kittens
Note that NONE of these three updated solutions will handle situations where multiple tagged database names exist in the same line of input. That's not stated as a requirement in the question either, but I'm just sayin'....

You can do this in a pretty foolproof way with the supercool command cut :)
echo '$$DATABASE_AWESOME$$' | cut -d'$' -f3 | cut -d_ -f2 | tr 'A-Z' 'a-z'

This might work for you (GNU sed):
sed 's/$\$/\n/g;s/\nDATABASE_\([^\n]*\)\n/\L\1/g;s/\n/$$/g' file

Here is the shortest (GNU) awk solution I could come up with that does everything requested by the OP:
awk -vRS='[$][$]DATABASE_([^$]+[$])+[$]' '{ORS=tolower(substr(RT,12,length(RT)-13))}1'
Even if the string indicated with the asterix (*) contained one or more single Dollar signs ($) and/or linebreaks this soultion should still work.

awk '{gsub(/\$\$DATABASE_GIBSON\$\$/,"gibson")}1' file
gibson
test me gibson test me
gibson test gibson test
gibson gibsongibson

echo $$DATABASE_WOOLY$$ | awk '{print tolower($0)}'
awk will take what ever input, in this case the first agurment, and use the tolower function and return the results.
For your bash script you can do something like this and use the variable DBLOWER
DBLOWER=$(echo $$DATABASE_WOOLY$$ | awk '{print tolower($0)}');

Trim text using sed

How do I remove the first and the last quotes?
echo "\"test\"" | sed 's/"//' | sed 's/"$//'
The above is working as expected, But I guess there must be a better way.

You can combine the sed calls into one:
echo "\"test\"" | sed 's/"//;s/"$//'
The command you posted will remove the first quote even if it's not at the beginning of the line. If you want to make sure that it's only done if it is at the beginning, then you can anchor it like this:
echo "\"test\"" | sed 's/^"//;s/"$//'
Some versions of sed don't like multiple commands separated by semicolons. For them you can do this (it also works in the ones that accept semicolons):
echo "\"test\"" | sed -e 's/^"//' -e 's/"$//'

Maybe you prefer something like this:
echo '"test"' | sed 's/^"\(.*\)"$/\1/'

if you are sure there are no other quotes besides the first and last, just use /g modifier
$ echo "\"test\"" | sed 's/"//g'
test
If you have Ruby(1.9+)
$ echo $s
blah"te"st"test
$ echo $s | ruby -e 's=gets.split("\"");print "#{s[0]}#{s[1..-2].join("\"")+s[-1]}"'
blahte"sttest
Note the 2nd example the first and last quotes which may not be exactly at the first and last positions.
example with more quotes
$ s='bl"ah"te"st"tes"t'
$ echo $s | ruby -e 's=gets.split("\"");print "#{s[0]}#{s[1..-2].join("\"")+s[-1]}"'
blah"te"st"test

Filter text based in a multiline match criteria

I have the following sed command. I need to execute the below command in single line
cat File | sed -n '
/NetworkName/ {
N
/\n.*ims3/ p
}' | sed -n 1p | awk -F"=" '{print $2}'
I need to execute the above command in single line. can anyone please help.
Assume that the contents of the File is
System.DomainName=shayam
System.Addresses=Fr6
System.Trusted=Yes
System.Infrastructure=No
System.NetworkName=AS
System.DomainName=ims5.com
System.DomainName=Ram
System.Addresses=Fr9
System.Trusted=Yes
System.Infrastructure=No
System.NetworkName=Peer
System.DomainName=ims7.com
System.DomainName=mani
System.Addresses=Hello
System.Trusted=Yes
System.Infrastructure=No
System.NetworkName=Peer
System.DomainName=ims3.com
And after executing the command you will get only peer as the output. Can anyone please help me out?

You can use a single nawk command. And you can lost the useless cat
nawk -F"=" '/NetworkName/{n=$2;getline;if($2~/ims3/){print n} }' file
You can use sed as well as proposed by others, but i prefer less regex and less clutter.
The above save the value of the network name to "n". Then, get the next line and check the 2nd field against "ims3". If matched, then print the value of "n".

Put that code in a separate .sh file, and run it as your single-line command.

cat File | sed -n '/NetworkName/ { N; /\n.*ims3/ p }' | sed -n 1p | awk -F"=" '{print $2}'

Assuming that you want the network name for the domain ims3, this command line works without sed:
grep -B 1 ims3 File | head -n 1 | awk -F"=" '{print $2}'

So, you want the network name where the domain name on the following line includes 'ims3', and not the one where the following line includes 'ims7' (even though the network names in the example are the same).
sed -n '/NetworkName/{N;/ims3/{s/.*NetworkName=\(.*\)\n.*/\1/p;};}' File
This avoids abuse of felines, too (not to mention reducing the number of commands executed).
Tested on MacOS X 10.6.4, but there's no reason to think it won't work elsewhere too.
However, empirical evidence shows that Solaris sed is different from MacOS sed. It can all be done in one sed command, but it needs three lines:
sed -n '/NetworkName/{N
/ims3/{s/.*NetworkName=\(.*\)\n.*/\1/p;}
}' File
Tested on Solaris 10.

You just need to put -e pretty much everywhere you'd break the command at a newline or have a semicolon. You don't need the extra call to sed or awk or cat.
sed -n -e '/NetworkName/ {' -e 'N' -e '/\n.*ims3/ s/[^\n]*=\(.*\).*/\1/P' -e '}' File

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Extracting a string from a file name - sed

My script takes a file name in the form R#TYPE.TXT (# is a number and TYPE is two or three characters). I want my script to give me TYPE. What should I do to get it? Guess I need to use awk and sed. I'm using /bin/sh (which is a requirement)

you can use awk $ echo R1CcC.TXT | awk '{sub(/.[0-9]/,"");sub(".TXT","")}{print}' CcC or $ echo R1CcC.TXT | awk '{gsub(/.[0-9]|\.TXT$/,"");print}' CcC and if sed is really what you want $ echo R9XXX.TXT | sed 's/R[0-9]\(.*\)\.TXT/\1/' XXX

I think this is what you are looking for. $ echo R3cf.txt | sed "s/.[0-9]\(.\)\../\1/" cf If txt is always upper case and the filename always starts with R you could do something like. $ echo R3cf.txt | sed "s/R[0-9]\(.*\)\.TXT/\1/"

You can use just the shell (depending what shell your bin/sh is: f=R9ABC.TXT f="${f%.TXT}" # remove the extension type="${f#R[0-9]}" # remove the first bit echo "$type" # ==> ABC

Related

How to remove after second period in a string using sed

Shell: sed pipeline

Replace string with substring in lowercase using sed / awk / tr / perl?

Trim text using sed

Filter text based in a multiline match criteria

Categories

Resources

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Extracting a string from a file name - sed

My script takes a file name in the form R#TYPE.TXT (# is a number and TYPE is two or three characters). I want my script to give me TYPE. What should I do to get it? Guess I need to use awk and sed. I'm using /bin/sh (which is a requirement)

you can use awk $ echo R1CcC.TXT | awk '{sub(/.*[0-9]/,"");sub(".TXT","")}{print}' CcC or $ echo R1CcC.TXT | awk '{gsub(/.*[0-9]|\.TXT$/,"");print}' CcC and if sed is really what you want $ echo R9XXX.TXT | sed 's/R[0-9]\(.*\)\.TXT/\1/' XXX

I think this is what you are looking for. $ echo R3cf.txt | sed "s/.[0-9]\(.*\)\..*/\1/" cf If txt is always upper case and the filename always starts with R you could do something like. $ echo R3cf.txt | sed "s/R[0-9]\(.*\)\.TXT/\1/"

You can use just the shell (depending what shell your bin/sh is: f=R9ABC.TXT f="${f%.TXT}" # remove the extension type="${f#R[0-9]}" # remove the first bit echo "$type" # ==> ABC

Related

How to remove after second period in a string using sed

Shell: sed pipeline

Replace string with substring in lowercase using sed / awk / tr / perl?

Trim text using sed

Filter text based in a multiline match criteria

Categories

Resources

you can use awk $ echo R1CcC.TXT | awk '{sub(/.[0-9]/,"");sub(".TXT","")}{print}' CcC or $ echo R1CcC.TXT | awk '{gsub(/.[0-9]|\.TXT$/,"");print}' CcC and if sed is really what you want $ echo R9XXX.TXT | sed 's/R[0-9]\(.*\)\.TXT/\1/' XXX

I think this is what you are looking for. $ echo R3cf.txt | sed "s/.[0-9]\(.\)\../\1/" cf If txt is always upper case and the filename always starts with R you could do something like. $ echo R3cf.txt | sed "s/R[0-9]\(.*\)\.TXT/\1/"