Print pattern on a string with special character - sed

How to print only string figure with the following line :
\begin{figure}[h!]
I tried :
firstLine='\begin{figure}[h!]'
echo $firstLine | sed -n 's/\\begin{\(.*\)}/\1/p'
but returns :
figure[h!] instead of figure
It seems that issue comes from [] or ! character.

firstLine='\begin{figure}[h!]'
echo "$firstLine" | sed 's/.*{\(.*\)}.*/\1/'
Output:
figure
With your code (add .*):
echo $firstLine | sed -n 's/\\begin{\(.*\)}.*/\1/p'

This might work for you (GNU sed):
sed 's/.*{\(.*\)}.*/\1/' file
This assumes there is only one {...} expression and one line.
A more rigorous solution would be:
sed -n 's/.*\\begin{\([^}]*\)}.*/\1/p' file
However nothing would be output if no match was found.

Related

Extract substrings between strings

I have a file with text as follows:
###interest1 moreinterest1### sometext ###interest2###
not-interesting-line
sometext ###interest3###
sometext ###interest4### sometext othertext ###interest5### sometext ###interest6###
I want to extract all strings between ### .
My desired output would be something like this:
interest1 moreinterest1
interest2
interest3
interest4
interest5
interest6
I have tried the following:
grep '###' file.txt | sed -e 's/.*###\(.*\)###.*/\1/g'
This almost works but only seems to grab the first instance per line, so the first line in my output only grabs
interest1 moreinterest1
rather than
interest1 moreinterest1
interest2
Here is a single awk command to achieve this that makes ### field separator and prints each even numbered field:
awk -F '###' '{for (i=2; i<NF; i+=2) print $i}' file
interest1 moreinterest1
interest2
interest3
interest4
interest5
interest6
Here is an alternative grep + sed solution:
grep -oE '###[^#]*###' file | sed -E 's/^###|###$//g'
This assumes there are no # characters in between ### markers.
With GNU awk for multi-char RS:
$ awk -v RS='###' '!(NR%2)' file
interest1 moreinterest1
interest2
interest3
interest4
interest5
interest6
You can use pcregrep:
pcregrep -o1 '###(.*?)###' file
The regex - ###(.*?)### - matches ###, then captures into Group 1 any zero o more chars other than line break chars, as few as possible, and ### then matches ###.
o1 option will output Group 1 value only.
See the regex demo online.
sed 't x
s/###/\
/;D; :x
s//\
/;t y
D;:y
P;D' file
Replacing "###" with newline, D, then conditionally branching to P if a second replacement of "###" is successful.
This might work for you (GNU sed):
sed -n 's/###/\n/g;/[^\n]*\n/{s///;P;D}' file
Replace all occurrences of ###'s by newlines.
If a line contains a newline, remove any characters before and including the first newline, print the details up to and including the following newline, delete those details and repeat.

insert semi colon after 10 digit number

I have lines that start like this: 2141058222 11/22/2017 and I want to append a ; at the end of the ten digit number like this: 2141058222; 11/22/2017.
I've tried sed with sed -i 's/^[0-9]\{10\}\\$/;&/g' which does nothing.
What am I missing?
Try this:
echo "2141058222 11/22/2017" | sed -r 's/^([0-9]{10})/&;/'
echo "2141058222 11/22/2017" | sed 's/ /; /'
Output:
2141058222; 11/22/2017
If the input is always in the format specified, GNU cut works, and might even be more efficient than sed:
cut -c -10,11- --output-delimiter ';' <<< "2141058222 11/22/2017"
Output:
2141058222; 11/22/2017
For an input file that'd be:
cut -c -10,11- --output-delimiter ';' file

sed replace if part of word matches

My text looks like this:
cat
catch
cat_mouse
catty
I want to replace "cat" with "dog".
When I do
sed "s/cat/dog/"
my result is:
dog
catch
cat_mouse
catty
How do I replace with sed if only part of the word matches?
There's a mistake :
You lack the g modifier
sed 's/cat/dog/g'
g
Apply the replacement to all matches to the regexp, not just the first.
See
http://www.gnu.org/software/sed/manual/html_node/The-_0022s_0022-Command.html
http://sed.sourceforge.net/sedfaq3.html#s3.1.3
If you want to replace only cat by dog only if part of the word matches :
$ perl -pe 's/cat(?=.)/dog/' file.txt
cat
dogch
dog_mouse
dogty
I use Positive Look Around, see http://www.perlmonks.org/?node_id=518444
If you really want sed :
sed '/^cat$/!s/cat/dog/' file.txt
bash-3.00$ cat t
cat
catch
cat_mouse
catty
To replace cat only if it is part of a string
bash-3.00$ sed 's/cat\([^$]\)/dog\1/' t
cat
dogch
dog_mouse
dogty
To replace all occurrences of cat:
bash-3.00$ sed 's/cat/dog/' t
dog
dogch
dog_mouse
dogty
awk solution for this
awk '{gsub("cat","dog",$0); print}' temp.txt

Unix - Removing everything after a pattern using sed

I have a file which looks like below:
memory=500G
brand=HP
color=black
battery=5 hours
For every line, I want to remove everything after = and also the =.
Eventually, I want to get something like:
memory:brand:color:battery:
(All on one line with colons after every word)
Is there a one-line sed command that I can use?
sed -e ':a;N;$!ba;s/=.\+\n\?/:/mg' /my/file
Adapted from this fine answer.
To be frank, however, I'd find something like this more readable:
cut -d = -f 1 /my/file | tr \\n :
Here's one way using GNU awk:
awk -F= '{ printf "%s:", $1 } END { printf "\n" }' file.txt
Result:
memory:brand:color:battery:
If you don't want a colon after the last word, you can use GNU sed like this:
sed -n 's/=.*//; H; $ { g; s/\n//; s/\n/:/g; p }' file.txt
Result:
memory:brand:color:battery
This might work for you (GNU sed):
sed -i ':a;$!N;s/=[^\n]*\n\?/:/;ta' file
perl -F= -ane '{print $F[0].":"}' your_file
tested below:
> cat temp
abc=def,100,200,dasdas
dasd=dsfsf,2312,123,
adasa=sdffs,1312,1231212,adsdasdasd
qeweqw=das,13123,13,asdadasds
dsadsaa=asdd,12312,123
> perl -F= -ane '{print $F[0].":"}' temp
abc:dasd:adasa:qeweqw:dsadsaa:
My command is
First step:
sed 's/([a-z]+)(\=.*)/\1:/g' Filename |cat >a
cat a
memory:
brand:
color:
battery:
Second step:
sed -e 'N;s/\n//' a | sed -e 'N;s/\n//'
My output is
memory:brand:color:battery:

Can sed search & replace on a match if that match in only part of a line?

The sed below will output the input exactly. What I'd like to do is replace all occurrences of _ with - in the first matching group (\1), but not in the second. Is this possible?
echo 'abc_foo_bar=one_two_three' | sed 's/\([^=]*\)\(=.*\)/\1\2/'
abc_foo_bar=one_two_three
So, the output I'm hoping for is:
abc-foo-bar=one_two_three
I'd prefer not to resort to awk since I'm doing a string of other sed commands too, but I'll resort to that if I have to.
Edit: Minor fix to RE
You can do this in sed using the hold space:
$ echo 'abc_foo_bar=one_two_three' | sed 'h; s/[^=]*//; x; s/=.*//; s/_/-/g; G; s/\n//g'
abc-foo-bar=one_two_three
You could use awk instead of sed as follows:
echo 'abc_foo_bar=one_two_three' | awk -F= -vOFS== '{gsub("_", "-", $1); print $1, $2}'
The output would be, as expected:
abc-foo-bar=one_two_three
You could use ghc instead of sed as follows:
echo "abc_foo_bar=one_two_three" | ghc -e "getLine >>= putStrLn . uncurry (++) . (map (\x -> if x == '_' then '-' else x) *** id) . break (== '=')"
The output would be, as expected:
abc-foo-bar=one_two_three
This might work for you:
echo 'abc_foo_bar=one_two_three' |
sed 's/^/\n/;:a;s/\n\([^_=]*\)_/\1-\n/;ta;s/\n//'
abc-foo-bar=one_two_three
Or this:
echo 'abc_foo_bar=one_two_three' |
sed 'h;s/=.*//;y/_/-/;G;s/\n.*=/=/'
abc-foo-bar=one_two_three