I have this string : "This is the First - This is the seconde - this is the big"
I try to get only the text before the first - like This is the First
echo "This is the First - This is the seconde - this is the big" | sed -re 's/\-/\l/g'
but the code remove only the -
Thanks for help
You can delete everything after matching -:
$ s='This is the First - This is the seconde - this is the big'
$ echo "$s" | sed 's/-.*//'
This is the First
Use s/ *-.*// if you want to remove the spaces before - as well.
If removing spaces before - isn't a concern, you can use cut as well. This solution is also easier to extend:
$ echo "$s" | cut -d- -f1
This is the First
$ echo "$s" | cut -d- -f1-2
This is the First - This is the seconde
This will use - to split the input into fields and then display only the required fields.
Related
In my script, have a possible version number: 15.03.2 set to variable $STRING. These numbers always change. I want to strip it down to: 15.03 (or whatever it will be next time).
How do I remove everything after the second . using sed?
Something like:
$(echo "$STRING" | sed "s/\.^$\.//")
(I don't know what ^, $ and others do, but they look related, so I just guessed.)
I think the better tool here is cut
echo '15.03.2' | cut -d . -f -2
This might work for you (GNU sed):
sed 's/\.[^.]*//2g' file
Remove the second or more occurrence of a period followed by zero or non-period character(s).
$ echo '15.03.2' | sed 's/\([^.]*\.[^.]*\)\..*/\1/'
15.03
More generally to skip N periods:
$ echo '15.03.2.3.4.5' | sed -E 's/(([^.]*\.){2}[^.]*)\..*/\1/'
15.03.2
$ echo '15.03.2.3.4.5' | sed -E 's/(([^.]*\.){3}[^.]*)\..*/\1/'
15.03.2.3
$ echo '15.03.2.3.4.5' | sed -E 's/(([^.]*\.){4}[^.]*)\..*/\1/'
15.03.2.3.4
I have a multi-line string, downloaded from the Web:
toast the lemonade
blend with the lemonade
add one tablespoon of the lemonade
grill the spring onions
add the lemonade
add the raisins to the saucepan
rinse the horseradish sauce
I have assigned this to $INPUT, like this:
INPUT=$(lynx --dump 'http://example.net/recipes' \
| python -m json.tool \
| awk '/steps/,/]/' \
| egrep -v "steps|]" \
| sed 's/[",]\|^ *//g; $d')
At this point, $INPUT is ready for substitution into my target file as follows:
sed -i "0,/OLDINPUT/s//$INPUT/" /home/test_file
Of course, sed complains about an unterminated s command - herein lies the problem.
The current workaround I am using is to echo $INPUT prior to giving it to sed, but then the newlines are not preserved. echo strips newlines - which is the problem.
The correct output should maintain its newlines. How can sed be instructed to preserve the newlines?
The hacky direct answer is to replace all newlines with \n, which you can do by adding
| sed ':a $!{N; ba}; s/\n/\\n/g'
to the long command above. A better answer, because substituting shell variables into code is always a bad idea and with sed you wouldn't have a choice, is to use awk instead:
awk -i inplace -v input="$INPUT" 'NR == 1, /OLDINPUT/ { sub(/OLDINPUT/, input) } 1' /home/test_file
This requires GNU awk 4.1.0 or later for the -i inplace.
If you're using Bash, you can substitute \n for the newlines:
INPUT="${INPUT//
/\\n}"
If you don't like the literal linefeed in your parameter expansion, you might prefer
INPUT="${INPUT//$'\n'/\\n}"
Side note - you probably mean to change the matched lines to your input, not substitute each of them. In which case, you don't want to quote the newlines, after all...
To clean up your code some.
This:
lynx --dump 'http://somesite.net/recipes' | python -m json.tool | awk '/steps/,/]/' | egrep -v "steps|]" | sed 's/"//g' |sed 's/,//g' | sed 's/^ *//g' | sed '$d'
Can be replaced with this:
lynx --dump 'http://somesite.net/recipes' | python -m json.tool | awk '/]/ {f=0} f {if (c--) print line} /steps/{f=1} {gsub(/[",]|^ */,"");line=$0}'
It may be shorten more, but I do not now what this does: python -m json.tool
This:
awk '/]/ {f=0} f {if (c--) print line} /steps/{f=1} {gsub(/[",]|^ */,"");line=$0}'
Does:
Print line after pattern steps to line before ] - awk '/steps/,/]/' | egrep -v "steps|]"
Removes ", , and all space in front of all lines. - sed 's/"//g' |sed 's/,//g' | sed 's/^ *//g'
Then remove last line of this group. - sed '$d'
Example:
cat file
my data
steps data
more
do not delet this
hei "you" , more data
extra line
here is end ]
this is good
awk '/]/ {f=0} f {if (c--) print line} /steps/{f=1} {gsub(/[",]|^ */,"");line=$0}' file
more
do not delet this
hei you more data
Assuming your input JSON fragment looks something like this:
{ "other": "random stuff",
"steps": [
"toast the lemonade",
"blend with the lemonade",
"add one tablespoon of the lemonade",
"grill the spring onions",
"add the lemonade",
"add the raisins to the saucepan",
"rinse the horseradish sauce"
],
"still": "yet more stuff" }
you can extract just the steps member with
jq -r .steps
To interpolate that into a sed statement, you'd need to escape any regex metacharacters in the result. A less intimidating and hopefully slightly less hacky solution would be to read static text from standard input:
lynx ... | jq ... |
sed -i -e '/OLDINPUT/{s///; r /dev/stdin' -e '}' /home/test_file
The struggle to educate practitioners to use structure-aware tools for structured data has reached epic heights and continues unabated. Before you decide to use the quick and dirty approach, at least make sure you understand the dangers (technical and mental).
You'll want to use an editor instead of sed's substitution:
$ input="toast the lemonade
blend with the lemonade
add one tablespoon of the lemonade
grill the spring onions
add the lemonade
add the raisins to the saucepan
rinse the horseradish sauce"
$ seq 10 > file
$ ed file <<END
1,/5/d
1i
$input
.
w
q
END
$ cat file
toast the lemonade
blend with the lemonade
add one tablespoon of the lemonade
grill the spring onions
add the lemonade
add the raisins to the saucepan
rinse the horseradish sauce
6
7
8
9
10
How do I remove the first and the last quotes?
echo "\"test\"" | sed 's/"//' | sed 's/"$//'
The above is working as expected, But I guess there must be a better way.
You can combine the sed calls into one:
echo "\"test\"" | sed 's/"//;s/"$//'
The command you posted will remove the first quote even if it's not at the beginning of the line. If you want to make sure that it's only done if it is at the beginning, then you can anchor it like this:
echo "\"test\"" | sed 's/^"//;s/"$//'
Some versions of sed don't like multiple commands separated by semicolons. For them you can do this (it also works in the ones that accept semicolons):
echo "\"test\"" | sed -e 's/^"//' -e 's/"$//'
Maybe you prefer something like this:
echo '"test"' | sed 's/^"\(.*\)"$/\1/'
if you are sure there are no other quotes besides the first and last, just use /g modifier
$ echo "\"test\"" | sed 's/"//g'
test
If you have Ruby(1.9+)
$ echo $s
blah"te"st"test
$ echo $s | ruby -e 's=gets.split("\"");print "#{s[0]}#{s[1..-2].join("\"")+s[-1]}"'
blahte"sttest
Note the 2nd example the first and last quotes which may not be exactly at the first and last positions.
example with more quotes
$ s='bl"ah"te"st"tes"t'
$ echo $s | ruby -e 's=gets.split("\"");print "#{s[0]}#{s[1..-2].join("\"")+s[-1]}"'
blah"te"st"test
what would be the best way to remove whitespace only around certain character. Let's say a dash - Some- String- 12345- Here would become Some-String-12345-Here. Something like sed 's/\ -/-/g;s/-\ /-/g' but I am sure there must be a better way.
Thanks!
If you mean all whitespace, not just spaces, then you could try \s:
echo 'Some- String- 12345- Here' | sed 's/\s*-\s*/-/g'
Output:
Some-String-12345-Here
Or use the [:space:] character class:
echo 'Some- String- 12345- Here' | sed 's/[[:space:]]*-[[:space:]]*/-/g'
Different versions of sed may or not support these, but GNU sed does.
Try:
's/ *- */-/g'
you can use awk as well
$ echo 'Some - String- 12345-' | awk -F" *- *" '{$1=$1}1' OFS="-"
Some-String-12345-
if its just "- " in your example
$ s="Some- String- 12345-"
$ echo ${s//- /-}
Some-String-12345-
My script takes a file name in the form R#TYPE.TXT (# is a number and TYPE is two or three characters).
I want my script to give me TYPE. What should I do to get it? Guess I need to use awk and sed.
I'm using /bin/sh (which is a requirement)
you can use awk
$ echo R1CcC.TXT | awk '{sub(/.*[0-9]/,"");sub(".TXT","")}{print}'
CcC
or
$ echo R1CcC.TXT | awk '{gsub(/.*[0-9]|\.TXT$/,"");print}'
CcC
and if sed is really what you want
$ echo R9XXX.TXT | sed 's/R[0-9]\(.*\)\.TXT/\1/'
XXX
I think this is what you are looking for.
$ echo R3cf.txt | sed "s/.[0-9]\(.*\)\..*/\1/"
cf
If txt is always upper case and the filename always starts with R you could do something like.
$ echo R3cf.txt | sed "s/R[0-9]\(.*\)\.TXT/\1/"
You can use just the shell (depending what shell your bin/sh is:
f=R9ABC.TXT
f="${f%.TXT}" # remove the extension
type="${f#R[0-9]}" # remove the first bit
echo "$type" # ==> ABC