I have a string
abc\xyz\file.txt#4 - blah blah blah
I want the file number so I did the following -
abc\xyz\file.txt#4 - blah blah blah | sed -e "s/[A-Z,a-z,\,/,.#:,-/s]//g"
It gives the expected output - 4
But when the string is -
abc\xyz1\file.txt#4 - blah blah blah | sed -e "s/[A-Z,a-z,\,/,.#:,-/s]//g"
It gives the output as - 14.
So I was trying to get string between '#' and '-'
I tried -
abc\xyz1\file.txt#4 - blah blah blah |sed 's/^.# //; s/-.$//'
but it only works when there is space before #, which is not true in this case.
What am I doing wrong?
Try using
sed 's/.*#\([0-9]*\).*/\1/'
This searches for everything up to and including the `#, followed by zero or more numbers, followed by the remainder of the line. So the search string matches the entire line.
It then replaces the matching pattern (that is, the entire line) with the part that matches the sub-pattern between the escaped parens, which is the droid number you're looking for.
To print only the matching lines, use
sed -n 's/.*#\([0-9]*\).*/\1/p'
where the -n flag suppresses the default behavior of printing every line, and the p suffix prints the matching lines.
try this one :
echo "abc\xyz1\file.txt#4 - blah blah blah" | sed -e "s|.*#||g" -e "s| -.*||g"
Related
I've been trying to extract the bold portion from the following:
[HorribleSubs] Black Clover - 128 [720p].mkv
But for whatever reason, this sed expression-
sed --regexp-extended 's#.*#\1#'
-is returning the entire file, when of course, I only want the \1 capture group to be.
The weird thing, is that this expression worked just fine when I tried debugging it with desed; with the capture group and primary match showing up just fine.
I'm using gnu sed 4.8-1
You can use
sed -n -E '/.*<a href="(\/torrent\/[^"]*)\/">[^<]*<\/a>.*/{s//\1/p;q}'
Details:
-n - suppresses default line output
-E - enables POSIX ERE regex syntax
/.*[^<]*<\/a>.*/ - finds a line containing < href=".../">... substring, capturing the part between href=" and /"
{s//\1/p;q}' - replaces the string matched above with the value of the captured substring, prints it and quits.
See the online demo:
s='blah
[HorribleSubs] Black Clover - 128 [720p].mkv
blah
[HorribleSubs] Black Clover - 128 [720p].mkv
blah'
sed -n -E '/.*<a href="(\/torrent\/[^"]*)\/">[^<]*<\/a>.*/{s//\1/p;q}' <<< "$s"
# => /torrent/4384536/HorribleSubs-Black-Clover-128-720p-mkv
I am trying to extract aa49a30add59 from the following command but the \1 back reference is not providing me the substring but the entire match with -e option.
bash$ docker images | grep '^aaa' | sed -e "s/aaa\s+xxx\s+\([0-9]+\)\s+/\1/"
aaa xxx aa49a30add59 33
bash$ docker images | grep '^aaa' | sed -e "s/\(?:aaa\s+xxx\s+\)\([0-9]+\)\s+/\1/"
aaa xxx aa49a30add59 33 minutes ago 1.52 GB
Here only aaa and xxx is fixed, rest all is dynamic.
How to get only the matched subset here?
I would use the following :
docker images | sed -nE "s/^aaa\s+xxx\s+([0-9a-f]+)\s*/\1/p"
-E switches to the ERE regex flavour where unescaped brackets and + are parsed as metacharacters. Now your + work, but
we now need to unescape the brackets.
we need [0-9a-f] rather than [0-9] to match hexadecimal digits
there may be no trailing space, so \s* instead of \s+
adding the anchor, the -n and the final p makes the sed command perform the grep command's job, which can now be removed
You can try it here.
I want to get the first line of a file that is not commented out with an hash, then append a line of text just after that line just before that line.
I managed to get the number of the line:
sed -n '/^\s*#/!{=;q}' file // prints 2
and also to insert text (specifying the line manually):
sed '2 a extralinecontent' file
I can't get them working together as a one liner or in a batch.
I tried command substitution (with $(command) and also with backticks) but I get an error from bash:
sed '$(sed -n '/^\s*#/!{=;q}' file) a extralinecontent' file
-bash: !{=: event not found
and also tried many other combinations, but no luck.
I'm using gnu-sed (via brew) on macOS.
This might work for you (GNU sed):
sed -e '/^\s*#/b;a extra line content' -e ':a;n;ba' file
Bail out of any lines beginning with a comment at the beginning of the file, append an extra line following the first line that is not a comment and keep fetching/printing all the remaining lines of the file.
Here's a way to do it with GNU sed without reading the file twice
$ cat ip.txt
#comment
foo baz good
123 456 7889
$ sed -e '0,/^\s*[^#[:space:]]/ {// a XYZ' -e '}' ip.txt
#comment
foo baz good
XYZ
123 456 7889
GNU sed allows first address to be 0 if the other address is regex, that way this will work even if first line matches the condition
/^\s*[^#[:space:]]/ as sed doesn't support possessive quantifier, need to ensure that the first character being matched by the character class isn't either a # or a whitespace character
// is a handy shortcut to repeat the last regex
a XYZ your required line to be appended (note that your question mentiones insert, so if you want that, use i instead of a)
I have read sed manual for the s/// command. There it says:
e
This command allows one to pipe input from a shell command into pattern
space. If a substitution was made, the command that is found in
pattern space is executed and pattern space is replaced with its output.
A trailing newline is suppressed; results are undefined if the command
to be executed contains a nul character. This is a GNU sed extension.
I don't know what is useful:
echo "1234" | sed 's/1/echo ss/e'
echo "1234" | sed 's/1/ss/'
These two commands result in the same, so what is the e modifier about?
$ printf "%s\n" 1234 2345 3456 |
> sed -e 's/\(..\)\(..\)/echo $((\1 * \2))/e'
408
1035
1904
$
This printf command echoes three 4-digit numbers on three lines. The sed script splits each line into a pair of 2-digit numbers, creates a command echo $((12 * 34)), for example, runs it, and the output (408 for the given values) is included in (as) the pattern space — which is then printed. So, for this script, the pairs of 2-digit numbers are multiplied and the result is shown.
You can get fancier if you wish:
$ printf "%s\n" 1234 2345 3456 |
> sed -e 's/\(..\)\(..\)/echo \1 \\* \2 = $((\1 * \2))/e'
12 * 34 = 408
23 * 45 = 1035
34 * 56 = 1904
$
Note the double backslash — that's rather important. You could avoid the need for that using double quotes:
printf "%s\n" 1234 2345 3456 |
sed -e 's/\(..\)\(..\)/echo "\1 * \2 = $((\1 * \2))"/e'
Beware: the notation will run the external command every time the s/// command actually makes a substitution. If you have millions of lines of data in your files, that could mean millions of commands executed.
The /e option is a GNU sed extension. It causes the result of the replacement to be passed to the shell for evaluation as a command. Observe:
vnix$ sed 's/a/pwd/' <<<a
pwd
vnix$ sed 's/a/pwd/e' <<<a
/home/tripleee
Your example caused identical behavior (echoing back exactly the replaced text) so it was a poorly chosen example.
Out of the box, the pattern space refers to the current input line, but there are ways to put something else in the pattern space. Substitution modifies the pattern space, but there are other sed commands which modify the pattern space in other ways. (For a trivial example, x swaps the pattern space with the hold space, which is basically another built-in variable which you can use for whatever you want.)
I have function that prints a header that needs to be applied across several files, but if I utilize a sed process substitution the lines prior to the last have a backslash \ on them.
E.g.
function print_header() {
cat << EOF
-------------------------------------------------------------------
$(date '+%B %d, %Y # ~ %r') ID:$(echo $RANDOM)
EOF
}
If I then take a file such as test.txt:
line 1
line 2
line 3
line 4
line 5
sed "1 i $(print_header | sed 's/$/\\/g')" test.txt
I get:
-------------------------------------------------------------------\
November 24, 2015 # ~ 11:18:28 AM ID:13187
line 1
line 2
line 3
line 4
line 5
Notice the troublesome backslash at the end of the first line, I'd like to not have that backslash appear. Any ideas?
I would use cat for that:
cat <(print_header) file > file_with_header
This behavior depends on the sed dialect. Unfortunately, it's one of the things which depends on which version you have.
To simplify debugging, try specifying verbatim text. Here's one from a Debian system.
vnix$ sed '1i\
> foo\
> bar' <<':'
> hello
> goodbye
> :
foo
bar
hello
goodbye
Your diagnostics appear to indicate that your sed dialect does not in fact require the backslash after the first i.
Since you are generating the contents of the header programmatically anyway, my recommended solution would be to refactor the code so that you can avoid this conundrum. If you don't want cat <<EOF test.txt then maybe experiment with sed 1r/dev/stdin' <<EOF test.txt (I could not get 1r- to work, but /dev/stdin should be portable to any Linux.)
Here is my kludgy fix, if you can find something more elegant I'll gladly credit you:
sed "1 i $(print_header | sed 's/$/\\/g;$s/$/\x01/')" test.txt | tr -d '\001'
This puts an unprintable SOH (\x01) ascii Start Of Header character after the inserted text, that precludes the backslashes and then I run it over tr to delete the SOH chars.