Related
OpenSUSE, in their infinite wisdom, has decided that ld -v will return
GNU ld (GNU Binutils; SUSE Linux Enterprise 15) 2.37.20211103-7.26
I need to extract the 2 and 37 values and throw out the rest, and this needs to work with ld that isn't so screwed up.
I have tried numerous examples found here and elsewhere for extracting the version, but they all get hung up on 15. Does anyone have any idea on how I can extract this using sed?
Currently in the Makefile I am using
LD_MAJOR_VER := $(shell $(LD) -v | perl -pe '($$_)=/([0-9]+([.][0-9]+)+)/' | cut -f1 -d. )
LD_MINOR_VER := $(shell $(LD) -v | perl -pe '($$_)=/([0-9]+([.][0-9]+)+)/' | cut -f2 -d. )
though I would much prefer to use sed like it did before SuSE screwed up our build process with their 15.3 update. Any help would be greatly appreciated.
You can use
LD_MAJOR_VER := $(shell $(LD) -v | sed -n 's/.* \([0-9]*\).*/\1/p')
LD_MINOR_VER := $(shell $(LD) -v | sed -n 's/.* [0-9]*\.\([0-9]*\).*/\1/p')
Details:
-n - an option that suppresses default line output with sed
.* \([0-9]*\).* - a regex that matches the whole string:
.* - any zero or more chars
- space
\([0-9]*\) - Group 1 (the parentheses are escaped to form a capturing group since this is a POSIX BRE pattern): any zero or more digits
.* - any zero or more chars
\1 - the replacement is the Group 1 value
p - only prints the result of the substitution.
In the second regex, [0-9]*\. also matches zero or more digits (the major version number) with a dot after it to skip that value.
I would do it in two steps, it can make it clear:
get the version information
get the major/minor or whatever from the version information
It would be easier to use awk to solve it, but since you said you prefer sed:
kent$ ver=$(sed 's/.*[[:space:]]//' <<< "GNU ld (GNU Binutils; SUSE Linux Enterprise 15) 2.37.20211103-7.26")
kent$ echo $ver
2.37.20211103-7.26
kent$ major=$(sed 's/[.].*//' <<< $ver)
kent$ echo $major
2
kent$ minor=$(sed 's/^[^.-]*[.]//;s/[.].*//' <<< $ver)
kent$ echo $minor
37
If you use GNU make then its Functions for Transforming Text solve all this:
LD_VERSION := $(subst ., ,$(lastword $(shell $(LD) -v)))
LD_MAJOR_VER := $(word 1,$(LD_VERSION))
LD_MINOR_VER := $(word 2,$(LD_VERSION))
Moreover it is probably very robust and should work with any version string where the version is the last word and its component are separated by dots. Demo (where the version string is passed as a make variable instead of being returned by $(LD) -v):
$ cat Makefile
LD_VERSION := $(subst ., ,$(lastword $(LD_VERSION_STRING)))
LD_MAJOR_VER := $(word 1,$(LD_VERSION))
LD_MINOR_VER := $(word 2,$(LD_VERSION))
.PHONY: all
all:
#echo $(LD_MAJOR_VER)
#echo $(LD_MINOR_VER)
$ make LD_VERSION_STRING='blah blah blah 1.2.3.4.5.6.7'
1
2
$ make LD_VERSION_STRING='GNU ld (GNU Binutils for Debian) 2.35.2'
2
35
$ make LD_VERSION_STRING='GNU ld (GNU Binutils; SUSE Linux Enterprise 15) 2.37.20211103-7.26'
2
37
I want to filter the output of the blkid to get the UUID.
The output of blkid looks like
CASE 1:-
$ blkid
/dev/sda2: LABEL="A" UUID="4CC9-0015"
/dev/sda3: LABEL="B" UUID="70CF-169F"
/dev/sda1: LABEL=" NTFS_partition" UUID="3830C24D30C21234"
In somecases the output of blkid looks like
CASE 2:-
$ blkid
/dev/sda1: UUID="d7ec380e-2521-4fe5-bd8e-b7c02ce41601" TYPE="ext4"
/dev/sda2: UUID="fc54f19a-8ec7-418b-8eca-fbc1af34e57f" TYPE="ext4"
/dev/sda3: UUID="6f218da5-3ba3-4647-a44d-a7be19a64e7a" TYPE="swap"
I want to filter out the UUID.
Using the combination of grep and cut it can be done as
/sbin/blkid | /bin/grep 'sda1' | /bin/grep -o -E 'UUID="[a-zA-Z|0-9|\-]*' | /bin/cut -c 7-
I have tried using awk , grep and cut as below for filtering the UUID
$ /sbin/blkid | /bin/grep 'sda1' | /usr/bin/awk '{print $2}' | /bin/sed 's/\"//g' | cut -c 7-
7ec380e-2521-4fe5-bd8e-b7c02ce41601
The above command(which uses awk) is not reliable since sometimes an extra field such as LABEL may be present in the output of the blkid program as shown in the above output.
What is the best way to create a command using awk which works reliably?
Please post if any other elegant method exits for the job using bin and core utils. I dont want to use perl or python since this has to be run on busybox.
NOTE:-I am using busybox blkid to which /dev/sda1 can not be passed as the args(the version i am using does not support it) hence the grep to filter the line.
UPDATE :- added the CASE 2: -output to show that field position can not be relied upon.
Why are you making it so complex?
Try this:
# blkid -s UUID -o value
d7ec380e-2521-4fe5-bd8e-b7c02ce41601
fc54f19a-8ec7-418b-8eca-fbc1af34e57f
6f218da5-3ba3-4647-a44d-a7be19a64e7a
Or this:
# blkid -s UUID -o value /dev/sda1
d7ec380e-2521-4fe5-bd8e-b7c02ce41601
Install proper blkid package if you don't have it:
sudo apt-get install util-linux
sudo yum install util-linux
For all the UUID's, you can do :
$ blkid | sed -n 's/.*UUID=\"\([^\"]*\)\".*/\1/p'
d7ec380e-2521-4fe5-bd8e-b7c02ce41601
fc54f19a-8ec7-418b-8eca-fbc1af34e57f
6f218da5-3ba3-4647-a44d-a7be19a64e7a
Say, only for a specific sda1:
$ blkid | sed -n '/sda1/s/.*UUID=\"\([^\"]*\)\".*/\1/p'
d7ec380e-2521-4fe5-bd8e-b7c02ce41601
The sed command tries to group the contents present within the double quotes after the UUID keyword, and replaces the entire line with the token.
Here's a short awk solution:
blkid | awk 'BEGIN{FS="[=\"]"} {print $(NF-1)}'
Output:
4CC9-0015
70CF-169F
3830C24D30C21234
Explanation:
BEGIN{FS="[=\"]"} : Use = and " as delimiters
{print $(NF-1)}: NF stands of Number of Fields; here we print the 2nd to last field
This is based on the consistent structure of blkid output: UUID in quotes is at the end of each line.
Alternatively:
blkid | awk 'BEGIN{FS="="} {print $NF}' | sed 's/"//g'
data.txt
/dev/sda2: LABEL="A" UUID="4CC9-0015"
/dev/sda3: LABEL="B" UUID="70CF-169F"
/dev/sda1: LABEL=" NTFS_partition" UUID="3830C24D30C21234"
awk and sed combination
cat data.txt | awk 'BEGIN{FS="UUID";RS="\n"} {print $2}' | sed -e 's/=//' -e 's/"//g'
Explanation:
Set the Field Separator to the string 'UUID', $2 will give the rest output
use sed then to remove the = and " as shown where -e is a switch so that you can give multiple sed commands/expression in one.
All occurrences of " are removed using the ending g option i.e. global.
The question has a "e.t.c" so I'm going to assume python is one of the options ;)
#!/usr/bin/env python3
import subprocess, re, json
# get blkid output
blkid = subprocess.check_output(["blkid"]).decode('utf-8')
devices = []
for line in [x for x in blkid.split('\n') if x]:
parameters = line.split()
for idx, parameter in enumerate(parameters):
if idx is 0:
devices.append({"DEVICE": re.sub(r':$','',parameter)})
continue
key_and_value = parameter.split('=')
devices[-1].update({
key_and_value[0]: re.sub(r'"','',key_and_value[1])
})
uuids = [{dev['DEVICE']: dev['UUID']} for dev in devices if 'UUID' in dev.keys()]
print(json.dumps(uuids, indent=4, sort_keys=True))
Although, this is probably overkill and quite a few error handling/optimization is missing from this script XD
I assume you're using busybox in an initramfs and you are waiting for your e.g. USB drive with the rootfs on it to become available.
You could use the following awk script (busybox awk compliant).
# cat get-ruuid.awk
BEGIN {
ruuid=ENVIRON["RUUID"]
}
/^\/dev\/sd[a-z]/ {
if (index($0, tolower(ruuid)) || index($0, toupper(ruuid))) {
split($1, parts, ":")
printf("%s\n", parts[1])
exit(0) # Return success and stop further scanning.
}
}
END {
exit(1) # If we reach the end, it means RUUID was not found.
}
Call it as follows from e.g. the init script; this is not the most ideal way.
# The UUID of your root partition
export RUUID="<put proper uuid value here>"
for x in 1, 2, 3, 4, 5 ; do
mdev -s
found=$(blkid | awk -f ./get-ruuid.awk)
test -z $found || break; # If no longer zero length, break the loop.
sleep 1
done
But if this is the only reason why you would want to have an initramfs, I would use the 'root=PARTUUID=... waitroot' Linux kernel command line option. Check the kernel docs and sources.
Get the proper PARTUUID (NOT UUID) of your root partition with the blkid command.
I have a plaintext file containing multiple instances of the pattern $$DATABASE_*$$ and the asterisk could be any string of characters. I'd like to replace the entire instance with whatever is in the asterisk portion, but lowercase.
Here is a test file:
$$DATABASE_GIBSON$$
test me $$DATABASE_GIBSON$$ test me
$$DATABASE_GIBSON$$ test $$DATABASE_GIBSON$$ test
$$DATABASE_GIBSON$$ $$DATABASE_GIBSON$$$$DATABASE_GIBSON$$
Here is the desired output:
gibson
test me gibson test me
gibson test gibson test
gibson gibsongibson
How do I do this with sed/awk/tr/perl?
Here's the perl version I ended up using.
perl -p -i.bak -e 's/\$\$DATABASE_(.*?)\$\$/lc($1)/eg' inputFile
Unfortunately there's no easy, foolproof way with awk, but here's one approach:
$ cat tst.awk
{
gsub(/[$][$]/,"\n")
head = ""
tail = $0
while ( match(tail, "\nDATABASE_[^\n]+\n") ) {
head = head substr(tail,1,RSTART-1)
trgt = substr(tail,RSTART,RLENGTH)
tail = substr(tail,RSTART+RLENGTH)
gsub(/\n(DATABASE_)?/,"",trgt)
head = head tolower(trgt)
}
$0 = head tail
gsub("\n","$$")
print
}
$ cat file
The quick brown $$DATABASE_FOX$$ jumped over the lazy $$DATABASE_DOG$$s back.
The grey $$DATABASE_SQUIRREL$$ ate $$DATABASE_NUT$$s under a $$DATABASE_TREE$$.
Put a dollar $$DATABASE_DOL$LAR$$ in the $$ string.
$ awk -f tst.awk file
The quick brown fox jumped over the lazy dogs back.
The grey squirrel ate nuts under a tree.
Put a dollar dol$lar in the $$ string.
Note the trick of converting $$ to a newline char so we can negate that char in the match(RE), without that (i.e. if we used ".+" instead of "[^\n]+") then due to greedy RE matching if the same pattern appeared twice on one input line the matching string would extend from the start of the first pattern to the end of the second pattern.
This one works with complicated examples.
perl -ple 's/\$\$DATABASE_(.*?)\$\$/lc($1)/eg' filename.txt
And for simpler examples :
echo '$$DATABASE_GIBSON$$' | sed 's#$$DATABASE_\(.*\)\$\$#\L\1#'
in sed, \L means lower case (\E to stop if needed)
Using awk alone:
> echo '$$DATABASE_AWESOME$$' | awk '{sub(/.*_/,"");sub(/\$\$$/,"");print tolower($0);}'
awesome
Note that I'm in FreeBSD, so this is not GNU awk.
But this can be done using bash alone:
[ghoti#pc ~]$ foo='$$DATABASE_AWESOME$$'
[ghoti#pc ~]$ foo=${foo##*_}
[ghoti#pc ~]$ foo=${foo%\$\$}
[ghoti#pc ~]$ foo=${foo,,}
[ghoti#pc ~]$ echo $foo
awesome
Of the above substitutions, all except the last one (${foo,,}) will work in standard Bourne shell. If you don't have bash, you can instead do use tr for this step:
$ echo $foo
AWESOME
$ foo=$(echo "$foo" | tr '[:upper:]' '[:lower:]')
$ echo $foo
awesome
$
UPDATE:
Per comments, it seems that what the OP really wants is to strip the substring out of any text in which it is included -- that is, our solutions need to account for the possibility of leading or trailing spaces, before or after the string he provided in his question.
> echo 'foo $$DATABASE_KITTENS$$ bar' | sed -nE '/\$\$[^$]+\$\$/{;s/.*\$\$DATABASE_//;s/\$\$.*//;p;}' | tr '[:upper:]' '[:lower:]'
kittens
And if you happen to have pcregrep on your path (from the devel/pcre FreeBSD port), you can use that instead, with lookaheads:
> echo 'foo $$DATABASE_KITTENS$$ bar' | pcregrep -o '(?!\$\$DATABASE_)[A-Z]+(?=\$\$)' | tr '[:upper:]' '[:lower:]'
kittens
(For Linux users reading this: this is equivalent to using grep -P.)
And in pure bash:
$ shopt -s extglob
$ foo='foo $$DATABASE_KITTENS$$ bar'
$ foo=${foo##*(?)\$\$DATABASE_}
$ foo=${foo%%\$\$*(?)}
$ foo=${foo,,}
$ echo $foo
kittens
Note that NONE of these three updated solutions will handle situations where multiple tagged database names exist in the same line of input. That's not stated as a requirement in the question either, but I'm just sayin'....
You can do this in a pretty foolproof way with the supercool command cut :)
echo '$$DATABASE_AWESOME$$' | cut -d'$' -f3 | cut -d_ -f2 | tr 'A-Z' 'a-z'
This might work for you (GNU sed):
sed 's/$\$/\n/g;s/\nDATABASE_\([^\n]*\)\n/\L\1/g;s/\n/$$/g' file
Here is the shortest (GNU) awk solution I could come up with that does everything requested by the OP:
awk -vRS='[$][$]DATABASE_([^$]+[$])+[$]' '{ORS=tolower(substr(RT,12,length(RT)-13))}1'
Even if the string indicated with the asterix (*) contained one or more single Dollar signs ($) and/or linebreaks this soultion should still work.
awk '{gsub(/\$\$DATABASE_GIBSON\$\$/,"gibson")}1' file
gibson
test me gibson test me
gibson test gibson test
gibson gibsongibson
echo $$DATABASE_WOOLY$$ | awk '{print tolower($0)}'
awk will take what ever input, in this case the first agurment, and use the tolower function and return the results.
For your bash script you can do something like this and use the variable DBLOWER
DBLOWER=$(echo $$DATABASE_WOOLY$$ | awk '{print tolower($0)}');
The terminal transcript speaks for itself:
iMac:~$ echo -n a | md5
0cc175b9c0f1b6a831c399e269772661
iMac:~$ perl -e 'system "echo -n a | md5"'
c3392e9373ccca33629d82b17699420f
Note that the MD5 hash of a is 0cc175b9c0f1b6a831c399e269772661, the first
result. Why does it turns out to be different when the same command is called
by perl?
By the way, perl is perl 5, version 12, subversion 4 (v5.12.4) built for darwin-thread-multi-2level. And the system: Mac OS 10.8, Darwin 12.0
When in the /bin/sh shell on mac, echo -n doesn't not print out the newline like it does in /bin/bash. You can see this if you drop into /bin/sh and run echo -n a, your output should look like this:
sh-3.2$ echo -n a
-n a
so you're literally getting -n a instead of the desired a. As perl system runs /bin/sh to evaluate your command, -n a is being passed into md5 instead of your desired a
The specific question has already been answered, but I want to point out that od is useful to help understand exactly what any command outputs or file contains. This is useful especially to show otherwise non-printing characters.
$ echo -n a | od -tc
0000000 a
0000001
$ perl -e 'system "echo -n a | od -tc";'
0000000 - n a \n
0000005
The following command is correctly changing the contents of 2 files.
sed -i 's/abc/xyz/g' xaa1 xab1
But what I need to do is to change several such files dynamically and I do not know the file names. I want to write a command that will read all the files from current directory starting with xa* and sed should change the file contents.
I'm surprised nobody has mentioned the -exec argument to find, which is intended for this type of use-case, although it will start a process for each matching file name:
find . -type f -name 'xa*' -exec sed -i 's/asd/dsg/g' {} \;
Alternatively, one could use xargs, which will invoke fewer processes:
find . -type f -name 'xa*' | xargs sed -i 's/asd/dsg/g'
Or more simply use the + exec variant instead of ; in find to allow find to provide more than one file per subprocess call:
find . -type f -name 'xa*' -exec sed -i 's/asd/dsg/g' {} +
Better yet:
for i in xa*; do
sed -i 's/asd/dfg/g' $i
done
because nobody knows how many files are there, and it's easy to break command line limits.
Here's what happens when there are too many files:
# grep -c aaa *
-bash: /bin/grep: Argument list too long
# for i in *; do grep -c aaa $i; done
0
... (output skipped)
#
You could use grep and sed together. This allows you to search subdirectories recursively.
Linux: grep -r -l <old> * | xargs sed -i 's/<old>/<new>/g'
OS X: grep -r -l <old> * | xargs sed -i '' 's/<old>/<new>/g'
For grep:
-r recursively searches subdirectories
-l prints file names that contain matches
For sed:
-i extension (Note: An argument needs to be provided on OS X)
Those commands won't work in the default sed that comes with Mac OS X.
From man 1 sed:
-i extension
Edit files in-place, saving backups with the specified
extension. If a zero-length extension is given, no backup
will be saved. It is not recommended to give a zero-length
extension when in-place editing files, as you risk corruption
or partial content in situations where disk space is exhausted, etc.
Tried
sed -i '.bak' 's/old/new/g' logfile*
and
for i in logfile*; do sed -i '.bak' 's/old/new/g' $i; done
Both work fine.
#PaulR posted this as a comment, but people should view it as an answer (and this answer works best for my needs):
sed -i 's/abc/xyz/g' xa*
This will work for a moderate amount of files, probably on the order of tens, but probably not on the order of millions.
Another more versatile way is to use find:
sed -i 's/asd/dsg/g' $(find . -type f -name 'xa*')
I'm using find for similar task. It is quite simple: you have to pass it as an argument for sed like this:
sed -i 's/EXPRESSION/REPLACEMENT/g' `find -name "FILE.REGEX"`
This way you don't have to write complex loops, and it is simple to see, which files you are going to change, just run find before you run sed.
u can make
'xxxx' text u search and will replace it with 'yyyy'
grep -Rn '**xxxx**' /path | awk -F: '{print $1}' | xargs sed -i 's/**xxxx**/**yyyy**/'
There's some good answers above. I thought I'd throw in one more that is succinct and parallelizable, using GNU parallel, which I often prefer to xargs:
parallel sed -i 's/abc/xyz/g' {} ::: xa*
Combine this with the -j N option to run N jobs in parallel.
If you are able to run a script, here is what I did for a similar situation:
Using a dictionary/hashMap (associative array) and variables for the sed command, we can loop through the array to replace several strings. Including a wildcard in the name_pattern will allow to replace in-place in files with a pattern (this could be something like name_pattern='File*.txt' ) in a specific directory (source_dir).
All the changes are written in the logfile in the destin_dir
#!/bin/bash
source_dir=source_path
destin_dir=destin_path
logfile='sedOutput.txt'
name_pattern='File.txt'
echo "--Begin $(date)--" | tee -a $destin_dir/$logfile
echo "Source_DIR=$source_dir destin_DIR=$destin_dir "
declare -A pairs=(
['WHAT1']='FOR1'
['OTHER_string_to replace']='string replaced'
)
for i in "${!pairs[#]}"; do
j=${pairs[$i]}
echo "[$i]=$j"
replace_what=$i
replace_for=$j
echo " "
echo "Replace: $replace_what for: $replace_for"
find $source_dir -name $name_pattern | xargs sed -i "s/$replace_what/$replace_for/g"
find $source_dir -name $name_pattern | xargs -I{} grep -n "$replace_for" {} /dev/null | tee -a $destin_dir/$logfile
done
echo " "
echo "----End $(date)---" | tee -a $destin_dir/$logfile
First, the pairs array is declared, each pair is a replacement string, then WHAT1 will be replaced for FOR1 and OTHER_string_to replace will be replaced for string replaced in the file File.txt. In the loop the array is read, the first member of the pair is retrieved as replace_what=$i and the second as replace_for=$j. The find command searches in the directory the filename (that may contain a wildcard) and the sed -i command replaces in the same file(s) what was previously defined. Finally I added a grep redirected to the logfile to log the changes made in the file(s).
This worked for me in GNU Bash 4.3 sed 4.2.2 and based upon VasyaNovikov's answer for Loop over tuples in bash.
The Silver Searcher Solution
I'm adding another option for those people who don't know about the amazing tool called The Silver Searcher (command line tool is ag).
Note: You can use grep and other tools to do the same thing here, but The Silver Searcher is fantastic :)
TLDR
ag -l 'abc' | xargs sed -i 's/abc/xyz/g'
Install The Silver Searcher
sudo apt install silversearcher-ag # Debian / Ubuntu
sudo pacman -S the_silver_searcher # Arch / EndeavourOS
sudo yum install epel-release the_silver_searcher # RHEL / CentOS
Demo Files
Paste the following into your terminal to create some demonstration files:
mkdir /tmp/food
cd /tmp/food
content="Everybody loves to abc this food!"
echo "$content" > ./milk
echo "$content" > ./bread
mkdir ./fastfood
echo "$content" > ./fastfood/pizza
echo "$content" > ./fastfood/burger
mkdir ./fruit
echo "$content" > ./fruit/apple
echo "$content" > ./fruit/apricot
Using 'ag'
The following ag command will recursively find all the files that contain the string 'abc'. It ignores the .git directory, .gitignore files, and other ignore files:
$ ag 'abc'
milk
1:Everybody loves to abc this food!
bread
1:Everybody loves to abc this food!
fastfood/burger
1:Everybody loves to abc this food!
fastfood/pizza
1:Everybody loves to abc this food!
fruit/apple
1:Everybody loves to abc this food!
fruit/apricot
1:Everybody loves to abc this food!
To just list the files that contain the string 'abc', use the -l switch:
$ ag -l 'abc'
bread
fastfood/burger
fastfood/pizza
fruit/apricot
milk
fruit/apple
Changing Multiple Files
Finally, using xargs and sed, we can replace the 'abc' string with another string:
ag -l 'abc' | xargs sed -i 's/abc/eat/g'
In the above command, ag is listing all the files that contain the string 'abc'. The xargs command is splitting the file names and piping them individually into the sed command.