Find and replace in UNIX - sed

I'm having the following string in a file called test.txt,
test.log test1.log test2.log
I want to replace it with
test.log -A test1.log -A test2.log
I tried:
sed -i 's/.log/.log -A/g' test.txt
But the output is
test.log -A test1.log -A test2.log -A
I don't want that to be appended in the last file. Can someone help me on this?

If the arguments are separated by space and final argument in the line doesn't have spaces after it, you could use this:
$ cat ip.txt
test.log test1.log test2.log
$ sed 's/\.log /&-A /g' ip.txt
test.log -A test1.log -A test2.log
since . is a metacharacter, you have to use \. to match it literally
& in replacement section represents entire matched portion in search section
You could also use awk here, better suited for field processing and added advantage of stripping away whitespaces at start/end of line
$ awk -v OFS=' -A ' '/\.log/{$1=$1} 1' ip.txt
test.log -A test1.log -A test2.log
default input field separator(FS) is one or more contiguous whitespace, so no need to set that
-v OFS=' -A ' set space followed by -A and space as output field separator(OFS)
/\.log/ if line contains .log
$1=$1 re-build input record, so that input FS will be replaced by OFS
1 idiomatic way to print input record
note that this solution won't change a line if it doesn't contain .log

Related

Get version of Podspec via command line (bash, zsh) [duplicate]

Given a file, for example:
potato: 1234
apple: 5678
potato: 5432
grape: 4567
banana: 5432
sushi: 56789
I'd like to grep for all lines that start with potato: but only pipe the numbers that follow potato:. So in the above example, the output would be:
1234
5432
How can I do that?
grep 'potato:' file.txt | sed 's/^.*: //'
grep looks for any line that contains the string potato:, then, for each of these lines, sed replaces (s/// - substitute) any character (.*) from the beginning of the line (^) until the last occurrence of the sequence : (colon followed by space) with the empty string (s/...// - substitute the first part with the second part, which is empty).
or
grep 'potato:' file.txt | cut -d\ -f2
For each line that contains potato:, cut will split the line into multiple fields delimited by space (-d\ - d = delimiter, \ = escaped space character, something like -d" " would have also worked) and print the second field of each such line (-f2).
or
grep 'potato:' file.txt | awk '{print $2}'
For each line that contains potato:, awk will print the second field (print $2) which is delimited by default by spaces.
or
grep 'potato:' file.txt | perl -e 'for(<>){s/^.*: //;print}'
All lines that contain potato: are sent to an inline (-e) Perl script that takes all lines from stdin, then, for each of these lines, does the same substitution as in the first example above, then prints it.
or
awk '{if(/potato:/) print $2}' < file.txt
The file is sent via stdin (< file.txt sends the contents of the file via stdin to the command on the left) to an awk script that, for each line that contains potato: (if(/potato:/) returns true if the regular expression /potato:/ matches the current line), prints the second field, as described above.
or
perl -e 'for(<>){/potato:/ && s/^.*: // && print}' < file.txt
The file is sent via stdin (< file.txt, see above) to a Perl script that works similarly to the one above, but this time it also makes sure each line contains the string potato: (/potato:/ is a regular expression that matches if the current line contains potato:, and, if it does (&&), then proceeds to apply the regular expression described above and prints the result).
Or use regex assertions: grep -oP '(?<=potato: ).*' file.txt
grep -Po 'potato:\s\K.*' file
-P to use Perl regular expression
-o to output only the match
\s to match the space after potato:
\K to omit the match
.* to match rest of the string(s)
sed -n 's/^potato:[[:space:]]*//p' file.txt
One can think of Grep as a restricted Sed, or of Sed as a generalized Grep. In this case, Sed is one good, lightweight tool that does what you want -- though, of course, there exist several other reasonable ways to do it, too.
This will print everything after each match, on that same line only:
perl -lne 'print $1 if /^potato:\s*(.*)/' file.txt
This will do the same, except it will also print all subsequent lines:
perl -lne 'if ($found){print} elsif (/^potato:\s*(.*)/){print $1; $found++}' file.txt
These command-line options are used:
-n loop around each line of the input file
-l removes newlines before processing, and adds them back in afterwards
-e execute the perl code
You can use grep, as the other answers state. But you don't need grep, awk, sed, perl, cut, or any external tool. You can do it with pure bash.
Try this (semicolons are there to allow you to put it all on one line):
$ while read line;
do
if [[ "${line%%:\ *}" == "potato" ]];
then
echo ${line##*:\ };
fi;
done< file.txt
## tells bash to delete the longest match of ": " in $line from the front.
$ while read line; do echo ${line##*:\ }; done< file.txt
1234
5678
5432
4567
5432
56789
or if you wanted the key rather than the value, %% tells bash to delete the longest match of ": " in $line from the end.
$ while read line; do echo ${line%%:\ *}; done< file.txt
potato
apple
potato
grape
banana
sushi
The substring to split on is ":\ " because the space character must be escaped with the backslash.
You can find more like these at the linux documentation project.
Modern BASH has support for regular expressions:
while read -r line; do
if [[ $line =~ ^potato:\ ([0-9]+) ]]; then
echo "${BASH_REMATCH[1]}"
fi
done
grep potato file | grep -o "[0-9].*"

Removing a specific line in bash with an exact string

I'm having trouble in getting sed to remove just the specific line I want. Let's say I have a file that looks like this:
testfile
testfile.txt
testfile2
Currently I'm using this to remove the line I want:
sed -i "/$1/d" file
The issue is that with this if I were to give testfile as input it would delete all three lines but I want it to only remove the first line. How do I do this?
With grep
grep -x -F -v -- "$1" file
# or
grep -xFv -- "$1" file
-F is for "fixed strings" -- turns off regex engine.
-x is to match entire line.
-v is for "everything but" the matched line(s).
-- to signal the end of options, in case $1 starts with a hyphen.
To save the file
grep -xFv -- "$1" file | sponge file # `moreutils` package
# or
tmp=$(mktemp)
grep -xFv -- "$1" file > "$tmp" && mv "$tmp" file
So match the whole line.
var=testfile
sed -i '/^'"$var"'$/d' file
# or with " quoting
sed -i "/^$var\$/d" file
You can learn regex with fun online with regex crosswords.

How to convert in file csv date in specific column to unix date

I have a file csv with this columns:
"Weight","Impedance","Units","User","Timestamp","PhysiqueRating"
"58.75","5.33","kg","7","2020-7-11 19:29:29","5"
Of course, I can convert the date command:
date -d '2020-7-11 19:29:29' +%s
Results:
1594488569
How to replace this date in csv file in bash script?
With GNU sed
sed -E '2,$ s/(("[^"]*",){4})("[^"]+")(.*)/echo \x27\1"\x27$(date -d \3 +%s)\x27"\4\x27/e'
2,$ to skip header from getting processed
(("[^"]*",){4}) first four columns
("[^"]+") fifth column
(.*) rest of the line
echo \x27\1"\x27 and \x27"\4\x27 preserve first four columns and rest of line after fifth column, along with adding double quotes to result of date conversion
$(date -d \3 +%s) calling shell command with fifth column value
Note that this command will fail if input can contain single quotes. That can be worked around by using s/\x27/\x27\\&\x27/g.
You can see the command that gets executed by using -n option and pe flags
sed -nE '2,$ s/(("[^"]*",){4})("[^"]+")(.*)/echo \x27\1"\x27$(date -d \3 +%s)\x27"\4\x27/pe'
will give
echo '"58.75","5.33","kg","7","'$(date -d "2020-7-11 19:29:29" +%s)'","5"'
For 58.25,5.89, kg, 7,2020 / 7/12 11:23:46, "5" format, try
sed -E '2,$ s/(([^,]*,){4})([^,]+)(.*)/echo \x27\1\x27$(date -d "\3" +%s)\x27\4\x27/e'
or (adapted from https://stackoverflow.com/a/62862416)
awk 'BEGIN{FS=OFS=","} NR>1{$5=mktime(gensub(/[:\/]/, " ", "g", $5))} 1'
Note: For the sed solution, if the input can come from outside source, you'll have to take care to avoid malicious intent as mentioned in the comments. One way is to match the fifth column using [0-9: -]+ or similar.
Using GNU awk:
$ gawk '
BEGIN {
FS=OFS=","
}
{
n=split($5,a,/[-" :]/)
if(n==8)
$5="\"" mktime(sprintf("%s %s %s %s %s %s",a[2],a[3],a[4],a[5],a[6],a[7])) "\""
}1' file
Output:
"Weight","Impedance","Units","User","Timestamp","PhysiqueRating"
"58.75","5.33","kg","7","1594484969","5"
With GNU awk for gensub() and mktime():
$ awk 'BEGIN{FS=OFS="\""} NR>1{$10=mktime(gensub(/[-:]/," ","g",$10))} 1' file
"Weight","Impedance","Units","User","Timestamp","PhysiqueRating"
"58.75","5.33","kg","7","1594513769","5"

Prefix the contents of a file using content of another file having a single line

I am trying to insert the contents of a file1.txt into file2.txt using sed. The content of file1.txt is just a single line, which is a path.
I want it to be added as a prefix to each line in file2.txt as well as add another / character.
$ cat file1.txt
/psot/rot8888/orce/db/tier/data/tine
$ cat file2.txt
o1_mf_users_abchwfg_.dbf
o1_mf_toptbs2_abchrq0_.dbf
o1_mf_toptbs1_abchrl2_.dbf
o1_mf_toptbs1_abchtlf_.dbf
Desired output should be like:
/psot/rot8888/orce/db/tier/data/tine/o1_mf_users_abchwfg_.dbf
/psot/rot8888/orce/db/tier/data/tine/o1_mf_toptbs2_abchrq0_.dbf
/psot/rot8888/orce/db/tier/data/tine/o1_mf_toptbs1_abchrl2_.dbf
/psot/rot8888/orce/db/tier/data/tine/o1_mf_toptbs1_abchtlf_.dbf
Tried command:
$ sed '/o1/ r file1.txt' file2.txt >> test.txt
$ cat test.txt
o1_mf_users_abchwfg_.dbf
/psot/rot8888/orce/db/tier/data/tine
o1_mf_toptbs2_abchrq0_.dbf
/psot/rot8888/orce/db/tier/data/tine
o1_mf_toptbs1_abchrl2_.dbf
/psot/rot8888/orce/db/tier/data/tine
o1_mf_toptbs1_abchtlf_.dbf
/psot/rot8888/orce/db/tier/data/tine
You can use pr for this without having to worry about sed metacharacters, delimiters, etc.
$ cat ip.txt
abcd.xyz
123.txt
foo_baz.txt
$ cat f1
/a/b/c/d/
$ pr -mts"$(< f1)" /dev/null ip.txt
/a/b/c/d/abcd.xyz
/a/b/c/d/123.txt
/a/b/c/d/foo_baz.txt
Where -m allows pasting files parallely and -s is the separator between the files to be merged. Here, /dev/null is used as a dummy for one of the files as only the separator has to be prefixed.
If you need to add some more characters after the contents of file containing the prefix:
$ cat ip.txt
abcd.xyz
123.txt
foo_baz.txt
$ cat f1
/a/b/c/d
$ pr -mts"$(< f1)"'/' /dev/null ip.txt
/a/b/c/d/abcd.xyz
/a/b/c/d/123.txt
/a/b/c/d/foo_baz.txt
This will work using any awk in any shell on every UNIX box:
$ awk 'NR==FNR{p=$0; next} {print p "/" $0}' file1 file2
/psot/rot8888/orce/db/tier/data/tine/o1_mf_users_abchwfg_.dbf
/psot/rot8888/orce/db/tier/data/tine/o1_mf_toptbs2_abchrq0_.dbf
/psot/rot8888/orce/db/tier/data/tine/o1_mf_toptbs1_abchrl2_.dbf
/psot/rot8888/orce/db/tier/data/tine/o1_mf_toptbs1_abchtlf_.dbf
This might work for you (GNU sed):
sed '1h;1d;G;s/\(.*\)\n\(.*\)/\2\1/' file1 file2
Copy file1 into the hold space and append it to each line in file2. Using regexp and back references, manipulate the two lines into one in the correct order.
Alternative:
sed 'x;s/.*/cat file1/e;G;s/\n//' file2
Insert file1 into the hold space, append the current line of file2, and remove the newline connecting them.
A third way:
sed 'r file1' file2 | sed -E 'N;s/(.*)\n(.*)/\2\/\1/'

grep -f forEXACT pattern

I want TO extract list of names from other bigger file (input), having that name and some additional information associated with that name. My problem is with grep -f option, as it is not matching the exact entries in input file but some other entries that contain similar name.
I tried:
$ grep -f list.txt -A 1 input >output
Following are the format of files;
list.txt
TE_final_35005
TE_final_1040
Input file
>TE_final_10401
ACGTACGTACGTACGT
>TE_final_35005
ACGTACGATCAGT
>TE_final_1040
ACGTACGTACGT
Required output:
>TE_final_35005
ACGTACGATCAGT
>TE_final_1040
ACGTACGTACGT
output I am getting:
>TE_final_10401
ACGTACGTACGTACGT
>TE_final_35005
ACGTACGATCAGT
>TE_final_1040
ACGTACGTACGT
Although TE_final_10401 is not in the list.txt
How I can use ^ in list?
Please help to match the exact value or suggest other ways to do this.
Add the whole word switch (-w):
grep -w -A1 -f list.txt infile
Output:
>TE_final_35005
ACGTACGATCAGT
>TE_final_1040
ACGTACGTACGT
A couple of things, remove the blanks lines from the files first:
sed -i '/^\s*$/d' file list
Then -w is used to match whole words only and -A1 will print the next line after the match:
$ grep -w -A1 -f list file > new_file
$ cat new_file
>TE_final_35005
ACGTACGATCAGT
>TE_final_1040
ACGTACGTACGT
as others have mentioned, adding the -w flag is the cleanest and easiest approach based on your sample data. but since you explicitly asked how you could use ^ in list.txt, here's another option.
to add ^ and/or $ anchors to each line in list.txt:
$ cat list.txt
^>TE_final_35005[ ]*$
^>TE_final_1040[ ]*$
this searches for your patterns at the start of the line, preceded by a > character, and ignores any trailing spaces. then your previous command will work (assuming you either remove those blank lines or change your argument to -A 2).
if you'd like to add these anchors to the list file automatically (and delete any blank lines at the same time), use this awk construct:
awk '{if($0 != ""){print "^>"$0"[ ]*$"}}' list.txt >newlist.txt
or if you prefer sed inplace editing:
sed -i '/^[ ]*$/d;s/\(.*\)/^>\1[ ]*$/g' list.txt