sed: add semicolon after each sentence - sed

I'm using this grep in order to extract "SQL " sentences from my log file:
grep -oPzZ ' SQL "\K[^"]+' log.log
After that, I need to format it adding ; at the end of each detected sql sentence:
grep -oPzZ ' SQL "\K[^"]+' log.log | sed -E '$s/$/\n/; s/\x0/;/; s/^[[:blank:]]+//'
Nevertheless, it seems to not working at all. I mean, I'm getting:
alter table HFJ_RES_LINK modify ( SRC_PATH varchar2(200) );create index IDX_VALUESET_EXP_STATUS on TRM_VALUESET(EXPANSION_STATUS)drop index IDX_VALUESET_EXP_STATUS
As you can see, first ; is added after first detected sql sentence, later is not added.
log.log is similar to:
3_6_0.20180929.1: SQL "alter table HFJ_RES_LINK modify ( SRC_PATH varchar2(200) )" returned 0
4_0_0.20190722.37: SQL "create index IDX_VALUESET_EXP_STATUS on TRM_VALUESET(EXPANSION_STATUS)" returned 0
Any ideas?

It would be better to use awk here as you can combine both grep and sed operations into a single command. Consider this gnu awk solution:
awk -v RS='SQL "[^"]+"' 'RT {gsub(/^SQL "|"|\n/, "", RT); print RT ";"}' file
alter table HFJ_RES_LINK modify( SRC_PATH varchar2(200) );
create index IDX_VALUESET_EXP_STATUS on TRM_VALUESET(EXPANSION_STATUS);
Details:
This awk command uses a custom record separator of SQL followed by a single space and then followed by a double quoted string.
Matched text is available in internal variable RT to awk.
When RT is non-empty we remove unwanted start SQL " and end " and all line breaks from RT and finally print it with an ending ;.

A simple sed can replace your grep + sed:
sed -nr 's/.*SQL "([^"]*)".*/\1;/p' log.log

Related

Get version of Podspec via command line (bash, zsh) [duplicate]

Given a file, for example:
potato: 1234
apple: 5678
potato: 5432
grape: 4567
banana: 5432
sushi: 56789
I'd like to grep for all lines that start with potato: but only pipe the numbers that follow potato:. So in the above example, the output would be:
1234
5432
How can I do that?
grep 'potato:' file.txt | sed 's/^.*: //'
grep looks for any line that contains the string potato:, then, for each of these lines, sed replaces (s/// - substitute) any character (.*) from the beginning of the line (^) until the last occurrence of the sequence : (colon followed by space) with the empty string (s/...// - substitute the first part with the second part, which is empty).
or
grep 'potato:' file.txt | cut -d\ -f2
For each line that contains potato:, cut will split the line into multiple fields delimited by space (-d\ - d = delimiter, \ = escaped space character, something like -d" " would have also worked) and print the second field of each such line (-f2).
or
grep 'potato:' file.txt | awk '{print $2}'
For each line that contains potato:, awk will print the second field (print $2) which is delimited by default by spaces.
or
grep 'potato:' file.txt | perl -e 'for(<>){s/^.*: //;print}'
All lines that contain potato: are sent to an inline (-e) Perl script that takes all lines from stdin, then, for each of these lines, does the same substitution as in the first example above, then prints it.
or
awk '{if(/potato:/) print $2}' < file.txt
The file is sent via stdin (< file.txt sends the contents of the file via stdin to the command on the left) to an awk script that, for each line that contains potato: (if(/potato:/) returns true if the regular expression /potato:/ matches the current line), prints the second field, as described above.
or
perl -e 'for(<>){/potato:/ && s/^.*: // && print}' < file.txt
The file is sent via stdin (< file.txt, see above) to a Perl script that works similarly to the one above, but this time it also makes sure each line contains the string potato: (/potato:/ is a regular expression that matches if the current line contains potato:, and, if it does (&&), then proceeds to apply the regular expression described above and prints the result).
Or use regex assertions: grep -oP '(?<=potato: ).*' file.txt
grep -Po 'potato:\s\K.*' file
-P to use Perl regular expression
-o to output only the match
\s to match the space after potato:
\K to omit the match
.* to match rest of the string(s)
sed -n 's/^potato:[[:space:]]*//p' file.txt
One can think of Grep as a restricted Sed, or of Sed as a generalized Grep. In this case, Sed is one good, lightweight tool that does what you want -- though, of course, there exist several other reasonable ways to do it, too.
This will print everything after each match, on that same line only:
perl -lne 'print $1 if /^potato:\s*(.*)/' file.txt
This will do the same, except it will also print all subsequent lines:
perl -lne 'if ($found){print} elsif (/^potato:\s*(.*)/){print $1; $found++}' file.txt
These command-line options are used:
-n loop around each line of the input file
-l removes newlines before processing, and adds them back in afterwards
-e execute the perl code
You can use grep, as the other answers state. But you don't need grep, awk, sed, perl, cut, or any external tool. You can do it with pure bash.
Try this (semicolons are there to allow you to put it all on one line):
$ while read line;
do
if [[ "${line%%:\ *}" == "potato" ]];
then
echo ${line##*:\ };
fi;
done< file.txt
## tells bash to delete the longest match of ": " in $line from the front.
$ while read line; do echo ${line##*:\ }; done< file.txt
1234
5678
5432
4567
5432
56789
or if you wanted the key rather than the value, %% tells bash to delete the longest match of ": " in $line from the end.
$ while read line; do echo ${line%%:\ *}; done< file.txt
potato
apple
potato
grape
banana
sushi
The substring to split on is ":\ " because the space character must be escaped with the backslash.
You can find more like these at the linux documentation project.
Modern BASH has support for regular expressions:
while read -r line; do
if [[ $line =~ ^potato:\ ([0-9]+) ]]; then
echo "${BASH_REMATCH[1]}"
fi
done
grep potato file | grep -o "[0-9].*"

Use sed to replace every character by itself followed by $n times a char?

I'm trying to run the command below to replace every char in DECEMBER by itself followed by $n question marks. I tried both escaping {$n} like so {$n} and leaving it as is. Yet my output just keeps being D?{$n}E?{$n}... Is it just not possible to do this with a sed?
How should i got about this.
echo 'DECEMBER' > a.txt
sed -i "s%\(.\)%\1\(?\){$n}%g" a.txt
cat a.txt
This might work for you (GNU sed):
n=5
sed -E ':a;s/[^\n]/&\n/g;x;s/^/x/;/x{'"$n"'}/{z;x;y/\n/?/;b};x;ba' file
Append a newline to each non-newline character in a line $n times then replace all newlines by the intended character ?.
N.B. The newline is chosen as the initial substitute character as it is not possible for it to be within a line (sed uses newlines to separate lines) and if the final substitution character already exists within the current line, the substitutions are correct.
Range (also, interval or limiting quantifiers), like {3} / {3,} / {3,6}, are part of regex, and not replacement patterns.
You can use
sed -i "s/./&$(for i in {1..7}; do echo -n '?'; done)/g" a.txt
See the online demo:
#!/bin/bash
sed "s/./&$(for i in {1..7}; do echo -n '?'; done)/g" <<< "DECEMBER"
# => D???????E???????C???????E???????M???????B???????E???????R???????
Here, . matches any char, and & in the replacement pattern puts it back and $(for i in {1..7}; do echo -n '?'; done) adds seven question marks right after it.
This one-liner should do the trick:
sed 's/./&'$(printf '%*s' "$n" '' | tr ' ' '?')'/g' a.txt
with the assumption that $n expands to a positive integer and the command is executed in a POSIX shell.
Efficiently using any awk in any shell on every Unix box after setting n=2:
$ awk -v n="$n" '
BEGIN {
new = sprintf("%*s",n,"")
gsub(/./,"?",new)
}
{
gsub(/./,"&"new)
print
}
' a.txt
D??E??C??E??M??B??E??R??
To make the changes "inplace" use GNU awk with -i inplace just like GNU sed has -i.
Caveat - if the character you want to use in the replacement text is & then you'd need to use gsub(/./,"\\\\\\&",new) in the BEGIN section to make it is treated as literal instead of a backreference metachar. You'd have that issue and more (e.g. handling \1 or /) with any sed solution and any solution that uses double quotes around the script would have more issues with handling $s and the solutions that have a shell script expanding unquoted would have even more issues with globbing chars.

How can I achieve the following in sed?

The original text is:
apr_array_pstrcat(anythingbutalwayshereincludingspaces,anythingbutalwayshereincludingspaces, ',')
I want to change it to:
apr_array_pstrcat(samethingasabove,samethingasabove, ", ")
I got the following sed command, but it is not working:
find . -type f -exec sed -i "s/apr_array_pstrcat\((.*),(.*),(.*)','\)/apr_array_pstrcat\($1,$2,$3\", \"\)/g" {} +
How can I do this? I am able to understand PCRE regex, but I am not sure about this sed one.
Issues with OP's attempts:
-E is needed to enable ERE, otherwise \( and ( need to be reversed with default BRE
$1, $2, etc should be \1, \2, etc
there should be only two capture groups as per given sample
also, g flag isn't needed if there can be only one match per line
sed -E "s/apr_array_pstrcat\((.*),(.*)','\)/apr_array_pstrcat\(\1,\2\", \"\)/g"
This can be simplified to:
sed -E "s/(apr_array_pstrcat\(.*),(.*)','\)/\1,\2\", \"\)/g"
# or this one, since using double quotes for entire expression can lead to
# conflict with shell double quote interpretation
sed -E 's/(apr_array_pstrcat\(.*),(.*)\x27,\x27\)/\1,\2", "\)/g'
This can be further simplified depending on what kind of data is present in the input:
# change ',' to ", " if a line contains apr_array_pstrcat(
sed '/apr_array_pstrcat(/ s/\x27,\x27/", "/'
sed has the -E flag for "use extended regular expressions in the script".
I'd also match the arguments with 'anything that's not a comma': "[^,]+"
So this works for me:
sed -E "s/(apr_array_pstrcat\([^,]+, [^,]+,) ','\)/\1 \", \")/"

How to convert in file csv date in specific column to unix date

I have a file csv with this columns:
"Weight","Impedance","Units","User","Timestamp","PhysiqueRating"
"58.75","5.33","kg","7","2020-7-11 19:29:29","5"
Of course, I can convert the date command:
date -d '2020-7-11 19:29:29' +%s
Results:
1594488569
How to replace this date in csv file in bash script?
With GNU sed
sed -E '2,$ s/(("[^"]*",){4})("[^"]+")(.*)/echo \x27\1"\x27$(date -d \3 +%s)\x27"\4\x27/e'
2,$ to skip header from getting processed
(("[^"]*",){4}) first four columns
("[^"]+") fifth column
(.*) rest of the line
echo \x27\1"\x27 and \x27"\4\x27 preserve first four columns and rest of line after fifth column, along with adding double quotes to result of date conversion
$(date -d \3 +%s) calling shell command with fifth column value
Note that this command will fail if input can contain single quotes. That can be worked around by using s/\x27/\x27\\&\x27/g.
You can see the command that gets executed by using -n option and pe flags
sed -nE '2,$ s/(("[^"]*",){4})("[^"]+")(.*)/echo \x27\1"\x27$(date -d \3 +%s)\x27"\4\x27/pe'
will give
echo '"58.75","5.33","kg","7","'$(date -d "2020-7-11 19:29:29" +%s)'","5"'
For 58.25,5.89, kg, 7,2020 / 7/12 11:23:46, "5" format, try
sed -E '2,$ s/(([^,]*,){4})([^,]+)(.*)/echo \x27\1\x27$(date -d "\3" +%s)\x27\4\x27/e'
or (adapted from https://stackoverflow.com/a/62862416)
awk 'BEGIN{FS=OFS=","} NR>1{$5=mktime(gensub(/[:\/]/, " ", "g", $5))} 1'
Note: For the sed solution, if the input can come from outside source, you'll have to take care to avoid malicious intent as mentioned in the comments. One way is to match the fifth column using [0-9: -]+ or similar.
Using GNU awk:
$ gawk '
BEGIN {
FS=OFS=","
}
{
n=split($5,a,/[-" :]/)
if(n==8)
$5="\"" mktime(sprintf("%s %s %s %s %s %s",a[2],a[3],a[4],a[5],a[6],a[7])) "\""
}1' file
Output:
"Weight","Impedance","Units","User","Timestamp","PhysiqueRating"
"58.75","5.33","kg","7","1594484969","5"
With GNU awk for gensub() and mktime():
$ awk 'BEGIN{FS=OFS="\""} NR>1{$10=mktime(gensub(/[-:]/," ","g",$10))} 1' file
"Weight","Impedance","Units","User","Timestamp","PhysiqueRating"
"58.75","5.33","kg","7","1594513769","5"

delete a column with awk or sed

I have a file with three columns. I would like to delete the 3rd column(in-place editing). How can I do this with awk or sed?
123 abc 22.3
453 abg 56.7
1236 hjg 2.3
Desired output
123 abc
453 abg
1236 hjg
try this short thing:
awk '!($3="")' file
With GNU awk for inplace editing, \s/\S, and gensub() to delete
1) the FIRST field:
awk -i inplace '{sub(/^\S+\s*/,"")}1' file
or
awk -i inplace '{$0=gensub(/^\S+\s*/,"",1)}1' file
2) the LAST field:
awk -i inplace '{sub(/\s*\S+$/,"")}1' file
or
awk -i inplace '{$0=gensub(/\s*\S+$/,"",1)}1' file
3) the Nth field where N=3:
awk -i inplace '{$0=gensub(/\s*\S+/,"",3)}1' file
Without GNU awk you need a match()+substr() combo or multiple sub()s + vars to remove a middle field. See also Print all but the first three columns.
This might work for you (GNU sed):
sed -i -r 's/\S+//3' file
If you want to delete the white space before the 3rd field:
sed -i -r 's/(\s+)?\S+//3' file
It seems you could simply go with
awk '{print $1 " " $2}' file
This prints the two first fields of each line in your input file, separated with a space.
Try using cut... its fast and easy
First you have repeated spaces, you can squeeze those down to a single space between columns if thats what you want with tr -s ' '
If each column already has just one delimiter between it, you can use cut -d ' ' -f-2 to print fields (columns) <= 2.
for example if your data is in a file input.txt you can do one of the following:
cat input.txt | tr -s ' ' | cut -d ' ' -f-2
Or if you better reason about this problem by removing the 3rd column you can write the following
cat input.txt | tr -s ' ' | cut -d ' ' --complement -f3
cut is pretty powerful, you can also extract ranges of bytes, or characters, in addition to columns
excerpt from the man page on the syntax of how to specify the list range
Each LIST is made up of one range, or many ranges separated by commas.
Selected input is written in the same order that it is read, and is
written exactly once. Each range is one of:
N N'th byte, character or field, counted from 1
N- from N'th byte, character or field, to end of line
N-M from N'th to M'th (included) byte, character or field
-M from first to M'th (included) byte, character or field
so you also could have said you want specific columns 1 and 2 with...
cat input.txt | tr -s ' ' | cut -d ' ' -f1,2
Try this :
awk '$3="";1' file.txt > new_file && mv new_file file.txt
or
awk '{$3="";print}' file.txt > new_file && mv new_file file.txt
Try
awk '{$3=""; print $0}'
If you're open to a Perl solution...
perl -ane 'print "$F[0] $F[1]\n"' file
These command-line options are used:
-n loop around every line of the input file, do not automatically print every line
-a autosplit mode – split input lines into the #F array. Defaults to splitting on whitespace
-e execute the following perl code