I have a file which contains each users userid and password. I need to fetch userid and password from that file by passing userid as an search element using awk command.
user101,smith,smith#123
user102,jones,passj#007
user103,albert,albpass#01
I am using a awk command inside my perl script like this:
...
...
my $userid = ARGV[0];
my $user_report_file = "report_file.txt";
my $data = `awk -F, '$1 ~ /$userid/ {print $2, $3}' $user_report_file`;
my ($user,$pw) = split(" ",$data);
...
...
Here I am getting the error:
awk: ~ /user101/ {print , }
awk: ^ syntax error
But if I run same command in terminal window its able to give result like below:
$] awk -F, '$1 ~ /user101/ {print $2, $3}' report_file.txt
smith smith#123
What could be the issue here?
The backticks are a double-quoted context, so you need to escape any literal $ that you want awk to interpret.
my $data = `awk -F, '\$1 ~ /$userid/ {print \$2, \$3}' $user_report_file`;
If you don't do that, you're interpolating the capture variables from the last successful Perl match.
When I have these sorts of problems, I try the command as a string first to see if it is what I expect:
my $data = "awk -F, '\$1 ~ /$userid/ {print \$2, \$3}' $user_report_file";
say $data;
Here's the Perl equivalent of that command:
$ perl -aF, -e '$F[0]=~/101/ && print "#F[1,2]"' report_file
But, this is something you probably want to do in Perl instead of creating another process:
Interpolating data into external commands can go wrong, such as a filename that is foo.txt; rm -rf /.
The awk you run is the first one in the path, so someone can make that a completely different program (so use the full path, like /usr/bin/awk).
Taint checking can tell you when you are passing unsanitized data to the shell.
Inside a program you don't get all the shortcuts, but if this is the part of your program that is slow, you probably want to rethink how you are accessing this data because scanning the entire file with any tool isn't going to be that fast:
open my $fh, '<', $user_report_file or die;
while( <$fh> ) {
chomp;
my #F = split /,/;
next unless $F[0] =~ /\Q$userid/;
print "#F[1,2]";
last; # if you only want the first one
}
Say I have a file like so:
+jaklfjdskalfjkdsaj
fkldsjafkljdkaljfsd
-jslakflkdsalfkdls;
+sdjafkdjsakfjdskal
I only want to find and count the amount of times during this file a line that starts with - is immediately followed by a line that starts with +.
Rules:
No external scripts
Must be done from within a bash script
Must be inline
I could figure out how to do this in a Python script, for instance, but I've never had to do something this extensive in Bash.
Could anyone help me out? I figure it'll end up being grep, perl, or maybe a talented sed line -- but these are things I'm still learning.
Thank you all!
grep -A1 "^-" $file | grep "^+" | wc -l
The first grep finds all of the lines starting with -, and the -A1 causes it to also output the line after the match too.
We then grep that output for any lines starting with +. Logically:
We know the output of the first grep is only the -XXX lines and the following lines
We know that a +xxx line cannot also be a -xxx line
Therefore, any +xxx lines must be following lines, and should be counted, which we do with wc -l
Easy in Perl:
perl -lne '$c++ if $p and /^\+/; $p = /^-/ }{ print $c' FILE
awk one-liner:
awk -v FS='' '{x=x sprintf("%s", $1)}END{print gsub(/-\+/,"",x)}' file
e.g.
kent$ cat file
+jaklfjdskalfjkdsaj
fkldsjafkljdkaljfsd
-jslakflkdsalfkdls;
+sdjafkdjsakfjdskal
-
-
-
+
-
+
foo
+
kent$ awk -v FS='' '{x=x sprintf("%s", $1)}END{print gsub(/-\+/,"",x)}' file
3
Another Perl example. Not as terse as choroba's, but more transparent in how it works:
perl -e'while (<>) { $last = $cur; $cur = $_; print $last, $cur if substr($last, 0, 1) eq "-" && substr($cur, 0, 1) eq "+" }' < infile
Output:
-jslakflkdsalfkdls;
+sdjafkdjsakfjdskal
Pure bash:
unset c p
while read line ; do
[[ $line == +* && $p == 0 ]] && (( c++ ))
[[ $line == -* ]]
p=$?
done < FILE
echo $c
I have a text file with the following content
L,4m,06/03/2013
L,33GJm,06/03/2013,G
L,44Bm,06/03/2013,B
L,4q,08/03/2013
J,4m,04/03/2013
J,3GU,04/03/2013,G
J,3jm,04/03/2013
J,3GJ,04/03/2013,G
J,44Bm,06/03/2013,B
J,34Bq,08/03/2013,B
M,4v,12/03/2013
D,3GU,12/03/2013,G
D,4B,11/03/2013,B
D,4m,12/03/2013
D,3GJ,13/03/2013,G
D,3GU,13/03/2013,G
D,4B,14/03/2013,B
D,4B,14/03/2013,B
D,34Bm,14/03/2013,B
L,33BUq,11/03/2013,B
L,3BJUq,11/03/2013,B
L,44Bq,14/03/2013,B
L,44Bq,14/03/2013,B
L,3Bq,15/03/2013,B
L,3q,15/03/2013
J,34Bjq,11/03/2013,B
J,33GUm,12/03/2013,G
J,4q,13/03/2013
J,33GUq,13/03/2013,G
J,33GUq,13/03/2013,G
J,4q,13/03/2013
M,3BU,18/03/2013,B
M,4B,18/03/2013,B
M,4B,18/03/2013,B
M,3GJ,19/03/2013,G
M,3GJ,19/03/2013,G
D,4B,22/03/2013,B
D,3BU,22/03/2013,B
L,34Bv,18/03/2013,B
L,3jm,19/03/2013
L,4m,19/03/2013
L,33GJm,19/03/2013,G
L,33GUm,19/03/2013,G
J,33BUm,18/03/2013,B
J,4m,18/03/2013
J,4B,18/03/2013,B
J,33BUm,18/03/2013,B
J,4q,22/03/2013
J,4q,22/03/2013
A,3GJ,28/03/2013,G
M,4B,27/03/2013,B
D,4B,25/03/2013,B
L,44Bq,25/03/2013,B
L,34Bq,25/03/2013,B
L,34Bq,25/03/2013,B
L,33BUa,26/03/2013,B
L,33BUq,26/03/2013,B
L,33BUq,26/03/2013,B
L,34Bq,27/03/2013,B
L,34Bq,27/03/2013,B
L,4B,27/03/2013,B
L,34Bq,27/03/2013,B
L,4a,28/03/2013
I want to translate the second column based on the following coding system.
If $2 starts with a 1 or 2 - Change $2 to Excellent
If $2 contains 3BU or 3GU - Change $2 to Good
if $2 contains 3BJ or 3GJ - Change $2 to OK
If $2 starts with a 4 - Change $2 to Poor
If $2 starts with a 5 - Change $2 Terrible
I can find and change the 3BUs to Good easy enough using the following command
awk 'BEGIN{FS=",";OFS=","} {if ($2~ /3(B|G)U/)print $1,"Good",$3}' file | sponge file
Though I use all other non 3(B|G)U lines. I could use if else terminology though this seems inelegant. I have tried to use gensub to solve the problem
awk -F, '{gensub(/3(B|G)U/,Good,"",2)}1' file
But this prints the file contents without substitution. Any hints
Desired output
L,Poor,06/03/2013
L,Ok,06/03/2013,G
L,Poor,06/03/2013,B
L,Poor,08/03/2013
J,Poor,04/03/2013
J,Good,04/03/2013,G
A perl or sed one-liner would also be helpful as this code forms part of a bash shell script
If you want to stick with shell:
(
IFS=,
while read -ra f; do # pick more appropriate variable names
case ${f[1]} in
[12]*) f[1]=Excellent ;;
*3[BG]U*) f[1]=Good ;;
*3[BG]J*) f[1]=OK ;;
4*) f[1]=Poor ;;
5*) f[1]=Terrible ;;
esac
echo "${f[*]}"
done < file
) > tmp && mv tmp file
I ran that in a subshell to localize changes to $IFS
a sed solutions too
sed -e 's/\(^.,\)\(1\|2\)[^,]*/\1Excellent/g' -e 's/\(^.,\)3[BG]U[^,]*/\1Good/g' -e 's/\(^.,\)3[BG]J[^,]*/\1OK/g' -e 's/\(^.,\)4[^,]*/\1Poor/g' -e 's/\(^.,\)5[^,]*/\1Terrible/g' <filename>
$ awk '
BEGIN { FS=OFS="," }
$2 ~ /^(1|2)/ { $2 = "Excellent" }
$2 ~ /3(B|G)U/ { $2 = "Good" }
$2 ~ /3(B|G)J/ { $2 = "OK" }
$2 ~ /^4/ { $2 = "Poor" }
$2 ~ /^5/ { $2 = "Terrible" }
1
' foo.txt | head -n 10
L,Poor,06/03/2013
L,OK,06/03/2013,G
L,Poor,06/03/2013,B
L,Poor,08/03/2013
J,Poor,04/03/2013
J,Good,04/03/2013,G
J,3jm,04/03/2013
J,OK,04/03/2013,G
J,Poor,06/03/2013,B
J,34Bq,08/03/2013,B
perl -pe 's{,(\w+)}{ $_ = /^[12]/ ?"Excellent" :/3[BG]U/ ?"Good" :/3[BG]J/ ?"OK" :/^4/ ?"Poor" :/^5/ ?"Terrible" :$_ for $v=$1; ",$v" }e'
More readable version,
s{,(\w+)}{
for ($v = $1) {
$_ = /^[12]/ ?"Excellent"
:/3[BG]U/ ?"Good"
:/3[BG]J/ ?"OK"
:/^4/ ?"Poor"
:/^5/ ?"Terrible"
:$_;
}
",$v";
}e;
In awk I can write: awk -F: 'BEGIN {OFS = FS} ...'
In Perl, what's the equivalent of FS? I'd like to write
perl -F: -lane 'BEGIN {$, = [what?]} ...'
update with an example:
echo a:b:c:d | awk -F: 'BEGIN {OFS = FS} {$2 = 42; print}'
echo a:b:c:d | perl -F: -ane 'BEGIN {$, = ":"} $F[1] = 42; print #F'
Both output a:42:c:d
I would prefer not to hard-code the : in the Perl BEGIN block, but refer to wherever the -F option saves its argument.
To sum up, what I'm looking for does not exist:
there's no variable that holds the argument for -F, and more importantly
Perl's "FS" is fundamentally a different data type (regular expression) than the "OFS" (string) -- it does not make sense to join a list of strings using a regex.
Note that the same holds true in awk: FS is a string but acts as regex:
echo a:b,c:d | awk -F'[:,]' 'BEGIN {OFS=FS} {$2=42; print}'
outputs "a[:,]42[:,]c[:,]d"
Thanks for the insight and workarounds though.
You can use perl's -s (similar to awk's -v) to pass a "FS" variable, but the split becomes manual:
echo a:b:c:d | perl -sne '
BEGIN {$, = $FS}
#F = split $FS;
$F[1] = 42;
print #F;
' -- -FS=":"
If you know the exact length of input, you could do this:
echo a:b:c:d | perl -F'(:)' -ane '$, = $F[1]; #F = #F[0,2,4,6]; $F[1] = 42; print #F'
If the input is of variable lengths, you'll need something more sophisticated than #f[0,2,4,6].
EDIT: -F seems to simply provide input to an automatic split() call, which takes a complete RE as an expression. You may be able to find something more suitable by reading the perldoc entries for split, perlre, and perlvar.
You can sort of cheat it, because perl is actually using the split function with your -F argument, and you can tell split to preserve what it splits on by including capturing parens in the regex:
$ echo a:b:c:d | perl -F'(:)' -ane 'print join("/", #F);'
a/:/b/:/c/:/d
You can see what perl's doing with some of these "magic" command-line arguments by using -MO=Deparse, like this:
$ perl -MO=Deparse -F'(:)' -ane 'print join("/", #F);'
LINE: while (defined($_ = <ARGV>)) {
our(#F) = split(/(:)/, $_, 0);
print join('/', #F);
}
-e syntax OK
You'd have to change your #F subscripts to double what they'd normally be ($F[2] = 42).
Darnit...
The best I can do is:
echo a:b:c:d | perl -ne '$v=":";#F = split("$v"); $F[1] = 42; print join("$v", #F) . "\n";'
You don't need the -F: this way, and you're only stating the colon once. I was hoping there was someway of setting variables on the command line like you can with Awk's -v switch.
For one liners, Perl is usually not as clean as Awk, but I remember using Awk before I knew of Perl and writing 1000+ line Awk scripts.
Trying things like this made people think Awk was either named after the sound someone made when they tried to decipher such a script, or stood for AWKward.
There is no input record separator in Perl. You're basically emulating awk by using the -a and -F flags. If you really don't want to hard code the value, then why not just use an environmental variable?
$ export SPLIT=":"
$ perl -F$SPLIT -lane 'BEGIN { $, = $ENV{SPLIT}; } ...'