Hi i need to perform multiple sed operations at a time and then flush the output to a file.
I have a .dat file which has data as follows
indicator.dat
Air_Ind - A.Air_Ind Air_Ind - 0000 - 00- 00
Rpting_Ind - Case When Dstbr_Id Is Null Then 'N' Else 'Y' End Rpting_Ind - 0000 - 00 - 00
Latitude,Longitude - A.Store_Latitude Latitude,A.Store_Longitude Longitude - 0000- 00- 00
coalesce(Pm_Cig_Direct_Ind,'') - Coalesce(Direct_Acct_Ind ,'') - 0004 - 01- 01
coalesce(Pm_Mst_Direct_Ind,'') - Coalesce(Direct_Acct_Ind ,'') - 0004 - 02 - 02
coalesce(Pm_Snus_Direct_Ind,'') - Coalesce(Direct_Acct_Ind ,'') - 0004 - 01 - 02
coalesce(Pm_Snuf_Direct_Ind,'') - Coalesce(Direct_Acct_Ind ,'') - 0004 - 04- 02
coalesce(Jmc_Cgr_Direct_Ind,'') - Coalesce(Direct_Acct_Ind ,'') - 2000 - 02 - 01
coalesce(Usst_Mst_Direct_Ind,'') - Coalesce(Direct_Acct_Ind ,'') - 1070- 02- 02
coalesce(Usst_Snus_Direct_Ind,'') - Coalesce(Direct_Acct_Ind ,'') - 1070 - 01 - 02
coalesce(Usst_Snuf_Direct_Ind,'') - Coalesce(Direct_Acct_Ind ,'') - 1070 - 04 - 02
Now I am trying to replace the parameters defined in a file called indicator.sql and flush the output to a file
indicator_s.sql:
Select A.Location_Id,
param1
From Edw.Location A
--1Left Outer Join
--1(
--1 Select
--1 Location_Id,
--1 Direct_Acct_Ind
--1 From Edw.Location_Bcbc D
--1 Where Company_Cd = 'param2'
--1 And Prod_Type_Cd = 'param3'
--1 And Prod_Catg_Cd = 'param4'
--1 ) A
--1 On L.Location_Id = A.Location_Id
Inner Join
Mim.Mdm_Xref_Distributor D
On D.Src_Dstbr_Id=A.Location_Id
Where Sdw_Exclude_Ind='N' And Dstrb_Cd='Us'
the else block is not entered at any point of time
#!/bin/sh
rm ./Source_tmp.sql
touch ./Source_tmp.sql
while read line
do
MIM=`echo $line | cut -d " " -f1 `
EDW=`echo $line | cut -d "-" -f2 `
Company_Cd=`echo $line | cut -d "-" -f3 `
Prod_Type_Cd=`echo $line | cut -d "-" -f4 `
Prod_Catg_Cd=`echo $line | cut -d "-" -f5 `
echo "Select top 10 * from (" >> ./Source_tmp.sql ;
sed "s/Param1/$MIM/g" indicator.sql >> Source_tmp.sql;
echo "minus">> Source_tmp.sql;
if [ "$MIM"="Air_Ind " ] || [ "$MIM"="Rpting_Ind " ] || [ "$MIM"="Latitude,Longitude " ]
then
sed "s/param1/$EDW/g" indicator_s.sql >> Source_tmp.sql
else
sed -e "s/--1/' '/g" -e "s/param1/$EDW/g" -e "s/param2/$Company_Cd/g" -e "s/param3/$Prod_Type_Cd/g" -e "s/param4/$Prod_Catg_Cd/g" ./indicator_s.sql >> ./Source_tmp.sql
fi
done <indicator.dat
the output should be like the param1 and param2 etc values which i defined should be replaced from the indicator .dat file and the commented lines need to be un-commented in the else block
kindly help me
As far as I can tell, the sed command "is working". But it does not produce the expected output. I guess. To give some output example:
[...]
' ' Where Company_Cd = ' 0000 '
' ' And Prod_Type_Cd = ' 00 '
' ' And Prod_Catg_Cd = ' 00'
[...]
Obviously you have extra quotes on start of line, and extra spaced around your values.
Will this fix your issue:
#!/bin/bash
[...]
MIM=`echo -n $line | cut -d " " -f1 `
EDW=`echo -n $line | cut -d "-" -f2 `
Company_Cd=`echo -n $line | cut -d "-" -f3 `
Prod_Type_Cd=`echo -n $line | cut -d "-" -f4 `
Prod_Catg_Cd=`echo -n $line | cut -d "-" -f5 `
# Trim spaces as you "cut" on "-", keeping extra spaces arround
shopt -s extglob
EDW=${EDW%%+([[:space:]])}; EDW=${EDW##+([[:space:]])};
Company_Cd=${Company_Cd%%+([[:space:]])}; Company_Cd=${Company_Cd##+([[:space:]])};
Prod_Type_Cd=${Prod_Type_Cd%%+([[:space:]])}; Prod_Type_Cd=${Prod_Type_Cd##+([[:space:]])};
Prod_Catg_Cd=${Prod_Catg_Cd%%+([[:space:]])}; Prod_Catg_Cd=${Prod_Catg_Cd##+([[:space:]])};
[...]
# Fix "sed" in your "else" clause by removing extra single quotes
sed -e "s/--1/ /g" -e "s/param1/$EDW/g" -e "s/param2/$Company_Cd/g" -e "s/param3/$Prod_Type_Cd/g" -e "s/param4/$Prod_Catg_Cd/g" ./indicator_s.sql >> ./Source_tmp.sql
Producing now the much more valid SQL:
[...]
Where Company_Cd = '0000'
And Prod_Type_Cd = '00'
And Prod_Catg_Cd = '00'
[...]
That being said, this is mostly some hacks to fix (some of ?) the various issues you might have in your script. But the whole things seems a little bit contrived. And fragile. As for example, it will break if any replace string contains a &. Here Be Dragons.
Its impossible to tell what you want the script to do given so far you've only posted a script that DOESN'T produce whatever output you want and you haven't posted the output you DO want, but let's start with this and you can update your question to show expected output and clarify your requirements:
$ cat tst.awk
BEGIN{ FS="-" }
NR==FNR { template = (template ? template ORS : "") $0; next }
{
split($0,arr,/ /)
MIM = arr[1]
EDW = $1
Company_Cd = $3
Prod_Type_Cd = $4
Prod_Catg_Cd = $5
$0 = "Select top 10 * from (\n" template
gsub(/Param1/,MIM "\nminus")
gsub(/param1/,EDW)
if ( MIM !~ /^Air_Ind|Rpting_Ind|Latitude,Longitude/ ) {
gsub(/--1/," ")
gsub(/param2/,Company_Cd)
gsub(/param3/,Prod_Type_Cd)
gsub(/param4/,Prod_Catg_Cd)
}
print
}
.
$ awk -f tst.awk indicator.sql indicator.dat
Select top 10 * from (
Select A.Id,
Air_Ind
minus
From Location A
--1Left Outer Join
--1(
--1 Select
--1 Location_Id,
--1 Direct_Acct_Ind
--1 From Location_Bcbc D
--1 Where Company_Cd = 'param2'
--1 And Prod_Type_Cd = 'param3'
--1 And Prod_Catg_Cd = 'param4'
--1 ) A
Select top 10 * from (
Select A.Id,
Rpting_Ind
minus
From Location A
--1Left Outer Join
--1(
--1 Select
--1 Location_Id,
--1 Direct_Acct_Ind
--1 From Location_Bcbc D
--1 Where Company_Cd = 'param2'
--1 And Prod_Type_Cd = 'param3'
--1 And Prod_Catg_Cd = 'param4'
--1 ) A
I need to generate a file.sql file from a file.csv, so I use this command :
cat file.csv |sed "s/\(.*\),\(.*\)/insert into table(value1, value2)
values\('\1','\2'\);/g" > file.sql
It works perfectly, but when the values exceed 9 (for example for \10, \11 etc...) it takes consideration of only the first number (which is \1 in this case) and ignores the rest.
I want to know if I missed something or if there is another way to do it.
Thank you !
EDIT :
The not working example :
My file.csv looks like
2013-04-01 04:00:52,2,37,74,40233964,3860,0,0,4878,174,3,0,0,3598,27.00,27
What I get
insert into table
val1,val2,val3,val4,val5,val6,val7,val8,val9,val10,val11,val12,val13,val14,val15,val16
values
('2013-04-01 07:39:43',
2,37,74,36526530,3877,0,0,6080,
2013-04-01 07:39:430,2013-04-01 07:39:431,
2013-04-01 07:39:432,2013-04-01 07:39:433,
2013-04-01 07:39:434,2013-04-01 07:39:435,
2013-04-01 07:39:436);
After the ninth element I get the first one instead of the 10th,11th etc...
As far I know sed has a limitation of supporting 9 back references. It might have been removed in the newer versions (though not sure). You are better off using perl or awk for this.
Here is how you'd do in awk:
$ cat csv
2013-04-01 04:00:52,2,37,74,40233964,3860,0,0,4878,174,3,0,0,3598,27.00,27
$ awk 'BEGIN{FS=OFS=","}{print "insert into table values (\x27"$1"\x27",$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16 ");"}' csv
insert into table values ('2013-04-01 04:00:52',2,37,74,40233964,3860,0,0,4878,174,3,0,0,3598,27.00,27);
This is how you can do in perl:
$ perl -ple 's/([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+),([^,]+)/insert into table values (\x27$1\x27,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16);/' csv
insert into table values ('2013-04-01 04:00:52',2,37,74,40233964,3860,0,0,4878,174,3,0,0,3598,27.00,27);
Try an awk script (based on #JS웃 solution):
script.awk
#!/usr/bin/env awk
# before looping the file
BEGIN{
FS="," # input separator
OFS=FS # output separator
q="\047" # single quote as a variable
}
# on each line (no pattern)
{
printf "insert into table values ("
print q $1 q ", "
print $2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16
print ");"
}
Run with
awk -f script.awk file.csv
One-liner
awk 'BEGIN{OFS=FS=","; q="\047" } { printf "insert into table values (" q $1 q ", " $2","$3","$4","$5","$6","$7","$8","$9","$10","$11","$12","$13","$14","$15","$16 ");" }' file.csv
I have a file in stanza format. Example of the file are as below.
id_1:
id=241
pgrp=staff
groups=staff
home=/home/id_1
shell=/usr/bin/ks
id_2:
id=242
pgrp=staff
groups=staff
home=/home/id_2
shell=/usr/bin/ks
How do I use sed or awk to process it and return only the id name, id and groups in a single line and tab delimited format? e.g.:
id_1 241 staff
id_2 242 staff
with awk:
BEGIN { FS="="}
$1 ~ /id_/ { printf("%s", $1) }
$1 ~ /id/ && $1 !~ /_/ { printf("\t%s", $2) }
$1 ~ /groups/ { printf("\t%s\n", $2) }
Here is an awk solution:
translate.awk
#!/usr/bin/awk -f
{
if(match($1, /[^=]:[ ]*$/)){
id_=$1
sub(/:/,"",id_)
}
if(match($1,/id=/)){
split($1,p,"=")
id=p[2]
}
if(match($1,/groups=/)){
split($1,p,"=")
print id_," ",id," ",p[2]
}
}
Execute it either by:
chmod +x translated.awk
./translated.awk data.txt
or
awk -f translated.awk data.txt
For completeness, here comes a shortened version:
#!/usr/bin/awk -f
$1 ~ /[^=]:[ ]*$/ {sub(/:/,"",$1);printf $1" ";FS="="}
$1 ~ /id/ {printf $2" "}
$1 ~ /groups/ {print $2}
sed 'N;N;N;N;N;y/=\n/ /' data.txt | awk '{print $1,$3,$7}'
Here is the one-liner approach by setting RS:
awk 'NR>1{print "id_"++i,$3,$7}' RS='id_[0-9]+:' FS='[=\n]' OFS='\t' file
id_1 241 staff
id_2 242 staff
Requires GNU awk and assumes the IDs are in increasing order starting at 1.
If the ordering of the ID's is arbitrary:
awk '!/shell/&&NR>1{gsub(/:/,"",$1);print "id_"$1,$3,$5}' RS='id_' FS='[=\n]' OFS='\t' file
id_1 241 staff
id_2 242 staff
awk -F"=" '/id_/{split($0,a,":");}/id=/{i=$2}/groups/{printf a[1]"\t"i"\t"$2"\n"}' your_file
tested below:
> cat temp
id_1:
id=241
pgrp=staff
groups=staff
home=/home/id_1
shell=/usr/bin/ks
id_2:
id=242
pgrp=staff
groups=staff
home=/home/id_2
shell=/usr/bin/ks
> awk -F"=" '/id_/{split($0,a,":");}/id=/{i=$2}/groups/{printf a[1]"\t"i"\t"$2"\n"}' temp
id_1 241 staff
id_2 242 staff
This might work for you (GNU sed):
sed -rn '/^[^ :]+:/{N;N;N;s/:.*id=(\S+).*groups=(\S+).*/\t\1\t\2/p}' file
Look for a line holding an id then get the next 3 lines and re-arrange the output.
I have three files, each with an ID and a value.
sdt5z#fir-s:~/test$ ls
a.txt b.txt c.txt
sdt5z#fir-s:~/test$ cat a.txt
id1 1
id2 2
id3 3
sdt5z#fir-s:~/test$ cat b.txt
id1 4
id2 5
id3 6
sdt5z#fir-s:~/test$ cat c.txt
id1 7
id2 8
id3 9
I want to create a file that looks like this...
id1 1 4 7
id2 2 5 8
id3 3 6 9
...preferably using a single command.
I'm aware of the join and paste commands. Paste will duplicate the id column each time:
sdt5z#fir-s:~/test$ paste a.txt b.txt c.txt
id1 1 id1 4 id1 7
id2 2 id2 5 id2 8
id3 3 id3 6 id3 9
Join works well, but for only two files at a time:
sdt5z#fir-s:~/test$ join a.txt b.txt
id1 1 4
id2 2 5
id3 3 6
sdt5z#fir-s:~/test$ join a.txt b.txt c.txt
join: extra operand `c.txt'
Try `join --help' for more information.
I'm also aware that paste can take STDIN as one of the arguments by using "-". E.g., I can replicate the join command using:
sdt5z#fir-s:~/test$ cut -f2 b.txt | paste a.txt -
id1 1 4
id2 2 5
id3 3 6
But I'm still not sure how to modify this to accomodate three files.
Since I'm doing this inside a perl script, I know I can do something like putting this inside a foreach loop, something like join file1 file2 > tmp1, join tmp1 file3 > tmp2, etc. But this gets messy, and I would like to do this with a one-liner.
join a.txt b.txt|join - c.txt
should be sufficient
Since you're doing it inside a Perl script, is there any specific reason you're NOT doing the work in Perl as opposed to spawning in shell?
Something like (NOT TESTED! caveat emptor):
use File::Slurp; # Slurp the files in if they aren't too big
my #files = qw(a.txt b.txt c.txt);
my %file_data = map ($_ => [ read_file($_) ] ) #files;
my #id_orders;
my %data = ();
my $first_file = 1;
foreach my $file (#files) {
foreach my $line (#{ $file_data{$file} }) {
my ($id, $value) = split(/\s+/, $line);
push #id_orders, $id if $first_file;
$data{$id} ||= [];
push #{ $data{$id} }, $value;
}
$first_file = 0;
}
foreach my $id (#id_orders) {
print "$d " . join(" ", #{ $data{$id} }) . "\n";
}
perl -lanE'$h{$F[0]} .= " $F[1]" END{say $_.$h{$_} foreach keys %h}' *.txt
Should work, can't test it as I'm answering from my mobile. You also could sort the output if you put a sort between foreach and keys.
pr -m -t -s\ file1.txt file2.txt|gawk '{print $1"\t"$2"\t"$3"\t"$4}'> finalfile.txt
Considering file1 and file2 have 2 columns and 1 and 2 represents columns from file1 and 3 and 4 represents columns from file2.
You can also print any column from each file in this way and it will take any number of files as input. If your file1 has 5 columns for example, then $6 will be the first column of the file2.
I have some data (separated by semicolon) with close to 240 rows in a text file temp1.
temp2.txt stores 204 rows of data (separated by semicolon).
I want to:
Sort the data in both files by field1, i.e. the first data field in every row.
Compare the data in both files and redirect the rows that are not equal in separate files.
Sample data:
temp1.txt
1000xyz400100xyzA00680xyz0;19722.83;19565.7;157.13;11;2.74;11.00
1000xyz400100xyzA00682xyz0;7210.68;4111.53;3099.15;216.95;1.21;216.94
1000xyz430200xyzA00651xyz0;146.70;0.00;0.00;0.00;0.00;0.00
temp2.txt
1000xyz400100xyzA00680xyz0;19722.83;19565.7;157.13;11;2.74;11.00
1000xyz400100xyzA00682xyz0;7210.68;4111.53;3099.15;216.95;1.21;216.94
The sort command I'm using:
sort -k1,1 temp1 -o temp1.tmp
sort -k1,1 temp2 -o temp2.tmp
I'd appreciate if someone could show me how to redirect only the missing/mis-matching rows into two separate files for analysis.
Try
cat temp1 temp2 | sort -k1,1 -o tmp
# mis-matching/missing rows:
uniq -u tmp
# matching rows:
uniq -d tmp
You want the difference as described at http://www.pixelbeat.org/cmdline.html#sets
sort -t';' -k1,1 temp1 temp1 temp2 | uniq -u > only_in_temp2
sort -t';' -k1,1 temp1 temp2 temp2 | uniq -u > only_in_temp1
Notes:
Use join rather than uniq, as shown at the link above if you want to compare only particular fields
If the first field is fixed width then you don't need the -t';' -k1,1 params above
Look at the comm command.
using gawk, and outputting lines in file1 that is not in file2
awk -F";" 'FNR==NR{ a[$1]=$0;next }
( ! ( $1 in a) ) { print $0 > "afile.txt" }' file2 file1
interchange the order of file2 and file to output line in file2 that is not in file1