sh: can't return one result after comparing 2 files - sh

as an example I will put different inputs to keep the privacy of my files and to avoid long text, these are of the following form :
INPUT1.cfg :
TC # aa # D317
TC # bb # D314
TC # cc # D315
TC # dd # D316
INPUT2.cfg
BL;nn;3
LY;ww;3
LO;xx;3
TC;vv;3
TC;dd;3
OD;pp;3
TC;aa;3
what I want to do is iterate the name (column 2) in the rows of input1 and compare with the name (column 2) in the rows of input2; if they match we will get the line of INPUT2 in an output file otherwise it will return that the table is not found, here is my try code:
#!/bin/bash
input1="input1.cfg";
input2="input2.cfg"
cat $input1|while read line
do
TableNameIN=`echo $line|cut -d"#" -f2`
cat $input2| while read line
do
TableNameOUT=`echo $line|cut -d";" -f2`
if echo "$TableNameOUT" | grep -q $TableNameIN;
then echo "$line" >> output.txt
else
echo "Table $TableNameIN non trouvé"
fi
done
done
this what i get as result :
Table bb not found
Table bb not found
Table bb not found
Table cc not found
Table cc not found
Table cc not found
I manage to write what is equal but the problem with my code is that it has in output "table not found" for each row whereas I just want to write only once at the end of the comparison of all the lines
here is the output i want to get :
Table bb not found
Table cc not found
Can any one help me with this , PS : I don't want to use awk because it's just a part of my code and i already use sh

Assumptions:
for file input2.cfg the 2nd column (table name) is unique
input2.cfg is not so large that we run the risk of using up all memory for storing intput2.cfg in an associative array (otherwise we could store the table names from input1.cfg's - assuming this is a smaller file - in the array and swap the processing order of the two files)
there are no explicit requirements for data to be sorted (otherwise we may need to add a sort or two)
a bash solution is sufficient (based on inclusion of the #!/bin/bash shebang in OPs current code)
There are many ways to slice-n-dice this one (awk being my preference but OP doesn't want to use awk). For this particular answer I'll pull the awk steps out into separate bash commands.
NOTE: While we could use a set of nested loops (as in the OPs code), I've opted to use an associative array to store input2.cfg thus eliminating the need to repeatedly scan input2.cfg.
#!/usr/bin/bash
input1=input1.cfg
input2=input2.cfg
> output.txt # clear out the target file
# load ${input2} into an associative array
unset lines
typeset -A lines # associative array for storing contents of ${input2}
while read -r line
do
x="${line%;*}" # use parameter expansion
tabname="${x#*;}" # to parse out table name
lines["${tabname}"]="${line}" # add to array
done < "${input2}"
# process ${input1}
while read -r c1 c2 tabname rest_of_line
do
[[ -v lines["${tabname}"] ]] && # if tabname has an entry in our array
echo "${lines[${tabname}]}" >> output.txt && # then dump the associated line (from ${input2}) to output.txt
continue # process next line from ${input1}
echo "Table ${tabname} not found" # otherwise print 'not found' message
done < "${input1}"
# display contents of output.txt
echo "++++++++++++++++ output.txt"
cat output.txt
echo "++++++++++++++++"
This generates the following:
Table bb not found
Table cc not found
++++++++++++++++ output.txt
TC;aa;3
TC;dd;3
++++++++++++++++

Related

Filtering tshark output for .csv. Preventing errors from missing fields

I am trying to filter a pcap file in tshark wit a lua script and ultimately output it to a .csv. I am most of the way there but I am still running into a few issues.
This is what I have so far
tshark -nr -V -X lua_script:wireshark_dissector.lua -r myfile.pcap -T fields -e frame.time_epoch -e Something_UDP.field1 -e Something_UDP.field2 -e Something_UDP.field3 -e Something_UDP.field4 -e Something_UDP.field5 -e Something_UDP.field6 -e Something_UDP.field15 -e Something_UDP.field16 -e Something_UDP.field18 -e Something_UDP.field22 -E separator=,
Here is an example of what the frames look like, sort of.
frame 1
time: 1626806198.437893000
Something_UDP.field1: 0
Something_UDP.field2: 1
Something_UDP.field3:1
Something_UDP.field5:1
Something_UDP.field6:1
frame 2
time: 1626806198.439970000
Something_UDP.field8: 1
Something_UDP.field9: 0
Something_UDP.field13: 0
Something_UDP.field14: 0
frame 3
time: 1626806198.440052000
Something_UDP.field15: 1
Something_UDP.field16: 0
Something_UDP.field18: 1
Something_UDP.field19:1
Something_UDP.field20:1
Something_UDP.field22: 0
Something_UDP.field24: 0
The output I am looking for would be
1626806198.437893000,0,1,1,,1,1,1,,,,,
1626806198.440052000,,,,,,,,,1,0,,1,1,1,,0,0,,,,
That is if the frame contains one of the fields I am looking for it will output its value followed by a comma but if that field isn't there it will output a comma. One issue is that not every frame contains info that I am interested in and I don't want them to be outputted. Part of the issue with that is that one of the fields I need is epoch time and that will be in every frame but that is only important if the other fields are there. I could use awk or grep to do this but wondering if it can all be done inside tshark. The other issue is that the fields being requested will com from a text file and there may be fields in the text file that don't actually exist in the pcap file and if that happens I get a "tshark: Some fields aren't valid:" error.
In short I have 2 issues.
1: I need to print data only it the fields names match but not if the only match is epoch.
2: I need it to work even if one of the fields being requested doesn't exist.
I need to print data only it the fields names match but not if the only match is epoch.
Try using a display filter that mentions all the field names in which you're interested, with an "or" separating them, such s
-Y "Something_UDP.field1 or Something_UDP.field2 or Something_UDP.field3 or Something_UDP.field4 or Something_UDP.field5 or Something_UDP.field6 or Something_UDP.field15 or Something_UDP.field16 or Something_UDP.field18 or Something_UDP.field22"
so that only packets containing at least one of those fields will be processed.
I need it to work even if one of the fields being requested doesn't exist.
Then you will need to construct the command line on the fly, avoiding field names that aren't valid.
One way, in a script, to test whether a field is valid is to use the dftest command:
dftest Something_UDP.field1 >/dev/null 2>&1
will exit with a status of 0 if there's a field named "Something_UDP.field1" and will exit with a status of 2 if there isn't; if the scripting language you're using can check the exit status of a command to see if it succeeds, you can use that.

How to use fishshell to add numbers to files

I have a very simple mp3 player, and the order it plays audio files are based on the file names, and the rule is there must be a 3-size number in the beginning of file name, such as:
001file.mp3
002file.mp3
003file.mp3
I want to write a fish shell sortmp3 to add numbers to the files of a directory. Say directory myfiles contains files:
aaa.mp3
bbb.mp3
ccc.mp3`
When I run sortmp3 myfiles, the file names will be changed to:
001aaa.mp3
002bbb.mp3
003ccc.mp3
But my question is:
how to generate some sequential numbers?
how to make sure the size of each number is exactly 3?
I would write this, which makes no assumptions about how many files there are in a directory:
function sortmp3
set -l files *
set -l i
for i in (seq (count $files))
echo mv $files[$i] (printf "%03d%s" $i $files[$i])
end
end
Remove the "echo" if you like how it works.
You can generate sequential numbers with the seq tool - an external program.
This will only take care of the first part, it won't pad to three characters.
To do that, there's a variety of choices:
printf '%s\n' 00(seq 0 99) | rev | cut -c 1-3 | rev
printf '%s\n' 00(seq 0 99) | sed 's/^.*\(...\)$/\1/'
The 00(seq 0 99) part will generate numbers from "1" to "99" with two zeroes prepended - ie. from "001" to "0099". The later parts of the pipeline remove the superfluous zeroes again.
Or with the next fish version, you can use the new string tool:
string sub -s -3 -- 00(seq 0 99)
Depending on your specific situation you should use the "seq" command to generate sequential numbers or the "math" command to increment a counter. To format the number with a predictable number of leading zeros use the "printf" command:
set idx 12
printf '%03d' $idx

How to read files with numerical name with ascending order

I have several (15) files with names : file1.out, file2.out, file3.out, ....,file15.out. I am reading each file and doing some calculation. Here is a sample.
for file in file*.out; do
echo $file
done
But in this way the files are being read in the order file1.out, file10.out.... ,file15.out, file2.out...,file9.out. Is there any way to read these files in an ascending order i.e. file1.out then file2.out and so on.
Since you know the amount of files you have, you can use a for integer loop
for i in $(seq 1 15); do
echo "file$i.out"
done
For full POSIX compliance (seq is not a standard utility), use a while loop and an explicit counter
i=1
while [ "$i" -le 15 ]; do
echo "file$i.out"
i=$((i+1))
done
Rename your files
If you have less than 100 files you can use the following notation
file1.out => file01.out
Change your sort algorithm
i.e. Use ls -v instead of file*.out
for i in `ls -v file*.out`; do
echo $i;
done;

Optimal way of writing to a file after DB Query and then using this file as the data file for BCP in

Requirement is to copy a table say account in server A to table account_two in server B.
There are many tables like this each having thousands of rows.
I want to try BCP for it. The problem is account_two might have fewer cols than account.
I understand in such scenarios I can either use a format file or a temp table.
The issue is I do not own Server A tables. And in case someone changes the order and the no of col , bcp will fail.
In Sybase queryout is not working.
The only option left is doing a select A , B from account in server A and then writing this output to a file and using this file as the date file in BCP IN .
However, since it is huge data I am not able to find a convenient way of doing this.
while ( $my row = $isth->fetchrow_arrayref) {
print FILE JOIN ("\t",#$row),"\n";
}
But using this performance will be hit.
I cannot use dump_results() or dumper. It will be additional task to bring thousands of lines of data into bcp data file format.
if someone can help me in deciding the best approach.
PS: I am new to PERL. Sorry, if there is an obvious answer to this.
#!/usr/local/bin/perl
use strict;
use warnings;
use Sybase::BCP;
my $bcp = new Sybase::BCP $user, $passwd;
$bcp->config(INPUT => 'foo.bcp',
OUTPUT => 'mydb.dbo.bar',
SEPARATOR => '|');
$bcp->run;
You should record column names as well, so later you can check if order didn't change. There is no bcp option to retrieve column names, so you have to get that information and store it separatelly.
If you need to reorder them, then:
$bcp->config(...
REORDER => { 1 => 2,
3 => 1,
2 => 'foobar',
12 => 4},
...);
Non-Perl solution:
-- Create the headers file
sqlcmd -Q"SET NOCOUNT ON SELECT 'col1','col2'" -Syour_server -dtempdb -E -W -h-1 -s" " >c:\temp\headers.txt
-- Output data
bcp "SELECT i.col1, col2 FROM x" queryout c:\temp\temp.txt -Syour_server -T -c
-- Combine the files using DOS copy command. NB switches: /B - binary; avoids appending invalid EOF character 26 to end of file.
copy c:\temp\headers.txt + c:\temp\temp.txt c:\temp\output.txt /B

Bitwise comparision of two directories(files) in Perl

I am trying to achieve the follwing using perl
A script that performs bitwise comparison of files from two directories
(the directory names are passed as arguments to the script in the command line).
The script should read all files from the first directory and all subdirectories, and
compare them to the corresponding files (e.g. files with the same names) in the
second directory.
The result of the script - (PASSED or FAILED) is formed according to:
The result is FAILED when at least one file from the first directory is not bitwise
equal to the corresponding file in the second directory or the second directory
has no corresponding file.
Otherwise test is PASSED.
So far I have tried the approach in this thread created by me - Comparing two directories using Perl . After some point I realized I am essentially trying to do simulate "diff -r dir1 dir2" which isn't the goal, How can one perform bitwise comparision operation on two directories?
EDIT: Test Case
/dir1 /dir2
-- file1 -- file1
-- file2 -- file2
-- file3
-- ....
-- ...
---/subDir1
--file1
--file2
file1 of dir1 contains :- foo bar
file1 of dir2 contains :- foo
Result - Fail
file1 of dir1 contains :- foo bar
file1 of dir2 contains :- foo bar
Result - Pass.
The script should essentially extract files with same names present in different directories.
I would do something like this:
Open dir1
Read all filenames into an array
Open dir2
Read all filenames into an array
For any case in which a filename in dir1 matches a filename in dir2 or vice versa, begin compare logic
Use Digest::MD5 here to perform an MD5 comparison of the two files. If even one bit is off, you will get different checksums.
Code example from Digest::MD5...
use Digest::MD5 qw(md5 md5_hex md5_base64);
$digest = md5($data);
$digest = md5_hex($data);
$digest = md5_base64($data);
# OO style
use Digest::MD5;
$ctx = Digest::MD5->new;
$ctx->add($data);
$ctx->addfile(*FILE);
$digest = $ctx->digest;
$digest = $ctx->hexdigest;
$digest = $ctx->b64digest;
Generate an MD5 hash for each file and compare them, then pass or fail accordingly.