Extract Matching Lines after first match - perl

I have text data at the command line that is broken into "records", each with the same value (always 1). In each record, each line is a separate key and value (no this isn't in json unfortunately). A key is sometimes repeated in the record, and sometimes the key name is part of a longer key. For example:
Record = 1
Apple = 1
Ball = 2
Car = 3
RedApple = 4
Ball = 5
Dog = 6
Elf = 7
Fudge = 8
Record = 1
Apple = 2
Ball = 4
Car = 6
RedApple = 8
Ball = 10
Dog = 12
Elf = 14
Fudge = 16
Record = 1
Apple = 3
Ball = 6
Car = 9
RedApple = 12
Ball = 15
Dog = 18
Elf = 21
Fudge = 24
Is there a quick for each record get the lines for a set of keys, returning only the first result per key?
Ex: For each record get keys {Apple, Ball, Dog}
would match the following lines:
Record = 1
Apple = 1
Ball = 2
Dog = 6
Record = 1
Apple = 2
Ball = 4
Dog = 12
...
Basically, the rule is after matching a line with "Record", get the next unique lines with " Apple ", " Ball ", and " Dog " (spacing indicating exact key match) and spit those lines out.
I can write something in perl and it wouldn't be too complex. I don't know awk, so don't know if it's better for something like this.

Is there a quick for each record get the lines for a set of keys, returning only the first result per key?
I don't believe that's actually what you want. I believe you actually want the items labeled Apple, Ball and Dog at the second level, meaning both
Record = 1
Apple = 1
Ball = 2
Car = 3
RedApple = 4
Ball = 5
Dog = 6
Elf = 7
Fudge = 8
and
Record = 1
Apple = 1
Car = 3
RedApple = 4
Ball = 5
Ball = 2
Dog = 6
Elf = 7
Fudge = 8
should produce
Record = 1
Apple = 1
Ball = 2
Dog = 6
If so, you could use
perl -ne'print if /^(?:\S|[ ]{2}(?:Apple|Ball|Dog)[ ]=)/'
or
grep -P '^(?:\S|[ ]{2}(?:Apple|Ball|Dog)[ ]=)'
Output:
Record = 1
Apple = 1
Ball = 2
Dog = 6
Record = 1
Apple = 2
Ball = 4
Dog = 12
Record = 1
Apple = 3
Ball = 6
Dog = 18
See Specifying file to process to Perl one-liner for usage.

If this isn't all you need:
$ grep -E '^(Record| (Apple|Ball|Car))' file
Record = 1
Apple = 1
Ball = 2
Car = 3
Record = 1
Apple = 2
Ball = 4
Car = 6
Record = 1
Apple = 3
Ball = 6
Car = 9
then edit your question to show a more truly representative example. Right now you've accepted an answer that's also based on guessing at your needs and may be more complicated than necessary (while this one may be more simple).

awk to the rescue!
$ awk '/^Record/ {h=$0; a["Apple"]=a["Dog"]=a["Ball"]=0}
$1 in a {if(h) {print h; h=""}
if(!a[$1]++) print}' file
Record = 1
Apple = 1
Ball = 2
Dog = 6
Record = 1
Apple = 2
Ball = 4
Dog = 12
Record = 1
Apple = 3
Ball = 6
Dog = 18
Explanation saves header line and reset the counts. For the lines that has the first field in required keys print header once and print the lines for the first appearance of the key.
If you wanted to extract the second level items only, need to incorporate leading spaces as part of key (to determine the hierarchy). This can be one alternative...
$ awk -F' *= *' '/Record/ {h=$0; a[" Apple"]=a[" Dog"]=a[" Ball"]=0}
$1 in a {if(h) {print h;h=""}; if(!a[$1]++) print}'

Related

Table sort by month

I have a table in MATLAB with attributes in the first three columns and data from the fourth column onwards. I was trying to sort the entire table based on the first three columns. However, one of the columns (Column C) contains months ('January', 'February' ...etc). The sortrows function would only let me choose 'ascend' or 'descend' but not a custom option to sort by month. Any help would be greatly appreciated. Below is the code I used.
sortrows(Table, {'Column A','Column B','Column C'} , {'ascend' , 'ascend' , '???' } )
As #AnonSubmitter85 suggested, the best thing you can do is to convert your month names to numeric values from 1 (January) to 12 (December) as follows:
c = {
7 1 'February';
1 0 'April';
2 1 'December';
2 1 'January';
5 1 'January';
};
t = cell2table(c,'VariableNames',{'ColumnA' 'ColumnB' 'ColumnC'});
t.ColumnC = month(datenum(t.ColumnC,'mmmm'));
This will facilitate the access to a standard sorting criterion for your ColumnC too (in this example, ascending):
t = sortrows(t,{'ColumnA' 'ColumnB' 'ColumnC'},{'ascend', 'ascend', 'ascend'});
If, for any reason that is unknown to us, you are forced to keep your months as literals, you can use a workaround that consists in sorting a clone of the table using the approach described above, and then applying to it the resulting indices:
c = {
7 1 'February';
1 0 'April';
2 1 'December';
2 1 'January';
5 1 'January';
};
t_original = cell2table(c,'VariableNames',{'ColumnA' 'ColumnB' 'ColumnC'});
t_clone = t_original;
t_clone.ColumnC = month(datenum(t_clone.ColumnC,'mmmm'));
[~,idx] = sortrows(t_clone,{'ColumnA' 'ColumnB' 'ColumnC'},{'ascend', 'ascend', 'ascend'});
t_original = t_original(idx,:);

Creating an array of a specific size from data in an array of a larger size - averages

I want to find the average value across an array between the element(x) and element(x+1)
for val = 1: xMid_p-1
eapDia_p = diaArray_p(1,val);
baseDia_p = diaArray_p(1,end);
curDiaArray_p = linspace(eapDia_p, baseDia_p, xMid_p-1);
curRadArray_p = curDiaArray_p/2;
maxRad = max(curRadArray_p);
for val = 1 : xMid_p-1
ln(1,val) = maxRad(:) - curRadArray_p(val);
lnE(1,val) = ln(1,val).^3;
presAn(1,val)= acos(((refDia_p/2)*cos(refPresAng_p))./curRadArray_p(val));
arcToo(1,val) = 2 * curRadArray_p(val)*((twRefDia_p/refDia_p)+(tan(refPresAng_p)-refPresAng_p)-(tan(presAn(1,val))-presAn(1,val)));
chor(1,val) = 2 * curRadArray_p(val) * sin(arcToo(1,val)/(curRadArray_p(1,val)*2));
for val = 1 : xMid_p - 2
lnM(1,val) = maxRad(:) - curRadArray_p(val);
lnME(1,val)=lnM(1,val).^3;
end
end
lnCubed(1,:) = ln.^3;
lnMCubed(1,:) = lnM.^3;
lnEq = lnCubed(2:end) - lnMCubed;
end
please see chor(1,val), this would give the value :
chor =
1 2 3 4 5 6 7 8
I want to find the average chor, therefore the array will be one element smaller in size and will give the result
aveChor =
1.5 2.5 3.5 4.5 5.5 6.5 7.5
One approach using indexing -
aveChor = (chor(2:end) + [chor(1:end-1)])/2
Another approach using diff -
aveChor = (2*chor(1:end-1) + diff(chor))/2

SystemVerilog array random seed of Shuffle function

I get the same output everytime I run the code below.
module array_shuffle;
integer data[10];
initial begin
foreach (data[x]) begin
data[x] = x;
end
$display("------------------------------\n");
$display("before shuffle, data contains:\n");
foreach (data[x]) begin
$display("data[%0d] = %0d", x, data[x]);
end
data.shuffle();
$display("------------------------------\n");
$display("after shuffle, data contains:\n");
foreach (data[x]) begin
$display("data[%0d] = %0d", x, data[x]);
end
end
endmodule
Output:
------------------------------
before shuffle, data contains:
data[0] = 0
data[1] = 1
data[2] = 2
data[3] = 3
data[4] = 4
data[5] = 5
data[6] = 6
data[7] = 7
data[8] = 8
data[9] = 9
------------------------------
after shuffle, data contains:
data[0] = 8
data[1] = 6
data[2] = 7
data[3] = 9
data[4] = 5
data[5] = 0
data[6] = 1
data[7] = 4
data[8] = 2
data[9] = 3
Is there a way to seed the randomization of the shuffle function?
Shuffle returns the same result every time because you probably run the simulator with the same seed. This is the intended behavior, because when you run a simulation and find a bug, you want to be able to reproduce it, regardless of any design (and to some extent testbench) changes. To see a different output, try setting the seed on the simulator command line. For Incisive this is:
irun -svseed 1 // sets the seed to 1
irun -svseed random // will set a random seed
It's also possible to manipulate the seed of the random number generator using set_randstate, but I wouldn't mess with that.

Matlab - preprocess CSV file

I have a CSV file in a format similar to the following one:
title1
index columnA1 columnA2 columnA3
1 2 3 6
2 23 23 1
3 2 3 45
4 2 2 101
title2
index columnB1 columnB2 columnB3
1 23 53 6
2 22 13 1
3 5 4 43
4 8 6 102
I want to build a function readCustomCSV which receives a CSV file in the bellow illustrated format and a row index i and returns an output file with (for let's say i = 3) the following content:
title1
index columnA1 columnA2 columnA3
3 2 3 45
title2
index columnB1 columnB2 columnB3
3 5 4 43
Do you know how to use the csvread function in order to obtain this type of functionality?
It confuses me that there are 2 types sections. I was thinking at using the whole thing as a string and then split it into 2 .csv files and then read the corresponding line line.
try using this function :
I assumed that all tables have equal number of columns/rows. The code can definitely be shortened / improved / extended ;)
function multi_table_csvread (row_index)
filename_INPUT = 'multi_table.csv' ;
filename_OUTPUT = 'selected_row.csv' ;
fIN = fopen(filename_INPUT,'r');
nextLine = fgetl(fIN);
tableIndex = 0;
tableLine = 0;
csvTable = [];
% start reading the csv file, line by line
while nextLine ~= -1
lineStr = strtrim(strsplit(nextLine,',')) ;
% remove empty cells
lineStr(cellfun('isempty',lineStr)) = [] ;
tableLine = tableLine + 1 ;
% if 1 element start new table
if numel(lineStr) == 1
tableIndex = tableIndex + 1;
tableLine = 1;
csvTable{tableIndex,tableLine} = lineStr ;
else
lineStr = add_comas(lineStr) ;
csvTable{tableIndex,tableLine} = lineStr ;
end
nextLine = fgetl(fIN);
end
fclose(fIN);
fOUT = fopen(filename_OUTPUT,'w');
if row_index > size(csvTable,2) -2
error('The row index exceeds the maximum number of rows!')
end
for k = 1 : size(csvTable,1)
title = csvTable{k,1};
columnHeaders = csvTable{k,2};
selected_row = csvTable{k,row_index+2};
fprintf(fOUT,'%s\n',title{:});
fprintf(fOUT,'%s',columnHeaders{:});
fprintf(fOUT,'\n');
fprintf(fOUT,'%s',selected_row{:});
fprintf(fOUT,'\n');
end
fclose(fOUT);
function line_with_comas = add_comas(this_line)
for ii = 1 : length(this_line)-1
this_line{ii} = strcat(this_line{ii},',') ;
end
line_with_comas = this_line ;

Issue with circular shift BITMASK

i'm having one enum
typedef NS_ENUM(NSInteger, Node) {
NodeTop ,
NodeLeft ,
NodeBottom ,
NodeRight ,
} ;
and property as,
#property Node node;
now in my controller i'm assigning node multiple values using pipeline,
node =top | left | bottom | right ;
(Q-1 does node have 0000,0011 kind of values using NodeTop,Left OR simply final result of ORing of top|left|bottom|right?)
this way NSLog("%d",node); giving result 1.
now if node contains 0001 , i want to left shift it by 1 so i tried
node<<1;
which changes node value 1 to 2 but is it does not seem really changing 0001 to 0010?
in short i want node to have value like 0001,
and later on i want to shift its value like,
0010
0100
1000
0001
0010
...
Any help? let me know if question is not clear!
You can use NS_OPTIONS to create a bit mask where each value is defined as 1<<n. This NSHipster post explain the usage of both NS_ENUM and NS_OPTIONS.
typedef NS_OPTIONS(NSInteger, Node) {
NodeTop = 1 << 0, // 1
NodeLeft = 1 << 1, // 2
NodeBottom = 1 << 2, // 4
NodeRight = 1 << 3 // 8
};
That way when you write
NodeTop | NodeBottom
it will be the same as
...0001 | ...0100 = ....0101
And shifting the bit will move it to the next option
Node node = NodeBottom<<1; // = NodeRight
since ...0100 << ...1000.
which changes node value 1 to 2 but is it does not seem really changing 0001 to 0010?
It does. 0010 binary is 2 decimal.
You can only bitwise OR these values together in a meaningful way if you make each value correspond to a different bit, e.g.
typedef NS_ENUM(NSInteger, Node) {
NodeTop = 1,
NodeLeft = 2,
NodeBottom = 4,
NodeRight = 8
};
node = NodeTop; // 0b0001 = 0x01 = 1
node <<= 1; // 0b0010 = 0x02 = 2 = NodeLeft
node <<= 1; // 0b0100 = 0x04 = 4 = NodeBottom
Well as for the circular bit shifting, I think the best you can do is something like this (quickly thrown together C code)
enum test {
test1 = 1,
test2 = 2,
test4 = 4,
test8 = 8
};
static enum test val(enum test input) {
while(input >= 16) //Or whatever max bit you want
input = input >> 4; //Be sure to shift by the appropriate number
return input;
}
Then when you run this code:
enum test foo = test1;
foo = val(foo << 5);
foo will be test2