matching elements in a table by perl - perl

I have some data that looks like this:
G1 G2 G3 G4
Pf1 NO B1 NO D1
Pf2 NO NO C1 D1
Pf3 A1 B1 NO D1
Pf4 A1 NO C1 D2
Pf5 A3 B2 C2 D3
Pf6 NO B3 NO D3
My purpose is to check in each column if an element (different from the "NO" cases) is showed twice (like A1 in column 2, for example) and only twice (if it is showed three times or more I don't want it in the output) and, if so, write it as correspondenting to the element of the first column. Of course, I will have more elements of the columns corresponding to an element of the first column. So, the desired output looks like this:
Pf1 B1
Pf2 C1
Pf3 A1 B1
Pf4 A1 C1
Pf5 D3
Pf6 D3
I have a code, that work in the opposite direction. It lists the elements of the first column that correspond to the elements that are showed twice and only twice in the other columns. This code looks like this:
use Data::Dumper;
my %hash;
while (<DATA>) {
next if $.==1;
chomp;
my ($first,#others) = (split /\s+/);
for (#others){
$hash{$_}.=' '.$first;
}
}
print Dumper \%hash;
I need to be pushed in order to adapt it to my new purpose. Any help or suggestion is totally welcome!

my %hash;
my #r;
while (<DATA>) {
next if $.==1;
chomp;
my #t = grep $_ ne "NO", split;
push #r, \#t;
$hash{$_}++ for #t[1 .. $#t];
}
for my $l (#r) {
my $k = shift #$l;
my #t = grep { $hash{$_} ==2 } #$l;
print "$k #t\n";
}
__DATA__
G1 G2 G3 G4
Pf1 NO B1 NO D1
Pf2 NO NO C1 D1
Pf3 A1 B1 NO D1
Pf4 A1 NO C1 D2
Pf5 A3 B2 C2 D3
Pf6 NO B3 NO D3
output
Pf1 B1
Pf2 C1
Pf3 A1 B1
Pf4 A1 C1
Pf5 D3
Pf6 D3

Related

Kafka Streams DSL folding hierarchical data

Using Kafka Streams DSL, this is what I want to do:
Input message Serdes: Avro for both key and value
Key: Record with fields L1, L2, L3
Value: Record with value V (in this case an int)
What I want to do is collapse this heirarchy in such a way that the produced stream has the correct summed up value. For example,
Input:
L1 L2 L3 V
a1 b1 c1 v1
a1 b1 c2 v2
a1 b2 c1 v3
a1 b2 c2 v4
a2 b1 c1 v5
Output 1: (Data wanted at L1, L2)
L1 L2 V
a1 b1 v1 + v2
a1 b2 v3 + v4
a2 b1 v5
Output 2 (Data wanted at L1)
L1 V
a1 v1 + v2 + v3 + v4
a2 v2
Is there a way the Streams DSL would allow be this? Note that the key type changes across all outputs and I couldn't find a way to perform these rekey + aggregation (since rekey is esentially supposed to merge multiple values). While there might be ways to achieve this using the processor API or basic Kafka Consumer, want to check how to do this in DSL (if possible).
You should be able to use selectKey():
KStream input = builder.stream(...);
input.selectKey(/*create a new output record with only 2 attributes L1 and L2*/)
.groupyByKey()
.aggregate(...);
input.selectKey(/*create a new output record with only 1 attribute L1*/)
.groupByKey()
.aggregate(...)

Notations in Coq

I want to use the notations to represent the predicate test as follows:
Variable A B : Type.
Inductive test : A -> B -> A -> B -> Prop :=
| test1 : forall a1 a2 b1 b2,
a1 \ b1 || a2 \ b2
where "c1 '\' st '||' c2 '\' st'" := (test c1 st c2 st')
.
However, the Coq has an error:
Why this notation cannot be accepted in Coq?
The notation is accepted, it's actually that Coq is incorrectly parsing your use of the notation within the definition of test1. To correctly parse this notation you need to adjust the parsing levels of its terms. You can do that with a reserved notation, since these where clauses for notation within an inductive don't support the syntax for configuring the notation:
Variable A B : Type.
Reserved Notation "c1 '\' st '||' c2 '\' st'" (at level 40, st at next level, c2 at next level, no associativity).
Inductive test : A -> B -> A -> B -> Prop :=
| test1 : forall a1 a2 b1 b2,
a1 \ b1 || a2 \ b2
where "c1 '\' st '||' c2 '\' st'" := (test c1 st c2 st')
.
I don't have a good intuition for what parsing levels work well (40 is somewhat arbitrary above), so the best advice I can give is to experiment and if it's parsed incorrectly somewhere then try adjusting the level.

Emacs rectangle space removal

I have something like the following data set
A, B ,C,D , E
A1 , B121 ,C1,D1 , E1
A2,Ber2 ,C2,D2 , E2
A3, Bat3 ,C3,D3 , E3
And I want the commas to align so that they are right after the text ends and there is a space after which the next column starts.
Like this
A, B, C, D, E
A1, B121, C1, D1, E1
A2, Ber2, C2, D2, E2
A3, Bat3, C3, D3, E3
I tried using delete-whitepsace-rectangle but that only works as long as the width of the strings ina column match for some reason .
Is there a way to make this happen in emacs?
You want to replace spaces, a comma, and spaces, with a comma and a single space.
You can do this with replace-regexp, replacing "\ *,\ *" with ",".

"for" translation into lists high order functions

As far as I understand, for expressions are translated into Scala expressions which are build upon:
map
flatMap
filterWith
foreach
High order lists methods.
A common example is the one where:
for(b1 <= books; b2 <- books if b1 != b2;
a1 <- b1.authors; a2 <- b2.authors if a1 == a2) yield a1;
Results in:
books flatMap (b1 =>
books withFilter( b2 => b1 != b2) flatMap( b2 =>
b1.authors flatMap ( a1 =>
b2.authors withFilter ( a2 => a2 == a1 ) map ( a2 => a1 )
)
)
)
Where:
books is a list of class Book objects (List[Book])
Book has a public attribute authors of type List[String]
My question is about this line:
b2.authors withFilter ( a2 => a2 == a1 ) map ( a2 => a1 )
Since the condition is a2 == a1 that line is equivalent to:
b2.authors withFilter ( a2 => a2 == a1 ) map ( x => x )
Why the generated code isn't just?
b2.authors filter ( a2 => a2 == a1 )
Can it be explained by the fact that the example is the reproduction of code automatically generated by Scala's compiler?
Is filter out of the for "building bricks"?
The translation of for/yield syntax into method calls is very simple and mechanical, almost at the level of string manipulation. withFilter is necessary in some places for its laziness, therefore it's used everywhere for simplicity. I don't understand the phrasing of your final question, but for/yield expressions are AIUI never translated into calls to filter except in a deprecated way for objects that don't yet have a withFilter method.

How to search pattern in certain range of text and replace with another string?

$ NAME : corry
$$.Inc s d
$$.Oc s
$$.TO
G1 ty n1 EE EE M T1 T2 $$SRU
G2 n1 y OO OO M T3 T4 $$SRU
.EON
$ NAME : patrick
$$.Inc c d
$$.Oc c
$$.TO
G1 td n3 EE EE M T5 T6 $$SRU
G2 n3 y OO OO M T7 T8 $$SRU
.EON
$ NAME : danny
$$.Inc a b
$$.Oc b
$$.TO
#lc1 corry
#lc2 patrick
1 to n0 EE EE M S1 S2 $$SRU
G2 n0 y OO OO M S3 S4 $$SRU
.EON
$ NAME : sandy
$$.Inc m n
$$.Oc n
$$.TO
G1 te n1 EE EE M b1 b2 $$SRU
G2 n1 o OO OO M b3 b4 $$SRU
.EON
$ NAME : manager
$$.Inc o e
$$.Oc e
$$.TO
#lc3 danny
#lc4 sandy
G1o ty n1 EE EE M T1 T2 $$SRU
G2o n1 y OO OO M T3 T4 $$SRU
.EON
How to search for a certain pattern in a certain range? For example, I want to search G1o at range between the section from $ Name : manager until the end of End of name (.EON) and replace it with G1o.corry.n.
perl -pe 's/G1o/G1o.corry.n/ if /\$ NAME : manager/ .. /\.EON/' file
From perlop documentation:
In scalar context, ".." returns a boolean value. The operator is bistable, like a flip-flop, and emulates the line-range (comma) operator of sed, awk, and various editors. Each ".." operator maintains its own boolean state, even across calls to a subroutine that contains it. It is false as long as its left operand is false.
sed '/^\$ NAME : manager/,/\.EON/s/G1o/G1o.corry.n/'