Difference between Flux.concat and Flux.concatWith - reactive-programming

I am new to reactive streams and learning the combine two publishers (Flux to be specific) using the concat/concatWith methods.
Everything which i can do with concat method, the same can be achieved using the concatWith method. Here is the sample cases which i used:
Mono<String> mono1 = Mono.just(" karan ");
Mono<String> mono2 = Mono.just(" | verma ");
Mono<String> mono3 = Mono.just(" | kv ");
Flux<String> flux1 = Flux.just(" {1} ","{2} ","{3} ","{4} " );
Flux<String> flux2 = Flux.just(" |A|"," |B| "," |C| ");
// FLux emits item each 500ms
Flux<String> intervalFlux1 = Flux.interval(Duration.ofMillis(1000))
.zipWith(flux1, (i, string) -> string);
// FLux emits item each 700ms
Flux<String> intervalFlux2 = Flux
.interval(Duration.ofMillis(1000))
.zipWith(flux2, (i, string) -> string);
System.out.println("**************Flux Concat***************");
Flux.concat(mono1, mono2, mono3).subscribe(System.out::print);
System.out.println();
Flux.concat(flux2, flux1).subscribe(System.out::print);
System.out.println();
Flux.concat(intervalFlux2, flux1).subscribe(System.out::print);
Thread.sleep(5000);
System.out.println();
Flux.concat(intervalFlux2, intervalFlux1).subscribe(System.out::print);
Thread.sleep(10000);
System.out.println("----------------------------------------");
System.out.println("**************Flux Concat with***************");
mono1.concatWith(mono2).concatWith(mono3).subscribe(System.out::print);
System.out.println();
flux1.concatWith(flux2).subscribe(System.out::print);
System.out.println();
intervalFlux1.concatWith(flux2).subscribe(System.out::print);
Thread.sleep(5000);
System.out.println();
intervalFlux1.concatWith(intervalFlux2).subscribe(System.out::print);
Thread.sleep(10000);
System.out.println();
System.out.println("----------------------------------------");
and here is the output for both the cases:
**************Flux Concat***************
karan | verma | kv
|A| |B| |C| {1} {2} {3} {4}
|A| |B| |C| {1} {2} {3} {4}
|A| |B| |C| {1} {2} {3} {4} ----------------------------------------
**************Flux Concat with***************
karan | verma | kv
{1} {2} {3} {4} |A| |B| |C|
{1} {2} {3} {4} |A| |B| |C|
{1} {2} {3} {4} |A| |B| |C|
----------------------------------------
and the time complexity was also similar for both cases.
What is the difference between the two?
Is there any specific conditions, when concat or concatWith should be used?

They are equivalent
Java requires that all code be part of a class, so you can't just have Flux concat(Flux, Flux) as a free function, which imo would be the least confusing.
Some people prefer "always member functions" others prefer "static functions when taking two (or more) of same class".
A third alternative would be a constructor of the form Flux::Flux(Flux, Flux) (or Flux::Flux(Flux[]))

Related

Dataflow job doesn't emit messages after GroupByKey()

I have a streaming dataflow pipeline that writes to BQ, and I want to window all the failed rows and do some further analysis. The pipeline looks like this, I'm getting all the error messages in the 2nd step but all the messages are getting stuck to the beam.GroupByKey(). Nothing moves downstream after that. Does anyone have any idea how to fix this?
data = (
| "Read PubSub Messages" >> beam.io.ReadFromPubSub(subscription=options.input_subscription,
with_attributes=True)
...
| "write to BQ" >> beam.io.WriteToBigQuery(
table=f"{options.bq_dataset}.{options.bq_table}",
write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND,
method='STREAMING_INSERTS',
insert_retry_strategy=beam.io.gcp.bigquery_tools.RetryStrategy.RETRY_NEVER
)
)
(
data[beam.io.gcp.bigquery.BigQueryWriteFn.FAILED_ROWS]
| f"Window into: {options.window_size}m" >> GroupWindowsIntoBatches(options.window_size)
| f"Failed Rows for " >> beam.ParDo(BadRows(options.bq_dataset, 'table'))
)
and
class GroupWindowsIntoBatches(beam.PTransform):
"""A composite transform that groups Pub/Sub messages based on publish
time and outputs a list of dictionaries, where each contains one message
and its publish timestamp.
"""
def __init__(self, window_size):
# Convert minutes into seconds.
self.window_size = int(window_size * 60)
def expand(self, pcoll):
return (
pcoll
# Assigns window info to each Pub/Sub message based on its publish timestamp.
| "Window into Fixed Intervals" >> beam.WindowInto(window.FixedWindows(10))
# If the windowed elements do not fit into memory please consider using `beam.util.BatchElements`.
| "Add Dummy Key" >> beam.Map(lambda elem: (None, elem))
| "Groupby" >> beam.GroupByKey()
| "Abandon Dummy Key" >> beam.MapTuple(lambda _, val: val)
)
also, I don't know if it's relevant but the beam.DoFn.TimestampParam inside my GroupWindowsIntoBatches has invalid timestamp (negative)
Ok, so the issue was that the messages coming from BigQuery FAILED_ROWS were not timestamped. adding | 'Add Timestamps' >> beam.Map(lambda x: beam.window.TimestampedValue(x, time.time())) seems to fix the group by.
class GroupWindowsIntoBatches(beam.PTransform):
"""A composite transform that groups Pub/Sub messages based on publish
time and outputs a list of dictionaries, where each contains one message
and its publish timestamp.
"""
def __init__(self, window_size):
# Convert minutes into seconds.
self.window_size = int(window_size * 60)
def expand(self, pcoll):
return (
pcoll
| 'Add Timestamps' >> beam.Map(lambda x: beam.window.TimestampedValue(x, time.time())) <----- Added This line
| "Window into Fixed Intervals" >> beam.WindowInto(window.FixedWindows(30))
| "Add Dummy Key" >> beam.Map(lambda elem: (None, elem))
| "Groupby" >> beam.GroupByKey()
| "Abandon Dummy Key" >> beam.MapTuple(lambda _, val: val)
)

Why "a b c [2] d c" -match "b c [2]" is False?

Below predicate returns False which I expected True
"a b c [2] d c" -match "b c [2]"
False
But
"a b c d c" -match "b c"
True
This returns True. I wonder what is the reason.
When in doubt, regex escape. You don't need to backslash the spaces. Brackets are used for regex character ranges.
[regex]::escape("b c [2]")
b\ c\ \[2]
"a b c [2] d c" -match [regex]::escape("b c [2]")
True
"a b c" -match "[a-c] [a-c] [a-c]"
True
"a b c " | select-string "([a-c] ){3}" # the whole match is highlighted in ps7
a b c
The [ is a regex character and must be escaped.
PS> "a b c [2] d c" -match "b c \[2]"
True
Edit: Mathias beat me to it while was typing.

How to find number from text

This is a small example of a pyspark column (String) in my dataframe.
column | new_column
------------------------------------------------------------------------------------------------- |--------------------------------------------------
Hoy es día de ABC/KE98789T983456 clase. | 98789
------------------------------------------------------------------------------------------------- |--------------------------------------------------
Como ABC/KE 34562Z845673 todas las mañanas | 34562
------------------------------------------------------------------------------------------------- |--------------------------------------------------
Hoy tiene ABC/KE 110330/L63868 clase de matemáticas, | 110330
------------------------------------------------------------------------------------------------- |--------------------------------------------------
Marcos se ABC 898456/L56784 levanta con sueño. | 898456
------------------------------------------------------------------------------------------------- |--------------------------------------------------
Marcos se ABC898456 levanta con sueño. | 898456
------------------------------------------------------------------------------------------------- |--------------------------------------------------
comienza ABC - KE 60014 -T60058 | 60014
------------------------------------------------------------------------------------------------- |--------------------------------------------------
inglés y FOR 102658/L61144 ciencia. Se viste, desayuna | 102658
------------------------------------------------------------------------------------------------- |--------------------------------------------------
y comienza FOR ABC- 72981 / KE T79581: el camino hacia la | 72981
------------------------------------------------------------------------------------------------- |--------------------------------------------------
escuela. Se FOR ABC 101665 - 103035 - 101926 - 105484 - 103036 - 103247 - encuentra con su | [101665,103035,101926,105484,103036,103247]
------------------------------------------------------------------------------------------------- |--------------------------------------------------
escuela ABCS 206048/206049/206050/206051/205225-FG-matemáticas- | [206048,206049,206050,206051,205225]
------------------------------------------------------------------------------------------------- |--------------------------------------------------
encuentra ABCS 111553/L00847 & 111558/L00895 - matemáticas | [111553, 111558]
------------------------------------------------------------------------------------------------- |--------------------------------------------------
ciencia ABC 163278/P20447 AND RETROFIT ABCS 164567/P21000 - 164568/P21001 - desayuna | [163278,164567,164568 ]
------------------------------------------------------------------------------------------------- |--------------------------------------------------
ABC/KE 71729/T81672 - 71781/T81674 71782/T81676 71730/T81673 71783/T81677 71784/T | [71729,71781,71782,71730,71783,71784]
------------------------------------------------------------------------------------------------- |--------------------------------------------------
ciencia ABC/KE2646/L61175:E/F-levanta con sueño L61/62LAV AT Z5CTR/XC D3-1593 | [2646]
-----------------------------------------------------------------------------------------------------------------------------------------------------
escuela ABCS 6048/206049/6050/206051/205225-FG-matemáticas- MSN 2345 | [6048,206049,6050,206051,205225]
-----------------------------------------------------------------------------------------------------------------------------------------------------
FOR ABC/KE 109038_L35674_DEFINE AND DESIGN IMPROVEMENTS OF 1618 FROM 118(PDS4 BRACKETS) | [109038]
-----------------------------------------------------------------------------------------------------------------------------------------------------
y comienza FOR ABC- 2981 / KE T79581: el camino hacia la 9856 | [2981]
I want to extract all numbers that contain: 4, 5 or 6 digits from this text.
Condition and cases to extract them:
- Attached to ABC/KE (first line in the example above).
- after ABC/KE + space (second and third line).
- after ABC + space (line 4)
- after ABC without space (line 5)
- after ABC - KE + space
- after for word
- after ABC- + space
- after ABC + space
- after ABCS (line 10 and 11)
Example of failed cases:
Column | new_column
------------------------------------------------------------------------------------------------------------------------
FOR ABC/KE 109038_L35674_DEFINE AND DESIGN IMPROVEMENTS OF 1618 FROM 118(PDS4 BRACKETS) | [1618] ==> should be [109038]
------------------------------------------------------------------------------------------------------------------------
ciencia ABC/KE2646/L61175:E/F-levanta con sueño L61/62LAV AT Z5CTR/XC D3-1593 | [1593] ==> should be [2646]
------------------------------------------------------------------------------------------------------------------------
escuela ABCS 6048/206049/6050/206051/205225-FG-matemáticas- MSN 2345 | [6048,206049,6050,206051,205225, 2345] ==> should be [6048,206049,6050,206051,205225]
I hope that I resumed the cases, you can see my example above and the expect output.
How can I do it ?
Thank you
One way using regexes to clean out the data and set up a lone anchor with value of ABC to identify the start of a potential match. after str.split(), iterate through the resulting array to flag and retrieve consecutive matching numbers that follow this anchor.
Edit: Added underscore _ into the data pattern (\b(\d{4,6})(?=[A-Z/_]|$)) so that it now allows underscore as an anchor to follow the matched substring of 4-6 digit. this fixed the first line, line 2 and 3 should be working with the existing regex patterns.
import re
from pyspark.sql.types import ArrayType, StringType
from pyspark.sql.functions import udf
(1) Use regex patterns to clean out the raw data so that we have only one anchor ABC to identify the start of a potential match:
clean1: use [-&\s]+ to convert '&', '-' and whitespaces to a SPACE ' ', they are used to connect a chain of numbers
example: `ABC - KE` --> `ABC KE`
`103035 - 101926 - 105484` -> `103035 101926 105484`
`111553/L00847 & 111558/L00895` -> `111553/L00847 111558/L00895`
clean2: convert text matching the following three sub-patterns into 'ABC '
+ ABCS?(?:[/\s]+KE|(?=\s*\d))
+ ABC followed by an optional `S`
+ followed by at least one slash or whitespace and then `KE` --> `[/\s]+KE`
example: `ABC/KE 110330/L63868` to `ABC 110330/L63868`
+ or followed by optional whitespaces and then at least one digit --> (?=\s*\d)
example: ABC898456 -> `ABC 898456`
+ \bFOR\s+(?:[A-Z]+\s+)*
+ `FOR` words
example: `FOR DEF HJK 12345` -> `ABC 12345`
data: \b(\d{4,6})(?=[A-Z/_]|$) is a regex to match actual numbers: 4-6 digits followed by [A-Z/] or end_of_string
(2) Create a dict to save all 3 patterns:
ptns = {
'clean1': re.compile(r'[-&\s]+', re.UNICODE)
, 'clean2': re.compile(r'\bABCS?(?:[/\s-]+KE|(?=\s*\d))|\bFOR\s+(?:[A-Z]+\s+)*', re.UNICODE)
, 'data' : re.compile(r'\b(\d{4,6})(?=[A-Z/_]|$)', re.UNICODE)
}
(3) Create a function to find matched numbers and save them into an array
def find_number(s_t_r, ptns, is_debug=0):
try:
arr = re.sub(ptns['clean2'], 'ABC ', re.sub(ptns['clean1'], ' ', s_t_r.upper())).split()
if is_debug: return arr
# f: flag to identify if a chain of matches is started, default is 0(false)
f = 0
new_arr = []
# iterate through the above arr and start checking numbers when anchor is detected and set f=1
for x in arr:
if x == 'ABC':
f = 1
elif f:
new = re.findall(ptns['data'], x)
# if find any matches, else reset the flag
if new:
new_arr.extend(new)
else:
f = 0
return new_arr
except Exception as e:
# only use print in local debugging
print('ERROR:{}:\n [{}]\n'.format(s_t_r, e))
return []
(4) defind the udf function
udf_find_number = udf(lambda x: find_number(x, ptns), ArrayType(StringType()))
(5) get the new_column
df.withColumn('new_column', udf_find_number('column')).show(truncate=False)
+------------------------------------------------------------------------------------------+------------------------------------------------+
|column |new_column |
+------------------------------------------------------------------------------------------+------------------------------------------------+
|Hoy es da de ABC/KE98789T983456 clase. |[98789] |
|Como ABC/KE 34562Z845673 todas las ma?anas |[34562] |
|Hoy tiene ABC/KE 110330/L63868 clase de matem篓垄ticas, |[110330] |
|Marcos se ABC 898456/L56784 levanta con sue?o. |[898456] |
|Marcos se ABC898456 levanta con sue?o. |[898456] |
|comienza ABC - KE 60014 -T60058 |[60014] |
|ingl篓娄s y FOR 102658/L61144 ciencia. Se viste, desayuna |[102658] |
|y comienza FOR ABC- 72981 / KE T79581: el camino hacia la |[72981] |
|escuela. Se FOR ABC 101665 - 103035 - 101926 - 105484 - 103036 - 103247 - encuentra con su|[101665, 103035, 101926, 105484, 103036, 103247]|
|escuela ABCS 206048/206049/206050/206051/205225-FG-matem篓垄ticas- |[206048, 206049, 206050, 206051, 205225] |
|encuentra ABCS 111553/L00847 & 111558/L00895 - matem篓垄ticas |[111553, 111558] |
|ciencia ABC 163278/P20447 AND RETROFIT ABCS 164567/P21000 - 164568/P21001 - desayuna |[163278, 164567, 164568] |
|ABC/KE 71729/T81672 - 71781/T81674 71782/T81676 71730/T81673 71783/T81677 71784/T |[71729, 71781, 71782, 71730, 71783, 71784] |
+------------------------------------------------------------------------------------------+------------------------------------------------+
(6) code for debugging, use find_number(row.column, ptns, 1) to check how/if the first two regex patterns work as expected:
for row in df.limit(10).collect():
print('{}:\n {}\n'.format(row.column, find_number(row.column, ptns)))
Some Notes:
in clean2 pattern, ABCS and ABS are treated the same way. if they are different, just remove the 'S' and add a new alternative ABCS\s*(?=\d) to the end of the pattern
re.compile(r'\bABC(?:[/\s-]+KE|(?=\s*\d))|\bFOR\s+(?:[A-Z]+\s+)*|ABCS\s*(?=\d)')
current pattern clean1 only treats '-', '&' and whitespaces as consecutive connector, you might add more characters or words like 'and', 'or', for example:
re.compile(r'[-&\s]+|\b(?:AND|OR)\b')
FOR words is \bFOR\s+(?:[A-Z]+\s+)*, this might be adjusted based on if numbers are allowed in words etc.
This was tested on Python-3. using Python-2, there might be issue with unicode, you can fix it by using the method in the first answer of reference

Transpose rows into columns using Perl

I am newbie with Perl. I have an assignment to transpose rows into columns of a huge data.
customers goods transportation
---------- ----- --------------
A, B, C, D rice truck
E, G, D corn train
............. ..... .........
T, H, K, M, N wheat air cargo
And I would like to have an output look like:
customers goods transportation
---------- ----- --------------
A rice truck
B rice truck
C rice truck
D rice truck
............. ..... .........
N wheat air cargo
Could anyone help. Thank you very much.
Most probably you have come across the map function, study how it works and how you construct rows or columns and you will get it, good luck! :)
Thank you all. After few hours trying I could figure out how to do my assignment.
It needs some simple steps of manually intervention for the input and output,
i.e. add an , at the end of the first column of the input and remove \ in
the second column of the output using Excel.
It is time to submit the output. I appreciate if someone has a better Perl
code to solve it.
#!/usr/bin/perl
my #records;
while (<DATA>) {
chomp;
my #columns = split ", ", $_;
push #records, \#columns;
}
foreach my $record (#records) {
foreach my $column (#{$record}) {
if (\$column != \$$record[-1]) {
print "$column\t \\$$record[-1]\n";
}
}
}
__DATA__
A, B, C, D, Rice Truck
E, G, D, Corn Train
T, H, K, M, N, Wheat Air cargo
__OUTPUT__
A \ Rice Truck
B \ Rice Truck
C \ Rice Truck
D \ Rice Truck
E \ Corn Train
G \ Corn Train
D \ Corn Train
T \ Wheat Air cargo
H \ Wheat Air cargo
K \ Wheat Air cargo
M \ Wheat Air cargo
N \ Wheat Air cargo
It's a long time since I looked at perl, but is this perl mod any good?
Data::Pivot

How can I count the number of entries of a column based on the distinct entries of another column

My data in a file is like below with multiple columns:
A B
Tiger Animal
Parrot Bird
Lion Animal
Elephant Animal
Crow Bird
Horse Animal
Man Human
Dog Animal
I want to find the number of entries in column A corresponding to distinct entries in column B. If possible in R or may be a perl script for this.
Output as:
Animal 5
Bird 2
Human 1
Moreover, if possible to find out if the entries in column A has been repeated for the distinct entries in column B like
A B
Tiger Animal
Tiger Animal
tapply from base R will solve this nicely.
with(anm, tapply(A, B, function(x) length(unique(x))))
This is a solution done in R. Is this what you were looking for?
> anm <- data.frame(A = c("Tiger", "Parrot", "Lion", "Elephant", "Crow", "Horse", "Man", "Dog", "Tiger"),
+ B = c("Animal", "Bird", "Animal", "Animal", "Bird", "Animal", "Human", "Animal", "Animal"))
> anm
A B
1 Tiger Animal
2 Parrot Bird
3 Lion Animal
4 Elephant Animal
5 Crow Bird
6 Horse Animal
7 Man Human
8 Dog Animal
9 Tiger Animal
> (col.anm <- colSums(table(anm)))
Animal Bird Human
6 2 1
> table(anm)
B
A Animal Bird Human
Crow 0 1 0
Dog 1 0 0
Elephant 1 0 0
Horse 1 0 0
Lion 1 0 0
Man 0 0 1
Parrot 0 1 0
Tiger 2 0 0 # you can see how many times entry from A comes up
EDIT
To get the desired output format as noted in the comment, wrap your result in a data.frame.
> data.frame(col.anm)
col.anm
Animal 6
Bird 2
Human 1
If your data is in R, you can use table() to get what you need. First some example data:
dat <- data.frame(A=c("tiger","parrot","lion","tiger"),B=c("animal","bird","animal","animal"))
Then we can get counts of B with:
table(dat$B)
and counts of co-occurance with:
table(dat)
To get the table you specified we can use the plyr package:
library("plyr")
tab <- ddply(dat,.(A,B),nrow)
tab[tab$V1>1,]
A B V1
3 tiger animal 2
Not sure I get the full data structure in the file, but if you're on UNIX:
tr -s ' ' | sort -u | awk '{ print $2}' | sort | uniq -c
5 Animal
2 Bird
1 Human
The above works, even if I add this line: "Tiger Animal" at the end, because of the first sort -u.
The tr -s squeezes out multiple blank spaces (so the sort commands act as expected)
In case anyone else comes by here, here are a couple more approaches that work.
myout <- lapply(split(anm,list(anm$B)),function(x)
list(length(unique(x[,"A"])),x[duplicated(x),"A"])
)
unlist(sapply(myout,function(x)x[1])) # counts in each category
sapply(myout,function(x)x[-1]) # list of duplicated names
or....
library(data.table)
mydt <- data.table(anm,key="B")
mydt[,.N,by=key(mydt)]
mydt[,.N,by="B,A"][N>1]
where....
anm = read.table(textConnection(
"Tiger Animal
Parrot Bird
Lion Animal
Elephant Animal
Crow Bird
Horse Animal
Man Human
Dog Animal
Tiger Animal"))
names(anm) <- c("A","B")
EDIT: Edited in response to comment by Matthew Dowle (author of data.table).
You can do the first easily with awk:
awk '{ myarray[$2]++ } END { for ( key in myarray ) { print key ": " myarray[key] } }' FILE
The second is a bit trickier... ( http://ideone.com/xdKcs )
awk '{ myarray[$2]++ ; myarray2[$2, $1]++ }
END { for ( key in myarray ) { print key ": " myarray[key] }
print
print "Duplicates: "
for (key in myarray2) {
split(key,sep,SUBSEP)
if (myarray2[sep[1], sep[2]]>1)
{ print sep[1] ": " sep[2] " " myarray2[sep[1], sep[2]]
}}}' FILE
Here is an approach using the plyr package in R.
mydf = read.table(textConnection(
"Tiger Animal
Parrot Bird
Lion Animal
Elephant Animal
Crow Bird
Horse Animal
Man Human
Dog Animal
Tiger Animal"))
library(plyr)
ddply(mydf, .(V2), summarize, V3 = length(V1))
V2 V3
1 Animal 6
2 Bird 2
3 Human 1
ddply(mydf, .(V2, V1), summarize, V3 = length(V1))
V2 V1 V3
1 Animal Dog 1
2 Animal Elephant 1
3 Animal Horse 1
4 Animal Lion 1
5 Animal Tiger 2
6 Bird Crow 1
7 Bird Parrot 1
8 Human Man 1
EDIT. Adds the names of animals in each category
ddply(mydf, .(V2), summarize,
V3 = length(V1),
V4 = do.call("paste", as.list(unique(V1))))
V2 V3 V4
1 Animal 6 Tiger Lion Elephant Horse Dog
2 Bird 2 Parrot Crow
3 Human 1 Man
If you're more comfortable with SQL, here's a very short solution using the sqldf package in R:
anm <- data.frame(A = c("Tiger", "Parrot", "Lion", "Elephant", "Crow", "Horse", "Man", "Dog", "Tiger"),
B = c("Animal", "Bird", "Animal", "Animal", "Bird", "Animal", "Human", "Animal", "Animal"))
library(sqldf)
sqldf("select B,count(distinct A) tot from anm group by B")
sqldf("select B,A,count(*) num from anm group by B,A HAVING num > 1")
In Perl (strict and warnings implied.)
my ( %uniq, %count_for );
# here $fh = some input source
while ( <$fh> ) {
s/^\s+//; # trim left
s/\s*$//; # trim right (and chomp)
# This split allows for spaces between words in a single column
# allows also for tab-delimited record
my #cols = split /(?:\t|\s{2,})/;
# Normalize the text and test for uniqueness:
#
# By these manipulations:
# Tiger Animal
# matches
# Tiger Animal
# for any column irregularities
next if $uniq{join('-',#cols)};
# count occurrence.
$count_for{$cols[1]}++;
}
#!/usr/bin/env perl
use strict;
use warnings;
use File::Slurp qw(slurp);
exit unless $ARGV[0];
my #data = slurp($ARGV[0]);
my (%h);
for (#data) {
chomp;
map { next if /^(A|B)$/; $h{$_}++ } split ' ', $_;
}
map { print $_, ": ", $h{$_}, "\n" } keys %h;
usage:
$ perl script.pl columns.txt