Combining two different multi-row strings into one using Perl - perl

I'm having some issues trying to combine two multi-row strings into one after performing regular expression manipulations on those strings. As an example, I start with data in this form:
TMS: xxxxxxx11110000
TDI: xxxxxxx00001111
TMS: xxxx00001111
TDI: xxxx11110000
To get it in the form I need, I search the file for the key word "TMS: ", extract just the data, use regular expressions to remove the "x's", reverse the data, and then place each bit on its own line and store it in a string. Resultant string would look like this:
0
0
0
0
1
1
1
1
I then search the file for "TDI: " and repeat that same process. The last step would be to concatenate the first string with the second string to get the following output (given the example above):
01
01
01
01
10
10
10
10
10
10
10
10
01
01
01
01
However, when I concatenate the two strings, what I'm getting as an output right now is
0
0
0
0
1
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
1
1
1
1
Is there a way to get the result I'm looking for without changing too much about my process? I've tried using the split command, chomp command, etc. without any luck.

Would be good to have a minimal example to see how you're approaching this problem. Additionally, there are a lot of things about your input file that are not clear. For example, are TMS and TDI always paired in the file, or do you have to check for that? Will you always take the next TDI instance to pair with the preceeding TMS event, or can they be more disjointed? Does TMS always preceed TDI or can they be reversed?
One simple way to do this assuming that the data look just like you've indicated in your example, might be to read each line and store the data in one array for the TMS string and one array for the TDI string. If both arrays are full, then we have a pair to output, so output the pair and clear the arrays for the next events. Otherwise, read the next line to get the TDI data:
#!/usr/bin/env perl
use strict;
use warnings;
my (#first, #second);
while (my $elem = <DATA>) {
($elem =~ /^TMS/)
? (#first = read_string($elem))
: (#second = read_string($elem));
if (#second) {
for my $index (0..$#first) {
print "$first[$index]$second[$index]\n";
}
print "\n";
#first = #second = ();
}
}
sub read_string {
my $string = shift;
my #bits = grep {/\d/} split('', $string);
return reverse(#bits);
}
__DATA__
TMS: xxxxxxx11110000
TDI: xxxxxxx00001111
TMS: xxxx00001111
TDI: xxxx11110000
Output from this would be:
01
01
01
01
10
10
10
10
10
10
10
10
01
01
01
01

What you want is a zip operation. Conveniently List::MoreUtils provides one for you.
#x = qw/a b c d/;
#y = qw/1 2 3 4/;
# returns [a, 1], [b, 2], [c, 3], [d, 4]
#z = zip6 #x, #y;
To get the input for zip either put your resultants into an array in the first place, or split your input string.

hobbs answer from Code Golf: Lasers was solving quite a different problem, but part of the solution was about how to "rotate" a multi-line string, and it could be useful here.
First, don't put each bit on its own line, just separate bits from different rows of input on different lines. Put the multi-line string into $_.
$_ = '0000111111110000
1111000000001111';
Now execute the following code:
$_ = do {
my $o;
$o .= "\n" while s/^./$o.=$&,""/meg;
$o };
(the substitution in hobbs's algorithm started with s/.$/.../. By using s/^./.../, it becomes an algorithm for transposition rather than for rotation)
Input:
$_ = '0000111111110000
1111000000001111';
Output:
01
01
01
01
10
10
10
10
10
10
10
10
01
01
01
01
This algorithm easily generalizes to any number of rows and columns in the input.
Input:
$_='ABCDE
12345
FGHIJ
67890';
Output:
A1F6
B2G7
C3H8
D4I9
E5J0

Related

How to create a numeric ruler and evenly spaced numeric width markers

I am new to Perl. An exercise, where I am to create a numeric ruler from which, I size columns for data at 20 characters-width, is proving a little difficult to complete. So far, I have,
printf “%10d” x 5, (1..6);
#ruler = (1..10) x 7;
Print #ruler, “\n”;
It should look something like,
1 2 3 4
1234567890123456789012345678901234567890
What I get for the top row of numbers is an error, ‘Redundant argument in printf at <script.pl> line #; the bottom row produces numbers from 1 to 10, as it ought with the range operator, but I would like it to produce 1 to 9 with a zero on the end. I did think to start the range from 0, but I haven’t figured out how to remove the first index and only the first index.
I would be grateful for your guidance with both issues.
The warning is due to the fact that you pass 6 numbers to printf, but the format only requires 5.
To me,
1 2 3 4
1234567890123456789012345678901234567890
reads as
11, 12, 13, ..., 19, 10, 21, 22, 23, ...
Why does it start with 11? Why is 10 between 19 and 21?
The following makes more sense:
1 2 3 4
01234567890123456789012345678901234567890 0-based
and
1 2 3 4
1234567890123456789012345678901234567890 1-based
I'm not going to give the solution outright.
If you want the numbers 1 to 9 and 0, that would be 1..9, 0.
%10d will add padding on the left. %-10d will add padding on the right.
Nothing says you can't prefix the output with something that doesn't repeat, like a zero or a space.
Provided desired output starts count from 11 instead 1 -- it doesn't look right.
Perhaps OP intended to start count from 1 until some $max value with placing a digit representing tens above main counter.
Please study following code sample for compliance with your requirements.
use strict;
use warnings;
use feature 'say';
my $max = shift || 45;
rule($max);
sub rule {
my $max = shift;
my($a,$b);
$a .= ' ' x 9 . $_ for 1..$max/10;
$b .= $_ % 10 for 1..$max;
say $a . "\n" . $b;
}
Output
1 2 3 4
123456789012345678901234567890123456789012345
Original OP's code requires slight modification to achieve desired output
use strict;
use warnings;
use feature 'say';
my $max = shift || 45;
printf "%10d" x int($max/10) . "\n", (1..$max/10);
print $_ % 10 for 1..$max;
print "\n";

horizontal absolute values of every line

I am trying to calculate the absolute values of line 2 - values of line 1
and then the horizontal absolute values of every line in my input file. Here's a part of that input.
43 402 51 360
63 60 69 63
65 53 89 55
103 138 135 135
109 36 123 38
To be more precise about what im trying to do I made the following example
initial data
0 2 0 0
0 1 1 1
next stage (absolute value after subscription the second line minus the first line)
2 2 0
1 0 0
final stage (horizontal application of abs values until one column remained)
0
1
The below code was a failed attempt to obtain the final stage of the single column. My problem here is that I don't know how to obtain the final (desired) stage by using subroutine, as I believe that it is a better way to solving my problem. Of course, every idea or better approach is welcome.
#!/usr/bin/perl
use feature qw(say);
use strict;
use warnings;
use Data::Dumper;
my #rows = 'table_only_numbers';
open(my $fh, '<:encoding(UTF-8)', $rows)
sub ori {
for ($num_cols=#{ $rows[$r-1]}; $num_cols=1; $num_cols-- ){
my #diff_diffs = map { abs($diffs[$_-1] - $diffs[$_]) } 1..$num_cols-1;
#final=#diff_diffs;
say join ' ',#final;
return (final) }
my $num_cols = #{ $rows[0] };
for my $r (1..$#rows) {
die "Bad format!" if #{ $rows[$r] } != $num_cols;
my #diffs = map { abs($rows[$r-1][$_] - $rows[$r][$_]) } 0..$num_cols-1;
while ($num_cols>1)
{
$final_output = ori(#{ $rows[0] })
say "final_output";
}
}
close $fh;
Finally, I figure it by myself without subroutines!!! Im posting it in case someone face the same issue in the future.I know that it is an easiest way to do it but as I am newbie in Perl it is the easiest way for me.
So I used:
for the first abs of the line 2 minus the line 1
my #data = map { abs($current[$_]-$previous[$_]) } 0..$#current;
push #final, \#data;
To obtain the absolute value of row 2 minus the row 1
And after I used 3 times as I had 3 columns left (in my case) the following coding line and each time I Substituted the #xxx with a new variable. and I have desired output of I column.
foreach my $row (#XXX) {
my #data = map { abs(#{$row}[$_]-#{$row}[$_+1]) } 0..$#{$row}-1;
say join ' ', #data;
push #XXX, \#data;}

Perl: perl regex for extracting values from complex lines

Input log file:
Nservdrx_cycle 4 servdrx4_cycle
HCS_cellinfo_st[10] (type = (LTE { 2}),cell_param_id = (28)
freq_info = (10560),band_ind = (rsrp_rsrq{ -1}),Qoffset1 = (0)
Pcompensation = (0),Qrxlevmin = (-20),cell_id = (7),
agcreserved{3} = ({ 0, 0, 0 }))
channelisation_code1 16/5 { 4} channelisation_code1
sync_ul_info_st_ (availiable_sync_ul_code = (15),uppch_desired_power =
(20),power_ramping_step = (3),max_sync_ul_trans = (8),uppch_position_info =
(0))
trch_type PCH { 7} trch_type8
last_report 0 zeroth bit
I was trying to extract only integer for my above inputs but I am facing some
issue with if the string contain integer at the beginning and at the end
For ( e.g agcreserved{3},HCS_cellinfo_st[10],Qoffset1)
here I don't want to ignore {3},[10] and 1 but in my code it does.
since I was extracting only integer.
Here I have written simple regex for extracting only integer.
MY SIMPLE CODE:
use strict;
use warnings;
my $Ipfile = 'data.txt';
open my $FILE, "<", $Ipfile or die "Couldn't open input file: $!";
my #array;
while(<$FILE>)
{
while ($_ =~ m/( [+-]?\d+ )/xg)
{
push #array, ($1);
}
}
print "#array \n";
output what I am getting for above inputs:
4 4 10 2 28 10560 -1 1 0 0 -20 7 3 0 0 0 1 16 5 4 1 15 20 3 8 0 7 8 0
expected output:
4 2 28 10560 -1 0 0 -20 7 0 0 0 4 15 20 3 8 0 7 0
If some body can help me with explanation ?
You are catching every integer because your regex has no restrictions on which characters can (or can not) come before/after the integer. Remember that the /x modifier only serves to allow whitespace/comments inside your pattern for readability.
Without knowing a bit more about the possible structure of your output data, this modification achieves the desired output:
while ( $_ =~ m! [^[{/\w] ( [+-]?\d+ ) [^/\w]!xg ) {
push #array, ($1);
}
I have added rules before and after the integer to exclude certain characters. So now, we will only capture if:
There is no [, {, /, or word character immediately before the number
There is no / or word character immediately after the number
If your data could have 2-digit numbers in the { N} blocks (e.g. PCH {12}) then this will not capture those and the pattern will need to become much more complex. This solution is therefore quite brittle, without knowing more of the rules about your target data.

Sorting lines according to the numerical value of the first element on each line?

I have just started learning Perl, hence my question might seem very silly. I apologize in advance.
I have a list say #data which contains a list of lines read from the input. The lines contain numbers that are separated by (unknown number of) spaces.
Now, I would like to sort them and print them out, but not in the lexicographical order but according to the numerical value of the first number appearing on the line.
I know this must be something very simple but I cannot figure out how to do it?
Thanks in advance,
You can use a Schwartzian transform, capturing the first number in the row with a regex
use strict;
use warnings;
my #sorted = map $_->[0],
sort { $a->[1] <=> $b->[1] }
map { [ $_, /^(-?[\d.]+)/ ] } <DATA>;
print #sorted;
__DATA__
21 13 14
0 1 2
32 0 4
11 2 3
1 3 3
Output:
0 1 2
1 3 3
11 2 3
21 13 14
32 0 4
Reading the transform from behind, the <DATA> is the file handle we use, it will return a list of the lines in the file. The first map statement returns an array reference [ ... ], that contains the original line, plus the first number that is captured in the line. Alternatively, you can use the regex /^(\S+)/ here, to just capture whatever non-whitespace that comes first. The sort uses this captured number inside the array ref when comparing lines. And finally, the last map converts the array ref back to the original value, stored in $_->[0].
Be aware that this relies on the lines having a number at the start of the line. If that can be missing, or blank, this will have some unforeseen consequences.
Note that only using a simple numerical sort will also "work", because Perl will convert one of your lines to the correct number, assuming each line begins with a number followed by space. You will have some warnings about that, such as Argument "21 13 14\n" isn't numeric in sort. For example, if I replace my code above with
my #foo = sort { $a <=> $b } <DATA>;
I will get the output:
Argument "21 13 14\n" isn't numeric in sort at foo.pl line 6, <DATA> line 5.
Argument "0 1 2\n" isn't numeric in sort at foo.pl line 6, <DATA> line 5.
Argument "32 0 4\n" isn't numeric in sort at foo.pl line 6, <DATA> line 5.
Argument "11 2 3\n" isn't numeric in sort at foo.pl line 6, <DATA> line 5.
Argument "1 3 3\n" isn't numeric in sort at foo.pl line 6, <DATA> line 5.
0 1 2
1 3 3
11 2 3
21 13 14
32 0 4
But as you can see, it has sorted correctly. I would not advice this solution, but it is a nice demonstration in this context, I think.
You can use the sort function :
#sorted_data = sort(#data);

perl add contents of a column of a file

Column A | Column B | Column C | Column D
35627799100 8 8 2
35627788000 60 34 45
35627799200 10 21 21
35627780000 60 5 8
Basically I have a file as shown above and would like to add the contents of Column B i.e 8+60+10+60. To be frank I'm not sure if need to remove the first line being text and if I can use the split function and put it in a hash something along the lines:
my %hash = map {split/\s+/,$_,4} <$file>;
Thanks in advance for the help.
If you just want to sum up the second column, a hash is overkill. You can do something like this and calculate the sum directly in the map.
my $sum;
$sum += (split /\s+/, $_)[1] while <$file>;
Edit: If you have header rows or other rows with non-numeric values in column 2, then as the comments below indicate, you will run into problems. You can avoid this by trading split for a regular expression, like so:
my $sum = 0;
while (<STDIN>)
{
$sum += $1 if $_ =~ /^\S+\s+(\d+)/;
}
If it's possible that column 1 has no text (ie. the line starts with a single blank and the first non-blank represents the second column), then change the first part of the pattern from ^\S+ to ^\S*.
This is an example based on your data:
use strict;
use warnings;
my $sum_column_b = 0;
<DATA>; #drop header
while( my $line = <DATA>) {
$line =~ m/\s+(\d+)/; #regexpr to catch second column values
$sum_column_b += $1;
}
print $sum_column_b, "\n"; #<-- prints: 138
__DATA__
Column A | Column B | Column C | Column D
35627799100 8 8 2
35627788000 60 34 45
35627799200 10 21 21
35627780000 60 5 8