Logical issue in Perl numerically iterating over 10 directories and populating them with numbers? - perl

My basic goal is to take the integers 1 to 100,
and split them into 10 parts, where the initial part is integers 1 to 10,
and the next part is 11 to 20, each part having 10 numbers each.
I want the first part to go into the directory NUMBERS_1
containing the numbers 1 to 10 in a file called FILE_1.txt
and the next part to go into directory NUMBERS_2
containing the numbers 11 to 20 in a text file: FILE_2.txt
and so forth.
How I approached this problem was that I initialized an array with 100 numbers,
and then created an array reference by destructively splicing the array into 10 parts composed of no more than 10 integers each.
Then I created 10 folders NUMBERS_1 to NUMBERS_10 on a for-loop.
As I was doing this for-loop, I created a directory list of all the directories I created.
Then some problem is occuring as I iterate over the directories.
So I attempted to iterate over the directories in a foreach loop,
and then I try to open each of the directories in this directory list one at a time,
create a text file, fill it with a quantity of ten integers, close the file, and then close the directory. It doesn't seem like I'm opening the directories, but I'm not getting any error messages, and I have multiple open or die statements so shouldn't I be getting some errors?
ISSUE:
My problem is that my 10 text files are being created in my current working directory, not on each in the 10 directories that I created, but I just can't see the error in my logic.
#!/usr/bin/env perl
# The objective of this program is to an array of 100 numbers in 10 parts
# And write each 10 parts into 10 files with ten numbers each.
use strict;
use warnings;
use feature 'say';
my #numbers = (1 .. 100);
my $partition_size = 10;
my #number_groups = ();
my $num_elements;
my #directories;
my $directory_handle;
my $incrementer; # Incrementing over the directories ... not the same as my use of $i
my $i;
# This right here is very powerful
# I must confess that I received some help from zdim at stackoverflow:
# https://stackoverflow.com/questions/45158306/splitting-an-array-into-n-accessible-parts-within-perl
# splice is destructive so numbers will be empty,
# but at that cost the array reference #number_groups will have 10 sections filled with 10 numbers each from 1 to 100
push #number_groups, [splice(#numbers, 0, $partition_size)] while #numbers;
$num_elements = scalar(#number_groups); # Retrieiving size of array reference.
# Why is it not being treated like an array reference, but like an array?;
# Here I'm saying every item of the array reference #number_groups.
say "Let's take a look at this array reference containing each of the pieces of numbers 1 to 100";
say "#$_\n" for #number_groups;
# Now let's make folders containing the numbers 1 to 100, with 10 numbers in each folder.
# And the folders properly labeled.
say "I will now create $num_elements folders";
for(my $i = 1; $i <= $num_elements; $i++)
{
mkdir "NUMBERS_$i" or warn "Could not create folder $_, probably because it already exists";
push #directories, "NUMBERS_$i";
}
# I know that the script is misbehaving somewhere below this line.
$incrementer = 0; # The incrementer is at zero because the first item in the list of directories is the zero-ith item.
$i = 1; # This incrementer "$i" is for the 10 logical slices of the numbers from 1 to 100
foreach(#directories)
{
opendir($directory_handle, $_) or die "Could not open directory $_";
my $file = "FILE_$i.txt"; my $filehandle;
say "\nThe incrementer is at $i in directory $_";
open($filehandle, '>', $file) or die "Could not open file $_";
while(my $line = <$filehandle>)
{
chomp $line;
foreach(#{$number_groups[$incrementer]}){print "$_\t";}
}
close $filehandle or die "Could not close file $_";
closedir($directory_handle) or die "Could not close directory $_";
$incrementer++;
$i++;
}
Running this script produces the following output:
Let's take a look at this array reference containing each of the pieces of numbers 1 to 100
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
I will now create 10 folders
The incrementer is at 1 in directory NUMBERS_1
1 2 3 4 5 6 7 8 9 10
The incrementer is at 2 in directory NUMBERS_2
11 12 13 14 15 16 17 18 19 20
The incrementer is at 3 in directory NUMBERS_3
21 22 23 24 25 26 27 28 29 30
The incrementer is at 4 in directory NUMBERS_4
31 32 33 34 35 36 37 38 39 40
The incrementer is at 5 in directory NUMBERS_5
41 42 43 44 45 46 47 48 49 50
The incrementer is at 6 in directory NUMBERS_6
51 52 53 54 55 56 57 58 59 60
The incrementer is at 7 in directory NUMBERS_7
61 62 63 64 65 66 67 68 69 70
The incrementer is at 8 in directory NUMBERS_8
71 72 73 74 75 76 77 78 79 80
The incrementer is at 9 in directory NUMBERS_9
81 82 83 84 85 86 87 88 89 90
The incrementer is at 10 in directory NUMBERS_10
91 92 93 94 95 96 97 98 99 100
The standard output being sent to the screen is totally logical, but the text files are not in their appropriate directories, and somehow the files are empty.
What's going on?
Thanks!

my $file = "FILE_$i.txt";
should be
my $file = "$_/FILE_$i.txt";
And get rid of that opendir; you never use the handle.

Related

Sorting wrt to a column value in matlab [duplicate]

This question already has answers here:
Sorting entire matrix according to one column in matlab
(2 answers)
Closed 4 years ago.
I have multiple columns in my dataset and column 2 contains value from 1 till 7. I want to sort my dataset with respect to second column . Thanks in advance
The command you need is sortrows
By default this sorts with respect to the first column, but an additional argument can be used to change this to the 2nd (or 5th, 17th etc)
If A is your original array:
B = sortrows(A,2);
will give you the sorted array B w.r.t 2nd column
What did you mean by sort with respect to second column? You should be more specific or at least give us an example.
If you need a simple sort on each column use the following
A =
95 45 92 41 13 1 84
23 1 73 89 20 74 52
60 82 17 5 19 44 20
48 44 40 35 60 93 67
89 61 93 81 27 46 83
76 79 91 0 19 41 1
Sort each column of A in ascending order:
c = sort(A, 1)
c =
23 1 17 0 13 1 1
48 44 40 5 19 41 20
60 45 73 35 19 44 52
76 61 91 41 20 46 67
89 79 92 81 27 74 83
95 82 93 89 60 93 84

How can I interrupt a 'loop' in kdb?

numb is a list of numbers:
q))input
42 58 74 51 63 23 41 40 43 16 64 29 35 37 30 3 34 33 25 14 4 39 66 49 69 13..
31 41 39 27 9 21 7 25 34 52 60 13 43 71 10 42 19 30 46 50 17 33 44 28 3 62..
15 57 4 55 3 28 14 21 35 29 52 1 50 10 39 70 43 53 46 68 40 27 13 69 20 49..
3 34 11 53 6 5 48 51 39 75 44 32 43 23 30 15 19 62 64 69 38 29 22 70 28 40..
18 30 60 56 12 3 47 46 63 19 59 34 69 65 26 61 50 67 8 71 70 44 39 16 29 45..
I want to iterate through each row and calculate the sum of the first 2 and then 3 and then 4 numbers etc. If that sum is greater than 1000 I want to stop the iteration on that particualr row and jump on the next row and do the same thing. This is my code:
{[input]
tot::tot+{[x;y]
if[1000<sum x;:count x;x,y]
}/[input;input]
}each numb
My problem here is that after the count of x is added to tot the over keeps going on the same row. How can I exit over and jump on the next row?
UPDATE: (QUESTION STILL OPEN) I do appreciate all the answers so far but I am not looking for an efficient way to sum the first n numbers. My question is how do I break the over and jump on the next line. I would like to achieve the same thing as with those small scripts:
C++
for (int i = 0; i <= 100; i++) {
if (i = 50) { printf("for loop exited at: %i ", i); break; }
}
Python
for i in range(100):
if i == 50:
print(i);
break;
R
for(i in 1:100){
if(i == 50){
print(i)
break
}
}
I think this is what you are trying to accomplish.
sum {(x & sums y) ? x}[1000] each input
It takes a cumulative sum of each row and takes an element wise minimum between that sum and the input limit thereby capping the output at the limit like so:
q)(100 & sums 40 43 16 64 29)
40 83 99 100 100
It then uses the ? operator to find the first occurance of that limit (i.e the element where this limit was equaled or passed) adding one as it is 0 indexed. In the example the first 100 occurs after 3 elements. You might want add one to include the first element after the limit in the count.
q)40 83 99 100 100 ? 100
3
And then it sums this count over all rows of the input.
You could use coverage in this case to exit when you fail to satisfy a condition
https://code.kx.com/q/ref/adverbs/#converge-repeat
The first parameter would be a function that does your check based on the current value of x which will be the next value to be passed in the main function.
For your example ive made a projection using the main input line then increase the indexes of what i am summing each time:
q)numb
98 11 42 97 89 80 73 35 4 30
86 33 38 86 26 15 83 71 21 22
23 43 41 80 56 11 22 28 47 57
q){[input] {x+1}/[{100>sum (y+1)#x}[input;];0] }each numb
1 1 2
this returns the first index of each where running sum is over 100
However this isn't really an ideal use case of KDB
could instead be done with something like
(sums#/:numb) binr\: 100
maybe your real example makes more sense
You can use while loops in KDB although all KDB developers are generally too afraid of being openly mocked and laughed at for doing so
q){i:0;while[i<>50;i+:1];:"loop exited at ",string i}`
"loop exited at 50"
Kdb does have a "stop loop" mechanism but only in the case of a monadic function with single seed value
/keep squaring until number is no longer less than 1000, starting at 2
q){x*x}/[{x<1000};2]
65536
/keep dealing random numbers under 20 until you get an 18 (seed value 0 is irrelevant)
q){first 1?20}\[18<>;0]
0 19 17 12 15 10 18
However this doesn't really fit your use case and as other people have pointed out, this is not how you would/should solve this problem in kdb.

Function readmtx on matlab

I want to read a matrix that is on my matlab path. I was using the function readmtx but I don't know what to put on 'precision' (mtx = readmtx(fname,nrows,ncols,precision)).
I was wondering if you could help me with that. Or suggest a better way to read the matrix
You could read a matrix from text file with load command. If the first line include text, that should be started with %.
Note that each row of the text file should be values of a row in matrix, which are separated by a space, for Example:
%C1 C2 C3
1 2 3
4 5 6
7 8 9
Then, if you use load command you can read the text file into a matrix, something like:
myMatrix = load('textFileName.txt')
Now, Let's talk about readmtx ;)
About precision as described here:
Both binary and formatted data files can be read. If the file is binary, the precision argument is a format string recognized by fread. Repetition modifiers such as '40*char' are not supported. If the file is formatted, precision is a fscanf and sscanf-style format string of the form '%nX', where n is the number of characters within which the formatted data is found, and X is the conversion character such as 'g' or 'd'. Fortran-style double-precision output such as '0.0D00' can be read using a precision string such as '%nD', where n is the number of characters per element. This is an extension to the C-style format strings accepted by sscanf. Users unfamiliar with C should note that '%d' is preferred over '%i' for formatted integers. MATLAB syntax follows C in interpreting '%i' integers with leading zeros as octal. Formatted files with line endings need to provide the number of trailing bytes per row, which can be 1 for platforms with carriage returns or linefeed (Macintosh, UNIX®), or 2 for platforms with carriage returns and linefeeds (DOS).
Check this example also:
Write and read a binary matrix file:
fid = fopen('binmat','w');
fwrite(fid,1:100,'int16');
fclose(fid);
mtx = readmtx('binmat',10,10,'int16')
mtx =
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
mtx = readmtx('binmat',10,10,'int16',[2 5],3:2:9)
mtx =
13 15 17 19
23 25 27 29
33 35 37 39
43 45 47 49

File::stat returns "No such file or directory"

When I run this program:
#!/usr/bin/perl -w
use File::Find;
use File::stat;
use Time::gmtime;
use Fcntl ':mode';
my %size = ();
my #directory = ('.');
find(
sub {
my $st = stat($File::Find::name) or die "stat failed for $File::Find::name : $!";
if ( defined $st )
{
my $gm = gmtime $st->mtime;
$size{$gm->year + 1900} += $st->blksize unless S_ISDIR($st->mode);
}
else
{
print "stat failed for ", $File::Find::name, ": $!\n";
}
},
#directory);
foreach my $year (keys %size)
{
print "$year ", $size{$year}, "\n";
}
I get stat failed for ./1128/00 : No such file or directory at ./size.pl line 13.. But, when I list it, it's there:
# ls ./1128/00
03 05 07 09 12 14 18 20 22 24 27 29 32 34 37 40 43 45 47 50 52 54 57 59 63 65 67 69 75 78 81 83 85 88 90 92 95
04 06 08 11 13 15 19 21 23 25 28 31 33 35 39 41 44 46 48 51 53 55 58 61 64 66 68 71 77 79 82 84 86 89 91 93
Based on diagnostics that I have removed for this question, I can see that it does successfully stat the first 4 files and the . directory and 1128 directory (parent to 1128/00). It always successfully stats the same files and directories and fails on 1128/00. Why is it failing?
By default, File::Find will chdir to each directory as it recurses.
Because of this, performing stat on the $File::Find::name value of ./1128/00 is actually looking for the file ./1128/./1128/00, which does not exist.
To get the behavior that you want, simply perform your file operations on the $_ variable.
my $st = stat($_) or die "stat failed for $_: $!";

Hex dump parsing in perl

I have a hex dump of a message in a file which i want to get it in an array
so i can perform the decoding logic on it.
I was wondering if that was a easier way to parse a message which looks like this.
37 39 30 35 32 34 35 34 3B 32 31 36 39 33 34 35
3B 32 31 36 39 33 34 36 00 00 01 08 40 00 00 15
6C 71 34 34 73 69 6D 31 5F 33 30 33 31 00 00 00
00 00 01 28 40 00 00 15 74 65 6C 63 6F 72 64 69
74 65 6C 63 6F 72 64 69
Note that the data can be max 16 bytes on any row. But any row can contain fewer bytes too (minimum :1 )
Is there a nice and elegant way rather than to read 2 chars at a time in perl ?
Perl has a hex operator that performs the decoding logic for you.
hex EXPR
hex
Interprets EXPR as a hex string and returns the corresponding value. (To convert strings that might start with either 0, 0x, or 0b, see oct.) If EXPR is omitted, uses $_.
print hex '0xAf'; # prints '175'
print hex 'aF'; # same
Remember that the default behavior of split chops up a string at whitespace separators, so for example
$ perl -le '$_ = "a b c"; print for split'
a
b
c
For every line of the input, separate it into hex values, convert the values to numbers, and push them onto an array for later processing.
#! /usr/bin/perl
use warnings;
use strict;
my #values;
while (<>) {
push #values => map hex($_), split;
}
# for example
my $sum = 0;
$sum += $_ for #values;
print $sum, "\n";
Sample run:
$ ./sumhex mtanish-input
4196
I would read a line at a time, strip the whitespace, and use pack 'H*' to convert it. It's hard to be more specific without knowing what kind of "decoding logic" you're trying to apply. For example, here's a version that converts each byte to decimal:
while (<>) {
s/\s+//g;
my #bytes = unpack('C*', pack('H*', $_));
print "#bytes\n";
}
Output from your sample file:
55 57 48 53 50 52 53 52 59 50 49 54 57 51 52 53
59 50 49 54 57 51 52 54 0 0 1 8 64 0 0 21
108 113 52 52 115 105 109 49 95 51 48 51 49 0 0 0
0 0 1 40 64 0 0 21 116 101 108 99 111 114 100 105
116 101 108 99 111 114 100 105
I think reading in two characters at a time is the appropriate way to parse a stream whose logical tokens are two-character units.
Is there some reason you think that's ugly?
If you're trying to extract a particular sequence, you could do that with whitespace-insensitive regular expressions.