What is the recommended way in Julia to create a shared module?

What is the recommended way in Julia to create a shared module? - import

After some experiment and searching, I figured out 2 ways of creating a shared module
that holds some constant values.
SCHEME A:
# in file sharedconstants.jl:
module sharedconstants
kelvin = 273.15
end
# -------------------------
# in file main.jl:
include("./sharedconstants.jl");
using .sharedconstants
print(sharedconstants.kelvin, "\n");
# -------------------------
SCHEME B:
# in file sharedconstants.jl:
module sharedconstants
kelvin = 273.15
end
# -------------------------
# in file main.jl:
import sharedconstants
print(sharedconstants.kelvin, "\n");
# -------------------------
Scheme B does not always work and when it fails it throws
the error of not finding sharedconstants in current Path. Plus, Scheme B
requires the name of module (sharedconstants) the same as the trunk of
the file name. I wonder which way of the above is better in terms of
compiling and execution. Also is there any other approach to do the job?
I transferred from FORTRAN and I am quite used to simply
use sharedconstants in my code.

For performance reasons this should be a const (BTW module names use CamelNaming):
module SharedConstants2
const kelvin = 273.15
end
Writing it this way makes it type-stable which results in huge performance difference:
julia> #btime sharedconstants.kelvin * 3
18.574 ns (1 allocation: 16 bytes)
819.4499999999999
julia> #btime SharedConstants2.kelvin * 3
0.001 ns (0 allocations: 0 bytes)
819.4499999999999
Regarding the question "where to place it" I would recommend doing a Julia package - start reading here: https://pkgdocs.julialang.org/v1/creating-packages/
Finally, you might have a look at the PhysicalConstants.jl package https://github.com/JuliaPhysics/PhysicalConstants.jl

Related

How to generate a good seed

I'm looking for a method to generate a good seed for generating different series of random numbers in processes that starts at the same time.
I would like to avoid using one of the math or crypto libraries because I'm picking random numbers very frequently and my cpu resources are very limited.
I found few example for setting seeds. I tested them using the following method:
short program that picks 100 random numbers out of 5000 options. So each value has 2% chance to be selected.
run this program 100 times, so in theory, in a truly random environment, all possible values should be picked at least once.
count the number of values that were not selected at all.
This is the perl code I used. In each test I opt in only one method for generating seed:
#!/usr/bin/perl
#$seed=5432;
#$seed=(time ^ $$);
#$seed=($$ ^ unpack "%L*", `ps axww | gzip -f`);
$seed=(time ^ $$ ^ unpack "%L*", `ps axww | gzip -f`);
srand ($seed);
for ($i=0 ; $i< 100; $i++) {
printf ("%03d \n", rand (5000)+1000);
}
I ran the program 100 time and counted the values NOT selected using:
# run the program 100 times
for i in `seq 0 99`; do /tmp/rand_test.pl ; done > /tmp/list.txt
# test 1000 values (out of 5000). It should be good-enough representation.
for i in `seq 1000 1999`; do echo -n "$i "; grep -c $i /tmp/list.txt; done | grep " 0" | wc -l
The table shows the result of the tests (Lower value is better):
count Seed generation method
114 default - the line: "srand ($seed);" is commented ou
986 constant seed (5432)
122 time ^ $$
125 $$ ^ unpack "%L*", `ps axww | gzip -f`
163 time ^ $$ ^ unpack "%L*", `ps axww | gzip -f`
The constant seed method showed 986 or 1000 values not selected. In other words, only 1.4% of the possible values were selected. This is close enough to the 2% that was expected.
However, I expected that the last option that was recommended in few places, would be significantly better than the default.
Is there any better method to generate a seed for each of the processes?

I'm picking random numbers very frequently and my cpu resources are very limited.
You're worrying before you even have made a measurement.
Is there any better method to generate a seed for each of the processes?
Yes. You have to leave user space which is prone to manipulation. Simply use Crypt::URandom.
It is safe for any purpose, including fetching a seed.
It will use the kernel CSPRNG for each operating system (see source code) and hence avoid the problems shown in the article above.
It does not suffer from the documented rand weakness.

Don't generate a seed. Let Perl do it for you. Don't call srand (or call it without a parameter if you do).
Quote srand,
If srand is not called explicitly, it is called implicitly without a parameter at the first use of the rand operator
and
When called with a parameter, srand uses that for the seed; otherwise it (semi-)randomly chooses a seed.
It doesn't simply use the time as the seed.
$ perl -M5.014 -E'say for srand, srand'
2665271449
1007037147

Your goal seems to be how to generate random numbers rather than how to generate seeds. In most cases, just use a cryptographic RNG (such as Crypt::URandom in Perl) to generate the random numbers you want, rather than generate seeds for another RNG. (In general, cryptographic RNGs take care of seeding and other issues for you.) You should not use a weaker RNG unless—
the random values you generate aren't involved in information security (e.g., the random values are neither passwords nor nonces nor encryption keys), and
either—
you care about repeatable "randomness" (which is not the case here), or
you have measured the performance of your application and find random number generation to be a performance bottleneck.
Since you will generate random names for the purpose of querying a database, which may be in a remote location, it will be highly unlikely that the random number generation itself will be the performance bottleneck.

running read csv from qpython and assign to table

I would like to run the following q code in python:
table: ("ISI"; enlist ",") 0:`data.csv
I am starting with exploring qpython as its easier to use in Windows for now (compared to pyq) and would like to do the following:
q = qconnection.QConnection(host = 'localhost', port = 5000)
q.sync('table: ("ISI"; enlist ",") 0:`data.csv')
Is something like this possible or do I need to use pyq in the future when its stable for Windows? The examples I have seen for q.sync are queries and functions that take a list of parameters rather than directly running code in the q environment. I would like to make sure I am not missing some other functionality that I can use for my current task.

When trying to access a file you have to use its file handle which is of the form `:data.csv (notice the colon at the start), instead of a symbol which is what you are using. You can use hsym to turn a symbol into a file handle.
You should also check that the file is in the same working directory as the q process, using \dir in the q process on Windows, otherwise you will need to adapt your file handle to point to the correct location
q)hsym `data.csv
`:data.csv
With a file data.csv that has contents:
id,sym,val
1,APPL,50
2,GOOG,100
Running the same command that you did but using the file handle:
In: q.sync('table: ("ISI"; enlist ",") 0: `:data.csv')
or
In: q.sync('table: ("ISI"; enlist ",") 0:hsym `qpython.csv')
Checking the resulting variable using qpython:
In: q.sync('table')
Out: rec.array([(1, b'APPL', 50), (2, b'GOOG', 100)],
dtype=[('id', '<i4'), ('sym', 'S4'), ('val', '<i4')])
Checking in the q process
q)table
id sym val
-----------
1 APPL 50
2 GOOG 100

In Squeak Smalltalk, how can type a number which is base-250 positional numeral system?

One thing that makes me particularly like about Smalltalk is that it
has the power to do arithemtic calculations of numbers with the base
of different integers. I guess no other language can do the same.
Please see the codes below.
Transcript show: 16raf * 32; cr.
Transcript show: 7r21 - 5r32; cr.
The output is
5600
-2
I understand that if the number is hexadecimal(16-based), abcdef can
be employed. But what if the integer I want to be the base is 250. On some position, there's 60. How can I type that number in squeak ?

Short answer: you cannot type arbitrary numbers for arbitrary bases above 36 without changing the parser.
Longer answer:
You can use arbitrary base above 36, but you will run into trouble print and write numbers that would need symbols above 36.
You can check all the symbols for a base:
base := 36.
number := 0.
1 to: base - 1 do: [ :i |
number := number * base + i
].
number printStringBase: base.
the above results in the following
'123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ'
This is also hard-coded when printing in SmallInteger>>printOn:base:length:padded:
Note that for a number that is smaller than base, printStringBase: will just use ascii directly.
36 printStringBase: 37 '['
But even if you were to remove the hardcoded limitation and used ascii directly, you aren't helping yourself.
Sooner or later you will need ascii symbols that have different meaning in the syntax. For example following Z (ascii 90) is [ (ascii 91), which is used to start block.
So 37r2[ will be parsed into 37r2 and [ will be reserved for block (and thus will result in syntax error).
But otherwise you can use any base
2001rSpaceOdyssey -> 57685915098460127668088707185846682264

Find all differences between .mat files

I am looking for a way to list the differences between two .mat files, something that can be usefull for many people.
Though I searched everywhere I could think of, I have not found anything that meets my requirements:
Pick 2 mat files
Find the differences
Save them properly
The closest I have come is visdiff. As long as I stay within matlab, it will allow me to browse the differences, but when I save the result it only shows me the top level.
Here is a simplified example of what my files typically look like:
a = 6;
b.c.d = 7;
b.c.e = 'x';
save f1
f = a;
clear a
b.c.e = 'y';
save f2
visdiff('f1.mat','f2.mat')
If I click here on b, I can find the difference. However if I run this and use 'file>save', I am not able to click on b. Thus I still don't know what has been changed.
Note: I don't have Simulink
Hence my question is:
How can I show all differences between 2 mat files to someone without Matlab
Here are the answers that I personally consider to be most suitable for different situations:
Answer for users with Simulink
General answer
Answer displaying all value differences

Find all differences between mat files without MATLAB?
You can find the differences between HDF5 based .mat files with the HDF5 Tools.
Example
Let me shorten your MATLAB example and assume you create two mat files with
clear ; a = 6 ; b.c = 'hello' ; save -v7.3 f1
clear ; a = 7 ; b.e = 'world' ; save -v7.3 f2
Outside MATLAB use
h5ls -v -r f1.mat
to get a listing about the kind of data included f1.mat:
Opened "f1.mat" with sec2 driver.
/ Group
Location: 1:96
Links: 1
/a Dataset {1/1, 1/1}
Attribute: MATLAB_class scalar
Type: 6-byte null-terminated ASCII string
Data: "double"
Location: 1:2576
Links: 1
Storage: 8 logical bytes, 8 allocated bytes, 100.00% utilization
Type: native double
/b Group
Attribute: MATLAB_class scalar
Type: 6-byte null-terminated ASCII string
Data: "struct"
Location: 1:800
Links: 1
/b/c Dataset {5/5, 1/1}
Attribute: H5PATH scalar
Type: 2-byte null-terminated ASCII string
Data: "/b"
Attribute: MATLAB_class scalar
Type: 4-byte null-terminated ASCII string
Data: "char"
Attribute: MATLAB_int_decode scalar
Type: native int
Data: 2
Location: 1:1832
Links: 1
Storage: 10 logical bytes, 10 allocated bytes, 100.00% utilization
Type: native unsigned short
Use of
h5ls -d -r f1.mat
returns the values of the stored data:
/ Group
/a Dataset {1, 1}
Data:
(0,0) 6
/b Group
/b/c Dataset {5, 1}
Data:
(0,0) 104, 101, 108, 108, 111
The data 104, 101, 108, 108, 111 represents the word hello, which can be seen with
h5ls -d -r f1.mat | tail -1 | awk '{FS=",";printf("%c%c%c%c%c \n",$2,$3,$4,$5,$6)}'
You can get the same listing for f2.mat and compare the two outputs with the tool of your choice.
Comparison also works directly with HDF5 Tools. To compare the two numbers a from both files use
h5diff -r f1.mat f2.mat /a
which will show you the values and their difference
dataset: </a> and </a>
size: [1x1] [1x1]
position a a difference
------------------------------------------------------------
[ 0 0 ] 6 7 1
1 differences found
attribute: <MATLAB_class of </a>> and <MATLAB_class of </a>>
0 differences found
Remarks
There are a few more commands and options in the HDF5 Tools, which may help to get your real problem solved.
Binary distributions are available for Linux and Windows from The HDF Group. For OS X you can get them installed via MacPorts. If needed there is also a GUI: HDFView.

If you have simulink you can use Simulink.saveVars to generate an m-file that upon execution creates the same variables in work space:
a = 6;
b.c.d = 7;
b.c.e = 'x';
Simulink.saveVars('f1');
f = a;
clear a
b.c.e = 'y';
Simulink.saveVars('f2');
visdiff('f1.m','f2.m')
as illustrated in this sctreenshot
Note that by default it limits the number of elements in arrays to 1000 and you can increase it to 10000. Arrays larger than that limit will be saved in a separate mat-file.
UPDATE: From R2014a a new function similar to Simulink.saveVars has been added to MATLAB. see matlab.io.saveVariablesToScript

This is only part of the answer, but maybe it helps.
You could use gencode, a Matlab function that generates Matlab code from a variable such that running the code reproduces the variable. You do this for all of the variables in each mat-file (takes some programming, but should be doable) and put the results in different .m-files.
Then you use a standard text comparison tool (maybe even visdiff) to compare the .m-files.

There are several good tools to compare XML-Files, this I would proceed this way:
Download struct2xml.m
Load both matfiles
Export each with struct2xml
compare, using XMLSpy or similar

Simple general answer, without displaying value differences
Due to the insight I gained from the answers of #BHF, #Daniel R and #Dennis Jaheruddin, I have managed to find a simple scalable solution:
[fs1, fs2, er] = comp_struct(load('f1.mat'),load('f2.mat'))
Note that it works for .mat containing an arbritrary number of variables.
This uses the Compare Structures - File Exchange submission.

Answer for small files, displaying all value differences
Based on the suggestion by #A. Donda I have tried to use gencode to create a variable for everything.
Though it works for my toy example, it is quite slow and tells me that I exceed the allowed amount of variables for my real .mat files.
Anyway, for those who are looking for something that works with small files, I will post this option:
wList=who;
for iLoop = 1:numel(wList)
eval(['generated_' wList{iLoop} '= gencode(' wList{iLoop} ');'])
for jLoop = 1:numel(eval(['generated_' wList{iLoop}]))
eval(['generated_' wList{iLoop} '_' num2str(jLoop) '= generated_' wList{iLoop} '(' num2str(jLoop) ');' ])
end
end
Though it may work, I don't feel like this is the best way to go.

General answer, without displaying value differences
Due to the insight I gained from the answers of #BHF and #Daniel R I have managed to find a reasonably scalable solution.
Step 1: Save all variables from each files as a single struct
This uses the Save workspace to struct - File Exchange submission.
Here are the steps to take assuming you want to compare f1.mat and f2.mat:
clear
load f1
myStruct1 = ws2struct;
save myStruct1 myStruct1
clear
load f2
myStruct2 = ws2struct;
save myStruct2 myStruct2
clear
load myStruct1
load myStruct2
Step 2: Compare the structs
This uses the Compare Structures - File Exchange submission
Given that you want to compare myStruct1 and myStruct2 you can simply call:
[fs1, fs2, er] = comp_struct(myStruct1,myStruct2)
I was positively surprised at how readable the list of differences in er is, here is the output for the example that was used in the question:
er =
's2 is missing field a'
's1(1).b(1).c(1).e and s2(1).b(1).c(1).e do not match'
Note that it will not show values, from a technical point of view it is probably not too hard to change the m file if value difference displays are desirable. However, especially if there are some big matrices I suppose this could result in problematic output.

perlre length limit

From man perlre:
The "*" quantifier is equivalent to "{0,}", the "+" quantifier to "{1,}", and the "?" quantifier to "{0,1}". n and m are limited to integral values less than a preset limit defined when perl is built. This is usually 32766 on the most common platforms. The actual limit can be seen in the error message generated by code such as this:
$_ **= $_ , / {$_} / for 2 .. 42;
Ay that's ugly - Isn't there some constant I can get instead?
Edit: As daxim pointed out (and perlretut hints towards) it might be that 32767 is a magical hardcoded number. A little searching in the Perl code goes a long way, but I'm not sure how to get to the next step and actually find out where the default reg_infty or REG_INFTY is actually set:
~/dev/perl-5.12.2
$ grep -ri 'reg_infty.*=' *
regexec.c: if (max != REG_INFTY && ST.count == max)
t/re/pat.t: $::reg_infty = $Config {reg_infty} // 32767;
t/re/pat.t: $::reg_infty_m = $::reg_infty - 1;
t/re/pat.t: $::reg_infty_p = $::reg_infty + 1;
t/re/pat.t: $::reg_infty_m = $::reg_infty_m; # Surpress warning.
Edit 2: DVK is of course right: It's defined at compile time, and can probably be overridden only with REG_INFTY.

Summary: there are 3 ways I can think of to find the limit: empirical, "matching Perl tests" and "theoretical".
Empirical:
eval {$_ **= $_ , / {$_} / for 2 .. 129};
# To be truly portable, the above should ideally loop forever till $# is true.
$# =~ /bigger than (-?\d+) /;
print "LIMIT: $1\n"'
This seems obvious enough that it doesn't require explanation.
Matches Perl tests:
Perl has a series of tests for regex, some of which (in pat.t) deal with testing this max value. So, you can approximate that the max value computed in those tests is "good enough" and follow the test's logic:
use Config;
$reg_infty = $Config {reg_infty} // 2 ** 15 - 1; # 32767
print "Test-based reg_infinity limit: $reg_infty\n";
The explanation of where in the tests this is based off of is in below details.
Theoretical: This is attempting to replicate the EXACT logic used by C code to generate this value.
This is harder that it sounds, because it's affected by 2 things: Perl build configuration and a bunch of C #define statements with branching logic. I was able to delve fairly deeply into that logic, but was stalled on two problems: the #ifdefs reference a bunch of tokens that are NOT actually defined anywhere in Perl code that I can find - and I don't know how to find out from within Perl what those defines values were, and the ultimate default value (assuming I'm right and those #ifdefs always end up with the default) of #define PERL_USHORT_MAX ((unsigned short)~(unsigned)0) (The actual limit is gotten by removing 1 bit off that resulting all-ones number - details below).
I'm also not sure how to access the amount of bytes in short from Perl for whichever implementation was used to build perl executable.
So, even if the answer to both those questions can be found (which I'm not sure of), the resulting logic would most certainly be "uglier" and more complex than the straightforward "empirical eval-based" one I offered as the first option.
Below I will provide the details of where various bits and pieces of logic related to to this limit live in Perl code, as well as my attempts to arrive at "Theoretically correct" solution matching C logic.
OK, here is some investigation part way, you can complete it yourself as I have ti run or I will complete later:
From regcomp.c: vFAIL2("Quantifier in {,} bigger than %d", REG_INFTY - 1);
So, the limit is obviously taken from REG_INFTY define. Which is declared in:
rehcomp.h:
/* XXX fix this description.
Impose a limit of REG_INFTY on various pattern matching operations
to limit stack growth and to avoid "infinite" recursions.
*/
/* The default size for REG_INFTY is I16_MAX, which is the same as
SHORT_MAX (see perl.h). Unfortunately I16 isn't necessarily 16 bits
(see handy.h). On the Cray C90, sizeof(short)==4 and hence I16_MAX is
((1<<31)-1), while on the Cray T90, sizeof(short)==8 and I16_MAX is
((1<<63)-1). To limit stack growth to reasonable sizes, supply a
smaller default.
--Andy Dougherty 11 June 1998
*/
#if SHORTSIZE > 2
# ifndef REG_INFTY
# define REG_INFTY ((1<<15)-1)
# endif
#endif
#ifndef REG_INFTY
# define REG_INFTY I16_MAX
#endif
Please note that SHORTSIZE is overridable via Config - I will leave details of that out but the logic will need to include $Config{shortsize} :)
From handy.h (this doesn't seem to be part of Perl source at first glance so it looks like an iffy step):
#if defined(UINT8_MAX) && defined(INT16_MAX) && defined(INT32_MAX)
#define I16_MAX INT16_MAX
#else
#define I16_MAX PERL_SHORT_MAX
I could not find ANY place which defined INT16_MAX at all :(
Someone help please!!!
PERL_SHORT_MAX is defined in perl.h:
#ifdef SHORT_MAX
# define PERL_SHORT_MAX ((short)SHORT_MAX)
#else
# ifdef MAXSHORT /* Often used in <values.h> */
# define PERL_SHORT_MAX ((short)MAXSHORT)
# else
# ifdef SHRT_MAX
# define PERL_SHORT_MAX ((short)SHRT_MAX)
# else
# define PERL_SHORT_MAX ((short) (PERL_USHORT_MAX >> 1))
# endif
# endif
#endif
I wasn't able to find any place which defined SHORT_MAX, MAXSHORT or SHRT_MAX so far. So the default of ((short) (PERL_USHORT_MAX >> 1)) it is assumed to be for now :)
PERL_USHORT_MAX is defined very similarly in perl.h, and again I couldn't find a trace of definition of USHORT_MAX/MAXUSHORT/USHRT_MAX.
Which seems to imply that it's set by default to: #define PERL_USHORT_MAX ((unsigned short)~(unsigned)0). How to extract that value from Perl side, I have no clue - it's basically a number you get by bitwise negating a short 0, so if unsigned short is 16 bytes, then PERL_USHORT_MAX will be 16 ones, and PERL_SHORT_MAX will be 15 ones, e.g. 2^15-1, e.g. 32767.
Also, from t/re/pat.t (regex tests): $::reg_infty = $Config {reg_infty} // 32767; (to illustrate where the non-default compiled in value is stored).
So, to get your constant, you do:
use Config;
my $shortsize = $Config{shortsize} // 2;
$c_reg_infty = (defined $Config {reg_infty}) ? $Config {reg_infty}
: ($shortsize > 2) ? 2**16-1
: get_PERL_SHORT_MAX();
# Where get_PERL_SHORT_MAX() depends on logic for PERL_SHORT_MAX in perl.h
# which I'm not sure how to extract into Perl with any precision
# due to a bunch of never-seen "#define"s and unknown size of "short".
# You can probably do fairly well by simply returning 2**8-1 if shortsize==1
# and 2^^16-1 otherwise.
say "REAL reg_infinity based on C headers: $c_reg_infty";

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

What is the recommended way in Julia to create a shared module? - import

Related

How to generate a good seed

running read csv from qpython and assign to table

In Squeak Smalltalk, how can type a number which is base-250 positional numeral system?

Find all differences between .mat files

perlre length limit

Categories

Resources