Calculating large catalan numbers modulo a prime - numbers

Does anybody know a fast algorithm for calculating a large catalan number modulo a prime (which is 1.000.000.007) with an input value of about 500.000
Already spent quite some time on it but I wasn't able to modify the normal formular to work with that high numbers and the dynamic algorithm takes too long.
I'd be very grateful for any help :)

Using the open source and python-based program sage it takes 11 seconds on my old double core 2.80Ghz laptop to compute the residue class of the Catalan number you asked for:
huisman#thom:~$ sage
┌────────────────────────────────────────────────────────────────────┐
│ SageMath Version 6.10, Release Date: 2015-12-18 │
│ Type "notebook()" for the browser-based notebook interface. │
│ Type "help()" for help. │
└────────────────────────────────────────────────────────────────────┘
sage: time catalan_number(5*10^5).mod(10^9+7)
CPU times: user 11.3 s, sys: 0 ns, total: 11.3 s
Wall time: 11.3 s
213941567
By inspecting the code you can get an idea how to do it so fast.

Related

Register address in ADXL345 digital accelerometer

I am confused with the registers present in the ADXL345 digital accelerometer.
The first thing which confuses me is where I have to write the data to set resolution for +/-2g. I didn't find any mention of this register in the datasheet.
Secondly, there are two registers in which the measurement value for the X axis is stored. How do I read that data from both registers? Do I need to send the address of the register at the same time, or what?
The first thing which confuses me is where I have to write the data to set resolution for +/-2g. I didn't find any mention of this register in the datasheet.
You'll find this information on page 26 of the data sheet (at least, in Rev. E of the data sheet). The range is controlled by bits 0 and 1 in register 0x31 (DATA_FORMAT).
Register 0x31—DATA_FORMAT (Read/Write)
The DATA_FORMAT register controls the presentation of data
to Register 0x32 through Register 0x37. All data, except that for
the ±16 g range, must be clipped to avoid rollover.
SELF_TEST Bit
A setting of 1 in the SELF_TEST bit applies a self-test force to
the sensor, causing a shift in the output data. A value of 0 disables
the self-test force.
SPI Bit
A value of 1 in the SPI bit sets the device to 3-wire SPI mode,
and a value of 0 sets the device to 4-wire SPI mode.
INT_INVERT Bit
A value of 0 in the INT_INVERT bit sets the interrupts to active
high, and a value of 1 sets the interrupts to active low.
FULL_RES Bit
When this bit is set to a value of 1, the device is in full resolution
mode, where the output resolution increases with the g range
set by the range bits to maintain a 4 mg/LSB scale factor. When
the FULL_RES bit is set to 0, the device is in 10-bit mode, and
the range bits determine the maximum g range and scale factor.
Justify Bit
A setting of 1 in the justify bit selects left-justified (MSB) mode,
and a setting of 0 selects right-justified mode with sign extension.
Range Bits
These bits set the g range as described in Table 21.
Table 21. g Range Setting
╔═════════╦══════════╗
║ Setting ║ ║
╠════╦════╣ g Range ║
║ D1 ║ D0 ║ ║
╠════╬════╬══════════╣
║ 0 ║ 0 ║ +/- 2 g ║
╠════╬════╬══════════╣
║ 0 ║ 1 ║ +/- 4 g ║
╠════╬════╬══════════╣
║ 1 ║ 0 ║ +/- 8 g ║
╠════╬════╬══════════╣
║ 1 ║ 1 ║ +/- 16 g ║
╚════╩════╩══════════╝
So, what you'll want to do is read the current value of register 0x31, mask off bits 0 and 1, set the value you want (as per Table 21), and then write the new value to register 0x31.
Secondly, there are two registers in which the measurement value for the X axis is stored. How do I read that data from both registers? Do I need to send the address of the register at the same time, or what?
No, you read each register sequentially.
Register 0x32 holds the least-significant bits of the x-axis value, and register 0x33 holds the most-significant bits of the x-axis value. Together, they combine to an x-axis reading with 13 (maximum) bits of precision, in two's-complement format. If you only needed 8 bits of precision, you could read only the MSB from register 0x33, which would be slightly faster than reading both registers.
The data sheet does make one additional recommendation that you should pay attention to:
It is recommended that a multiple-byte read of all registers be performed to prevent a change in data between reads of sequential registers.
How exactly you do a multiple-byte read varies, depending on whether you're using the SPI or I2C bus, but either way, it is described in the data sheet. For SPI:
To read or write multiple bytes in a single transmission, the
multiple-byte bit, located after the R/W bit in the first byte transfer
(MB in Figure 37 to Figure 39), must be set. After the register
addressing and the first byte of data, each subsequent set of clock
pulses (eight clock pulses) causes the ADXL345 to point to the
next register for a read or write. This shifting continues until the
clock pulses cease and CS is deasserted. To perform reads or writes
on different, nonsequential registers, CS must be deasserted
between transmissions and the new register must be addressed
separately.

Mosek memory issue for large linear programming

I use MOSEK to run a very large linear programming problem in Matlab (32768 unknowns and 691621 constraints).
The code is submitted in a Linux based cluster.
In the bash file I request the following amount of memory:
#$ -l h_vmem=20G
#$ -l tmem=20G
but get Mosek error: MSK_RES_ERR_SPACE (Out of space.)
I could request more memory (however, it is unclear how much more?), but this would mean queuing in the cluster for a long time.
Hence, I was wondering whether I can try to ameliorate the issue in some other way.
1) Quoting from some MOSEK FAQs:
Java, .NET, amd Python applications runs under a virtual machine. MOSEK shares memeory
with the virtual machine. This implies it might be necessary to force the virtual machine to
free unused memory by explicitly calling the garbage collector (for example before optimization
is performed) in order to make sufficient memory available to MOSEK.
Can this advise be useful? What does it mean calling the garbage collector (i.e., which line should I add to my Matlab code?).
2) From https://docs.mosek.com/9.2/pythonapi/guidelines-optimizer.html (even though this is for Python), it suggests to set
Task.putmaxnumvar. Estimate for the number of variables.
Task.putmaxnumcon. Estimate for the number of constraints.
Task.putmaxnumcone. Estimate for the number of cones.
Task.putmaxnumbarvar. Estimate for the number of semidefinite matrix variables.
Task.putmaxnumanz. Estimate for the number of non-zeros in A.
Task.putmaxnumqnz. Estimate for the number of non-zeros in the quadratic terms.
Can I do that in Matlab? How?
3) From http://ask.cvxr.com/t/how-to-deal-with-out-of-space-error-when-using-mosek-to-solve-a-conic-optimization-problem/7510: "It will reduce memory consumption to some extent if you run on 1 thread (set MSK_IPAR_NUM_THREADS to 1 in cvx solver options or set MSK_IPAR_INTPNT_MULTI_THREAD to 0)"
Can this be done in Matlab as well? I have tried
param_MOSEK.MSK_IPAR_NUM_THREADS = 1;
param_MOSEK.MSK_IPAR_INTPNT_MULTI_THREAD = 'MSK_OFF';
but it does not seem to work as the output file still gives
Optimizer - threads : 16
Optimizer - solved problem : the dual
...
Comments related to questions below:
The code runs in my MacOS 64 bit using 16 threads in just 180 sec. The
memory of the computer is 32 GB 2667 MHz DDR4. It uses much less than 20G (around 9G).
The code fails when it is run on the cluster of my univ (Linux based) after having requested 20G of vmem and tmem. In the cluster, MOSEK executes the presolving, the GP based matrix reordering, and then fails. This is a typical log file
Wed 9 Sep 08:10:47 BST 2020
Task ID is 6
< M A T L A B (R) >
Copyright 1984-2019 The MathWorks, Inc.
R2019b Update 3 (9.7.0.1261785) 64-bit (glnxa64)
November 27, 2019
For online documentation, see https://www.mathworks.com/support
For product information, visit www.mathworks.com.
MOSEK Version 9.2.5 (Build date: 2020-4-22 22:56:56)
Copyright (c) MOSEK ApS, Denmark. WWW: mosek.com
Platform: Linux/64-X86
Problem
Name :
Objective sense : min
Type : LO (linear optimization problem)
Constraints : 691597
Cones : 0
Scalar variables : 32768
Matrix variables : 0
Integer variables : 0
Optimizer started.
Presolve started.
Linear dependency checker started.
Linear dependency checker terminated.
Eliminator started.
Freed constraints in eliminator : 0
Eliminator terminated.
Eliminator - tries : 1 time : 0.00
Lin. dep. - tries : 1 time : 0.33
Lin. dep. - number : 0
Presolve terminated. Time: 2.99
GP based matrix reordering started.
GP based matrix reordering terminated.
Optimizer terminated. Time: 20.15
Interior-point solution summary
Problem status : UNKNOWN
Solution status : UNKNOWN
Primal. obj: 0.0000000000e+00 nrm: 1e+00 Viol. con: 1e+00 var: 0e+00
Dual. obj: 0.0000000000e+00 nrm: 0e+00 Viol. con: 0e+00 var: 0e+00
Optimizer summary
Optimizer - time: 20.15
Interior-point - iterations : 0 time: 19.95
Basis identification - time: 0.00
Primal - iterations : 0 time: 0.00
Dual - iterations : 0 time: 0.00
Clean primal - iterations : 0 time: 0.00
Clean dual - iterations : 0 time: 0.00
Simplex - time: 0.00
Primal simplex - iterations : 0 time: 0.00
Dual simplex - iterations : 0 time: 0.00
Mixed integer - relaxations: 0 time: 0.00
Mosek error: MSK_RES_ERR_SPACE (Out of space.)
Irrelevant in Matlab.
Irrelevant and imposible in Matlab. The MEX interface feeds the problem into Mosek in one go and takes care of all allocations itself.
For MSK_IPAR_NUM_THREADS to be respected you must restart the whole process i.e. Matlab. See https://docs.mosek.com/9.2/faq/faq.html#mosek-is-ignoring-the-limit-on-the-number-of-threads. However, when you set MSK_IPAR_INTPNT_MULTI_THREAD = 'MSK_OFF' then Mosek will use 1 thread regardless of the number of all threads available i.e. the number printed to the log is just an upper bound. You should be able to see in the task manager/top/whatever other CPU load tracker that only 1CPU is in use.
The basic question is: have you tried to run the problem without any memory limits to see if it works at all and estimate the memory consumption? Does it run on other machines?

Multiple sequence alignment of 12 species

i need to perform MSA( multiple sequence alignment on nucleotide sequences of 12 wheat varieties. all these varieties have different length bps(base pairs).I followed this documentation of MATLAB http://www.mathworks.in/help/bioinfo/ref/multialign.html. But when i type this "
ma = multialign(p53,tree,'ScoringMatrix',...
{'pam150','pam200','pam250'})
showalignment(ma)"
i get an error :
??? Out of memory. Type HELP MEMORY for
your options.
Error in ==> profalign>affinegap at 648
F = zeros(n+1,m+1,numStates);
Error in ==> profalign at 426
[F, pointer] =
affinegap(prof1,len1,prof2,len2,SM,go1,go2,ge1,ge2,wg1,wg2);
Error in ==> multialign at 655
[profs{rootInd} h1 h2] =
profalign(profs{[i,rootInd]},...
Please help
This is a hard problem to debug, because it is highly dependent on your specific settings. As mentioned in the comments, Matlab is saying that it ran out of memory. This might be because of the way you have Matlab configured or because your computer doesn't have enough RAM (or maybe you were using too much RAM for other things at the time). It's also possible that you just gave it more data than it can handle. However, assuming that the sequences aren't unreasonably long, 12 sequences should be pretty manageable for a progressive alignment algorithm, which multalign seems to be.
Given all of those variables, the simplest solution is just to avoid trying to run it on your computer. There are websites where you can submit your data to be aligned on a server that will definitely have sufficient RAM. The most popular of such websites is ClustalOmega, the successor to ClustalW. These sites will generally return results fairly quickly.

To find execution time on a mult-icore machine

I'am preparing for a competitive exam and i have an operating system question.
I'am not getting how to solve it. please help me out.
Q-)
A program took 160 seconds to execute on a single processor but only 64 seconds on a
4 core multicore. What is the best estimate for the execution time on a 64 core machine?
I don't think this is strictly relevant to programming (you might find this more relevant on the Math StackExchange but I'll attempt to answer it anyway.
The answer will depend entirely on how you model execution time vs number of cores. You could model the execution time as inversely proportional to the number of cores. For example, I used the following model:
Where t is time in seconds and n is number of cores, c (could represent overhead) and k (a factor) are constants.
Solve simultaneously
to get k = 128 and c = 32.
Then just substitute n = 64
So, you get 34 seconds according to this model. Of course, since you don't know the exact model, this can only be a calculated guess.

Perl module to convert between MB/GB/TB without converting to bytes first?

I'm trying to calculate the free space on an LVM physical volume by multiplying the number of free physical extents by the extent size, for example:
3623365 free extents * 4.00 MB each = 13.8 TB
I was using Number::Format to convert the extent size to bytes and convert the results of the multiplication back to a human-readable string, but TB and higher are not supported, so I end up with the longer, less readable 14,153.8 GB.
According to the docs, the reason TB and up are not supported is because of integer overflows on 32-bit systems, which made me wonder if I should even be multiplying arbitrary large numbers without using something like Math::BigInt. I see that Number::Bytes::Human supports numbers up to YB (yottabytes), but it's still in alpha so I hesitate to use it in production code.
My next thought was, why even convert to bytes in the first place when I could calculate the free space in MB and then convert to TB? Unfortunately, it seems like neither Number::Format nor Number::Bytes::Human supports conversions from one "suffix" to another, e.g. MB -> TB. Is there an existing module that does this? I really like how Number::Format and Number::Bytes::Human handle both SI/non-SI units (MB vs. MiB), allow you to set the precision, etc. and so hesitate to roll my own solution if a similarly full-featured module already does it.
Edit: The extent size is not always in MB, nor is the free space always in TB, so I am not asking how to convert from MB to TB (that would be trivial). I am asking if there are any existing modules that can convert from one [arbitrary] suffix to another without converting to bytes first.
To convert from MB to TB w/o going through Bytes:
Number of TB = Number of MB * (Bytes in 1 TB/Bytes in 1 MB)
UPDATE:
To Generalize:
Number of new units = Number of old units * (Bytes in 1 new unit / Bytes in 1 old unit)