Polynomially larger - confusion - polynomial-math

I am studying for my algorithms class. I have a question in context to the Masters theorem:
How is n.log2(n) polynomially larger than n^(log4(3))
(log2(x) = log to the base 2 of x
log4(x) = log to the base 4 of x)
(Note: This is a solved problem on page.95 of 'Introduction to Algorithms' by Cormen et.al.)

log4(3) is less than 1, and so n^(log4(3)) is less than n^1 = n, which is less than n*log2(n).

Related

pinv(H) is not equal to pinv(H'*H)*H'

I'm testing the y = SinC(x) function with single hidden layer feedforward neural networks (SLFNs) with 20 neurons.
With a SLFN, in the output layer, the output weight(OW) can be described by
OW = pinv(H)*T
after adding regularized parameter gamma, which
OW = pinv(I/gamma+H'*H)*H'*T
with
gamma -> Inf, pinv(H'*H)*H'*T == pinv(H)*T, also pinv(H'*H)*H' == pinv(H).
But when I try to calculate pinv(H'*H)*H' and pinv(H), I find a huge difference between these two when neurons number is over 5 (under 5, they are equal or almost the same).
For example, when H is 10*10 matrix, cond(H) = 21137561386980.3, rank(H) = 10,
H = [0.736251410036783 0.499731137079796 0.450233920602169 0.296610970576716 0.369359425954153 0.505556211442208 0.502934880027889 0.364904559142718 0.253349959726753 0.298697900877265;
0.724064281864009 0.521667364351399 0.435944895257239 0.337878535128756 0.364906002569385 0.496504064726699 0.492798607017131 0.390656915261343 0.289981152837390 0.307212326718916;
0.711534656474153 0.543520341487420 0.421761457948049 0.381771374416867 0.360475582262355 0.487454209236671 0.482668250979627 0.417033287703137 0.329570921359082 0.315860145366824;
0.698672860220896 0.565207057974387 0.407705930918082 0.427683127210120 0.356068794706095 0.478412571446765 0.472552121296395 0.443893207685379 0.371735862991355 0.324637323886021;
0.685491077062637 0.586647027111176 0.393799811411985 0.474875155650945 0.351686254239637 0.469385056318048 0.462458480695760 0.471085139463084 0.415948455902421 0.333539494486324;
0.672003357663056 0.607763454504209 0.380063647372632 0.522520267708374 0.347328559602877 0.460377531907542 0.452395518357816 0.498449772544129 0.461556360076788 0.342561958147251;
0.658225608290477 0.628484290731116 0.366516925684188 0.569759064961507 0.342996293691614 0.451395814182317 0.442371323528726 0.525823695636816 0.507817005881821 0.351699689941632;
0.644175558300583 0.648743139215935 0.353177974096445 0.615761051907079 0.338690023332811 0.442445652121229 0.432393859824045 0.553043275759248 0.553944175102542 0.360947346089454;
0.629872705346690 0.668479997764613 0.340063877672496 0.659781468051379 0.334410299080102 0.433532713184646 0.422470940392161 0.579948548513999 0.599160649563718 0.370299272759337;
0.615338237874436 0.687641820315375 0.327190410302607 0.701205860709835 0.330157655029498 0.424662569229062 0.412610204098877 0.606386924575225 0.642749594844498 0.379749516620049];
T=[-0.806458764562879 -0.251682808380338 -0.834815868451399 -0.750626822371170 0.877733363571576 1 -0.626938984683970 -0.767558933097629 -0.921811074815239 -1]';
There is a huge difference between pinv(H'*H)*H*T and pinv(H)*T, where
pinv(H'*H)*H*T = [-4803.39093243484 3567.08623820149 668.037919243849 5975.10699147077
1709.31211566970 -1328.53407325092 -1844.57938928594 -22511.9388736373
-2377.63048959478 31688.5125271114]';
pinv(H)*T = [-19780274164.6438 -3619388884.32672 -76363206688.3469 16455234.9229156
-135982025652.153 -93890161354.8417 283696409214.039 193801203.735488
-18829106.6110445 19064848675.0189]'.
I also find that if I round H , round(H,2), pinv(H'*H)*H*T and pinv(H)*T return the same answer. So I guess one of the reason might be the float calculation issue inside the matlab.
But since cond(H) is large, any small change of H may result in large difference in the inverse of H. I think the round function may not be a good option to test. As Cris Luengo mentioned, with large cond,the numerical imprecision will affect the accuracy of inverse.
In my test, I use 1000 training samples Input:[-10,10], with noise between [-0.2,0.2], and test samples are noise free. 20 neurons are selected. The OW = pinv(H)*Tcan give reasonable results for SinC training, while the performance for OW = pinv(H'*H)*T is worse. Then I try to increase the precision of H'*H by pinv(vpa(H'*H)), there's no significant improvement.
Does anyone know how to solve this?
After some research, the answer is that ELM is very sentive to scaling and activation function.
Please refer to this paper for details: https://dl.acm.org/citation.cfm?id=2797143.2797161
And paper: https://ieeexplore.ieee.org/document/8533625 demonstrated a noval algorithm to improve the perforamance of ELM for scaling.

Challenging Algorithmic Modular Multiples

While practicing coding problems, I ran into this challenging problem:
Say I have a 7-variable equation, (A+B+C+C+D+B)(E+F+B+C)(G+F+F), and a huge list with up to 3500 numbers:
A 1, 2, 3; B 1, 2; C 9, 1; D 1; E 2; F 1; G 1
This list basically gives all the possible values that each variable can have. The question is, how many ways can I choose values of variables such that the resulting number is a multiple of seven?
For example, I could choose from the above list
A = 1, B = 1, C = 1, D = 1, E = 2, F = 1, G = 1. This would make the number (1+2+1+1+1+1)(2+1+1+1)(1+1+1) = 35, which is indeed a multiple of seven.
My solution was to test every possible combination of the seven variables, and check if that sum was a multiple of seven. However, that solution is obviously very slow. Does anyone have a more efficient solution for this problem?
while if the first number is a multiple of 7 than no matter what you times it by it well remain a multiple of 7
so (E+F+B+C)(G+F+F) is completely irrelevant.
if (E+F+B+C) is a multiple of 7 then the total is also a multiple of 7 same with (G+F+F).
If none of them are multiples of 7 than I don't think the answer can be a multiple of 7. At least, I cannot think of a case where this could happen.
ex no amount of 10s will ever be devisible by 7 except say 7 or 14 10s

Martin Odersky : Working hard to keep it simple

I was watching the talk given by Martin Odersky as recommended by himself in the coursera scala course and I am quite curious about one aspect of it
var x = 0
async { x = x + 1 }
async { x = x * 2 }
So I get that it can give 2 if if the first statement gets executed first and then the second one:
x = 0;
x = x + 1;
x = x * 2; // this will be x = 2
I get how it can give 1 :
x = 0;
x = x * 2;
x = x + 1 // this will be x = 1
However how can it result 0? Is it possible that the statement don't execute at all?
Sorry for such an easy question but I'm really stuck at it
You need to think about interleaved execution. Remember that the CPU needs to read the value of x before it can work on it. So imagine the following:
Thread 1 reads x (reading 0)
Thread 2 reads x (reading 0)
Thread 1 writes x + 1 (writing 1)
Thread 2 writes x * 2 (writing 0)
I know this has already been answered, but maybe this is still useful:
Think of it as a sequence of atomic operations. The processor is doing one atomic operation at a time.
Here we have the following:
Read x
Write x
Add 1
Multiply 2
The following two sequences are guaranteed to happen in this order "within themselves":
Read x, Add 1, Write x
Read x, Multiply 2, Write x
However, if you are executing them in parallel, the time of execution of each atomic operation relative to any other atomic operation in the other sequence is random i.e. these two sequences interleave.
One of the possible order of execution will produce 0 which is given in the answer by Paul Butcher
Here is an illustration I found on the internet:
Each blue/purple block is one atomic operation, you can see how you can have different results based on the order of the blocks
To solve this problem you can use the keyword "synchronized"
My understanding is that if you mark two blocks of code (e.g. two methods) with synchronized within the same object then each block will own the lock of that object while being executed, so that the other block cannot be executed while the first hasn't finished yet. However, if you have two synchronised blocks in two different objects then they can execute in parallel.

Find the largest x for which x^b+a = a

Stability (Numerical analysis)
Trying to apply the answer I saw in this question, a+x=a worked just fine with a+eps(a)/2. Suppose we have x^b+a=a, where b is a small integer, say 3 and a=2000. Then a+(eps(a))^3 or a+(eps(a)/2)^3 will always return number a. Can someone help with the measurement of x? Any way, even different from eps will do just fine.
p.s. 1938+(eps(1938)/0.00000000469)^3 is the last number that returns ans = 1.9380e+003.
1938+(eps(1938)/0.0000000047)^3 returns a=1938. Does that have to do with anything?
x = (eps(a)/2).^(1/(b-eps(a)/2))
if b = 3,
(eps(1938)/2).^(1/(3-eps(1938)/2)) > eps(1938)/0.0000000047

How to select the last column of numbers from a table created by FoldList in Mathematica

I am new to Mathematica and I am having difficulties with one thing. I have this Table that generates 10 000 times 13 numbers (12 numbers + 1 that is a starting number). I need to create a Histogram from all 10 000 13th numbers. I hope It's quite clear, quite tricky to explain.
This is the table:
F = Table[(Xi = RandomVariate[NormalDistribution[], 12];
Mu = -0.00644131;
Sigma = 0.0562005;
t = 1/12; s = 0.6416;
FoldList[(#1*Exp[(Mu - Sigma^2/2)*t + Sigma*Sqrt[t]*#2]) &, s,
Xi]), {SeedRandom[2]; 10000}]
The result for the following histogram could be a table that will take all the 13th numbers to one table - than It would be quite easy to create an histogram. Maybe with "select"? Or maybe you know other ways to solve this.
You can access different parts of a list using Part or (depending on what parts you need) some of the more specialised commands, such as First, Rest, Most and (the one you need) Last. As noted in comments, Histogram[Last/#F] or Histogram[F[[All,-1]]] will work fine.
Although it wasn't part of your question, I would like to note some things you could do for your specific problem that will speed it up enormously. You are defining Mu, Sigma etc 10,000 times, because they are inside the Table command. You are also recalculating Mu - Sigma^2/2)*t + Sigma*Sqrt[t] 120,000 times, even though it is a constant, because you have it inside the FoldList inside the Table.
On my machine:
F = Table[(Xi = RandomVariate[NormalDistribution[], 12];
Mu = -0.00644131;
Sigma = 0.0562005;
t = 1/12; s = 0.6416;
FoldList[(#1*Exp[(Mu - Sigma^2/2)*t + Sigma*Sqrt[t]*#2]) &, s,
Xi]), {SeedRandom[2]; 10000}]; // Timing
{4.19049, Null}
This alternative is ten times faster:
F = Module[{Xi, beta}, With[{Mu = -0.00644131, Sigma = 0.0562005,
t = 1/12, s = 0.6416},
beta = (Mu - Sigma^2/2)*t + Sigma*Sqrt[t];
Table[(Xi = RandomVariate[NormalDistribution[], 12];
FoldList[(#1*Exp[beta*#2]) &, s, Xi]), {SeedRandom[2];
10000}] ]]; // Timing
{0.403365, Null}
I use With for the local constants and Module for the things that are other redefined within the Table (Xi) or are calculations based on the local constants (beta). This question on the Mathematica StackExchange will help explain when to use Module versus Block versus With. (I encourage you to explore the Mathematica StackExchange further, as this is where most of the Mathematica experts are hanging out now.)
For your specific code, the use of Part isn't really required. Instead of using FoldList, just use Fold. It only retains the final number in the folding, which is identical to the last number in the output of FoldList. So you could try:
FF = Module[{Xi, beta}, With[{Mu = -0.00644131, Sigma = 0.0562005,
t = 1/12, s = 0.6416},
beta = (Mu - Sigma^2/2)*t + Sigma*Sqrt[t];
Table[(Xi = RandomVariate[NormalDistribution[], 12];
Fold[(#1*Exp[beta*#2]) &, s, Xi]), {SeedRandom[2];
10000}] ]];
Histogram[FF]
Calculating FF in this way is even a little faster than the previous version. On my system Timing reports 0.377 seconds - but such a difference from 0.4 seconds is hardly worth worrying about.
Because you are setting the seed with SeedRandom, it is easy to verify that all three code examples produce exactly the same results.
Making my comment an answer:
Histogram[Last /# F]