How to calculate a number like PI over multiple machines? [closed] - distributed-computing

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
In what way would you try to use multiple computers to compute a number like PI, i.e.?
Are there existing algorithms or solutions that make this easy to do?
How do you split up the work and let the results from other machines come into effect?

Here's one simple way:
generate a huge number of random (x,y) points where x and y are between 0 and 1.
for each point, calculate whether its cartesian distance to the origin is <= 1 (that is, whether it lies on or inside the circle)
count the number of point inside the circle versus outside the circle
Pi, then, can be calculated from the ratio of inside to outside points. A very large number of points is necessary for this to approach pi, but if you have many machines, you can have each computer generate as many as you like, then simply return the counts to some leader machine, which would collect all the results and calculate the final ratio.
This method can be used to calculate pi to any precision you want...the more points, the more precision. It's called a 'Monte Carlo' method because it uses randomness. See http://math.fullerton.edu/mathews/n2003/montecarlopimod.html for more information.

An "easy" version would be using the Bailey–Borwein–Plouffe formula, or its faster variant Bellard Formula. It allows calculating individual (binary) digits of π without calculating the previous ones before.
This means that you can distribute your calculation effort on different computers, which do not have to communicate much. For larger digit indices, you still need distribute the calculation even for a single digit (since you are doing some multiplications and divisions of really large integers).
This was used by the PiHex project to calculate some (binary) digits around digit number 5·1012, some around 4·1013 and some around 1015.

On .Net platform, you can try .net remoting

Related

Newbie to Neural Networks

Just starting to play around with Neural Networks for fun after playing with some basic linear regression. I am an English teacher so don't have a math background and trying to read a book on this stuff is way over my head. I thought this would be a better avenue to get some basic questions answered (even though I suspect there is no easy answer). Just looking for some general guidance put in layman's terms. I am using a trial version of an Excel Add-In called NEURO XL. I apologize if these questions are too "elementary."
My first project is related to predicting a student's Verbal score on the SAT based on a number of test scores, GPA, practice exam scores, etc. as well as some qualitative data (gender: M=1, F=0; took SAT prep class: Y=1, N=0; plays varsity sports: Y=1, N=0).
In total, I have 21 variables that I would like to feed into the network, with the output being the actual score (200-800).
I have 9000 records of data spanning many years/students. Here are my questions:
How many records of the 9000 should I use to train the network?
1a. Should I completely randomize the selection of this training data or be more involved and make sure I include a variety of output scores and a wide range of each of the input variables?
If I split the data into an even number, say 9x1000 (or however many) and created a network for each one, then tested the results of each of these 9 on the other 8 sets to see which had the lowest MSE across the samples, would this be a valid way to "choose" the best network if I wanted to predict the scores for my incoming students (not included in this data at all)?
Since the scores on the tests that I am using as inputs vary in scale (some are on 1-100, and others 1-20 for example), should I normalize all of the inputs to their respective z-scores? When is this recommended vs not recommended?
I am predicting the actual score, but in reality, I'm NOT that concerned about the exact score but more of a range. Would my network be more accurate if I grouped the output scores into buckets and then tried to predict this number instead of the actual score?
E.g.
750-800 = 10
700-740 = 9
etc.
Is there any benefit to doing this or should I just go ahead and try to predict the exact score?
What if ALL I cared about was whether or not the score was above or below 600. Would I then just make the output 0(below 600) or 1(above 600)?
5a. I read somewhere that it's not good to use 0 and 1, but instead 0.1 and 0.9 - why is that?
5b. What about -1(below 600), 0(exactly 600), 1(above 600), would this work?
5c. Would the network always output -1, 0, 1 - or would it output fractions that I would then have to roundup or rounddown to finalize the prediction?
Once I have found the "best" network from Question #3, would I then play around with the different parameters (number of epochs, number of neurons in hidden layer, momentum, learning rate, etc.) to optimize this further?
6a. What about the Activation Function? Will Log-sigmoid do the trick or should I try the other options my software has as well (threshold, hyperbolic tangent, zero-based log-sigmoid).
6b. What is the difference between log-sigmoid and zero-based log-sigmoid?
Thanks!
First a little bit of meta content about the question itself (and not about the answers to your questions).
I have to laugh a little that you say 'I apologize if these questions are too "elementary."' and then proceed to ask the single most thorough and well thought out question I've seen as someone's first post on SO.
I wouldn't be too worried that you'll have people looking down their noses at you for asking this stuff.
This is a pretty big question in terms of the depth and range of knowledge required, especially the statistical knowledge needed and familiarity with Neural Networks.
You may want to try breaking this up into several questions distributed across the different StackExchange sites.
Off the top of my head, some of it definitely belongs on the statistics StackExchange, Cross Validated: https://stats.stackexchange.com/
You might also want to try out https://datascience.stackexchange.com/ , a beta site specifically targeting machine learning and related areas.
That said, there is some of this that I think I can help to answer.
Anything I haven't answered is something I don't feel qualified to help you with.
Question 1
How many records of the 9000 should I use to train the network? 1a. Should I completely randomize the selection of this training data or be more involved and make sure I include a variety of output scores and a wide range of each of the input variables?
Randomizing the selection of training data is probably not a good idea.
Keep in mind that truly random data includes clusters.
A random selection of students could happen to consist solely of those who scored above a 30 on the ACT exams, which could potentially result in a bias in your result.
Likewise, if you only select students whose SAT scores were below 700, the classifier you build won't have any capacity to distinguish between a student expected to score 720 and a student expected to score 780 -- they'll look the same to the classifier because it was trained without the relevant information.
You want to ensure a representative sample of your different inputs and your different outputs.
Because you're dealing with input variables that may be correlated, you shouldn't try to do anything too complex in selecting this data, or you could mistakenly introduce another bias in your inputs.
Namely, you don't want to select a training data set that consists largely of outliers.
I would recommend trying to ensure that your inputs cover all possible values for all of the variables you are observing, and all possible results for the output (the SAT scores), without constraining how these requirements are satisfied.
I'm sure there are algorithms out there designed to do exactly this, but I don't know them myself -- possibly a good question in and of itself for Cross Validated.
Question 3
Since the scores on the tests that I am using as inputs vary in scale (some are on 1-100, and others 1-20 for example), should I normalize all of the inputs to their respective z-scores? When is this recommended vs not recommended?
My understanding is that this is not recommended as the input to a Nerual Network, but I may be wrong.
The convergence of the network should handle this for you.
Every node in the network will assign a weight to its inputs, multiply them by their weights, and sum those products as a core part of its computation.
That means that every node in the network is searching for some coefficients for each of their inputs.
To do this, all inputs will be converted to numeric values -- so conditions like gender will be translated into "0=MALE,1=FEMALE" or something similar.
For example, a node's metric might look like this at a given point in time:
2*ACT_SCORE + 0*GENDER + (-5)*VARISTY_SPORTS ...
The coefficients for each values are exactly what the network is searching for as it converges.
If you change the scale of a value, like ACT_SCORE, you just change the scale of the coefficient that will be found by the reciporical of that scaling factor.
The result should still be the same.
There are other concerns in terms of accuracy (computers have limited capacity to represent small fractions) and speed that may enter this, but not being familiar with NEURO XL, I can't say whether or not they apply for this technology.
Question 4
I am predicting the actual score, but in reality, I'm NOT that concerned about the exact score but more of a range. Would my network be more accurate if I grouped the output scores into buckets and then tried to predict this number instead of the actual score?
This will reduce accuracy, although you should converge to a solution much faster with fewer possible outputs (scores).
Neural Networks actually describe very high-dimensional functions in their input variables.
If you reduce the granularity of that function's output space, you essentially state that you don't care about local minima and maxima in that function, especially around the borders between your output scores.
As a result, you are sacrificing information that may be an essential component of the "true" function that you are searching for.
I hope this has been helpful, but you really should break this question down into its many components and ask them separately on different sites -- potentially some of them do belong here on StackOverflow as well.

shortest path drawing between two stores in a same mall [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
The requirement is , suppose a person is went to the mall and he want to go to particular store from his current locations, So the app will give him the two options to choose ,current store and the intended store , after selecting this two options, the map will pop there showing the shortest path between selected stores, that stores might be on diff. floors .
As no lat long can be use because the stores are in the same mall how could i do this, some body please help me.
A solution would be to create a weighted graph representing the mall:
Nodes being the stores and path intersections (i.e escalators between floors)
Edges being the paths connecting them
Weights of the edges being the distance/time to walk between nodes
Then implement something like Dijkstra's algorithm to find the shortest path between two nodes (stores).
The solution could then be drawn as an overlay onto a map of the mall.
This is an example of a Shortest Path Problem, which is a subset of the classic Travelling salesman problem
This thread has a link to objective c code that may help: Easy way to apply a shortest path alghoritm in objective c

How long would it take my i-7 processor to factorise a 1024 bits number (consisting of just 2 prime factors) [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
We're examining the RSA algorithm and would like to know how much time it would take an intel i-7 core (# 2.50 gHz) to factorise the RSA-public key.
we wrote a piece of java for this, and I don't know how effective it is
public static String factorise(long l)
{
double a = Math.floor(Math.sqrt(l));
while(l/a != Math.round(l/a))
{
a--;
}
return (long)a + ", " + (long)(l/a);
}
With a number around 2^45 it took the PC approximately 33 milliseconds. In theory, how long would it take to factorise a number around 2^1024?
Thanks in advance :)
Your algorithm is O(2^n), where n is the number of bits in the original number l. (that means that a single bit more will double the runtime, because twice as many numbers a must be checked - on average)
If 45 bits took 33 ms, then 1024 bits will take approx. 2^1024 / 2^45 * 33ms = 5.34654 * 10^285 years.
This of course assumes, that the 1024bit code is exactly as efficient as your code for long numbers (64bit?). Which is a bold statement, considering that 10^285 years is more than enough time to switch to the General number field sieve and scratch a few million years of that time...
In 2009 the 768 bit number rsa-768 was cracked using about 1000 cores and 2 years of calculations. Assuming they used the General number field sieve (a very fair assumption) it would take them 7481 years to crack a 1024 bit number using the same hardware.
Or using only your i7 with this algorithm: about 3 million years. Still a long time.... ;)

coin flip generator [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
what do you guys think will be the outcome if a program is coded to simulate a coin flip where there is a 50% of the coin landing either heads or tails when the results are looked at; Mainly will there be a higher % of the coin flips landing heads when the previous 10 flips were tails and vice versa?
This really depends on what mechanism is being used to generate the random numbers. If, say, a linear congruential generator is used...
... then clearly any given generated number is dependent on the one that preceded. The quality of the output also depends on what parameters are used in conjunction with the mechanism (e.g. if a small value was used for "m" in the above method, the quality would be poor... or if your seed value was highly predictable).
Despite the fact that computers only generate pseudo-random numbers, some algorithms satisfy the tests for statistical randomness (i.e. have no discernible patterns) and can be used safely.
If you are that concerned about the randomness of your generated numbers, you should look into the actual method being used to generate them within your specific context. For more information, take a look at Wikipedia.
If you program it correctly, the chance of landing on either side of the coin should be equal(50%), regardless of previous flips...
The probability of heads/tails is always 50% on any given toss. The probability of getting x heads (or any specified combination of heads/tails) in a row is 0.5^x (because each toss is independent of the others).
I'm I am understanding you correctly, you are asking if there will be more "heads" results or "tails" results if a program is written to give each option a 50% chance?
Statistically, if you run the program a number of times, each side will average out and you'll have an equal number of heads and tails results. (Depending on your language of choice, you may have to seed the randomizer to guarantee true randomness.)
I suppose that it depends on the goodness of the pseudo-random number generator.
On only ten flips, the results may be meaningless...
But if you have a good algorithm to generate pseudo-random number, and you extend this experiment with n try where "n" is a significantly large large, the probability still remain 0.5 (50%).
This because the statistic has no memory

Is LOC correct parameter for project estimation? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Is LOC correct parameter for project estimation?
there are so many scenarios where complexity takes much more time for a single line of code,
other than LOC what could be the suggested parameter for project estimation?
As peoples are talking about functional point of program does it mean for use case related information?
i am trying to find out any solid base for full software developement estimation which can consist analysis, design, testcase preparation, and coding, please suggest?
Steve McConnell in Rapid Development (Microsoft Press, 1996):
Because different programming
languages produce such different bangs
for a given number of lines of code,
much of the software industry is
moving toward a measure called
"function points" to estimate program
sizes. A function point is a synthetic
measure of program size that is based
on a weighted sum of the number of
inputs, outputs, inquiries, and files.
Function points are useful because
they allow you to think about program
size in a languageindependent way.
Google "Function Point" for more information.
Seeing as developers are likely to* spend most of their time trying to test changes, lines-of-code is never a good indicator of size of a problem.
Let's suppose you have an existing large application - changing a single line of code may seem trivial, but the test planning and execution could take weeks.
Likewise, adding a relatively large amount of code in a single limited-scope module which is easily testable might be only a few days.
* they should do, at least. If they're spending more time writing code than testing it, it is probably full of bugs. And I mean BEFORE it reaches your dedicated QA team.
Only if you use it in the inverse.
-- Edit
But no. It isn't. It's a mostly useless measure, and generally harmful. As you note, less code is almost always better.
Other things to check? Well, what are you trying to measure? What result do you want to see from a change in the things that you would be checking? What sort of decisions will you be making on the basis of these changes?
LOC is one proxy measure for measuring the problem size.
LOC estimate can be used, and LOC count is relatively cheap to measure from historical projects. But LOC can be problematic if used for anything else than a proxy for problem size, as already pointed out by other answers.
Problem size is rather constant given the requirements. From a size estimate you can go to effort, schedule and cost estimates. It depends on your planning drivers such as cost or schedule. From the historical data you can find correlation how problem size translates to effort and how other planning drivers further influence the outcome. So you need to measure size measure and effort vs. other parameters and keep on fine-tuning your estimation process. There are some LOC-to-effort measures available in the literature, but they are not very accurate in your domain, using the technology you are using, and the team you have.
Other proxies for problem size are function points and story points. My experience on function points is that they are rarely worth the effort. On the other hand, story points in agile methods work very well since they are deliberately abstract (thus avoiding a lot of problems with with LOC) and measured on a sprint-by-sprint basis, with instant feedback into following sprints.
No, it isn't. The reason is simple: if you produce a new line of code during your development, are you one step closer to a solution? If you estimated 1000 lines of code to complete a task, are you now 0.1% complete with that task?
Lines of code can be used as a metric but only in the negative sense: for a greater number of lines of code, it is reasonable to assume that you have a greater number of bugs. Based on historical data, there is generally a linear correlation between lines of code and bug count.
Here are some useful and measurable factors that are worth considering:
Hours of labor.
Dollars spent: this is a good one because it strongly enforces the reality that you'd rather find bugs at the developer's desktop than in the hands of a tester or customer).
Milestones met: is the system available for the customers on the right date?
Requirements completed: this can be a funny one - what if you discover a new customer need during the project?
In short, lines of code is very nearly the worst possible metric you could ever use.
The only way to get any reasonable estimate on project duration is to COMPLETELY implement and deliver some subset of the final requirements. Then you can estimate the remaining requirements by comparing their complexity against the completed work.