Which costs more while looping; assignment or an if-statement? - variable-assignment

Consider the following 2 scenarios:
boolean b = false;
int i = 0;
while(i++ < 5) {
b = true;
}
OR
boolean b = false;
int i = 0;
while(i++ < 5) {
if(!b) {
b = true;
}
}
Which is more "costly" to do? If the answer depends on used language/compiler, please provide. My main programming language is Java.
Please do not ask questions like why would I want to do either.. They're just barebone examples that point out the relevant: should a variable be set the same value in a loop over and over again or should it be tested on every loop that it holds a value needed to change?

Please do not forget the rules of Optimization Club.
The first rule of Optimization Club is, you do not Optimize.
The second rule of Optimization Club is, you do not Optimize without measuring.
If your app is running faster than the underlying transport protocol, the optimization is over.
One factor at a time.
No marketroids, no marketroid schedules.
Testing will go on as long as it has to.
If this is your first night at Optimization Club, you have to write a test case.
It seems that you have broken rule 2. You have no measurement. If you really want to know, you'll answer the question yourself by setting up a test that runs scenario A against scenario B and finds the answer. There are so many differences between different environments, we can't answer.

Have you tested this? Working on a Linux system, I put your first example in a file called LoopTestNoIf.java and your second in a file called LoopTestWithIf.java, wrapped a main function and class around each of them, compiled, and then ran with this bash script:
#!/bin/bash
function run_test {
iter=0
while [ $iter -lt 100 ]
do
java $1
let iter=iter+1
done
}
time run_test LoopTestNoIf
time run_test LoopTestWithIf
The results were:
real 0m10.358s
user 0m4.349s
sys 0m1.159s
real 0m10.339s
user 0m4.299s
sys 0m1.178s
Showing that having the if makes it slight faster on my system.

Are you trying to find out if doing the assignment each loop is faster in total run time than doing a check each loop and only assigning once on satisfaction of the test condition?
In the above example I would guess that the first is faster. You perform 5 assignments. In the latter you perform 5 test and then an assignment.
But you'll need to up the iteration count and throw in some stopwatch timers to know for sure.

Actually, this is the question I was interested in… (I hoped that I’ll find the answer somewhere to avoid own testing. Well, I didn’t…)
To be sure that your (mine) test is valid, you (I) have to do enough iterations to get enough data. Each iteration must be “long” enough (I mean the time scale) to show the true difference. I’ve found out that even one billion iterations are not enough to fit to time interval that would be long enough… So I wrote this test:
for (int k = 0; k < 1000; ++k)
{
{
long stopwatch = System.nanoTime();
boolean b = false;
int i = 0, j = 0;
while (i++ < 1000000)
while (j++ < 1000000)
{
int a = i * j; // to slow down a bit
b = true;
a /= 2; // to slow down a bit more
}
long time = System.nanoTime() - stopwatch;
System.out.println("\\tasgn\t" + time);
}
{
long stopwatch = System.nanoTime();
boolean b = false;
int i = 0, j = 0;
while (i++ < 1000000)
while (j++ < 1000000)
{
int a = i * j; // the same thing as above
if (!b)
{
b = true;
}
a /= 2;
}
long time = System.nanoTime() - stopwatch;
System.out.println("\\tif\t" + time);
}
}
I ran the test three times storing the data in Excel, then I swapped the first (‘asgn’) and second (‘if’) case and ran it three times again… And the result? Four times “won” the ‘if’ case and two times the ‘asgn’ appeared to be the better case. This shows how sensitive the execution might be. But in general, I hope that this has also proven that the ‘if’ case is better choice.
Thanks, anyway…

Any compiler (except, perhaps, in debug) will optimize both these statements to
bool b = true;
But generally, relative speed of assignment and branch depend on processor architecture, and not on compiler. A modern, super-scalar processor perform horribly on branches. A simple micro-controller uses roughly the same number of cycles per any instruction.

Relative to your barebones example (and perhaps your real application):
boolean b = false;
// .. other stuff, might change b
int i = 0;
// .. other stuff, might change i
b |= i < 5;
while(i++ < 5) {
// .. stuff with i, possibly stuff with b, but no assignment to b
}
problem solved?
But really - it's going to be a question of the cost of your test (generally more than just if (boolean)) and the cost of your assignment (generally more than just primitive = x). If the test/assignment is expensive or your loop is long enough or you have high enough performance demands, you might want to break it into two parts - but all of those criteria require that you test how things perform. Of course, if your requirements are more demanding (say, b can flip back and forth), you might require a more complex solution.

Related

Script is taking 11 - 20 seconds to lookup up an item in an 18,000 row data set

I have two Google sheets workbooks.
One is the "master" source of lookup data with a key based on manufacturer item #, which could be anything from 1234 to A-01/234-Name_1. This sheet, referenced via SpreadsheetApp.openByUrl, has 18,000 rows and 13 columns. The key column has been converted to plain text and the sheet is sorted by this column.
The second is the "template" where people enter item #s that they need to look up against the master, typically 20 - 1500 items at a time.
The script is in the template. It is very slow and routinely times out after 30 minutes. It was written by someone else and I am new to App Script, but I think I've managed to understand what the script is doing and where the bottleneck is occurring.
It does a bunch of stuff, but this is the meat of the lookup:
var numrows = master.getDataRange().getNumRows();
var masterdata = master.getDataRange().getValues();
var itemnumberlist = template.getDataRange().getValues();
var retreiveddata = [];
// iterate through the manf item number list to find all matches in the
// master and return those matches to another sheet
for (i = 1; i < template.getDataRange().getValues().length; i++) {
for (j = 0; j < numrows; j++) {
if (masterdata[j][1].toString() === itemnumberlist[i][1].toString()) {
retreiveddata.push(data[j]);
anothersheet.appendRow(data[j]);
}
}
}
I used Logger.log() to determine that each time through the i loop is taking 11 - 19 seconds, which just seems insane.
I've been doing some google searching and I've tried a couple of different things...
First I tried moving the writing of found data out of the for loop so the script would be doing all of its reading first and then writing in one big chunk, but I couldn't get it exactly right. My two attempts are below.
var mycounter = 0;
for (i = 0; i < template.getDataRange().getValues().length; i++) {
for (j = 0; j < numrows; j++) {
if (masterdata[j][0].toString() === itemnumberlist[i][0].toString()) {
retreiveddata.push(masterdata[j]);
mycounter = mycounter + 1;
}
}
}
// Attempt 1
// var myrange = retreiveddata.length;
// for(k = 0; k < myrange; k++) {
// anothersheet.appendRow(retreiveddata.pop([k]);
// }
//Attempt 2
var myotherrange = anothersheet.getRange(2,1,myothercounter, 13)
myotherrange.setValues(retreiveddata);
I can't remember for sure, because this was on Friday, but I think both attempts resulted in the script trying to write the entire master file into "anothersheet".
So I temporarily set this aside and decided to try something else. I was trying to recreate the issue in a couple of sample spreadsheets, but I was unable to do so. The same script is getting through my 15,000 row sample "master" file in less than 1 second per lookup. The only thing I can think of is that I used a random number as my key instead of a weird text string.
That led me to think that maybe I could use a hash algorithm on both the master data and the values to be looked up, but this is presenting a whole other set of issues.
I borrowed these functions from another forum post:
function GetMD5Hash(value) {
var rawHash = Utilities.computeDigest(Utilities.DigestAlgorithm.MD5,
value);
var txtHash = '';
for (j = 0; j <rawHash.length; j++) {
var hashVal = rawHash[j];
if (hashVal < 0)
hashVal += 256;
if (hashVal.toString(16).length == 1)
txtHash += "0";
txtHash += hashVal.toString(16);
Utilities.sleep(100);
}
return txtHash;
}
function RangeGetMD5Hash(input) {
if (input.map) { // Test whether input is an array.
return input.map(GetMD5Hash); // Recurse over array if so.
Utilities.sleep(100);
} else {
return GetMD5Hash(input)
}
}
It literally took me all day to get the hash value for all 18,000 item #s in my master spreadsheet. Neither GetMD5Hash nor RangeGetMD5Hash will return a value consistently. I can only do a few rows at a time. Sometimes I get "Loading..." indefinitely. Sometimes I get "#Name" with a message about GetMD5Hash being undefined (despite the fact that it worked on the previous row). And sometimes I get "#Error" with a message about an internal error.
This method actually reduces the lookup time of each item to 2 - 3 seconds (much better, but not great). However, I can't get the hash function to consistently work on the input data.
At this point I'm so frustrated and behind on my other work that I thought I'd reach out to the smart people on these forums and hope for some sort of miracle response.
To summarize, I'm looking for suggestions on these three items:
What am I doing wrong in my attempt to move the write out of the for loop?
Is there a way to get my hash value faster or utilize a different method to accomplish the same goal?
What else can I try to help speed up the script?
Any suggestions you can offer would be greatly appreciated!
-Mandy
It sounds like you hit on the right approach with attempting to move the appendRow() call out of the loop. Anytime you are reading or writing to a spreadsheet you can expect the individual call to take 1 to 2 seconds, so this will eat up a lot of time when you get matches. Storing the matches in an array and writing them all at once is the way to go.
Another thing I notice is that your script calls getValues() in the actual for loop condition statement. The condition statement is executed each time on each iteration of the loop, so this is potentially wasting a lot of time even when you don't have matches.
A final tweak that may be helpful depending on your desired behaviour. You can stop the inner for loop after it finds the first match, which, if you only care about the first match or know there will only be one match, will save you a lot of iterations. To do this, put "break" immediately after the retreiveddata.push(masterdata[j]); line.
To fix the getValues issue, Change:
for (i = 1; i < template.getDataRange().getValues().length; i++) {
To:
for (i = 1; i < itemnumberlist.length; i++) {
And that fix along with the appendRow issue, and including the break call:
for (i = 1; i < itemnumberlist.length; i++) {
for (j = 0; j < numrows; j++) {
if (masterdata[j][0].toString() === itemnumberlist[i][0].toString()) {
retreiveddata.push(masterdata[j]);
break; //stop searching after first match, move on to next item
}
}
}
//make sure you have data to write before trying to write it.
if(retreiveddata.length > 0){
var myotherrange = anothersheet.getRange(2,1,retreiveddata.length, retreiveddata[0].length);
myotherrange.setValues(retreiveddata);
}
If you are re-using the same sheet for "anothersheet" on each execution, you may also want to call anothersheet.clear() to erase any existing data before you write your fresh results.
I would pass on the hashing approach altogether, comparing strings is comparing strings, so whether they are hashes or actual part numbers I wouldn't expect a significant difference.

Merge Sort algorithm efficiency

I am currently taking an online algorithms course in which the teacher doesn't give code to solve the algorithm, but rather rough pseudo code. So before taking to the internet for the answer, I decided to take a stab at it myself.
In this case, the algorithm that we were looking at is merge sort algorithm. After being given the pseudo code we also dove into analyzing the algorithm for run times against n number of items in an array. After a quick analysis, the teacher arrived at 6nlog(base2)(n) + 6n as an approximate run time for the algorithm.
The pseudo code given was for the merge portion of the algorithm only and was given as follows:
C = output [length = n]
A = 1st sorted array [n/2]
B = 2nd sorted array [n/2]
i = 1
j = 1
for k = 1 to n
if A(i) < B(j)
C(k) = A(i)
i++
else [B(j) < A(i)]
C(k) = B(j)
j++
end
end
He basically did a breakdown of the above taking 4n+2 (2 for the declarations i and j, and 4 for the number of operations performed -- the for, if, array position assignment, and iteration). He simplified this, I believe for the sake of the class, to 6n.
This all makes sense to me, my question arises from the implementation that I am performing and how it effects the algorithms and some of the tradeoffs/inefficiencies it may add.
Below is my code in swift using a playground:
func mergeSort<T:Comparable>(_ array:[T]) -> [T] {
guard array.count > 1 else { return array }
let lowerHalfArray = array[0..<array.count / 2]
let upperHalfArray = array[array.count / 2..<array.count]
let lowerSortedArray = mergeSort(array: Array(lowerHalfArray))
let upperSortedArray = mergeSort(array: Array(upperHalfArray))
return merge(lhs:lowerSortedArray, rhs:upperSortedArray)
}
func merge<T:Comparable>(lhs:[T], rhs:[T]) -> [T] {
guard lhs.count > 0 else { return rhs }
guard rhs.count > 0 else { return lhs }
var i = 0
var j = 0
var mergedArray = [T]()
let loopCount = (lhs.count + rhs.count)
for _ in 0..<loopCount {
if j == rhs.count || (i < lhs.count && lhs[i] < rhs[j]) {
mergedArray.append(lhs[i])
i += 1
} else {
mergedArray.append(rhs[j])
j += 1
}
}
return mergedArray
}
let values = [5,4,8,7,6,3,1,2,9]
let sortedValues = mergeSort(values)
My questions for this are as follows:
Do the guard statements at the start of the merge<T:Comparable> function actually make it more inefficient? Considering we are always halving the array, the only time that it will hold true is for the base case and when there is an odd number of items in the array.
This to me seems like it would actually add more processing and give minimal return since the time that it happens is when we have halved the array to the point where one has no items.
Concerning my if statement in the merge. Since it is checking more than one condition, does this effect the overall efficiency of the algorithm that I have written? If so, the effects to me seems like they vary based on when it would break out of the if statement (e.g at the first condition or the second).
Is this something that is considered heavily when analyzing algorithms, and if so how do you account for the variance when it breaks out from the algorithm?
Any other analysis/tips you can give me on what I have written would be greatly appreciated.
You will very soon learn about Big-O and Big-Theta where you don't care about exact runtimes (believe me when I say very soon, like in a lecture or two). Until then, this is what you need to know:
Yes, the guards take some time, but it is the same amount of time in every iteration. So if each iteration takes X amount of time without the guard and you do n function calls, then it takes X*n amount of time in total. Now add in the guards who take Y amount of time in each call. You now need (X+Y)*n time in total. This is a constant factor, and when n becomes very large the (X+Y) factor becomes negligible compared to the n factor. That is, if you can reduce a function X*n to (X+Y)*(log n) then it is worthwhile to add the Y amount of work because you do fewer iterations in total.
The same reasoning applies to your second question. Yes, checking "if X or Y" takes more time than checking "if X" but it is a constant factor. The extra time does not vary with the size of n.
In some languages you only check the second condition if the first fails. How do we account for that? The simplest solution is to realize that the upper bound of the number of comparisons will be 3, while the number of iterations can be potentially millions with a large n. But 3 is a constant number, so it adds at most a constant amount of work per iteration. You can go into nitty-gritty details and try to reason about the distribution of how often the first, second and third condition will be true or false, but often you don't really want to go down that road. Pretend that you always do all the comparisons.
So yes, adding the guards might be bad for your runtime if you do the same number of iterations as before. But sometimes adding extra work in each iteration can decrease the number of iterations needed.

Beat Detection on iPhone with wav files and openal

Using this website i have tried to make a beat detection engine. http://www.gamedev.net/reference/articles/article1952.asp
{
ALfloat energy = 0;
ALfloat aEnergy = 0;
ALint beats = 0;
bool init = false;
ALfloat Ei[42];
ALfloat V = 0;
ALfloat C = 0;
ALshort *hold;
hold = new ALshort[[myDat length]/2];
[myDat getBytes:hold length:[myDat length]];
ALuint uiNumSamples;
uiNumSamples = [myDat length]/4;
if(alDatal == NULL)
alDatal = (ALshort *) malloc(uiNumSamples*2);
if(alDatar == NULL)
alDatar = (ALshort *) malloc(uiNumSamples*2);
for (int i = 0; i < uiNumSamples; i++)
{
alDatal[i] = hold[i*2];
alDatar[i] = hold[i*2+1];
}
energy = 0;
for(int start = 0; start<(22050*10); start+=512){
for(int i = start; i<(start+512); i++){
energy+= ((alDatal[i]*alDatal[i]) + (alDatal[i]*alDatar[i]));
}
aEnergy = 0;
for(int i = 41; i>=0; i--){
if(i ==0){
Ei[0] = energy;
}
else {
Ei[i] = Ei[i-1];
}
if(start >= 21504){
aEnergy+=Ei[i];
}
}
aEnergy = aEnergy/43.f;
if (start >= 21504) {
for(int i = 0; i<42; i++){
V += (Ei[i]-aEnergy);
}
V = V/43.f;
C = (-0.0025714*V)+1.5142857;
init = true;
if(energy >(C*aEnergy)) beats++;
}
}
}
alDatal and alDatar are (short*) type;
myDat is NSdata that holds the actual audio data of a wav file formatted to
22050 khz and 16 bit stereo.
This doesn't seem to work correctly. If anyone could help me out that would be amazing. I've been stuck on this for 3 days.
The desired result is after the 10 seconds worth of data has been processed i should be able to multiply that by 6 and have an estimated beats per minute.
My current results are 389 beats every 10 seconds, 2334 BPM the song i know is right around 120 BPM.
That code really has been smacked about with the ugly stick. If you're going to ask other people to find your bugs for you, it's a good idea to make things presentable first. Strangely enough, this will often help you to find them for yourself too.
So, before I point out some of the more fundamental errors, I have to make a few schoolmarmly suggestions:
Don't sprinkle your code with magic numbers. Is it really that hard to type a few lines like const ALuint SAMPLE_RATE = 22050? Trust me, it makes life a lot easier.
Use variable names that you aren't going to mix up easily. One of your bugs is a substitution of alDatal for alDatar. That probably wouldn't have happened if they were called left and right. Similarly, what is the point of having a meaningful variable name like energy if you're just going to stick it alongside the meaningless but more or less identical aEnergy? Why not something informative like average?
Declare variables close to where you're going to use them and in the appropriate scope. Another of your bugs is that you don't reset your calculated energy sum when you move your averaging window, so the energy will just add up and up. But you don't need the energy outside that loop, and if you declared it inside the problem couldn't happen.
There are some other things I personally find a little irksome, like the random bracing and indentation, and mixing of C and C++ allocations, and odd inconsistent scraps of Hungarian prefixing, but at least some of those may be more a matter of taste so I won't go on.
Anyway, here are some reasons why your code doesn't work:
First up, look at the right hand side of this line:
energy+= ((alDatal[i]*alDatal[i]) + (alDatal[i]*alDatar[i]));
You want the square of each channel value, so it should really say:
energy+= ((alDatal[i]*alDatal[i]) + (alDatar[i]*alDatar[i]));
Spot the difference? Not easy with those names, is it?
Second, you should be computing the total energy over each window of samples, but you're only setting energy = 0 outside the outer loop. So the sum accumulates, and consequently the current window energy will always be the biggest you've ever encountered.
Third, your variance calculation is wrong. You have:
V += (Ei[i]-aEnergy);
But it should be the sum of the squares of the differences from the mean:
V += (Ei[i] - aEnergy) * (Ei[i] - aEnergy);
There may well be other errors as well. For instance, you don't allocate the data buffers if they're not NULL, but assume that they're the right length -- which you've only just calculated. You may justify that in terms of some consistent usage you've stuck to throughout your code, but from the perspective of what we can see here it looks like a pretty bad idea.

Random AI / Switch case?

So I have a very simple game going here..., right now the AI is nearly perfect and I want it to make mistakes every now and then. The only way the player can win is if I slow the computer down to a mind numbingly easy level.
My logic is having a switch case statement like this:
int number = randomNumber
case 1:
computer moves the complete opposite way its supposed to
case 2:
computer moves the correct way
How can I have it select case 2 68% (random percentage, just an example) of the time, but still allow for some chance to make the computer fail? Is a switch case the right way to go? This way the difficulty and speed can stay high but the player can still win when the computer makes a mistake.
I'm on the iPhone. If there's a better/different way, I'm open to it.
Generating Random Numbers in Objective-C
int randNumber = 1 + rand() % 100;
if( randNumber < 68 )
{
//68% path
}
else
{
//32% path
}
int randomNumber = GeneraRandomNumberBetweenZeroAndHundred()
if (randomNumber < 68) {
computer moves the complete opposite way its supposed to
} else {
computer moves the correct way
}
Many PRNGs will offer a random number in the range [0,1). You can use an if-statement instead:
n = randomFromZeroToOne()
if n <= 0.68:
PlaySmart()
else:
PlayStupid()
If you're going to generate an integer from 1 to N, instead of a float, beware of modulo bias in your results.

In what circumstances can a compiler change the execution order of programme statements?

If this is not a real question then feel free to close ;)
Not only the compiler can reorder execution (mostly for optimization), most modern processors do so, too. Read more about execution reordering and memory barriers.
The compiler can change the execution order of statements when it sees fit for optimization purposes, and when such changes wouldn't alter the observable behavior of the code.
A very simple example -
int func (int value)
{
int result = value*2;
if (value > 10)
{
return result;
}
else
{
return 0;
}
}
A naive compiler can generate code for this in exactly the sequence shown. First calculate "result" and return it only if the original value is larger than 10 (if it isn't, "result" would be ignored - calculated needlessly).
A sane compiler, though, would see that the calculation of "result" is only needed when "value" is larger than 10, so may easily move the calculation "value*2" inside the first braces and only do it if "value" is actually larger than 10 (needless to mention, the compiler doesn't really look at the C code when optimizing - it works in lower levels).
This is only a simple example. Much more complicated examples can be created. It is very possible that a C function would end up looking almost nothing like its C representation in compiled form, with aggressive enough optimizations.
Many compilers use something called "common subexpression elimination". For example, if you had the following code:
for(int i=0; i<100; i++) {
x += y * i * 15;
}
the compiler would notice that y * 15 is invariant (its value doesn't change). So it would compute y * 15, stick the result in a register and change the loop statement to "x += r0 * i". This is kind of a contrived example, but you often see expressions like this when working with array indexes or any other base + offset type of situation.