Martin Odersky : Working hard to keep it simple - scala

I was watching the talk given by Martin Odersky as recommended by himself in the coursera scala course and I am quite curious about one aspect of it
var x = 0
async { x = x + 1 }
async { x = x * 2 }
So I get that it can give 2 if if the first statement gets executed first and then the second one:
x = 0;
x = x + 1;
x = x * 2; // this will be x = 2
I get how it can give 1 :
x = 0;
x = x * 2;
x = x + 1 // this will be x = 1
However how can it result 0? Is it possible that the statement don't execute at all?
Sorry for such an easy question but I'm really stuck at it

You need to think about interleaved execution. Remember that the CPU needs to read the value of x before it can work on it. So imagine the following:
Thread 1 reads x (reading 0)
Thread 2 reads x (reading 0)
Thread 1 writes x + 1 (writing 1)
Thread 2 writes x * 2 (writing 0)

I know this has already been answered, but maybe this is still useful:
Think of it as a sequence of atomic operations. The processor is doing one atomic operation at a time.
Here we have the following:
Read x
Write x
Add 1
Multiply 2
The following two sequences are guaranteed to happen in this order "within themselves":
Read x, Add 1, Write x
Read x, Multiply 2, Write x
However, if you are executing them in parallel, the time of execution of each atomic operation relative to any other atomic operation in the other sequence is random i.e. these two sequences interleave.
One of the possible order of execution will produce 0 which is given in the answer by Paul Butcher
Here is an illustration I found on the internet:
Each blue/purple block is one atomic operation, you can see how you can have different results based on the order of the blocks
To solve this problem you can use the keyword "synchronized"
My understanding is that if you mark two blocks of code (e.g. two methods) with synchronized within the same object then each block will own the lock of that object while being executed, so that the other block cannot be executed while the first hasn't finished yet. However, if you have two synchronised blocks in two different objects then they can execute in parallel.

Related

Julia (JuMP): Indicator constraints with multiple conditional values (is a boolean expression possible?)

I want to implement a constraint depending on the change of values in my binary decision variable, x, over "time".
I am trying to implement a minimum operating time constraint for a unit commitment optimization problem for power systems. x is representing the unit activation where 0 and 1 show that a power unit, n, at a certain time, t, respectively is shut off or turned on.
For this, indicator constraints seem to be a promising solution and with the inspiration of a similar problem the implementation seemed quite straightforward.
So, since boolean operators are introduced (! and ¬), I prematurely wanted to express the change in a boolean way:
#constraint(m, xx1[n=1:N,t=2:T], (!x[n,t-1] && x[n,t]) => {next(t, 1) + next(t, 2) == 2})
Saying: if unit was deactivated before but now is on, then demand the unit to be active for the next 2 times.
Where next(t, i) = x[((t - 1 + i) % T) + 1].
I got the following error:
LoadError: MethodError: no method matching !(::VariableRef)
Closest candidates are:
!(!Matched::Missing) at missing.jl:100
!(!Matched::Bool) at bool.jl:33
!(!Matched::Function) at operators.jl:896
I checked that the indicator constraint is working properly with a single term only.
Question: Is this possible or is there another obvious solution?
Troubleshooting and workarounds: I have tried the following (please correct me if my diagnosis is wrong):
Implement change as an expression: indicator constraints only work with binary integer variables.
Implement change as another variable relating to x. I have found a solution but it is quite sketchy, which is documented in a Julia discourse. The immediate problem, found from the solution, is that indicator constraints do not work as bi-implication but only one way, LHS->RHS. Please see the proper approach given by #Oscar Dowson.
You can get the working code from github.
The trick is to find constraint(s) that have an equivalent truth-table:
# Like
(!x[1] && x[2]) => {z == 1}
# Is equivalent to:
z >= -x[1] + x[2]
# Proof
-x[1] + x[2] = sum <= z
--------------------------
- 0 + 0 = 0 <= 0
- 1 + 0 = -1 <= 0
- 0 + 1 = 1 <= 1
- 1 + 1 = 0 <= 0
I was recommended MOSEK Modeling Cookbook to help working out the correct formulation of constraints.
See eventually the thread here from where I got the answer for further details.

How to do bitwise operation decently?

I'm doing analysis on binary data. Suppose I have two uint8 data values:
a = uint8(0xAB);
b = uint8(0xCD);
I want to take the lower two bits from a, and whole content from b, to make a 10 bit value. In C-style, it should be like:
(a[2:1] << 8) | b
I tried bitget:
bitget(a,2:-1:1)
But this just gave me separate [1, 1] logical type values, which is not a scalar, and cannot be used in the bitshift operation later.
My current solution is:
Make a|b (a or b):
temp1 = bitor(bitshift(uint16(a), 8), uint16(b));
Left shift six bits to get rid of the higher six bits from a:
temp2 = bitshift(temp1, 6);
Right shift six bits to get rid of lower zeros from the previous result:
temp3 = bitshift(temp2, -6);
Putting all these on one line:
result = bitshift(bitshift(bitor(bitshift(uint16(a), 8), uint16(b)), 6), -6);
This is doesn't seem efficient, right? I only want to get (a[2:1] << 8) | b, and it takes a long expression to get the value.
Please let me know if there's well-known solution for this problem.
Since you are using Octave, you can make use of bitpack and bitunpack:
octave> a = bitunpack (uint8 (0xAB))
a =
1 1 0 1 0 1 0 1
octave> B = bitunpack (uint8 (0xCD))
B =
1 0 1 1 0 0 1 1
Once you have them in this form, it's dead easy to do what you want:
octave> [B A(1:2)]
ans =
1 0 1 1 0 0 1 1 1 1
Then simply pad with zeros accordingly and pack it back into an integer:
octave> postpad ([B A(1:2)], 16, false)
ans =
1 0 1 1 0 0 1 1 1 1 0 0 0 0 0 0
octave> bitpack (ans, "uint16")
ans = 973
That or is equivalent to an addition when dealing with integers
result = bitshift(bi2de(bitget(a,1:2)),8) + b;
e.g
a = 01010111
b = 10010010
result = 00000011 100010010
= a[2]*2^9 + a[1]*2^8 + b
an alternative method could be
result = mod(a,2^x)*2^y + b;
where the x is the number of bits you want to extract from a and y is the number of bits of a and b, in your case:
result = mod(a,4)*256 + b;
an extra alternative solution close to the C solution:
result = bitor(bitshift(bitand(a,3), 8), b);
I think it is important to explain exactly what "(a[2:1] << 8) | b" is doing.
In assembly, referencing individual bits is a single operation. Assume all operations take the exact same time and "efficient" a[2:1] starts looking extremely inefficient.
The convenience statement actually does (a & 0x03).
If your compiler actually converts a uint8 to a uint16 based on how much it was shifted, this is not a 'free' operation, per se. Effectively, what your compiler will do is first clear the "memory" to the size of uint16 and then copy "a" into the location. This requires an extra step (clearing the "memory" (register)) that wouldn't normally be needed.
This means your statement actually is (uint16(a & 0x03) << 8) | uint16(b)
Now yes, because you're doing a power of two shift, you could just move a into AH, move b into AL, and AH by 0x03 and move it all out but that's a compiler optimization and not what your C code said to do.
The point is that directly translating that statement into matlab yields
bitor(bitshift(uint16(bitand(a,3)),8),uint16(b))
But, it should be noted that while it is not as TERSE as (a[2:1] << 8) | b, the number of "high level operations" is the same.
Note that all scripting languages are going to be very slow upon initiating each instruction, but will complete said instruction rapidly. The terse nature of Python isn't because "terse is better" but to create simple structures that the language can recognize so it can easily go into vectorized operations mode and start executing code very quickly.
The point here is that you have an "overhead" cost for calling bitand; but when operating on an array it will use SSE and that "overhead" is only paid once. The JIT (just in time) compiler, which optimizes script languages by reducing overhead calls and creating temporary machine code for currently executing sections of code MAY be able to recognize that the type checks for a chain of bitwise operations need only occur on the initial inputs, hence further reducing runtime.
Very high level languages are quite different (and frustrating) from high level languages such as C. You are giving up a large amount of control over code execution for ease of code production; whether matlab actually has implemented uint8 or if it is actually using a double and truncating it, you do not know. A bitwise operation on a native uint8 is extremely fast, but to convert from float to uint8, perform bitwise operation, and convert back is slow. (Historically, Matlab used doubles for everything and only rounded according to what 'type' you specified)
Even now, octave 4.0.3 has a compiled bitshift function that, for bitshift(ones('uint32'),-32) results in it wrapping back to 1. BRILLIANT! VHLL place you at the mercy of the language, it isn't about how terse or how verbose you write the code, it's how the blasted language decides to interpret it and execute machine level code. So instead of shifting, uint32(floor(ones / (2^32))) is actually FASTER and more accurate.

What does for i = 1 ; i <= power ; 1 += 1) { mean?

Sorry for pretty basic question.
Just starting out. Using flowgorithm to write code that gives out a calculation of exponential numbers.
The code to do it is:
function exponential(base, power) {
var answer;
answer = 1;
var i;
for (i = 1 ; i <= power ; i+= 1) {
answer = answer * base;
}
return answer;
f
then it loops up to the number of power. And i just understand that in the flowgorithm chart but i dont understand the code for it.
what does each section of the for statement mean?
i = 1 to power, i just need help understanding how it is written? What is the 1+= 1 bit?
Thanks.
The exponential function will take in 2 parameters, base and power.
You can create this function and call (fire) it when ever it is needed like so exponential(2,4).
The for (i = 1; 1 <= power; i+=1) is somewhat of an ugly for loop.
for loops traditionaly take three parameters. The first parameter in this case i =1 is the assignment parameter, the next one 1 <= power is the valadation parameter. So if we call the function like so...exponential(2,4) is i less than 4? The next parameter is an increment/decrement parameter. but this doesnt get executed until the code inside the for loop gets executed. Once the code inside the for loop is executed then this variable i adds 1 to itself so it is now 2. This is usefull because once i is no longer less than or equal to power it will exit the for loop. So in the case of exponential(2,4) once the code inside this for loop is ran 5 times it will exit the for loop because 6 > 5.
So if we look at a variable answer, we can see that before this for loop was called answer was equal to 1. After the first iteration of this for loop answer = answer times base. In the case of exponential(2,4) then answer equals 1 times 2, now answer =2. But we have only looped through the foor loop once , and like i said a for loop goes like (assignment, validator, "code inside the foor loop". then back up to increment/decrement). So since we to loop through this for loop 5 times in the case of exponential(2,4) it will look like so.
exponential(2,4)
answer = 1 * 2
now answer = 2
answer = 2 * 2
now answer = 4
answer = 4 * 2
now answer = 8
answer = 8 * 2
now answer = 16
answer = 16 * 2
now answer = 32
So if we could say... var int ans = exponential(2,4)
Then ans would equal 32 hence, the return answer; at the last line of your code.

In count change recursive algorithm, why do we return 1 if the amount = 0?

I am taking coursera course and,for an assignment, I have written a code to count the change of an amount given a list of denominations. A doing a lot of research, I found explanations of various algorithms. In the recursive implementation, one of the base cases is if the amount money is 0 then the count is 1. I don't understand why but this is the only way the code works. I feel that is the amount is 0 then there is no way to make change for it and I should throw an exception. The code look like:
function countChange(amount : Int, denoms :List[Int]) : Int = {
if (amount == 0 ) return 1 ....
Any explanation is much appreciated.
To avoid speaking specifically about the Coursera problem, I'll refer to a simpler but similar problem.
How many outcomes are there for 2 coin flips? 4.
(H,H),(H,T),(T,H),(T,T)
How many outcomes are there for 1 coin flip? 2.
(H),(T)
How many outcomes are there for 0 coin flips? 1.
()
Expressing this recursively, how many outcomes are there for N coin flips? Let's call it f(N) where
f(N) = 2 * f(N - 1), for N > 0
f(0) = 1
The N = 0 trivial (base) case is chosen so that the non-trivial cases, defined recursively, work out correctly. Since we're doing multiplication in this example and the identity element for multiplication is 1, it makes sense to choose that as the base case.
Alternatively, you could argue from combinatorics: n choose 0 = 1, 0! = 1, etc.

Calculating prime numbers in Scala: how does this code work?

So I've spent hours trying to work out exactly how this code produces prime numbers.
lazy val ps: Stream[Int] = 2 #:: Stream.from(3).filter(i =>
ps.takeWhile{j => j * j <= i}.forall{ k => i % k > 0});
I've used a number of printlns etc, but nothings making it clearer.
This is what I think the code does:
/**
* [2,3]
*
* takeWhile 2*2 <= 3
* takeWhile 2*2 <= 4 found match
* (4 % [2,3] > 1) return false.
* takeWhile 2*2 <= 5 found match
* (5 % [2,3] > 1) return true
* Add 5 to the list
* takeWhile 2*2 <= 6 found match
* (6 % [2,3,5] > 1) return false
* takeWhile 2*2 <= 7
* (7 % [2,3,5] > 1) return true
* Add 7 to the list
*/
But If I change j*j in the list to be 2*2 which I assumed would work exactly the same, it causes a stackoverflow error.
I'm obviously missing something fundamental here, and could really use someone explaining this to me like I was a five year old.
Any help would be greatly appreciated.
I'm not sure that seeking a procedural/imperative explanation is the best way to gain understanding here. Streams come from functional programming and they're best understood from that perspective. The key aspects of the definition you've given are:
It's lazy. Other than the first element in the stream, nothing is computed until you ask for it. If you never ask for the 5th prime, it will never be computed.
It's recursive. The list of prime numbers is defined in terms of itself.
It's infinite. Streams have the interesting property (because they're lazy) that they can represent a sequence with an infinite number of elements. Stream.from(3) is an example of this: it represents the list [3, 4, 5, ...].
Let's see if we can understand why your definition computes the sequence of prime numbers.
The definition starts out with 2 #:: .... This just says that the first number in the sequence is 2 - simple enough so far.
The next part defines the rest of the prime numbers. We can start with all the counting numbers starting at 3 (Stream.from(3)), but we obviously need to filter a bunch of these numbers out (i.e., all the composites). So let's consider each number i. If i is not a multiple of a lesser prime number, then i is prime. That is, i is prime if, for all primes k less than i, i % k > 0. In Scala, we could express this as
nums.filter(i => ps.takeWhile(k => k < i).forall(k => i % k > 0))
However, it isn't actually necessary to check all lesser prime numbers -- we really only need to check the prime numbers whose square is less than or equal to i (this is a fact from number theory*). So we could instead write
nums.filter(i => ps.takeWhile(k => k * k <= i).forall(k => i % k > 0))
So we've derived your definition.
Now, if you happened to try the first definition (with k < i), you would have found that it didn't work. Why not? It has to do with the fact that this is a recursive definition.
Suppose we're trying to decide what comes after 2 in the sequence. The definition tells us to first determine whether 3 belongs. To do so, we consider the list of primes up to the first one greater than or equal to 3 (takeWhile(k => k < i)). The first prime is 2, which is less than 3 -- so far so good. But we don't yet know the second prime, so we need to compute it. Fine, so we need to first see whether 3 belongs ... BOOM!
* It's pretty easy to see that if a number n is composite then the square of one of its factors must be less than or equal to n. If n is composite, then by definition n == a * b, where 1 < a <= b < n (we can guarantee a <= b just by labeling the two factors appropriately). From a <= b it follows that a^2 <= a * b, so it follows that a^2 <= n.
Your explanations are mostly correct, you made only two mistakes:
takeWhile doesn't include the last checked element:
scala> List(1,2,3).takeWhile(_<2)
res1: List[Int] = List(1)
You assume that ps always contains only a two and a three but because Stream is lazy it is possible to add new elements to it. In fact each time a new prime is found it is added to ps and in the next step takeWhile will consider this new added element. Here, it is important to remember that the tail of a Stream is computed only when it is needed, thus takeWhile can't see it before forall is evaluated to true.
Keep these two things in mind and you should came up with this:
ps = [2]
i = 3
takeWhile
2*2 <= 3 -> false
forall on []
-> true
ps = [2,3]
i = 4
takeWhile
2*2 <= 4 -> true
3*3 <= 4 -> false
forall on [2]
4%2 > 0 -> false
ps = [2,3]
i = 5
takeWhile
2*2 <= 5 -> true
3*3 <= 5 -> false
forall on [2]
5%2 > 0 -> true
ps = [2,3,5]
i = 6
...
While these steps describe the behavior of the code, it is not fully correct because not only adding elements to the Stream is lazy but every operation on it. This means that when you call xs.takeWhile(f) not all values until the point when f is false are computed at once - they are computed when forall wants to see them (because it is the only function here that needs to look at all elements before it definitely can result to true, for false it can abort earlier). Here the computation order when laziness is considered everywhere (example only looking at 9):
ps = [2,3,5,7]
i = 9
takeWhile on 2
2*2 <= 9 -> true
forall on 2
9%2 > 0 -> true
takeWhile on 3
3*3 <= 9 -> true
forall on 3
9%3 > 0 -> false
ps = [2,3,5,7]
i = 10
...
Because forall is aborted when it evaluates to false, takeWhile doesn't calculate the remaining possible elements.
That code is easier (for me, at least) to read with some variables renamed suggestively, as
lazy val ps: Stream[Int] = 2 #:: Stream.from(3).filter(i =>
ps.takeWhile{p => p * p <= i}.forall{ p => i % p > 0});
This reads left-to-right quite naturally, as
primes are 2, and those numbers i from 3 up, that all of the primes p whose square does not exceed the i, do not divide i evenly (i.e. without some non-zero remainder).
In a true recursive fashion, to understand this definition as defining the ever increasing stream of primes, we assume that it is so, and from that assumption we see that no contradiction arises, i.e. the truth of the definition holds.
The only potential problem after that, is the timing of accessing the stream ps as it is being defined. As the first step, imagine we just have another stream of primes provided to us from somewhere, magically. Then, after seeing the truth of the definition, check that the timing of the access is okay, i.e. we never try to access the areas of ps before they are defined; that would make the definition stuck, unproductive.
I remember reading somewhere (don't recall where) something like the following -- a conversation between a student and a wizard,
student: which numbers are prime?
wizard: well, do you know what number is the first prime?
s: yes, it's 2.
w: okay (quickly writes down 2 on a piece of paper). And what about the next one?
s: well, next candidate is 3. we need to check whether it is divided by any prime whose square does not exceed it, but I don't yet know what the primes are!
w: don't worry, I'l give them to you. It's a magic I know; I'm a wizard after all.
s: okay, so what is the first prime number?
w: (glances over the piece of paper) 2.
s: great, so its square is already greater than 3... HEY, you've cheated! .....
Here's a pseudocode1 translation of your code, read partially right-to-left, with some variables again renamed for clarity (using p for "prime"):
ps = 2 : filter (\i-> all (\p->rem i p > 0) (takeWhile (\p->p^2 <= i) ps)) [3..]
which is also
ps = 2 : [i | i <- [3..], and [rem i p > 0 | p <- takeWhile (\p->p^2 <= i) ps]]
which is a bit more visually apparent, using list comprehensions. and checks that all entries in a list of Booleans are True (read | as "for", <- as "drawn from", , as "such that" and (\p-> ...) as "lambda of p").
So you see, ps is a lazy list of 2, and then of numbers i drawn from a stream [3,4,5,...] such that for all p drawn from ps such that p^2 <= i, it is true that i % p > 0. Which is actually an optimal trial division algorithm. :)
There's a subtlety here of course: the list ps is open-ended. We use it as it is being "fleshed-out" (that of course, because it is lazy). When ps are taken from ps, it could potentially be a case that we run past its end, in which case we'd have a non-terminating calculation on our hands (a "black hole"). It just so happens :) (and needs to ⁄ can be proved mathematically) that this is impossible with the above definition. So 2 is put into ps unconditionally, so there's something in it to begin with.
But if we try to "simplify",
bad = 2 : [i | i <- [3..], and [rem i p > 0 | p <- takeWhile (\p->p < i) bad]]
it stops working after producing just one number, 2: when considering 3 as the candidate, takeWhile (\p->p < 3) bad demands the next number in bad after 2, but there aren't yet any more numbers there. It "jumps ahead of itself".
This is "fixed" with
bad = 2 : [i | i <- [3..], and [rem i p > 0 | p <- [2..(i-1)] ]]
but that is a much much slower trial division algorithm, very far from the optimal one.
--
1 (Haskell actually, it's just easier for me that way :) )