Unicode BIDI (UBA): R RLI R; Why should RLI stay on level 0 - unicode

implementing isolater logic for the BIDI got me stuck again. Given the following test case.
#Levels: 1 0 1
#Reorder: 0 1 2
R RLI R ; 2 # LTR flag
\____/ \_/ (my run sequences)
Here is my current understanding:
After processing rules X1-X8 you end up with levels 0 0 1 which will convert to two isolated run sequences.
As far as I can tell, rules W1-W7 should not apply to any characters.
Rule N1 however will raise the level of the RLI to 1 as it is an NI between two strong types, the R and the eos of the first sequence.
Rule I1/2 will then push the first R to level 1.
Rule L1 could lower the level of an RLI but does not apply here.
How is that, that the RLI should end up on level 0.
Same problem with similar tests, just invthe opposite direction.
#Levels: 2 1 2
#Reorder: 2 1 0
L LRI L; 4
Sadly http://unicode.org/cldr/utility/bidi.jsp does not implement the latest algorithm including the RLI/LRI.
Any advice welcome :) thank you.

Okay the issue was that Rule N1 should not have been applied since the eos is of type L and not R.
I have missed the Note at X10 stating that: ...and if there is none or the last character of the sequence is an isolate initiator (lacking a matching PDI), with the paragraph embedding level."

Related

How does the scan operator help to identify indices where a rule holds for 2 observations in a row?

A boolean vector has been created based on some rule and we need to identify the indices where the rule holds for 2 observations in a row. The following code does that
indices:0101001101b
runs:{0 x\x}"f"$;
where 2=runs indices
Could you please help me understand how the scan operator is used in the definition of the runs function? Appreciate your help.
It's using this special shorthand commonly used in calculating exponential moving averages: https://code.kx.com/q/ref/accumulators/#alternative-syntax
So {0 x\x} is equivalent to:
q){z+x*y}\[0;indices;indices]
0 1 0 1 0 0 1 2 0 1
What this is doing is essentially using the booleans as an on/off switch (via the boolean multiplication) for the rolling sum. It adds (z+) until it hits a negative boolean in which case the rolling sum resets back to zero.
In english: nextValue + [currentValue (starting at 0) * nextValue]
When nextValue is 1, 1 gets added. When nextValue is 0 the result is zero (resetting the rolling sum).
Something like this can achieve the same thing, though no less easy to read at a glance (and using two scans instead of one):
q){s-maxs not[x]*s:sums x}indices
0 1 0 1 0 0 1 2 0 1i
Terry has answered your question about how runs works.
Comparing adjacent items is common. You might prefer to use the prior keyword. Certainly easier to see what it is doing.
q)where (and) prior indices
,7

Principle of duality

The following is given in my textbook-
When we start with a boolean relation, another boolean relation can be derived by:
changing OR to AND
changing AND to OR
replacing 0 to 1 and 1 to 0
This derived expression is known as the dual of the initial expression.
So dual of 1 AND 0 is 0 OR 1. Isn't?
What about the dual of X OR X' ? Is it X' AND X? (Note that ' stands for complement.)
PS -
One change we did not need to make as part of this interchange was to
complement. We say that complement is a self-dual operation. The
identity or do-nothing operation x (copy the input to the output) is
also self-dual.
What does the above line mean?
Source : Duality principle in wikipedia

Dot Product: * Command vs. Loop gives different results

I have two vectors in Matlab, z and beta. Vector z is a 1x17:
1 0.430742139435890 0.257372971229541 0.0965909090909091 0.694329541928697 0 0.394960106863064 0 0.100000000000000 1 0.264704325268675 0.387774594078319 0.269207605609567 0.472226643323253 0.750000000000000 0.513121013402805 0.697062571025173
... and beta is a 17x1:
6.55269487769363e+26
0
0
-56.3867588816768
-2.21310778926413
0
57.0726052009847
0
3.47223691057151e+27
-1.00249317882651e+27
3.38202232046686
1.16425987969027
0.229504956512063
-0.314243264212449
-0.257394312588330
0.498644243389556
-0.852510642195370
I'm dealing with some singularity issues, and I noticed that if I want to compute the dot product of z*beta, I potentially get 2 different solutions. If I use the * command, z*beta = 18.5045. If I write a loop to compute the dot product (below), I get a solution of 0.7287.
summation=0;
for i=1:17
addition=z(1,i)*beta(i);
summation=summation+addition;
end
Any idea what's going on here?
Here's a link to the data: https://dl.dropboxusercontent.com/u/16594701/data.zip
The problem here is that addition of floating point numbers is not associative. When summing a sequence of numbers of comparable magnitude, this is not usually a problem. However, in your sequence, most numbers are around 1 or 10, while several entries have magnitude 10^26 or 10^27. Numerical problems are almost unavoidable in this situation.
The wikipedia page http://en.wikipedia.org/wiki/Floating_point#Accuracy_problems shows a worked example where (a + b) + c is not equal to a + (b + c), i.e. demonstrating that the order in which you add up floating point numbers does matter.
I would guess that this is a homework assignment designed to illustrate these exact issues. If not, I'd ask what the data represents to suss out the appropriate approach. It would probably be much more productive to find out why such large numbers are being produced in the first place than trying to make sense of the dot product that includes them.

How are the columns and rows counted in pascal function in Functional Programming Principles in Scala at coursera?

I'm learning Scala while going through the Coursera course Functional Programming Principles in Scala.
The first exercise says:
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
The numbers at the edge of the triangle are all 1, and each number
inside the triangle is the sum of the two numbers above it. Write a
function that computes the elements of Pascal’s triangle by means of a
recursive process.
Do this exercise by implementing the pascal function in Main.scala,
which takes a column c and a row r, counting from 0 and returns the
number at that spot in the triangle. For example, pascal(0,2)=1,
pascal(1,2)=2 and pascal(1,3)=3.
At the start, I understand, as he refers to the 'numbers' we are all familiar with, but then he goes on to use the term "elements." What does he mean by this? What does he want me to compute?
I assumed that he got bored with the word "number" and thought, after defining the names of the numbers in the triangle as 'numbers' he just wanted to use something new, thus "element," but no matter how I count I cannot get the references to work.
I cannot even really understand the term 'column' seeing as the numbers are not vertically above each other.
Can you please explain how he gets pascal(1,3) == 3?
You're thinking about columns a bit wrong. By "xth column," he means the "xth entry in a given row.
So, if you are looking at the function pascal(c,r), you would want to figure out what the cth number is in the rth row.
So, for example:
pascal(1,2) corresponds to the second entry in the 3rd row
1
1 1
1 *2* 1
pascal(1,3) wants you to look at the second entry in the 4th row.
1
1 1
1 2 1
1 *3* 3 1
Just count from the left. (0,2) is the leftmost number in the row
1 2 1
so (1,3) would be the second number in
1 3 3 1
You can simply make the triangle "rectangle", and everything will become apparent:
cols-> 0 1 2 3 4
row-0 1
row-1 1 1
row-2 1 2 1
row-3 1 3 3 1
row-4 1 4 6 4 1
And you were right in that the triangle's "elements" are made of numbers, though there's a subtle difference, but insubstantial in this case.
P.S. I would personally advice to prefer the course forum for such questions:
It will avoid controversial issues on the honor code.
Your course fellows will have a quicker understanding of the problem at hand
They will have access to material which is not available to those not undertaking the course
It will help to build up a sense of membership amongst the course students, and give you all a chance to create new, possibly fruitful, relashionships
What you're asking is against the Coursera Honor Code: https://www.coursera.org/maestro/auth/normal/tos.php#honorcode
http://www.aiqus.com/questions/41299/coursera-cheating-scala-course
I loved solving this exercise.
My thought process was the following:
Understanding that the problem is a literal description of the binomial coefficient. https://en.wikipedia.org/wiki/Binomial_coefficient
Understanding that the ask is a literal plug into the fomula (!row) / ((!col) * !((row - c))) and the formula is right there in the wiki page
Now the only thing that is missing now is implementing a tail recursive function of factorial
Bonus. if you use the extension method as such
extension (int: Int) {
def ! = factorialTailRec(int)
}
// you get to write
(r.!) / ((c.!) * ((r - c).!))
You get to write almost the identical mathematical formula. And at that moment I realised the similarities between doing maths and programming. And I cried a little with the beauty of it.

Linspace vs range

I was wondering what is better style / more efficient:
x = linspace(-1, 1, 100);
or
x = -1:0.01:1;
As Oli Charlesworth mentioned, in linspace you divide the interval [a,b] into N points, whereas with the : form, you step-out from a with a specified step size (default 1) till you reach b.
One thing to keep in mind is that linspace always includes the end points, whereas, : form will include the second end-point, only if your step size is such that it falls on it at the last step else, it will fall short. Example:
0:3:10
ans =
0 3 6 9
That said, when I use the two approaches depends on what I need to do. If all I need to do is sample an interval with a fixed number of points (and I don't care about the step-size), I use linspace.
In many cases, I don't care if it doesn't fall on the last point, e.g., when working with polar co-ordinates, I don't need the last point, as 2*pi is the same as 0. There, I use 0:0.01:2*pi.
As always, use the one that best suits your purposes, and that best expresses your intentions. So use linspace when you know the number of points; use : when you know the spacing.
[Incidentally, your two examples are not equivalent; the second one will give you 201 points.]
As Oli already pointed out, it's usually easiest to use linspace when you know the number of points you want and the colon operator when you know the spacing you want between elements.
However, it should be noted that the two will often not give you exactly the same results. As noted here and here, the two approaches use slightly different methods to calculate the vector elements (here's an archived description of how the colon operator works). That's why these two vectors aren't equal:
>> a = 0:0.1:1;
>> b = linspace(0,1,11);
>> a-b
ans =
1.0e-016 *
Columns 1 through 8
0 0 0 0.5551 0 0 0 0
Columns 9 through 11
0 0 0
This is a typical side-effect of how floating-point numbers are represented. Certain numbers can't be exactly represented (like 0.1) and performing the same calculation in different ways (i.e. changing the order of mathematical operations) can lead to ever so slightly different results, as shown in the above example. These differences are usually on the order of the floating-point precision, and can often be ignored, but you should always be aware that they exist.