How to select the indices of a price series where there is a difference of x bips? - kdb

I have a price series and I'd like to know the indices where there has been a change of x bips. I worked out a very ugly way to accomplish this in a loop e.g.
q)bips:200
q)level:0.001*bips / 0.2
q)price: 1.0 1.1 1.3 1.8 1.9 2.0 2.3
q)ix:0
q)lastix:0
q)result:enlist lastix
q)do[count price;if[abs(price[ix]-price[lastix])>level;result,:ix;lastix:ix];ix:ix+1];
q)result
0 2 3 6
This is a simple O(n) algo that walks through the price series and keeps a marked index (lastix) starts from the first element until it finds a price whose difference is greater than bips when found saves that index and updates lastix with the one found ... is there a more idiomatic way to do it?
My if condition inside the loop is somewhat flawed don't know exactly why if I check abs(price[lastix]-price[ix]) instead of abs(price[ix]-price[lastix]) it doesn't give correct results.
UPDATE: I was aware of deltasbut it compares consecutive elements only and that's not what I need in my OP. I apologize if the price series example in the OP was ambiguous and lead to correct results by simply using deltas. Here I have a counter example new prices series:
q)price: 1.0 1.1 1.21 1.42 1.4 1.32 1.63
q)where abs deltas price > level
,0
and this is not correct. The correct result which is produced by the accepted answer is still
0 2 3 6

I think you're looking for something like this maybe:
f:{where differ{$[level<abs[y-x];y;x]}\[x]}
this carries forward the last value that satisfied your condition and uses if for comparison with the scan adverb, and then uses differ to pick out where the condition was satisfied and values were updated.
If I've understood your problem correctly, the same result should come from
newprice:1 1.1 1.3 1.8 1.9 2 2.1
since the final value is more than 0.2 greater than 1.8, the last value at which the level was updated.
q)f newprice
0 2 3 6
Thanks,
Ryan

I'm not sure if this is exactly what you're looking for but deltas will give you the change between consectutive pairs:
q)deltas price
1 0.1 0.2 0.5 0.1 0.1 0.3
Checking for your condition returns a boolean list:
q)level<=deltas price
1011001b
Finally 'where' will return the indices:
q)where level<=deltas price
0 2 3 6
Thanks,
Jamie

level:0.001*bips:200;
result:where level<=abs deltas price:1.0 1.1 1.3 1.8 1.9 2.0 2.3;
result
0 2 3 6
Is this close to what you're looking for?
Deltas checks the difference between the current and next value, abs will take the absolute value, and then you're comparing each difference against "level", which you have predefine, using where to find the associated indice.

You've included index 0 in your answer but if you want to exclude it you can use the two argument form of deltas:
q)where level<=abs deltas[price 0;price]
2 3 6
Where the first argument sets the initial value to take away, in this case the first element of the price list.
An example of where this may be beneficial is if you were running the function for each date in a partitioned db you could pass in the last value from the previous day to ensure you didn't get the indices where there wasn't a significant difference of bips.

Related

Conditionally format crosstab cell based on aggregate values for row

I have a cross tab report that has categories as the rows and the month/year. Additionally, I have the average and std dev for each row.
For instance:
2022-01
2022-02
2022-03
Average(myData)
stdDev(myData)
electrical
1
0
2
1
1
mechanical
3
3
3
3
0
admin
1
7
1
3
3.46
Now, I am able to format the cells against a static value. For instance, I can set up a conditional format like this:
CellValue () > 2
This will allow me to highlight any crosstab intersection with a value greater than 2.
But I am at a loss on how to get this to work comparing it against the average and/or standard deviation
for instance, the following
CellValue ()>[myQuery].[Average(myData)]
highlights nothing, whereas I would have expected this to highlight any cell above average.
My end goal is to highlight any value that is above 1.645 * standard deviation + average, but I cannot even get a simpler rule to work.
I was able to get something to work, but far from ideal.
I made queries to get the summary stats, then I joined those to the original data.
Then I put the categories and each of the summary stats on the left edge.
I could then reference them as expected in a conditional format. e.g. [mydata] > [myAve] + 1.645[myStdDev]
It isnt as straight forward as I would like, and it is a bit messy, but it works

How does the scan operator help to identify indices where a rule holds for 2 observations in a row?

A boolean vector has been created based on some rule and we need to identify the indices where the rule holds for 2 observations in a row. The following code does that
indices:0101001101b
runs:{0 x\x}"f"$;
where 2=runs indices
Could you please help me understand how the scan operator is used in the definition of the runs function? Appreciate your help.
It's using this special shorthand commonly used in calculating exponential moving averages: https://code.kx.com/q/ref/accumulators/#alternative-syntax
So {0 x\x} is equivalent to:
q){z+x*y}\[0;indices;indices]
0 1 0 1 0 0 1 2 0 1
What this is doing is essentially using the booleans as an on/off switch (via the boolean multiplication) for the rolling sum. It adds (z+) until it hits a negative boolean in which case the rolling sum resets back to zero.
In english: nextValue + [currentValue (starting at 0) * nextValue]
When nextValue is 1, 1 gets added. When nextValue is 0 the result is zero (resetting the rolling sum).
Something like this can achieve the same thing, though no less easy to read at a glance (and using two scans instead of one):
q){s-maxs not[x]*s:sums x}indices
0 1 0 1 0 0 1 2 0 1i
Terry has answered your question about how runs works.
Comparing adjacent items is common. You might prefer to use the prior keyword. Certainly easier to see what it is doing.
q)where (and) prior indices
,7

read arrays in Simulink

I need some help of solving that issue: I have 5 different voltage values that change every single tick time - that mean every single moment. I need to sort them and after they been sorted I want to go to another matrix(like this one at the bottom) and to pull out(read) specific column from it, for every state pre define(timing that I am designing..) That mechanism change every single states/moment. How can I do this ?
The Matrix look like(and could be greater...):
0 0 0 1 1 1...
0 1 1 0 0 1...
1 0 1 0 1 0...
1 1 0 1 0 0...
.. .. .. .. .. ..
Thanks, Henry
I am not sure I understood it correctly. So I will edit my answer after you make your question a bit more clear.
I see two separate things:
Reading 5 voltage values which change at each step. You want to sort these values. To do this you can use the sort function of matlab. It is really easy to use and you can look at it here.
This is the part I didn't understand well. After sorting the voltage readings what do you want to do with the matrix ? If you want to access just a specific column of the matrix and save it in a variable you can do it in this way. Let's assume you have a matrix A which is N x N, if you want to access the 10th column of the matrix and store it in a variable called column10 you will do something like: column10 = A(:,10)
I hope this will help you but let me know if this is what you wanted and I will edit my answer according to it.
Fab.

How are the columns and rows counted in pascal function in Functional Programming Principles in Scala at coursera?

I'm learning Scala while going through the Coursera course Functional Programming Principles in Scala.
The first exercise says:
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
The numbers at the edge of the triangle are all 1, and each number
inside the triangle is the sum of the two numbers above it. Write a
function that computes the elements of Pascal’s triangle by means of a
recursive process.
Do this exercise by implementing the pascal function in Main.scala,
which takes a column c and a row r, counting from 0 and returns the
number at that spot in the triangle. For example, pascal(0,2)=1,
pascal(1,2)=2 and pascal(1,3)=3.
At the start, I understand, as he refers to the 'numbers' we are all familiar with, but then he goes on to use the term "elements." What does he mean by this? What does he want me to compute?
I assumed that he got bored with the word "number" and thought, after defining the names of the numbers in the triangle as 'numbers' he just wanted to use something new, thus "element," but no matter how I count I cannot get the references to work.
I cannot even really understand the term 'column' seeing as the numbers are not vertically above each other.
Can you please explain how he gets pascal(1,3) == 3?
You're thinking about columns a bit wrong. By "xth column," he means the "xth entry in a given row.
So, if you are looking at the function pascal(c,r), you would want to figure out what the cth number is in the rth row.
So, for example:
pascal(1,2) corresponds to the second entry in the 3rd row
1
1 1
1 *2* 1
pascal(1,3) wants you to look at the second entry in the 4th row.
1
1 1
1 2 1
1 *3* 3 1
Just count from the left. (0,2) is the leftmost number in the row
1 2 1
so (1,3) would be the second number in
1 3 3 1
You can simply make the triangle "rectangle", and everything will become apparent:
cols-> 0 1 2 3 4
row-0 1
row-1 1 1
row-2 1 2 1
row-3 1 3 3 1
row-4 1 4 6 4 1
And you were right in that the triangle's "elements" are made of numbers, though there's a subtle difference, but insubstantial in this case.
P.S. I would personally advice to prefer the course forum for such questions:
It will avoid controversial issues on the honor code.
Your course fellows will have a quicker understanding of the problem at hand
They will have access to material which is not available to those not undertaking the course
It will help to build up a sense of membership amongst the course students, and give you all a chance to create new, possibly fruitful, relashionships
What you're asking is against the Coursera Honor Code: https://www.coursera.org/maestro/auth/normal/tos.php#honorcode
http://www.aiqus.com/questions/41299/coursera-cheating-scala-course
I loved solving this exercise.
My thought process was the following:
Understanding that the problem is a literal description of the binomial coefficient. https://en.wikipedia.org/wiki/Binomial_coefficient
Understanding that the ask is a literal plug into the fomula (!row) / ((!col) * !((row - c))) and the formula is right there in the wiki page
Now the only thing that is missing now is implementing a tail recursive function of factorial
Bonus. if you use the extension method as such
extension (int: Int) {
def ! = factorialTailRec(int)
}
// you get to write
(r.!) / ((c.!) * ((r - c).!))
You get to write almost the identical mathematical formula. And at that moment I realised the similarities between doing maths and programming. And I cried a little with the beauty of it.

matlab percentage change between cells

I'm a newbie to Matlab and just stumped how to do a simple task that can be easily performed in excel. I'm simply trying to get the percent change between cells in a matrix. I would like to create a for loop for this task. The data is setup in the following format:
DAY1 DAY2 DAY3...DAY 100
SUBJECT RESULTS
I could only perform getting the percent change between two data points. How would I conduct it if across multiple days and multiple subjects? And please provide explanation
Thanks a bunch
FOR EXAMPLE, FOR DAY 1 SUBJECT1(RESULT=1), SUBJECT2(RESULT=4), SUBJECT3(RESULT=5), DAY 2 SUBJECT1(RESULT=2), SUBJECT2(RESULT=8), SUBJECT3(RESULT=10), DAY 3 SUBJECT1(RESULT=1), SUBJECT2(RESULT=4), SUBJECT3(RESULT=5).
I WANT THE PERCENT CHANGE SO OUTPUT WILL BE DAY 2 SUBJECT1(RESULT=100%), SUBJECT2(RESULT=100%), SUBJECT3(RESULT=100%). DAY3 SUBJECT1(RESULT=50%), SUBJECT2(RESULT=50%), SUBJECT3(RESULT=50%)
updated:
Hi thanks for responding guys. sorry for the confusion. zebediah49 is pretty close to what I'm looking for. My data is for example a 10 x 10 double. I merely wanted to get the percentage change from column to column. For example, if I want the percentage change from rows 1 through 10 on all columns (from columns 2:10). I would like the code to function for any matrix dimension (e.g., 1000 x 1000 double) zebediah49 could you explain the code you posted? thanks
updated2:
zebediah49,
(data(1:end,100)- data(1:end,99))./data(1:end,99)
output=[data(:,2:end)-data(:,1:end-1)]./data(:,1:end-1)*100;
Observing the code above, How would I go about modifying it so that column 100 is used as the index against all of the other columns(1-99)? If I change the code to the following:
(data(1:end,100)- data(1:end,:))./data(1:end,:)
matlab is unable because of exceeding matrix dimensions. How would I go about implementing that?
UPDATE 3
zebediah49,
Worked perfectly!!! Originally I created a new variable for the index and repmat the index to match the matrices which was not a good idea. It took forever to replicate when dealing with large numbers.
Thanks for you contribution once again.
Thanks Chris for your contribution too!!! I was looking more on how to address and manipulate arrays within a matrix.
It's matlab; you don't actually want a loop.
output=input(2:end,:)./input(1:end-1,:)*100;
will probably do roughly what you want. Since you didn't give anything about your matlab structure, you may have to change index order, etc. in order to make it work.
If it's not obvious, that line defines output as a matrix consisting of the input matrix, divided by the input matrix shifted right by one element. The ./ operator is important, because it means that you will divide each element by its corresponding one, as opposed to doing matrix division.
EDIT: further explanation was requested:
I assumed you wanted % change of the form 1->1->2->3->1 to be 100%, 200%, 150%, 33%.
The other form can be obtained by subtracting 100%.
input(2:end,:) will grab a sub-matrix, where the first row is cut off. (I put the time along the first dimension... if you want it the other way it would be input(:,2:end).
Matlab is 1-indexed, and lets you use the special value end to refer to the las element.
Thus, end-1 is the second-last.
The point here is that element (i) of this matrix is element (i+1) of the original.
input(1:end-1,:), like the above, will also grab a sub-matrix, except that that it's missing the last column.
I then divide element (i) by element (i+1). Because of how I picked out the sub-matrices, they now line up.
As a semi-graphical demonstration, using my above numbers:
input: [1 1 2 3 1]
input(2,end): [1 2 3 1]
input(1,end-1): [1 1 2 3]
When I do the division, it's first/first, second/second, etc.
input(2:end,:)./input(1:end-1,:):
[1 2 3 1 ]
./ [1 1 2 3 ]
---------------------
== [1.0 2.0 1.5 0.3]
The extra index set to (:) means that it will do that procedure across all of the other dimension.
EDIT2: Revised question: How do I exclude a row, and keep it as an index.
You say you tried something to the effect of (data(1:end,100)- data(1:end,:))./data(1:end,:). Matlab will not like this, because the element-by-element operators need them to be the same size. If you wanted it to only work on the 100th column, setting the second index to be 100 instead of : would do that.
I would, instead, suggest setting the first to be the index, and the rest to be data.
Thus, the data is processed by cutting off the first:
output=[data(2:end,2:end)-data(2:end,1:end-1)]./data(2:end,1:end-1)*100;
OR, (if you neglect the start, matlab assumes 1; neglect the end and it assumes end, making (:) shorthand for (1:end).
output=[data(2:,2:end)-data(2:,1:end-1)]./data(2:,1:end-1)*100;
However, you will probably still want the indices back, in which case you will need to append that subarray back:
output=[data(1,1:end-1) data(2:,2:end)-data(2:,1:end-1)]./data(2:,1:end-1)*100];
This is probably not how you should be doing it though-- keep data in one matrix, and time or whatever else in a separate array. That makes it much easier to do stuff like this to data, without having to worry about excluding time. It's especially nice when graphing.
Oh, and one more thing:
(data(:,2:end)-data(:,1:end-1))./data(:,1:end-1)*100;
is identically equivalent to
data(:,2:end)./data(:,1:end-1)*100-100;
Assuming zebediah49 guessed right in the comment above and you want
1 4 5
2 8 10
1 4 5
to turn into
1 1 1
-.5 -.5 -.5
then try this:
data = [1,4,5; 2,8,10; 1,4,5];
changes_absolute = diff(data);
changes_absolute./data(1:end-1,:)
ans =
1.0000 1.0000 1.0000
-0.5000 -0.5000 -0.5000
You don't need the intermediate variable, you can directly write diff(data)./data(1:end,:). I just thought the above might be easier to read. Getting from that result to percentage numbers is left as an exercise to the reader. :-)
Oh, and if you really want 50%, not -50%, just use abs around the final line.