I have a set of random variables in matlab, with that how to find the discrete rv Z having pdf P(Z = 1) = p, P(Z = 0) = 1 – p, p = 0.3 - matlab

n=1000
x=rand(n,1)
This is my code to find the random samples.

Firstly, let's disambiguate what you want:
By "random sample values of random variable Z having pdf P(Z=1)=p,P(Z=0)=p-1, for p = 0.3" I assume you mean:
You want to randomly choose between two values, 0 and 1.
0 should occur 70% of the time.
1 should occur 30% of the time.
You already have the MATLAB statements:
n = 1000;
x = rand(n,1);
This is a good first step. The next step is for you read up on "logical indexing" in MATLAB, which is a way to apply a logical condition - say "is greater than 0.3" - to an array of numbers.
Try reading Peter Acklam's excellent reference "MATLAB Tips and Tricks", which will teach you about logical indexing and many other useful tricks for working with arrays in MATLAB.
As regards the phrasing of your question: There's no need to use overly technical language and abbreviations to describe a simple problem.
Also, to me, a PDF ("Probability Density Function") implies a continuous distribution like the normal distribution, which is why I was confused - you had the words "discrete" and "PDF" right next to each other, and it didn't compute. Again, don't use the technical jargon unless you actually have to.

Related

K-means Stopping Criteria in Matlab?

Im implementing the k-means algorithm on matlab without using the k-means built-in function, The stopping criteria is when the new centroids doesn't change by new iterations, but i cannot implement it in matlab , can anybody help?
Thanks
Setting no change as a stopping criteria is a bad idea. There are a few main reasons you shouldn't use a 0 change condition
even for a well behaved function the difference between 0 change and a very small change (say 1e-5 perhaps)could be 1000+ iterations, so you are wasting time trying to get them to be exactly the same. Especially because computers usually keep far more digits than we are interested in. IF you only need 1 digit accuracy, why wait for the computer to find an answer within 1e-31?
computers have floating point errors everywhere. Try doing some easily reversible matrix operations like a = rand(3,3); b = a*a*inv(a); a-b theoretically this should be 0 but you will see it isn't. So these errors alone could prevent your program from ever stopping
dithering. lets say we have a 1d k means problem with 3 numbers and we want to split them into 2 groups. One iteration the grouping can be a,b vs c. the next iteration could be a vs b,c the next could be a,b vs c the next.... This is of course a simplified example, but there can be instances where a few data points can dither between clusters, and you will end up with a never ending algorithm. Since those few points are reassigned, the change will never be 0
the solution is to use a delta threshold. basically you subtract the current values from the previous and if they are less than a threshold you are done. This on its own is powerful, but as with any loop, you need a backup escape plan. And that is setting a max_iterations variable. Look at matlabs documentation for kmeans, even they have a MaxIter variable (default is 100) so even if your kmeans doesn't converge, at least it wont run endlessly. Something like this might work
%problem specific
max_iter = 100;
%choose a small number appropriate to your problem
thresh = 1e-3;
%ensures it runs the first time
delta_mu = thresh + 1;
num_iter = 0;
%do your kmeans in the loop
while (delta_mu > thresh && num_iter < max_iter)
%save these right away
old_mu = curr_mu;
%calculate new means and variances, this is the standard kmeans iteration
%then store the values in a variable called curr_mu
curr_mu = newly_calculate_values;
%use the two norm to find the delta as a single number. no matter what
%the original dimensionality of mu was. If old_mu -new_mu was
% 0 the norm is still 0. so it behaves well as a distance measure.
delta_mu = norm(old_mu - curr_mu,2);
num_ter = num_iter + 1;
end
edit
if you don't know the 2 norm is essentially the euclidean distance

Unable to code non linear equation in MATLAB R2013a - MATLAB giving warning message

I wanted to solve the following equation in MATLAB R2013a using the Symbolic Math Toolbox.
(y/x)-(((1+r)^n)-1)/r=0 where y,x and n>3 are given and r is the dependent variable
I tried myself & coded as follows:
f=solve('(y/x)-(((1+r)^n)-1)/r','r')
but as the solution for r is not exact i.e. it is converging on successive iterations hence MATLAB is giving a warning output with the message
Warning: Explicit solution could not be found.
f =
[ empty sym ]
How do I code this?
There are an infinite number of solutions to this for an unspecified value of n > 3 and unknown r. I hope that it's pretty clear why – it's effectively asking for a greater and greater number of roots of (1+r)^n. You can find solutions for fixed values of n, however. Note that as n becomes larger there are more and more solutions and of course some of them are complex. I'm going to assume that you're only interested in real values of r. You can use solve and symbolic math for n = 4, n = 5, and n = 6 (for n = 6, the solution may not be in a convenient form):
y = 441361;
x = 66990;
n = 5;
syms r;
rsol = solve(y/x-((1+r)^n-1)/r==0,r,'IgnoreAnalyticConstraints',true)
double(rsol)
However, the question is "do you need all the solutions or just a particular solution for a given value of n"? If you just need a particular solution, you shouldn't be using symbolic math at all as it's slower and has practical issues like the ones you're experiencing. You can instead just use a numerical approach to find a zero of the equation that is near a specified initial guess. fzero is the standard function for solving this sort of problem in a single variable:
y = 441361;
x = 66990;
n = 5;
f = #(r)y/x-((1+r).^n-1)./r;
r0 = 1;
rsol = fzero(f,r0)
You'll see that the value returned is the same as one of the solutions from the symbolic solution above. If you adjust the initial guess r0 (say r0 = -3), it will return the other solution. When using numeric approaches in cases when there are multiple solutions, if you want specific solutions you'll need to know about the behavior of your function and you'll need to add some clever extra code to choose initial guesses.
I think you forgot to define n as well.
f=solve('(y/x)-(((1+r)^n)-1)/r=0','n-3>0','r','n')
Should solve your problem :)

Detect signal jumps relative to local activity

In Matlab, is it possible to measure local variation of a signal across an entire signal without using for loops? I.e., can I implement the following:
window_length = <something>
for n = 1:(length_of_signal - window_length/2)
global_variance(n) = var(my_signal(1:window_length))
end
in a vectorized format?
If you have the image processing toolbox, you can use STDFILT:
global_std = stdfilt(my_signal(:),ones(window_length,1));
% square to get the variance
global_variance = global_std.^2;
You could create a 2D array where each row is shifted one w.r.t. to the row above, and with the number of rows equal to the window width; then computing the variance is trivial. This doesn't require any toolboxes. Not sure if it's much faster than the for loop though:
longSignal = repmat(mySignal(:), [1 window_length+1]);
longSignal = reshape(longSignal(1:((length_of_signal+1)*window_length)), [length_of_signal+1, window_length])';
global_variance = sum(longSignal.*longSignal, 2);
global_variance = global_variance(1:length_of_signal-window_length));
Note that the second column is shifted down by one relative to the one above - this means that when we have the blocks of data on which we want to operate in rows, so I take the transpose. After that, the sum operator will sum over the first dimension, which gives you a row vector with the results you want. However, there is a bit of wrapping of data going on, so we have to limit to the number of "good" values.
I don't have matlab handy right now (I'm at home), so I was unable to test the above - but I think the general idea should work. It's vectorized - I can't guarantee it's fast...
Check the "moving window standard deviation" function at Matlab Central. Your code would be:
movingstd(my_signal, window_length, 'forward').^2
There's also moving variance code, but it seems to be broken.
The idea is to use filter function.

How to calculate value of short options call with Black-Scholes formula?

I am trying to calculate the profit/loss of a short call at various times in the future, but it isn't coming out correct. Compared to the time of expiration, the ones with time left have less profit above the strike price, but at some point below the strike they don't lose value as fast as the t=0 line. Below is the formula in pseudocode, what am I doing wrong?
profit(stockprice) = -1 * (black_scholes_price_of_call(stockPrice,optionStrike,daysTillExpiration) - premium);
Real matlab code:
function [ x ] = sell_call( current,strike,price,days)
if (days > 0)
Sigma = .25;
Rates = 0.05;
Settle = today;
Maturity = today + days;
RateSpec = intenvset('ValuationDate', Settle, 'StartDates', Settle, 'EndDates',...
Maturity, 'Rates', Rates, 'Compounding', -1);
StockSpec = stockspec(Sigma, current);
x = -1 * (optstockbybls(RateSpec, StockSpec, Settle, Maturity, 'call', strike) - price);
else
x = min(price,strike-current-price);
end
end
Your formula ain't right. I don't know why you need that leading -1 as a multiplier for, because when I distribute it out the "formula" is a simple one:
profit(stockprice) = premium - black_scholes_price_of_call(stockPrice,optionStrike,daysTillExpiration);
Pretty simple. So that means the problem is buried in that function for the price of the call, right?
When I compare your formula to what I see as the definition on Wikipedia, I don't see a correspondence at all. Your MATLAB code doesn't help, either. Dig into the functions and see where you went wrong.
Did you write those? How did you test them before you assembled them into this larger function. Test the smaller blocks before you assemble them into the bigger thing.
What baseline are you testing against? What known situation are you comparing your calculation to? There are lots of B-S calculators available. Maybe you can use one of those.
I'd assume that it's an error in your code rather than MATLAB. Or you've misunderstood the meaning of the parameters you're passing. Look at your stuff more carefully, re-read the documentation for that function, and get a good set of baseline cases.
I found the problem, it had to do with the RateSpec argument. When you pass in a interest rate, it affects the option pricing.

MATLAB interview questions?

I programmed in MATLAB for many years, but switched to using R exclusively in the past few years so I'm a little out of practice. I'm interviewing a candidate today who describes himself as a MATLAB expert.
What MATLAB interview questions should I ask?
Some other sites with resources for this:
"Matlab interview questions" on Wilmott
"MATLAB Questions and Answers" on GlobaleGuildLine
"Matlab Interview Questions" on CoolInterview
This is a bit subjective, but I'll bite... ;)
For someone who is a self-professed MATLAB expert, here are some of the things that I would personally expect them to be able to illustrate in an interview:
How to use the arithmetic operators for matrix or element-wise operations.
A familiarity with all the basic data types and how to convert effortlessly between them.
A complete understanding of matrix indexing and assignment, be it logical, linear, or subscripted indexing (basically, everything on this page of the documentation).
An ability to manipulate multi-dimensional arrays.
The understanding and regular usage of optimizations like preallocation and vectorization.
An understanding of how to handle file I/O for a number of different situations.
A familiarity with handle graphics and all of the basic plotting capabilities.
An intimate knowledge of the types of functions in MATLAB, in particular nested functions. Specifically, given the following function:
function fcnHandle = counter
value = 0;
function currentValue = increment
value = value+1;
currentValue = value;
end
fcnHandle = #increment;
end
They should be able to tell you what the contents of the variable output will be in the following code, without running it in MATLAB:
>> f1 = counter();
>> f2 = counter();
>> output = [f1() f1() f2() f1() f2()]; %# WHAT IS IT?!
We get several new people in the technical support department here at MathWorks. This is all post-hiring (I am not involved in the hiring), but I like to get to know people, so I give them the "Impossible and adaptive MATLAB programming challenge"
I start out with them at MATLAB and give them some .MAT file with data in it. I ask them to analyze it, without further instruction. I can very quickly get a feel for their actual experience.
http://blogs.mathworks.com/videos/2008/07/02/puzzler-data-exploration/
The actual challenge does not mean much of anything, I learn more from watching them attempt it.
Are they making scripts, functions, command line or GUI based? Do they seem to have a clear idea where they are going with it? What level of confidence do they have with what they are doing?
Are they computer scientists or an engineer that learned to program. CS majors tend to do things like close their parenthesis immediately, and other small optimizations like that. People that have been using MATLAB a while tend to capture the handles from plotting commands for later use.
How quickly do they navigate the documentation? Once I see they are going down the 'right' path then I will just change the challenge to see how quickly they can do plots, pull out submatrices etc...
I will throw out some old stuff from Project Euler. Mostly just ramp up the questions until one of us is stumped.
Floating Point Questions
Given that Matlab's main (only?) data type is the double precision floating point matrix, and that most people use floating point arithmetic -- whether they know it or not -- I'm astonished that nobody has suggested asking basic floating point questions. Here are some floating point questions of variable difficulty:
What is the range of |x|, an IEEE dp fpn?
Approximately how many IEEE dp fpns are there?
What is machine epsilon?
x = 10^22 is exactly representable as a dp fpn. What are the fpns xp
and xs just below and just above x ?
How many dp fpns are in [1,2)? How many atoms are on an edge of a
1-inch sugar cube?
Explain why sin(pi) ~= 0, but cos(pi) = -1.
Why is if abs(x1-x2) < 1e-10 then a bad convergence test?
Why is if f(a)*f(b) < 0 then a bad sign check test?
The midpoint c of the interval [a,b] may be calculated as:
c1 = (a+b)/2, or
c2 = a + (b-a)/2, or
c3 = a/2 + b/2.
Which do you prefer? Explain.
Calculate in Matlab: a = 4/3; b = a-1; c = b+b+b; e = 1-c;
Mathematically, e should be zero but Matlab gives e = 2.220446049250313e-016 = 2^(-52), machine epsilon (eps). Explain.
Given that realmin = 2.225073858507201e-308, and Matlab's u = rand gives a dp fpn uniformly distributed over the open interval (0,1):
Are the floating point numbers [2^(-400), 2^(-100), 2^(-1)]
= 3.872591914849318e-121, 7.888609052210118e-031, 5.000000000000000e-001
equally likely to be output by rand ?
Matlab's rand uses the Mersenne Twister rng which has a period of
(2^19937-1)/2, yet there are only about 2^64 dp fpns. Explain.
Find the smallest IEEE double precision fpn x, 1 < x < 2, such that x*(1/x) ~= 1.
Write a short Matlab function to search for such a number.
Answer: Alan Edelman, MIT
Would you fly in a plane whose software was written by you?
Colin K would not hire me (and probably fire me) for saying "that
Matlab's main (only?) data type is the double precision floating
point matrix".
When Matlab started that was all the user saw, but over the years
they have added what they coyly call 'storage classes': single,
(u)int8,16,32,64, and others. But these are not really types
because you cannot do USEFUL arithmetic on them. Arithmetic on
these storage classes is so slow that they are useless as types.
Yes, they do save storage but what is the point if you can't do
anything worthwhile with them?
See my post (No. 13) here, where I show that arithmetic on int32s is 12 times slower than
double arithmetic and where MathWorkser Loren Shure says "By
default, MATLAB variables are double precision arrays. In the olden
days, these were the ONLY kind of arrays in MATLAB. Back then even
character arrays were stored as double values."
For me the biggest flaw in Matlab is its lack of proper types,
such as those available in C and Fortran.
By the way Colin, what was your answer to Question 14?
Ask questions about his expertise and experience in applying MATLAB in your domain.
Ask questions about how he would approach designing an application for implementation in MATLAB. If he refers to recent features of MATLAB, ask him to explain them, and how they are different from the older features they replace or supplement, and why they are preferable (or not).
Ask questions about his expertise with MATLAB data structures. Many of the MATLAB 'experts' I've come across are very good at writing code, but very poor at determining what are the best data structures for the job in hand. This is often a direct consequence of their being domain experts who've picked up MATLAB rather than having been trained in computerism. The result is often good code which has to compensate for the wrong data structures.
Ask questions about his experience, if any, with other languages/systems and invite him to expand upon his observations about the relative strengths and weaknesses of MATLAB.
Ask for top tips on optimising MATLAB programs. Expect the answers: vectorisation, pre-allocation, clearing unused variables, etc.
Ask about his familiarity with the MATLAB profiler, debugger and lint tools. I've recently discovered that the MATLAB 'expert' over in the corner here had never, in 10 years using the tool, found the profiler.
That should get you started.
I. I think this recent SO question
on indexing is a very good question
for an "expert".
I have a 2D array, call it 'A'. I have
two other 2D arrays, call them 'ix'
and 'iy'. I would like to create an
output array whose elements are the
elements of A at the index pairs
provided by x_idx and y_idx. I can do
this with a loop as follows:
for i=1:nx
for j=1:ny
output(i,j) = A(ix(i,j),iy(i,j));
end
end
How can I do this without the loop? If
I do output = A(ix,iy), I get the
value of A over the whole range of
(ix)X(iy).
II. Basic knowledge of operators like element-wise multiplication between two matrices (.*).
III. Logical indexing - generate a random symmetric matrix with values from 0-1 and set all values above T to 0.
IV. Read a file with some properly formatted data into a matrix (importdata)
V. Here's another sweet SO question
I have three 1-d arrays where elements
are some values and I want to compare
every element in one array to all
elements in other two.
For example:
a=[2,4,6,8,12]
b=[1,3,5,9,10]
c=[3,5,8,11,15]
I want to know if there are same
values in different arrays (in this
case there are 3,5,8)
Btw, there's an excellent chance your interviewee will Google "MATLAB interview questions" and see this post :)
Possible question:
I have an array A of n R,G,B triplets. It is a 3xn matrix. I have another array B in the form 1xn which stores an index value (association to a cluster) for each triplet.
How do I plot the triplets of A in 3D space (using plot3 function), coloring each triplet according to its index in B? (The goal is to qualitatively evaluate my clustering)
Really, really good programmers who are MATLAB novices won't be able to give you an efficient (== MATLAB style) solution. However, it is a very simple problem if you do know your MATLAB.
Depends a bit what you want to test.
To test MATLAB fluency, there are several nice Stack Overflow questions that you could use to test e.g. array manipulations (example 1, example 2), or you could use fix-this problems like this question (I admit, I'm rather fond of that one), or look into this list for some highly MATLAB-specific stuff. If you want to be a bit mean, throw in a question like this one, where the best solution is a loop, and the typical MATLAB-way-of-thinking solution would just fill up the memory.
However, it may be more useful to ask more general programming questions that are related to your area of work and see whether they get the problem solved with MATLAB.
For example, since I do image analysis, I may ask them to design a class for loading images of different formats (a MATLAB expert should know how to do OOP, after all, it has been out for two years now), and then ask follow-ups as to how to deal with large images (I want to see a check on how much memory would be used - or maybe they know memory.m - and to hear about how MATLAB usually works with doubles), etc.