getting fields out of RDD[string] - scala

I'm trying to pull out the 7th and 9th fields from an RDD. I used the following code
val logData = sc.textFile("path",2).map(item => {val comps=item.split(" "); (comps(6).toFloat, comps(8).toFloat)})
But, I got the output as
(x1,y1)
(x2,y2)
(x3,y3)
where as I need the output as
x1 y1
x2 y2
x3 y3
Can anyone give me a solution to this

Do you mean as a String, such as "0.54 0.123"? If so, you could replace:
(comps(6).toFloat, comps(8).toFloat)
with
s"${comps(6).toFloat} ${comps(8).toFloat}"
(or you can use f"${comps(6).toFloat}%0.3f ${comps(8).toFloat}%0.3f" or similar instead of s"..." for greater control over formatting).

Related

Delta compression in MATLAB

I tried delta compression in MATLAB. I tried to use array instead of for loop but encountered a problem during the decompression. No syntax errors but not able to get back the original stream. Please help me. Here is my code:
clear all;
close all;
m = [20,3,55,11,222,555,6,98,0,46];
subplot(3,1,1);plot(m);title('Raw Data');
delta(1) = m(1);
i = [1:(length(m)-1)];
delta(i+1) = m(i+1)-m(i);
subplot(3,1,2);plot(delta);title('Delta Encoding')
j =[1:(length(delta)-1)];
delta_decode(1) = delta(1);
delta_decode(j+1)=delta(j+1)+delta(j);
subplot(3,1,3);plot(delta_decode);title('Delta Decoding')
So why is your decoding not working, lets have a look on the math
Lets assume we have a sequence of N numbers X1,X2,...,XN
The variable delta holds the following information
delta= delta(1), delta(2),delta(3), ...,delta(N)
delta= X1 , X2-X1 , X3-X2 , ..., XN - X(N-1)
So what you are doing right now is to add always two entries which will result in the following:
delta(1)+delta(2), delta(2)+delta(3), delta(3)+delta(4)
X1+X2-X1, X2-X1+X3-X2, X3-X2+X4-X3,...
Summarizing it:
X2, X3-X1, X4-X2
So you see this not what you want, to restore / decode the real values you need the accumulation of all previous information. Thats why you have to add everything
This means:
X2= X1 + delta(2)=X1+X2-X1=X2
X3= X1 + delta(2)+delta(3)=X2+delta(3)=X2+X3-X2=X3
X4= X1 + delta(2)+delta(3)+delta(4)=X2+delta(3)+delta(4)=X3 +delta(4)=X3+X4-X3=X4
and so on ...
As mentioned in the comments, you can achieve that without a for-loop by using cumsum(delta)

Spline interpolation in matlab in order to predict value

I have the situation like on this image below:
This plot is the result of two vectors:
fi = [41.309180589278, 41.8087915220215, 42.8081880760916, ...
43.8078181874395, 44.8076823745539, 45.8077808710707, 46.3079179803177]
m = [1.00047608139868, 1.00013712198767, 0.999680989440986, ...
0.999524195487826, 0.999671686649694, 1.00012913666266, 1.00047608139868]
I need to get the values of fi where m is equal to 1. So approximately that will be 42.2 and 42.5.
I tried to do spline interpolation:
xq = [fi(1):0.25:fi(7)];
vq1 = interp1(fi,m,xq);
[fi1, fi2] = interp1(m, xq, 1)
But that is not working. Can someone help me with this?
One way to find a zero crossing is to "turn the graph sideways", having fi be a function of m, and interpolate to find m=0. But interp1 requires the m input to be monotonic, which this is not. In fact, this function has two different values for each m.
MATLAB knows the fzeros function, which finds a zero crossing of a function numerically. It requires a function as input. We can define an anonymous function using interp1, which returns m-1 for any value of x. Here, x is defined by fi and f(x) by m:
fi = [41.309180589278, 41.8087915220215, 42.8081880760916, ...
43.8078181874395, 44.8076823745539, 45.8077808710707, 46.3079179803177];
m = [1.00047608139868, 1.00013712198767, 0.999680989440986, ...
0.999524195487826, 0.999671686649694, 1.00012913666266, 1.00047608139868];
fun = #(x)interp1(fi,m,x)-1;
x1 = fzero(fun,42)
x2 = fzero(fun,46)
This gives me:
x1 = 42.109
x2 = 45.525
Note that we needed to know the approximate locations for these two zeros. There is no easy way around this that I know of. If one knows that there are two zero crossings, and the general shape of the function, one can find the local minimum:
[~,fimin] = min(m);
fimin = fi(fimin);
and then find the zero crossings between each of the end points and the local minimum:
x1 = fzero(fun,[fi(1),fimin])
x2 = fzero(fun,[fimin,fi(end)])
You need to use an anonymous function so that you can pass additional arguments to interp1.
Try this
fi = [41.309180589278, 41.8087915220215, 42.8081880760916, ...
43.8078181874395, 44.8076823745539, 45.8077808710707, 46.3079179803177];
m = [1.00047608139868, 1.00013712198767, 0.999680989440986, ...
0.999524195487826, 0.999671686649694, 1.00012913666266, 1.00047608139868];
fzero(#(x) 1-interp1(fi,m,x), 43)
fzero(#(x) 1-interp1(fi,m,x), 45)
The 43 and 45 are the initialization for x for fzero. You need to run the fzero twice to find the two solutions.

matlab - surf with 3 vectors

I got an assignment to make to fallowing function using 2 vectors and 2 parameters:
f2= ((cos(x)).^2/w2 + (sin(y)).^2/w3)*(-2*10^5+2.5*10^5);
I get a new vector which is kinda weird
if w2=3,w3=5 and x is a vector from 1-10 and y is a vector from 2:2:20.
I get this vector:
-134331.694334541 -156930.983357435 -114422.587024547 -115454.347733941 -178496.698590187 -108777.226902904 -103570.802583501 -194091.395692804 -102621.044915812 -99656.4625498021
My question is how I use surf function on f2?
I am told to use surf on f2.
but when I try to use the surf(x,y,f2) I get an error saying f2 needs to be a matrix.
Any ideas? This is my code:
x=[1:10];
y=[2:2:20];
w2=3;
w3=5;
f2= -2*10^5+2.5*10^5*((cos(x)).^2/w2 + (sin(y)).^2/w3)
surf(x,y,f2);
I am not sure if the assignment is looking something like this:
[newX,newY]=meshgrid(x,y);
f2= ((cos(newX)).^2/w2 + (sin(newY)).^2/w3)*(-2*10^5+2.5*10^5);
surf(newX,newY,f2);
Here, you calculate value of f for all combinations of x and y using meshgrid. Then you visualize them using surf.

Basic MATLAB - how to "create" a variant

I need to implement Lagrange iterpolation in MATLAB.
I (think I've) understood how it works. I don't get how to implement the x.
lets say I want to calculate for these point: (0,1) (1,1) (2,4)
So I need to do these:
l_0(x) = (x-1)(x-2)/(0-1)(0-2)
l_1(x) = (x-0)(x-2)/(1-0)(1-2)
l_2(x) = (x-0)(x-1)/(2-0)(2-1)
and so on...
So I want to do a MATLAB function that will receive the (x,y) points, and retrieves the coefficients of the resulting Polynomial.
In this case: ( 3/2, 3/2, 1 )
I DON'T WANT A CODE FOR AN ANSWER - just how to implement the above x variant.
Thanks
I'm not sure if this is what you need, but I think that what you are looking for is MATLAB anonymous functions
In your case, you would write
l_0 = #(x) (x-1)(x-2)/(0-1)(0-2)
l_1 = #(x) (x-0)(x-2)/(1-0)(1-2)
l_2 = #(x) (x-0)(x-1)/(2-0)(2-1)
Then you can use your Lagrange polynomials like regular functions:
val = y0 * l_0(x0) + y1 * l_1(x1) + y2 * l_2(x2)
Is that what you were looking for?
Well if you don't want code, then x is simply any value within the range of the input values of your x points. In your case, any value between 0 and 2.

MATLAB XYZ to Grid

I have a tab separated XYZ file which contains 3 columns, e.g.
586231.8 2525785.4 15.11
586215.1 2525785.8 14.6
586164.7 2525941 14.58
586199.4 2525857.8 15.22
586219.8 2525731 14.6
586242.2 2525829.2 14.41
Columns 1 and 2 are the X and Y coordinates (in UTM meters) and column 3 is the associated Z value at the point X,Y; e.g. the elevation (z) at a point is given as z(x,y)
I can read in this file using dlmread() to get 3 variables in the workspace, e.g. X = 41322x1 double, but I would like to create a surface of size (m x n) using these variables. How would I go about this?
Following from the comments below, I tried using TriScatteredInterp (see commands below). I keep getting the result shown below (it appears to be getting some of my surface though):
Any ideas what is going on to cause this result? I think the problem lies with themeshgrid command, though I'm not sure where (or why). I am currently putting in the following set of commands to calculate the above figure (my X and Y columns are in meters, and I know my grid size is 8m, hence ti/tj going up in 8s):
F = TriScatteredInterp(x,y,z,'nearest');
ti = ((min(x)):8:(max(x)));
tj = ((min(y)):8:(max(y)));
[qx,qy] = meshgrid(ti,tj);
qz = F(qx,qy);
imagesc(qz) %produces the above figure^
I think you want the griddata function. See Interpolating Scattered Data in MATLAB help.
Griddata and tirscattteredinterp are extremely slow. Use the utm2deg function on the file exchange and from there a combination of both vec2mtx to make a regular grid and then imbedm to fit the data to the grid.
I.E.
for i = 1:length(X)
[Lat,Lon ] = utm2deg(Easting ,Northing ,Zone);
end
[Grid, R] = vec2mtx(Lat, Lon, gridsize);
Grid= imbedm(Lat, Lon,z, Grid, R);
Maybe you are looking for the function "ndgrid(x,y)" or "meshgrid(x,y)"