I am currently creating a new column in a polars data frame using
predictions = [10, 20, 30, 40, 50]
df['predictions'] = predictions
where predictions is a numpy array or list containing values I computed with another tool.
However, polars throws a warning, that this option will be deprecated.
How can the same result be achieved using .with_columns()?
The values in the numpy array or list predictions can be add to a polars table using:
predictions = [10, 20, 30, 40, 50]
df.with_column(pl.Series(name="predictions", values=predictions))
Related
I have a mllib.linalg.Vector in Scala containing Double values in range of (-1; 1). I would like to multiply all of the values by, let's say, 100.
For example I'd like to convert [0.5, 0.3, -0.1] to [50, 30, -10].
How can I do it?
import org.apache.spark.mllib.linalg.*
val vec = org.apache.spark.mllib.linalg.Vectors.dense(0.5, 0.3, -0.1)
val vec2 = Vectors.dense(vec.toArray.map(_*100))
I can use sort like this to sort elements in an array.
M = sort(A(:));
But is there a good method to sort the elements with its occurrences as well?
Like this:
ELEM = [10, 60, 30, 20]
OCCU = [30, 25, 10, 5]
You can do the above with combination of unique() and sort().
First extract only the unique values in the vector using unique() and group the same indices for occurrence.
Then just sort the values and you'll have what you asked above.
I have a matrix <1x5000> named values. What I do now is to check if certain values are existing in that matrix, like this:
if any(values == 10) && any(values == 45) && any(values == 55) and so on
plot graph here
end
What this do is to check whether the numbers 10, 45, 55 are existing somewhere in that matrix. Now I want to change this statement to instead check for numbers coming in a pre-defined order after each other, in other words not only check if they exist. Example:
if values has 10, 25, 35, 55, 60 <- they must come like this, not mixed
do stuff
end
Help would be greatly appreciated as I am new with Matlab.
Have so far tried:
values = [10, 50, 30, 60, 40];
[~, indices] = ismember([10, 50, 30, 60, 40], values);
if all(indices > 0) && issorted(indices)
% Do stuff
end
Without any success, the if statement is never satisfied.
So to make it more clear, if I set values = [10, 20, 50, 25, 33]; there must somewhere in the matrix come values after each other in the exact same way as I set it. Example: matrix: 10, 55, 90, 33, 10, 20, 50, 25, 33, 100, 59 would give true as there is one sequence of 10, 20, 50, 25, 33
If you want to determine if an exact series of values appears within your array, you can use strfind. Although the function was created for strings, it also works for numeric datatypes. If there sub-array exists in the array, then the output of strfind is the index of the occurances, otherwise if the sub-array does not exist, the output is an empty array [].
if ~isempty(strfind(values, [10 25 35 55 60]))
% Do stuff
end
I have a question about finding index of the maximum values along rows of matrix. How can I do this in Spark Scala? This function would be like argmax in numpy in Python.
What's the type of your matrix ? If it's a RowMatrix, you can access the RDD of its row vectors using rows.
Then it's a simple matter of finding the maximum of each vector of this RDD[Vector], if I understand correctly. You can therefore myMatrix.rows.map{_.toArray.max}.
If you have a DenseMatrix you can convert it to an Array, at which stage you'll have a list of elements in row-major form. You can also access the number of columns of your matrix with numCols, and then use the collections method grouped to obtain rows.
myMatrix.toArray.grouped(myMatrix.numCols).map{_.max}
I think you will have to get the values as an array to get the maximum value.
val dm: Matrix = Matrices.dense(3, 2, Array(1.0, 3.0, 5.0, 2.0, 4.0, 6.0))
val result = dm.toArray.max
println(result)
Is it possible to select only particular columns of a matrix? E.g. I have a 10x100 shaped matrix and I only would like to get these 4 columns: 231, 82, 12, 493.
Yes, it is possible. If your matrix is named A then A(:, [3,7,12,89]) will retrieve the columns numbered 3, 7, 12, and 89.