Show values in Matlab depending on previous values in another column - matlab

I'm struggling with the following problem in Matlab:
I’ve got a table containing a few column vectors: Day, Name, Result
My goal is to create another column vector (New vector) that shows me in each row the result of the previous day for the corresponding name.
| Day | Name | Result | New Vector |
|-----|------|--------|------------|
| 1 | A | 1.2 | 0 |
| 1 | C | 0.9 | 0 |
| 1 | B | 0.7 | 0 |
| 1 | D | 1.1 | 0 |
| 2 | B | 1 | 0.7 |
| 2 | A | 1.5 | 1.2 |
| 2 | C | 1.4 | 0.9 |
| 2 | D | 0.9 | 1.1 |
| 3 | B | 1.1 | 1 |
| 3 | C | 1.3 | 1.4 |
| 3 | A | 1 | 1.5 |
| 3 | D | 0.3 | 0.9 |
For example row 5:
It is day 2 and name is "B". The vector "RESULT" shows 1.0 in the same row but what I want to show in my new vector, is the result value of "B" of the previous day (day 1 in this example).
Since one can find "B" on the previous day in row 3, the result value is 0.7, which should be shown in row 5 of my New Vector.
When day is equal to 1, the logical consequence is that there are no values since there is no previous day. Consequently I want to show 0 for each row on Day 1.
I've already tried some combinations of unique to get the index and some if clauses but it did not work at all since I'm relatively new to Matlab and still very confused.
Is anybody able to help? Thank you so much!!

Your question is not well defined, but the code below solves your problem as it is stated.
This code works by internally sorting each Day's information in order of Name. This allows New Vector to be created easily by simply shifting and then inverting the sort operation.
close all; clear all; clc;
% A few column vectors
Day = [1;1;1;1;2;2;2;2;3;3;3;3];
Name = ['A';'C';'B';'D';'B';'A';'C';'D';'B';'C';'A';'D'];
Result = [1.2;0.9;0.7;1.1;1;1.5;1.4;0.9;1.1;1.3;1;0.3];
% Sort the table (so Name is in order for each Day)
[~,Index] = sort(max(Name)*Day + Name);
Day = Day(Index);
Name = Name(Index);
Result = Result(Index);
% Shift Result to get sorted NewVector
NewVector = circshift(Result, 4);
NewVector(1:4) = 0;
% Unsort NewVector, to get original table ordering
ReverseIndex(Index) = 1:length(Index);
NewVector = NewVector(ReverseIndex)
This prints the following result:
NewVector =
0
0
0
0
0.7000
1.2000
0.9000
1.1000
1.0000
1.4000
1.5000
0.9000

Related

Postresql matrix multiplication of a table (multiply a table by itself)

I have a square table similar to this:
| c | d |
| - | - |
a | 1 | 2 |
b | 3 | 4 |
I want to calculate matrix multiplication result where this table is multiplied by itself, i.e., this:
| c | d |
| -- | - |
a | 7 | 10 |
b | 15 | 22 |
While I understand that SQL should not be my language of choice for this task, I need to do this in that language. How do I do this?
It will make your life easier if you represent your matrix elements as (i,j,a[i,j]).
WITH matrix AS (SELECT * FROM
(VALUES ('a','a',1), ('a','b',1), ('b','a',2), ('b','b',3)) AS t(i,j,a))
SELECT m1.i as i, m2.j as j, sum(m1.a * m2.a) FROM matrix m1, matrix m2
GROUP BY m1.i, m2.j
ORDER BY i,j
This will handle sparse matrices nicely as well
Here a dbfiddle that you might be able to visualize.

Pyspark - redistribute percentages

I have a table like the following:
city | center | qty_out | qty_out %
----------------------------------------
A | 1 | 10 | .286
A | 2 | 2 | .057
A | 3 | 23 | .657
B | 1 | 40 | .8
B | 2 | 10 | .2
city-center is unique/the primary key.
If any center within a city has a qty_out % of less than 10% (.10), I want to ignore it and redistribute its % among the other centers of the city. So the result above would become
city | center | qty_out_%
----------------------------------------
A | 1 | .3145
A | 3 | .6855
B | 1 | .8
B | 2 | .2
How can I go about this? I was thinking a window function to partition but can't think of a window function to use with this
column_list = ["city","center"]
w = Window.partitionBy([col(x) for x in column_list]).orderBy('qty_out_%')
I am not statistician, so I cannot comment on the equation, however, if I write the Spark SQL as literally as you mentioned, it'll be like this.
w = Window.partitionBy('city')
redist_cond = F.when(F.col('qty_out %') < 0.1, F.col('qty_out %'))
df = (df.withColumn('redist', F.sum(redist_cond).over(w) / (F.count('*').over(w) - F.count(redist_cond).over(w)))
.fillna(0, subset=['redist'])
.filter(F.col('qty_out %') >= 0.1)
.withColumn('qty_out %', redist_cond.otherwise(F.col('qty_out %') + F.col('redist')))
.drop('redist'))

Extract contents from cell array

I have a series of Images, stored into an array A. So every entry of A contains an Image (matrix). All matrices are equally sized.
Now I want to extract the value of a specific position (pixel), but my current approach seems to be slow and I think there may be a better way to do it.
% Create data that resembles my problem
N = 5
for i = 1:N
A{i} = rand(5,5);
end
% my current approach
I = size(A{1},1);
J = size(A{1},2);
val = zeros(N,1);
for i = 1:I
for j = 1:J
for k = 1:N
B(k) = A{k}(i,j);
end
% do further operations on B for current i,j, don't save B
end
end
I was thinking there should be some way along the lines of A{:}(i,j) or vertcat(A{:}(i,j)) but both lead to
??? Bad cell reference operation.
I'm using Matlab2008b.
For further information, I use fft on B afterwards.
Here are the results of the answer by Cris
| Code | # images | Extracting Values | FFT | Overall |
|--------------|----------|-------------------|----------|-----------|
| Original | 16 | 12.809 s | 19.728 s | 62.884 s |
| Original | 128 | 105.974 s | 23.242 s | 177.280 s |
| ------------ | -------- | ----------------- | ------- | --------- |
| Answer | 16 | 42.122 s | 27.382 s | 104.565 s |
| Answer | 128 | 36.807 s | 26.623 s | 102.601 s |
| ------------ | -------- | ----------------- | ------- | --------- |
| Answer (mod) | 16 | 14.772 s | 27.797 s | 77.784 s |
| Answer (mod) | 128 | 13.637 s | 28.095 s | 83.839 s |
The answer codes was modded to double(squeeze(A(i,j,:))); because without double the FFT took much longer.
Answer (mod) uses double(A(i,j,:));
So the improvement seems to really kick in for larger sets of images, however I currently plan with processing ~ 500 images per run.
Update
Measured with the profile function, the result of using/omitting squeeze
| Code | # Calls | Time |
|--------------------------------|---------|----------|
| B = double(squeeze(A(i,j,:))); | 1431040 | 36.325 s |
| B= double(A(i,j,:)); | 1431040 | 14.289 s |
A{:}(i,j) does not work because A{:} is a comma-separated list of elements, equivalent to A{1},A{2},A{3},...A{end}. It makes no sense to index into such an array.
To speed up your operation, I recommend that you create a 3D matrix out of your data, like this:
A3 = cat(3,A{:});
Of course, this will only work if all elements of A have the same size (as was originally specified in the question).
Now you can quickly access the data like so:
for i = 1:I
for j = 1:J
B = squeeze(A3(i,j,:));
% do further operations on B for current i,j, don't save B
end
end
Depending on the operations you apply to each B, you could vectorize those operations as well.
Edit: Since you apply fft to each B, you can obtain that also without looping:
B_fft = fft(A3,[],3); % 3 is the dimension along which to apply the FFT

Boolean expression from a State machine diagram

I'm having trouble to determine the boolean equation for Q1 and Q2. What I did was to input the values into a karnaugh-map. But since the state Diagram only consists of 3 states (00, 01 and 11), I'm a bit unsure of how to setup the Karnaugh. I know what it would have looked like if it had four states like (00, 01, 11 and 10).
This is what my karnaugh looks like, it's probably wrong though
Edit: Should I add the last row (10) in my Karnaugh and just input don't care?
I would say, that the K-map is ok as a draft, but I would suggest making each of the output variables (the "new" Q_1 and Q_0 in the next step of the state diagram) their own K-map.
That way you can minimize the function separately for each of them.
I have filled the truth table this way:
+-----------------++-----------+
input variables || next state
+-----+-----+-----++-----+-----+
| Q_1 | Q_0 | x || Y_1 | Y_0 |
+-----+-----+-----++-----+-----+
| 0 | 0 | 0 || 0 | 1 |
| 0 | 0 | 1 || 0 | 0 |
| 0 | 1 | 0 || 0 | 0 |
| 0 | 1 | 1 || 1 | 1 |
+-----+-----+-----++-----+-----+
| 1 | 0 | 0 || X | X |
| 1 | 0 | 1 || X | X |
| 1 | 1 | 0 || 0 | 0 |
| 1 | 1 | 1 || 1 | 1 |
+-----+-----+-----++-----+-----+
And the output functions determining the next state (Y_1 as the "new" next Q_1, Y_0 as the "new" next Q_0) are:
The indexes in the Karnaugh maps correspond with the rows of the truth table because of the order of the variables.
Also take notice, that I used the 'dont-care' X output (for 10 state) to advantage in minimization of the second function (Q_0).
The machine should (theoretically) never go to the 'dont-care' state, therefore you should not worry about using it in the function.
Without circling the X the Y_0 function would be longer: Y_0 = ¬x·¬Q_1·¬Q_0 + x·Q_0. With the X it is only: Y_0 = ¬x·¬Q_0 + x·Q_0.
If it seem unclear to you, do not hesitate to ask in a comment, please.

How to set sequence number of sub-elements in TSQL unsing same element as parent?

I need to set a sequence inside T-SQL when in the first column I have sequence marker (which is repeating) and use other column for ordering.
It is hard to explain so I try with example.
This is what I need:
|------------|-------------|----------------|
| Group Col | Order Col | Desired Result |
|------------|-------------|----------------|
| D | 1 | NULL |
| A | 2 | 1 |
| C | 3 | 1 |
| E | 4 | 1 |
| A | 5 | 2 |
| B | 6 | 2 |
| C | 7 | 2 |
| A | 8 | 3 |
| F | 9 | 3 |
| T | 10 | 3 |
| A | 11 | 4 |
| Y | 12 | 4 |
|------------|-------------|----------------|
So my marker is A (each time I met A I must start new group inside my result). All rows before first A must be set to NULL.
I know that I can achieve that with loop but it would be slow solution and I need to update a lot of rows (may be sometimes several thousand).
Is there a way to achive this without loop?
You can use window version of COUNT to get the desired result:
SELECT [Group Col], [Order Col],
COUNT(CASE WHEN [Group Col] = 'A' THEN 1 END)
OVER
(ORDER BY [Order Col]) AS [Desired Result]
FROM mytable
If you need all rows before first A set to NULL then use SUM instead of COUNT.
Demo here