Shifting line index based on a specific criteria Matlab - matlab

I have a table similar to the one above. For each species, I have the growth rate of year1 to 100. Suppose the table is called Data_Table.
I have to shift the index of growth rate column for each species based on a criteria.
For instance,
if criteria
First_Index = Incubation - x;
else
First_Index = Incubation - x + 1;
end
First index can take value 1 or 2.
If it is equal to 1, the growth rate for the species should not change.
If it is equal to 2, the second growth rate becomes the first growth
rate, the 3rd growth rate becomes the 2nd and so on ..
Data_Table(1,'Growth_Rate') = Data_Table(First_Index, 'Growth_Rate');
Any idea on how I can do that simply on Matlab without touching the other columns ? Circshift as written below is completely messing up the data in data_table.
Data_Table(:,'Growth_Rate') = circshift(Data_Table(:, 'Growth_Rate'),First_Index);
Thanks in advance

Related

Length scaling orientation data in MATLAB

Problem
I have a data set of describing geological structures. Each structure has a row with two attributes - its length and orientation (0-360 degrees).
Within this data set, there are two types of structure.
Type 1: less data points, but the structures are physically larger (large length, and so more significant).
Type 2: more data points, but the structures are physically smaller (small length, and so less significant).
I want to create a rose plot to show the spread of the structures' orientations. However, I want this plot to also represent the significance of the structures in combination with the direction they face - taking into account the lengths.
Is it possible to scale this by length in MATLAB somehow so that the subset which is less numerous is not under represented, when the structures are large?
Example
A data set might contain:
10 structures orientated North-South, 50km long.
100 structures orientated East-West, 0.5km long.
In this situation the East-West population would look to be more significant than the North-South population based on absolute numbers. However, in reality the length of the members contributing to this population are much smaller and so the structures are less significant.
Code
This is the code I have so far:
load('WG_rose_data.xy')
azimuth = WG_rose_data(:,2);
length = WG_rose_data(:,1);
rose(azimuth,20);
Where WG_rose_data.xy is a data file with 2 columns containing the length and azimuth (orientation) data for the geological structures.
For each row in your data, you could duplicate it a given number of times, according to its length value. Therefore, if you had a structure with length 50, it counts for 50 data points, whereas a structure with length 1 only counts as 1 data point. Of course you have to round your lengths since you can only have integer numbers of rows.
This could be achieved like so, with your example data in the matrix d
% Set up example data: 10 large vertical structures, 100 small ones perpendicular
d = [repmat([0, 50], 10, 1); repmat([90, .5], 100, 1)];
% For each row, duplicate the data in column 1, according to the length in column 2
d1 = [];
for ii = 1:size(d,1)
% make d(ii,2) = length copies of d(ii,1) = orientation
d1(end+1:end+ceil(d(ii,2))) = d(ii,1);
end
Output rose plot:
You could fine tune how to duplicate the data to achieve the desired balance of actual data and length weighting.
Thanks for all the help with this. This code is my final working version for reference:
clear all
close all
% Input dataset
original_data = load('WG_rose_data.xy');
d = [];
%reformat azimuth
d(:,1)= original_data(:,2);
%reformat length
d(:,2)= original_data(:,1);
% For each row, duplicate the data in column 1, according to the length in column 2
d1 = [];
for a = 1:size(d,1)
d1(end+1:end+ceil(d(a,2))) = d(a,1);
end
%create oposite directions for rose diagram
length_d1_azi = length(d1);
d1_op_azi=zeros(1,length_d1_azi);
for i = 1:length_d1_azi
d1_op_azi(i)=d1(i)-180;
if d1_op_azi(i) < 1;
d1_op_azi(i) = 360 - (d1_op_azi(i)*-1);
end
end
%join calculated oposites to original input
new_length = length_d1_azi*2;
all=zeros(new_length,1);
for i = 1:length_d1_azi
all(i)=d1(i);
end
for j = length_d1_azi+1:new_length;
all(j)=d1_op_azi(j-length_d1_azi);
end
%convert input aray into radians to plot
d1_rad=degtorad(all);
rose(d1_rad,24)
set(gca,'View',[-90 90],'YDir','reverse');

How to eliminate sudden changes in a vector in Matlab?

Assume that I have vector shown in the figure below. By common sense, we can see that there are 2 values which suddenly depart from the trend of the vector.
How do I eliminate these sudden changes. I mean how do I automatically detect and replace these noise values by the average value of their neighbors.
Define a threshold, compute the average values, then compare the relative error between the values and the averages of their neighbors:
threshold = 5e-2;
averages = [v(1); (v(3:end) + v(1:end-2)) / 2; v(end)];
is_outlier = (v.^2 - averages.^2) > threshold^2 * averages.^2;
Then replace the outliers:
v(is_outlier) = averages(is_outlier);

Matlab creating median on entire movie (Avoid memory issue)

So my computer is not too strong.. to say the least..
Yet I want to create a median of all pixels in an entire specific movie.
I was able to do it for a sequence of frames in memory.. but I am not sure on how to do it when reading more frames each time... how do I give median weight?
(like I'll read 100 frames each time but the median has to update according to the current median * 100 * times I read + 100 * current image..)
I have this code:
mov = VideoReader('MVI_3478.MOV');
seq = read(mov, [1 frames]);
% create background
channels = size(seq, 3);
height = size(seq,1);
width = size(seq,2);
BG = zeros(height, width, channels, 'uint8');
for c = 1:channels
for y = 1:height
for x = 1:width
BG(y,x,c) = median(seq(y,x,c,:));
end
end
end
and my question is, given that I will add another loop above everything, how to give median weight?
Thanks!
There is no possibility to calculate the median this way. The required Information is lost.
Example:
median([1,2,3,4,5,6,7]) is 4
median([1,2,3,3,5,6,7]) is 3
median([1,2,3])=2
median([4,5,6,7])=5
median([3,5,6,7])=5
Thus, for both subsequence you get the partial results 2 and 5, while the median is 3 in one case and 4 in the other case.
The only possibility I see is some binary search approach:
smaller=0
larger=0
equal=0
el=numel(s)
while(smaller>=el/2||larger>el/2||equal==0)
guess=..
smaller=0
larger=0
equal=0
for c = 1:channels
for y = 1:height
for x = 1:width
s=seq(y,x,c,:)
smaller=smaller+numel(s(s<guess);
larger=larger+numel(s(s>guess);
equal=equal+numel(s(s=guess);
end
end
end
end
This is only a sketch, the code has to be completed. Guess has to be filled with some binary search strategy.
In case of a large number of frames, calculating the median in a progressive manner can be problem since the median is a global order statistic and does not have a structure. The classical method is to use the fact that we are working with grayscale 8 bit values (256). Thus for any pixel p(x,y,n) one needs to maintain a histogram with 256 bins with each bin counting n values( as there are n frames).
Thus at each update we will have:
value = p(x,y,i); %for the ith frame
H(x,y,value) = H(x,y,value) + 1; %updating your histogram,
and then sort the histogram by their frequencies and pick the middle value: https://math.stackexchange.com/questions/202302/how-to-calculate-median-and-standard-deviation-from-histogram
The size of this counter can be decided based on the number of frames you have in the video N = log2(n) bit. The median search now is simplified since its constant time search within a histogram. This also helps when concatenating many histograms since the search remains a constant time search independent.
Thus finally the total size of your histograms would be XYN bits, where X and Y are the dimensions of your image.

How do I create ranking (descending) table in matlab based on inputs from two separate data tables? [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
I have four data sets (please bear with me here):
1st Table: List of 10 tickers (stock symbols) in one column in txt format in matlab.
2nd table: dates in numerical format in one column (10 days in double format).
3rd table: I have 10*10 data set of random numbers (assume 0-1 for simplicity). (Earnings Per Share growth EPS for example)--so I want high EPS growth in my ranking for portfolio construction.
4th table: I have another 10*10 data set of random numbers (assume 0-1 for simplicity). (Price to earnings ratios for example daily).-so I want low P/E ratio in my ranking for portfolio construction.
NOW: I want to rank portfolio of stocks each day made up of 3 stocks (largest values) from table one for a particular day and bottom three stocks from table 2 (smallest values). The output must be list of tickers for each day (3 in this case) based on combined ranking of the two factors (table 3 & 4 as described).
Any ideas? In short I need to end up with a top bucket with three tickers...
It is not entirely clear from the post what you are trying to achieve. Here is a take based on guessing, with various options.
Your first two "tables" store symbols for stocks and days (irrelevant for ranking). Your third and fourth are scores arranged in a stock x day manner. Let's assume stocks vertical, days horizontal and stocks symbolized with a value in [1:10].
N = 10; % num of stocks
M = 10; % num of days
T3 = rand(N,M); % table 3 stocks x days
T4 = rand(N,M); % table 4 stocks x days
Sort the score tables in ascending and descending order (to get upper and lower scores per day, i.e. per column):
[Sl,L] = sort(T3, 'descend');
[Ss,S] = sort(T4, 'ascend');
Keep three largest and smallest:
largest = L(1:3,:); % bucket of 3 largest per day
smallest = S(1:3,:); % bucket of 3 smallest per day
IF you need the ones in both (0 is nan):
% Inter-section of both buckets
indexI = zeros(3,M);
for i=1:M
z = largest(ismember(largest(:,i),smallest(:,i)));
if ~isempty(z)
indexI(1:length(z),i) = z;
end
end
IF you need the ones in either one (0 is nan):
% Union of both buckets
indexU = zeros(6,M);
for i=1:M
z = unique([largest(:,i),smallest(:,i)]);
indexU(1:length(z),i) = z;
end
IF you need a ranking of scores/stocks from the set of largest_of_3 and smallest_of_4:
scoreAll = [Sl(1:3,:); Ss(1:3,:)];
indexAll = [largest;smallest];
[~,indexSort] = sort(scoreAll,'descend');
for i=1:M
indexBest(:,i) = indexAll(indexSort(1:3,i),i);
end
UPDATE
To get a weighted ranking of the final scores, define the weight vector (1 x scores) and use one of the two options below, before sorting scoreAllW instead of scoreAll:
w = [0.3 ;0.3; 0.3; 0.7; 0.7; 0.7];
scoreAllW = scoreAll.*repmat(w,1,10); % Option 1
scoreAllW = bsxfun(#times, scoreAll, w); % Option 2

How can I generate a binary matrix with specific patterns?

I have a binary matrix of size m-by-n. Given below is a sample binary matrix (the real matrix is much larger):
1010001
1011011
1111000
0100100
Given p = m*n, I have 2^p possible matrix configurations. I would like to get some patterns which satisfy certain rules. For example:
I want not less than k cells in the jth column as zero
I want the sum of cell values of the ith row greater than a given number Ai
I want at least g cells in a column continuously as one
etc....
How can I get such patterns satisfying these constraints strictly without sequentially checking all the 2^p combinations?
In my case, p can be a number like 2400, giving approximately 2.96476e+722 possible combinations.
Instead of iterating over all 2^p combinations, one way you could generate such binary matrices is by performing repeated row- and column-wise operations based on the given constraints you have. As an example, I'll post some code that will generate a matrix based on the three constraints you have listed above:
A minimum number of zeroes per column
A minimum sum for each row
A minimum sequential length of ones per column
Initializations:
First start by initializing a few parameters:
nRows = 10; % Row size of matrix
nColumns = 10; % Column size of matrix
minZeroes = 5; % Constraint 1 (for columns)
minRowSum = 5; % Constraint 2 (for rows)
minLengthOnes = 3; % Constraint 3 (for columns)
Helper functions:
Next, create a couple of functions for generating column vectors that match constraints 1 and 3 from above:
function vector = make_column
vector = [false(minZeroes,1); true(nRows-minZeroes,1)]; % Create vector
[vector,maxLength] = randomize_column(vector); % Randomize order
while maxLength < minLengthOnes, % Loop while constraint 3 is not met
[vector,maxLength] = randomize_column(vector); % Randomize order
end
end
function [vector,maxLength] = randomize_column(vector)
vector = vector(randperm(nRows)); % Randomize order
edges = diff([false; vector; false]); % Find rising and falling edges
maxLength = max(find(edges == -1)-find(edges == 1)); % Find longest
% sequence of ones
end
The function make_column will first create a logical column vector with the minimum number of 0 elements and the remaining elements set to 1 (using the functions TRUE and FALSE). This vector will undergo random reordering of its elements until it contains a sequence of ones greater than or equal to the desired minimum length of ones. This is done using the randomize_column function. The vector is randomly reordered using the RANDPERM function to generate a random index order. The edges where the sequence switches between 0 and 1 are detected using the DIFF function. The indices of the edges are then used to find the length of the longest sequence of ones (using FIND and MAX).
Generate matrix columns:
With the above two functions we can now generate an initial binary matrix that will at least satisfy constraints 1 and 3:
binMat = false(nRows,nColumns); % Initialize matrix
for iColumn = 1:nColumns,
binMat(:,iColumn) = make_column; % Create each column
end
Satisfy the row sum constraint:
Of course, now we have to ensure that constraint 2 is satisfied. We can sum across each row using the SUM function:
rowSum = sum(binMat,2);
If any elements of rowSum are less than the minimum row sum we want, we will have to adjust some column values to compensate. There are a number of different ways you could go about modifying column values. I'll give one example here:
while any(rowSum < minRowSum), % Loop while constraint 2 is not met
[minValue,rowIndex] = min(rowSum); % Find row with lowest sum
zeroIndex = find(~binMat(rowIndex,:)); % Find zeroes in that row
randIndex = round(1+rand.*(numel(zeroIndex)-1));
columnIndex = zeroIndex(randIndex); % Choose a zero at random
column = binMat(:,columnIndex);
while ~column(rowIndex), % Loop until zero changes to one
column = make_column; % Make new column vector
end
binMat(:,columnIndex) = column; % Update binary matrix
rowSum = sum(binMat,2); % Update row sum vector
end
This code will loop until all the row sums are greater than or equal to the minimum sum we want. First, the index of the row with the smallest sum (rowIndex) is found using MIN. Next, the indices of the zeroes in that row are found and one of them is randomly chosen as the index of a column to modify (columnIndex). Using make_column, a new column vector is continuously generated until the 0 in the given row becomes a 1. That column in the binary matrix is then updated and the new row sum is computed.
Summary:
For a relatively small 10-by-10 binary matrix, and the given constraints, the above code usually completes in no more than a few seconds. With more constraints, things will of course get more complicated. Depending on how you choose your constraints, there may be no possible solution (for example, setting minRowSum to 6 will cause the above code to never converge to a solution).
Hopefully this will give you a starting point to begin generating the sorts of matrices you want using vectorized operations.
If you have enough constraints, exploring all possible matrices could be attempted:
// Explore all possibilities starting at POSITION (0..P-1)
explore(int position)
{
// Check if one or more constraints can't be verified anymore with
// all values currently set.
invalid = ...;
if (invalid) return;
// Do we have a solution?
if (position >= p)
{
// print the matrix
return;
}
// Set one more value and continue exploring
for (int value=0;value<2;value++)
{ matrix[position] = value; explore(position+1); }
}
If the number of constraints is low, this approach will take too much time.
In this case, for the kind of constraints you gave as examples, simulated annealing may be a good solution.
You must design an energy function, high when all constraints are met. That would be something like that:
Generate a random matrix
Compute energy E0
Change one cell
Compute energy E1
If E1>E0, or E0-E1 is smaller than f(temperature), keep it, otherwise reverse the move
Update temperature, and goto 2 unless stop criterion is reached
If all the contraints relate to columns (as is the case in the question), then you can find all possible valid columns and check that each column in the matrix is in this set. (i.e. when you consider each column independently, you reduce the number of possibilities a lot.)
I might be way off here, but I remember doing something similar once with some genetic algorithm.
Check out pseudo boolean constraints (also called 0-1 integer programming).
This is virtually impossible if your constraint set is complex enough. You might try to use a stochastic optimizer, like simulated annealing, particle swarm optimization, or a genetic algorithm to find a feasible solution.
However, if you can generate one (non-random) solution to such a problem, then often you can generate others by random permutations made to the existing solution.