Problem
I have a data set of describing geological structures. Each structure has a row with two attributes - its length and orientation (0-360 degrees).
Within this data set, there are two types of structure.
Type 1: less data points, but the structures are physically larger (large length, and so more significant).
Type 2: more data points, but the structures are physically smaller (small length, and so less significant).
I want to create a rose plot to show the spread of the structures' orientations. However, I want this plot to also represent the significance of the structures in combination with the direction they face - taking into account the lengths.
Is it possible to scale this by length in MATLAB somehow so that the subset which is less numerous is not under represented, when the structures are large?
Example
A data set might contain:
10 structures orientated North-South, 50km long.
100 structures orientated East-West, 0.5km long.
In this situation the East-West population would look to be more significant than the North-South population based on absolute numbers. However, in reality the length of the members contributing to this population are much smaller and so the structures are less significant.
Code
This is the code I have so far:
load('WG_rose_data.xy')
azimuth = WG_rose_data(:,2);
length = WG_rose_data(:,1);
rose(azimuth,20);
Where WG_rose_data.xy is a data file with 2 columns containing the length and azimuth (orientation) data for the geological structures.
For each row in your data, you could duplicate it a given number of times, according to its length value. Therefore, if you had a structure with length 50, it counts for 50 data points, whereas a structure with length 1 only counts as 1 data point. Of course you have to round your lengths since you can only have integer numbers of rows.
This could be achieved like so, with your example data in the matrix d
% Set up example data: 10 large vertical structures, 100 small ones perpendicular
d = [repmat([0, 50], 10, 1); repmat([90, .5], 100, 1)];
% For each row, duplicate the data in column 1, according to the length in column 2
d1 = [];
for ii = 1:size(d,1)
% make d(ii,2) = length copies of d(ii,1) = orientation
d1(end+1:end+ceil(d(ii,2))) = d(ii,1);
end
Output rose plot:
You could fine tune how to duplicate the data to achieve the desired balance of actual data and length weighting.
Thanks for all the help with this. This code is my final working version for reference:
clear all
close all
% Input dataset
original_data = load('WG_rose_data.xy');
d = [];
%reformat azimuth
d(:,1)= original_data(:,2);
%reformat length
d(:,2)= original_data(:,1);
% For each row, duplicate the data in column 1, according to the length in column 2
d1 = [];
for a = 1:size(d,1)
d1(end+1:end+ceil(d(a,2))) = d(a,1);
end
%create oposite directions for rose diagram
length_d1_azi = length(d1);
d1_op_azi=zeros(1,length_d1_azi);
for i = 1:length_d1_azi
d1_op_azi(i)=d1(i)-180;
if d1_op_azi(i) < 1;
d1_op_azi(i) = 360 - (d1_op_azi(i)*-1);
end
end
%join calculated oposites to original input
new_length = length_d1_azi*2;
all=zeros(new_length,1);
for i = 1:length_d1_azi
all(i)=d1(i);
end
for j = length_d1_azi+1:new_length;
all(j)=d1_op_azi(j-length_d1_azi);
end
%convert input aray into radians to plot
d1_rad=degtorad(all);
rose(d1_rad,24)
set(gca,'View',[-90 90],'YDir','reverse');
Related
I have a dataset of points represented by a 2D vector (X).
Each point belongs to a categorical data (Y) represented by an integer value(from 1 to 4).
I want to plot each point with a different symbol depending on its class.
Toy example:
X = randi(100,10,2); % 10 points ranging 1:100 in 2D space
Y = randi(4,10,1); % class of the points (1 to 4)
I create a vector of symbols for each class:
S = {'bx' 'rx' 'b.' 'r.'};
Then I try:
plot(X(:,1), X(:,2), S(Y))
Error using plot
Invalid first data argument
How can I assign to each point of X a different symbol based on the value of Y?
Of curse I can use a loop for each class and plot the different classes one by one. But is there a method to directly plot each class with a different symbol?
No need for a loop, use gscatter:
X = randi(100,10,2); % 10 points ranging 1:100 in 2D space
Y = randi(4,10,1); % class of the points (1 to 4)
color = 'brbr';
symbol = 'xx..';
gscatter(X(:,1),X(:,2),Y,color,symbol)
and you will get:
If X has many rows, but there are only a few S types, then I suggest you check out the second approach first. It's optimized for speed instead of readability. It's about twice as fast if the vector has 10 elements, and more than 200 times as fast if the vector has 1000 elements.
First approach (easy to read):
Regardless of approach, I think you need a loop for this:
hold on
arrayfun(#(n) plot(X(n,1), X(n,2), S{Y(n)}), 1:size(X,1))
Or, to write the loop in the "conventional way":
hold on
for n = 1:size(X,1)
plot(X(n,1), X(n,2), S{Y(n)})
end
Second approach (gives same plot as above):
If your dataset is large, you can sort [Y_sorted, sort_idx] = sort(Y), then use sort_idx to index X, like this: X_sorted = X(sort_idx);. After this, you split X_sorted into 4 groups, one for each of the individual Y-values, using histc and mat2cell. Then you loop over the four groups and plot each one individually.
This way you only need to loop through four values, regardless of the number of elements in your data. This should be a lot faster if the number of elements is high.
[Y_sorted, Y_index] = sort(Y);
X_sorted = X(Y_index, :);
X_cell = mat2cell(X_sorted, histc(Y,1:numel(S)));
hold on
for ii = 1:numel(X_cell)
plot(X_cell{ii}(:,1),X_cell{ii}(:,2),S{ii})
end
Benchmarking:
I did a very simple benchmarking of the two approaches using timeit. The result shows that the second approach is a lot faster:
For 10 elements:
First approach: 0.0086
Second approach: 0.0037
For 1000 elements:
First approach = 0.8409
Second approach = 0.0039
Assume that I have vector shown in the figure below. By common sense, we can see that there are 2 values which suddenly depart from the trend of the vector.
How do I eliminate these sudden changes. I mean how do I automatically detect and replace these noise values by the average value of their neighbors.
Define a threshold, compute the average values, then compare the relative error between the values and the averages of their neighbors:
threshold = 5e-2;
averages = [v(1); (v(3:end) + v(1:end-2)) / 2; v(end)];
is_outlier = (v.^2 - averages.^2) > threshold^2 * averages.^2;
Then replace the outliers:
v(is_outlier) = averages(is_outlier);
I am calculating the Local Ternary Pattern of an image. My code is given below. Am I going in the right direction or not?
function [ I3 ] = LTP(I2)
m=size(I2,1);
n=size(I2,2);
for i=2:m-1
for j=2:n-1
J0=I2(i,j);
I3(i-1,j-1)=I2(i-1,j-1)>J0;
end
end
I2 is the image LTP is applied to.
This isn't quite correct. Here's an example of LTP given a 3 x 3 image patch and a threshold t:
(source: hindawi.com)
The range that you assign a pixel in a window to 0 is when the threshold is between c - t and c + t, where c is the centre intensity of the pixel. Therefore, because the intensity is 34 in the centre of this window, the range is between [29,39]. Any values that are beyond 39 get assigned 1 and any values that are below 29 get assigned -1. Once you determine the ternary codes, you split up the codes into upper and lower patterns. Basically, any values that get assigned a -1 get assigned 0 for upper patterns and any values that get assigned a -1 get assigned 1 for lower patterns. Also, for the lower pattern, any values that are 1 from the original window get mapped to 0. The final pattern is reading the bit pattern starting from the east location with respect to the centre (row 2, column 3), then going around counter-clockwise. Therefore, you should probably modify your function so that you're outputting both lower patterns and upper patterns in your image.
Let's write the corrected version of your code. Bear in mind that I will not give an optimized version. Let's get a basic algorithm working, and it'll be up to you on how you want to optimize this. As such, change your code to something like this, bearing in mind all of the stuff I talked about above. BTW, your function is not defined properly. You can't use spaces to define your function, as well as your variables. It will interpret each word in between spaces as variables or functions, and that's not what you want. Assuming your neighbourhood size is 3 x 3 and your image is grayscale, try something like this:
function [ ltp_upper, ltp_lower ] = LTP(im, t)
%// Get the dimensions
rows=size(im,1);
cols=size(im,2);
%// Reordering vector - Essentially for getting binary strings
reorder_vector = [8 7 4 1 2 3 6 9];
%// For the upper and lower LTP patterns
ltp_upper = zeros(size(im));
ltp_lower = zeros(size(im));
%// For each pixel in our image, ignoring the borders...
for row = 2 : rows - 1
for col = 2 : cols - 1
cen = im(row,col); %// Get centre
%// Get neighbourhood - cast to double for better precision
pixels = double(im(row-1:row+1,col-1:col+1));
%// Get ranges and determine LTP
out_LTP = zeros(3, 3);
low = cen - t;
high = cen + t;
out_LTP(pixels < low) = -1;
out_LTP(pixels > high) = 1;
out_LTP(pixels >= low & pixels <= high) = 0;
%// Get upper and lower patterns
upper = out_LTP;
upper(upper == -1) = 0;
upper = upper(reorder_vector);
lower = out_LTP;
lower(lower == 1) = 0;
lower(lower == -1) = 1;
lower = lower(reorder_vector);
%// Convert to a binary character string, then use bin2dec
%// to get the decimal representation
upper_bitstring = char(48 + upper);
ltp_upper(row,col) = bin2dec(upper_bitstring);
lower_bitstring = char(48 + lower);
ltp_lower(row,col) = bin2dec(lower_bitstring);
end
end
Let's go through this code slowly. First, I get the dimensions of the image so I can iterate over each pixel. Also, bear in mind that I'm assuming that the image is grayscale. Once I do this, I allocate space to store the upper and lower LTP patterns per pixel in our image as we will need to output this to the user. I have decided to ignore the border pixels where when we consider a pixel neighbourhood, if the window goes out of bounds, we ignore these locations.
Now, for each valid pixel that is within the valid borders of the image, we extract our pixel neighbourhood. I convert these to double precision to allow for negative differences, as well as for better precision. I then calculate the low and high ranges, then create a LTP pattern following the guidelines we talked about above.
Once I calculate the LTP pattern, I create two versions of the LTP pattern, upper and lower where any values of -1 for the upper pattern get mapped to 0 and 1 for the lower pattern. Also, for the lower pattern, any values that were 1 from the original window get mapped to 0. After, this, I extract out the bits in the order that I laid out - starting from the east, go counter-clockwise. That's the purpose of the reorder_vector as this will allow us to extract those exact locations. These locations will now become a 1D vector.
This 1D vector is important, as we now need to convert this vector into character string so that we can use bin2dec to convert the value into a decimal number. These numbers for the upper and lower LTPs are what are finally used for the output, and we place those in the corresponding positions of both output variables.
This code is untested, so it'll be up to you to debug this if it doesn't work to your specifications.
Good luck!
So my computer is not too strong.. to say the least..
Yet I want to create a median of all pixels in an entire specific movie.
I was able to do it for a sequence of frames in memory.. but I am not sure on how to do it when reading more frames each time... how do I give median weight?
(like I'll read 100 frames each time but the median has to update according to the current median * 100 * times I read + 100 * current image..)
I have this code:
mov = VideoReader('MVI_3478.MOV');
seq = read(mov, [1 frames]);
% create background
channels = size(seq, 3);
height = size(seq,1);
width = size(seq,2);
BG = zeros(height, width, channels, 'uint8');
for c = 1:channels
for y = 1:height
for x = 1:width
BG(y,x,c) = median(seq(y,x,c,:));
end
end
end
and my question is, given that I will add another loop above everything, how to give median weight?
Thanks!
There is no possibility to calculate the median this way. The required Information is lost.
Example:
median([1,2,3,4,5,6,7]) is 4
median([1,2,3,3,5,6,7]) is 3
median([1,2,3])=2
median([4,5,6,7])=5
median([3,5,6,7])=5
Thus, for both subsequence you get the partial results 2 and 5, while the median is 3 in one case and 4 in the other case.
The only possibility I see is some binary search approach:
smaller=0
larger=0
equal=0
el=numel(s)
while(smaller>=el/2||larger>el/2||equal==0)
guess=..
smaller=0
larger=0
equal=0
for c = 1:channels
for y = 1:height
for x = 1:width
s=seq(y,x,c,:)
smaller=smaller+numel(s(s<guess);
larger=larger+numel(s(s>guess);
equal=equal+numel(s(s=guess);
end
end
end
end
This is only a sketch, the code has to be completed. Guess has to be filled with some binary search strategy.
In case of a large number of frames, calculating the median in a progressive manner can be problem since the median is a global order statistic and does not have a structure. The classical method is to use the fact that we are working with grayscale 8 bit values (256). Thus for any pixel p(x,y,n) one needs to maintain a histogram with 256 bins with each bin counting n values( as there are n frames).
Thus at each update we will have:
value = p(x,y,i); %for the ith frame
H(x,y,value) = H(x,y,value) + 1; %updating your histogram,
and then sort the histogram by their frequencies and pick the middle value: https://math.stackexchange.com/questions/202302/how-to-calculate-median-and-standard-deviation-from-histogram
The size of this counter can be decided based on the number of frames you have in the video N = log2(n) bit. The median search now is simplified since its constant time search within a histogram. This also helps when concatenating many histograms since the search remains a constant time search independent.
Thus finally the total size of your histograms would be XYN bits, where X and Y are the dimensions of your image.
I'm just beginning to teach myself MATLAB, and I'm making a 501x6 array. The columns will contain probabilities for flipping 101 sided die, and as such, the columns contain 101,201,301 entries, not 501. Is there a way to 'stretch the column' so that I add 0s above and below the useful data? So far I've only thought of making a column like a=[zeros(200,1);die;zeros(200,1)] so that only the data shows up in rows 201-301, and similarly, b=[zeros(150,1);die2;zeros(150,1)], if I wanted 200 or 150 zeros to precede and follow the data, respectively in order for it to fit in the array.
Thanks for any suggestions.
You can do several thing:
Start with an all-zero matrix, and only modify the elements you need to be non-zero:
A = zeros(501,6);
A(someValue:someOtherValue, 5) = value;
% OR: assign the range to a vector:
A(someValue:someOtherValue, 5) = 1:20; % if someValue:someOtherValue is the same length as 1:20