I have to plot some Venn diagrams to show intersections between arrays of strings. Let me explain: I have a category of names that includes three others that do not intersect each other. I want to understand which of these three occupies the largest percentage of the macro-category mentioned above. I would like to do it with MATLAB but I see that this development environment is somewhat devoid of such functions. If you have any ideas in mind I would be grateful.
Thanks in advance!
Venn Diagrams by Plotting Circles
Not the most elegant way but one method of plotting a Venn diagram can be plotting circles and filling the individual circles using a text() annotation/label. Unfortunately, here I manually placed where the labels are centred. Automating the label positioning may take several additional steps which I will leave out for simplicity. To find the intersections of the respective datasets I simply used the intersect() function.
A = {'A','B','C','D','E','F','G','H','I','L'};
B = {'A','B','C','D'};
C = {'E','F','G','H','I','L'};
%Finding the intersections of the arrays%
Intersection_AB = intersect(A,B);
fprintf("Intersection AB: ")
disp(Intersection_AB);
fprintf("\n");
Intersection_BC = intersect(B,C);
fprintf("Intersection BC: ")
disp(Intersection_BC);
fprintf("\n");
Intersection_AC = intersect(A,C);
fprintf("Intersection AC: ")
disp(Intersection_AC);
fprintf("\n");
Intersection_ABC = intersect(Intersection_AB,C);
fprintf("Intersection ABC: ")
disp(Intersection_ABC);
fprintf("\n");
clc;
clf;
Plotting_Interval = 0.01;
Angles_In_Radians = (0: Plotting_Interval: 2*pi);
Circle_Plot = #(X_Offset,Y_Offset,Radius) plot(X_Offset + Radius*cos(Angles_In_Radians),Y_Offset + Radius*sin(Angles_In_Radians));
hold on
%Plotting the 3 circles%
X_Offset_A = 0; Y_Offset_A = 2; Radius_A = 3;
Circle_A = Circle_Plot(X_Offset_A,Y_Offset_A,Radius_A);
fill(Circle_A.XData, Circle_A.YData,'r','FaceAlpha',0.2,'LineWidth',1);
X_Offset_B = -2; Y_Offset_B = -2; Radius_B = 3;
Circle_B = Circle_Plot(X_Offset_B,Y_Offset_B,Radius_B);
fill(Circle_B.XData, Circle_B.YData,'g','FaceAlpha',0.2,'LineWidth',1);
X_Offset_C = 2; Y_Offset_C = -2; Radius_C = 3;
Circle_Plot(X_Offset_C,Y_Offset_C,Radius_C);
Circle_C = Circle_Plot(X_Offset_C,Y_Offset_C,Radius_C);
fill(Circle_C.XData, Circle_C.YData,'b','FaceAlpha',0.2,'LineWidth',1);
title("Venn Diagram");
%Writing all the labels%
A_Label = strjoin(string(A));
text(X_Offset_A,Y_Offset_A,A_Label,'color','r');
B_Label = strjoin(string(B));
text(X_Offset_B,Y_Offset_B,B_Label,'color','g');
C_Label = strjoin(string(C));
text(X_Offset_C,Y_Offset_C,C_Label,'color','b');
AB_Label = strjoin(string(Intersection_AB));
text(-1.2,0,AB_Label);
BC_Label = strjoin(string(Intersection_BC));
text(0,-2,BC_Label);
AC_Label = strjoin(string(Intersection_AC));
text(1.2,0,AC_Label);
ABC_Label = strjoin(string(Intersection_ABC));
text(0,0,ABC_Label);
%Setting the labels to be relative to the centres%
set(findall(gcf,'type','text'),'HorizontalAlignment','center');
axis equal
axis off
Ran using MATLAB R2019b
Related
I'm trying to plot a knn result from my data, which has 3 columns: x, y , label. There are 3 classes and for each of them I would like to used a different symbol. Here's the way I'm plotting now:
t1 = data(:,3) == 1;
t2 = data(:,3) == 2;
t3 = data(:,3) == 3;
train1 = data(t1,:);
train2 = data(t2,:);
train3 = data(t3,:);
figure(1);
plot(train1(:,1),train1(:,2),'#',train2(:,1),train2(:,2),'*',train3(:,1),train3(:,2),'o');
I want to know if there's a more concise way of doing this. Thanks
Here's the most concise (and working) way to plot your data:
figure(1);
hold all
plot(train1(:,1),train1(:,2),'o')
plot(train2(:,1),train2(:,2),'x')
plot(train3(:,1),train3(:,2),'s')
Here's an example that does what you want in a robust and modular manner. You can easily add classes or modify the figure output.
data = [0.53,0.17,2;0.78,0.60,3;0.93,0.26,1;0.13,0.65,2;0.57,0.69,1;...
0.47,0.75,3;0.010,0.45,1;0.34,0.080,3;0.16,0.23,3;0.79,0.91,3;...
0.31,0.15,1;0.53,0.83,2];
categories = [1,2,3];
symbols = {'s','<','o','d','v','+','x','*'};
figure;
hold all
for loopj = 1:length(categories)
t = data(:,3) == categories(loopj);
train = data(t,:);
label = strcat('Class ',num2str(categories(loopj)));
plot(train(:,1),train(:,2),symbols{loopj},'DisplayName',label,'LineWidth',1.3)
end
lg = legend('show');
lg.Location = 'best';
Use hold all to write on the figure without erasing the previous axi and let Matlab pick line colors.
In any case, you need to manually define the different symbols and each plot command comes with one unique type of line and markers.
I have written the 3x3 average filter. It works fine but it shows the same output image three times instead of one. How to resolve the problem?
The code is
function [filtr_image] = avgFilter(noisy_image)
[x,y] = size(noisy_image);
filtr_image = zeros(x,y);
for i = 2:x-1
for j =2:y-1
sum = 0;
for k = i-1:i+1
for l = j-1:j+1
sum = sum+noisy_image(k,l);
end
end
filtr_image(i,j) = sum/9.0;
filtr_image = uint8(filtr_image);
end
end
end
thanks in advance
What is most likely happening is the fact that you are supplying a colour image when the code is specifically meant for grayscale. The reason why you see "three" is because when you do this to allocate your output filtered image:
[x,y] = size(noisy_image)
If you have a 3D matrix, the number of columns reported by size will be y = size(noisy_image,2)*size(noisy_image,3);. As such, when you are iterating through each pixel in your image, in column major order each plane would be placed side by side each other. What you should do is either convert your image into grayscale from RGB or filter each plane separately.
Also, you have an unnecessary casting performed in the loop. Just do it once outside of the loop.
Option #1 - Filter per plane
function [filtr_image] = avgFilter(noisy_image)
[x,y,z] = size(noisy_image);
filtr_image = zeros(x,y,z,'uint8');
for a = 1 : z
for i = 2:x-1
for j =2:y-1
sum = 0;
for k = i-1:i+1
for l = j-1:j+1
sum = sum+noisy_image(k,l,a);
end
end
filtr_image(i,j,a) = sum/9.0;
end
end
end
end
Then you'd call it by:
filtr_image = avgFilter(noisy_image);
Option #2 - Convert to grayscale
filtr_image = avgFilter(rgb2gray(noisy_image));
Minor Note
You are using sum as a variable. sum is an actual function in MATLAB and you would be overshadowing this function with your variable. This will have unintended consequences if you have other functions that rely on sum later down the line.
I can't see why your code would repeat the image (unless it's a pattern cause by an integer overflow :/ ) but here are some suggestions:
if you want to use loops, at least drop the inner loops:
[x,y] = size(noisy_image);
filtr_image = zeros(x,y);
for i = 2:x-1
for j =2:y-1
% // you could do this in 1 line if you use mean2(...) instead
sub = noisy_image(i-1:i+1, j-1:j+1);
filtr_image = uint8(mean(sub(:)));
end
end
However do you know about convolution? Matlab has a built in function for this:
filter = ones(3)/9;
filtr_image = uint8(conv2(noisy_image, filter, 'same'));
Language : MATLAB
Problem Defenition:
I have a set of 2D points in space. I would like to group the points based on their euclidean distance. My data has a property that two groups are always separated by at least R units. Hence for a given point, all points that are closer than 50 units can be considered to be its neighbors. Combining points having common neighbors would result in the groups (that is the idea at least).
Proposed Method:
Use delaunay triangulation in matlab and get list of edges of the resulting triangles. Remove all edges that are greater than R units. Each group of points left are the groups I am looking for. Remaining unconnected points can be ignored.
Attempt:
I tried to implement the above in MATLAB, but I am making a mistake in grouping the left over points. I am attaching my code.
DT = delaunayTriangulation(double(frame(:,1:2)));
edgeList = edges(DT);
edgeVertex1 = frame(edgeList(:,1),:);
edgeVertex2 = frame(edgeList(:,2),:);
dVec = edgeVertex1 - edgeVertex2;
edgeLengths = sqrt(sum(abs(dVec).^2,2));
requiredEdges = edgeLengths < NEIGH_RADIUS;
edgeLengthsFiltered = edgeLengths(requiredEdges);
edgeListFiltered = edgeList(requiredEdges,:);
% Clustering
edgeOrigins = edgeListFiltered(:,1);
edgeEndings = edgeListFiltered(:,2);
nodeList = unique(edgeOrigins);
if isempty(nodeList)
Result = struct([]);
super_struct(i).result = Result;
else
groups = cell(10,1);
groups{1} = nodeList(1);
groupLength = 2;
flag = 0;
% grouping
for j = 1:1:length(nodeList);
neighbourList = [nodeList(j); edgeEndings(edgeOrigins==nodeList(j))];
% add current node as part of neighbourList
for k = 1:1:groupLength-1
te = ismembc(groups{k}, neighbourList);
if sum(te) ~=0
temp = sort([groups{k}; neighbourList]);
groups{k} = temp([true;diff(temp(:))>0]);
flag = 1;
break;
end
end
if ~flag
groups{groupLength} = neighbourList;
groupLength = groupLength + 1;
end
flag = 0;
end
largeGroups = cell(1,1);
largeGroups_c = 1;
for j = 1:1:groupLength -1;
if ~ isempty(groups{j})
for k = j+1:1:groupLength - 1
te = ismembc(groups{j}, groups{k});
if sum(te) ~= 0
temp = sort([groups{j}; groups{k}]);
groups{j} = temp([true;diff(temp(:))>0]);
groups{k} =[];
end
end
% ignore small groups
if length(groups{j}) > MIN_PTS_IN_GROUP
largeGroups{largeGroups_c} = groups{j};
largeGroups_c = largeGroups_c+1;
end
end
end
in the above code, frame is the variable that has the list of points. The constants NEIGH_RADIUS represents R from the question. The other constant MIN_PTS_IN_GROUP is user defined to select the minimum no of points needed to consider it a cluster of interest.
When I run the above code, there are still instances where a single group of points are still represented as multiple groups.
The red lines border a single group as identified by the code above. Clearly there are intersecting groups which is wrong.
Question 1
Can someone suggest a better (and correct) way of grouping?
Question 2
Any other alternate methods of obtaining the groups faster than Triangulation would also be great!
Thank you in advance
If you know the number of group you search for, you can use kmeans function in matlab statistic toolbox or you can find other implentation on matlab exchange (kmeans clustering)
I have 3 vectors: Y=rand(1000,1), X=Y-rand(1000,1) and ACTid=randi(6,1000,1).
I'd like to create boxplots by groups of Y and X corresponding to their group value 1:6 (from ACTid).
This is rather ad-hoc and looks nasty
for ii=
dummyY(ii)={Y(ACTid==ii)};
dummyX(ii)={X(ACTid==ii)}
end
Now I have the data in a cell but can't work out how to group it in a boxplot. Any thoughts?
I've found aboxplot function that looks like this but I don't want that, I'd like the builtin boxplot function because i'm converting it to matlab2tikz and this one doesn't do it well.
EDIT
Thanks to Oleg: we now have a grouped boxplot... but the labels are all skew-whiff.
xylabel = repmat({'Bleh','Blah'},1000,1); % need a legend instead, but doesn't appear possible
boxplot([Y(:,end); cfu], {repmat(ACTid,2,1), xylabel(:)} ,'factorgap',10,'color','rk')
set(gca,'xtick',1.5:3.2:50)
set(gca,'xticklabel',{'Direct care','Housekeeping','Mealtimes','Medication','Miscellaneous','Personal care'})
>> ylabel('Raw CFU counts (Y)')
How to add a legend?
I had the same problem with grouping data in a box plot. A further constraint of mine was that different groups have different amounts of data points. Based on a tutorial I found, this seems to be a nice solution I wanted to share with you:
x = [1,2,3,4,5,1,2,3,4,6];
group = [1,1,2,2,2,3,3,3,4,4];
positions = [1 1.25 2 2.25];
boxplot(x,group, 'positions', positions);
set(gca,'xtick',[mean(positions(1:2)) mean(positions(3:4)) ])
set(gca,'xticklabel',{'Direct care','Housekeeping'})
color = ['c', 'y', 'c', 'y'];
h = findobj(gca,'Tag','Box');
for j=1:length(h)
patch(get(h(j),'XData'),get(h(j),'YData'),color(j),'FaceAlpha',.5);
end
c = get(gca, 'Children');
hleg1 = legend(c(1:2), 'Feature1', 'Feature2' );
Here is a link to the tutorial.
A two-line approach (although if you want to retain two-line xlables and center those in the first line, it's gonna be hackish):
Y = rand(1000,1);
X = Y-rand(1000,1);
ACTid = randi(6,1000,1);
xylabel = repmat('xy',1000,1);
boxplot([X; Y], {repmat(ACTid,2,1), xylabel(:)} ,'factorgap',10)
The result:
EDIT
To center labels...
% Retrieve handles to text labels
h = allchild(findall(gca,'type','hggroup'));
% Delete x, y labels
throw = findobj(h,'string','x','-or','string','y');
h = setdiff(h,throw);
delete(throw);
% Center labels
mylbl = {'this','is','a','pain','in...','guess!'};
hlbl = findall(h,'type','text');
pos = cell2mat(get(hlbl,'pos'));
% New centered position for first intra-group label
newPos = num2cell([mean(reshape(pos(:,1),2,[]))' pos(1:2:end,2:end)],2);
set(hlbl(1:2:end),{'pos'},newPos,{'string'},mylbl')
% delete second intra-group label
delete(hlbl(2:2:end))
Exporting as .png will cause problems...
My question has two parts. First one is:
How can I include the background as a component in the bwconncomp function, because it's default behavior doesn't include it.
Also, and this is my other question is, how can I select the n-th largest component based on what I get by using bwconncomp.
Currently I was thinking about something like this, but that doesn't work :P
function out = getComponent(im,n)
CC = bwconncomp(im,4);
%image is an binary image here
numPixels = cellfun(#numel,CC.PixelIdxList);
sortedPixels = sort(numPixels,'descend');
w = sortedPixels(n);
[largest, index] = find(numPixels==w);
im(CC.PixelIdxList{index}) = 0;
out = im;
But that doesn't work at all. But im not too sure what the CC.PixelIdxList{index} does, is it just changing elements in the array. I also find it kinda vague what exactly PixelIdxList is.
To find the background, you can use 'not' operation on the image
'PixelIdxList' is not what you need. You need the 'Area' property.
function FindBackgroundAndLargestBlob
x = imread('peppers.png');
I = x(:,:,2);
level = graythresh(I);
bw = im2bw(I,level);
b = bwlabel(bw,8);
rp = regionprops(b,'Area','PixelIdxList');
areas = [rp.Area];
[unused,indexOfMax] = max(areas);
disp(indexOfMax);
end
Update:
You can do it with bwconncomp as well:
function FindBackgroundAndLargestBlob
x = imread('peppers.png');
I = x(:,:,2);
level = graythresh(I);
bw = im2bw(I,level);
c = bwconncomp(bw,4);
numOfPixels = cellfun(#numel,c.PixelIdxList);
[unused,indexOfMax] = max(numOfPixels);
figure;imshow(bw);
bw( c.PixelIdxList{indexOfMax} ) = 0;
figure;imshow(bw);
end
Which will give the following results: