Related
I have to plot some Venn diagrams to show intersections between arrays of strings. Let me explain: I have a category of names that includes three others that do not intersect each other. I want to understand which of these three occupies the largest percentage of the macro-category mentioned above. I would like to do it with MATLAB but I see that this development environment is somewhat devoid of such functions. If you have any ideas in mind I would be grateful.
Thanks in advance!
Venn Diagrams by Plotting Circles
Not the most elegant way but one method of plotting a Venn diagram can be plotting circles and filling the individual circles using a text() annotation/label. Unfortunately, here I manually placed where the labels are centred. Automating the label positioning may take several additional steps which I will leave out for simplicity. To find the intersections of the respective datasets I simply used the intersect() function.
A = {'A','B','C','D','E','F','G','H','I','L'};
B = {'A','B','C','D'};
C = {'E','F','G','H','I','L'};
%Finding the intersections of the arrays%
Intersection_AB = intersect(A,B);
fprintf("Intersection AB: ")
disp(Intersection_AB);
fprintf("\n");
Intersection_BC = intersect(B,C);
fprintf("Intersection BC: ")
disp(Intersection_BC);
fprintf("\n");
Intersection_AC = intersect(A,C);
fprintf("Intersection AC: ")
disp(Intersection_AC);
fprintf("\n");
Intersection_ABC = intersect(Intersection_AB,C);
fprintf("Intersection ABC: ")
disp(Intersection_ABC);
fprintf("\n");
clc;
clf;
Plotting_Interval = 0.01;
Angles_In_Radians = (0: Plotting_Interval: 2*pi);
Circle_Plot = #(X_Offset,Y_Offset,Radius) plot(X_Offset + Radius*cos(Angles_In_Radians),Y_Offset + Radius*sin(Angles_In_Radians));
hold on
%Plotting the 3 circles%
X_Offset_A = 0; Y_Offset_A = 2; Radius_A = 3;
Circle_A = Circle_Plot(X_Offset_A,Y_Offset_A,Radius_A);
fill(Circle_A.XData, Circle_A.YData,'r','FaceAlpha',0.2,'LineWidth',1);
X_Offset_B = -2; Y_Offset_B = -2; Radius_B = 3;
Circle_B = Circle_Plot(X_Offset_B,Y_Offset_B,Radius_B);
fill(Circle_B.XData, Circle_B.YData,'g','FaceAlpha',0.2,'LineWidth',1);
X_Offset_C = 2; Y_Offset_C = -2; Radius_C = 3;
Circle_Plot(X_Offset_C,Y_Offset_C,Radius_C);
Circle_C = Circle_Plot(X_Offset_C,Y_Offset_C,Radius_C);
fill(Circle_C.XData, Circle_C.YData,'b','FaceAlpha',0.2,'LineWidth',1);
title("Venn Diagram");
%Writing all the labels%
A_Label = strjoin(string(A));
text(X_Offset_A,Y_Offset_A,A_Label,'color','r');
B_Label = strjoin(string(B));
text(X_Offset_B,Y_Offset_B,B_Label,'color','g');
C_Label = strjoin(string(C));
text(X_Offset_C,Y_Offset_C,C_Label,'color','b');
AB_Label = strjoin(string(Intersection_AB));
text(-1.2,0,AB_Label);
BC_Label = strjoin(string(Intersection_BC));
text(0,-2,BC_Label);
AC_Label = strjoin(string(Intersection_AC));
text(1.2,0,AC_Label);
ABC_Label = strjoin(string(Intersection_ABC));
text(0,0,ABC_Label);
%Setting the labels to be relative to the centres%
set(findall(gcf,'type','text'),'HorizontalAlignment','center');
axis equal
axis off
Ran using MATLAB R2019b
The Classification Learner Gui provides the option to export the code, which looks like this
function [trainedClassifier, validationAccuracy] = trainClassifier(datasetTable)
% Convert input to table
datasetTable = table(datasetTable);
datasetTable.Properties.VariableNames = {'column'};
% Split matrices in the input table into vectors
datasetTable.column_1 = datasetTable.column(:,1);
datasetTable.column_2 = datasetTable.column(:,2);
datasetTable.column_3 = datasetTable.column(:,3);
datasetTable.column_4 = datasetTable.column(:,4);
datasetTable.column_5 = datasetTable.column(:,5);
datasetTable.column_6 = datasetTable.column(:,6);
datasetTable.column_7 = datasetTable.column(:,7);
datasetTable.column_8 = datasetTable.column(:,8);
datasetTable.column_9 = datasetTable.column(:,9);
datasetTable.column_10 = datasetTable.column(:,10);
datasetTable.column_11 = datasetTable.column(:,11);
datasetTable.column_12 = datasetTable.column(:,12);
datasetTable.column_13 = datasetTable.column(:,13);
datasetTable.column_14 = datasetTable.column(:,14);
datasetTable.column_15 = datasetTable.column(:,15);
datasetTable.column_16 = datasetTable.column(:,16);
datasetTable.column_17 = datasetTable.column(:,17);
datasetTable.column_18 = datasetTable.column(:,18);
datasetTable.column_19 = datasetTable.column(:,19);
datasetTable.column = [];
% Extract predictors and response
predictorNames = {'column_1', 'column_2', 'column_3', 'column_4', 'column_5', 'column_6', 'column_7', 'column_8', 'column_9', 'column_10', 'column_11', 'column_12', 'column_13', 'column_14', 'column_15', 'column_16', 'column_17', 'column_18'};
predictors = datasetTable(:,predictorNames);
predictors = table2array(varfun(#double, predictors));
response = datasetTable.column_19;
% Train a classifier
trainedClassifier = fitctree(predictors, response, 'PredictorNames', {'column_1' 'column_2' 'column_3' 'column_4' 'column_5' 'column_6' 'column_7' 'column_8' 'column_9' 'column_10' 'column_11' 'column_12' 'column_13' 'column_14' 'column_15' 'column_16' 'column_17' 'column_18'}, 'ResponseName', 'column_19', 'ClassNames', [0 1], 'SplitCriterion', 'gdi', 'MaxNumSplits', 20, 'Surrogate', 'off');
% Perform cross-validation
partitionedModel = crossval(trainedClassifier, 'KFold', 5);
% Compute validation accuracy
validationAccuracy = 1 - kfoldLoss(partitionedModel, 'LossFun', 'ClassifError');
%% Uncomment this section to compute validation predictions and scores:
% % Compute validation predictions and scores
% [validationPredictions, validationScores] = kfoldPredict(partitionedModel);
Now, I'd like to pass to trainClassifier a datasetTable of varying size and call it like this:
trainClassifier(datasetTable,tablesize)
So, there's gonna be a for-loop to fill datasetTable.column_i and predictorNames.
I'm not experienced in using tables, so I haven't managed to write something syntactically correct.
I think the strings in predictorNames can be created using
eval(sprintf('column_%d ', i));
So,what can you suggest about the variables of datasetTable?
So I am trying to make a plot with the x-axis reading as the dates in my data.frames.
However it keeps reading as 0,1,2,3,4,5,....,,20, instead of April-01,April-02...April-29
data.setup<-function(data,loc='yahoo',start.date=Sys.Date()-months(1),
end.date=Sys.Date()) {
getSymbols(data,src=loc)
x<-as.data.frame(window(get(data),
start=as.character(start.date),
end=as.character(end.date)))
x$dates<-row.names(x)
colnames(x)<-c('Open','High','Low','Close','Volume','Adjusted','Dates')
x<-x[c(7,1,2,3,4,5,6)]
return(return(x))
}
data<-data.setup('AAPL',start.date=Sys.Date()-months(1))
h1<-Highcharts$new()
h1$chart(type='line')
h1$xAxis(category=data$Dates,id='dates')
h1$series(data=data$Low,name='Low',xAxis='dates')
not category but categories.
# data
df <- data.frame(x = 1:10, y = rnorm(10), s = rnorm(10), z = letters[1:10])
# create plot object
p <- hPlot(y ~ x, data = df, size = "s", type = "line")
# set axis
p$xAxis(categories = as.character(seq(Sys.Date(), by = 1, length.out = 10)))
# show
p
I have 9 years of data, each has the same variables saved e.g. summer, warmest months, length of sunlit months etc. I want to plot each year's variables as a set of horizontal lines, grouped together, with different line properties. So years will be on the y axis and months on the x axis.
Sorry this is a little vague, I'm not sure how else to describe it.
The following code might get you close to, if not the exact thing, what you are looking for -
%% PLOT YEARLY DATA ON TOP OF EACH OTHER
%% NOTE: Tinker with the text and plot properties to put Y labels and the custom-made legend at custom positions on the plot
%% Data - Insert your data in this way
years = 2001:2009;
months_string = {'Jan'; 'Feb'; 'Mar';'Apr'; 'May'; 'Jun';'Jul'; 'Aug'; 'Sep';'Oct'; 'Nov'; 'Dec'};
num_years = numel(years);
num_months = numel(months_string);
for count = 1:num_years
data.year(count).summer = 12.*rand(num_months,1);
data.year(count).warmest_months = 48*rand(num_months,1);
data.year(count).len_sunlit = 23*rand(num_months,1);
end
%% Params
offset_factor = 0.5;
x_legend_offset_factor = 0.75;
extension_top_legend = 0.2;
ylabel_pos = -1.0;
%% Add some useful info the struct, to be used later on
for count = 1:num_years
data.year(count).minval = min([ min(data.year(count).summer) min(data.year(count).warmest_months) min(data.year(count).len_sunlit)]);
data.year(count).maxval = max([ max(data.year(count).summer) max(data.year(count).warmest_months) max(data.year(count).len_sunlit)]);
data.year(count).range1 = data.year(count).maxval - data.year(count).minval;
end
%% Global Offset
max_range = max(extractfield(data.year,'range1'));
global_offset = offset_factor*max_range;
off1 = zeros(num_years,1);
for count = 2:num_years
off1(count) = data.year(count-1).maxval + global_offset;
end
off1 = cumsum(off1);
%% Plot
figure,hold on,grid on,set(gca, 'YTick', []);
xlabel('Months');
for count = 1:num_years
plot(1:num_months,off1(count)+data.year(count).summer,'b')
plot(1:num_months,off1(count)+data.year(count).warmest_months,'r')
plot(1:num_months,off1(count)+data.year(count).len_sunlit,'g')
text(ylabel_pos,(data.year(count).minval+data.year(count).maxval)/2+off1(count),['Year - ',num2str(years(count))])
end
% Find Y Limits and extending the plot at the top to accomodate the custom legend
ylimit = off1(num_years) + data.year(num_years).maxval;
ylim_1 = data.year(1).minval;
ylim_2 = ylimit+(ylimit - data.year(1).minval)*extension_top_legend;
ylim([ylim_1 ylim_2])
xlimits = xlim;
x_legend_offset = (xlimits(2) - xlimits(1))*x_legend_offset_factor;
% Adding text to resemble legends
txstr(1) = {'\color{blue} Summer'};
txstr(2) = {'\color{red} Warmest Months'};
txstr(3) = {'\color{green} Length of sunlit months'};
text(x_legend_offset,ylim_2,txstr,'HorizontalAlignment','center','EdgeColor','red','LineWidth',3)
set(gca, 'XTickLabel',months_string, 'XTick',1:numel(months_string))
For a random data, the plot might look like -
Let us know if the above code works for you!
I have 3 vectors: Y=rand(1000,1), X=Y-rand(1000,1) and ACTid=randi(6,1000,1).
I'd like to create boxplots by groups of Y and X corresponding to their group value 1:6 (from ACTid).
This is rather ad-hoc and looks nasty
for ii=
dummyY(ii)={Y(ACTid==ii)};
dummyX(ii)={X(ACTid==ii)}
end
Now I have the data in a cell but can't work out how to group it in a boxplot. Any thoughts?
I've found aboxplot function that looks like this but I don't want that, I'd like the builtin boxplot function because i'm converting it to matlab2tikz and this one doesn't do it well.
EDIT
Thanks to Oleg: we now have a grouped boxplot... but the labels are all skew-whiff.
xylabel = repmat({'Bleh','Blah'},1000,1); % need a legend instead, but doesn't appear possible
boxplot([Y(:,end); cfu], {repmat(ACTid,2,1), xylabel(:)} ,'factorgap',10,'color','rk')
set(gca,'xtick',1.5:3.2:50)
set(gca,'xticklabel',{'Direct care','Housekeeping','Mealtimes','Medication','Miscellaneous','Personal care'})
>> ylabel('Raw CFU counts (Y)')
How to add a legend?
I had the same problem with grouping data in a box plot. A further constraint of mine was that different groups have different amounts of data points. Based on a tutorial I found, this seems to be a nice solution I wanted to share with you:
x = [1,2,3,4,5,1,2,3,4,6];
group = [1,1,2,2,2,3,3,3,4,4];
positions = [1 1.25 2 2.25];
boxplot(x,group, 'positions', positions);
set(gca,'xtick',[mean(positions(1:2)) mean(positions(3:4)) ])
set(gca,'xticklabel',{'Direct care','Housekeeping'})
color = ['c', 'y', 'c', 'y'];
h = findobj(gca,'Tag','Box');
for j=1:length(h)
patch(get(h(j),'XData'),get(h(j),'YData'),color(j),'FaceAlpha',.5);
end
c = get(gca, 'Children');
hleg1 = legend(c(1:2), 'Feature1', 'Feature2' );
Here is a link to the tutorial.
A two-line approach (although if you want to retain two-line xlables and center those in the first line, it's gonna be hackish):
Y = rand(1000,1);
X = Y-rand(1000,1);
ACTid = randi(6,1000,1);
xylabel = repmat('xy',1000,1);
boxplot([X; Y], {repmat(ACTid,2,1), xylabel(:)} ,'factorgap',10)
The result:
EDIT
To center labels...
% Retrieve handles to text labels
h = allchild(findall(gca,'type','hggroup'));
% Delete x, y labels
throw = findobj(h,'string','x','-or','string','y');
h = setdiff(h,throw);
delete(throw);
% Center labels
mylbl = {'this','is','a','pain','in...','guess!'};
hlbl = findall(h,'type','text');
pos = cell2mat(get(hlbl,'pos'));
% New centered position for first intra-group label
newPos = num2cell([mean(reshape(pos(:,1),2,[]))' pos(1:2:end,2:end)],2);
set(hlbl(1:2:end),{'pos'},newPos,{'string'},mylbl')
% delete second intra-group label
delete(hlbl(2:2:end))
Exporting as .png will cause problems...