Orange data mining boxplots - orange

I wanted to ask as to how I can reduce the number of decimal points - in quartiles, mean, median etc. In the boxplots.
Also how can I order various boxplots.. e.g. if I want to create boxplots for say very rich, rich, middle class, poor, very poor in this order.
Also by default boxplots are horizontal. How can I make them vertical.

There is no option to reduce the number of decimal points.
You can visualize subgroups, but you can't order them manually. The default is by variance or alphabetically.
You can't make box plots vertical.

Related

Is there a way to increase density of data of scatter data?

I have this x y data:
I would like to increase the density of points by closing small gaps. How do i go about. I still want to preserve the structure of the points.
It depends on what you mean by 'closing gaps'.
if you mean that you want to make the data seem more grouped without actually adding more data points, then you might find the 'LineWidth' argument to be useful. If used currectly it increases the width of each marker in the scatter plot, which will make the data seem more grouped and with less gaps.
to use it, write the scatter code line as follows:
scatter(X, Y, 'LneWidth', width_number)
replace 'width_number' with different values and see the effect.

I have a histogram plot how to chose the appropriate point?

I had written a code to obtain a threshold pixel value of an image for imaging particles. I have got a plot attached below. I want to choose a point where there is sudden jump in the value. This will be my threshold value. I can manually do this by seeing the point. But I want to do it automatically through code what should I do?
I was thinking of sorting it and finding frequency. Then loop it to compare it with previous value. I want to know what should I choose the minimum difference to between these two values.
What other method should I use?
Here is the Image:
Firstly, what do you mean by "sudden jump"? If I understand well, it means that there is a big difference (descrease) of pixel numbers between two adjecent gray levels. Then you can just right shift the histogram vector, and substract the two vectors, getting a vector containing the differences between two adjacent gray levels. And then, you can choose a threshold. That's all.

Visually Comparable Plot

So I have to plot certain data (90 sets total) and a single set looks like this.
However when I hold on and plot 90 sets superimposed, it looks just like a patch of multiple colours.
Now what would be the most optimal way to represent the plots that can let us compare them and study the difference. For example (and this is just my thought and I am open to opnions) how can I compare these 90 plots in a Matrix fashion viz.
Is there even better ways to represent such collection of plots instead of just superimposing them?
EDIT: To clear things up, I have 90 graphs that look similar to the first graph and I have to compare them in say, a single page. What would be the best way to do it? Also is subplot the best idea for 90 graphs?
Thanks.
You need subplot(), which allows you to plot multiple figures in one window.
http://www.mathworks.com/help/matlab/ref/subplot.html?requestedDomain=www.mathworks.com

Tableau Control Chart - Attribute measure incorrect

All
I have a control chart, with on the X-axis a time period, and the Y-axis the value of the measure (I'd like to plot all the points in a control chart).
However, I have 2 different values as a measure, which have the exact same date (up to a second match) but different measure values.
When I plot this on a control chart, instead of having 2 points in the control chart with value 500 and 550 for example - it gives me one point with a value of about 200.
It also gives a notification that there is a NULL value in this axis, which points to the X-axis where 2 records have the exact same date.
Any idea what I can do to make this correct - or make tableau draw the measure points correctly?
Thanks in advance!
It's difficult to answer without seeing more detail about your problem, but this sounds like a good candidate for a blended axis. (multiple measures sharing a single axis)
The easiest way to do this is to put your (probably continuous) datetime field on the row axis and one of your measures on the row axis to see one of then control plots. Then drag the second measure to the Y-axis until you see a little translucent two bar icon to indicate that you are adding a second measure to that axis, at which point you can release the pointer and you should see a two plots on the same axis.
If the scales for the two measures are radically different, you can instead drag the second measure to the right side instead to get a dual axis.

How to neatly cut off an extreme value in a plot that compresses the rest of a plot?

So basically, the graph labeled "Thermal Wind" has an extreme value that compresses the y-values for all the other plots, making it much harder to see any of the individual variations in the other plots. Is there a way to neatly cut off this extreme value? I could just rescale the y limit to a maximum of 40, but then this looks ugly.
As for the alternative I've tried - it's here:
I would recommend trying to plot it on a log scale. The function you'll want to consider using is semilogx, though for completeness I recommend also reading the help file on loglog.
Alternately, you could use subplot to generate multiple plots, one of which is zoomed into a region of interest.
Are the outlier points errors in the data, or do they represent extreme cases?
If they are not valid data, just manually exclude them from the data, plot the graph, and include a text clarification when describing the graph. If they are valid data, then trimming them would misrepresent the data, which isn't a good thing.
Graphs of data aren't art: their main goal isn't to be pretty; it's to provide a useful visualization of data. There are some minimum requirements on appearance, however: the axes have to be labeled, the units have to be meaningful, the different curves have to be visually distinct, etc. As long as your graph has these things, you shouldn't expect to lose marks for presentation.
There are two approaches that I use:
One approach would be transform the data so it will fill the plot nicely. Make the transform so that it wouldn't touch the range - say -10 to +10. In your case you could choose it so that 100 transforms to +15 and -100 to -15.
For clarity you need to then also set and label the y ticks appropriately. And for nice style make sure the line changes slope when it goes over the border.
I plot the data as is. But set the axis limits say from -10 to +10. Where points lay outside I place upwards and downwards triangles along the border to mark in which direction the "outliers" would be. Obviously this is only good when there aren't too many.