Plotting horizontal lines across histogram bars - charts

I have an experiment where I measured the performance of some algorithms relative to three baselines. I'd therefore like to plot histograms for the algorithms, with horizontal lines of various styles drawn through the histogram bars to show the baselines.
Below is an example which I produced by manually drawing horizontal lines on a graph produced by Gnuplot. The histograms "sentence" and "document" represent the algorithms I tested, and "mono", "random", and "MFS" are the baselines.
Is there some way I can do this within Gnuplot itself? If not, can anyone recommend another tool which can do this? Or perhaps there's a better visualization technique I should be using instead?

This is definitely possible. Here's a little example that I cooked up:
First, the datafile "data.dat":
#histograms
1 3 stack1
2 2 stack2
3 1 stack3
#mono
.6
.6
1.5
1.5
3.1
3.1
Now the gnuplot script to plot it:
set yrange [0:*]
set style data histograms
set style histogram cluster gap 1
IDX=-1
xpos(x)=(IDX=IDX+1, IDX%2==0)?(IDX/2-.5):(IDX/2+.5)
set style fill solid
plot 'data.dat' index "histograms" u 1:xtic(3) title "column1", \
'' index "histograms" u 2 title "column2", \
'' index "mono" u (xpos($1)):1 w lines ls -1 title "mono"
This is a little more tricky than my last version. When plotting a cluster of histograms, each cluster is centered on an integer starting at 0 and incrementing by 1 for each cluster (regardless of your setting for xtics and labels). What I've done is used that information to simplify the datafile. Now this plot command plots 2 different data sets as histograms (taken from each column in the "histogram" portion of the datafile), the first one adds the xtic labels. Then the tricky part: I write a function which has side-effects (gnuplot inline-functions are new in gnuplot 4.4 I think). Each time it is called, the value for the variable IDX is incremented -- So, the current position on the xrange is always IDX/2. This function alternates between returning IDX/2-.5 and IDX/2+.5. Note that to create another dataset random, you'll need another function xpos2 which is the same as xpos1 except it uses a separate iterator.

Related

Line increment in gnuplot

I have large files (~5 Gbs) whit constant increment on x-axis, let's say each dt.
I would like to know if I could set the every command of Gnuplot as logarithmic increment not linear.
plot "fileA.txt" u 1:2 every dt #linear increment of dt
This is because, if x-axis is in log-scale, then I want to have more points for low values of x in (10^-4,10^-2) but also not an oversampling in (10^4,10^2) range. Somehow a differential increment.
Does I have to use external programs like sed to re-write my file first?
A test plot is included as well as the data. In blue the full data, in red the ones with the every command. As you can see one loose the information for short x also oversample the plot for large x. the data file
Many thanks.
You could plot smoothed data with points:
set key left
set logscale x
set yrange [3.9:4.8]
set samples 30
set terminal png
set output "log.png"
plot "fort.11" title "raw" with points lc 3 pointtype 5 pointsize 2,\
"" title "smooth" smooth csplines with points lc 1 pointtype 5 pointsize 1
set samples 30 tells gnuplot to use 30 points equidistant in x
smooth csplines interpolates the datapoints
with points plots with points instead of lines, which would be the default
Note that this does not plot the original data, and that smooth csplines introduces new points if the original datapoints are too far apart. This might or might not be what you want.

How to make a heat map on top of worldmap using hist3 in MATLAB?

My x-axis is latitudes, y-axis is longitudes, and z-axis is the hist3 of the two. It is given by: z=hist3(location(:,1:2),[180,360]), where location(:,1) is the latitude column, and location(:,2) is the longitude column.
What I now want is, instead of plotting on a self-created XY plane, I want to plot the same on a worldmap. And instead of representing the frequency of each latitude-longitude pair with the height of the bars of hist3, I want to represent the frequency of each location by a heat map on top of the world map, corresponding to each latitude-longitude pair's frequency on the dataset. I have been searching a lot for this, but have not found much help. How to do this? I could only plot the skeleton of the worldmap like this:
worldmap world
load geoid
geoshow(geoid, geoidrefvec, 'DisplayType', 'texturemap');
load coast
geoshow(lat, long)
I don't know what the colour is being produced based on.
Additionally, if possible, I would also like to know how to plot the hist3 on a 3D map of the world (or globe), where each bar of the hist3 would correspond to the frequency of each location (i.e., each latitude-longitude pair). Thank you.
The hist3 documentation, which you can find here hist3, says:
Color the bars based on the frequency of the observations, i.e. according to the height of the bars. set(get(gca,'child'),'FaceColor','interp','CDataMode','auto');
If that's not what you need, you might wanna try it with colormap. More info about it here colormap. I haven't tried using colormap on histograms directly, so If colormap doesn't help, then you can try creating a new matrix manually which will have values in colors instead of the Z values the histogram originally had.
To do that, you need to first calculate the maximum Z value with:
maxZ=max(Z);
Then, you need to calculate how much of the colors should overlap. For example, if you use RGB system and you assign Blue for the lowest values of the histogram, then Green for the middle and Red for the High, and the green starts after the Blue with no overlap, than it will look artificial. So, if you decide that you will have, for example overlapping of 10 values, than, having in mind that every R, G and B component of the RGB color images have 255 values (8 bits) and 10 of each overlap with the former, that means that you will have 255 values (from the Blue) + 245 values (From the Green, which is 255 - 10 since 10 of the Green overlap with those of the Blue) + 245 (From the Red, with the same comment as for the Green), which is total amount of 745 values that you can assign to the new colored Histogram.
If 745 > maxZ there is no logic for you to map the new Z with more than maxZ values. Then you can calculate the number of overlaping values in this manner:
if 745 > maxZ
overlap=floor(255- (maxZ-255)/2)
end
At this point you have 10 overlapping values (or more if you still think that it doesn't looks good) if the maximum value of the Z is bigger than the total amount of values you are trying to assign to the new Z, or overlap overlapping values, if the maximum of Z is smaller.
When you have this two numbers (i.e. 745 and maxZ), you can write the following code so you can create the newZ.
First you need to specify that newZ is of the same size as Z. You can achieve that by creating a zero matrix with the same size as Z, but having in mind that in order to be in color, it has to have an additional dimension, which will specify the three color components (if you are working with RGB).
This can be achieved in the following manner:
newZ=zeros(size(Z),3)
The number 3 is here, as I said, so you would be able to give color to the new histogram.
Now you need to calculate the step (this is needed only if maxZ > The number of colors you wish to assign). The step can be calculated as:
stepZ=maxZ/Total_Number_of_Colors
If maxZ is, for example 2000 and Total_Number_of_Colors is (With 10 overlaping colours) 745, then stepZ=2.6845637583892617449664429530201. You will also need a counter so you would know what color you would assign to the new matrix. You can initialize it here:
count=0;
Now, finally the assignment is as follows:
For i=1:stepZ:maxZ
count=count+1;
If count>245
NewZ(Z==stepz,3)=count;
elseif count>245 && count<256
NewZ(Z==stepz,3)=count;
NewZ(Z==stepz,2)=count-245;
elseif count>255
NewZ(Z==stepz,2)=count-245;
elseif count>500 && count<511
NewZ(Z==stepz,2)=count-245;
NewZ(Z==stepz,1)=count-500;
else
NewZ(Z==stepz,1)=count-500;
end
end
At this point you have colored your histogram. Note that you can manually color it in different colors than red, green and blue (even if you are working in RGB), but it would be a bit harder, so if you don't like the colors you can experiment with the last bit of code (the one with the for loops), or check the internet of some other automatic way to color your newZ matrix.
Now, how do you think to superimpose this matrix (histogram) over your map? Do you want only the black lines to be shown over the colored histogram? If that's the case, than it can be achieved by resampling the NewZ matrix (the colored histogram) with the same precision as the map. For example, if the map is of size MxN, then the histogram needs to be adjusted to that size. If, on the other hand, their sizes are the same, then you can directly continue to the next part.
Your job is to find all pixels that have black in the map. Since the map is not binary (blacks and whites), it will be a bit more harder, but still achievable. You need to find a satisfactory threshold for the three components. All the lines under this threshold should be the black lines that are shown on the map. You can test these values with imshow(worldmap) and checking the values of the black lines you wish to preserve (borders and land edges, for example) by pointing the cross tool on the top of the figure, in the tools bar on every pixel which is of interest.
You don't need to test all black lines that you wish to preserve. You just need to have some basic info about what values the threshold should have. Then you continue with the rest of the code and if you don't like the result so much, you just adjust the threshold in some trial and error manner. When you have figured that this threshold is, for example, (40, 30, 60) for all of the RGB values of the map that you wish to preserve (have in mind that only values that are between (0,0,0) and (40,30,60) will be kept this way, all others will be erased), then you can add the black lines with the following few commands:
for i = 1:size(worldmap,1)
for j = 1:size(worldmap,2)
if worldmap(i,j,1)<40 && worldmap(i,j,2)<30 && worldmap(i,j,3)<60
newZ(i,j,:)=worldmap(i,j,:)
end
end
I want to note that I haven't tested this code, since I don't have Matlab near me atm, so It can have few errors, but those should be easily debugable.
Hopes this is what you need,
Cheers!

Matlab scatter using different color gradients per group

I didn't find a way to plot scattered data (Lon X Lat X variable) classified in groups (>4), where the value of my variable in each group goes from 0.5 to 1. So far I did it in plain colors, no variation (color gradient) per group. I applied a FOR loop, one step per group, changing colors each step.
Thanks in advance!
The simple way around the one-colormap-per-figure limit would be to offset the data, so e.g. group 1 goes from 0.5 to 1, group 2 goes from 1.5 to 2, group 3 from 2.5 to 3 etc. then create a colormap that is the concatenation of all the gradients. That way each group 'indexes' into the correct region of the colormap and achieves the desired effect.
The alternative would be to bypass the indexed colormap and pre-generate a matrix with specific RGB values for each point, then pass that to scatter(). For total control there's the option of traversing the handles to get to the underlying patch objects and setting CData directly, but I'd try to get it done with an easier approach before going that far.

Columnstacked histograms in gnuplot from multiple files

I'm trying to use gnuplot to view some profiling data; I have several files, each of the following format:
file_runXX.dat:
elapsed time, stage
elapsed time, stage
For example:
0 foo
1 step_1
1.5 step_2
2.3 step_3
and
0 bar
0.75 step_1
1.3 step_2
2.1 step_3
To plot them, I use:
set style data histogram
set style histogram columnstack
plot for [i=1:2] sprintf("%02d.log", i) using 1
And I get a graph with two vertical bars: at x=0 I have a bar going from y=0 to y=1, then y=1 to y=1.5 and y=1.5 to y=2.3. At x=1, I have the same data from the second file.
Two questions:
(a) Is this the proper way to do this (i.e., it works, but is there something better?), and
(b) How can I set the xlabels to read "foo" and "bar" (see column 2, row 1, of each file)? I've tried messing around with using 1:xtic(2) or title columnheader and a few other options, but it seems that's only usable if I have one file containing both timestamps (I'm not sure I can do this, since I sometimes have step 2a in one file but not in the other; yes, I'm aware that this can mean the colors are not going to be uniform between bars).
Thanks
or you could transpose the data:
#label step_1 step_2 step_3
foo 1 1.5 2.3
bar .75 1.3 2.1
... and then use following commands:
set style data histograms
set boxwidth .7
set style histogram rowstacked
plot for [COL=2:4] "all.dat" using COL:xticlabels(1)
this adds a legend which you can suppress or customize.
you could combine all data in one tab-separated file all.dat:
foo bar
1 .75
1.5 1.3
2.3 2.1
and then use following commands:
set style data histograms
set style histogram columnstacked
set boxwidth .7
plot for [COL=1:2] "all.dat" using COL title columnhead

translate matlab plot to gnuplot 3d

I have a matrix of fft data over time, 8192 rows of data x 600 columns of time. The first column is a frequency label, the first row is shown below but doesn't actually exist in the data file, neither do the spaces, they are shown just for ease of reading.
Frequency, Sec1, Sec2, Sec3...Sec600
1e8, -95, -90, -92
1.1e8, -100, -101, -103
...
It is plotted in matlab with the following code (Apologies to other posters, I grabbed the wrong matlab code)
x is a matrix of 8192 rows by 600 columns, f is an array of frequency labels, FrameLength = 1, figN = 3
function [] = TimeFreq(x,f,FrameLength,figN)
[t,fftSize] = size(x);
t = (1:1:t) * FrameLength;
figure(figN);
mesh(f,t,x)
xlabel('Frequency, Hz')
ylabel('time, sec')
zlabel('Power, dBm')
title('Time-Freq Representation')
I cant quite figure out how to make it work in gnuplot. Here is a sample image of what it looks like in Matlab: http://imagebin.org/253633
To make this work in gnuplot, you'll want to take a look at the splot (for "surface plot") command. You can probably figure out quite a lot about it just by running the following commands in your terminal:
$ gnuplot
gnuplot> help splot
Specifically, you want to read the help page shown by running (after the above, when the prompt asks for a subtopic): datafile. That should tell you enough to get you started.
Also, the answers to this question might be helpful.
so here is the gnuplot command script that I ended up using. It has some additional elements in it that weren't in the original matlab plot but all the essentials are there.
set term png size 1900,1080
set datafile separator ","
set pm3d
# reverse our records so that time moves away from our perspective of the chart
set xrange[*:*] reverse
# hide parts of the chart that would make the 3d view look funny
set hidden3d
# slightly roate our perspective and compress the z axis
set view 45,75,,0.85
set palette defined (-120 "yellow", -70 "red", -30 "blue")
set grid x y z
set xlabel "time (secs)"
set ylabel "frequency"
set zlabel "dBm"
# plot all the data
set output waterfall.png
splot 'waterfall.csv' nonuniform matrix using 1:2:3 with pm3d lc palette