Looker: Aggregate by Count - visualization

I have data that looks like this:
User Id | Login Count
00000000| 0
00000001| 1
00000002| 1
00000003| 2
I'd like to visualize this data by the number of logins on the X axis and the count of users with that many logins on the Y axis:
4|
3|
2| *
1| * *
_________
0 1 2
How can I do this?

This can be done my going to the vis menu, then Plot, and toggle "Switch X and Y" for example:
Then, once it's enabled, it will render this:
Note that this option is only available for scatterplot, line, and area graphs.

Related

Compute similarity in pyspark

I have a csv file contains some data, I want select the similar data with an input.
my data is like:
H1 | H2 | H3
--------+---------+----------
A | 1 | 7
B | 5 | 3
C | 7 | 2
And the data point that I want find data similar to that in my csv is like : [6, 8].
Actually I want find rows that H2 and H3 of data set is similar to input, and It return H1.
I want use pyspark and some similarity measure like Euclidean Distance, Manhattan Distance, Cosine Similarity or machine learning algorithm.

How does the Graphite summarize function with avg work?

I'm trying to figure out how the Graphite summarize function works. I've the following data points, where X-axis represents time, and Y-axis duration in ms.
+-------+------+
| X | Y |
+-------+------+
| 10:20 | 0 |
| 10:30 | 1585 |
| 10:40 | 356 |
| 10:50 | 0 |
+-------+------+
When I pick any time window on Grafana more than or equal to 2 hours (why?), and apply summarize('1h', avg, false), I get a triangle starting at (9:00, 0) and ending at (11:00, 0), with the peak at (10:00, 324).
A formula that a colleague came up with to explain the above observation is as follows.
Let:
a = Number of data points for a peak, in this case 4.
b = Number of non-zero data points, in this case 2.
Then avg = sum / (a + b). It produces (1585+356) / 6 = 324 but doesn't match with the definition of any mean I know of. What is the math behind this?
Your data is at 10 minute intervals, so there are 6 points in each 1hr period. Graphite will simply take the sum of the non-null values in each period divided by the count (standard average). If you look at the raw series you'll likely find that there are also zero values at 10:00 and 10:10

MATLAB Generate matrix with logical values according to vector

I have a vertical Nx1 matrix A full of integers.
A:
+---+
| 4 |
| 3 |
| 1 |
| . |
+---+
My goal is to create a NxM matrix B where each cell's value is 1 if it's row is less than or equal to the corresponding number in A and the rest are 0.
B:
+-------------+
| 1 1 1 1 0 . |
| 1 1 1 0 0 . |
| 1 0 0 0 0 . |
| . . . . . . |
+-------------+
This could be achieved by iterating row by row, but I'm trying to find a quicker method. I feel this could be done through logical indexing but cannot think of how to exactly off the top of my head.
You can type:
B = A>=1:size(A,1)
% or, in versions earlier than 2016b:
B = bsxfun(#ge,A,1:size(A,1))
This will compare each value in A to all the numbers between 1 to the length of A, and returns 1 if it's greater or equals (#ge...) to it, and 0 if not. The result is a matrix, where each row k is the comparison for the value A(k) with all values between 1 to the length of A.
Found a solution for my problem.
index = repmat(1:max(A),length(A),1);
B = ones(length(A),max(A));
B(index>repmat(A,1,max(A))) = 0;
index is a NxM matrix where the value of a cell is equal to its column number. Whenever that value is greater than the value in A, the corresponding cell in B is set to 0.

matlab plot(x,y) of different data type

i got data (2945 * 3) of different types imported as cell array. 1st column data type has been imported as text (time e.g 1/1/1990), whereas the 2nd and 3rd columns are numbers.
so far i used cell2mat to convert to double both the 2nd and 3rd columns. Thus plot(y) works {y being either the 2nd or 3rd column data} , however i am wondering how i can handle the text data type from my 1st column in an attempt to use plot(x,y).
Any idea would be appreciated. cheers
--------sample.csv-------------
Date LAST Rt
1/27/2018 20 0.234556
1/26/2019 20.05 0.184556
1/23/2040 20.1 0.134556
1/22/1990 20.15 0.084556
1/21/1991 20.2 0.034556
1/20/1993 20.25 -0.015444
1/19/1998 20.3 -0.065444
1/16/2050 20.35 -0.115444
1/15/2030 20.4 -0.165444
--------cell array appearance------------
1 | 2 | 3
1| '1/27/2018' 20 0.234556
2| '1/26/2019' 20.05 0.184556
3| '1/23/2040' 20.1 0.134556
4| '1/22/1990' 20.15 0.084556
5| '1/21/1991' 20.2 0.034556
6| '1/20/1993' 20.25 -0.015444
7| '1/19/1998' 20.3 -0.065444
8| '1/16/2050' 20.35 -0.115444
9| '1/15/2030' 20.4 -0.165444
You could also use datenum to convert the text to a serial date number (copied from Octave command line):
>> test
test =
{
[1,1] = 1/1/2000
[1,2] = 1/2/2001
[1,3] = 10/2/2001
[1,4] = 10/3/2001
[1,5] = 10/3/2005
}
>> x_dates = cellfun('datenum',test(:,1))
x_dates =
730486 730853 731126 731127 732588
>> y = rand(size(x_dates));
>> plot(x_dates,y)
>> datetick('x','dd/mm/yyyy')
Update:
It looks like cellfun requires a function handle in MATLAB, so you probably need to do something like:
x_dates = cellfun(#datenum,test(:,1))
You could use XTick and XTickLabel. The former will set up where and how many ticks you want in your X axis (I guess that you'd want one for each X data, but you also may want to go jumping 10 by 10). The second will set the labels in those tick positions. If the Labels are less than the ticks, they will repeat, so careful with that.
Let me illustrate with an example:
x = [0 1 2 3];
y = [2 0 1 1];
plot (x, y);
yourstrings={'Banana', 'T', 'Potato', '45'};
set(gca,'XTick',x(1):x(end))
set(gca,'XTickLabel',yourstrings)
A second option would be to use text. you could put text wherever you like in the plot. Let me illustrate again. Of course I don't meant to put it "nice", but if you/d like, you could play with offsetting the positions of the texts and so on in order to get a more "beautiful" plot.
x = [0 1 2 3];
y = [2 0 1 1];
plot (x, y);
yourstrings={'Banana', 'T', 'Potato', '45'};
for ii=1:length(yourstrings)
text(x(ii),y(ii),yourstrings{ii})
end
You can create a new matrix with usable data (or rewrite your current one) by looping through your matrix(:,1) to convert the strings using the cellfun function.
Or just to plot:
plot(cellfun(#(x)str2double(x), Measures(:,i)))
where i = 1: length(matrix)

gnuplot, two y-ranges far apart

Is it possible to plot two ranges which are far apart each other?
I mean, if I have a dataset like [ 1, 2, 3, 1001, 1001, 1003 ],
can I draw a plot like this?
|
1003 | x
1002 | x
1001 | x
1000 |
|
===================== omission
|
4 |
3 | x
2 | x
1 | x
-------------
You may want to check out this link:
Gnuplot surprising - Broken axes graph in gnuplot. The author presents three examples of plotting a grqph with a broken x axis.
Three helpful examples:
http://gnuplot-surprising.blogspot.com/2011/10/broken-axes-graph-in-gnuplot-3.html
http://www.phyast.pitt.edu/~zov1/gnuplot/html/broken.html
http://www.phyast.pitt.edu/~zov1/
It is not straightforward.