visualizing a distance matrix - matlab

Sorry if there's already an answer to this. I can't seem to find it.
I'm working on an application that pulls legislators' voting records on bills, and I'm trying to come up with some interesting ways of visualizing the data. There's one idea in my head right now, but I'm not sure it's mathematically possible to do the visualization I want to in two dimensions.
The data begins like this:
HB1 HB2 HB3
Smith 1 0 1
Hill 1 1 1
Davis 0 1 0
Where 1 = aye, 0 = nay.
The next step I take is to measure the "distance" of each legislator from the other by summing the XORs of their voting records, so that each time one legislator disagrees with another they get a distance "point" with that legislator. That creates a table like this:
Smith Hill Davis
Smith 0 1 3
Hill 1 0 2
Davis 3 2 0
So my idea is to graph each legislator as a point on a plane, and to have the distances between those points reflect the distance rating in the table. I think it presents an interesting opportunity to see if there are clusters of legislators with similar voting patterns, etc.
Now, obviously, this is easy to do with 3 points since you can always draw a triangle with three given lengths for sides. But I can't figure out whether it's mathematically possible to graph lots more (35-70) legislators and still have all the distances right within a 2-dimensional space, or whether you potentially need one additional dimension with each legislator after three.
So, for example, is it possible to preserve all the distances if the data table looks like this?
0 13 6 8 10 14 12 14 12 12
13 0 13 13 13 7 9 11 9 7
6 13 0 12 8 16 14 10 12 14
8 13 12 0 12 10 6 10 10 8
10 13 8 12 0 10 12 12 14 14
14 7 16 10 10 0 10 10 12 8
12 9 14 6 12 10 0 12 8 10
14 11 10 10 12 10 12 0 8 10
12 9 12 10 14 12 8 8 0 10
12 7 14 8 14 8 10 10 10 0
If so, does Octave have a built-in function? or can anyone point me to an algorithm?

Ok, found the answer.
No, it's generally not mathematically possible to do what I wanted to do.
The best approximation is an algorithm called multidimensional scaling. Octave has a built-in function: cmdscale.
Hope others may find this helpful.

Related

How do I set random numbers that fall in a range in kdb+?

In Kdb+, how do I use the "roll" function to make the random numbers generated fall within a range that doesn't start with 0? For example what if I wanted the range to be within 2-10 instead of 0-10?
What do I have to add to the code to make it fall into a range instead of the default 0-x? I have tried and looked for every method but can't seem to find one.
You could also just roll from 0-8 then add two. This doesn't require a list to be pre-generated
q)2+5?9
10 2 7 10 7
Assuming you want 2-10 inclusive
// quick and simple method
q)10?2+til 8
6 2 4 3 4 3 4 5 4 7
// or function (x)=num to be dealt, (y) start range, (z) end range
q)f:{x?y+til 1+z-y}
q)f[10;10;20]
12 17 10 11 19 12 11 18 18 11
If you supply a list in the right hand argument then you will get a random value from that list. To roll for a random range from 2-10 you can use til to generate the range:
q)2+til 9
2 3 4 5 6 7 8 9 10
q)1?2+til 9
,6
You can even supply a general list to randomly draw from:
q)3?(`abc;2 3f;10;20;30;"text")
2 3f
`abc
"text"
Simple math function for random number generator is:
(rand() mod (1+max- min)) + min
q) f:{x+rand[0] mod 1+y-x}
q) f[5;10]
q) 7
Update: I failed to notice that you wanted to generate couple of random numbers in the range. You could easily modify above function for that:
q) f:{x+(z?0) mod 1+y-x}
q) f[2;10;4]
q) 6 4 7 2

Displaying data to a map, creating a choropleth

What I would like to do is create a choropleth map which is darker or lighter based on the number of data points in a particular area.
I have the following data:
RO-B, 9
PL-MZ, 24
SE-C, 3
DE-NI, 5
PL-DS, 14
ES-CM, 11
RO-IS, 2
DE-BY, 51
SE-Z, 18
CH-BE, 10
PL-WP, 1
ES-IB, 1
DE-BW, 21
DE-BE, 24
DE-BB, 1
IE-M, 26
ES-PV, 1
DE-SN, 6
CH-ZH, 31
ES-GA, 1
NL-GE, 2
IE-U, 1
ES-AN, 4
FR-J, 82
DE-HH, 34
PL-PD, 1
PL-LD, 6
GB-WLS, 60
GB-ENG, 8619
RO-BV, 45
CH-VD, 2
PL-SL, 1
DE-HE, 17
SE-I, 1
HU-PE, 4
PL-MA, 4
SE-AB, 3
CH-BS, 20
ES-CT, 31
DE-TH, 25
IE-C, 1
CZ-ST, 1
DE-NW, 29
NL-NH, 3
DE-RP, 9
CZ-PR, 4
IE-L, 134
HU-BU, 10
RO-CJ, 1
GB-NIR, 29
ES-MD, 33
CH-LU, 11
GB-SCT, 172
CH-GE, 3
BE-BRU, 30
BE-VLG, 25
It references the ISO3166-2 of a country and sub region, and the # corresponds to the amount of data points affiliated with that region.
I've seen this project on GitHub which seems to also use the same ISO3166-2 to reference countries.
I'm trying to figure out how I could modify their code to display my data points, such that if the number is higher the area would be darker, if the number is less it would be lighter.
It seems it should be possible, the first thing I was trying to do was modify this jsfiddle code, which seems to be very close to what I need, but I couldn't get it to work.
For instance this line:
osmeRegions.geoJSON('RU-MOW',
Seems to directly reference a ISO3166-2 code, but it's not as simple as just changing that (or maybe it is but I couldn't get that to work properly).
Does anyone know if I could possibly adapt the code from that project to create the map rendering I've described?
Or perhaps there's a different way to achieve the same goal?

Using ELKI, having troubles with dimensions higher than 14

I'm trying to use SUBCLU in ELKI, but in order to figure things out I've tried DBSCAN, and even KMEANSLloyd, just so I know how to input data with high dimensions. Unfortunately I can only enter up to 14 Dimensions, any higher and the program starts complaining that I've not entered a parameter for "bubble.scaling", even when I quite clearly have. I'm inputting the data by using a .csv file formatted in a similar fashion to the "mouse.csv" tutorial file (which is how I figured out how to enter data with dimensions higher than 1 in the first place). What am I doing wrong?
Turns out I wasn't formatting the CSV file properly. Rather than having a CSV file with just the data in it seperated by spaces for dimensionality, I needed to also include the headers. As I wasn't using randomly generated information and I didn't know the number of clusters beforehand, this is what the CSV looked like.
## Size: 10
########################################################
1 2 3 4 5 6 7 8 9 10 11 12 13 14
1 2 3 4 5 6 7 8 9 10 11 12 13 14
14 13 12 11 10 9 8 7 6 5 4 3 2 1
14 13 12 11 10 9 8 7 6 5 4 3 2 1
I had the same problem. Im my case it turned out that my csv file contained only integer columns, which were seen as string data type instead of numeric data type. By setting the dbc.parser to CategoricalDataAsNumberVectorParser, the outofbounds error disappeared.

How Matlab extract a subset of a bigger matrix by specifying the indices?

I have a matrix A
A =
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
i have another matrix to specify the indices
index =
1 2
3 4
Now, i have got third matrix C from A
C = A(index)
C =
1 6
11 16
Problem: I am unable to understand how come i have received such a matrixC. I mean, what is logi behind it?
The logic behind it is linear indexing: When you provide a single index, Matlab moves along columns first, then along rows, then along further dimensions (according to their order).
So in your case (4 x 5 matrix) the entries of A are being accessed in the following order (each number here represents order, not the value of the entry):
1 5 9 13 17
2 6 10 14 18
3 7 11 15 19
4 8 12 16 20
Once you get used to it, you'll see linear indexing is a very powerful tool.
As an example: to obtain the maximum value in A you could just use max(A(1:20)). This could be further simplified to max(A(1:end)) or max(A(:)). Note that "A(:)" is a common Matlab idiom, used to turn any array into a column vector; which is sometimes called linearizing the array.
See also ind2sub and sub2ind, which are used to convert from linear index to standard indices and vice versa.

Page limitation AjaxPagingNavigator?

When using the wicket paging support for DataView. -> AjaxPagingNavigator
How is it possible to limit the number of pages?
For example :
First Previous 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Next Last
Should be limited to 10 pages max... (First 5 and last 5)
First Previous 1 2 3 4 5 ... 14 15 16 17 18 Next Last
Any idea?
Is it supported by default?
If not... How can I change it?
Thx
Koen
PagingNavigation supports a view size which is 10 by default.
If you want to display the first 5 and last 5, you'll have to roll your own navigation implementation, see AjaxPagingNavigator#newNavigation().