What's weka's kmeans output (WCSS) mean? - cluster-analysis

I was using weka to do K-means clustering, when i tried a small set and found
the within cluster sum of squared errors(WCSS) value was not what i thought to be.
I thought WCSS was the sum of squared distance of all elements to it's cluster center,
but the value was not right:
for example:
the data set was:
3.0, 2.0, 3.0, 0.0, 0.0, 2.0, 1.0, 0.0, 1.0
4.0, 1.0, 3.0, 0.0, 1.0, 0.0, 1.0, 4.0, 1.0
4.0, 1.0, 7.0, 0.0, 1.0, 1.0, 0.0, 1.0, 1.0
3.0, 2.0, 7.0, 0.0, 0.0, 2.0, 1.0, 1.0, 0.0
3.0, 2.0, 6.0, 1.0, 0.0, 1.0, 0.0, 2.0, 1.0
4.0, 2.0, 5.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0
4.0, 1.0, 8.0, 0.0, 1.0, 2.0, 0.0, 0.0, 1.0
3.0, 2.0, 2.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0
3.0, 2.0, 0.0, 0.0, 1.0, 1.0, 1.0, 3.0, 1.0
and the cluster(only one) center was 3, 2, 3, 0, 1, 1, 1, 0, 1:
the weka output WCSS was 39, but according to my understanding, it should be 133.
I know i must be wrong about what WCSS means, could anyone tell me about it?

I believe what is reported is the WCSS after the attribute values have been normalized. Unfortunately, I was not able to replicate your result.
However, using your dataset with SimpleKMeans (k=1), I got the following results:
Before normalizing attribute values, WCSS is 26.4375
After normalizing attribute values, WCSS is 26.4375
This source also indicates that Weka's K-means algorithm automatically normalizes the attribute values.

#relation cancer
#attribute a1{1,2,3,4,5,6}
#attribute a2{0,1,2}
#attribute a3{0,1,2,3,4,5,6,7,8,9,10}
#attribute a4{0,1,2,3,4,5,8}
#attribute a5{0,1}
#attribute a6{0,1,2}
#attribute a7{0,1}
#attribute a8{0,1,2,3,4}
#attribute a9{0,1}
#attribute label{0,1}
#data
3,2,3,0,0,2,1,0,1,1
4,1,3,0,1,0,1,4,1,0
4,1,7,0,1,1,0,1,1,1
3,2,7,0,0,2,1,1,0,0
3,2,6,1,0,1,0,2,1,1
4,2,5,1,1,1,1,0,0,0
4,1,8,0,1,2,0,0,1,0
3,2,2,0,1,1,0,0,1,0
3,2,0,0,1,1,1,3,1,0

Related

changing codes implemeted with cairo_t to Cairo::RefPtr<Cairo::Context>

i have some codes which i need to reimplement with Cairo::RefPtrCairo::Context...It is a bit confusing since i could not find good example which uses pattern while we have Cairo::RefPtrCairo::Context instead of cairo_t..
Cairo::RefPtr<Cairo::Surface> surface =
Cairo::ImageSurface::create(Cairo::FORMAT_ARGB32, width, height);
Cairo::RefPtr<Cairo::Context> cr = Cairo::Context::create(surface);
cairo_pattern_t *cp = cairo_pattern_create_radial(x_off, y_off, 0, x_off, y_off, cent_point_radius);
cairo_pattern_add_color_stop_rgba(cp, 0.0, 0.7, 0.7, 0.7, 0.8);
cairo_pattern_add_color_stop_rgba(cp, 1.0, 0.1, 0.1, 0.1, 0.8);
cairo_set_source(cr, cp);
How i can change "cp" to something which is recognizable for cr->set_resource().....cr used to be a cairo_t,but then i had to change it to Cairo::RefPtrCairo::Context
best regards
Since you've already decided to do it the C++ way, why not go all the way in?
// Create image surface.
Cairo::RefPtr <Cairo::Surface> refSurface =
Cairo::ImageSurface::create(Cairo::FORMAT_ARGB32,
nWidth,
nHeight);
// Create Cairo context for the image surface.
Cairo::RefPtr <Cairo::Context> refContext =
Cairo::Context::create(refSurface);
// Create a radial gradient (pattern)
Cairo::RefPtr <Cairo::RadialGradient> refPattern =
Cairo::RadialGradient::create(x_off,
y_off,
0,
x_off,
y_off,
cent_point_radius);
// Add color stops to the pattern
refPattern->add_color_stop_rgba(0.0,
0.7,
0.7,
0.7,
0.8);
refPattern->add_color_stop_rgba(1.0,
0.1,
0.1,
0.1,
0.8);
// Set the pattern as the source for the context.
refContext->set_source(refPattern);
// Add a closed path and fill...

Multiple camera orbits in Paraview

So I need the camera to orbit my data multiple times. I thought this should be quite easy but I could not figure it out. Double clicking the camera sequence in the Animation View allowed me to add another path but it added a default path which was different to the orbit. Manually (and painfully) copying the path parameters over did not work either? Any ideas on how to do this?
I know this is quite some time later, but I was stuck on this same need. I ended up tracing to see how a follow path camera cue worked in python code, and then I stacked a few of those lines in a for loop making sure I updated the KeyTime of the frame. This gave me an animation that orbited the focal point set for as many orbits as I did loops.
https://discourse.paraview.org/t/issues-with-multiple-orbit-laps-in-single-animation/11371
from paraview.simple import *
anim = GetAnimationScene()
renderView1 = GetActiveViewOrCreate("RenderView")
cameraAnimationCue1 = CameraAnimationCue()
#cameraAnimationCue1 = GetCameraTrack(view=rv)
cameraAnimationCue1.Mode = 'Path-based'
cameraAnimationCue1.AnimatedProxy = renderView1
# create a new key frame
n = 3
for i in range(n):
keyFrameN = CameraKeyFrame()
keyFrameN.Position = [-6.6921304299024635, 0.0, 0.0]
keyFrameN.FocalPoint = [1e-20, 0.0, 0.0]
keyFrameN.ViewUp = [0.0, 0.0, 1.0]
keyFrameN.ParallelScale = 1.7320508075688772
keyFrameN.PositionPathPoints = [0.0, -5.0, 0.0, 2.938926261462365, -4.045084971874736, 0.0, 4.755282581475766, -1.545084971874737, 0.0, 4.755282581475766, 1.5450849718747361, 0.0, 2.938926261462365, 4.045084971874735, 0.0, 1.3322676295501878e-15, 4.9999999999999964, 0.0, -2.9389262614623624, 4.045084971874735, 0.0, -4.755282581475763, 1.5450849718747368, 0.0, -4.755282581475763, -1.5450849718747341, 0.0, -2.9389262614623632, -4.045084971874731, 0.0]
keyFrameN.FocalPathPoints = [0.0, 0.0, 0.0]
keyFrameN.ClosedPositionPath = 1
keyFrameN.KeyTime = i/n
cameraAnimationCue1.KeyFrames.append(keyFrameN)
# ending scale
keyFrame9333 = CameraKeyFrame()
keyFrame9333.KeyTime = 1.0
keyFrame9333.Position = [-6.6921304299024635, 0.0, 0.0]
keyFrame9333.FocalPoint = [1e-20, 0.0, 0.0]
keyFrame9333.ViewUp = [0.0, 0.0, 1.0]
keyFrame9333.ParallelScale = 1.7320508075688772
# initialize the animation track
cameraAnimationCue1.KeyFrames.append( keyFrame9333)
anim.Cues.append(cameraAnimationCue1)

Metal shader not working as expected

If I run the following vertex shader in Metal/Swift I get a nice rectangle on the screen:
vertex Vertex vertexShader(uint k [[ vertex_id ]],
device float2* position [[buffer(1)]]){
Vertex output;
float2 pos = position[k];
output.position = float4(pos,0,1);
return output;
};
//position [0.0, 0.0, 0.5, 0.0, 0.0, 0.5, 0.5, 0.5]
//indexList [0, 1, 2, 2, 1, 3]
Now if I run the following I get a blank screen:
vertex Vertex vertexShader(uint k [[ vertex_id ]],
device float3* position [[buffer(1)]]){
Vertex output;
float3 pos = position[k];
output.position = float4(pos,1);
return output;
};
//position [0.0, 0.0, 0.0, 0.5, 0.0, 0.0, 0.0, 0.5, 0.0, 0.5, 0.5, 0.0]
//indexList [0, 1, 2, 2, 1, 3]
It seems to me these should produce identical results. What am I missing?
How exactly are you filling the buffer associated with index 1 in your app code?
I suspect you're just supplying an array of floats. Well, float3 is not packed. Its layout is not the same as 3 floats. There's padding. Its size is actually the same as float4 or 4 floats.
Probably, the simplest fix is to declare position as a pointer to packed_float3.

how to sort multidimensional matrices along multiple columns

I have a tricky matrix manipulation issue that I could really use some help with.
I need to reorganize a series of 2d matrices so that they align most effectively across subjects. Each matrix has ~50 rows (which are the observations) and 13 columns (which designate the 'weight' of each observation on a series of 13 outcome measures). Based on the manner in which the data are created, there is no inherent meaning in the order of the rows, however I need to reorganize each matrix such that the rows contain meaning between subjects.
Specifically, I want to be able to reorder the matrices such that the specific pattern of weightings in a given row aligns with a similar pattern in the same row across a group of 20 subjects. To make matters worse, some subjects have missing rows, although all have between 45 and 50 rows.
As an example:
subject 1:
[ 0.1, 0.1, 0.2, 0.2, 0.3, 0.3, 0.4, 0.4, 0.5, 0.5, 0.6, 0.6, 0.7;
0.9, 0.8, 0.8, 0.7, 0.7, 0.6, 0.6, 0.5, 0.5, 0.4, 0.4, 0.3, 0.3]
subject 2:
[ 0.8, 0.7, 0.7, 0.6, 0.6, 0.5, 0.5, 0.4, 0.4, 0.3, 0.3, 0.2, 0.2;
0.0, 0.0, 0.1, 0.1, 0.2, 0.2, 0.3, 0.3, 0.4, 0.4, 0.5, 0.6, 0.7]
problem: row 1 in subject 1 aligns best with row 2 in subject 2 (and v.v.) and I would like to reorganize them as such [note: the real life problem is much more convoluted than this].
I apologize ahead of time for how idiosyncratic this issue is, but I really appreciate any help that anyone can give.
Mac

Formatting CIColorCube data

Recently, I've been trying to set up a CIColorCube on a CIImage to create a custom effect. Here's what I have now:
uint8_t color_cube_data[8*4] = {
0, 0, 0, 1,
255, 0, 0, 1,
0, 255, 0, 1,
255, 255, 0, 1,
0, 0, 255, 1,
255, 0, 255, 1,
0, 255, 255, 1,
255, 255, 255, 1
};
NSData * cube_data =[NSData dataWithBytes:color_cube_data length:8*4*sizeof(uint8_t)];
CIFilter *filter = [CIFilter filterWithName:#"CIColorCube"];
[filter setValue:beginImage forKey:kCIInputImageKey];
[filter setValue:#2 forKey:#"inputCubeDimension"];
[filter setValue:cube_data forKey:#"inputCubeData"];
outputImage = [filter outputImage];
I've checked out the WWDC 2012 Core Image session, and what I have still doesn't work. I've also checked the web, and there are very few resources available on this issue. My code above just returns a black image.
In Apple's developer library, it says:
This filter applies a mapping from RGB space to new color values that are defined in inputCubeData. For each RGBA pixel in inputImage the filter uses the R,G and B values to index into a thee dimensional texture represented by inputCubeData. inputCubeData contains floating point RGBA cells that contain linear premultiplied values. The data is organized into inputCubeDimension number of xy planes, with each plane of size inputCubeDimension by inputCubeDimension. Input pixel components R and G are used to index the data in x and y respectively, and B is used to index in z. In inputCubeData the R component varies fastest, followed by G, then B.
However, this makes no sense to me. How does my inputCubeData need to be formatted?
The accepted answer is incorrect. While the cube data is indeed supposed to be scaled to [0 .. 1], it's supposed to be float, not int.
float color_cube_data[8*4] = {
0.0, 0.0, 0.0, 1.0,
1.0, 0.0, 0.0, 1.0,
0.0, 1.0, 0.0, 1.0,
1.0, 1.0, 0.0, 1.0,
0.0, 0.0, 1.0, 1.0,
1.0, 0.0, 1.0, 1.0,
0.0, 1.0, 1.0, 1.0,
1.0, 1.0, 1.0, 1.0
};
(Technically, you don't have to put the ".0" on each number, the compiler knows how to handle it.)
I found the issue... I have updated my question if anyone has the same problem!
The input float array had to be pre-divided out of 255.
The original used 255:
uint8_t color_cube_data[8*4] = {
0, 0, 0, 1,
255, 0, 0, 1,
0, 255, 0, 1,
255, 255, 0, 1,
0, 0, 255, 1,
255, 0, 255, 1,
0, 255, 255, 1,
255, 255, 255, 1
};
It should look like this instead:
uint8_t color_cube_data[8*4] = {
0, 0, 0, 1,
1, 0, 0, 1,
0, 1, 0, 1,
1, 1, 0, 1,
0, 0, 1, 1,
1, 0, 1, 1,
0, 1, 1, 1,
1, 1, 1, 1
};
Your problem is that you are using value 1(which is next to zero) for alpha channel, max for uint8_t is 255
See example below:
CIFilter *cubeHeatmapLookupFilter = [CIFilter filterWithName:#"CIColorCube"];
int dimension = 4; // Must be power of 2, max of 128
int cubeDataSize = 4 * dimension * dimension * dimension;
unsigned char cubeDataBytes[cubeDataSize];
//cubeDataBytes[cubeDataSize]
unsigned char cubeDataBytes[4*4*4*4] = {
0, 0, 0, 0,
255, 0, 0, 170,
255, 250, 0, 200,
255, 255, 255, 255
};
NSData *cube_data = [NSData dataWithBytes:cubeDataBytes length:(cubeDataSize*sizeof(char))];
//applying
[cubeHeatmapLookupFilter setValue:myImage forKey:#"inputImage"];
[cubeHeatmapLookupFilter setValue:cube_data forKey:#"inputCubeData"];
[cubeHeatmapLookupFilter setValue:#(dimension) forKey:#"inputCubeDimension"];
This is link to full project https://github.com/knerush/heatMap