Separate objects in a point cloud - matlab

I am looking to Separate a point cloud into unconnected objects, I have
tried using k-means algorithm, but it didn't do the whole job.
I'm looking to improve the results shown in the picture i added.
any thoughts or directions?
My cloud, separated using k-means

I suggest you look into Euclidean Segmentation, for which an algorithm is given on this page: http://www.pointclouds.org/documentation/tutorials/cluster_extraction.php. PCL also provides a built-in implementation.

Related

Analyze 3D objects by Voxels

I plan to use OpenVDB to analyze 3D objects/meshes. The objective is:
To detect object surface regions with a certain criterion, like slope
Then manipulate those regions
The manipulation might be adding other 3D objects to those regions, for example
OpenVDB has some tools available:
Conversion Tools
Filters
Topological Operations
Level Set Tools
Morphological Operations
Geometric Transforms
Compositing Tools
...
It is a large set of confusing tools to choose from. Does anybody with OpenVDB experience know:
Is OpenVDB the proper library to achieve my objective
If so, which OpenVDB tool best suits my needs
Answer provided by OpenVDB community:
An important question is what you mean by "3D objects/meshes."
OpenVDB is very good at performing those regions with surfaces by
representing them as signed distance fields. But the word "mesh"
raises some alarm bells that you may want to maintain topology. In
this case another library may be more effective.
It also sounds like you have a problem domain you are trying to
explore. For that, I would not go straight to code but instead
explore solutions using 3d applications first. My own biased first
choice would be Houdini, whose apprentice version you can get for
free. This provides most of the VDB code as separate nodes. So, for
example, you can use a File SOP to load a mesh from disk, a VDB From
Polygons to convert it to a Signed Distance FIeld, and then VDB
Analysis to compute the Gradient. The gradient I think matches what
you are looking for as slope, but it is also possible you are looking
for curvature...
To return to mesh land, you can use a VDB Convert. Finally a ROP
Geometry can save it out.
Attached is a file showing a network to compute an approximate Y-slope
as a volume, apply it back to a mesh, and save to disk.
Attached file

Applying vector based clustering algorithms to social network context

i have a social network described as edges in a file. I used graph based clustering algorithms to find dense parts of the graph. However there is also vector based clustering which i need to apply to the data i have, but i can not find any context to this. I have also information about each node considering their features. I think using vectors containing the features of each user makes no sense here. For example k-Means would calculate the distance between user u1 with his feature vector v1 = [f1,f2,f3,..] and user u2 with its feature vector v2 = [f1,f2,f3,...]. However both vectors would have binary values depending on which feature the user has. Additionally i have a matrix with the users on one axis and the features on the other, where the user is able to set permission.
My Question is now, how i can make use of k-means, dbscan etc. in the context of this topic.
Best wishes.
Many algorithms can be modified to allow being used with distances for binary features. For example k-means can be modified for binary data: k-modes.
But I don't think it will do anything useful on your data.
You approach to this problem is bad: don't first decide the algorithm, then try to make it run. You are then bound to solve the wrong problem. Instead, formalize the problem first, in mathematics, what a good clustering would be. Then identify the appropriate algorithm by it's mathematical ability to find a good solution to this objective.

calculating clustering validity of k-means using rapidminer

Well, I have been studying up on the different algorithms used for clustering like k-means, k-mediods etc and I was trying to run the algorithms and analyze their performance on the leaf dataset right here:
http://archive.ics.uci.edu/ml/datasets/Leaf
I was able to cluster the dataset via k-means by first reading the csv file, filtering out unneeded attributes and applying k-means on it. The problem that I am facing here that I wish to calculate measures such as entropy, precision, recall and f-measure for the model developed via k-means. Is there an operator avialable that allows me to do this so that I can quantitatively compare the different clustering algorithms available on rapid-miner?
P.S I know about performance operators like Performance(Classification) that allows me to calculate precision and recall for a model but I dont know any that allow me to calculate entropy.
Help would be much appreciated.
The short answer is to use R. Here's a link to a book chapter about this very subject. There is a revised version coming soon that works for the most recent version of RapidMiner.

How to find bridges (community connecting nodes) in large networks represented using the adjacency matrix

I have networks of roughly 10K to 100K nodes which are all connected. These nodes are typically grouped into clusters of communities which are strongly connected with many edges between them and there are hubs etc. Between the communities there are nodes with a few edges bridging / connecting the communities together. These datasets are in adjacency matrices
I have tried spectral clustering (Ding et al 2001) but it is really slow on large data sets and seems to stop working when there is a lot of ambiguity (bridges which are not the only bridge route to another cluster- other communities can act as alternative proxy routes).
I have tried some of the methods from martelot such as the Newman algorithm for modularity optimisation but have not incorporated the stability optimisation functions in that effort (could that be crucial?). On synthetic data sets where the clusters are created by random graphs (ER graphs) the methods work but on real ones where there is nested hierarchy the results are scattered. Using a standalone visualization application/tool the bridges are evident though.
What methods would you recommend/advise to try? I am using MATLAB.
What do you want to do, exactly? Detect communities, or bridges between them? Those are two different problems. Once you have the communities, it's straightforward enough identifying the edges connecting nodes from two distinct communities. So, I guess you want to detect communities.
There are actually thousands methods for this purpose, some of them implemented in Matlab, such as the one you cite, or the generalized Louvain algorithm (also based on modularity optimization). However, most of them are rather available as C or C++ programs, such as InfoMap (based on a data compression paradigm), WalkTrap (clustering using a random walk-based distance), Markov Cluster (simulates some propagation mechanism), and the list goes on...
Those tools formalize the notion of community structure more or less differently, potentially leading to different (estimated) community structures, when applied on the same network. And of course, different communities means different bridges, too. So the question is rather to know how to pick the appropriate method for your data. You seem to have a priori knowledge regarding the networks you are studying, so you should use that to make your choice (rather than the programming language). For instance, even if you don't state it explicitly, you seem to be looking for a hierarchical community structure: not all tools are able to detect this kind of structure. Similarly, if you think one node can belong to several communities at the same time, then you should consider looking for overlapping communities, for instance using CFinder (based on clique percolation).
I'd advise you to have a look at this excellent review of community detection, you might find some interesting information allowing you to pick a method: Community Detection in Graphs. Also, from a programming point of view, I'd advise you to play with the igraph library (available for C, R and Python): it contains several standard community detection tools. You can try them on your data and see what you get.

ELKI implementation of OPTICS clustering algorithm detects only one cluster

I'm having issue with using OPTICS implementation in ELKI environment. I have used the same data for DBSCAN implementation and it worked like a charm. Probably I'm missing something with parameters but I can't figure it out, everything seems to be right.
Data is a simple 300х2 matrix, consists of 3 clusters with 100 points in each.
DBSCAN result:
Clustering result of DBSCAN
MinPts = 10, Eps = 1
OPTICS result:
Clustering result of OPTICS
MinPts = 10
You apparently already found the solution yourself, but here is the long story:
The OPTICS class in ELKI only computes the cluster order / reachability diagram.
In order to extract clusters, you have different choices, one of which (the one from the original OPTICS publication) is available in ELKI.
So in order to extract clusters in ELKI, you need to use the OPTICSXi algorithm, which will in turn use either OPTICS or the index based DeLiClu to compute the cluster order.
The reason why this is split into two parts in ELKI probably is so that you can on one hand implement another logic for extracting the clusters, and on the other hand implement different methods like DeLiClu for computing the cluster order. That would align well with the modular architecture of ELKI.
IIRC there is at least one more method (apparently not yet in ELKI) that extracts clusters by looking for local maxima, then extending them horizontally until they hit the end of the valley. And there was a different one that used "inflexion points" of the plot.
#AnonyMousse pretty much put it right. I just can't upvote or comment yet.
We hope to have some students contribute the other cluster extraction methods as small student projects over time. They are not essential for our research, but they are good tasks for students that want to learn about ELKI to get started.
ELKI is a fast moving project, and it lives from community contributions. We would be happy to see you contribute some code to it. We know that the codebase is not easy to get started with - it is fairly large, and the generality of the implementation and the support for index structures make it a bit hard to get started. We try to add Tutorials to help you to get started. And once you are used to it, you will actually benefit from the architecture: your algorithms get the benfits of indexing and arbitrary distance functions, while if you would implement from scratch, you would likely only support Euclidean distance, and no index acceleration.
Seeing that you struggled with OPTICS, I will try to write an OPTICS tutorial in the new year. In particular, OPTICS can benefit a lot from using an appropriate index structure.