Delineating training on a mesh for CGAL classification - classification

Is there a mechanism/procedure in CGAL to delineate training samples on a mesh for classification? I note that it is documented in the demo files but not how the training labels were generated.
(eg using the ETHZ R.Fores)
Using the Polyhedron_3 demo I can do this for points but not for a mesh it appears.

For a mesh, you can do it similarly in the demo, but you have to use the mesh selection plugin (in facet mode) instead of point set selection one. Apart from that, it's all the same, you just need to select facets and add them to the training set.

Related

Calculate subregion projection matrix (oblique projection) for an existing camera in Unity

I'm trying to create a custom projection setup in Unity that will allow me to render the perspective of an original camera, given a full-sized render target, on to three sub textures, with each camera rendering only a section.
This is part of a large-scale projection system which runs the built project on multiple instances, passing an offset of the subregion to each and adjusting the render target size.
I've seen many examples that allow for constructing an oblique projection matrix from scratch based on simple parameters but my requirements are a little more complex. Simply, I need to take an existing projection matrix that might already have various projection properties set, and split it into subregions.
Is there a way to achieve a subregion projection matrix given an existing one and offset / crop parameters?
Here's a rough attempt at visualizing the problem:
Thank you!

What are "Activations", "Activation Gradients", "Weights" and "Weight Gradients" in Convolutional Neural Networks?

I've just finished reading the notes for Stanford's CS231n on CNNs and there is a link to a live demo; however, I am unsure by what "Activations", "Activation Gradients", "Weights" and "Weight Gradients" is referring to in the demo. The below screenshots have been copied from the demo.
Confusion point 1
I'm first confused by what "activations" refers to for the input layer. Based on the notes, I thought that the activation layer refers to the RELU layer in a CNN, which essentially tells the CNN which neurons should be lit up (using the RELU function). I'm not sure how that relates to the input layer as shown below. Furthermore, why are there two images displayed? The first image seems to display the image that is provided to the CNN but I'm unable to distinguish what the second image is displaying.
Confusion point 2
I'm unsure what "activations" and "activation gradients" is displaying here for the same reason as above. I think the "weights" display what the 16 filters in the convolution layer look like but I'm not sure what "Weight Gradients" is supposed to be showing.
Confusion point 3
I think I understand what the "activations" is referring to in the RELU layers. It is displaying the output images of all 16 filters after every value (pixel) of the output image has had the RELU function applied to it hence why each of the 16 images contains pixels that are black (un-activated) or some shade of white (activated). However, I don't understand what "activation gradients" is referring to.
Confusion point 4
Also don't understand what "activation gradients" is referring to here.
I'm hoping that by understanding this demo, I'll understand CNNs a little more
This question is similar to this question, but not quite. Also, here's a link to the ConvNetJS example code with comments (here's a link to the full documentation). You can take a look at the code at the top of the demo page for the code itself.
An activation function is a function that takes in some input and outputs some value based on if it reaches some "threshold" (this is specific for each different activation function). This comes from how neurons work, where they take some electrical input and will only activate if they reach some threshold.
Confusion Point 1: The first set of images show the raw input image (the left colored image) and the right of the two images is the output after going through the activation functions. You shouldn't really be able to interpret the second image because it is going through non-linear and perceived random non-linear transformations through the network.
Confusion Point 2: Similar to the previous point, the "activations" are the functions the image pixel information is passed into. A gradient is essentially the slope of the activation function. It appears more sparse (i.e., colors show up in only certain places) because it shows possible areas in the image that each node is focusing on. For example, the 6th image on the first row has some color in the bottom left corner; this may indicate a large change in the activation function to indicate something interesting in this area. This article may clear up some confusion on weights and activation functions. And this article has some really great visuals on what each step is doing.
Confusion Point 3: This confused me at first, because if you think about a ReLu function, you will see that it has a slope of one for positive x and 0 everywhere else. So to take the gradient (or slope) of the activation function (ReLu in this case) doesn't make sense. The "max activation" and "min activation" values make sense for a ReLu: the minimum value will be zero and the max is whatever the maximum value is. This is straight from the definition of a ReLu. To explain the gradient values, I suspect that some Gaussian noise and a bias term of 0.1 has been added to those values. Edit: the gradient refers to the slope of the cost-weight curve shown below. The y-axis is the loss value or the calculated error using the weight values w on the x-axis.
Image source https://i.ytimg.com/vi/b4Vyma9wPHo/maxresdefault.jpg
Confusion Point 4: See above.
Confusion point 1: Looking at the code it seems like in the case of the input layer the "Activations" visualisation are the coloured image for the first figure. The second figure does not really make any sense because the code is trying to display some gradient values but it's not clear where they come from.
// HACK to draw in color in input layer
if(i===0) {
draw_activations_COLOR(activations_div, L.out_act, scale);
draw_activations_COLOR(activations_div, L.out_act, scale, true);
Confusion point 2, 3 & 4:
Activations: It is the output of the layer
Activation Gradients: This name is confusing but it is basically the gradient of the loss with respect to the input of the current layer l. This is useful in case you want to debug the autodif algorithm
Weights: This is only printed if the layer is a convolution. It's basically the different filters of the convolution
Weight Gradients:It is the gradient of the loss with respect to the weights of the current layer l
Confusion point 1
For Convolutional Layers, every layer has a duty to detect features. Imagine that you want to detect a human face, first layer will detect edges, maybe next layer will detect your noses and so on. Towards last layer, more complex features will be detected. In first layer, what you see is what first layer detected from image.
Confusion point 2
If you look through fully connnected layers, I think they probably showing up the gradients they obtained during back-propagation. Because through fully connected layers, they get only gray black etc colors.
Confusion point 3
There is nothing relu layers. After convulution you use activation function, and you get another matric, and you pass it through another layer. After relu, you get the colors.
Confusion point 4
It is same above.
Please let me know when you don't understand any point.

Extracting 2D surface from 3D STEP model

I'm trying to figure out a good way to programmatically generate contours describing a 2D surface, from a 3D STEP model. Application is generating NC code for a laser-cutting program from a 3D model.
Note: it's easy enough to do this in a wide variety of CAD systems. I am writing software that needs to do it automatically.
For example, this (a STEP model):
Needs to become this (a vector file, like an SVG or a DXF):
Perhaps the most obvious way of tackling the problem is to parse the STEP model and run some kind of algorithm to detect planes and select the largest as the cut surface, then generate the contour. Not a simple task!
I've also considered using a pre-existing SDK to render the model using an orthographic camera, capture a high-res image, and then operating on it to generate the appropriate contours. This method would work, but it will be CPU-heavy, and its accuracy will be limited to the pixel resolution of the rendered image - not ideal.
This is perhaps a long shot, but does anyone have thoughts about this? Cheers!
I would use a CAD library to load the STEP file (not a CAD API), look for the planar face with the higher number of edge curves in the face loop and transpose them on the XY plane. Afterward, finding 2D geometry min/max for centering etc. would be pretty easy.
Depending on the programming language you are using I would search for "CAD control" or "CAD component" on Google combining it with "STEP import".

Matlab 3D reconstruction

Recently, I have to do a project of multi view 3D scanning within this 2 weeks and I searched through all the books, journals and websites for 3D reconstruction including Mathworks examples and so on. I written a coding to track matched points between two images and reconstruct them into 3D plot. However, despite of using detectSURFFeatures() and extractFeatures() functions, still some of the object points are not tracked. How can I reconstruct them also in my 3D model?
What you are looking for is called "dense reconstruction". The best way to do this is with calibrated cameras. Then you can rectify the images, compute disparity for every pixel (in theory), and then get 3D world coordinates for every pixel. Please check out this Stereo Calibration and Scene Reconstruction example.
The tracking approach you are using is fine but will only get sparse correspondences. The idea is that you would use the best of these to try to determine the difference in camera orientation between the two images. You can then use the camera orientation to get better matches and ultimately to produce a dense match which you can use to produce a depth image.
Tracking every point in an image from frame to frame is hard (its called scene flow) and you won't achieve it by identifying individual features (such as SURF, ORB, Freak, SIFT etc.) because these features are by definition 'special' in that they can be clearly identified between images.
If you have access to the Computer Vision Toolbox of Matlab you could use their matching functions.
You can start for example by checking out this article about disparity and the related matlab functions.
In addition you can read about different matching techniques such as block matching, semi-global block matching and global optimization procedures. Just to name a few keywords. But be aware that the topic of stereo matching is huge one.

Alternative to default Open GLES lines (3D)?

I'm currently trying to implement a silhouette algorithm in my project (using Open GLES, it's for mobile devices, primarily iPhone at the moment). One of the requirements is that a set of 3D lines be drawn. The issue with the default OpenGL lines is that they don't connect at an angle nicely when they are thick (gaps appear). Other subtle artifacts are also evident, which detract from the visual appeal of the lines.
Now, I have looked into using some sort of quad strip as an alternative to this. However, drawing a quad strip in screen space requires some sort of visibility detection - lines obscured in the actual 3D world should not be visible.
There are numerous approaches to this problem - i.e. quantitative invisibility. But such an approach, particularly on a mobile device with limited processing power, is difficult to implement efficiently, considering raycasting needs to be employed. Looking around some more I found this paper, which describes a couple of methods for using z-buffer sampling to achieve such an effect. However, I'm not an expert in this area, and while I understand the theory behind the techniques to an extent, I'm not sure how to go about the practical implementation. I was wondering if someone could guide me here at a more technical level - on the OpenGLES side of things. I'm also open to any suggestions regarding 3D line visibility in general.
The technique with z-buffer will be too complex for iOS devices - it needs heavy pixel shader and (IMHO) it will bring some visual artifacts.
If your models are not complex you can find geometric silhouette in runtime - for example by comparing normals of polygons with common edge: if z value of direction in view space has different sings (one normal is directed to camera and other is from camera) then this edge should be used for silhouette.
Another approach is more "FPS friendly": keep extruded version of your model. And render firstly extruded model with color of silhouette (without textures and lighting) and normal model over it. You will need more memory for vertices, but no real-time computations.
PS: In all games I have look at silhouettes were geometric.
I have worked out a solution that works nicely on an iPhone 4S (not tested on any other devices). It builds on the idea of rendering world-space quads, and does the silhouette detection all on the GPU. It works along these lines (pun not intended):
We generate edge information. This consists of a list of edges/"lines" in the mesh, and for each we associate two normals which represent the tris on either side of the edge.
This is processed into a set of quads that are uploaded to the GPU - each quad represents an edge. Each vertex of each quad is accompanied by three attributes (vec3s), namely the edge direction vector and the two neighbor tri normals. All quads are passed w/o "thickness" - i.e. the vertices on either end are in the same position. However, the edge direction vector is opposite for each vertex in the same position. This means they will extrude in opposite directions to form a quad when required.
We determine whether a vertex is part of a visible edge in the vertex shader by performing two dot products between each tri norm and the view vector and checking if they have opposite signs. (see standard silhouette algorithms around the net for details)
For vertices that are part of visible edges, we take the cross product of the edge direction vector with the view vector to get a screen-oriented "extrusion" vector. We add this vector to the vertex, but divided by the w value of the projected vertex in order to create a constant thickness quad.
This does not directly resolve the gaps that can appear between neighbor edges but is far more flexible when it comes to combating this. One solution may involve bridging the vertices between large angled lines with another quad, which I am exploring at the moment.