MATLAB Rotation an Image in Frequency Domain [duplicate] - matlab

I've heard that it should be possible to do a lossless rotation on a jpeg image. That means you do the rotation in the frequency domain without an IDCT. I've tried to google it but haven't found anything. Could someone bring some light to this?
What I mean by lossless is that I don't lose any additional information in the rotation. And of course that's probably only possible when rotating multiples of 90 degrees.

You do not need to IDCT an image to rotate it losslessly (note that lossless rotation for raster images is only possible for angles that are multiples of 90 degrees).
The following steps achieve a transposition of the image, in the DCT domain:
transpose the elements of each DCT block
transpose the positions of each DCT block
I'm going to assume you can already do the following:
Grab the raw DCT coefficients from the JPEG image (if not, see here)
Write the coefficients back to the file (if you want to save the rotated image)
I can't show you the full code, because it's quite involved, but here's the bit where I IDCT the image (note the IDCT is for display purposes only):
Size s = coeff.size();
Mat result = cv::Mat::zeros(s.height, s.width, CV_8UC1);
for (int i = 0; i < s.height - DCTSIZE + 1; i += DCTSIZE)
for (int j = 0; j < s.width - DCTSIZE + 1; j += DCTSIZE)
{
Rect rect = Rect(j, i, DCTSIZE, DCTSIZE);
Mat dct_block = cv::Mat::Mat(coeff, rect);
idct_step(dct_block, i/DCTSIZE, j/DCTSIZE, result);
}
This is the image that is shown:
Nothing fancy is happening here -- this is just the original image.
Now, here's the code that implements both the transposition steps I mentioned above:
Size s = coeff.size();
Mat result = cv::Mat::zeros(s.height, s.width, CV_8UC1);
for (int i = 0; i < s.height - DCTSIZE + 1; i += DCTSIZE)
for (int j = 0; j < s.width - DCTSIZE + 1; j += DCTSIZE)
{
Rect rect = Rect(j, i, DCTSIZE, DCTSIZE);
Mat dct_block = cv::Mat::Mat(coeff, rect);
Mat dct_bt(cv::Size(DCTSIZE, DCTSIZE), coeff.type());
cv::transpose(dct_block, dct_bt); // First transposition
idct_step(dct_bt, j/DCTSIZE, i/DCTSIZE, result); // Second transposition, swap i and j
}
This is the resulting image:
You can see that the image is now transposed. To achieve proper rotation, you need to combine reflection with transposition.
EDIT
Sorry, I forgot that reflection is also not trivial. It also consists of two steps:
Obviously, reflect the positions of each DCT block in the required axis
Less obviously, invert (multiply by -1) each odd row OR column in each DCT block. If you're flipping vertically, invert odd rows. If you're flipping horizontally, invert odd columns.
Here's code that performs a vertical reflection after the transposition.
for (int i = 0; i < s.height - DCTSIZE + 1; i += DCTSIZE)
for (int j = 0; j < s.width - DCTSIZE + 1; j += DCTSIZE)
{
Rect rect = Rect(j, i, DCTSIZE, DCTSIZE);
Mat dct_block = cv::Mat::Mat(coeff, rect);
Mat dct_bt(cv::Size(DCTSIZE, DCTSIZE), coeff.type());
cv::transpose(dct_block, dct_bt);
// This is the less obvious part of the reflection.
Mat dct_flip = dct_bt.clone();
for (int k = 1; k < DCTSIZE; k += 2)
for (int l = 0; l < DCTSIZE; ++l)
dct_flip.at<double>(k, l) *= -1;
// This is the more obvious part of the reflection.
idct_step(dct_flip, (s.width - j - DCTSIZE)/DCTSIZE, i/DCTSIZE, result);
}
Here's the image you get:
You will note that this constitutes a rotation by 90 degrees counter-clockwise.

Related

How to square crop a cameraImage in Flutter

I'm creating a scanner and I needed to implement a square overlay to the Camera preview, I take the image stream from the Camera Preview and send it to an API, now after adding the square overlay I need to square crop the cameraImage taken from the Camera preview before sending it to the API.
I only have the cameraImage -which is YUV 420 format- taken how can I crop it programmatically?
I guess that for the full image you coded something like that:
Uint8List getBytes() {
final WriteBuffer allBytes = WriteBuffer();
for (final Plane plane in cameraImage.planes) {
allBytes.putUint8List(plane.bytes);
}
return allBytes.done().buffer.asUint8List();
}
In fact, you are concatenating one after the other the data of the 3 planes of YUV: all Y, then all U, then all V.
As you can see in the wikipedia page plane Y has the same width and height as the image, while planes U and V use width/2 and height/2.
If we go byte after byte that means that the code above is similar to the following code:
int divider = 1; // for first plane: Y
for (final Plane plane in cameraImage.planes) {
for (int i = 0; i < cameraImage.height ~/ divider; i++) {
for (int j = 0; j < cameraImage.width ~/ divider; j++) {
allBytes.putUint8(plane.bytes[j + i * cameraImage.width ~/ divider]);
}
}
divider = 2; // for planes U and V
}
Now that you're here, I think you understand how to crop:
int divider = 1; // for first plane: Y
for (final Plane plane in cameraImage.planes) {
for (int i = cropTop ~/ divider; i < cropBottom ~/ divider; i++) {
for (int j = cropLeft ~/ divider; j < cropRight ~/ divider; j++) {
allBytes.putUint8(plane.bytes[j + i * cameraImage.width ~/ divider]);
}
}
divider = 2; // for planes U and V
}
where the crop* variables are computed from the full image.
That's the theory: this piece of code does not take into consideration the camera orientation, the possible side-effects with odd sizes and the performances. But that's the general idea.

finding area and centre of stones

I want to find area and center of these stones.
but some of them can not be found.
here are codes
I=imread('E:/2.png');
level = graythresh(I);
BW = im2bw(I,level);
se = strel('disk',2);
bw1 = imclose(BW,se);
bw1 = imfill(BW,'holes');
bwa=bwareaopen(bw1,25);
cc = bwconncomp(bwa)
stat = regionprops(cc,'centroid','Area');
ss=[stat.Area];
imshow(I); hold on;
for x = 1: numel(stat)
plot(stat(x).Centroid(1),stat(x).Centroid(2), 'wp','MarkerSize',6,'MarkerFaceColor','b');
end
figure, imshow(bwa)
result is here:
and this is black and white pic;
some of these stones can not be separated.
is there any idea for it?
Erode stones until you separated them, find segments via connected components (e.g. findContours), set centers, then apply flood fill seeding floods at the centers in the original BW image (before erosion) to gracefully define segments. “Gracefully” means that floods should not ‘leak' into another (possibly connected) segment since it will be already filled with a different label. You may want to play with the parameters of floodFIll to tune up your segmentation. I did not have time to do this.
// separate stones
Mat Ibw = imread("bw.png", 0);
imshow("bw", Ibw);
int w=Ibw.cols, h=Ibw.rows;
int ERODE_SZ = 20;
Mat kernel = getStructuringElement( cv::MORPH_RECT, Size(ERODE_SZ, ERODE_SZ));
Mat Ierode;
erode(Ibw, Ierode, kernel);
imshow("erode", Ierode); imwrite("erode.png", Ierode);
vector<vector<Point> > contours;
Mat hierarchy;
Mat Icc = Ierode.clone();
findContours(Icc, contours, hierarchy, CV_RETR_LIST, CV_CHAIN_APPROX_NONE);
// find centers
Mat Icenters = Ibw.clone();
int sz = contours.size();
vector<Point> centers(sz);
for (int i=0; i<sz; ++i) {
if (i==0)
centers[i] = Point2f(0.f, 0.f);
int area = contours[i].size();
for (int j=0; j<area; j++) {
centers[i]+=contours[i][j];
}
centers[i]*=1.0/area;
circle(Icenters, centers[i], 3, 100, 3);
}
imshow("centers", Icenters);imwrite("erode.png", Ierode);
// find segments
Mat Iseg = Ibw.clone();
RNG rng( 0xFFFFFFFF );
for (int i=0; i<sz; ++i) {
floodFill(Iseg, centers[i], rng.uniform(100, 200));
circle(Iseg, centers[i], 3, 0, 1);
}
imshow("seg", Iseg); imwrite("result.png", Iseg);
waitKey();

How to find Center of Mass for my entire binary image?

I'm interested in finding the coordinates (X,Y) for my whole, entire binary image, and not the CoM for each component seperatly.
How can I make it efficiently?
I guess using regionprops, but couldn't find the correct way to do so.
You can define all regions as a single region for regionprops
props = regionprops( double( BW ), 'Centroid' );
According to the data type of BW regionprops decides whether it should label each connected component as a different region or treat all non-zeros as a single region with several components.
Alternatively, you can compute the centroid by yourself
[y x] = find( BW );
cent = [mean(x) mean(y)];
Just iterate over all the pixels calculate the average of their X and Y coordinate
void centerOfMass (int[][] image, int imageWidth, int imageHeight)
{
int SumX = 0;
int SumY = 0;
int num = 0;
for (int i=0; i<imageWidth; i++)
{
for (int j=0; j<imageHeight; j++)
{
if (image[i][j] == WHITE)
{
SumX = SumX + i;
SumY = SumY + j;
num = num+1;
}
}
}
SumX = SumX / num;
SumY = SumY / num;
// The coordinate (SumX,SumY) is the center of the image mass
}
Extending this method to gray scale images in range of [0..255]: Instead of
if (image[i][j] == WHITE)
{
SumX = SumX + i;
SumY = SumY + j;
num = num+1;
}
Use the following calculation
SumX = SumX + i*image[i][j];
SumY = SumY + j*image[i][j];
num = num+image[i][j];
In this case a pixel of value 100 has 100 times higher weight than dark pixel with value 1, so dark pixels contribute a rather small fraction to the center of mass calculation.
Please note that in this case, if your image is large you might hit a 32 bits integer overflow so in that case use long int sumX, sumY variables instead of int.

Creating a circular mask for my graph

I'm plotting a square image, but since my camera views out of a circular construction, I want the image to look circular as well. So to do this, I just wanted to create a mask for the image (basically create a matrix, and multiply my data by the mask, so if I want to retain my image I am multiplying by one, and if I want that part of the image to go to black, I multiply by 0).
I'm not sure the best way to make a matrix that will represent a circular opening though. I just want every element within the circle to be a "1" and every element outside the circle to be a "0" so I can color my image accordingly. I was thinking of doing a for loop, but I was hoping there was a faster way to do it. So...all I need is:
A matrix that is 1280x720
I need a circle that has a diameter of 720, centered in the middle of the 1280x720 matrix (what I mean by this is all elements corresponding to being within the circle have a "1" and all other elements have a "0"
My attempt
mask = zeros(1280,720)
for i = 1:1280
for j = 1:720
if i + j > 640 && i + j < 1360
mask(i,j) = 1;
end
end
end
Well the above obviously doesn't work, I need to look at it a little better to form a better equation for determing when to add a 1 =P but ideally I would like to not use a for loop
Thanks, let me know if anything is unclear!
#kol 's answer looks correct. You can do this with vectorized code using the meshgrid function.
width = 1280;
height = 720;
radius = 360;
centerW = width/2;
centerH = height/2;
[W,H] = meshgrid(1:width,1:height);
mask = ((W-centerW).^2 + (H-centerH).^2) < radius^2;
Here is a possible solution:
width = 160;
height = 120;
mask = zeros(width, height);
center_x = width / 2;
center_y = height / 2;
radius = min(width, height) / 2;
radius2 = radius ^ 2;
for i = 1 : width
for j = 1 : height
dx = i - center_x;
dy = j - center_y;
dx2 = dx ^ 2;
dy2 = dy ^ 2;
mask(i, j) = dx2 + dy2 <= radius2;
end;
end;
picture = randn(width, height); % test image :)
masked_image = picture .* mask;
imagesc(masked_image);

Resilient backpropagation neural network - question about gradient

First I want to say that I'm really new to neural networks and I don't understand it very good ;)
I've made my first C# implementation of the backpropagation neural network. I've tested it using XOR and it looks it work.
Now I would like change my implementation to use resilient backpropagation (Rprop - http://en.wikipedia.org/wiki/Rprop).
The definition says: "Rprop takes into account only the sign of the partial derivative over all patterns (not the magnitude), and acts independently on each "weight".
Could somebody tell me what partial derivative over all patterns is? And how should I compute this partial derivative for a neuron in hidden layer.
Thanks a lot
UPDATE:
My implementation base on this Java code: www_.dia.fi.upm.es/~jamartin/downloads/bpnn.java
My backPropagate method looks like this:
public double backPropagate(double[] targets)
{
double error, change;
// calculate error terms for output
double[] output_deltas = new double[outputsNumber];
for (int k = 0; k < outputsNumber; k++)
{
error = targets[k] - activationsOutputs[k];
output_deltas[k] = Dsigmoid(activationsOutputs[k]) * error;
}
// calculate error terms for hidden
double[] hidden_deltas = new double[hiddenNumber];
for (int j = 0; j < hiddenNumber; j++)
{
error = 0.0;
for (int k = 0; k < outputsNumber; k++)
{
error = error + output_deltas[k] * weightsOutputs[j, k];
}
hidden_deltas[j] = Dsigmoid(activationsHidden[j]) * error;
}
//update output weights
for (int j = 0; j < hiddenNumber; j++)
{
for (int k = 0; k < outputsNumber; k++)
{
change = output_deltas[k] * activationsHidden[j];
weightsOutputs[j, k] = weightsOutputs[j, k] + learningRate * change + momentumFactor * lastChangeWeightsForMomentumOutpus[j, k];
lastChangeWeightsForMomentumOutpus[j, k] = change;
}
}
// update input weights
for (int i = 0; i < inputsNumber; i++)
{
for (int j = 0; j < hiddenNumber; j++)
{
change = hidden_deltas[j] * activationsInputs[i];
weightsInputs[i, j] = weightsInputs[i, j] + learningRate * change + momentumFactor * lastChangeWeightsForMomentumInputs[i, j];
lastChangeWeightsForMomentumInputs[i, j] = change;
}
}
// calculate error
error = 0.0;
for (int k = 0; k < outputsNumber; k++)
{
error = error + 0.5 * (targets[k] - activationsOutputs[k]) * (targets[k] - activationsOutputs[k]);
}
return error;
}
So can I use change = hidden_deltas[j] * activationsInputs[i] variable as a gradient (partial derivative) for checking the sing?
I think the "over all patterns" simply means "in every iteration"... take a look at the RPROP paper
For the paritial derivative: you've already implemented the normal back-propagation algorithm. This is a method for efficiently calculate the gradient... there you calculate the δ values for the single neurons, which are in fact the negative ∂E/∂w values, i.e. the parital derivative of the global error as function of the weights.
so instead of multiplying the weights with these values, you take one of two constants (η+ or η-), depending on whether the sign has changed
The following is an example of a part of an implementation of the RPROP training technique in the Encog Artificial Intelligence Library. It should give you an idea of how to proceed. I would recommend downloading the entire library, because it will be easier to go through the source code in an IDE rather than through the online svn interface.
http://code.google.com/p/encog-cs/source/browse/#svn/trunk/encog-core/encog-core-cs/Neural/Networks/Training/Propagation/Resilient
http://code.google.com/p/encog-cs/source/browse/#svn/trunk
Note the code is in C#, but shouldn't be difficult to translate into another language.