caffe circular shift of zero-padded upscaled images for interpolation - neural-network

I want to implement the super-resolution algorithm defined in https://arxiv.org/abs/1609.05158 using caffe. There are TF implementations but no caffe implementation yet: https://github.com/tetrachrome/subpixel
To summarize the algorithm: I want to superresolve an image by 3. I want to do the upsampling at the end of the network rather than at the beginning. To do that I will have 9 images (Batch x 9 x height x width) at the end of the network.
Then what I wish to do is to pick one pixel from each image at the same coordinates and place them within 3x3 square to complete an image of size 3*height * 3*width. Similar to:
1) Can I use deconvolution layer to upscale an image by 3, filling zeros in between and if so, how?
2) I am thinking of using slice layer to extract 9 images.
3) Is there a way to circularly shift some images to align them as seen in the image and if so, how?
4) Do I really need slice layer before circular shifting and eltwise summing OR can I do it in another way without needing slice layer: Can I circular shift channels separately and can I merge channels of images by summation?
5) Can this be done in a much easier way which I am unable to imagine.
I asked quite a lot questions I hope I am not overflowing the questions.
Thank you in advance.
EDIT:
I want to implement this Tensorflow code in caffe:
def _phase_shift(I, r):
bsize, a, b, c = I.get_shape().as_list()
bsize = tf.shape(I)[0] # Handling Dimension(None) type for undefined batch dim
X = tf.reshape(I, (bsize, a, b, r, r))
X = tf.transpose(X, (0, 1, 2, 4, 3)) # bsize, a, b, 1, 1
X = tf.split(1, a, X) # a, [bsize, b, r, r]
X = tf.concat(2, [tf.squeeze(x, axis=1) for x in X]) # bsize, b, a*r, r
X = tf.split(1, b, X) # b, [bsize, a*r, r]
X = tf.concat(2, [tf.squeeze(x, axis=1) for x in X]) # bsize, a*r, b*r
return tf.reshape(X, (bsize, a*r, b*r, 1))

Related

Reference elements in matrices in Julia while assigning values in other matrices

I have a vector X whose elements are zeros and ones. I want to create another vector Z of the same size as X where each element of Z is 0 if the corresponding element in X is zero, otherwise it is a random draw from a. uniform distribution. In Matlab I can easily do this by:
n = 1000;
X = randi([0, 1], [1, n]);
Z(X) = rand(); #Here wherever X takes a value of 1, the element of Z is a draw from a uniform distribution.
I want to implement this in Julia. Is there a cleaner way of doing this instead of using if conditionals. Thanks!!
Here's one way to do it:
julia> n = 1000;
julia> x = rand(Bool, n);
julia> z = zeros(n);
julia> using Distributions
julia> z[x] .= rand.(Uniform(-10, 10));
julia> z
100-element Vector{Float64}:
-2.6946644136672004
0.0
0.0
⋮
You can adjust the parameters of the Uniform distribution to what you need, or leave that argument out if the default [0, 1) range is what you need.
The line z[x] .= rand.(Uniform(-10, 10)) uses Julia's logical indexing (same as MATLAB) and broadcasting features - for every x value that is true, the rand call is made and the result assigned to that element of z.
The advantage of using the broadcast (compared to creating rand(Uniform(-10, 10), count(x)) and assigning that to z[x] for eg.) is that the values are directly assigned in-place to their destination in z, and so there's no extra unnecessary memory allocated (as mentioned by #DNF in the comments).
First of all, your Matlab code doesn't work in Matlab, for two reasons: Firstly because logical indices must be boolean, they cannot be 0 and 1. And secondly, because Z(X) = rand() will draw only a single random number and assign it to all the corresponding elements of Z.
Instead, you may want something like this Matlab code:
X = rand(1, n) > 0.5
Z(X) = rand(sum(X), 1)
In Julia you could do
X = rand(Bool, n)
Z = float.(X) # you have to initialize Z
Z[X] .= rand.()
Edit: Here's an alternative with a comprehension, where you don't need to initialize Z:
X = rand(Bool, n)
Z = [x ? float(x) : rand() for x in X]
Technically, what you are sampling from here is a left-censored uniform distribution -- equivalent to the mixture of a Dirac distibution at 0 and Uniform(0, 1). The next release of Distributions.jl will have an implementation for censored, which will remove the need to do any fancy assignment at all:
Z = rand(censored(Uniform(-1.0, 1.0), lower=0.0), N)
where the extent to the left is chosen so that the mixture components have equal weight.

What are the parameters in this overload of AddImage in the IText PDF library?

Sorry if this question is too specific to a particular library, however it seems popular enough that somebody might know the answer to this. The API documentation for AddImage does not say what each of the arguments are:
public PdfXObject addImage(ImageData image,
float a,
float b,
float c,
float d,
float e,
float f)
Creates Image XObject from image and adds it to canvas (as Image XObject).
Parameters:
image - the PdfImageXObject object
a - an element of the transformation matrix
b - an element of the transformation matrix
c - an element of the transformation matrix
d - an element of the transformation matrix
e - an element of the transformation matrix
f - an element of the transformation matrix
Obviously two are x/y coords, and presumably 2 are height and width, but from the "legacy" code I'm working with, it's not apparent which is which, and I can't think what the other two floats could be.
Those six values are elements of a matrix that has three rows and three columns:
You can use this matrix to express a transformation in a two-dimentional system.
Carrying out this multiplication results in this:
Carrying out this multiplication results in this:
x' = a * x + c * y + e
y' = b * x + d * y + f
The third column in the matrix is fixed: you’re working in two dimensions, so you don’t need to calculate a new z coordinate.
When studying analytical geometry in high school, you’ve probably learned how to apply transformations to objects. In PDF, we use a slightly different approach: instead of transforming objects, we transform the coordinate system.
Nevertheless, you can use your high school knowledge of analytical geometry to understand what the different values are about. For instance:
e and f are the values you will need for the translation of the object, so if you want to add the image at the position x = 36; y = 36, then you will need e = 36; f = 36.
a and d are the values you will need for the scaling in case you don't have any rotation. For instance: if you want the image to have a width of 100 user units and a height of 50 user units, you will need a = 100; b = 0; c = 0; d = 50.
So to add an image of 100 by 50 user units of which the lower-left corner coincides with the coordinate (36, 36), you'd need:
cb.addImage(img, 100, 0, 0, 50, 36, 36);
You can use the following formulas to compute the values for a, b, c, d, e, and f. For example, if you want to combine a translation (dX, dY), a scaling (sX, sY), and a rotation ϕ:
a = sX * cos(ϕ);
b = sY * sin(ϕ);
c = sX * -sin(ϕ);
d = sY * cos(ϕ);
e = dX;
f = dY;
These are all things you can rediscover if you dig into your high school books. It's simple Math; the stuff I learned at school at the age of 17 ;-)

MATLAB use custom function with pdist

I have a custom function to calculate the weight between two pixels (that represent nodes on a graph) of an image
function [weight] = getWeight(a,b,img, r, L)
ac = num2cell(a);
bc = num2cell(b);
imgint1 = img(sub2ind(size(img),ac{:}));
imgint2 = img(sub2ind(size(img),bc{:}));
weight = (sum((a - b) .^ 2) + (r^2/L) * abs(imgint2 - imgint1)) / (2*r^2);
where a = [x1 y1] and b = [x2 y2] are coordinates that represents pixels of the image, img is a gray-scale image and r and L are constants. Within the function imgint1 and imgint2 are gray intensities of the pixels on a and b.
I need to calculate the weight among set of points of the image.
Instead of two nested loops, I want to use the pdist function because it is WAY FASTER!
For instance, let nodes a set of pixel coordinates
nodes =
1 1
1 2
2 1
2 2
And img = [ 128 254; 0 255], r = 3, L = 255
In order to get these weights, I am using an intermediate function.
function [weight] = fxIntermediate(a,b, img, r, L)
weight = bsxfun(#(a,b) getWeight(a,b,img,r,L), a, b);
In order to finally get the whole set of weights
distNodes = pdist(nodes,#(XI,XJ) fxIntermediate(XI,XJ,img,r,L));
But it always get me an error
Error using pdist (line 373)
Error evaluating distance function '#(XI,XJ)fxIntermediate(XI,XJ,img,r,L)'.
Error in obtenerMatriz (line 27)
distNodes = pdist(nodes,#(XI,XJ) fxIntermediate(XI,XJ,img,r,L));
Caused by:
Error using bsxfun
Invalid output dimensions.
EDIT 1
This is a short example of my code that it should work, but I got the error mentioned above. If you copy/paste the code on MATLAB and run the code you will see the error
function [adjacencyMatrix] = problem
img = [123, 229; 0, 45]; % 2x2 Image as example
nodes = [1 1; 1 2; 2 2]; % I want to calculate distance function getWeight()
% between pixels img(1,1), img(1,2), img(2,2)
r = 3; % r is a constant, doesn't matter its meaning
L = 255; % L is a constant, doesn't matter its meaning
distNodes = pdist(nodes,#(XI,XJ) fxIntermediate(XI,XJ,img,r,L));
adjacencyMatrix = squareform(distNodes );
end
function [weight] = fxIntermediate(a,b, img, r, L)
weight = bsxfun(#(a,b) getWeight(a,b,img,r,L), a, b);
end
function [weight] = getWeight(a,b,img, r, L)
ac = num2cell(a);
bc = num2cell(b);
imgint1 = img(sub2ind(size(img),ac{:}));
imgint2 = img(sub2ind(size(img),bc{:}));
weight = (sum((a - b) .^ 2) + (r^2/L) * abs(imgint2 - imgint1)) / (2*r^2);
end
My goal is to obtain an adjacency matrix that represents the distance between pixels. For the above example, the desired adjacency matrix is:
adjacencyMatrix =
0 0.2634 0.2641
0.2634 0 0.4163
0.2641 0.4163 0
The problem is that you are neither fulfilling the expectations for a function to be used with pdist, nor those for a function to be used with bsxfun.
– From the documentation of pdist:
A distance function must be of form
d2 = distfun(XI,XJ)
taking as arguments a 1-by-n vector XI, corresponding to a single row
of X, and an m2-by-n matrix XJ, corresponding to multiple rows of X.
distfun must accept a matrix XJ with an arbitrary number of rows.
distfun must return an m2-by-1 vector of distances d2, whose kth
element is the distance between XI and XJ(k,:).
However, by using bsxfun in fxIntermediate, this function always returns a matrix of values whose size is the larger of the sizes of the two inputs.
– From the documentation of bsxfun:
A binary element-wise function of the form C
= fun(A,B) accepts arrays A and B of arbitrary but equal size and returns output of the same size. Each element in the output array C is
the result of an operation on the corresponding elements of A and B
only. fun must also support scalar expansion, such that if A or B is a
scalar, C is the result of applying the scalar to every element in the
other input array.
However, your getWeight appears to always return a scalar.
I do not understand your problem well enough in order to repair this. Moreover, I think if speed is what you are after, feeding pdist with a function handle is not the way to go. pdist does not perform magic; it is only fast because its built-in distance functions are implemented efficiently. Also, you are using anonymous function handles and conversions to and from cell arrays, all of which slow the process down. I think you should post a new question where you start with a description of what you are trying to compute, include some code that does the job even if inefficiently, and ask how to improve that.

RGB to YIQ conversion

I wrote code for rgb to yiq conversion.I get results but i don't know if this is correct.
%extract the red green blue elements
ImageGridRed = double(ImageRGB(:,:,1))';
ImageGridGreen = double(ImageRGB(:,:,2))';
ImageGridBlue = double(ImageRGB(:,:,3))';
%make the 300x300 matrices into 1x90000 matrices
flag = 1;
for i =1:1:300
for j = 1:1:300
imageGR(flag) = ImageGridRed(j,i);
imageGG(flag) = ImageGridGreen(j,i);
imageGB(flag) = ImageGridBlue(j,i);
flag = flag+1;
end
end
%put the 3 matrices into 1 matrix 90000x3
for j=1:1:300*300
colorRGB(j,1) = imageGR(j);
colorRGB(j,2) = imageGG(j);
colorRGB(j,3) = imageGB(j);
end
YIQ = rgb2ntsc([colorRGB(:,1) colorRGB(:,2) colorRGB(:,3)]);
I wrote this because the rgb2ntsc function needs mx3 matrix for input.I use the number 300 beacuse the picture is 300x300 pixels.I am going to seperate the picture in blocks in my project so dont give attention to the 300 number because i am going to change that, i put it just as an example.
thank you.
What you're doing is completely unnecessary. If you consult the documentation on rgb2ntsc, it also accepts a RGB image. Therefore, when you put in a RGB image, the output will be a 3 channel image, where the first channel is the luminance, or Y component and the second and third channels are the hue and saturation information (I and Q respectively). You don't need to decompose the image into a M x 3 matrix.
Therefore, simply do:
YIQ = rgb2ntsc(ImageRGB);
Make sure that ImageRGB is a RGB image where the first channel is red, second is green and third is blue.
Edit
With your comments, you want to take all of the pixels and place it into a M x 3 matrix where M is the total number of pixels. You would use this as input into rgb2ntsc. The function accepts a M x 3 matrix of RGB values where each row is a RGB tuple. The output in this case will be another M x 3 matrix where each row is its YIQ counterpart. Your code does do what you want it to do, but I would recommend that you do away with the for loops and replace it with:
colorRGB = reshape(permute(ImageRGB, [3 1 2]), 3, []).';,
After, do YIQ = rgb2ntsc(colorRGB);. colorRGB will already be a M x 3 matrix, so that column indexing you're doing is superfluous.
With the above using reshape and permute, it's very unnecessary to use the loops. In fact, I would argue that the for loop code is slower. Stick with the above code to get this done fast. Once you have your matrix in this fashion, then I suppose the code is doing what you want it to do.... however, I would personally just do a conversion on the image itself, then split it up into blocks or whatever you want to do after the fact.

Multiplying a 3x3 matrix to 3nx1 array without using loops

In my code, I have to multiply a matrix A (dimensions 3x3) to a vector b1 (dimensions 3x1), resulting in C. So C = A*b1. Now, I need to repeat this process n times keeping A fixed and updating b to a different (3x1) vector each time. This can be done using loops but I want to avoid it to save computational cost. Instead I want to do it as matrix and vector product. Any ideas?
You need to build a matrix of b vectors, eg for n equal to 4:
bMat = [b1 b2 b3 b4];
Then:
C = A * bMat;
provides the solution of size 3x4 in this case. If you want the solution in the form of a vector of length 3n by 1, then do:
C = C(:);
Can we construct bMat for arbitrary n without a loop? That depends on what the form of all your b vectors is. If you let me know in a comment, I can update the answer.