what is the encoded format of airbus kaggle? - image-segmentation

I'm try to understand the format of the segmentation.
The format of segmentation of air bus detection of kaggle
264661 17 265429 33 266197 33 266965 33 267733...
looks something like this. it does not look like voc format
what kind of format is this?

The evaluation page explains the format.
https://www.kaggle.com/c/airbus-ship-detection/overview/evaluation
In order to reduce the submission file size, our metric uses run-length encoding on the pixel values. Instead of submitting an exhaustive list of indices for your segmentation, you will submit pairs of values that contain a start position and a run length. E.g. '1 3' implies starting at pixel 1 and running a total of 3 pixels (1,2,3).
The competition format requires a space delimited list of pairs. For example, '1 3 10 5' implies pixels 1,2,3,10,11,12,13,14 are to be included in the mask. The pixels are one-indexed
and numbered from top to bottom, then left to right: 1 is pixel (1,1), 2 is pixel (2,1), etc. A prediction of of "no ship in image" should have a blank value in the EncodedPixels column.
The metric checks that the pairs are sorted, positive, and the decoded pixel values are not duplicated. It also checks that no two predicted masks for the same image are overlapping.

Related

Interpreting time series dimension?

I am wondering if anyone can explain the interpretation of the size (number of feature) in a time series? For example consider a simple script in Matlab
X= randn(2,5,2)
X(:,:,1) =
-0.5530 0.4291 0.3937 -1.2534 0.2811
-1.4926 -0.7019 -0.8305 -1.4034 1.9545
X(:,:,2) =
0.2004 0.1438 2.3655 -0.1589 0.7140
0.4905 0.2301 -0.7813 -0.6737 0.2552
Assume X is a time series with the following output
This generates 2 vectors of length 5 each has 2 rows. Can anyone tell me what is exactly the meaning of first 2 and 5?
In some websites it says a creating 5 vectors of length 5 and size 2. What does size mean here?
Is 2 like number of features and 5 is like number of time series. The reason for this confusion is because I do not understand how to interpret following sentence:
"Generate 2 vector-valued sequences of length 5; each vector has size
2."
What do size 2 and length 5 mean here?
This entirely depends on your data, and how you want to store this. If you have some 2D data over time, I find it convenient to have a data matrix with in the 1st and 2nd dimension the 2D data per time step, and in the 3rd dimension time.
Say I have a movie of 1920 by 1080 pixels with 100 frames, I'd store this as mov = rand(1080,1920,100) (1080 and 1920 swapped because of row, col order of indexing). So now mov(:,:,1) would give me the first frame etc.
BTW, your X is a normal array, not to be confused with the timeseries object.

How to change pixel values of an RGB image in MATLAB?

So what I need to do is to apply an operation like
(x(i,j)-min(x)) / max(x(i,j)-min(x))
which basically converts each pixel value such that the values range between 0 and 1.
First of all, I realised that Matlab saves our image(rows * col * colour) in a 3D matrix on using imread,
Image = imread('image.jpg')
So, a simple max operation on image doesn't give me the max value of pixel and I'm not quite sure what it returns(another multidimensional array?). So I tried using something like
max_pixel = max(max(max(Image)))
I thought it worked fine. Similarly I used min thrice. My logic was that I was getting the min pixel value across all 3 colour planes.
After performing the above scaling operation I got an image which seemed to have only 0 or 1 values and no value in between which doesn't seem right. Has it got something to do with integer/float rounding off?
image = imread('cat.jpg')
maxI = max(max(max(image)))
minI = min(min(min(image)))
new_image = ((I-minI)./max(I-minI))
This gives output of only 1s and 0s which doesn't seem correct.
The other approach I'm trying is working on all colour planes separately as done here. But is that the correct way to do it?
I could also loop through all pixels but I'm assuming that will be time taking. Very new to this, any help will be great.
If you are not sure what a matlab functions returns or why, you should always do one of the following first:
Type help >functionName< or doc >functionName< in the command window, in your case: doc max. This will show you the essential must-know information of that specific function, such as what needs to be put in, and what will be output.
In the case of the max function, this yields the following results:
M = max(A) returns the maximum elements of an array.
If A is a vector, then max(A) returns the maximum of A.
If A is a matrix, then max(A) is a row vector containing the maximum
value of each column.
If A is a multidimensional array, then max(A) operates along the first
array dimension whose size does not equal 1, treating the elements as
vectors. The size of this dimension becomes 1 while the sizes of all
other dimensions remain the same. If A is an empty array whose first
dimension has zero length, then max(A) returns an empty array with the
same size as A
In other words, if you use max() on a matrix, it will output a vector that contains the maximum value of each column (the first non-singleton dimension). If you use max() on a matrix A of size m x n x 3, it will result in a matrix of maximum values of size 1 x n x 3. So this answers your question:
I'm not quite sure what it returns(another multidimensional array?)
Moving on:
I thought it worked fine. Similarly I used min thrice. My logic was that I was getting the min pixel value across all 3 colour planes.
This is correct. Alternatively, you can use max(A(:)) and min(A(:)), which is equivalent if you are just looking for the value.
And after performing the above operation I got an image which seemed to have only 0 or 1 values and no value in between which doesn't seem right. Has it got something to do with integer/float rounding off?
There is no way for us to know why this happens if you do not post a minimal, complete and verifiable example of your code. It could be that it is because your variables are of a certain type, or it could be because of an error in your calculations.
The other approach I'm trying is working on all colour planes separately as done here. But is that the correct way to do it?
This depends on what the intended end result is. Normalizing each colour (red, green, blue) seperately will result in a different result as compared to normalizing the values all at once (in 99% of cases, anyway).
You have a uint8 RGB image.
Just convert it to a double image by
I=imread('https://upload.wikimedia.org/wikipedia/commons/thumb/0/0b/Cat_poster_1.jpg/1920px-Cat_poster_1.jpg')
I=double(I)./255;
alternatively
I=im2double(I); %does the scaling if needed
Read about image data types
What are you doing wrong?
If what you want todo is convert a RGB image to [0-1] range, you are approaching the problem badly, regardless of the correctness of your MATLAB code. Let me give you an example of why:
Say you have an image with 2 colors.
A dark red (20,0,0):
A medium blue (0,0,128)
Now you want this changed to [0-1]. How do you scale it? Your suggested approach is to make the value 128->1 and either 20->20/128 or 20->1 (not relevant). However when you do this, you are changing the color! you are making the medium blue to be intense blue (maximum B channel, ) and making R way way more intense (instead of 20/255, 20/128, double brightness! ). This is bad, as this is with simple colors, but with combined RGB values you may even change the color itsef, not only the intensity. Therefore, the only correct way to convert to [0-1] range is to assume your min and max are [0, 255].

Number of parameters calculation in Convolutional NN

I'm new in the CNN study and I started by watching Andrew'NG lessons.
There is an example that I did not understand :
How did he compute the #parameters value ?
As you can see in Answer 1 of this StackOverflow question, the formula for the calculation of the number of parameters of a convolutional network is: channels_in * kernel_width * kernel_height * channels_out + channels_out.
But this formula doesn't agree with your data. And in fact the drawing you are showing does not agree with the table you are giving.
If I base myself on the drawing, then the first CN has 3 entry channels, a 5*5 sliding window and 6 output channels, so the number of parameters should be 456.
You give the number 208, and this is the number obtained for 1 entry channel and 8 output channels (the table says 8, while the drawing says 6). So it seems that 208 is correctly obtained from the table data, if we consider that there is one input channel and not three.
As for the second CN, with 6 entry channels, a sliding window 5*5 and 16 output channels, you need 2,416 parameters, which looks suspiciously close to 416, the number given in the table.
As for the remaining networks it is always the number of input dimension times the number of output dimensions, plus one: 5*5*16*120+1=48,001, 120*84+1=10,081, 84*10+1=841.

matlab showing 0 for small number values product

I have matrix with elements have small values. I am taking product of some elements of matrix for 100 times. If I take matrix 10*10 then it shows output but when I take matrix 100*100 then it shows 0. I think it shows 0 because product appears very small value. so how to take product so this small value should display.
Try typing:
format long
It should be just and rounding problem. This will format it to 8 decimals. If you want to go back to Matlab default settings type:
format short

Matlab: Tempo-Alignment according to Timestamps

May be it is so simple but I'm new to Matlab and not good in Timestamps issues in general. Sorry!
I have two different cameras each contains timestamps of frames. I read them to two arrays TimestampsCam1 and TimestampsCam2:
TimestampsCam1 contains 1500 records and the timestamps are in Microseconds as follows:
1 20931160389
2 20931180407
3 20931200603
4 20931220273
5 20931240360 ...
and TimestampsCam2 contains 1000 records and the timestamps are in Milliseconds as follows:
1 28275280
2 28315443
3 28355607
4 28395771
5 28435935 ...
The first camera starts capturing first and ends a bit later than the second camera. So what I need to do is to know exactly where a frame from first camera is captured at the same time (or nearly the same time) by the other camera. In other words, I want to align the two arrays(cameras) in time according to the timestamps. I want to get at the end two arrays of same size where each record is tempo-aligned to the corresponding record in the other array.
Many thanks to all!
Sam
Make sure they are in the same unit of measurement, e.g. microseconds
Create an index which contains all values, except duplicates, suppose this one is 2400 records long
Create two NaN vectors of length 2400 by putting the value (for example the framenumber) at the place where the index matches the timestamp
Now you have two aligned vectors with NaNs to pad them where required.