How to train YOLO (Darknet) on 16 bit depth image files? - neural-network

The current implementation of yolo supports 8 bit depth, 3 channel png / jpg images to train on. I need to train yolo on 16 bit, 3 channel png images. What code do I need to change?
I have currently changed the following code:
In function image load_image_stb(char *filename, int channels), changed:
unsigned char *data = stbi_load(filename, &w, &h, &c, channels); to unsigned short *data = stbi_load(filename, &w, &h, &c, channels);
im.data[dst_index] = (float)data[src_index]/255.; to im.data[dst_index] = (float)data[src_index]/65536.;
In function image load_image_cv(char *filename, int channels), changed src = cvLoadImage(filename, flag) to src = cvLoadImage(filename, -1) since the -1 flag asks opencv to load the image with original depth.
In function void ipl_into_image(IplImage* src, image im), changed:
unsigned char *data = (unsigned char *)src->imageData; to unsigned short *data = (unsigned short *)src->imageData;
im.data[k*w*h + i*w + j] = data[i*step + j*c + k]/255.; to im.data[k*w*h + i*w + j] = data[i*step + j*c + k]/65536.;
What other modifications should I make to ensure yolo is training on 16 bit channels? Thank you.

The detector seems to expect 3 channels (rgb). I was able to get it to train by also disabling the distortion augmentation, and updating the areas of the code that were hardcoded to 3 channels i.e load_data_seg(). In my case I was using opencv and loading 16-bit pgm files.

Related

pass matlab image to open3d three::Image in a mex script

I am trying to load an image in a mex script and cast it to the corresponding format that the Open3D library uses, i.e. three::Image. I am using the following code:
uint8_t* rgb_image = (uint8_t*) mxGetPr(prhs[3]);
int* dims = (int*) mxGetDimensions(prhs[3]);
int height = dims[0];
int width = dims[2];
int channels = dims[4];
int imsize = height * width;
Image image;
image.PrepareImage(height, width, 3, sizeof(uint8_t)); // parameters: height, width, num_of_channels, bytes_per_channel
memcpy(image.data_.data(), rgb_image, image.data_.size());
The above works well when I give a grayscale image and specify num_of_channels to 1 but not for 3 channel images as you can notice below:
Then I tried to create a function where I am manually looping through the raw data and assigning them to the output image
auto image_ptr = std::make_shared<Image>();
image_ptr->PrepareImage(height, width, channels, sizeof(uint8_t));
for (int i = 0; i < height * width; i++) {
uint8_t *p = (uint8_t *)(image_ptr->data_.data() + i * channels * sizeof(uint8_t));
*p++ = *rgb_image++;
}
But now it seems that the color channels are wrongly assigned:
Any idea how to address this issue. The point is that it seems to be something easy but since my knowledge with C++ and pointers is quite limited I cannot figure it out straight forward.
I found this solution here (Reading image in matlab in a format acceptable to mex) as well but I am not sure how exactly I can use it. To be honest I am quite of confused.
ok the solution was quite straight forward as I was though in first place. It was just playing correctly with the pointers:
std::shared_ptr<Image> CreateRGBImageFromMat(uint8_t *mat_image, int width, int height, int channels)
{
auto open3d_image = std::make_shared<Image>();
open3d_image->PrepareImage(height, width, channels, sizeof(uint8_t));
for (int i = 0; i < height * width; i++) {
uint8_t *p = (uint8_t *)(open3d_image->data_.data() + i * channels * sizeof(uint8_t));
*p++ = *(mat_image + i);
*p++ = *(mat_image + i + height*width);
*p++ = *(mat_image + i + height*width*2);
}
return open3d_image;
}
since the three::Image expects the data in contiguous order row x col x channel while from matlab the image comes in blocks rows x cols x channel_1 (after you transpose the image since matlab is column major). My question though now is whether I can do the same with memcpy() or std::copy() where I can copy the bloc data to contiguous form so that I bypass the for loop.

Segmenting Lungs and nodules in CT images

I am new with Image processing in Matlab, I am trying to segment LUNG and nodules from CT image. I have done initial image enhancement.
I searched lot on the same but I haven't found any relevant materials.
Trying to segment lung part from the given image; and then detecting nodules on Lung part.
Code I tried in Matlab:
d1 = dicomread('000000.dcm');
d1ca = imadjust(d1);
d1nF = wiener2(d1ca);
d1Level = graythresh(d1nF);
d1sBW = im2bw(d1nF,d1Level);
sed = strel('diamon',1);
BWfinal = imerode(d1sBW,sed);
BWfinal = imerode(BWfinal,sed);
BWoutline = bwperim(BWfinal);
Segout = d1nF;
Segout(BWoutline) = 0;
edgePrewitt = edge(d1nF,'prewitt');
Result of above code:
Want results like this:
http://oi41.tinypic.com/35me7pj.jpg
http://oi42.tinypic.com/2jbtk6p.jpg
http://oi44.tinypic.com/w0kthe.jpg
http://oi40.tinypic.com/nmfaio.jpg
http://oi41.tinypic.com/2nvdrie.jpg
http://oi43.tinypic.com/2nvdnhk.jpg
I know its may be easy for experts. Please help me out.
Thank you!
The following is not a Matlab answer! However, OpenCV and Matlab share many features in common, and I'm sure you will be able to translate this C++ code to Matlab with no problems.
For more information about the methods being called, check the OpenCV documentation.
#include <iostream>
#include <vector>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
int main(int argc, char* argv[])
{
// Load input image (colored, i.e. 3-channel)
cv::Mat input = cv::imread(argv[1]);
if (input.empty())
{
std::cout << "!!! failed imread()" << std::endl;
return -1;
}
// Convert input image to grayscale (1-channel)
cv::Mat grayscale = input.clone();
cv::cvtColor(input, grayscale, cv::COLOR_BGR2GRAY);
What grayscale looks like:
// Erode & Dilate to remove noises and improve the result of the next operation (threshold)
int erosion_type = cv::MORPH_RECT; // MORPH_RECT, MORPH_CROSS, MORPH_ELLIPSE
int erosion_size = 3;
cv::Mat element = cv::getStructuringElement(erosion_type,
cv::Size(2 * erosion_size + 1, 2 * erosion_size + 1),
cv::Point(erosion_size, erosion_size));
cv::erode(grayscale, grayscale, element);
cv::dilate(grayscale, grayscale, element);
What grayscale looks like after morphological operations:
// Threshold to segment the area of the lungs
cv::Mat thres;
cv::threshold(grayscale, thres, 80, 150, cv::THRESH_BINARY);
What thres looks like:
// Find the contours of the lungs in the thresholded image
std::vector<std::vector<cv::Point> > contours;
cv::findContours(thres, contours, cv::RETR_LIST, cv::CHAIN_APPROX_SIMPLE);
// Fill the areas of the lungs with BLUE for better visualization
cv::Mat lungs = input.clone();
for (size_t i = 0; i < contours.size(); i++)
{
std::vector<cv::Point> cnt = contours[i];
double area = cv::contourArea(cv::Mat(cnt));
if (area > 15000 && area < 35000)
{
std::cout << "* Area: " << area << std::endl;
cv::drawContours(lungs, contours, i, cv::Scalar(255, 0, 0),
CV_FILLED, 8, std::vector<cv::Vec4i>(), 0, cv::Point() );
}
}
What lungs looks like:
// Using the image with blue lungs as a mask, we create a new image containing only the lungs
cv::Mat blue_mask = cv::Mat::zeros(input.size(), CV_8UC1);
cv::inRange(lungs, cv::Scalar(255, 0, 0), cv::Scalar(255, 0, 0), blue_mask);
cv::Mat output;
input.copyTo(output, blue_mask);
What output looks like:
At this point you have the lungs isolated in the image and can proceed to execute other filter operations to isolate the nodules.
Good luck.
Try this:
% dp6BK.png is your original image, I downloaded directly
I = im2double(imread('dp6BK.png'));
I=I(:,:,1);
imshow(I)
cropped = I(50:430,8:500); %% Crop region of interest
thresholded = cropped < 0.35; %% Threshold to isolate lungs
clearThresh = imclearborder(thresholded); %% Remove border artifacts in image
Liver = bwareaopen(clearThresh,100); %% Remove objects less than 100 pixels
Liver1 = imfill(Liver,'hole'); % fill in the vessels inside the lungs
figure,imshow(Liver1.*cropped)
What you will get:

How to obtain and modify a pixel value here?

Listing 2 of Apple's Q & A shows an example of how to modify pixels in a CGImageRef. The problem is: They're not showing how to obtain a pixel and modify it's R G B and A values.
The interesting part is here:
void *data = CGBitmapContextGetData (cgctx);
if (data != NULL)
{
// **** You have a pointer to the image data ****
// **** Do stuff with the data here ****
}
Now, lets say I want to read Red, Green, Blue and Alpha from pixel at x = 100, y = 50. How do I get access to that pixel and it's R, G, B and A components?
First, you need to know the bytesPerRow of your bitmap, as well as the data type and color format of the pixels in your bitmap. bytesPerRow can be different from the width_in_pixels*bytesPerPixel, as there might be padding at the end of each line. The pixels can be 16-bits or 32-bits, or possibly some other size. The format of the pixels can be ARGB or BRGA, or some other format.
For 32-bit ARGB data:
unsigned char *p = (unsigned char *)bytes;
long int i = bytesPerRow * y + 4 * x; // for 32-bit pixels
alpha = p[i ]; // for ARGB
red = p[i+1];
green = p[i+2];
blue = p[i+3];
Note that depending on your view transform, the Y axis might also appear to look upside down, depending on what you expect.

Transferring an image from Matlab to an OpenCV IplImage

I have an image in Matlab:
img = imopen('image.jpg')
which returns an uint8 array height x width x channels (3 channels: RGB).
Now I want to use openCV to do some manipulations on it, so I write up a MEX file which takes the image as a parameter and constructs an IplImage from it:
#include "mex.h"
#include "cv.h"
void mexFunction(int nlhs, mxArray **plhs, int nrhs, const mxArray **prhs) {
char *matlabImage = (char *)mxGetData(prhs[0]);
const mwSize *dim = mxGetDimensions(prhs[0]);
CvSize size;
size.height = dim[0];
size.width = dim[1];
IplImage *iplImage = cvCreateImageHeader(size, IPL_DEPTH_8U, dim[2]);
iplImage->imageData = matlabImage;
iplImage->imageDataOrigin = iplImage->imageData;
/* Show the openCV image */
cvNamedWindow("mainWin", CV_WINDOW_AUTOSIZE);
cvShowImage("mainWin", iplImage);
}
This result looks completely wrong, because openCV uses other conventions than matlab for storing an image (for instance, they interleave the color channels).
Can anyone explain what the differences in conventions are and give some pointers on how to display the image correctly?
After spending the day doing fun image format conversions </sarcasm> I can now answer my own question.
Matlab stores images as 3 dimensional arrays: height × width × color
OpenCV stores images as 2 dimensional arrays: (color × width) × height
Furthermore, for best performance, OpenCV pads the images with zeros so rows are always aligned on 32 bit blocks.
I've done the conversion in Matlab:
function [cv_img, dim, depth, width_step] = convert_to_cv(img)
% Exchange rows and columns (handles 3D cases as well)
img2 = permute( img(:,end:-1:1,:), [2 1 3] );
dim = [size(img2,1), size(img2,2)];
% Convert double precision to single precision if necessary
if( isa(img2, 'double') )
img2 = single(img2);
end
% Determine image depth
if( ndims(img2) == 3 && size(img2,3) == 3 )
depth = 3;
else
depth = 1;
end
% Handle color images
if(depth == 3 )
% Switch from RGB to BGR
img2(:,:,[3 2 1]) = img2;
% Interleave the colors
img2 = reshape( permute(img2, [3 1 2]), [size(img2,1)*size(img2,3) size(img2,2)] );
end
% Pad the image
width_step = size(img2,1) + mod( size(img2,1), 4 );
img3 = uint8(zeros(width_step, size(img2,2)));
img3(1:size(img2,1), 1:size(img2,2)) = img2;
cv_img = img3;
% Output to openCV
cv_display(cv_img, dim, depth, width_step);
The code to transform this into an IplImage is in the MEX file:
#include "mex.h"
#include "cv.h"
#include "highgui.h"
#define IN_IMAGE prhs[0]
#define IN_DIMENSIONS prhs[1]
#define IN_DEPTH prhs[2]
#define IN_WIDTH_STEP prhs[3]
void mexFunction(int nlhs, mxArray **plhs, int nrhs, const mxArray **prhs) {
bool intInput = true;
if(nrhs != 4)
mexErrMsgTxt("Usage: cv_disp(image, dimensions, depth, width_step)");
if( mxIsUint8(IN_IMAGE) )
intInput = true;
else if( mxIsSingle(IN_IMAGE) )
intInput = false;
else
mexErrMsgTxt("Input should be a matrix of uint8 or single precision floats.");
if( mxGetNumberOfElements(IN_DIMENSIONS) != 2 )
mexErrMsgTxt("Dimension vector should contain two elements: [width, height].");
char *matlabImage = (char *)mxGetData(IN_IMAGE);
double *imgSize = mxGetPr(IN_DIMENSIONS);
size_t width = (size_t) imgSize[0];
size_t height = (size_t) imgSize[1];
size_t depth = (size_t) *mxGetPr(IN_DEPTH);
size_t widthStep = (size_t) *mxGetPr(IN_WIDTH_STEP) * (intInput ? sizeof(unsigned char):sizeof(float));
CvSize size;
size.height = height;
size.width = width;
IplImage *iplImage = cvCreateImageHeader(size, intInput ? IPL_DEPTH_8U:IPL_DEPTH_32F, depth);
iplImage->imageData = matlabImage;
iplImage->widthStep = widthStep;
iplImage->imageDataOrigin = iplImage->imageData;
/* Show the openCV image */
cvNamedWindow("mainWin", CV_WINDOW_AUTOSIZE);
cvShowImage("mainWin", iplImage);
}
You could optimize your program with mxGetDimensions and mxGetNumberOfDimensions to get the size of the image and use the mxGetClassID to determine the depth more accurately.
I wanted to do the same but I think it would be better to do this using matlab dll and calllib. I would not do the transformation of the image in opencv format not in matlab because it would be slow. This is one of the biggest problems with matlab openCV. I think opencv2.2 has some good solutions for that problem. It looks like there are some solutions like that done from opencv community for octave but I still don't understand them. They are somehow using the c++ functionality of opencv.
Try using the library developed by Kota Yamaguchi:
http://github.com/kyamagu/mexopencv
It defines a class called 'MxArray' that can perform all types of conversions from MATLAB mxArray variables to OpenCV objects (and from OpenCV to MATLAB). For example, this library can convert between mxArray and cv::Mat data types. Btw, IplImage is not relevant anymore if you use C++ API of OpenCV, it's better to use cv::Mat instead.
Note: if using the library, make sure to compile your mex function with MxArray.cpp file from the library; you can do so in MATLAB command line with:
mex yourmexfile.cpp MxArray.cpp
Based on the answer and How the image matrix is stored in the memory on OpenCV, we can make it with Opencv Mat operation only!
C++: Mat::Mat(int ndims, const int* sizes, int type, void* data, const size_t* steps=0)
C++: void merge(const Mat* mv, size_t count, OutputArray dst)
Then the mex C/C++ code is:
#include "mex.h"
#include <opencv2/opencv.hpp>
#define uint8 unsigned char
void mexFunction(int nlhs, mxArray *out[], int nrhs, const mxArray *input[])
{
// assume the type of image is uint8
if(!mxIsClass(input[0], "uint8"))
{
mexErrMsgTxt("Only image arrays of the UINT8 class are allowed.");
return;
}
uint8* rgb = (uint8*) mxGetPr(input[0]);
int* dims = (int*) mxGetDimensions(input[0]);
int height = dims[0];
int width = dims[1];
int imsize = height * width;
cv::Mat imR(1, imsize, cv::DataType<uint8>::type, rgb);
cv::Mat imG(1, imsize, cv::DataType<uint8>::type, rgb+imsize);
cv::Mat imB(1, imsize, cv::DataType<uint8>::type, rgb+imsize + imsize);
// opencv is BGR and matlab is column-major order
cv::Mat imA[3];
imA[2] = imR.reshape(1,width).t();
imA[1] = imG.reshape(1,width).t();
imA[0] = imB.reshape(1,width).t();
// done! imf is what we want!
cv::Mat imf;
merge(imA,3,imf);
}

How to save an image as Tiff or PNG with an alpha channel or alpha mask in iPhone SDK?

I have an image with something inside in a white backround. I want to save that image in a format that allows alpha channel or using an alpha mask in a way that the white pixels became transparents. Any light out there?
I don't know of any libraries where this is super easy. But, there's a lot of relevant sample code in the GLImageProcessing example here. (I haven't run the following)
UIImage *some_image = [UIImage imageNamed:#"somethin'.tiff"];
CGImageRef cg_image = some_image.CGImage;
CFDataRef data = CGDataProviderCopyData(CGImageGetDataProvider(cg_image));
size_t bpp = CGImageGetBitsPerPixel(CGImage);
uint32_t *stuff = (uint32_t *)CFDataGetBytePtr(data);
int w = CGImageGetWidth(CGImage);
int h = CGImageGetHeight(CGImage);
int N = w * h;
for (int i = 0; i < N; i++ ) {
// do your stuff, test for white, set the alpha mask
stuff[i] = stuff[i] & ((uint32_t)0xFFFFFFFF | alpha_mask);
}
You could instead use this function
UIKIT_EXTERN NSData *UIImagePNGRepresentation(UIImage *image);
and write the data to disk. I hope this helps. Post the solution if you find it...