I'm working on some large histological images using Vips image library. Together with the image I have an array with coordinates. I want to make a binary mask which masks out the part of the image within the polygon created by the coordinates. I first tried to do this using vips draw function, but this is very inefficiently and takes forever (in my real code the images are about 100000 x 100000px and the array of polygons are very large).
I then tried creating the binary mask using PIL, and this works great. My problem is to convert the PIL image into an vips image. They both have to be vips images to be able to use the multiply-command. I also want to write and read from memory, as I believe this is faster than writing to disk.
In the im_PIL.save(memory_area,'TIFF') command I have to specify and image format, but since I'm creating a new image, I'm not sure what to put here.
The Vips.Image.new_from_memory(..) command returns: TypeError: constructor returned NULL
from gi.overrides import Vips
from PIL import Image, ImageDraw
import io
# Load the image into a Vips-image
im_vips = Vips.Image.new_from_file('images/image.tif')
# Coordinates for my mask
polygon_array = [(368, 116), (247, 174), (329, 222), (475, 129), (368, 116)]
# Making a new PIL image of only 1's
im_PIL = Image.new('L', (im_vips.width, im_vips.height), 1)
# Draw polygon to the PIL image filling the polygon area with 0's
ImageDraw.Draw(im_PIL).polygon(polygon_array, outline=1, fill=0)
# Write the PIL image to memory ??
memory_area = io.BytesIO()
im_PIL.save(memory_area,'TIFF')
memory_area.seek(0)
# Read the PIL image from memory into a Vips-image
im_mask_from_memory = Vips.Image.new_from_memory(memory_area.getvalue(), im_vips.width, im_vips.height, im_vips.bands, im_vips.format)
# Close the memory buffer ?
memory_area.close()
# Apply the mask with the image
im_finished = im_vips.multiply(im_mask_from_memory)
# Save image
im_finished.tiffsave('mask.tif')
You are saving from PIL in TIFF format, but then using the vips new_from_memory constructor, which is expecting a simple C array of pixel values.
The easiest fix is to use new_from_buffer instead, which will load an image in some format, sniffing the format from the string. Change the middle part of your program like this:
# Write the PIL image to memory in TIFF format
memory_area = io.BytesIO()
im_PIL.save(memory_area,'TIFF')
image_str = memory_area.getvalue()
# Read the PIL image from memory into a Vips-image
im_mask_from_memory = Vips.Image.new_from_buffer(image_str, "")
And it should work.
The vips multiply operation on two 8-bit uchar images will make a 16-bit uchar image, which will look very dark, since the numeric range will be 0 - 255. You could either cast it back to uchar again (append .cast("uchar") to the multiply line) before saving, or use 255 instead of 1 for your PIL mask.
You can also move the image from PIL to VIPS as a simple array of bytes. It might be slightly faster.
You're right, the draw operations in vips don't work well with very large images in Python. It's not hard to write a thing in vips to make a mask image of any size from a set of points (just combine lots of && and < with the usual winding rule), but using PIL is certainly simpler.
You could also consider having your poly mask as an SVG image. libvips can load very large SVG images efficiently (it renders sections on demand), so you just magnify it up to whatever size you need for your raster images.
Related
I have a VTF file that looks like this inside VTFEdit:
I tried to convert it to a PNG in Python using the code below:
import texture2ddecoder, numpy, cv2
from PIL import Image
img_width = 64
img_height = 64
encoded_binary = open('bracketsTest.vtf','rb').read()
#decompressing dxt5 (compression used for this VTF file) to get actual pixel colors, returns BGRA bytes
decoded_binary = texture2ddecoder.decode_bc5(encoded_binary, img_width, img_height)
#creating RGBA png, converting from BGRA (no support for BRGA in PIL it seems)
dec_img = Image.frombytes("RGBA", (img_width, img_height), decoded_binary, 'raw', ("BGRA"))
dec_img.show()
dec_img.save('testpng.png')
And the resulting image came out like this:
As the resulting image does not look the same as it does in VTFEdit, obviously something went wrong. I suspected that it was an issue with the color channel going from BGRA (VTFs are BRGA by default + texture2ddecoder produces BRGA bytes when decompressing) to RGBA, so I tried the following code to convert the image from RGBA to BRGA:
# trying to convert png back to BGRA
image = cv2.imread('testpng.png')
image_bgra = cv2.cvtColor(image, cv2.COLOR_RGBA2BGRA)
cv2.imshow('image',image_bgra)
But the resulting image came out basically the same as before the conversion only with blue squares instead of red ones. What's going on here and how can I fix it? Is there a name for these odd squares?
DXT5 is actually known as Block Compression 3 (BC3). In my case, I incorrectly assumed BC5 = DXT5, so the decompression was wrong (see this wikipedia article for a better explanation). I changed the line decoded_binary = texture2ddecoder.decode_bc5(encoded_binary, img_width, img_height) to decoded_binary = texture2ddecoder.decode_bc3(encoded_binary, img_width, img_height) and the resulting image looked like this:
There are still some odd squares at the top, but deleting/ignoring the header info and low-res thumbnail data seems to fix it:
Know your decompression algorithms!!!
Have 40 DICOM and 40 PNG images (data and their masks) for a Fully CNN that are loaded into my Google Drive and have been found by the notebook via the print(os.listdir(...)), as evidenced below in the first block of code where all the names of the 80 data in the above sets are listed.
Also have globbed all of the DICOM and PNG into img_path and mask_path, both with lengths of 40, in the second block of code that is below.
Now attempting to resize all of the images to 256 x 256 before inputting them into the U-net like architecture for segmentation. However, cannot load them via the nib.load() call, as it cannot work out the file type of the DCM and PNG files, even though for the latter it will not error but still produce an empty set of data like the last block of code yields.
Assuming that, once the first couple of lines inside the for loop in the third block of code are rectified, pre-processing should be completed and I can move onto the U-net implementation.
Have the current pydicom running in the Colab notebook and tried it in lieu of the nib.load() call, which produced the same error as the current one.
#import data as data
import pydicom
from PIL import Image
import numpy as np
import glob
import imageio
print(os.listdir("/content/drive/My Drive/Images"))
print(os.listdir("/content/drive/My Drive/Masks"))
pixel_data = []
images = glob.glob("/content/drive/My Drive/Images/IMG*.dcm");
for image in images:
dataset = pydicom.dcmread(image)
pixel_data.append(dataset.pixel_array)
#print(len(images))
#print(pixel_data)
pixel_data1 = [] ----------------> this section is the trouble area <-------
masks = glob.glob("content/drive/My Drive/Masks/IMG*.png");
for mask in masks:
dataset1 = imageio.imread(mask)
pixel_data1.append(dataset1.pixel_array)
print(len(masks))
print(pixel_data1)
['IMG-0004-00040.dcm', 'IMG-0002-00018.dcm', 'IMG-0046-00034.dcm', 'IMG-0043-00014.dcm', 'IMG-0064-00016.dcm',....]
['IMG-0004-00040.png', 'IMG-0002-00018.png', 'IMG-0046-00034.png', 'IMG-0043-00014.png', 'IMG-0064-00016.png',....]
0 ----------------> outputs of trouble area <--------------
[]
import glob
img_path = glob.glob("/content/drive/My Drive/Images/IMG*.dcm")
mask_path = glob.glob("/content/drive/My Drive/Masks/IMG*.png")
print(len(img_path))
print(len(mask_path))
40
40
images=[]
a=[]
for a in pixel_data:
a=resize(a,(a.shape[0],256,256))
a=a[:,:,:]
for j in range(a.shape[0]):
images.append((a[j,:,:]))
No output, this section works fine.
images=np.asarray(images)
print(len(images))
10880
masks=[] -------------------> the other trouble area <-------
b=[]
for b in masks:
b=resize(b,(b.shape[0],256,256))
b=b[:,:,:]
for j in range(b.shape[0]):
masks.append((b[j,:,:]))
No output, trying to solve the problem of how to fix this section.
masks=np.asarray(masks) ------------> fix the above section and this
print(len(masks)) should have no issues
[]
You are trying to load the DICOM files again using nib.load, which does not work, as you already found out:
for name in img_path:
a=nib.load(name) # does not work with DICOM files
a=a.get_data()
a=resize(a,(a.shape[0],256,256))
You already have the data from the DICOM files in the pixel_data list, so you should use these:
for a in pixel_data:
a=resize(a,(a.shape[0],256,256)) # or something similar, depending on the shape of pixel_data
...
Your last loop for mask in masks: is never executed because two lines about it you set masks = [].
It looks like it should to be for mask in mask_path:. mask_path is the list of mask file names.
So I've been using Tensorflow's tutorials for neural networks. I completed the "basic classification" that is essentially just MNIST and have been working on making my own custom variation as a little thought experiment. Everything is pretty self explanatory except putting the datasets into a usable form as the tutorial uses a premade dataset and looks like it cuts some corners. All I would like to know is how to put a colored photo into a usable piece of data. I assume that will just be a 1D array. As a side question, will a neural network lose any effectiveness if a 2d photo is stored in a 1d array if its not a CNN.
Datasets included in Keras are premade and usually preprocessed so that beginner could easily try a hand on them. For using your own images, like for a cat-dog image classification problem, you can place the images in two separate directories, for example,
in images/cats and images/dogs.
Now, we parse each and every image in these directories,
import os
from PIL import Image
master_dir = 'images'
img_dirs = os.listdir( master_dir )
for img_dir in img_dirs:
img_names = os.listdir( os.path.join( master_dir , img_dir ) )
for name in img_names:
img_path = os.path.join( master_dir , img_dir , name )
image = Image.open( img_path ).resize( ( 64 , 64 ) ).convert( 'L' )
# Store this image in an array with its corresponding label
Here. the image will be an array of shape (64, 64 ) which indicates that the image is grayscale. Besides .convert( 'L' ) in the code, we can use .convert( 'RGB' ) to have an image of shape (64,64,3) RGB image.
Now,
Collect all the images and labels in a Python list.
Convert the lists to NumPy arrays.
Store the NumPy arrays in a .npy file using the np.save() method.
In the file which trains the model, load these files using np.load() method and feed them to the model.
I have tested with sample text both alphanumeric and digits only. I am using digits mode.
How do I recognize digits like in the following image:
I think it is because of full height.
I have also tried converting it to .jpg using some online tools (not code)
I am using pytesseract 0.1.6, but I think this is Tesseract problem.
Here is my code:
def classify(hash):
socket = urllib.urlopen(hash)
image = StringIO(socket.read())
socket.close()
image = Image.open(image)
number = image_to_string(image, config='digits')
mapping[hash] = number
return number
classify('any url')
I think you've got two problems here.
First is that the text is rather small. You can scale the image up by making it 2x as tall and 2x as wide (preferably using AA or cubic interpolation to try and make the letters clearer).
Next there isn't enough white around the edge of the numbers for tesseract to know that it's actually an edge. So you need to add some blank whitespace image around what you've already got.
You can do that manually using photoshop or GIMP or ImageMagick or whatever to validate that it'll actually help. But if you need to do a bunch of images then you'll probably want to use PIL and ImageOps to help.
How do I resize an image using PIL and maintain its aspect ratio?
If you make the new sizes bigger rather than smaller, PIL will grow the image rather than shrink it. Grow it by 2x or 3x both width and height rather than 20% as that'll cause artifacts.
Here's one way to add extra white border:
http://effbot.org/imagingbook/imageops.htm#tag-ImageOps.expand
This question might help you with adding the extra whitespace also:
In Python, Python Image Library 1.1.6, how can I expand the canvas without resizing?
The input image is too small for recognition. Here is my solution:
Upsample the image
Add constant borders
Apply adaptive-threshold
Set configuration to digits
Upsampling the image is required for the accurate recognition. Adding contant borders will center the digits. Applying adaptive-threhsold will result the features (digit-strokes) more available. Result will be:
When you read:
049
Code:
import cv2
import pytesseract
img = cv2.imread("0cLW9.png")
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
(h, w) = gry.shape[:2]
gry = cv2.resize(gry, (w * 2, h * 2))
gry = cv2.copyMakeBorder(gry, 10, 10, 10, 10, cv2.BORDER_CONSTANT, value=255)
thr = cv2.adaptiveThreshold(gry, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 81, 12)
txt = pytesseract.image_to_string(thr, config="digits")
print(txt)
cv2.imshow("thr", thr)
cv2.waitKey(0)
You can achieve the same result using other pre-processing methods.
The image at the end of this question is a PNG with mode I, which stands for Indexed, as far as I can tell.
I'm trying to create a thumbnail out of it, and save it as JPG with PIL.
However, is I leave the mode alone, PIL won't let me resize it with error unable to generate thumbnail: cannot write mode I as JPEG.
If I convert it to RGB, the result will be a fully white image.
Is there a way to fix this?
https://www.dropbox.com/s/2d1edk2iu4ixk25/NGC281.png
The input image is a 16-bit grayscale PNG, and it appears PIL has a problem with this. Manually converting it to an 8-bit image before further processing makes it work again.
The problem may originate inside PIL itself. The PyPNG homepage asserts
..PIL only has internal representations (PIL mode) for 1-bit and 8-bit channel values. This makes me wonder if PIL can read PNG files with bit depth 2 or 4 (greyscale or palette), and also bit depth 16 (which PNG supports for greyscale and RGB images).
Then again, that page is from 2009. It could be worth tracking down where PIL is maintained from, and report this as a bug (? Or possibly a feature request?).