I have an image stored as BytesIO of Pillow and I need to save it to a file with some header information (containing textual attributes) that I need to add specific to my problem. I need the bytes to be represented according to some image compression format. Would that be possible? If yes, how it can be done?
I also need to store more than one image in the file.
Storing more than one image in a file is problematic for PNG, JPEG and the most of the common formats. One option for that is TIFF - not sure if that works for you?
Here's how you can store some additional text in a PNG at least:
#!/usr/bin/env python3
from PIL.PngImagePlugin import Image, PngInfo
# Create empty metadata and add a couple of text strings
metadata = PngInfo()
metadata.add_text("Key1:","Value1")
metadata.add_text("Key2:","Value2")
# Create red image and save with metadata embedded
im = Image.new('RGB',(64,64),'red')
im.save("result.png", pnginfo=metadata)
If you check that with pngcheck you will see:
pngcheck -7v result.png
Sample Output
File: result.png (200 bytes)
chunk IHDR at offset 0x0000c, length 13
64 x 64 image, 24-bit RGB, non-interlaced
chunk tEXt at offset 0x00025, length 12, keyword: Key1:
Value1
chunk tEXt at offset 0x0003d, length 12, keyword: Key2:
Value2
chunk IDAT at offset 0x00055, length 95
zlib: deflated, 32K window, default compression
chunk IEND at offset 0x000c0, length 0
No errors detected in result.png (5 chunks, 98.4% compression).
Here's how to save 3 images and a comment in a single TIFF file:
from PIL import Image
from PIL.TiffImagePlugin import ImageFileDirectory_v2, TiffTags
# Create a structure to hold meta-data
ifd = ImageFileDirectory_v2()
ifd[270] = 'Some Funky Comment'
ifd.tagtype[270] = TiffTags.ASCII
# Create red image and save with metadata embedded
im1 = Image.new('RGB',(50,50),'red')
im2 = Image.new('RGB',(64,64),'green')
im3 = Image.new('RGB',(80,80),'blue')
im1.save("result.tif", append_images[im2,im3], save_all=True, tiffinfo=ifd)
And check that with:
tiffinfo -v result.tif
Sample Output
TIFF Directory at offset 0x8 (8)
Image Width: 50 Image Length: 50
Bits/Sample: 8
Compression Scheme: None
Photometric Interpretation: RGB color
Samples/Pixel: 3
Rows/Strip: 50
Planar Configuration: single image plane
ImageDescription: Some Funky Comment
TIFF Directory at offset 0x1e08 (7688)
Image Width: 64 Image Length: 64
Bits/Sample: 8
Compression Scheme: None
Photometric Interpretation: RGB color
Samples/Pixel: 3
Rows/Strip: 64
Planar Configuration: single image plane
ImageDescription: Some Funky Comment
TIFF Directory at offset 0x4eb8 (20152)
Image Width: 80 Image Length: 80
Bits/Sample: 8
Compression Scheme: None
Photometric Interpretation: RGB color
Samples/Pixel: 3
Rows/Strip: 80
Planar Configuration: single image plane
ImageDescription: Some Funky Comment
You can then extract the images on the command-line with ImageMagick like this.
To extract first image:
magick result.tif[0] first.png
To extract last image:
magick result.tif[-1] last.png
To extract all three images:
magick result.tif image-%d.png
Result
-rw-r--r-- 1 mark staff 457 21 Jan 08:11 image-0.png
-rw-r--r-- 1 mark staff 458 21 Jan 08:11 image-1.png
-rw-r--r-- 1 mark staff 460 21 Jan 08:11 image-2.png
Note: Use convert in place of magick above if you are running v6 ImageMagick.
Keywords: Python, PIL, image processing, multiple images, TIF, comment, tiffinfo, IFD, PNG tEXt.
Related
I use the following minimal PostScript file:
%!PS-Adobe-2.0
%%BoundingBox: 0 0 100 100
0 0 moveto 100 100 rlineto stroke
showpage
This draws a line from (0,0) to (100,100) in a 100x100 box.
I then convert this file minimal.ps to a PNG with this command line, borrowed from GhostScript's documentation:
gs -sDEVICE=pngmono -o minimal.png minimal.ps
With GPL Ghostscript 9.53.3 (current Debian stable) and 9.27 (oldstable) the resulting PNG file has the following dimensions:
> identify minimal.png
minimal.png PNG 612x792 612x792+0+0 8-bit Gray 2c 2994B 0.000u 0:00.000
Both width and height extend far beyond the original BoundingBox, and they are not even equal.
The PNG's dimensions are the same if the output device is pnggray, pngalpha, etc.
The dimensions are also the same if, in the PostScript code, 100 is replaced by 10 or by 10000.
How can I tell GhostScript to create a PNG file whose dimensions automatically fit the PostScript file's BoundingBox?
The image properties for this image say that the width and height are respectively 340 pixels and 471 pixels. The bit depth is 24 bits. My understanding was that this means that the value associated with each pixel is encoded using 24 bits. So I expected the file size to be around 471 * 340 * 24 = 3843360 bits = 480420 bytes =480 KB. But then one of the image properties says the size of the file is 9.98 KB. Why the big difference?
I am reading the images into a convolutional neural network where I need to supply the input shape.
Below is a screenshot of image properties
Below is a screenshot of actual image:
The PNG image format is designed to support lossless data compression. The file size you calculated is completely uncompressed, just raw image data. To save disk space, the file is compressed, and given your example of the image, this can be done very effectively because most of the image is exactly the same color.
I have a 3-d numpy array and save it using Pillow as JPEG image. When I reloaded the image using Pillow, the resulting numpy array is different.
I write a demo code for this:
from PIL import Image
import numpy as np
file_extension = 'jpeg'
# generate a sample image
image = range(1, 2*2*3+1)
image = np.uint8(np.array(image).reshape(2,2,3))
print 'image', image
img = Image.fromarray(image, "RGB")
img.save('test.'+file_extension)
# load image
loaded_image = Image.open('test.'+file_extension)
loaded_image = np.array(loaded_image.convert('RGB'))
print 'loaded image', loaded_image
The output of the code is as follows:
image [[[ 1 2 3]
[ 4 5 6]]
[[ 7 8 9]
[10 11 12]]]
loaded image [[[ 3 4 6]
[ 3 4 6]]
[[ 7 8 10]
[ 8 9 11]]]
The loaded_image is different from the original image. However, if I change the file_extension to be 'png' or 'bmp' etc, The loaded_image will be the same as the original image.
I am wondering if anyone has a similar problem and know why saving image in JPEG format using Pillow gives such a problem?
The answer is very simple...
JPEG is "lossy". It discards the least obvious details to save space - see Wikipedia entry for JPEG and scroll down looking for "Quantisation". It also doesn't even get started with 16-bit per sample/channel data.
PNG, BMP and TIFF (other than JPEG-encoded TIFF) are lossless - that means you get back exactly what you saved.
GIF is a bit different as it has a limited palette, so you may get back something different from what you saved depending on how many colours your original image had.
If your data is 16-bit per sample/channel, you should probably use PNG, NetPBM or TIFF because BMP can not store 16-bit per sample data - what they call 24-bit means 3 channels of 8-bits each.
I'm trying to write a TIFF image with RMagick that tesseract can process. Tesseract objects if bits per pixel is > 32 or samples per pixel is other than 1, 3 or 4.
With the defaults, Image.write generates 3 (RGB) samples plus 1 alpha channel at 16-bits per sample for a total of 64 bits per pixel, violating the first constraint.
If I set the colorspace to GRAYColorspace as follows, it still outputs the alpha channel, giving two samples per pixel, violating the second constraint.
Image.write('image.tif) {self.colorspace = GRAYColorspace}
Per the RMagick documentation, the alpha channel is ignored on method operations unless specified, but even if I do self.channel(GREYChannel), the alpha channel is still output.
I know I can run convert on the file afterwards, but I'd like to find a solution that avoids that.
Here is the tiffinfo output for the file currently generated:
TIFF Directory at offset 0x9c48 (40008)
Image Width: 100 Image Length: 100
Bits/Sample: 16
Compression Scheme: None
Photometric Interpretation: min-is-black
Extra Samples: 1<unassoc-alpha>
FillOrder: msb-to-lsb
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 2
Rows/Strip: 20
Planar Configuration: single image plane
Page Number: 0-1
DocumentName: image-gray-colorspace.tif
White Point: 0.3127-0.329
PrimaryChromaticities: 0.640000,0.330000,0.300000,0.600000,0.150000,0.060000
A 24 bit .png file with transparency, as those that can be generated with Photoshop, has really 24 bits distributed across each color plus the alpha ? or the 24 bit refer only to the colors and ignores the alpha (RGBA 8888).
Is there any tool to examine a PNG file and verify this kind of information? Does Photoshop have any options to verify or configure this?
24 bit + alpha is actually 32 bits per pixel. Meaning you have the Red, Green, Blue and Alpha channels, each being 8 bit, allowing for 256 shades per channel translating to 256 x 256 x 256 x 256 possible colour combinations. That's what the "millions of colours" and "billions of colours" mean in certain graphics and video software.
As I understand it, there are three kinds of "24 bit" pngs:
24 bits with no transparency. No alpha information, truly 24 bits per pixel.
24 bits per pixel with alpha transparency. This would be 24 bits of color information with 8 bits of alpha (allows for various levels of transparency) - 32 bits per pixel total.
24 bits per pixel with binary transparency. This would be 24 bits of color information with 1 bit of alpha (transparent or not transparent) - 25 bits per pixel total.
24 bit PNG doesn't say much. An image has a pixel format. The pixel format describes the Colorspace used (such as CMYK, RGB) and bits per channel information (i.e. how many bits are allocated to represent each channel of the colorspace in use).
Go to File > File Info > Advanced. That should tell you what you are looking for.
After dissecting the exported file myself (from Photoshop CS6), I found that the "24 bit" file generated by Photoshop is actually still 8 bit. The RGBA is still one byte per channel. The IHDR PNG chunk still says that it's 8 bits per channel.
It's an 8 bit PNG.
The exported PNG also contains about 825 bytes of useless marketing text data (per PNG image).
See the image (with the byte for "bits per channel" selected):
See the specification for more details:
http://www.libpng.org/pub/png/spec/1.2/png-1.2.pdf