Preserving colors during CMYK to RGB transformation in PIL - python-imaging-library

I'm using PIL to process uploaded images. Unfortunately, I'm having trouble with color conversion from CMYK to RGB, as the resulting images tone and contrast changes.
I'd suspect that it's only doing direct number transformations. Does PIL, or anything built on top of it, have an Adobian dummy-proof consume embedded profile, convert to destination, preserve numbers tool I can use for conversion?
In all my healthy ignorance and inexperience, this sort of jumped at me and it's got me in a pinch. I'd really like to get this done without engaging any intricacies of color spaces, transformations and the necessary math for both at this point.
Though I've never previously used it, I'm also disposed at using ImageMagick for this processing step if anyone has experience that it can perform it in a gracious manner.

So it didn't take me long to run into other people mentioning Little CMS, being the most popular open source solution for color management. I ended snooping around for Python bindings, found the old pyCMS and some ostensible notions about PIL supporting Little CMS.
Indeed, there is support for Little CMS, it's mentioned in a whole whopping one-liner:
CMS support: littleCMS (1.1.5 or later is recommended).
The documentation contains no references, no topical guides, Google didn't crawl out anything, their mailing list is closed... but digging through the source there's a PIL.ImageCms module that's well documented and get's the job done. Hope this saves someone from a messy internet excavation.
Goes off getting himself a cookie...

it's 2019 and things have changed. Your problem is significantly more complex than it may appear at first sight. The problem is, CMYK to RGB and RGB to CMYK is not a simple there and back. If e.g. you open an image in Photoshop and convert it there, this conversion has 2 additional parameters: source color profile and destination color profile. These change things greatly! For a typical use case, you would assume Adobe RGB 1998 on the RGB side and say Coated FOGRA 39 on the CMYK side. These two additional pieces of information clarify to the converter how to deal with the colors on input and output. What you need next is a transformation mechanism, Little CMS is in deed a great tool for this. It is MIT licensed and (after looking for solutions myself for a considerable time), I would recommend the following setup if you indeed do need a python way to transform colors:
Python 3.X (necessary because of littlecms)
pip install littlecms
pip install Pillow
In littlecms' /tests folder you will find a great set of examples. I would allow myself a particular adaptation of one test. Before you get the code, please let me tell you something about those color profiles. On Windows, as is my case, you will find a set of files with an .icc extension in the folder C:\Windows\System32\spool\drivers\color where Windows stores it's color profiles. You can download other profiles from sites like https://www.adobe.com/support/downloads/iccprofiles/iccprofiles_win.html and install them on Windows simply by double-clicking the corresponding .icc file. The example I provide depends on such profile files, which Little CMS uses to do those magic color transforms. I work as a semi-professional graphics designer and needed to be able to convert colors from CMYK to RGB and vice versa for certain scripts that manipulate objects in InDesign. My setup is RGB: Adobe RGB 1998 and CMYK: Coated FOGRA 39 (these settings were recommended by most book printers I get my books printed at). The aforementioned color profiles generated very similar results for me to the same transforms made by Photoshop and InDesign. Still, be warned, the colors are slightly (by around 1%) off in comparison to what PS and Id will give you for the same inputs. I am trying to figure out why...
The little program:
import littlecms as lc
from PIL import Image
def rgb2cmykColor(rgb, psrc='C:\\Windows\\System32\\spool\\drivers\\color\\AdobeRGB1998.icc', pdst='C:\\Windows\\System32\\spool\\drivers\\color\\CoatedFOGRA39.icc') :
ctxt = lc.cmsCreateContext(None, None)
white = lc.cmsD50_xyY() # Set white point for D50
dst_profile = lc.cmsOpenProfileFromFile(pdst, 'r')
src_profile = lc.cmsOpenProfileFromFile(psrc, 'r') # cmsCreate_sRGBProfile()
transform = lc.cmsCreateTransform(src_profile, lc.TYPE_RGB_8, dst_profile, lc.TYPE_CMYK_8,
lc.INTENT_RELATIVE_COLORIMETRIC, lc.cmsFLAGS_NOCACHE)
n_pixels = 1
in_comps = 3
out_comps = 4
rgb_in = lc.uint8Array(in_comps * n_pixels)
cmyk_out = lc.uint8Array(out_comps * n_pixels)
for i in range(in_comps):
rgb_in[i] = rgb[i]
lc.cmsDoTransform(transform, rgb_in, cmyk_out, n_pixels)
cmyk = tuple(cmyk_out[i] for i in range(out_comps * n_pixels))
return cmyk
def cmyk2rgbColor(cmyk, psrc='C:\\Windows\\System32\\spool\\drivers\\color\\CoatedFOGRA39.icc', pdst='C:\\Windows\\System32\\spool\\drivers\\color\\AdobeRGB1998.icc') :
ctxt = lc.cmsCreateContext(None, None)
white = lc.cmsD50_xyY() # Set white point for D50
dst_profile = lc.cmsOpenProfileFromFile(pdst, 'r')
src_profile = lc.cmsOpenProfileFromFile(psrc, 'r') # cmsCreate_sRGBProfile()
transform = lc.cmsCreateTransform(src_profile, lc.TYPE_CMYK_8, dst_profile, lc.TYPE_RGB_8,
lc.INTENT_RELATIVE_COLORIMETRIC, lc.cmsFLAGS_NOCACHE)
n_pixels = 1
in_comps = 4
out_comps = 3
cmyk_in = lc.uint8Array(in_comps * n_pixels)
rgb_out = lc.uint8Array(out_comps * n_pixels)
for i in range(in_comps):
cmyk_in[i] = cmyk[i]
lc.cmsDoTransform(transform, cmyk_in, rgb_out, n_pixels)
rgb = tuple(rgb_out[i] for i in range(out_comps * n_pixels))
return rgb
def rgb2cmykImage(PILImage, psrc='C:\\Windows\\System32\\spool\\drivers\\color\\AdobeRGB1998.icc', pdst='C:\\Windows\\System32\\spool\\drivers\\color\\CoatedFOGRA39.icc') :
ctxt = lc.cmsCreateContext(None, None)
white = lc.cmsD50_xyY() # Set white point for D50
dst_profile = lc.cmsOpenProfileFromFile(pdst, 'r')
src_profile = lc.cmsOpenProfileFromFile(psrc, 'r')
transform = lc.cmsCreateTransform(src_profile, lc.TYPE_RGB_8, dst_profile, lc.TYPE_CMYK_8,
lc.INTENT_RELATIVE_COLORIMETRIC, lc.cmsFLAGS_NOCACHE)
n_pixels = PILImage.size[0]
in_comps = 3
out_comps = 4
n_rows = 16
rgb_in = lc.uint8Array(in_comps * n_pixels * n_rows)
cmyk_out = lc.uint8Array(out_comps * n_pixels * n_rows)
outImage = Image.new('CMYK', PILImage.size, 'white')
in_row = Image.new('RGB', (PILImage.size[0], n_rows), 'white')
out_row = Image.new('CMYK', (PILImage.size[0], n_rows), 'white')
out_b = bytearray(n_pixels * n_rows * out_comps)
row = 0
while row < PILImage.size[1] :
in_row.paste(PILImage, (0, -row))
data_in = in_row.tobytes('raw')
j = in_comps * n_pixels * n_rows
for i in range(j):
rgb_in[i] = data_in[i]
lc.cmsDoTransform(transform, rgb_in, cmyk_out, n_pixels * n_rows)
for j in cmyk_out :
out_b[j] = cmyk_out[j]
out_row = Image.frombytes('CMYK', in_row.size, bytes(out_b))
outImage.paste(out_row, (0, row))
row += n_rows
return outImage
def cmyk2rgbImage(PILImage, psrc='C:\\Windows\\System32\\spool\\drivers\\color\\CoatedFOGRA39.icc', pdst='C:\\Windows\\System32\\spool\\drivers\\color\\AdobeRGB1998.icc') :
ctxt = lc.cmsCreateContext(None, None)
white = lc.cmsD50_xyY() # Set white point for D50
dst_profile = lc.cmsOpenProfileFromFile(pdst, 'r')
src_profile = lc.cmsOpenProfileFromFile(psrc, 'r')
transform = lc.cmsCreateTransform(src_profile, lc.TYPE_CMYK_8, dst_profile, lc.TYPE_RGB_8,
lc.INTENT_RELATIVE_COLORIMETRIC, lc.cmsFLAGS_NOCACHE)
n_pixels = PILImage.size[0]
in_comps = 4
out_comps = 3
n_rows = 16
cmyk_in = lc.uint8Array(in_comps * n_pixels * n_rows)
rgb_out = lc.uint8Array(out_comps * n_pixels * n_rows)
outImage = Image.new('RGB', PILImage.size, 'white')
in_row = Image.new('CMYK', (PILImage.size[0], n_rows), 'white')
out_row = Image.new('RGB', (PILImage.size[0], n_rows), 'white')
out_b = bytearray(n_pixels * n_rows * out_comps)
row = 0
while row < PILImage.size[1] :
in_row.paste(PILImage, (0, -row))
data_in = in_row.tobytes('raw')
j = in_comps * n_pixels * n_rows
for i in range(j):
cmyk_in[i] = data_in[i]
lc.cmsDoTransform(transform, cmyk_in, rgb_out, n_pixels * n_rows)
for j in rgb_out :
out_b[j] = rgb_out[j]
out_row = Image.frombytes('RGB', in_row.size, bytes(out_b))
outImage.paste(out_row, (0, row))
row += n_rows
return outImage

Something to note for anyone implementing this: you probably want to take the uint8 CMYK values (0-255) and round them into the range 0-100 to better match most color pickers and uses of these values. See my code here: https://gist.github.com/mattdesl/ecf305c2f2b20672d682153a7ed0f133

Related

PyPdf2 merge with scanned PDF yields "An error exists on this page..."

I want to use PyPDF2 to take each page of a scanned PDF document,
scale the page to 85% of its original size
and center the page on a blank 8.5 by 11 page
with the same number of pages
to create margins that are needed for printing/adding barcodes.
I've tried a few approaches with mergeScaledTranslatedPage but I keep ending up with an error message when I open the file in Adobe Acrobat DC.
Even if the output appears to be a success, I get the following error when opening the file:
An error exists on this page. Acrobat may not display the page correctly. Please contact the person who created the PDF document to correct the problem.
How can I make it work?
I'm the maintainer of pypdf and PyPDF2. Please use pypdf.
from pypdf import PdfReader, PdfWriter, Transformation
from pypdf.generic import RectangleObject
reader = PdfReader("GeoTopo.pdf")
writer = PdfWriter()
desired_width = 100
desired_height = 100
r = RectangleObject([0, 0, desired_width, desired_height])
for page in reader.pages[:10]:
old_width = page.mediabox.width
old_height = page.mediabox.height
a1 = desired_width / old_width
a2 = desired_height / old_height
factor = min(a1, a2)
new_width = float(old_width * factor)
new_height = float(old_height * factor)
dx = (desired_width - new_width) / 2
dy = (desired_height - new_height) / 2
op = Transformation().translate(tx=dx, ty=dy)
page.scale_to(width=new_width, height=new_height)
page.add_transformation(op)
page.mediabox = r
page.artbox = r
page.cropbox = r
page.bleedbox = r
page.trimbox = r
writer.add_page(page)
with open("foo.pdf", "wb") as fp:
writer.write(fp)
``

Tesseract fails to recognize digits, even with rescaling, char white_listing and filtering

For an open source pokerbot I'm trying to recognize images as implemented here. I have tried the following with an example image that I'd like tesseract to recognize:
pytesseract.image_to_string(img_orig)
Out[32]: 'cies TE'
pytesseract.image_to_string(img_mod, 'eng', config='--psm 6 --oem 1 -c tessedit_char_whitelist=0123456789.$£B')
Out[33]: ''
So then let's use some more sophisticated methods by scaling::
basewidth = 200
wpercent = (basewidth / float(img_orig.size[0]))
hsize = int((float(img_orig.size[1]) * float(wpercent)))
img_resized = img_orig.convert('L').resize((basewidth, hsize), Image.ANTIALIAS)
if binarize:
img_resized = binarize_array(img_resized, 200)
Now we end up with an image looking like this:
Let's see what comes out:
pytesseract.image_to_string(img_resized)
Out[34]: 'Stee'
pytesseract.image_to_string(img_resized, 'eng', config='--psm 6 --oem 1 -c tessedit_char_whitelist=0123456789.$£B')
Out[35]: ''
Ok, that didn't work. Let's try applying some filers:
img_min = img_resized.filter(ImageFilter.MinFilter)
img_mod = img_resized.filter(ImageFilter.ModeFilter)
img_med = img_resized.filter(ImageFilter.MedianFilter)
img_sharp = img_resized.filter(ImageFilter.SHARPEN)
pytesseract.image_to_string(img_min)
Out[36]: ''
pytesseract.image_to_string(img_mod)
Out[37]: 'oe Se'
pytesseract.image_to_string(img_med)
Out[38]: 'rete'
pytesseract.image_to_string(img_sharp)
Out[39]: 'ry'
Or maybe binarize will help?
numpy_array = np.array(image)
for i in range(len(numpy_array)):
for j in range(len(numpy_array[0])):
if numpy_array[i][j] > threshold:
numpy_array[i][j] = 255
else:
numpy_array[i][j] = 0
img_binarized = Image.fromarray(numpy_array)
pytesseract.image_to_string(img_binarized)
Out[42]: 'Sion'
pytesseract.image_to_string(img_binarized, 'eng', config='--psm 6 --oem 1 -c tessedit_char_whitelist=0123456789.$£B')
Out[44]: '0'
Again, all totally wrong.
What else can I do?
Any suggestions would be greatly appreciated.
Add on example:
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
threshold_img = cv2.threshold(gray, 100, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
im_pil = cv2_to_pil(threshold_img)
pytesseract.image_to_string(im_pil)
Out[5]: 'TUM'
or trying another suggested algo for:
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
threshold_img = cv2.threshold(gray, 100, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
im_pil = cv2_to_pil(threshold_img)
pytesseract.image_to_string(im_pil, 'eng', config='--psm 7')
Out[5]: '$1.99'
I think you are making it too complicated here. I did simple Otsu thresholding on the image that you've provided and was able to get the output.
image_path = r'path/to/image'
img = cv2.imread(image_path)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 100, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1]
cv2.imwrite('thresh.png', thresh)
detected_text = pytesseract.image_to_string(Image.open(image_path))
print(detected_text)
The image that I got after thresholding was
Tesseract was easily able to detect it and the output I got: $0.51
You can do it with or without Otsu. The trick for me was to invert the image so that it's black text on white background (which Tesseract seems to prefer).
EDIT One more trick for Tesseract is to add a border around the image. Tesseract does not like for the text to be too close to the edge.
import cv2
import pytesseract
import numpy as np
img = cv2.imread('one_twenty_nine.png', cv2.IMREAD_GRAYSCALE)
thresh = cv2.threshold(img, 100, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)[1]
cv2.imwrite('thresh.png', thresh)
detected_text = pytesseract.image_to_string(thresh, config = '--psm 7')
print(detected_text)
which gives
$1.29

How can I sum up functions that are made of elements of the imported dataset?

See the code and error. I have already tried Do, For,...and it is not working.
CODE + Error from Mathematica:
Import of survival probabilities _{k}p_x and _{k}p_y (calculated in excel)
px = Import["C:\Users\Eva\Desktop\kpx.xlsx"];
px = Flatten[Take[px, All], 1];
NOTE: The probability _{k}p_x can be found on the position px[[k+2, x -16]
i = 0.04;
v = 1/(1 + i);
JointLifeIndep[x_, y_, n_] = Sum[v^k*px[[k + 2, x - 16]]*py[[k + 2, y - 16]], {k , 0, n - 1}]
Part::pkspec1: The expression 2+k cannot be used as a part specification.
Part::pkspec1: The expression 2+k cannot be used as a part specification.
Part::pkspec1: The expression 2+k cannot be used as a part specification.
General::stop: Further output of Part::pkspec1 will be suppressed during this calculation.
Part of dataset (left corner of the dataset):
k\x 18 19 20
0 1 1 1
1 0.999478086278185 0.999363078716059 0.99927911905056
2 0.998841497412202 0.998642656911039 0.99858030519133
3 0.998121451605207 0.99794428814123 0.99788275311401
4 0.997423447323642 0.997247180349674 0.997174407432264
5 0.996726703362208 0.996539285828369 0.996437857252448
6 0.996019178300768 0.995803204773039 0.99563600297737
7 0.995283481416241 0.995001861216016 0.994823584922968
8 0.994482556091416 0.994189960607964 0.99405569519175
9 0.993671079225432 0.99342255996206 0.993339856748282
10 0.992904079096455 0.992707177451333 0.992611817294026
11 0.992189069953677 0.9919796017009 0.991832027835091
Without having the exact same data files to work with it is often easy for each of us to make mistakes that the other cannot reproduce or understand.
From your snapshot of your data set I used Export in Mathematica to try to reproduce your .xlsx file. Then I tried the following
px = Import["kpx.xlsx"];
px = Flatten[Take[px, All], 1];
py = px; (* fake some py data *)
i = 0.04;
v = 1/(1 + i);
JointLifeIndep[x_, y_, n_] := Sum[v^k*px[[k+2,x-16]]*py[[k+2,y-16]], {k,0,n-1}];
JointLifeIndep[17, 17, 12]
and it displays 362.402
Notice I used := instead of = in my definition of JointLifeIndep. := and = do different things in Mathematica. = will immediately evaluate the right hand side of that definition. This is possibly the reason that you are getting the error that you do.
You should also be careful with your subscript values and make sure that every subscript is between 1 and the number of rows (or columns) in your matrix.
So see if you can try this example with an Excel sheet containing only the snapshot of data that you showed and see if you get the same result that I do.
Hopefully that will be enough for you to make progress.

Extract numbers from specific image

I am involved in a project that I think you can help me. I have multiple images that you can see here Images to recognize. The goal here is to extract the numbers between the dashed lines. What is the best approach to do that? The idea that I have from the beginning is to find the coordinates of the dash lines and do the crop function, then is just run OCR software. But is not easy to find those coordinates, can you help me? Or if you have a better approach tell me.
Best regards,
Pedro Pimenta
You may start by looking at more obvious (bigger) objects in your images. The dashed lines are way too small in some images. Searching for the "euros milhoes" logo and the barcode will be easier and it will help you have an idea of the scale and rotation involved.
To find these objects without using match template you can binarize your image (watch out for the background texture) and use the Hu moments on the contours/blobs.
Don't expect a good OCR accuracy on images where the numbers are smaller than 8-10 pixels.
You can use python-tesseract https://code.google.com/p/python-tesseract/ ,it works with your image.What you need to do is to split the result string.I use your https://www.dropbox.com/sh/kcybs1i04w3ao97/u33YGH_Kv6#f:euro9.jpg to test.And source code is below.UPDATE
# -*- coding: utf-8 -*-
from PIL import Image
from PIL import ImageEnhance
import tesseract
im = Image.open('test.jpg')
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(4)
im = im.convert('1')
w, h = im.size
im = im.resize((w * (416 / h), 416))
pix = im.load()
LINE_CR = 0.01
WHITE_HEIGHT_CR = int(h * (20 / 416.0))
status = 0
white_line = []
for i in xrange(h):
line = []
for j in xrange(w):
line.append(pix[(j, i)])
p = line.count(0) / float(w)
if not p > LINE_CR:
white_line.append(i)
wp = None
for i in range(10, len(white_line) - WHITE_HEIGHT_CR):
k = white_line[i]
if white_line[i + WHITE_HEIGHT_CR] == k + WHITE_HEIGHT_CR:
wp = k
break
result = []
flag = 0
while 1:
if wp < 0:
result.append(wp)
break
line = []
for i in xrange(w):
line.append(pix[(i, wp)])
p = line.count(0) / float(w)
if flag == 0 and p > LINE_CR:
l = []
for xx in xrange(20):
l.append(pix[(xx, wp)])
if l.count(0) > 5:
break
l = []
for xx in xrange(416-1, 416-100-1, -1):
l.append(pix[(xx, wp)])
if l.count(0) > 17:
break
result.append(wp)
wp -= 1
flag = 1
continue
if flag == 1 and p < LINE_CR:
result.append(wp)
wp -= 1
flag = 0
continue
wp -= 1
result.reverse()
for i in range(1, len(result)):
if result[i] - result[i - 1] < 15:
result[i - 1] = -1
result = filter(lambda x: x >= 0, result)
im = im.crop((0, result[0], w, result[-1]))
im.save('test_converted.jpg')
api = tesseract.TessBaseAPI()
api.Init(".","eng",tesseract.OEM_DEFAULT)
api.SetVariable("tessedit_char_whitelist", "0123456789abcdefghijklmnopqrstuvwxyz")
api.SetPageSegMode(tesseract.PSM_AUTO)
mImgFile = "test_converted.jpg"
mBuffer=open(mImgFile,"rb").read()
result = tesseract.ProcessPagesBuffer(mBuffer,len(mBuffer),api)
print "result(ProcessPagesBuffer)=",result
Depends python 2.7 python-tesseract-win32 python-opencv numpy PIL,and be sure to follow python-tesseract's remember to .

how to determine the transparent color index of ICO image with PIL?

Specifically, this is from an .ico file, so there is no "transparent" "info" attribute like you would get in a gif. The below example illustrates converting Yahoo!'s favicon to a png using the correct transparency index of "0", which I guessed. how to detect that the ico is in fact transparent and that the transparency index is 0 ?
import urllib2
import Image
import StringIO
resp = urllib2.urlopen("http://www.yahoo.com/favicon.ico")
image = Image.open(StringIO.StringIO(resp.read()))
f = file("test.png", "w")
# I guessed that the transparent index is 0. how to
# determine it correctly ?
image.save(f, "PNG", quality=95, transparency=0)
looks like someone recognized that PIL doesn't really read ICO correctly (I can see the same thing after reconciling its source code with some research on the ICO format - there is an AND bitmap which determines transparency)
and came up with this extension:
http://www.djangosnippets.org/snippets/1287/
since this is useful for non-django applications, I've reposted here with a few tweaks to its exception throws:
import operator
import struct
from PIL import BmpImagePlugin, PngImagePlugin, Image
def load_icon(file, index=None):
'''
Load Windows ICO image.
See http://en.wikipedia.org/w/index.php?oldid=264332061 for file format
description.
'''
if isinstance(file, basestring):
file = open(file, 'rb')
try:
header = struct.unpack('<3H', file.read(6))
except:
raise IOError('Not an ICO file')
# Check magic
if header[:2] != (0, 1):
raise IOError('Not an ICO file')
# Collect icon directories
directories = []
for i in xrange(header[2]):
directory = list(struct.unpack('<4B2H2I', file.read(16)))
for j in xrange(3):
if not directory[j]:
directory[j] = 256
directories.append(directory)
if index is None:
# Select best icon
directory = max(directories, key=operator.itemgetter(slice(0, 3)))
else:
directory = directories[index]
# Seek to the bitmap data
file.seek(directory[7])
prefix = file.read(16)
file.seek(-16, 1)
if PngImagePlugin._accept(prefix):
# Windows Vista icon with PNG inside
image = PngImagePlugin.PngImageFile(file)
else:
# Load XOR bitmap
image = BmpImagePlugin.DibImageFile(file)
if image.mode == 'RGBA':
# Windows XP 32-bit color depth icon without AND bitmap
pass
else:
# Patch up the bitmap height
image.size = image.size[0], image.size[1] >> 1
d, e, o, a = image.tile[0]
image.tile[0] = d, (0, 0) + image.size, o, a
# Calculate AND bitmap dimensions. See
# http://en.wikipedia.org/w/index.php?oldid=264236948#Pixel_storage
# for description
offset = o + a[1] * image.size[1]
stride = ((image.size[0] + 31) >> 5) << 2
size = stride * image.size[1]
# Load AND bitmap
file.seek(offset)
string = file.read(size)
mask = Image.fromstring('1', image.size, string, 'raw',
('1;I', stride, -1))
image = image.convert('RGBA')
image.putalpha(mask)
return image