How to shrink or manage an image's size in bytes - python-imaging-library

Python 3.6.6, Pillow 5.2.0
The Google Vision API has a size limit of 10485760 bytes.
When I'm working with a PIL Image, and save it to Bytes, it is hard to predict what the size will be. Sometimes when I try to resize it to have smaller height and width, the image size as bytes gets bigger.
I've tried experimenting with modes and formats, to understand their impact on size, but I'm not having much luck getting consistent results.
So I start out with a rawImage that is Bytes obtained from some user uploading an image (meaning I don't know much about what I'm working with yet).
rawImageSize = sys.getsizeof(rawImage)
if rawImageSize >= 10485760:
imageToShrink = Image.open(io.BytesIO(rawImage))
## do something to the image here to shrink it
# ... mystery code ...
## ideally, the minimum amount of shrinkage necessary to get it under 10485760
rawBuffer = io.BytesIO()
# possibly convert to RGB first
shrunkImage.save(rawBuffer, format='JPEG') # PNG files end up bigger after this resizing (!?)
rawImage = rawBuffer.getvalue()
print(sys.getsizeof(rawImage))
To shrink it I've tried getting a shrink ratio and then simply resizing it:
shrinkRatio = 10485760.0 / float(rawImageSize)
imageWidth, imageHeight = pilImage.size
shrunkImage = imageToShrink.resize((int(imageWidth * shrinkRatio),
int(imageHeight * shrinkRatio)), Image.LANCZOS)
Of course I could use a sufficiently small and somewhat arbitrary thumbnail size instead. I've thought about iterating thumbnail sizes until a combination takes me below the maximum bytes size threshold. I'm guessing the bytes size varies based on the color depth and mode and (?) I got from the end user that uploaded the original image. And that brings me to my questions:
Can I predict the size in bytes a PIL Image will be before I convert it for consumption by Google Vision? What is the best way to manage that size in bytes before I convert it?

First all, you probably don't need to maximize to the 10M limit posed by Google Vision API. In most case, a much smaller file will be just fine, and faster.
In addition to that, you may want to keep in mind that the aspect ratio might lead to different result. See this, https://www.mlreader.com/prepare-image-for-google-vision-api

Related

Can Flutter render images from raw pixel data? [duplicate]

Setup
I am using a custom RenderBox to draw.
The canvas object in the code below comes from the PaintingContext in the paint method.
Drawing
I am trying to render pixels individually by using Canvas.drawRect.
I should point out that these are sometimes larger and sometimes smaller than the pixels on screen they actually occupy.
for (int i = 0; i < width * height; i++) {
// in this case the rect size is 1
canvas.drawRect(
Rect.fromLTWH(index % (width * height),
(index / (width * height)).floor(), 1, 1), Paint()..color = colors[i]);
}
Storage
I am storing the pixels as a List<List<Color>> (colors in the code above). I tried differently nested lists previously, but they did not cause any noticable discrepancies in terms of performance.
The memory on my Android Emulator test device increases by 282.7MB when populating the list with a 999x999 image. Note that it only temporarily increases by 282.7MB. After about half a minute, the increase drops to 153.6MB and stays there (without any user interaction).
Rendering
With a resolution of 999x999, the code above causes a GPU max of 250.1 ms/frame and a UI max of 1835.9 ms/frame, which is obviously unacceptable. The UI freezes for two seconds when trying to draw a 999x999 image, which should be a piece of cake (I would guess) considering that 4k video runs smoothly on the same device.
CPU
I am not exactly sure how to track this properly using the Android profiler, but while populating or changing the list, i.e. drawing the pixels (which is the case for the above metrics as well), CPU usage goes from 0% to up to 60%. Here are the AVD performance settings:
Cause
I have no idea where to start since I am not even sure what part of my code causes the freezing. Is it the memory usage? Or the drawing itself?
How would I go about this in general? What am I doing wrong? How should I store these pixels instead.
Efforts
I have tried so much that did not help at all that I will try to only point out the most notable ones:
I tried converting the List<List<Color>> to an Image from the dart:ui library hoping to use Canvas.drawImage. In order to do that, I tried encoding my own PNG, but I have not been able to render more than a single row. However, it did not look like that would boost performance. When trying to convert a 9999x9999 image, I ran into an out of memory exception. Now, I am wondering how video is rendered as all as any 4k video will easily take up more memory than a 9999x9999 image if a few seconds of it are in memory.
I tried implementing the image package. However, I stopped before completing it as I noticed that it is not meant to be used in Flutter but rather in HTML. I would not have gained anything using that.
This one is pretty important for the following conclusion I will draw: I tried to just draw without storing the pixels, i.e. is using Random.nextInt to generate random colors. When trying to randomly generate a 999x999 image, this resulted in a GPU max of 1824.7 ms/frames and a UI max of 2362.7 ms/frame, which is even worse, especially in the GPU department.
Conclusion
This is the conclusion I reached before trying my failed attempt at rendering using Canvas.drawImage: Canvas.drawRect is not made for this task as it cannot even draw simple images.
How do you do this in Flutter?
Notes
This is basically what I tried to ask over two months ago (yes, I have been trying to resolve this issue for that long), but I think that I did not express myself properly back then and that I knew even less what the actual problem was.
The highest resolution I can properly render is around 10k pixels. I need at least 1m.
I am thinking that abandoning Flutter and going for native might be my only option. However, I would like to believe that I am just approaching this problem completely wrong. I have spent about three months trying to figure this out and I did not find anything that lead me anywhere.
Solution
dart:ui has a function that converts pixels to an Image easily: decodeImageFromPixels
Example implementation
Issue on performance
Does not work in the current master channel
I was simply not aware of this back when I created this answer, which is why I wrote the "Alternative" section.
Alternative
Thanks to #pslink for reminding me of BMP after I wrote that I had failed to encode my own PNG.
I had looked into it previously, but I thought that it looked to complicated without sufficient documentation. Now, I found this nice article explaining the necessary BMP headers and implemented 32-bit BGRA (ARGB but BGRA is the order of the default mask) by copying Example 2 from the "BMP file format" Wikipedia article. I went through all sources but could not find an original source for this example. Maybe the authors of the Wikipedia article wrote it themselves.
Results
Using Canvas.drawImage and my 999x999 pixels converted to an image from a BMP byte list, I get a GPU max of 9.9 ms/frame and a UI max of 7.1 ms/frame, which is awesome!
| ms/frame | Before (Canvas.drawRect) | After (Canvas.drawImage) |
|-----------|---------------------------|--------------------------|
| GPU max | 1824.7 | 9.9 |
| UI max | 2362.7 | 7.1 |
Conclusion
Canvas operations like Canvas.drawRect are not meant to be used like that.
Instructions
First of, this is quite straight-forward, however, you need to correctly populate the byte list, otherwise, you are going to get an error that your data is not correctly formatted and see no results, which can be quite frustrating.
You will need to prepare your image before drawing as you cannot use async operations in the paint call.
In code, you need to use a Codec to transform your list of bytes into an image.
final list = [
0x42, 0x4d, // 'B', 'M'
...];
// make sure that you either know the file size, data size and data offset beforehand
// or that you edit these bytes afterwards
final Uint8List bytes = Uint8List.fromList(list);
final Codec codec = await instantiateImageCodec(bytes));
final Image image = (await codec.getNextFrame()).image;
You need to pass this image to your drawing widget, e.g. using a FutureBuilder.
Now, you can just use Canvas.drawImage in your draw call.

BMP image header - biXPelsPerMeter

I have read a lot about BMP file format structure but I still cannot get what is the real meaning of the fields "biXPelsPermeter" and "biYPelsPermeter". I mean in practical way, how is it used or how it can be utilized. Any example or experience? Thanks a lot
biXPelsPermeter
Specifies the horizontal print resolution, in pixels per meter, of the target device for the bitmap.
biYPelsPermeter
Specifies the vertical print resolution.
Its not very important. You can leave them on 2835 its not going to ruin the image.
(72 DPI × 39.3701 inches per meter yields 2834.6472)
Think of it this way: The image bits within the BMP structure define the shape of the image using that much data (that much information describes the image), but that information must then be translated to a target device using a measuring system to indicate its applied resolution in practical use.
For example, if the BMP is 10,000 pixels wide, and 4,000 pixels high, that explains how much raw detail exists within the image bits. However, that image information must then be applied to some target. It uses the relationship to the dpi and its target to derive the applied resolution.
If it were printed at 1000 dpi then it's only going to give you an image with 10" x 4" but one with extremely high detail to the naked eye (more pixels per square inch). By contrast, if it's printed at only 100 dpi, then you'll get an image that's 100" x 40" with low detail (fewer pixels per square inch), but both of them have the same overall number of bits within. You can actually scale an image without scaling any of its internal image data by merely changing the dpi to non-standard values.
Also, using 72 dpi is a throwback to ancient printing techniques (https://en.wikipedia.org/wiki/Twip) which are not really relevant in moving forward (except to maintain compatibility with standards) as modern hardware devices often use other values for their fundamental relationships to image data. For video screens, for example, Macs use 72 dpi as the default. Windows uses 96 dpi. Others are similar. In theory you can set it to whatever you want, but be warned that not all software honors the internal settings and will instead assume a particular size. This can affect the way images are scaled within the app, even though the actual image data within hasn't changed.

Why smaller PNG image takes up more space than the original after getting resized by GraphicsMagic

The original PNG image is 800x1200 and takes up about 34K. After the images is resized by GraphicsMagick to 320x480 size, the resulting images takes up approximately 37K. (For comparison, if the image is resized with Paint on Windows 7 then the resulting image is 40K.) What gives? The whole point of resizing an image was to save space. How should GraphicsMagick be used to shrink the image size?
PNG is a lossless format and compresses the image data by first performing a step called prediction and then applying the same algorithm used in zlib. The prediction step is a crucial one in order to effectively compress the file, and it is based on the values of earlier neighbors pixels.
So, suppose you have a large PNG in black & white (by that I really mean only black and white, some people confuse that by grayscale sometimes). Also suppose it is not a tiny checkerboard pattern. In many regions of this image, you will have a relatively large white region, and then a relatively large black region, and so on. When the predictor is inside one of these large regions, it has no trouble to correctly predict that the current pixel intensity is exactly equal to the last one. This makes it easier to better compress the data describing your image.
Now, let us downscale this black & white image using some resampling filter different than nearest neighbor (let's say Lanczos). This has a great chance to turn your black & white image into a grayscale one, which has a much greater intensity range. This potentially makes the job of the predictor much harder, and thus the final file size might be larger.
For instance here is a black & white 256x256 PNG image which takes 5440 bytes, a resizing of it (using 3-lobed Lanczos) to 120x120 which now takes 7658 bytes, and another resizing (using nearest neighbor) to 120x120 which occupies 2467 bytes.
PNG is a compressed format. Sometimes trying to compress a maximally compressed item actually results in a larger item. So if the 800x1200 is resized to a smaller size, but the result retains everything that was in the original, because the original is already as minimal as possible, you could see this happen. To demonstrate this, try using 7zip to compress some data with ultra compression. Then try compressing the compressed file. Often the second compressed file will be larger than the first.

How to work with images(png's) of size 2-4Mb

I am working with images of size 2 to 4MB. I am working with images of resolution 1200x1600 by performing scaling, translation and rotation operations. I want to add another image on that and save it to photo album. My app is crashing after i successfully edit one image and save to photos. Its happening because of images size i think. I want to maintain the 90% of resolution of the images.
I am releasing some images when i get memory warning. But still it crashes as i am working with 2 images of size 3MB each and context of size 1200x1600 and getting a image from the context at the same time.
Is there any way to compress images and work with it?
I doubt it. Even compressing and decompressing an image without doing anything to it loses information. I suspect that any algorithms to manipulate compressed images would be hopelessly lossy.
Having said that, it may be technically possible. For instance, rotating a Fourier transform also rotates the original image. But practical image compression isn't usually as simple as just computing a Fourier transform.
Alternatively, you could write piecemeal algorithms that chop the image up into bite-sized pieces, transform the pieces and reassemble them afterwards. You might also provide a real-time view of the process by applying the same transform to a smaller version of the full image.
The key will be never to full decode the entire image into memory at full size.
If you need to display the image, there's no reason to do that at full size -- the display on the iPhone is too small to take advantage of that. For image objects that are for display, decode the image in scaled down form.
For processing, you will need to write custom code that works on a stream of pixels rather than an in-memory array. I don't know if this is available on the iPhone already, but you can write it yourself by writing to the libpng library API directly.
For example, your code right now probably looks something like this (pseudo code)
img = ReadImageFromFile("image.png")
img2 = RotateImage(img, 90)
SaveImage(img2, "image2.png")
The key thing to understand, is that in this case, img is not the data in the PNG file (2MB), but the fully uncompressed image (~6mb). RotateImage (or whatever it's called) returns another image of about this same size. If you are scaling up, it's even worse.
You want code that looks more like this (but there might not be any API's for you to do it -- you might have to write it yourself)
imgPixelGetter = PixelDecoderFromFile("image.png")
imgPixelSaver = OpenImageForAppending("image2.png")
w = imgPixelGetter.Width
h = imgPixelGetter.Height
// set up a 90 degree rotate
imgPixelSaver.Width = h
imgPixelSaver.Height = w
// read each vertical scanline of pixels
for (x = 0; x < w; ++x) {
pixelRect = imgPixelGetter.ReadRect(x, 0, 1, h) // x, y, w, h
pixelRect.Rotate(90); // it's now got a width of h and a height of 1
imgPixelSaver.AppendScanLine(pixelRect)
}
In this algorithm, you never had the entire image in memory at once -- you read it out piece by piece and saved it. You can write similar algorithms for scaling and cropping.
The tradeoff is that it will be slower than just decoding it into memory -- it depends on the image format and the code that's doing the ReadRect(). Unfortunately, PNG is not designed for this kind of access to the pixels.

Image editing using iphone

I'm creating an image editing application for iphone. i would like to enable the user to pick an image from the photolibrary, edit it (grayscale, sepia,etc) and if possible, save back to the filesystem. I've done it for picking image (the simplest thing, as you know using imagepicker) and also for creating the grayscale image. But, i got stuck with sepia. I don't know how to implement that. Is it possible to get the values of each pixel of the image so that we can vary it to get the desired effects. Or any other possible methods are there??? pls help...
The Apple image picker code will most likely be holding just the file names and some lower-res renderings of the images in RAM til the last moment when a user selects an image.
When you ask for the full frame buffer of the image, the CPU suddenly has to do a lot more work decoding the image at full resolution, but it might be even as simple as this to trigger it off:
CFDataRef CopyImagePixels(CGImageRef inImage)
{
return CGDataProviderCopyData(CGImageGetDataProvider(inImage));
}
/* IN MAIN APPLICATION FLOW - but see EDIT 2 below */
const char* pixels = [[((NSData*)CopyImagePixels([myImage CGImage]))
autorelease] bytes]; /* N.B. returned pixel buffer would be read-only */
This is just a guess as to how it works, really, but based on some experience with image processing in other contexts. To work out whether what I suggest makes sense and is good from a memory usage point of view, run Instruments.
The Apple docs say (related, may apply to you):
You should avoid creating UIImage objects that are greater than 1024 x 1024 in size. Besides the large amount of memory such an image would consume, you may run into problems when using the image as a texture in OpenGL ES or when drawing the image to a view or layer. This size restriction does not apply if you are performing code-based manipulations, such as resizing an image larger than 1024 x 1024 pixels by drawing it to a bitmap-backed graphics context. In fact, you may need to resize an image in this manner (or break it into several smaller images) in order to draw it to one of your views.
[ http://developer.apple.com/iphone/library/documentation/UIKit/Reference/UIImage_Class/Reference/Reference.html ]
AND
Note: Prior to iPhone OS 3.0, UIView instances may have a maximum height and width of 1024 x 1024. In iPhone OS 3.0 and later, views are no longer restricted to this maximum size but are still limited by the amount of memory they consume. Therefore, it is in your best interests to keep view sizes as small as possible. Regardless of which version of iPhone OS is running, you should consider using a CATiledLayer object if you need to create views larger than 1024 x 1024 in size.
[ http://developer.apple.com/iPhone/library/documentation/UIKit/Reference/UIView_Class/UIView/UIView.html ]
Also worth noting:-
(a) Official how-to
http://developer.apple.com/iphone/library/qa/qa2007/qa1509.html
(b) From http://cameras.about.com/od/cameraphonespdas/fr/apple-iphone.htm
"The image size uploaded to your computer is at 1600x1200, but if you email the photo directly from the iPhone, the size will be reduced to 640x480."
(c) Encoding large images with JPEG image compression requires large amounts of RAM, depending on the size, possibly larger amounts than are available to the application.
(d) It may be possible to use an alternate compression algorithm with (if necessary) its malloc rewired to use temporary memory mapped files. But consider the data privacy/security issues.
(e) From iPhone SDK: After a certain number of characters entered, the animation just won't load
"I thought it might be a layer size issue, as the iPhone has a 1024 x 1024 texture size limit (after which you need to use a CATiledLayer to back your UIView), but I was able to lay out text wider than 1024 pixels and still have this work."
Sometimes the 1024 pixel limit may appear to be a bit soft, but I would always suggest you program defensively and stay within the 1024 pixel limit if you can.
EDIT 1
Added extra line break in code.
EDIT 2
Oops! The code gets a read-only copy of the data (there is a diference between CFMutableDataRef and CFDataRef). Because of limitations on available RAM, you then have to make a lower-res copy of it by smooth-scaling it down yourself, or to copy it into a modifiable buffer, if the image is large, you may need to write it in bands to a temporary file, release the unmodifiable data block and load the data back from file. And only do this of course if having the data in a temporary file like this is acceptable. Painful.
EDIT 3
Here's perhaps a better idea maybe try using a destination bitmap context that uses a CFData block that is a memory-mapped CFData. Does that work? Again only do this if you're happy with the data going via a temporary file.
EDIT 4
Oh no, it appears that memory mapped read-write CFData not available. Maybe try mmap BSD APIs.
EDIT 5
Added "const char*" and "pixels read-only" comment to code.