Efficient way of saving multiple images of different size as hdf5 (h5py) - h5py

I have millions of images that are quite small in size, each of them is a different size but are generally quite small (~100x100).
What is the best way to save them into a single hdf5 file? The main reason I want to save them as a single file is because I don't want to run out of inodes.
Currently, the saved file is huge (x10 bigger than when saved as pngs). I tried using h5py's compression option as shown below. However this results in a bigger file than without compression.
hf_img = h5py.File("imgs.hdf5", "w")
for img in imgs:
hf_img.create_dataset(prefix, data=img, compression="gzip", compression_opts=9)
hf_img.close()
I heard the larger compressed file has to do with chunking but I don't understand how to set up chunking efficiently.

Related

Powershell - How to compress images within multiple Word Documents

I have a large quantity of Word Documents that contain images that result in very large file sizes (20-50MB) due to the size of the image files. This is causing storage and speed issues. I would like to have a Powershell script to add to the scripts I already run on batches of documents that uses Word's image compression feature (or other methods) to reduce the file size of the images, without affecting their size as they appear in the documents.
I found a script that claimed to do this on Reddit but it simply did not work. Unfortunately, I am pretty inexperienced with Powershell and this falls well outside of my capability to write myself. Any help would be appreciated.

Store cache images

I'm looking for the best read performance for a bunch (~200) cached 80px by 80px images. A large chuck (~50) will all needed to be accessed at once.
Should I store the uiimages (as binary data) in a plist or using core data?
A couple of basic concepts:
CoreData is probably the worst way to go for the image data, the documentation states that BLOB segments in the store cause massive performance problems.
Use the file system for what it's built for, read / write access of random chunks of data.
The rest depends on how you organize your data, so here are some thoughts:
80x80 is pretty small, you could probably hold 50 or so in memory at a given time.
You need a way to hash the images into some kind of structure so you know which ones to fetch. I would use core data for storing the locations of the images on the file system, and back your view with an NSFetchedResultsController to pull out the list of file names.
Use some in memory data structure to store the UIImage objects, a FIFO queue with a size of 50 would work well here, as it gets a new image from the file system it pops out the oldest one.
Finally you have to know what images you're going to view and stay ahead of it, file system reads won't be super fast, so you'll need to either chunk your reads or stay far enough of your view to avoid lagging. If your view is showing 50, you might want to keep 100 in memory, 50+ 25 previous and 25 next if you're scrolling for example.
A premature optimization:
If read performance is essential, it would be worth while to store the images in "page" sized chunks, such as a zip of 5 or n images that can be read into memory at once, and then split into their corresponding UIImage files.

iPhone storing large amounts of images

I have a large amount of images that correspond to records in a sqlite db. where should I store them? I have 3 versions of the same image - large, med, thumb sizes. (I don't want to store them in the db table, but reference them instead from each record)
all the images have the same name - each small, med and large image files would all be called "1.jpg" for record 1 in the sqlite table etc... this is mainly because I'm using an image batch resizing program that retains the same file name and creates a new folder.
thanks for any help.
For my cached images, I stored them in TMP folder, you can access using NSTemporaryDirectory.
I don't know if it is good for your cases or if it is good in general but it works quite well

Can i store sqlite db as zip file in iphone application

My sqlite file has a size of 7MB. I want to reduce its size. How i can do that ? When am simply compressing it will come around only 1.2 MB. Can i compress my mydb.sqlite to a zip file ? If it is not possible, any other way to reduce size of my sqlite file ?
It is possible to compress before hand, but is very redundant. You will compress your binary before distribution, Apple distributes your app through the store compressed and the compression of a compressed file is fruitless. Thus, any work you do to compress beforehand should not have much of an effect on the resulted size of your application
without details of what you are storing in the DB it's hard to give specific advice. The usual generics on DB Design will apply. Normalise your database.. for example
reduce/remove repeating data. If you have text/data that is repeated then store it once, and use key to reference it
If you are storing large chunks of data then you might be able to zip and unzip these in and out of the database in your app code rather than try to zip the DB

Is Tie::File lazily loading a file?

I'm planning on writing a simple text viewer, which I'd expect to be able to deal with very large sized files. I was thinking of using Tie::File for this, and kind of paginate the lines. Is this loading the lines lazily, or all of them at once?
It won't load the whole file. From the documentation:
The file is not loaded into memory, so this will work even for gigantic files.
As far as I can see from its source code it stores only used lines in memory. And yes, it loads data only when needed.
You can limit amount of used memory with memory parameter.
It also tracks offsets of all lines in the file to optimize disk access.