Why do game developers put many images into one big image? - iphone

Over the years I've often asked myself why game developers place many small images into a big one. But not only game developers do that. I also remember the good old Winamp MP3 player had a user interface design file which was just one huge image containing lots of small ones.
I have also seen some big javascript GUI libraries like ext.js using this technique. In ext.js there is a big image containing many small ones.
One thing I noticed is this: No matter how small my PNG image is, the Finder on the Mac always tells me it consumes at least 4kb. Which is heck of a lot if you have just 10 pixels.
So is this done because storing 20 or more small images into a big one is much more memory efficient versus having 20 separate files, each of them probably with it's own header and metadata?
Is it because locating files on the file system is expensive and slow, and therefore much faster to simply locate only one big image and then split it up into smaller ones, once it is loaded into memory?
Or is it lazyness, because it is tedious to think of so many file names?
Is ther a name for this technique? And how are those small images separated from the big one at runtime?

This is called spriting - and there are various reasons to do it in different situations.
For web development, it means that only one web request is required to fetch the image, which can be a lot more efficient than several separate requests. That's more efficient in terms of having less overhead due to the individual requests, and the final image file may well be smaller in total than it would have been otherwise.
The same sort of effect may be visible in other scenarios - for example, it may be more efficient to store and load a single large image file than multiple small ones, depending on the file system. That's entirely aside from any efficiencies gained in terms of the raw "total file size", and is due to the per file overhead (a directory entry, block size etc). It's a bit like the "per request" overhead in the web scenario, but due to slightly different factors.

None of these answers are right. The reason we pack multiple images into one big "sprite sheet" or "texture atlas" is to avoid swapping textures during rendering.
OpenGL and Direct-X take a performance hit when you draw from one image (texture) and the switch to another, so we pack multiple images into one big image and then we can draw several (or hundreds) of images and never switch textures. It has nothing to do with the 4K file size (or hasn't in 15 years).
Also, up until very recently, textures had to by powers of 2 (64, 128, 256) and if your game had lots of odd sized images, that's a lot of wasted memory. Packing them in a single texture could save a lot of space.

The 4kb usage is a side effect of how files are stored on disk. The smallest possible addressable bit of storage in a filesystem is a block, which is usually a fixed size of 512, 1024, 2048, etc... bytes. In your Mac's case, it's using 4k blocks. That means that even a 1-byte file will require at least 4kbytes worth of physical space to store, as it's not possible for the file system to address any storage unit SMALLER than 4k.
The reasons for these "large" blocks vary, but the big one is that the more "granular" your addressing gets (the small the blocks), the more space you waste on indexes to list which blocks are assigned to which files. If you had 1-byte sized blocks, then for every byte of data you store in a file, you'd also need to store 1+ bytes worth of usage information in the file system's metadata, and you'd end up wasting at least HALF of your storage on nothing but indexes.
The converse is true - the bigger the blocks, the more space is wasted for every smaller-than-one-block sized file you store, so in the end it comes down to what tradeoff you're willing to live with.

The reasons are a bit different in different environments.
On the web the main reason is to reduce the number of requests to the web server. Each requests creates overhead, most notably a separate round trip over the network.
When fetching from good ol' mechanical hard drives good read performance requires contiguous data. If you save data in lots of files you get extra seek-time for each file. There is also the block size to consider. Files are made out of blocks, in your case 4kB. When reading a file of one byte you need to read a whole block anyway. If you have many small images you can stuff a whole bunch of them in a single disk block and get them all in the same time as if you had only one small image in the block.

Another reason from days of yore was palletes.
If you did one image you could theme it with one pallete Colour = 14 = light grey with a hint of green.
If you did lots of little images you had to make sure you used the same pallet for every one while designing them, or you got all sort of artifacts.
Given you had one pallete then you could manipulate that, so everything currently green could be made red, by flipping one value in the palletes instead of trawling through every image.
Lots of simple animations like fire, smoke, running water are still done with this method.

Related

SVG Assets preferred in Xcode 13/14 over multiple PNG Images?

Xcode 12 introduced support for using Scalable Vector Graphic (SVG) image assets. It comes with a lot of benefits like smaller sizer, less management efforts, etc.
My question is: Does SVG also come with a sacrifice of compiling performance in the latest Xcode 13/14?
My quick test validates one’s intuition, namely that compilation is faster (though, only slightly) if you prepare the 1×, 2×, and 3× scale rasterized images yourself beforehand. For my test with twenty, trivial 1k SVG (the standard square.and.up.arrow icon), that was 0.3 seconds slower building than it was with the same number of sets of pre-prepared PNGs.
So, it depends upon the number of vector graphics, and the size/complexity of those vector files. But in my current project with ~100 vector assets, the compilation time of the assets has never been the concern. But my assets are, admittedly, relatively simple. Your mileage may vary.
You probably are going to just have to benchmark it with your collection of images to decide whether the compilation time difference warrants the time investment to create all the rasterized assets. So look at your build report and you can see how much time is taken at this step in the build.
As an aside, you mention the smaller size. The assets in your project might be smaller, but the resulting app might not be any smaller.
I don’t use vector graphics for size reductions, but for the other reasons you enumerated. Plus, by preserving vector data, I get nice renditions in the accessibility vision scenarios (e.g., where tab buttons become oversized).

SD card lifetime optimization

Simple question:
Which approach is best in terms of prolonging the life expectancy of an SD card?
Writing 10-minute files with 10 Hz lines of data input (~700 kB each)
1) directly to the SD card
or
2) to the internal memory of the device, then moving the file to the SD card
?
The amount of data being written to the SD card remains the same. The question is simply if a lot of tiny file operations (6000 lines written in the course of ten minutes, 100 ms apart) or one file operation moving the entire file containing the 6000 lines onto the card as once is better. Or does it even matter? Of course the card specifications are hugely important as well, but let's leave that out of the discussion.
1) You should only write to fill flash page boundaries discussed here:
https://electronics.stackexchange.com/questions/227686/sd-card-sector-size
2) Keeping fault-tolerant track of how much data is written where also needs to be written. That counts as a write hit on FAT etc as well, on a page that gets more traffic than others. Avoid if possible (ie fdup/fclose/fopen append) techniques which cause buffer and directory cached data to be flushed. But I would use this trick every minute or so so you never lose more than a minute of data on a crash or accidental removal.
3) OS-supported wear leveling will solve the above, if properly implemented. I have read horror stories about flash memories being destroyed in days.
4) Calculate expected life using the total wear-leveled lifetime writes spec of that memory. Usually in TB's. If you see numbers in the decades, don't bother doing more than (1).
5) Which OS and file-system you are using matters somewhat. For example EXT3 is supposedly faster than EXT2 due to less drive access at a slightly higher risk ratio. Since your question doesn't ask about OS/FS you use, I'll leave the rest of that up to you.

MATLAB: Are there any problems with many (millions) small files compared to few (thousands) large files?

I'm working on a real-time test software in MATLAB. On user input I want to extract the value of one (or a few neighbouring) pixels from 50-200 high resolution images (~25 MB).
My problem is that the total image set is to big (~2000 images) to store in RAM, consequently I need to read each of the 50-200 images from disk after each user-input which of course is way to slow!
So I was thinking about splitting the images into sub-images (~100x100 pixels) and saving these separately. This would make the image-read process quick enough.
Are there any problems I should be aware of with this approach? For instance I've read about people having trouble copying many small files, will this affect me to i.e. make the image-read slower?
rahnema1 is right - imread(...,'PixelRegion') will fasten read operation. If it is not enough for you, even if your files are not fragmented, may be it is time to think about some database?
Disk operations are always the bottleneck. First we switch to disk caches, then distributed storage, then RAID, and after some more time, we finish with in-memory databases. You should choose which access speed is reasonable.

Optimizing compression using HDF5/H5 in Matlab

Using Matlab, I am going to generate several data files and store them in H5 format as 20x1500xN, where N is an integer that can vary, but typically around 2300. Each file will have 4 different data sets with equal structure. Thus, I will quickly achieve a storage problem. My two questions:
Is there any reason not the split the 4 different data sets, and just save as 4x20x1500xNinstead? I would prefer having them split, since it is different signal modalities, but if there is any computational/compression advantage to not having them separated, I will join them.
Using Matlab's built-in compression, I set deflate=9 (and DataType=single). However, I have now realized that using deflate multiplies my computational time with 5. I realize this could have something to do with my ChunkSize, which I just put to 20x1500x5 - without any reasoning behind it. Is there a strategic way to optimize computational load w.r.t. deflation and compression time?
Thank you.
1- Splitting or merging? It won't make a difference in the compression procedure, since it is performed in blocks.
2- Your choice of chunkshape seems, indeed, bad. Chunksize determines the shape and size of each block that will be compressed independently. The bad is that each chunk is of 600 kB, that is much larger than the L2 cache, so your CPU is likely twiddling its fingers, waiting for data to come in. Depending on the nature of your data and the usage pattern you will use the most (read the whole array at once, random reads, sequential reads...) you may want to target the L1 or L2 sizes, or something in between. Here are some experiments done with a Python library that may serve you as a guide.
Once you have selected your chunksize (how many bytes will your compression blocks have), you have to choose a chunkshape. I'd recommend the shape that most closely fits your reading pattern, if you are doing partial reads, or filling in in a fastest-axis-first if you want to read the whole array at once. In your case, this will be something like 1x1500x10, I think (second axis being the fastest, last one the second fastest, and fist the slowest, change if I am mistaken).
Lastly, keep in mind that the details are quite dependant on the specific machine you run it: the CPU, the quality and load of the hard drive or SSD, speed of RAM... so the fine tuning will always require some experimentation.

Streaming huge 3D scenes over internet

I want to stream big scenes made of many objects to clients but need some advice on what approach to take. I know PS4 and Battle.NET stream the games even when 70% of the game is not downloaded yet but they work pretty fast with my 18 Mbps connection.
Anyone can please help me where to start and how to start for streaming big scenes?
A lot of these don't necessarily stream huge scenes per se, if "huge scenes" implies transmitting the lowest-level primitive data (individual points, triangles, unique textures on every single object, etc).
They often stream higher-level data like "maps" with a lot of instanced data. For example, they might not transmit the triangles of a thousand trees in a forest. Instead, they might transmit one unique tree asset which is instanced and just scaled and rotated and positioned differently to form a forest (just a unique transformation matrix per tree instance). The result might be that the entire forest can be transmitted without taking much more memory than a single tree's worth of triangles.
They might have two or more characters meshes which have identical geometry or topology and just unique deformations (point positions) or textures ("skins"), significantly reducing the amount of unique data that has to be sent/stored.
When doing this kind of instancing/tiling stuff, what might otherwise be terabytes worth of unique data may fit into megabytes due to the amount of instanced, non-unique data.
So the first step to doing this typically is to build your own level/map editor, e.g. That level/map editor can often serialize something considerably higher-level and tighter than, say, a Wavefront OBJ file due to the sheer amount of tiled/instanced (shared) data. That high-level data ends up being what you stream.
Second is to build scalable servers, and that's a separate beast. To do that often involves very efficient multithreading at the heart of the OS/kernel to achieve very efficient async I/O. There are some great resources out there on this subject, but it's too broad to cover in one simple answer.
And third might be compression of the data to further reduce the required bandwidth.
A commercial game title might seek all three of these, but probably the first thing to realize is that they're not necessarily streaming unique triangles and texels everywhere -- to stream such low-level data would place tremendous strain on the server, especially given the kind of player load that MMOs are designed to handle. There's a whole lot of instanced data that these games, especially MMOs, often use to significantly cut down on the unique data that actually has to occupy memory and be transmitted separately.
Maps and assets are often designed to carefully reuse existing data as much as possible -- carefully made to have maximum repetition to reduce memory requirements but without looking too blatantly redundant (variation vs. economy). They look "huge" but aren't really from a data standpoint given the sheer amount of repetition of the same data, and considering that they don't redundantly store repetitive data. They're typically very, very economical about it.
As far as streaming goes, a simple way might be to break the world down into 2-dimensional regions (with some overlap to allow a seamless experience so that adjacent regions are being streamed as the player travels around the world) with AABBs around them. Stream the data for the region(s) the player is in and possibly visible within the viewing frustum. It can get a lot more elaborate than this but this might serve as a decent starting point.