Deal with potential memory issue in iPhone

Deal with potential memory issue in iPhone - iphone

What ways are there to deal with memory issues on the iPhone?
Is it possible to ask how much memory is available before jumping into a memory intensive section of code?
(or perhaps Apple would just say that if you have to use so much memory you are on the wrong platform?)

UIApplicationDelegate's applicationDidReceiveMemoryWarning: will let you know if you're using too much memory. If you want to check before a memory intensive operation, here's a function that gets the available free memory in bytes on iOS:
natural_t TSGetFreeSystemMemory(void) {
mach_port_t host_port = mach_host_self();
mach_msg_type_number_t host_size = sizeof(vm_statistics_data_t) / sizeof(integer_t);
vm_size_t pagesize;
vm_statistics_data_t vm_stat;
host_page_size(host_port, &pagesize);
if (host_statistics(host_port, HOST_VM_INFO, (host_info_t)&vm_stat, &host_size) != KERN_SUCCESS)
printf("failed to get host statistics");;
// natural_t mem_used = (vm_stat.active_count + vm_stat.inactive_count + vm_stat.wire_count) * pagesize;
natural_t mem_free = vm_stat.free_count * pagesize;
return mem_free;
}

Apple appears not to be telling developers because they want to change the amount of memory available in new devices and OS releases. The number went way up on a freshly booted iPhone 4 and way down under iOS 4.0 after typical use on an iPhone 3G.
One possible method is to "preflight" the memory required for successful completion of some operation (e.g. malloc, check and then free the blocks that add up to your requirement). You can malloc in small chunks using a timer spanning many milliseconds to see if you can "push" other apps out of memory. But even this method is no guarantee, as Mail or some other background app could jump in and consume memory even when your app is frontmost.
If you use less than 20MB at any point in time, then a huge percentage of games in the App store will fail before your app does (just my random opinion).

Put the following in your app delegate and it will be called when memory is starting to run low. This is the Apple way of doing things:
- (void)applicationDidReceiveMemoryWarning:(UIApplication *)application {
// Free some memory or set some flag that we are low
}

Related

What is the latency of `clwb` and `ntstore` on Intel's Optane Persistent Memory?

In this paper, it is written that the 8 bytes sequential write of clwb and ntstore of optane PM have 90ns and 62ns latency, respectively, and sequential reading is 169ns.
But in my test with Intel 5218R CPU, clwb is about 700ns and ntstore is about 1200ns. Of course, there is a difference between my test method and the paper, but the result is too bad, which is unreasonable. And my test is closer to actual usage.
During the test, did the Write Pending Queue of CPU's iMC or the WC buffer in the optane PM become the bottleneck, causing blockage, and the measured latency has been inaccurate? If this is the case, is there a tool to detect it?
#include "libpmem.h"
#include "stdio.h"
#include "x86intrin.h"
//gcc aep_test.c -o aep_test -O3 -mclwb -lpmem
int main()
{
size_t mapped_len;
char str[32];
int is_pmem;
sprintf(str, "/mnt/pmem/pmmap_file_1");
int64_t *p = pmem_map_file(str, 4096 * 1024 * 128, PMEM_FILE_CREATE, 0666, &mapped_len, &is_pmem);
if (p == NULL)
{
printf("map file fail!");
exit(1);
}
if (!is_pmem)
{
printf("map file fail!");
exit(1);
}
struct timeval start;
struct timeval end;
unsigned long diff;
int loop_num = 10000;
_mm_mfence();
gettimeofday(&start, NULL);
for (int i = 0; i < loop_num; i++)
{
p[i] = 0x2222;
_mm_clwb(p + i);
// _mm_stream_si64(p + i, 0x2222);
_mm_sfence();
}
gettimeofday(&end, NULL);
diff = 1000000 * (end.tv_sec - start.tv_sec) + end.tv_usec - start.tv_usec;
printf("Total time is %ld us\n", diff);
printf("Latency is %ld ns\n", diff * 1000 / loop_num);
return 0;
}
Any help or correction is much appreciated!

The main reason is repeating flush to the same cacheline is delayed dramatically[1].
You are testing the avg latency instead of best-case latency like the FAST20 papaer.
ntstore are more expensive than clwb, so it's latency is higher. I guess it's a typo in your first paragraph.
appended on 4.14
Q: Tools to detect possible bottleneck on WPQ of buffers?
A: You can get a baseline when PM is idle, and use this baseline to indicate the possible bottleneck.
Tools:
Intel Memory Bandwidth Monitoring
Reads Two hardware counters from performance monitoring unit (PMU) in the processor: 1) UNC_M_PMM_WPQ_OCCUPANCY.ALL, which counts the accumulated number of WPQ entries at each cycle and 2) UNC_M_PMM_WPQ_INSERTS, which counts how many entries have been inserted into WPQ. And the calculate the queueing delay of WPQ: UNC_M_PMM_WPQ_OCCUPANCY.ALL / UNC_M_PMM_WPQ_INSERTS. [2]
[1] Chen, Youmin, et al. "Flatstore: An efficient log-structured key-value storage engine for persistent memory." Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems. 2020.
[2] Imamura, Satoshi, and Eiji Yoshida. “The analysis of inter-process interference on a hybrid memory system.” Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops. 2020.

https://www.usenix.org/system/files/fast20-yang.pdf describes what they're measuring: the CPU side of doing one store + clwb + mfence for a cached write1. So the CPU-pipeline latency of getting a store "accepted" into something persistent.
This isn't the same thing as making it all the way to the Optane chips themselves; the Write Pending Queue (WPQ) of the memory controllers are part of the persistence domain on Cascade Lake Intel CPUs like yours; wikichip quotes an Intel image:
Footnote 1: Also note that clwb on Cascade Lake works like clflushopt - it just evicts. So store + clwb + mfence in a loop test would test the cache-cold case, if you don't do something to load the line before the timed interval. (From the paper's description, I think they do). Future CPUs will hopefully properly support clwb, but at least CSL got the instruction supported so future libraries won't have to check CPU features before using it.
You're doing many stores, which will fill up any buffers in the memory controller or elsewhere in the memory hierarchy. So you're measuring throughput of a loop, not latency of one store plus mfence itself in a previously-idle CPU pipeline.
Separate from that, rewriting the same line repeatedly seems to be slower than sequential write, for example. This Intel forum post reports "higher latency" for "flushing a cacheline repeatedly" than for flushing different cache lines. (The controller inside the DIMM does do wear leveling, BTW.)
Fun fact: later generations of Intel CPUs (perhaps CPL or ICX) will have even the caches (L3?) in the persistence domain, hopefully making clwb even cheaper. IDK if that would affect back-to-back movnti throughput to the same location, though, or even clflushopt.
During the test, did the Write Pending Queue of CPU's iMC or the WC buffer in the optane PM become the bottleneck, causing blockage, and the measured latency has been inaccurate?
Yes, that would be my guess.
If this is the case, is there a tool to detect it?
I don't know, sorry.

iPhone App floating point calculations when released to the app store.

I released my first solo iPhone app last week that calculates 12V Marine and Boat Battery usage. I had tested it vigorously on the simulator and on my iPhone, and when I was comfortable all was well, I archived the app and released it to Apple. When users started using the app, they noted a calculation was not working as expected. The below code, which is a method on a NSManagedObject model, was producing a DIFFERENT output when released to when in debug.
The below should sum up the total of the related Discharge items - but if there are no BBDischarge items, then it should return zero. Instead, when there are no items (thus the for loop doesn't fire) it returns a figure that is the direct product of the domesticAH (an NSNumber). If I set the domesticAH to 100, the below dischargeAH returns 8, if set the domesticAH to 1000, it returns 83 (the float is rounded when it is displayed). Again, I stress the below works fine when running on the simulator, or put directly onto my iPhone 4s, only once released through the app store does it screw up.
//Other dynamics hidden for simplicity
#dynamic domesticAH;
#dynamic voltage;
#dynamic profileDischarge;
-(NSNumber*) dischargeAmpH
{
float totalWattsPD;
//Calculate the watts per day so we can accurately calculate the percentage of each item line
for(BBDischarge* currentDischarge in self.profileDischarge)
{
float totalWatt = [currentDischarge.wattage floatValue] ;
float totalMinsPD = [currentDischarge.minutesPD floatValue]/60;
float totaltimesUsed = [currentDischarge.timesUsed floatValue];
float numberInOperation = [currentDischarge.number floatValue];
totalWattsPD = totalWattsPD + ((totalWatt * totalMinsPD) * (totaltimesUsed * numberInOperation));
}
int voltage = ([self.voltage integerValue] + 1) * 12;
float totalAmpsPD = totalWattsPD / voltage;
return [NSNumber numberWithFloat:totalAmpsPD];
}
As you can see, self.domesticAH is not featured in the method anywhere, and as I can't recreate this in the simulator, I am having a hard time tracking this down.
A few questions:
Can I debug the live version of my app? Attach XCode to my running, App Store downloaded instance of my app?
Is there a way I can simulate an install from the archive - would anyone recommend ad hoc distribution to do this? I havent used ad hoc yet.
Any other ideas why this might be behaving in this way?

Is more a hint than an answer.. but, as I already said in comments: you can check the different behavior between Release and Debug version.
This different "behavior" has been seen in the past. Is given by "Optmization Level" setting. Debug version is set to None [-O0], while the Release version is set to Fastest, Smallest [-Os].
An example here .. here another ..

NSCache is not evicting data

NSCache is a rarely used tool which actually looks quite useful. I created a simple experiment to see how it works and it looks like it does not auto-evict data in low memory situations (or I am doing something wrong!)
- (void)viewDidLoad
{
_testCache = [[NSCache alloc] init];
// Allocate 600 MB of zeros and save to NSCache
NSMutableData* largeData = [[NSMutableData alloc] init];
[largeData setLength:1024 * 1024 * 600];
[_testCache setObject:largeData forKey:#"original_Data"];
}
- (IBAction)buttonWasTapped:(id)sender {
// Allocate & save to cache 300 MB each time the button is pressed
NSMutableData* largeData = [[NSMutableData alloc] init];
[largeData setLength:1024 * 1024 * 300];
static int count = 2;
NSString* key = [NSString stringWithFormat:#"test_data_%d", count++];
[_testCache setObject:largeData forKey:key];
NSMutableData* dataRecoveredFromCache = [_testCache objectForKey:#"original_Data"];
if (dataRecoveredFromCache) {
NSLog(#"Original data is ok");
} else {
NSLog(#"Original data is missing (purged from cache)");
}
}
So I ran the app in the simulator, and taped the button a few times however no items were evicted... The app eventually crashed:
2012-07-17 14:19:36.877 NSCacheTest[15302:f803] Data is ok
2012-07-17 14:19:37.365 NSCacheTest[15302:f803] Data is ok
2012-07-17 14:19:37.861 NSCacheTest[15302:f803] Data is ok
2012-07-17 14:19:38.341 NSCacheTest[15302:f803] Data is ok
2012-07-17 14:19:38.821 NSCacheTest[15302:f803] Data is ok
NSCacheTest(15302,0xac0942c0) malloc: *** mmap(size=393216000) failed (error code=12)
*** error: can't allocate region

From the doc (Emphasis mine): The NSCache class incorporates various auto-removal policies, which ensure that it does not use too much of the system’s memory. The system automatically carries out these policies if memory is needed by other applications. When invoked, these policies remove some items from the cache, minimizing its memory footprint.
Apple does not state that the memory will be freed on memory warning - in my experience, the cache is most often purged when the app goes to background or when you add more large elements.

Here's quoted docs ...
The NSCache class incorporates various auto-removal policies, which
ensure that it does not use too much of the system’s memory. The
system automatically carries out these policies if memory is needed by
other applications. When invoked, these policies remove some items
from the cache, minimizing its memory footprint.
... as you can see it states that it removes some items, not all items. It depends on NSCache internal policies, available memory, device status, etc. You shouldn't worry about these policies.
You can control them with countLimit, totalCostLimit properties and you can add object with cost, look at setObject:forKey:cost:.
Also you can evict objects by yourself. Add NSDiscardableContent protocol implementation to your objects and setEvictsObjectsWithDiscardedContent: to YES.

I am using that class too. Note that the documentation states the NSCache is tied into the OS and probably has access to memory information deep within the OS. The memory warning is just that - it just sends a memory warning to the appDelegate/viewControllers.
If you really want to test your code out, you will probably need a test mode where you start mallocing lots of memory (creating a huge leak so to speak). You might need parcel this out in chunks during each main runloop, so the OS has the opportunity to see he memory going down (I have an app that chew ups lots of memory, and it does it so fast on the 3GS it just gets killed never having got a memory warning.

Get amount of memory used by app in iOS

I'm working on an upload app that splits files before upload. It splits the files to prevent being closed by iOS for using too much memory as some of the files can be rather large. It would be great if I could, instead of setting the max "chunk" size, set the max memory usage and determine the size using that.
Something like this
#define MAX_MEM_USAGE 20000000 //20MB
#define MIN_CHUNK_SIZE 5000 //5KB
-(void)uploadAsset:(ALAsset*)asset
{
long totalBytesRead = 0;
ALAssetRepresentation *representation = [asset defaultRepresentation];
while(totalBytesRead < [representation size])
{
long chunkSize = MAX_MEM_USAGE - [self getCurrentMemUsage];
chunkSize = min([representation size] - totalBytesRead,max(chunkSize,MIN_CHUNK_SIZE));//if I can't get 5KB without getting killed then I'm going to get killed
uint8_t *buffer = malloc(chunkSize);
//read file chunk in here, adding the result to totalBytesRead
//upload chunk here
}
}
Is essentially what I'm going for. I can't seem to find a way to get the current memory usage of my app specifically. I don't really care about the amount of system memory left.
The only way I've been able to think of is one I don't like much. Grab the amount of system memory on the first line of main in my app, then store it in a static variable in a globals class then the getCurrentMemUsage would go something like this
-(long)getCurrentMemUsage
{
long sysUsage = [self getSystemMemoryUsed];
return sysUsage - [Globals origSysUsage];
}
This has some serious drawbacks. The most obvious one to me is that another app might get killed in the middle of my upload, which could drop sysUsage lower than origSysUsage resulting in a negative number even if my app is using 10MB of memory which could result in my app using 40MB for a request rather than the maximum which is 20MB. I could always set it up so it clamps the value between MIN_CHUNK_SIZE and MAX_MEM_USAGE, but that would just be a workaround instead of an actual solution.
If there are any suggestions as to getting the amount of memory used by an app or even different methods for managing a dynamic chunk size I would appreciate either.

Now, as with any virtual memory operating system, getting the "memory used" is not very well defined and is notoriously difficult to define and calculate.
Fortunately, thanks to the virtual memory manager, your problem can be solved quite easily: the mmap() C function. Basically, it allows your app to virtually load the file into memory, treating it as if it were in RAM, but it is actually swapped in from storage as it is accessed, and swapped out when iOS is low on memory.
This function is really easy to use in iOS with the Cocoa APIs for it:
- (void) uploadMyFile:(NSString*)fileName {
NSData* fileData = [NSData dataWithContentsOfMappedFile:fileName];
// Work with the data as with any NSData* object. The iOS kernel
// will take care of loading the file as needed.
}

How to find available memory in iPhone programmatically?

I'd like to know how to find programmatically available memory in iPhone from Objective-C?

You can get physical memory with the following:
NSLog(#"physical memory: %d", [NSProcessInfo processInfo].physicalMemory);
Available memory is going to be not something you can nail down to a hard number, since the os will kill off background apps for you as needed to give the foreground app more memory, along with clearing file caches etc. Assuming you're doing this to optimize your own caching, you could build your cache size based on physical memory and guess how much you should use. For instance, on an old 128m iphone 3g, your entire app would only get maybe 10-15megs of ram before it got killed, where a brand new 1024meg iphone5 is going to allow you hundreds of megabytes of ram before the os decides to kill you.
See memory in devices at http://en.wikipedia.org/wiki/List_of_iOS_devices

You can use the Mach call host_info(host, flavor, host_info, host_info_count). If you call it with flavor=HOST_BASIC_INFO, the buffer host_info points to is filled with a struct host_basic_info, what looks like that:
struct host_basic_info {
integer_t max_cpus; /* max number of CPUs possible */
integer_t avail_cpus; /* number of CPUs now available */
natural_t memory_size; /* size of memory in bytes, capped at 2 GB */
cpu_type_t cpu_type; /* cpu type */
cpu_subtype_t cpu_subtype; /* cpu subtype */
cpu_threadtype_t cpu_threadtype; /* cpu threadtype */
integer_t physical_cpu; /* number of physical CPUs now available */
integer_t physical_cpu_max; /* max number of physical CPUs possible */
integer_t logical_cpu; /* number of logical cpu now available */
integer_t logical_cpu_max; /* max number of physical CPUs possible */
uint64_t max_mem; /* actual size of physical memory */
}
From this structure, you can get the memory size.

Swift 5
Your Ram Size in Bytes:
let totalRam = ProcessInfo.processInfo.physicalMemory

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse