How to find available memory in iPhone programmatically? - iphone

I'd like to know how to find programmatically available memory in iPhone from Objective-C?

You can get physical memory with the following:
NSLog(#"physical memory: %d", [NSProcessInfo processInfo].physicalMemory);
Available memory is going to be not something you can nail down to a hard number, since the os will kill off background apps for you as needed to give the foreground app more memory, along with clearing file caches etc. Assuming you're doing this to optimize your own caching, you could build your cache size based on physical memory and guess how much you should use. For instance, on an old 128m iphone 3g, your entire app would only get maybe 10-15megs of ram before it got killed, where a brand new 1024meg iphone5 is going to allow you hundreds of megabytes of ram before the os decides to kill you.
See memory in devices at http://en.wikipedia.org/wiki/List_of_iOS_devices

You can use the Mach call host_info(host, flavor, host_info, host_info_count). If you call it with flavor=HOST_BASIC_INFO, the buffer host_info points to is filled with a struct host_basic_info, what looks like that:
struct host_basic_info {
integer_t max_cpus; /* max number of CPUs possible */
integer_t avail_cpus; /* number of CPUs now available */
natural_t memory_size; /* size of memory in bytes, capped at 2 GB */
cpu_type_t cpu_type; /* cpu type */
cpu_subtype_t cpu_subtype; /* cpu subtype */
cpu_threadtype_t cpu_threadtype; /* cpu threadtype */
integer_t physical_cpu; /* number of physical CPUs now available */
integer_t physical_cpu_max; /* max number of physical CPUs possible */
integer_t logical_cpu; /* number of logical cpu now available */
integer_t logical_cpu_max; /* max number of physical CPUs possible */
uint64_t max_mem; /* actual size of physical memory */
}
From this structure, you can get the memory size.

Swift 5
Your Ram Size in Bytes:
let totalRam = ProcessInfo.processInfo.physicalMemory

Related

What is the latency of `clwb` and `ntstore` on Intel's Optane Persistent Memory?

In this paper, it is written that the 8 bytes sequential write of clwb and ntstore of optane PM have 90ns and 62ns latency, respectively, and sequential reading is 169ns.
But in my test with Intel 5218R CPU, clwb is about 700ns and ntstore is about 1200ns. Of course, there is a difference between my test method and the paper, but the result is too bad, which is unreasonable. And my test is closer to actual usage.
During the test, did the Write Pending Queue of CPU's iMC or the WC buffer in the optane PM become the bottleneck, causing blockage, and the measured latency has been inaccurate? If this is the case, is there a tool to detect it?
#include "libpmem.h"
#include "stdio.h"
#include "x86intrin.h"
//gcc aep_test.c -o aep_test -O3 -mclwb -lpmem
int main()
{
size_t mapped_len;
char str[32];
int is_pmem;
sprintf(str, "/mnt/pmem/pmmap_file_1");
int64_t *p = pmem_map_file(str, 4096 * 1024 * 128, PMEM_FILE_CREATE, 0666, &mapped_len, &is_pmem);
if (p == NULL)
{
printf("map file fail!");
exit(1);
}
if (!is_pmem)
{
printf("map file fail!");
exit(1);
}
struct timeval start;
struct timeval end;
unsigned long diff;
int loop_num = 10000;
_mm_mfence();
gettimeofday(&start, NULL);
for (int i = 0; i < loop_num; i++)
{
p[i] = 0x2222;
_mm_clwb(p + i);
// _mm_stream_si64(p + i, 0x2222);
_mm_sfence();
}
gettimeofday(&end, NULL);
diff = 1000000 * (end.tv_sec - start.tv_sec) + end.tv_usec - start.tv_usec;
printf("Total time is %ld us\n", diff);
printf("Latency is %ld ns\n", diff * 1000 / loop_num);
return 0;
}
Any help or correction is much appreciated!
The main reason is repeating flush to the same cacheline is delayed dramatically[1].
You are testing the avg latency instead of best-case latency like the FAST20 papaer.
ntstore are more expensive than clwb, so it's latency is higher. I guess it's a typo in your first paragraph.
appended on 4.14
Q: Tools to detect possible bottleneck on WPQ of buffers?
A: You can get a baseline when PM is idle, and use this baseline to indicate the possible bottleneck.
Tools:
Intel Memory Bandwidth Monitoring
Reads Two hardware counters from performance monitoring unit (PMU) in the processor: 1) UNC_M_PMM_WPQ_OCCUPANCY.ALL, which counts the accumulated number of WPQ entries at each cycle and 2) UNC_M_PMM_WPQ_INSERTS, which counts how many entries have been inserted into WPQ. And the calculate the queueing delay of WPQ: UNC_M_PMM_WPQ_OCCUPANCY.ALL / UNC_M_PMM_WPQ_INSERTS. [2]
[1] Chen, Youmin, et al. "Flatstore: An efficient log-structured key-value storage engine for persistent memory." Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems. 2020.
[2] Imamura, Satoshi, and Eiji Yoshida. “The analysis of inter-process interference on a hybrid memory system.” Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops. 2020.
https://www.usenix.org/system/files/fast20-yang.pdf describes what they're measuring: the CPU side of doing one store + clwb + mfence for a cached write1. So the CPU-pipeline latency of getting a store "accepted" into something persistent.
This isn't the same thing as making it all the way to the Optane chips themselves; the Write Pending Queue (WPQ) of the memory controllers are part of the persistence domain on Cascade Lake Intel CPUs like yours; wikichip quotes an Intel image:
Footnote 1: Also note that clwb on Cascade Lake works like clflushopt - it just evicts. So store + clwb + mfence in a loop test would test the cache-cold case, if you don't do something to load the line before the timed interval. (From the paper's description, I think they do). Future CPUs will hopefully properly support clwb, but at least CSL got the instruction supported so future libraries won't have to check CPU features before using it.
You're doing many stores, which will fill up any buffers in the memory controller or elsewhere in the memory hierarchy. So you're measuring throughput of a loop, not latency of one store plus mfence itself in a previously-idle CPU pipeline.
Separate from that, rewriting the same line repeatedly seems to be slower than sequential write, for example. This Intel forum post reports "higher latency" for "flushing a cacheline repeatedly" than for flushing different cache lines. (The controller inside the DIMM does do wear leveling, BTW.)
Fun fact: later generations of Intel CPUs (perhaps CPL or ICX) will have even the caches (L3?) in the persistence domain, hopefully making clwb even cheaper. IDK if that would affect back-to-back movnti throughput to the same location, though, or even clflushopt.
During the test, did the Write Pending Queue of CPU's iMC or the WC buffer in the optane PM become the bottleneck, causing blockage, and the measured latency has been inaccurate?
Yes, that would be my guess.
If this is the case, is there a tool to detect it?
I don't know, sorry.

STM32 HAL_I2C_Master_Transmit - Why we need to shift address?

after stumbling upon very strange thing I would like to find out if anyone could provide reasonable explanation.
I have SHT31 humidity sensor running on I2C and after trying to run it on STM32F2 it didn't work.
uint8_t __data[5]={0};
__data[0] = SHT31_SOFTRESET >> 8;
__data[1] = SHT31_SOFTRESET & 0xFF;
HAL_I2C_Master_Transmit(&hi2c3,((uint16_t)0x44)<<1,__data,2,1000);
I have opened the function and saw:
/**
* #brief Transmits in master mode an amount of data in blocking mode.
* #param hi2c Pointer to a I2C_HandleTypeDef structure that contains
* the configuration information for the specified I2C.
* #param DevAddress Target device address: The device 7 bits address value
* in datasheet must be shifted to the left before calling the interface
* #param pData Pointer to data buffer
* #param Size Amount of data to be sent
* #param Timeout Timeout duration
* #retval HAL status
*/
HAL_StatusTypeDef HAL_I2C_Master_Transmit(I2C_HandleTypeDef *hi2c, uint16_t DevAddress, uint8_t *pData, uint16_t Size, uint32_t Timeout)
{
/* Init tickstart for timeout management*/
uint32_t tickstart = HAL_GetTick();
if (hi2c->State == HAL_I2C_STATE_READY)
....... and it goes ....
So I followed the comment and frustration from my scope (looking why my bits are not going on the wire) and did:
HAL_I2C_Master_Transmit(&hi2c3,((uint16_t)0x44)<<1,__data,2,1000);
Finally my bits are going out and device ACKs me back - voila it works!
But why?? What would be the reason behind putting burden on the programmer to shift the address?
Because the programmer should probably be made aware if he wants to read or write data to or from the I2C slave device.
In common I2C communication the first seven bits of the "address byte" contains the slave address, whereas the last bit is a read/write bit. 0 is write and 1 is read.
In your case, you want to write data to the device (to perform a soft reset) and therefore a simple left shift will do the trick.
It has never been agreed whether an I2C address is to be specified:
such that it needs to be shifted for transmission, or
such that it does not need to be shifted for transmission.
Therefore some device datasheets specify it in variant 1 and some in variant 2. Similarly, some I2C APIs take the address in variant 1 and some in variant 2.
If the device and the API use a different variant, it's the programmer's burden to shift the address.
It creates a lot of confusion and is quite annoying. I doubt it will every be clarified.
Sorry for the late reply, I just bumped my head against this myself. This should be considered a bug but ST refuses to acknowledge it as such. If you research the reference manual for the I2C section, the OAR1 register states that the address is stored in bits 7:1 for 7 bit mode. Bits 0, 8 and 9 are ignored. The HAL routine that sets the address should then shift the 7 LSB's so that bits 6:0 of your address get written to bits 7:1 of the OAR1 register. This doesn't happen. Essentially, because the code was released, it is now a "feature" and not a bug. Another way to look at it is that the address byte that you send to the HAL is left aligned. This is extremely irritating as it is not consistent for 7 and 10 bit addresses.

How can I change the start address on flash?

I'm using STM32F746ZG and FreeRTOS.
The start address of flash is 0x08000000. But I want to change it to 0x08040000. I've searched this issue through google but I didn't find the solution.
I changed the linker script like the following.
MEMORY
{
RAM (xrw) : ORIGIN = 0x20000000, LENGTH = 320K
/* FLASH (rx) : ORIGIN = 0x8000000, LENGTH = 1024K */
FLASH (rx) : ORIGIN = 0x8040000, LENGTH = 768K
}
If I only change it and run the debugger, it has the problem.
If I change the VECT_TAB_OFFSET from 0x00 to 0x4000, it works fine.
/* #define VECT_TAB_SRAM */
#define VECT_TAB_OFFSET 0x40000 /* 0x00 */
SCB->VTOR = FLASH_BASE | VECT_TAB_OFFSET;
But if I don't use debugger, it doesn't work anything.
It means it only works when using ST-Linker.
Please let me know if you know the solution.
Thank you for in advance of your reply.
The boot address can be set in the option bytes.
You can set any address in the flash with 16k increments. There are two 16 bit registers in the option bytes area, one is used when the boot pin is low at reset, the other when the pin is high. Write the desired address shifted right by 14 bits, i.e. divided by 16384.
To boot from 0x08040000, write 0x2010 into the register as described in the Option bytes programming chapter of the reference manual.
You could also write a bootloader. Bootloader sits on the 0x0800 0000 address and loads your application firmware meaning jumps to it.
This is the other way to do it.
You need to place 8 bytes at the original beginning of the FLASH. Stm32 boots always from the address 0x00000000 which is aliased to the one of the memories (depending on the boot pins and options).
The first word contains the stack pointer the second one your reset handler. You never get to your code as it boots always from the same address.
You will need to modify your linker script and the startup files where vectors are defined

Get amount of memory used by app in iOS

I'm working on an upload app that splits files before upload. It splits the files to prevent being closed by iOS for using too much memory as some of the files can be rather large. It would be great if I could, instead of setting the max "chunk" size, set the max memory usage and determine the size using that.
Something like this
#define MAX_MEM_USAGE 20000000 //20MB
#define MIN_CHUNK_SIZE 5000 //5KB
-(void)uploadAsset:(ALAsset*)asset
{
long totalBytesRead = 0;
ALAssetRepresentation *representation = [asset defaultRepresentation];
while(totalBytesRead < [representation size])
{
long chunkSize = MAX_MEM_USAGE - [self getCurrentMemUsage];
chunkSize = min([representation size] - totalBytesRead,max(chunkSize,MIN_CHUNK_SIZE));//if I can't get 5KB without getting killed then I'm going to get killed
uint8_t *buffer = malloc(chunkSize);
//read file chunk in here, adding the result to totalBytesRead
//upload chunk here
}
}
Is essentially what I'm going for. I can't seem to find a way to get the current memory usage of my app specifically. I don't really care about the amount of system memory left.
The only way I've been able to think of is one I don't like much. Grab the amount of system memory on the first line of main in my app, then store it in a static variable in a globals class then the getCurrentMemUsage would go something like this
-(long)getCurrentMemUsage
{
long sysUsage = [self getSystemMemoryUsed];
return sysUsage - [Globals origSysUsage];
}
This has some serious drawbacks. The most obvious one to me is that another app might get killed in the middle of my upload, which could drop sysUsage lower than origSysUsage resulting in a negative number even if my app is using 10MB of memory which could result in my app using 40MB for a request rather than the maximum which is 20MB. I could always set it up so it clamps the value between MIN_CHUNK_SIZE and MAX_MEM_USAGE, but that would just be a workaround instead of an actual solution.
If there are any suggestions as to getting the amount of memory used by an app or even different methods for managing a dynamic chunk size I would appreciate either.
Now, as with any virtual memory operating system, getting the "memory used" is not very well defined and is notoriously difficult to define and calculate.
Fortunately, thanks to the virtual memory manager, your problem can be solved quite easily: the mmap() C function. Basically, it allows your app to virtually load the file into memory, treating it as if it were in RAM, but it is actually swapped in from storage as it is accessed, and swapped out when iOS is low on memory.
This function is really easy to use in iOS with the Cocoa APIs for it:
- (void) uploadMyFile:(NSString*)fileName {
NSData* fileData = [NSData dataWithContentsOfMappedFile:fileName];
// Work with the data as with any NSData* object. The iOS kernel
// will take care of loading the file as needed.
}

Deal with potential memory issue in iPhone

What ways are there to deal with memory issues on the iPhone?
Is it possible to ask how much memory is available before jumping into a memory intensive section of code?
(or perhaps Apple would just say that if you have to use so much memory you are on the wrong platform?)
UIApplicationDelegate's applicationDidReceiveMemoryWarning: will let you know if you're using too much memory. If you want to check before a memory intensive operation, here's a function that gets the available free memory in bytes on iOS:
natural_t TSGetFreeSystemMemory(void) {
mach_port_t host_port = mach_host_self();
mach_msg_type_number_t host_size = sizeof(vm_statistics_data_t) / sizeof(integer_t);
vm_size_t pagesize;
vm_statistics_data_t vm_stat;
host_page_size(host_port, &pagesize);
if (host_statistics(host_port, HOST_VM_INFO, (host_info_t)&vm_stat, &host_size) != KERN_SUCCESS)
printf("failed to get host statistics");;
// natural_t mem_used = (vm_stat.active_count + vm_stat.inactive_count + vm_stat.wire_count) * pagesize;
natural_t mem_free = vm_stat.free_count * pagesize;
return mem_free;
}
Apple appears not to be telling developers because they want to change the amount of memory available in new devices and OS releases. The number went way up on a freshly booted iPhone 4 and way down under iOS 4.0 after typical use on an iPhone 3G.
One possible method is to "preflight" the memory required for successful completion of some operation (e.g. malloc, check and then free the blocks that add up to your requirement). You can malloc in small chunks using a timer spanning many milliseconds to see if you can "push" other apps out of memory. But even this method is no guarantee, as Mail or some other background app could jump in and consume memory even when your app is frontmost.
If you use less than 20MB at any point in time, then a huge percentage of games in the App store will fail before your app does (just my random opinion).
Put the following in your app delegate and it will be called when memory is starting to run low. This is the Apple way of doing things:
- (void)applicationDidReceiveMemoryWarning:(UIApplication *)application {
// Free some memory or set some flag that we are low
}