iOS / iPhone Journaling / File System Caching - iphone

From local device testing, I've seen that writing a file to the iOS file system (regardless of how low level the call you use) will often return success before the file is fully committed to the flash. Meaning, if you hard reset the device then reboot, your file could be rolled back (if the write completed or was atomic) or corrupted. What is the source of this delay (documentation appreciated, I haven't been able to find anything), and is there a way to get feedback when the actual filesystem write is completed. For instance, I'd like to acknowledge receipt and storage of a piece of data from a remote server, but I find that acknowledging it after write "reports" success could result in data loss in the event of a hard crash or power failure.

Since this is a 4 years old questions, I'll provide not only the answer, but also the path I took while searching for it.
I was not able to find any clear explanation in the official documentation: File System Programming Guide. There was only a clue in the Performance Tips section. It states that:
Apps can call the BSD fcntl function with the F_NOCACHE flag to enable or disable caching for a file. For more information about this function, see fcntl.
Enabling the F_NOCACHE flag does not solve the problem you're stating, however, the manual for fcntl method states there's an option that you might just find interesting:
F_FULLFSYNC Does the same thing as fsync(2) then asks the drive to flush all buffered data to the permanent storage device
(from man fcntl, see here).
I've checked the manual for fsync for more details. It has given me, eventually, the clearest and most understandable explanation of both the problem and the solution:
Note that while fsync() will flush all data from the host to the drive (i.e. the "permanent storage device"), the drive itself may not physically write the data to the platters for quite some time and it may be written in an out-of-order sequence.
Specifically, if the drive loses power or the OS crashes, the application may find that only some or none of their data was written. The disk drive may also re-order the data so that later writes may be present, while earlier writes are not.
This is not a theoretical edge case. This scenario is easily reproduced with real world workloads and drive power failures.
For applications that require tighter guarantees about the integrity of their data, Mac OS X provides the F_FULLFSYNC fcntl. The F_FULLFSYNC fcntl asks the drive to flush all buffered data to permanent storage. Applications, such as databases, that require a strict ordering of writes should use F_FULLFSYNC to ensure that their data is written in the order they expect.
(from man fsync, see here).
Yeah, it's definitely not a theoretical edge case. Thankfully, once you know the problem, the solution is trivial:
let filePath: String = "your file path"
// you can use other option than read-write
let fd = open(String(path.utf8), O_RDWR)
// if fd is -1, there was an error opening file, handle it as you wish
guard fd != -1 else { return }
// syncResult is -1 if sync operation failed, handle it as you wish
let syncResult = fcntl(fd, F_FULLFSYNC)
// don't forget to close opened file
close(fd)
Once fcntl finishes, your data will be saved.
Notice this operation is slower than a usual writing to file (via NSFileManager or writeToURL methods family). In case of performance issues, it's best to move writing to background thread.

Related

Update file descriptor pointing to /proc/self after fork() from python multiprocess.Process

I'm working on a C++ program that uses boost::python to provide a python wrapper/API for the user. The program tracks and limits its own memory usage by opening /proc/self/statm using a file descriptor. Every timestep it seeks to the beginning of that file and reads the vmsize from it.
proc_self_statm_fd = open( "/proc/self/statm", O_RDONLY );
However, this causes a problem when calling fork(). In particular, when a user writes a python script that does something like this:
proc = multiprocessing.Process(name="bkg_process",target=bkg_process,daemon=True)
The problem is that the forked process gets the file descriptor pointing to /proc/self/statm from the parent process, not its own, and this reports the wrong memory usage. Even worse, if the parent process exits, the child process will fail when trying to read from the file descriptor.
What's the correct solution for this? It needs to be handled at the C++ level because we don't have control over the user's python scripts. Is there a way to have the class auto detect that a fork has happened and grab a new file descriptor? In the worst case I can have it re-open the file for every update. I'm worried that would add runtime overhead though.
You could store the PID in the class, and check it against the value of getpid() on each call, and then reopen the file if the PID has changed. getpid() is typically much cheaper than open - on some systems it doesn't even need a context switch (it just fetches the PID from a magic location in the process's own memory).
That said, you may also want to actually measure the cost of reopening the file each time - it may not actually be significant.

How to read STM32H7 SRAM ECC status?

I want to periodically check the state of the RAM ECC detection in my firmware on a STM32H753.
The ECC mechanism is well described in the Reference Manual for the Flash but not very well (in my opinion) for the RAM.
I've seen that there are dedicated interrupts for RAM ECC but it is not clear if they are triggered for single error or double error ?
Also, why a dedicated interrupt and not a BusFault (like it is done on the Flash if I understood correctly ?)
Also , I don't want to get interrupted for a single error (that is properly corrected). Instead I want to get the status periodically and log the error somewhere. What is the register to read to check whether a single error happened ?
I've got an answer on STMicro forum, I thought it might be helpful to post it here. The answer is from a member called Berendi.
"RAMECC status and interrupt flags are defined in the stm32h7??xx.h headers, and there are some code examples in stm32h7xx_hal_ramecc.h/.c
Apparently it's possible to independently mask single/double ecc fault interrupts.
There is application note AN5342 as well, but it's not very helpful."

Files read immediately after watchman notify are empty

I'm integrating watchman via the socket/bser interface in a JVM program.
I'm seeing odd timing where:
A file is written to by the build system (a small text file)
I get a watchman notification on the bser interface
Thread A listening for bser subscription notifications puts the update onto a queue for a separate thread
Thread B reads the queue, reads the changed file, and then puts the file's data on the wire
However, somehow, Thread B is reading an empty file.
Which, I assume is validly empty at some point, e.g. the IO/syscalls might be:
Clear the file contents
Write chunk 1
Write chunk 2
Close the file
And I assume my Thread B is reading the file between steps 1 and 2. Or maybe 1 and 4, if 4 is when the result is flushed.
My confusion is two fold:
1) I thought watchman's default 20ms wait would account for things like this, and I'd only see an update on my thread A, let alone when my thread B does a read, after step 4, and the data is done being written to the file.
2) Even if watchman did tell me "too soon" about the 1st syscall (say step 1), and I read the results while it was an empty file, there should be another syscall/watchman notification that "btw, the file has some content now".
FWIW/oddly enough, I was seeing this very same behavior when using the Java WatchService API, where I would get an inotify event, but read a file "too soon", and so get either empty or partial results, and then no follow up inotify event when the rest of the data was available.
I assumed this was a fluke/nuance of the WatchService, so I solved it at the time by checking the file mod time before reading it, and just waiting to ensure mod time >2 seconds old before assuming the file is "done" being written.
(Note that this also handled ~100mb+ files being written, where the build process might write a chunk of data every 100ms+, but with WatchService I was seeing 100s of inotify notifications for what was essentially a single continuous write.)
When I ported my WatchService code to watchman, I dropped this "ensureSettled" hack, because I assumed watchman's 20ms settle period (which is way lower than the 2s I was using, but hey it's the default) + it's general robustness compared to the somewhat beta WatchService would mean it wouldn't be a problem.
But within ~a day of using the watchman-ported code, I'm seeing empty file reads, just like I was with the WatchService.
Any ideas about what I'm missing?
I can add back the ensureSettled hack, but at this point I'm curious about what is going on.
The docs aren't very clear on this, sorry!
Dispatching of subscription notifications is subject to the settle timeout, but since file updates are non-atomic it's likely that the default 20ms kicks in before the file contents are visible to you; under the covers, the kernel generates a series of notifications for the various mutations that you're doing, so if the truncate takes 20ms before you write (or perhaps flush) the data out, you'll likely get a notification "in the middle".
This stuff is also operating system dependent. Here's an example of a recently discovered and resolved issue: https://github.com/facebook/watchman/commit/bac383c751b248ae742a2a20df3e8272238c0ae2
it doesn't sound like it is quite the same thing as you're experiencing, it just adds some color to this discussion.
If you already have code to manage the settling in your client, then it may be easier for you to add that back; we do this in watchman-make for example.
You may also wish to try setting https://facebook.github.io/watchman/docs/config.html#settle in a .watchmanconfig file in the root of the directory tree that you're watching and leave that to the watchman server. If/when you change this setting, you will need to delete and restart the watch.
Which you choose depends on how you want to trade ease of configuration against volume of code you want to maintain and (perhaps) volume of support questions from your user base if the .watchmanconfig isn't correctly configured for them.
Note that you can use the command invocation from https://facebook.github.io/watchman/docs/cmd/log-level.html to see the debug logging for the kernel notifications as they come in in real time; this may be helpful for you in understanding exactly which notifications are coming in and when.
Just curious, are you using https://github.com/facebook/watchman/tree/master/java to talk to the watchman server?

Why are there ioctl calls in socket.c?

Trying to understand why there are ioctl calls in socket.c ? I can see a modified kernel that I am using, it has some ioctl calls which load in the required modules when the calls are made.
I was wondering why these calls ended up in socket.c ? Isn't socket kind of not-a-device and ioctls are primarily used for device.
Talking about 2.6.32.0 heavily modified kernel here.
ioctl suffers from its historic name. While originally developed to perform i/o controls on devices, it has a generic enough construct that it may be used for arbitrary service requests to the kernel in context of a file descriptor. A file descriptor is an opaque value (just an int) provided by the kernel that can be associated with anything.
Now if you treat a file descriptor and think of things as files, which most *nix constructs do, open/read/write/close isn't enough. What if you want to label a file (rename)? what if you want to wait for a file to become available (ioctl)? what if you want to terminate everything if a file closes (termios)? all the "meta" operations that don't make sense in the core read/write context are lumped under ioctls; fctls; etc. unless they are so frequently used that they deserve their own system call (e.g. flock(2) functionality in BSD4.2)

Kernel Code vs User Code

Here's a passage from the book
When executing kernel code, the system is in kernel-space execut-
ing in kernel mode.When running a regular process, the system is in user-space executing
in user mode.
Now what really is a kernel code and user code. Can someone explain with example?
Say i have an application that does printf("HelloWorld") now , while executing this application, will it be a user code, or kernel code.
I guess that at some point of time, user-code will switch into the kernel mode and kernel code will take over, but I guess that's not always the case since I came across this
For example, the open() library function does little except call the open() system call.
Still other C library functions, such as strcpy(), should (one hopes) make no direct use
of the kernel at all.
If it does not make use of the kernel, then how does it make everything work?
Can someone please explain the whole thing in a lucid way.
There isn't much difference between kernel and user code as such, code is code. It's just that the code that executes in kernel mode (kernel code) can (and does) contain instructions only executable in kernel mode. In user mode such instructions can't be executed (not allowed there for reliability and security reasons), they typically cause exceptions and lead to process termination as a result of that.
I/O, especially with external devices other than the RAM, is usually performed by the OS somehow and system calls are the entry points to get to the code that does the I/O. So, open() and printf() use system calls to exercise that code in the I/O device drivers somewhere in the kernel. The whole point of a general-purpose OS is to hide from you, the user or the programmer, the differences in the hardware, so you don't need to know or think about accessing this kind of network card or that kind of display or disk.
Memory accesses, OTOH, most of the time can just happen without the OS' intervention. And strcpy() works as is: read a byte of memory, write a byte of memory, oh, was it a zero byte, btw? repeat if it wasn't, stop if it was.
I said "most of the time" because there's often page translation and virtual memory involved and memory accesses may result in switched into the kernel, so the kernel can load something from the disk into the memory and let the accessing instruction that's caused the switch continue.