How can I check whether my app is compiled in 32-bit or 64-bit ?
This is helpful to debug low level code (working with buffers for example).
A compile time check would involve #ifdef'ing for __LP64__, which is ARM's data type size standard. A runtime solution would involve checking the size of pointers, like so:
if (sizeof(void*) == 4) {
// Executing in a 32-bit environment
} else if (sizeof(void*) == 8) {
// Executing in a 64-bit environment
}
Thankfully, pointer sizes are the one thing that the different standards for compiling 64-bit code seem to agree on.
#ifdef __LP64__
NSLog(#"64-bit\t");
#else
NSLog(#"32-bit\t");
#endif
You could check the size of a pointer. I think on 32bit it is 4bytes and on 64 it should be 8.
if( sizeof(void*) == 4 ) then 32bit else 64bit
Related
I am trying to build kernel module driver (KMD) for NVDLA NVIDIA's Deep Learning Accelerator and got the following error at the end.
enter image description here
After doing some research on google I found that it is due to 64bit operations (especially 64bit division) present in the kmd that is causing the errors. After further investigation I found that the kmd was written for 64bit architecture while I am trying to compile it for 32bit (ARM cortex A9) processor. some people online have suggested to use -lgcc, which will take care the issue.
Could anyone help me in editing the makefile to link the linker library libgcc.
Thanks in advance.
Linux kernel code that uses 64-bit division should use the functions provided by #include <linux/math64.h>. Otherwise, when building for 32-bit architectures, GCC will attempt to use functions from libgcc which is not used by the kernel.
For example, the div_u64 function divides a 64-bit unsigned dividend by a 32-bit unsigned divisor and returns a 64-bit unsigned quotient. The KMD code referenced by OP contains this function:
int64_t dla_get_time_us(void)
{
return ktime_get_ns() / NSEC_PER_USEC;
}
After adding #include <linux/math64.h>, it can be rewritten to use the div_u64 function as follows:
int64_t dla_get_time_us(void)
{
return div_u64(ktime_get_ns(), NSEC_PER_USEC);
}
(Note that ktime_get_ns() returns a u64 (an unsigned 64-bit integer) and NSEC_PER_USEC has the value 1000 so can be used as a 32-bit divisor.)
There may be other places in the code where 64-bit division is used, but that is the first one I spotted.
I'm performing a set of activities to make sure Redis runs well in a set of embedded systems, including the Raspberry PI. In order to fix certain code paths of Redis where unaligned memory accesses are performed (due to a change introduced in Redis 3.2) I'm trying to force the PI to either log a message on unaligned memory accesses or send a signal to the process when this happens. In this way I can both make sure that Redis will run well where unaligned accesses are a violation, and that it will run faster in platforms where instead such accesses can be performed but are slower. ARM v6, the one used in the PI v1, is apparently able to deal with unaligned memory accesses, so if I use following command to configure Linux in order to sent a signal to the process performing the unaligned access:
echo 4 > /proc/cpu/alignment
And then run the following program:
#include <stdio.h>
#include <stdint.h>
int main(int argc, char **argv) {
char *buf = "foobareklsjdfklsjdfslkjfskdljfskdfjdslkjfdslkjfsd";
uint32_t *l = (uint32_t*) (buf+1);
printf("%p\n", l);
printf("%d\n", (int)*l);
return 0;
}
I can't see any signal received by the process, or the counters at /proc/cpu/alignment incrementing.
My guess is that this is due to ARM v6 ability to deal with unaligned addresses automatically, if a given CPU configuration flag is set. My question is, is my hypothesis correct? And if so, how to force a PI version 1 to actually raise an exception in case of unaligned accesses so that the Linux kernel can trap it and send a signal, log the access, and so forth, according to /proc/cpu/alignment settings?
EDIT: It is worth to note that not all the instructions can perform unaligned accesses even in ARM v6. For instance STMDB, STMFD, LDMDB, LDMEA and similar multiple words instructions will indeed raise an exception and will be trapped by the Linux kernel.
I think I eventually found my answers:
Yes I'm correct, up to the word size ARM v6 (or greater) can silently handle the unaligned accesses so no trap is generated and is completely transparent for the Linux kernel. Nothing will be logged, nor the traps counter in /proc/cpu/alignment will be incremented.
AFAIK there is no way I can force the kernel to trap word-sized unaligned accesses, since to do that apparently the CPU should be configured in order to trap the unaligned addresses in every case, but the Linux kernel does not do that AFAIK, probably because there is alignment unsafe code inside the kernel itself. Checking the Linux kernel source code indeed one can see:
if (cpu_is_v6_unaligned()) {
set_cr(__clear_cr(CR_A));
ai_usermode = safe_usermode(ai_usermode, false);
}
What this means is that the SCTLR.A bit is always cleared, so no trap
will be generated for unaligned accesses ARM v6 can handle.
There are a great deal of instructions that will still generate traps when used with unaligned addresses, for example multi store/load instructions, loading and storing of double values.
However, there are instructions that GCC (the version shipped in the default Raspberry Linux distribution) will happily produced that are not handled by the Linux kernel correctly, that will result in a SIGBUS generated even when /proc/cpu/alignment is set to fix the access.
So point number 4 basically means that, it is not a good idea to fix programs to run in ARM v6 just letting the Linux kernel handle unaligned addresses for us, even when the performance implications of unaligned addresses are not a problem: the program can still crash since not all the instructions are handled.
How to reliably find all the unaligned accesses in a program remains an open question AFAIK, since unfortunately, the otherwise wonderful valgrind program, never implemented this feature. In the past I had to use QEMU emulating Sparc, however this is a very slow process. Valgrind would be the trivial way to do that.
For some unknown reason Intel decided to does not support AVX2 via typical /arch: option. /arch: recognizes only following instructions IA32,SSE,SSE2,SSE3,AVX. So if you want to compile for AVX2 you are basically forced to activate /QxCORE-AVX2 switch. The problem with this option is that it injects check code. That code at runtime checks if your cpu is compatible with selected intructions. If CPU is not compatible then this message pops-up.
Please verify that both the operating system and the processor support Intel(R)
MOVBE, F16C, FMA, BMI, LZCNT and AVX2 instructions.
Now I'm worried that the same message may pop-up on AMD Excavator and RyZen CPU due to not being GenuineIntel. Unfortunately I do not have access to any AMD cpu so I can't check that on real cpu. To make your life easier I've compiled this simple code with activated /QxCORE-AVX2 option.
#include "stdafx.h"
int _tmain(int argc, _TCHAR* argv[])
{
double a, b, c;
a = 3.0;
b = 2.0;
c = 1.0;
a = a*b + c;
printf("a=%1.1f",a);
return 0;
}
and here is decompiled asm code: http://codepad.org/KL4Vq978
My question to people who understand asm code is do you see anything what may block execution of this code on latest AMD cpus? If yes will this http://www.softpedia.com/get/Programming/Patchers/Intel-Compiler-Patcher.shtml help?
It turns out that /arch:CORE-AVX2 is recognized and compiled executable contains FMA instructions! I really do not understand why this option is not listed in Visual Studio and in ICL /help ?!?
Dropbox menu in Visual Studio (NO AVX2!)
http://i.cubeupload.com/c1xidV.png
ICL /help
http://i.cubeupload.com/y2Cre6.png
The Ryzen supports these instruction sets, but the code will not run on AMD processors because it checks if the processor is "GenuineIntel". There has been a long discussion and legal battle about this issue. See http://www.agner.org/optimize/blog/read.php?i=49
This shouldn't be that hard that one may think, if I got it right. Specifically, I'll begin with iOS and the ELF executable format. Let's clarify that I have a jailbroken iPhone and I don't want to do this in any appstore apps, so pleas avoid "good advices" like "you can't do it as it's prohibited by Apple".
So, what I have seen is that there's a Flash player implementation, called Frash (by Comex btw, developer of recent jailbreaks). This utility requires, after installation, that Android's libflashplayer.so is present (copied to) the iPhone file system. I digged into the source code and found out that the tweak actually opens the Android (ELF) shared object file, "parses" it and executes code from it. I already asked a friend of mine wheter it is or is not actually possible and he told me that it is, because ELF on ARM and Mach-O on ARM are binary compatible (because they're both ARM). But he actually failed to explain it to me in detail, so I'd like to ask how can it be done? I can't exactly understand the source code fragment that handles, but one thing is sure:
int fd = open("libflashplayer.so", O_RDONLY);
_assert(fd > 0);
fds_init();
sandbox_me();
int symtab_size;
Elf32_Sym *symtab;
void **init_array;
Elf32_Word init_array_size;
char *strtab;
TIME(base_load_elf(fd, &symtab, &symtab_size, &init_array, &init_array_size, &strtab));
// Call the init funcs
_assert(init_array);
while(init_array_size >= 4) {
void (*x)() = *init_array++;
notice("Calling %p", x);
x();
init_array_size -= 4;
}
(from the original code, as of 02/12/2011 on GitHub)
It seems to me that he uses libelf to perform this, right? And that in an ELF file there are symbols that can be executed on a compatible processor just fine?
I'd also like to know whether it is true for all other processor architectures? So maybe one can execute symbols from Linux binaries on OS X?
The important thing about compatibility is the underlying processor architecture, not Linux vs. OS X vs. Android. If the ELF or .so are compiled for the same processor instruction set, then this can work. If not, then they are not compatible. For example, if both were built for Linux but different processors, they would not be compatible.
I am working on an iPhone app which is using an external library for which I have the source. During debugging some of the objects are listed as 0x0 in the debugger but the app runs fine. Also, some of the objects addresses point to the wrong thing.
These symbols are in the external library. The addresses are fine if I am tracing through a file actually in the external library.
Does anyone have suggestions how to stop this behavior?
UPDATE: target settings > Build tab > GCC 4.2 Code Generation > "Compile for Thumb"
I turned off this target setting and the gdb problem went away.
--
Hi John.
I understand what you're referring to. I'm also seeing a problem where gdb and NSLog() are giving me different results for pointers in certain parts of my code.
In a boiled-down example, gdb fails to report the proper value for "pointer" when I set a breakpoint on any line in this function:
id testPointer( id pointer )
{
NSLog( #"pointer value: %p", pointer );
#try
{
NSLog( #"foo" );
}
#catch ( NSException *e )
{ }
#finally
{ }
return pointer;
}
As zPesk notes, 0x0 is nil, which is a common value for objects that have not been initialized (particularly instance variables). I'm not certain what you mean by "point to the wrong thing." If you haven't initialized a local (stack) variable, it may point to any random address until it is initialized. What behavior are you having trouble with?
Were you ever able to resolve this issue? I, too, am noticing strange behavior in gdb when mixing Thumb and ARM modes. For example, it appears that the addresses of variables reported by gdb are off by exactly 64 bytes from the addresses reported using printf("%p\n") statements. Perhaps gdb needs to be explicitly told whether the current operating mode is ARM or Thumb...?