ARM crypto instructions and __ARM_FEATURE_CRYPTO macro - macros

I'm having a hard time determining ARM-64 features across platforms (Linux, Apple, Windows Phone and Windows Store) and toolchains (ARMCC, GCC, Clang, MSVC). According to ARM's documentation at Compiler Toolchain for __ARM_FEATURE_CRYPTO:
Set to 1 if the target has crypto instructions.
Back tracking a bit further, and according to the ARM C Language Extensions 2.0 (ACLE):
6.5.7 Crypto Extension
__ARM_FEATURE_CRYPTO is defined to 1 if the Crypto instructions are supported and the intrinsics defined in 12.3.14 are available. These instructions include
AES{E, D}, SHA1{C, P, M} etc. This is only available when __ARM_ARCH>= 8.
And:
2.3.14 Crypto Intrinsics
Crypto extension instructions are part of the Advanced SIMD instruction set.
These intrinsics are available when __ARM_FEATURE_CRYPTO is defined ...
If you notice, section 6.5.7 defers to 2.3.14, and 2.3.14 circles back and defers to 6.5.7.
What are the instructions that will trigger the define? And when the instructions(s) are present, what intrinisics are available?
From a 64-bit ARMv8-a LeMaker HiKey (asimd is neon in disguise):
$ cat /proc/cpuinfo
Processor : AArch64 Processor rev 3 (aarch64)
processor : 0
...
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
And from the same LeMaker HiKey dev board (-march=native is not available):
$ gcc -dM -E -march=armv8-a -mcpu=cortex-a53 - < /dev/null | egrep -i '(arm|aarch|neon|crc|crypto)'
#define __AARCH64_CMODEL_SMALL__ 1
#define __aarch64__ 1
#define __AARCH64EL__ 1
#define __ARM_NEON 1
And from an Apple toolchain with -arch arm64:
$ clang++ -arch arm64 -dM -E - < /dev/null | sort | egrep -i '(arm|aarch|neon|crc)'
#define __AARCH64EL__ 1
#define __AARCH64_SIMD__ 1
#define __ARM64_ARCH_8__ 1
#define __ARM_64BIT_STATE 1
#define __ARM_ACLE 200
#define __ARM_ALIGN_MAX_STACK_PWR 4
#define __ARM_ARCH 8
#define __ARM_ARCH_ISA_A64 1
#define __ARM_ARCH_PROFILE 'A'
#define __ARM_FEATURE_CLZ 1
#define __ARM_FEATURE_CRYPTO 1
#define __ARM_FEATURE_DIV 1
#define __ARM_FEATURE_FMA 1
#define __ARM_FEATURE_UNALIGNED 1
#define __ARM_FP 0xe
#define __ARM_FP16_FORMAT_IEEE 1
#define __ARM_FP_FENV_ROUNDING 1
#define __ARM_NEON 1
#define __ARM_NEON_FP 7
#define __ARM_NEON__ 1
#define __ARM_PCS_AAPCS64 1
#define __ARM_SIZEOF_MINIMAL_ENUM 4
#define __ARM_SIZEOF_WCHAR_T 4
#define __aarch64__ 1
#define __arm64 1
#define __arm64__ 1

Being an optional extension, it's generally down to you to tell the compiler if your target implements the crypto instructions. For GCC or regular Clang, that means adding the +crypto feature modifier to your -march or -mcpu setting.
It looks like Apple's version of Clang enables it unconditionally, but the target there is implicitly "Apples's CPUs", and I doubt they make non-crypto versions since they don't license their CPU designs at all, let alone for export. As for Windows Phone, whilst ARMv8 does add the crypto instructions to AArch32 as well, the VS2015 ARM compiler still only seems to support ARMv7, so I think it's entirely moot there.
Note that GCC doesn't do much with the crypto feature other than pass it through to the assembler, since it doesn't properly support ACLE. I tried Clang 4.8 as packaged by Arch Linux, and that does happily compile the standard AES intrinsics from arm_acle.h if -march=armv8-a+crypto is given.

Related

g++ cannot link libatomic library on Microsoft user-agent

I have a very simple dummy program call main.c as below:
#include <stdlib.h>
#include <stdio.h>
#define TEST __atomic_compare_exchange
void test() {
__int128 unsigned a = 1, b = 2, c = 3;
__atomic_compare_exchange_16(&a, &b, c, 1, 1, 1);
printf("hello");
}
When compile using the following command, it works fine on my local Linux machine (Debian gcc version 6):
g++ --shared -o libmain.so -latomic main.c -Wl,--no-as-needed
However when using Microsoft hosted agent ubuntu-18.04, it fails regardless of whatever command I tried. Below is a list of commands that I have tried:
g++ --shared -o libmain.so -latomic main.c -Wl,--no-as-needed
g++ --shared -o libmain.so /usr/lib/x86_64-linux-gnu/libatomic.so.1 main.c -Wl,--no-as-needed
g++ --shared -o libmain.so -L/usr/lib/x86_64-linux-gnu/ -l:libatomic.so.1 main.c -Wl,--no-as-needed
When run ldd libmain.so not libatomic is showed in the list:
linux-vdso.so.1 (0x00007ffc3c5b6000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f889324e000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f8892eb0000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f8892c98000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f88928a7000)
/lib64/ld-linux-x86-64.so.2 (0x00007f889385d000)
When run readelf -W -s libatomic.so, the __atomic_compare_exchange_16 shows as undefined without any suffix #... indicating the libatomic libary to look for.
Symbol table '.dynsym' contains 14 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND printf#GLIBC_2.2.5 (2)
2: 0000000000000000 0 FUNC WEAK DEFAULT UND __cxa_finalize#GLIBC_2.2.5 (2)
3: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND __atomic_compare_exchange_16
4: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __stack_chk_fail#GLIBC_2.4 (3)
5: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterTMCloneTable
6: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__
7: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_registerTMCloneTable
8: 0000000000201038 0 NOTYPE GLOBAL DEFAULT 22 _edata
9: 0000000000201040 0 NOTYPE GLOBAL DEFAULT 23 _end
10: 000000000000070a 150 FUNC GLOBAL DEFAULT 12 _Z4testv
11: 00000000000005c0 0 FUNC GLOBAL DEFAULT 9 _init
12: 0000000000201038 0 NOTYPE GLOBAL DEFAULT 23 __bss_start
13: 00000000000007a0 0 FUNC GLOBAL DEFAULT 13 _fini
I have also check the g++ --print-search-dirs and the libraries search dirs look all correct to me.
Is Microsoft Hosted Agent environment just different or am I missing any obvious linker option?
Update
This is a fundamental concept to linker symbol search order that I have missed. Usually, linker searches for symbols from left to right, but for some modern linkers, all libraries are searched regardless of order. I have tested this on one my other ubuntu VM and changing the order does work as expected.
Will update once I have tested with the Microsoft hosted user-agent.
Update 2
I can confirm that it is the difference between gcc linker in Ubuntu and my local machine Debian.
To answer my own question.
The difference that I found is that on the Ubuntu machine, the gcc linker appear to search symbols in left to right order. Thus placing '-latomic' before the source file cause the reference to not be found and the library is not linked.
On my local Debian machine, the linker search in both directions so the library is linked regardless of location of the '-l' option in the command.

Need some help in Fuzzing Mosquitto lib

Here comes a bug while fuzzing Mosquitto lib, I would like to know the solution.
Step1. compile the lib
#:~/fuzz/fuzzmqtt/mosquitto$ ls
about.html doc Makefile security
aclfile.example docker man SECURITY.md
appveyor.yml edl-v10 misc service
buildtest.py epl-v10 mosquitto.conf set-version.sh
ChangeLog.txt examples Mosquitto.podspec snap
client installer notice.html src
CMakeLists.txt lib pskfile.example test
compiling.txt libmosquitto.pc.in pwfile.example THANKS.txt
config.h libmosquittopp.pc.in readme.md travis-configure.sh
config.mk LICENSE.txt readme-tests.md travis-install.sh
CONTRIBUTING.md logo readme-windows.txt www
#:~/fuzz/fuzzmqtt/mosquitto$ sudo make install CC="clang -O2 -fno-omit-frame-pointer -g -fsanitize=address -fsanitize-coverage=trace-pc-guard,trace-cmp,trace-gep,trace-div" -j2
Step2. compile the fuzzer
#:~/fuzz/fuzzmqtt/mosquitto/lib$ clang -g -O1 -fsanitize=fuzzer,address mos_fuzzer.cc -o mos_fuzzer -lmosquitto
Step3. Run the fuzzer and got the bug
#~/fuzz/fuzzmqtt/mosquitto/lib$ ./mos_fuzzer
INFO: Seed: 106983829
INFO: Loaded 1 modules (2337 guards): 2337 [0x7f157cd816b0, 0x7f157cd83b34),
INFO: Loaded 1 modules (1 inline 8-bit counters): 1 [0x787f80, 0x787f81),
INFO: Loaded 1 PC tables (1 PCs): 1 [0x565af8,0x565b08),
ERROR: The size of coverage PC tables does not match the
number of instrumented PCs. This might be a compiler bug,
please contact the libFuzzer developers.
Also check https://bugs.llvm.org/show_bug.cgi?id=34636
for possible workarounds (tl;dr: don't use the old GNU ld)
The code is as follow
#include "stdio.h"
#include "mosquitto.h"
#include "assert.h"
#include "stdint.h"
#include "stddef.h"
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
bool clean_session = true;
struct mosquitto *mosq = NULL;
mosquitto_lib_init();
void *data_1=(void *)data;
mosq = mosquitto_new(NULL, clean_session, data_1);
mosquitto_destroy(mosq);
mosquitto_lib_cleanup();
return 0;
}
Thank you

Different behavior of mpiexec in Windows and in Ubuntu

I have a code in Fortran (program.f) and I have compiled it with Eclipse in \ubuntu 16 and in Windows 7.
The Eclipse configuration for Ubuntu is the follow:
GNU Fortran Compiler: gfortran
Include paths(-l) : /usr/lib/openmpi/include
GNU Fortran Linker : mpif90
Tool Chain Editor : GCC Fortran
The Eclipse configuration for Windows is the follow:
GNU Fortran Compiler: gfortran
Include paths(-l) : C:\cygwin64\usr\include
GNU Fortran Linker : mpif90
Tool Chain Editor : GCC Fortran
When I execute the program in Ubuntu, the program works how it is expected.
In Ubuntu the program is executed with 2 processors by doing
$ mpiexec -np 2 myprogram
And the behavior is the follow
$ mpiexec -np 2 myprogram
There are 2 processors running this job.
Rank# 1 d1= 65 d2= 128
Rank# 0 d1= 1 d2= 64
Where d1 and d2 are pieces of the problem domain assigned to each processors. In this example the total domain is 128. The domain was assigned from 1 to 64 to processor 0, and from 65 to 128 to processor 1. This is the expected behavior: the model of 128 are divided in 2, from 1 to 64 to processor 0, and from 65 to 128 for the processor 1.
For the other hand, in Windows, after compile the code using the mentioned specifications, I execute the program by doing:
$ mpiexec.exe -n 2 myprogram.exe
And the behavior is the follow
$ mpiexec -np 2 myprogram
There are 1 processors running this job.
Rank# 0 d1= 1 d2= 128
Rank# 0 d1= 1 d2= 128
We can see that the behavior is different: the program executed in Windows is not running in parallel as it is expected. In the terminal we can see that 1 processor is running the program, and the domain is assigned as follow: from 1 to 128 (whole domain) to processor 0 and, from 1 to 128 (whole domain again?) to processor 0. This is the problem that I am trying to solve. I am trying to have the same behavior that I have in Ubuntu.
The mpiexec.exe program for Windows was obtained from the official installer MS-MPI.
The gfortran and the OpenMPI libs for Windows were obtained by using cygwin
I tried to change the GNU linker and the compiler in Eclipse for Windows and does not work. I tried to run the code in others machines with Windows 10 and problem is the same.
Any suggestions on how to try solve this issue?
As mentioned by #jcgiret there are a consistency problem: the program is compiled using OpenMPI and it is executed with MS-MPI.
To solve this issue the code was executed using the equivalent to mpiexec defined in the openmpi package:
usr/bin/mpiexec -> orterun.exe
The program is executed in windows by
$ orterun.exe -n 2 myprogram.exe
Then the results is the same that obtained in Ubuntu:
$ orterun.exe -n 2 myprogram.exe
There are 2 processors running this job.
Rank# 1 d1= 65 d2= 128
Rank# 0 d1= 1 d2= 64

cross compile ghc curses not found

I tried to crosscompile from linux i386 to arm-linux-gnueabihf, but i cant make it happen, because it gives me this weird error while running 'make':
checking ncurses.h usability... yes
checking ncurses.h presence... yes
checking for ncurses.h... yes
checking for setupterm in -ltinfo... no
checking for setupterm in -lncursesw... no
checking for setupterm in -lncurses... no
checking for setupterm in -lcurses... no
configure: error: in '/home/edi/ghc_cross/ghc/libraries/terminfo':
configure: error: curses library not found, so this package cannot be built
See 'config.log' for more details
make[2]: *** [libraries/terminfo/dist-install/package-data.mk] Error 1
make[1]: *** [all_libraries/terminfo] Error 2
make[1]: Leave Directory '/home/edi/ghc_cross/ghc'
make: *** [all] Error 2
What have i done:
-) Compiled 7.8.0 from github (The log says 'That should have been 7.8.0', i chose 7.8 because i thought it would be more stable for crosscompiling) to my i386 (normal boot, configure, make, make install). This worked fine
-) installed newest llvm from svn (LLVM version 3.5svn)
-) replaced the libffi-3.0.11.tar.gz in ghc/libffi-tarballs with libffi-3.0.13
-) added this version of mk/build.mk:
SRC_HC_OPTS = -H32m _o -fasm -Rghc-timing
GhcStage1HcOpts = -O -fasm
GhcStage2HcOpts = -O0 -DDEBUG -Wall
GhcLibHcOpts = -O -fasm -XGenerics
GhcLibWays = v dyn
SplitObjs = NO
Stage1Only = YES
-) downloaded the sourcecode of ncurses from ​ftp.de.debian.org/debian/pool/main/n/ncurses/ncurses_5.9.orig.tar.gz and builded it with "./configure arm-linux-gnueabihf --with-gcc=arm-linux-gnueabihf-gcc --target=arm-linux-gnueabihf --prefix=/usr/arm-linux-gnueabihf
" + "make". afterwards i added the folder to my $PATH
-)did "perl boot", "./configure --target=arm-linux-gnueabihf --with-gcc=arm-linux-gnueabihf-gcc --prefix=/usr/arm-linux-gnueabihf" and "make".
./configure worked, but make gives me this weird error.
I also tried copying all included files from ncurses to the libraries/terminfo folder, but that also didn't work. I think the mistake is somewhere in the building process, but i'm not exactly sure, so thats why i'm posting this bug.
I also tried getting the libncurses5-dev.deb and libtinfo-dev from my Raspberry pi with "apt-get download libncurses5-dev" and "apt-get download libncurses5-dev" and copied them to my i386, extracted them and added them to my $PATH
Does anyone have an idea about how i can fix that problem with curses?
PS: i also made sure that i met the prerequirements mentioned in ghc.haskell.org/trac/ghc/wiki/Building/Preparation/Linux and ​ghc.haskell.org/trac/ghc/wiki/Building/CrossCompiling
Edit: this is the log of my config.log:
`
configure:3400: checking for setupterm in -lcurses
configure:3425: arm-linux-gnueabihf-gcc -o conftest -fno-stack-protector conftest.c -lcurses >&5
/usr/lib/gcc-cross/arm-linux-gnueabihf/4.8/../../../../arm-linux-gnueabihf/bin/ld: skipping incompatible /usr/lib/../lib/libcurses.a when searching for -lcurses
/usr/lib/gcc-cross/arm-linux-gnueabihf/4.8/../../../../arm-linux-gnueabihf/bin/ld: skipping incompatible /usr/lib/libcurses.a when searching for -lcurses
/usr/lib/gcc-cross/arm-linux-gnueabihf/4.8/../../../../arm-linux-gnueabihf/bin/ld: cannot find -lcurses
collect2: error: ld returned 1 exit status
configure:3425: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "Haskell terminfo package"
| #define PACKAGE_TARNAME "terminfo"
| #define PACKAGE_VERSION "0.2"
| #define PACKAGE_STRING "Haskell terminfo package 0.2"
| #define PACKAGE_BUGREPORT "judah dot jacobson at gmail dot com"
| #define PACKAGE_URL ""
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| /* end confdefs.h. */
|
| /* Override any GCC internal prototype to avoid an error.
| Use char because int might match the return type of a GCC
| builtin and then its argument prototype would still apply. */
| #ifdef __cplusplus
| extern "C"
| #endif
| char setupterm ();
| int
| main ()
| {
| return setupterm ();
| ;
| return 0;
| }
configure:3434: result: no
configure:3450: error: in `/home/edi/ghc_cross/ghc/libraries/terminfo':
configure:3452: error: curses library not found, so this package cannot be built See "config.log" for more details
`
Greetings,
Edi
So, i've found the solution.
What i did is the following:
compiled the ncurses library with the arm-gcc with the following command:
./configure --target=arm-linux-gnueabihf --with-gcc=arm-linux-gnueabihf-gcc --with-shared --host=arm-linux-gnueabihf --with-build-cpp=arm-linux-gnueabihf-g++
afterwards i did make and did configured my ghc with the following:
./configure --target=arm-linux-gnueabihf --with-gcc=arm-linux-gnueabihf-gcc --prefix=/usr/arm-linux-gnueabihf --with-shared --with-sysroot=/path/to/cursescompiled/libs
Then it went through and didn't ask me for the curses-library anymore.
so guys, i hope this will be helpful for you
Ran into a similar problem, and this question lead me to the correct solution. Yes, I had to tweak sysroot, but the author's method didn't work, configure script just ignored the --with-sysroot flag. So what I did instead:
export SYSROOT=/usr/x86_64-w64-mingw32/sys-root/
export CFLAGS=--sysroot=$SYSROOT
export CPPFLAGS=--sysroot=$SYSROOT
export LDFLAGS=--sysroot=$SYSROOT
Then configure and make as usual, all in the same terminal session.

PPC64/Power7 emulation on x86_64

I have used CellSDK in past to emulate/debug and test code for powerpc7 architecture on x86_64 machines. Now i want to emulate test code for upcoming (Google/intel etc) PowerPC8 compiler is available, can someone tell me qemu that can emulate ppc64 on x86_64 so i can test code(scheduler) on it .
The minimum Requirement to emulate ppc kernel on Qemu is qemu binaries kenrel image and rootfs.
For arm its like this
qemu-system-arm -M versatilepb -m 128M -kernel zImage -hda rootfs.ext3 -no-reboot -show-cursor -usb -usbdevice wacom-tablet -no-reboot -serial stdio -m 256 --append "root=/dev/sda rw console=ttyAMA0,115200 console=tty mem=256M highres=off console=ttyS0"
For more details on ppc_64 have a look at
http://gmplib.org/~tege/qemu.html
here 10 and 11
You might want to use IBM SDK other than CellIDE.
https://www-304.ibm.com/webapp/set2/sas/f/lopdiags/sdklop.html
Also, if you are doing open source development, you can try to get a free access to real ppc64 machines. It works pretty well.
http://openpower.ic.unicamp.br/minicloud/