About using Bob Jenkins' perfect hash library - hash

When i use Bob Jenkins's perfect hash package, After build "perfect" binary, i could not even pass the example by "./perfect < sample_input" , it always warn me that "fatal error: Cannot perfect hash: cannot build tab[]", Has anyone met this issue before ? Are there any other steady perfect hash concerned library or package, Thanks in advanced!
Quote Jenkins' perfect hash library link as below:
http://burtleburtle.net/bob/hash/perfect.html

I faced the same problem with gcc and clang under 64 bit Linux and found the reason:
The type definitions of the 4 byte types ub4 and sb4 must be changed in standard.h from
typedef unsigned long int ub4;
typedef signed long int sb4;
to
typedef unsigned int ub4;
typedef signed int sb4;
or might be defined as aliases to types from stdint.h (uint32_t and int32_t).

Related

Why Swift's malloc/MemoryLayout.size take/return signed integers?

public func malloc(_ __size: Int) -> UnsafeMutableRawPointer!
#frozen public enum MemoryLayout<T> {
public static func size(ofValue value: T) -> Int
...
When in C malloc/sizeof take/return size_t which is unsigned?
Isn't Swift calling libc under the hood?
EDIT: is this the reason why? https://qr.ae/pvFOQ6
They are basically trying to get away from C's legacy?
Yes, it's calling the libc functions under the hood.
The StdlibRationales.rst document in the Swift repo explains why it imports size_t as Int:
Converging APIs to use Int as the default integer type allows users to write fewer explicit type conversions.
Importing size_t as a signed Int type would not be a problem for 64-bit platforms. The only concern is about 32-bit platforms, and only about operating on array-like data structures that span more than half of the address space. Even today, in 2015, there are enough 32-bit platforms that are still interesting, and x32 ABIs for 64-bit CPUs are also important. We agree that 32-bit platforms are important, but the usecase for an unsigned size_t on 32-bit platforms is pretty marginal, and for code that nevertheless needs to do that there is always the option of doing a bitcast to UInt or using C.

c cast and deference a pointer strict aliasing

In http://blog.regehr.org/archives/1307, the author claims that the following snippet has undefined behavior:
unsigned long bogus_conversion(double d) {
unsigned long *lp = (unsigned long *)&d;
return *lp;
}
The argument is based on http://port70.net/~nsz/c/c11/n1570.html#6.5p7, which specified the allowed access circumstances. However, in the footnote(88) for this bullet point, it says this list is only for checking aliasing purpose, so I think this snippet is fine, assuming sizeof(long) == sizeof(double).
My question is whether the above snippet is allowed.
The snippet is erroneous but not because of aliasing. First there is a simple rule that says to deference a pointer to object with a different type than its effective type is wrong. Here the effective type is double, so there is an error.
This safeguard is there in the standard, because the bit representation of a double must not be a valid representation for unsigned long, although this would be quite exotic nowadays.
Second, from a more practical point of view, double and unsigned long may have different alignment properties, and accessing this in that way may produce a bus error or just have a run time penalty.
Generally casting pointers like that is almost always wrong, has no defined behavior, is bad style and in addition is mostly useless, anyhow. Focusing on aliasing in the argumentation about these problems is a bad habit that probably originates in incomprehensible and scary gcc warnings.
If you really want to know the bit representation of some type, there are some exceptions of the "effective type" rule. There are two portable solutions that are well defined by the C standard:
Use unsigned char* and inspect the bytes.
Use a union that comprises both types, store the value in there and read it with the other type. By that you are telling the compiler that you want an object that can be seen as both types. But here you should not use unsigned long as a target type but uint64_t, since you have to be sure that the size is exactly what you think it is, and that there are no trap representations.
To illustrate that, here is the same function as in the question but with defined behavior.
unsigned long valid_conversion(double d) {
union {
unsigned long ul;
double d;
} ub = { .d = d, };
return ub.ul;
}
My compiler (gcc on a Debian, nothing fancy) compiles this to exactly the same assembler as the code in the question. Only that you know that this code is portable.

Char string encoding differences between native C++ and C++/CLI?

I have a strange problem for which I believe there is a solution but I cannot find it. Your help would be appreciated.
On the one hand, I have a native C++ class named Native which has a static wchar_t array containing accentuated characters. This array is const and defined at build time.
/// Header file
Native
{
public:
static const wchar_t* Array() const { return mArray; }
private:
static const wchar_t *mArray;
};
//--------------------------------------------------------------
/// .cpp file
const wchar_t* Native::mArray = {L"This is a description éàçï"};
On the other hand, I have a C++/CLI class that uses the array like this:
/// C++/CLI use
System::String^ S1 = gcnew System::String( Native::Array() );
System::String^ S2 = gcnew System::String( L"This is a description éàçï" };
The problem is that while S2 gives This is a description éàçï as expected, S1 gives This is a description éà çï. I do not understand why passing a pointer to a static array will not give the same result as giving the same array directly???
I guess this is an encoding problem but I would have expected the same results for both S1 and S2. Do you know how to solve the problem? The way I must use it in my program is like S1 i.e. by accessing the build time static array with a static method that returns a const wchar_t*.
Thanks for your help!
EDIT 1
What is the best way to define literals at build time in C++ using Intel C++ 13.0 to make them directly usable in C++/CLI System::String constructor? This could be the ultimate question for my problem.
I don't have enough reputation to add a comment to ask this question, so I apologize for posting this as an answer if that seems inappropriate.
Could the problem be that your compiler defines wchar_t to be 8 bits? I'm basing that is possible on this answer:
Should I use wchar_t when using UTF-8?
To answer your question (in the comments) about building a UTF-16 array at build time, I believe you can force it to be UTF-16 by using u"..." for your literal instead of L"..." (see http://en.cppreference.com/w/cpp/language/string_literal)
Edit 1:
For what it's worth, I tried your code (after fixing a couple compile errors) using Microsoft Visual Studio 10 and didn't have the same problem (both strings printed as expected).
I don't know if it will help you, but another possible way to statically initialize this wchar_t array is to use std::wstring to wrap your literal and then set your array to the c-string pointer returned by wstring::c_str(), shown as follows:
std::wstring ws(L"This is a description éàçï");
const wchar_t* Native::mArray = ws.c_str();
This edit was inspired by Dynamic wchar_t array (C++ beginner)

Best pratice for typedef of uint32

On a system where both long and int is 4 bytes which is the best and why?
typedef unsigned long u32;
or
typedef unsigned int u32;
note: uint32_t is not an option
Nowadays every platform has stdint.h or its C++ equivalent cstdint which define uint32_t. Please use the standard type rather than creating your own.
http://pubs.opengroup.org/onlinepubs/7999959899/basedefs/stdint.h.html
http://www.cplusplus.com/reference/cstdint/
http://msdn.microsoft.com/en-us/library/hh874765.aspx
The size will be the same between both, so it depend only on your use.
If you need to store decimal values, use long.
A better and complete answer here:
https://stackoverflow.com/questions/271076/what-is-the-difference-between-an-int-and-a-long-in-c/271132
Edit: I'm not sure about decimal with long, if someone can confirm, thanks.
Since you said the standard uint32_t is not an option, using long and int are both correct on 32-bit machines, I'll say
typedef unsigned int u32;
is a little better, because on two popular 64-bit machine data models (LLP64 and LP64), int is still 32-bit, while long could be 32-bit or 64-bit. See 64-bit data models

How to create PBKDF2 key on iOS device

I need to create a PBKDF2 key to use in my AES encryption routine in my iPhone Xcode application. I have seen references to using OpenSSL to do this, but not found specific references to what module within OpenSSL to call.
I have scanned various OpenSSL .h files searching for a means to make this call, but have so far been unsuccessful.
The key I will be using is 5-digits, Salt is 12 characters, Iterations is 1000, and I need a 128-bit generated key.
You can use the PKCS5_PBKDF2_HMAC_SHA1() function in openssl/evp.h. Divining how to use the function is pretty easy from the declaration:
int PKCS5_PBKDF2_HMAC_SHA1(const char *pass, int passlen,
const unsigned char *salt, int saltlen, int iter,
int keylen, unsigned char *out);
I think p5_crpt2.c is what you are looking for.