Perl XS and C++ passing pointer to buffer - perl

I know almost no C++ so that's not helping, and my XS isn't much better. I'm creating an XS interface for a C++ library and I have almost all my methods working except one.
The method in Perl should look like this:
$return_data = $obj->readPath( $path );
The method is defined as this the .h file:
int readPath(const char* path, char* &buffer, bool flag=true);
The "buffer" will get allocated if it's passed in NULL.
There's two additional versions of readPath with different signatures, but they are not the ones I want. (And interestingly, when I try and compile it tells me the "candidates" are the two I don't want.) Is that because it's not understanding the "char * &"?
Can someone help with the xsub I need to write?
I'm on Perl 5.14.2.
BTW -- I've also used a typemap "long long int" to T_IV. I cannot find any documentation on how to correctly typemap long long. Any suggestions how I should typemap long long?
Thanks,

I've never dealt with C++ from C or XS. If it was C, it would be:
void
readPath(SV* sv_path)
PPCODE:
{
char* path = SvPVbyte_nolen(sv_path, len);
char* buffer = NULL;
if (!readPath(path, &buffer, 0))
XSRETURN_UNDEF;
ST(0) = sv_2mortal(newSVpv(buffer, 0));
free(buffer);
XSRETURN(1);
}
Hopefully, that works or you can adjust it to work.
I assumed:
readPath returns true/false for success/failure.
buffer isn't allocated on failure.
The deallocator for buffer is free.

Second part of the question is TYPEMAP for long long (or long long int).
Normally long is at least 32 bits and long long is at least 64. The default typemap for long is T_IV. Perl's equivalent for long long is T_IV, too.
But sometimes, you wan't to reduce warnings for cast. So, you can use T_LONG for long. T_LONG is an equivalent to T_IV but explicitly casts the return to type long. The TYPEMAP for T_LONG is descripted at $PERLLIB/ExtUtils/typemap
With this knowledge you can write you own TYPEMAP for long long int:
TYPEMAP: <<TYPEMAPS
long long int T_LONGLONG
INPUT
T_LONGLONG
$var = (long long int)SvIV($arg)
OUTPUT
T_LONGLONG
sv_setiv($arg, (IV)$var);
TYPEMAPS

Related

c++ libtomcrypt library outputting shorter hashes/truncated hashes

I am trying to generate hashes to use in a blockchain project, when looking for a crypto library i stumbled accross tomcrypt and chose to download it since it was easy to install, but now i have a problem, when I create the hashes (btw i'm usign SHA3_512 but the bug is present in every other SHA hashing algorithm) sometimes it outputs the correct hash but truncated
photo example
Hash truncating example
this is the code for the hashing function
string hashSHA3_512(const std::string& input) {
//Initial
unsigned char* hashResult = new unsigned char[sha3_512_desc.hashsize];
//Initialize a state variable for the hash
hash_state md;
sha3_512_init(&md);
//Process the text - remember you can call process() multiple times
sha3_process(&md, (const unsigned char*) input.c_str(), input.size());
//Finish the hash calculation
sha3_done(&md, hashResult);
// Convert to string
string stringifiedHash(reinterpret_cast<char*>(hashResult));
// Return the result
return stringToHex(stringifiedHash);
}
and here is the code for the toHex function even if I already checked and the truncating hash problem pops up before this function is called
string stringToHex(const std::string& input)
{
static const char hex_digits[] = "0123456789abcdef";
std::string output;
output.reserve(input.length() * 2);
for (unsigned char c : input)
{
output.push_back(hex_digits[c >> 4]);
output.push_back(hex_digits[c & 15]);
}
return output;
}
if someone has knowledge about this library or in general about this problem and possible fixes pls explain to me, i'm stuck from 3 days
UPDATE
I figured out the program is truncating the hashes when it encounters 2 consecutive zeros in hex so 8 zeros in binary (or simply 2 bytes) but I still don't understand why, if you do pls let me and hopefully other people with the same problem know

D language unsigned hash of string

I am a complete beginner with the D language.
How to get, as an uint unsigned 32 bits integer in the D language, some hash of a string...
I need a quick and dirty hash code (I don't care much about the "randomness" or the "lack of collision", I care slightly more about performance).
import std.digest.crc;
uint string_hash(string s) {
return crc320f(s);
}
is not good...
(using gdc-5 on Linux/x86-64 with phobos-2)
While Adams answer does exactly what you're looking for, you can also use a union to do the casting.
This is a pretty useful trick so may as well put it here:
/**
* Returns a crc32Of hash of a string
* Uses a union to store the ubyte[]
* And then simply reads that memory as a uint
*/
uint string_hash(string s){
import std.digest.crc;
union hashUnion{
ubyte[4] hashArray;
uint hashNumber;
}
hashUnion x;
x.hashArray = crc32Of(s); // stores the result of crc32Of into the array.
return x.hashNumber; // reads the exact same memory as the hashArray
// but reads it as a uint.
}
A really quick thing could just be this:
uint string_hash(string s) {
import std.digest.crc;
auto r = crc32Of(s);
return *(cast(uint*) r.ptr);
}
Since crc32Of returns a ubyte[4] instead of the uint you want, a conversion is necessary, but since ubyte[4] and uint are the same thing to the machine, we can just do a reinterpret cast with the pointer trick seen there to convert types for free at runtime.

Perl SV value from pointer without copy

How I could create SV value from null terminated string without copy? Like newSVpv(const char*, STRLEN) but without copy and with moving ownership to Perl (so Perl must release that string memory). I need this to avoid huge memory allocation and copy.
I found following example:
SV *r = sv_newmortal();
SvPOK_on(r);
sv_usepvn_mg(r, string, strlen(string) + 1);
But I don't have deep knowledge of XS internals and have some doubts.
If you want Perl to manage the memory block, it needs to know how to reallocate it and deallocate it. The only memory it knows how to reallocate and deallocate is memory allocated using its allocator, Newx. (Otherwise, it would have to associate a reallocator and deallocator with each memory block.)
If you can't allocate the memory block using Newx, then your best option might be to create a read-only SV with SvLEN set to zero. That tells Perl that it doesn't own the memory. That SV could be blessed into a class that has a destructor that will deallocate the memory using the appropriate deallocator.
If you can allocate the memory block using Newx, then you can use the following:
SV* newSVpvn_steal_flags(pTHX_ const char* ptr, STRLEN len, const U32 flags) {
#define newSVpvn_steal_flags(a,b,c) newSVpvn_steal_flags(aTHX_ a,b,c)
SV* sv;
assert(!(flags & ~(SVf_UTF8|SVs_TEMP|SV_HAS_TRAILING_NUL)));
sv = newSV(0);
sv_usepvn_flags(sv, ptr, len, flags & SV_HAS_TRAILING_NUL);
if ((flags & SVf_UTF8) && SvOK(sv)) {
SvUTF8_on(sv);
}
SvTAINT(sv);
if (flags & SVs_TEMP) {
sv_2mortal(sv);
}
return sv;
}
Note: ptr should point to memory that was allocated by Newx, and it must point to the start of the block returned by Newx.
Note: Accepts flags SVf_UTF8 (to specify that ptr is the UTF-8 encoding of the string to be seen in Perl), SVs_TEMP (to have sv_2mortal called on the SV) and SV_HAS_TRAILING_NUL (see below).
Note: Some code expects the string buffer of scalars to have a trailing NUL (even though the length of the buffer is known and even though the buffer can contain NULs). If the memory block you allocated has a trailing NUL beyond the end of the data (e.g. a C-style NUL-terminated string), then pass the SV_HAS_TRAILING_NUL flag. If not, the function will attempt to extend the buffer and add a NUL.

XS typemap for intptr_t

I'm trying to return an intptr_t type from some XS code:
intptr_t
my_func( self )
myObjPtr self
CODE:
RETVAL = (intptr_t) self;
OUTPUT:
RETVAL
My typemap doesn't have anything about intptr_t, so of course dmake fails with Could not find a typemap for C type 'intptr_t'. I'm not sure if Perl even works with integers as big as intptr_t can be. If there's no good way to return this to Perl as a number, I'll just stringify it.
IV, Perl's signed integer format, is guaranteed to be large enough to hold a pointer. intptr_t is C's version of what Perl has had for a long time. (In fact, a ref is just a pointer stored in a scalar's IV slot with a flag indicating it's a reference.)
But you don't want to cast directly to an IV as that can result in a spurious warning. As Sinan Ünür points out, use PTR2IV instead.
IV
my_func()
myObjPtr self
CODE:
self = ...;
RETVAL = PTR2IV(self);
OUTPUT:
RETVAL
INT2PTR(myObjPtr, iv) does the inverse operation.
This thread suggests:
The existing mechanism which does work everywhere are the macros INT2PTR and PTR2IV (in perl.h)
From perldoc perlguts:
Because pointer size does not necessarily equal integer size, use the follow macros to do it right.
PTR2UV(pointer)
PTR2IV(pointer)
PTR2NV(pointer)
INT2PTR(pointertotype, integer)

Using memcpy/memset

When using memset or memcpy within an Obj-C program, will the compiler optimise the setting (memset) or copying (memcpy) of data into 32-bit writes or will it do it byte by byte?
You can see the libc implementations of these methods in the Darwin source. In 10.6.3, memset works at the word level. I didn't check memcpy, but probably it's the same.
You are correct that it's possible for the compiler to do the work inline instead of calling these functions. I suppose I'll let someone who knows better answer what it will do, though I would not expect a problem.
Memset will come as part of your standard C library so it depends on the implementation you are using. I would guess most implementations will copy in blocks of the native CPU size (32/64 bits) and then the remainder byte-by-byte.
Here is glibc's version of memcpy for an example implementation:
void *
memcpy (dstpp, srcpp, len)
void *dstpp;
const void *srcpp;
size_t len;
{
unsigned long int dstp = (long int) dstpp;
unsigned long int srcp = (long int) srcpp;
/* Copy from the beginning to the end. */
/* If there not too few bytes to copy, use word copy. */
if (len >= OP_T_THRES)
{
/* Copy just a few bytes to make DSTP aligned. */
len -= (-dstp) % OPSIZ;
BYTE_COPY_FWD (dstp, srcp, (-dstp) % OPSIZ);
/* Copy whole pages from SRCP to DSTP by virtual address manipulation,
as much as possible. */
PAGE_COPY_FWD_MAYBE (dstp, srcp, len, len);
/* Copy from SRCP to DSTP taking advantage of the known alignment of
DSTP. Number of bytes remaining is put in the third argument,
i.e. in LEN. This number may vary from machine to machine. */
WORD_COPY_FWD (dstp, srcp, len, len);
/* Fall out and copy the tail. */
}
/* There are just a few bytes to copy. Use byte memory operations. */
BYTE_COPY_FWD (dstp, srcp, len);
return dstpp;
}
So you can see it copies a few bytes first to get aligned, then copies in words, then finally in bytes again. It does some optimized page copying using some kernel operations.