Ok, this compiles fine in GCC under Linux.
char * _v3_get_msg_string(void *offset, uint16_t *len) {/*{{{*/
char *s;
memcpy(len, offset, 2);
*len = ntohs(*len);
s = malloc(*len+1);
memset(s, 0, *len+1);
memcpy(s, offset+2, *len);
s[*len] = '\0';
*len+=2;
return s;
}/*}}}*/
However, I'm having a problem porting it to Windows, due to the line...
memcpy(s, offset+2, *len);
Being a void pointer, VC++ doesn't want to offset the pointer. The usual caveat that CPP doesn't allow pointer offsets SHOULD be moot, as the whole project is being built under extern "C".
Now, this is only 1 function in many, and finding the answer to this will allow them all to be fixed. I would really prefer not having to rewrite the library project from the ground up, and I don't want to build under MinGW. There has to be a way to do this that I'm missing, and not finding in Google.
Well, you cannot do pointer arithmetics with void*, it is ridiculous that this compiles under GCC. try memcpy(s, ((char*)offset)+2,*len);
Related
In http://blog.regehr.org/archives/1307, the author claims that the following snippet has undefined behavior:
unsigned long bogus_conversion(double d) {
unsigned long *lp = (unsigned long *)&d;
return *lp;
}
The argument is based on http://port70.net/~nsz/c/c11/n1570.html#6.5p7, which specified the allowed access circumstances. However, in the footnote(88) for this bullet point, it says this list is only for checking aliasing purpose, so I think this snippet is fine, assuming sizeof(long) == sizeof(double).
My question is whether the above snippet is allowed.
The snippet is erroneous but not because of aliasing. First there is a simple rule that says to deference a pointer to object with a different type than its effective type is wrong. Here the effective type is double, so there is an error.
This safeguard is there in the standard, because the bit representation of a double must not be a valid representation for unsigned long, although this would be quite exotic nowadays.
Second, from a more practical point of view, double and unsigned long may have different alignment properties, and accessing this in that way may produce a bus error or just have a run time penalty.
Generally casting pointers like that is almost always wrong, has no defined behavior, is bad style and in addition is mostly useless, anyhow. Focusing on aliasing in the argumentation about these problems is a bad habit that probably originates in incomprehensible and scary gcc warnings.
If you really want to know the bit representation of some type, there are some exceptions of the "effective type" rule. There are two portable solutions that are well defined by the C standard:
Use unsigned char* and inspect the bytes.
Use a union that comprises both types, store the value in there and read it with the other type. By that you are telling the compiler that you want an object that can be seen as both types. But here you should not use unsigned long as a target type but uint64_t, since you have to be sure that the size is exactly what you think it is, and that there are no trap representations.
To illustrate that, here is the same function as in the question but with defined behavior.
unsigned long valid_conversion(double d) {
union {
unsigned long ul;
double d;
} ub = { .d = d, };
return ub.ul;
}
My compiler (gcc on a Debian, nothing fancy) compiles this to exactly the same assembler as the code in the question. Only that you know that this code is portable.
TL;DR
I'm looking for a way to extract a part of an existing CUDA Toolkit example and turn it into a CUDAKernel executable in MATLAB.
The Story
In an attempt to obtain a short-runtime implementation of the non-local means (NLM) 2D filter, I stumbled upon the imageDenoising example provided with the CUDA Toolkit which implements two variants of this filter, called NLM & NLM2 (or "quick NLM").
Having no previous experience with CUDA coding, I initially attempted to follow MATLAB's documentation on the subject, which resulted in several strange errors including: ptx compilation, multiple entry points and wrong number of inputs in the C prototype. At this point I realized that this isn't going to be a "just works" case and that some tinkering is required.
So I decided to eliminate the multiple entry point issue by simply deleting parts of imageDenoising.cu file and consolidating the relevant .cuh (either ..._nlm_kernel.cuh or ..._nlm2_kernel.cuh) into the .cu so as to obtain a single entry point at any given time.
To my surprise this actually managed to compile and I was finally able to create a CUDAKernel without an error (using the command k = parallel.gpu.CUDAKernel('imageDenoising.ptx', 'uint8_T *, int, int, float, float');).
This however was not enough, because I mistakenly concluded that the 1st argument is the unprocessed image in the form of an RGB matrix (i.e. X*Y*3 uint8), and so the result I was getting back was exactly the input but with 0 in the 1st 4 elements.
After searching a bit more I realized that there are additional, and critical, aspects I'm entirely unaware of (like the need to initialize __device__ variables) to such a conversion process, at which stage I decided to ask for help.
The Problem
I'm currently wondering how to efficiently continue from here. While I'd love to hear if this kind of approach can generally bear fruit (and whether a complete example of this process is available somewhere), which other pitfalls I should look out for, and what alternative courses of action I can take (considering my very limited knowledge in CUDA and the fact I won't hire anybody else to do this for me), I keep in mind that this is SO and so I must have a specific programming problem, so here goes:
How do I modify imageDenoising.cu such that the MATLAB CUDAKernel constructed from it will also accept the unprocessed image as an input?
Note: in my application, the input matrix is a 2d, grayscale, double matrix.
Related: How CudaMalloc work?
P.S.
A working piece of code would obviously be welcomed, but I'd really rather "learn to fish".
I ended up taking an alternative approach to CUDAKernel, using .MEX, by doing the following:
Setting up the external libraries OpenCV v2.4.10 (not v3!) and mexopencv.
Writing a small wrapper function for OpenCV's fastNlMeansDenoising using the guidelines of mexopencv for unimplemented functions, as seen below (excluding the documentation):
#include "mexopencv.hpp"
using namespace cv;
void mexFunction(int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[])
{
// Check arguments
if (nlhs != 1 || nrhs<1 || ((nrhs % 2) != 1) )
mexErrMsgIdAndTxt("fastNLM:invalidArgs", "Wrong number of arguments");
// Argument vector
vector<MxArray> rhs(prhs, prhs + nrhs);
// Option processing
// Defaults:
double h = 3;
int templateWindowSize = 7;
int searchWindowSize = 21;
// Parsing input name-value pairs:
for (int i = 1; i<nrhs; i += 2) {
string key = rhs[i].toString();
if (key == "h")
h = rhs[i + 1].toDouble();
else if (key == "templateWindowSize")
templateWindowSize = rhs[i + 1].toInt();
else if (key == "searchWindowSize")
searchWindowSize = rhs[i + 1].toInt();
else
mexErrMsgIdAndTxt("mexopencv:error", "Unrecognized option");
}
// Process
Mat src(rhs[0].toMat()), dst;
fastNlMeansDenoising(src, dst, h, templateWindowSize, searchWindowSize);
// Convert cv::Mat back to mxArray*
plhs[0] = MxArray(dst);
}
Compiling it..... and viola - a working CUDA-accelerated NLM filter.
The answer to my question itself can be found by comparing opencv\sources\modules\photo\src\cuda\nlm.cu (this is the opencv2 path) with imageDenoising_nlm2_kernel.cuh.
This solution worked well for me because it was more important for me to get an NLM filter running, rather than using CUDAKernel.
The main lesson I learned from this (and I'd like to pass on to others) is:
Running CUDA code in MATLAB can also be done in ways other than CUDAKernel, such as using .mex wrappers as shown above.
Yesterday, I updated Xcode to the newest version (5.1 (5B130a)) to compatible with iOS 7.1. Then I build my project, I get the error "Cast from pointer to smaller type 'int' loses information" in EAGLView.mm file (line 408) when 64-bit simulators (e.g.: iPhone Retina 4-inch 64-bit) is selected.
I'm using cocos2d-x-2.2.2. Before I update Xcode, my project still can build and run normally with all devices.
Thanks for all recommendation.
Update: Today, i download the latest version of cocos2d-x (cocos2d-x 2.2.3). But the problem has still happened.
Here is some piece of code where that error occur:
/cocos2d-x-2.2.2/cocos2dx/platform/ios/EAGLView.mm:408:18: Cast from pointer to smaller type 'int' loses information
// Pass the touches to the superview
#pragma mark EAGLView - Touch Delegate
- (void)touchesBegan:(NSSet *)touches withEvent:(UIEvent *)event
{
if (isKeyboardShown_)
{
[self handleTouchesAfterKeyboardShow];
return;
}
int ids[IOS_MAX_TOUCHES_COUNT] = {0};
float xs[IOS_MAX_TOUCHES_COUNT] = {0.0f};
float ys[IOS_MAX_TOUCHES_COUNT] = {0.0f};
int i = 0;
for (UITouch *touch in touches) {
ids[i] = (int)touch; // error occur here
xs[i] = [touch locationInView: [touch view]].x * view.contentScaleFactor;;
ys[i] = [touch locationInView: [touch view]].y * view.contentScaleFactor;;
++i;
}
cocos2d::CCEGLView::sharedOpenGLView()->handleTouchesBegin(i, ids, xs, ys);
}
Apparently the clang version in Xcode 5.1 and above is more strict about potential 32bit vs. 64 bit incompatibilities in source code than older clang versions have been.
To be honest, I think, clang is too restrictive here. A sane compiler may throw a warning on lines like this but by no way it should throw an error, because this code is NOT wrong, it is just potentially error-prone, but can be perfectly valid.
The original code is
ids[i] = (int)touch;
with ids being an array of ints and touch being a pointer.
In a 64bit build a pointer is 64bit (contrary to a 32bit build, where it is 32bit), while an int is 32bit, so this assignment stores a 64bit value in a 32bit storage, which may result in a loss of information.
Therefore it is perfectly valid for the compiler to throw an error for a line like
ids[i] = touch;
However the actual code in question contains an explicit c-style cast to int. This explicit cast clearly tells the compiler "Shut up, I know that this code does not look correct, but I do know what I am doing".
So the compiler is very picky here and the correct solution to make the code compile again and still let it show the exact same behavior like in Xcode 5.0 is to first cast to an integer type with a size that matches the one of a pointer and to then do a second cast to the int that we actually want:
ids[i] = (int)(size_t)touch;
I am using size_t here, because it is always having the same size as a pointer, no matter the platform. A long long would not work for 32bit systems and a long would not work for 64 bit Windows (while 64bit Unix and Unix-like systems like OS X use the LP64 data model, in which a long is 64bit, 64bit Windows uses the LLP64 data model, in which a long has a size of 32bit (http://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models)).
I meet this problem too.
ids[i] = (int)touch; // error occur here => I change this to below.
ids[i] = (uintptr_t)touch;
Then i can continue compiling. Maybe you can try this too.
XCode 5.1 is change all architecture to 64 bit.
you can just change architecture to support 32 bit compilation by all below in in Build Settings
use $(ARCHS_STANDARD_32_BIT) at Architecture instead of $(ARCHS_STANDARD)
remove arm64 at Valid Architectures
Hope it helps.
You can fix this error by replacing this line of code.
ids[i] = (uint64_t)touch;
You should perform type conversion based on 64bit build system because the type "int" supports only -32768 ~ 32768.
Surely the solution is to change the type of ids from int to type that is sufficiently large to hold a pointer.
I'm unfamiliar with XCode, but the solution should be something like follows:
Change the declaration of ids to:
intptr_t ids[IOS_MAX_TOUCHES_COUNT];
and the line producing the error to:
ids[i] = (intptr_t)touch;
Most of the "solutions" above can lose part of the pointer address when casting to a smaller type. If the value is ever used as pointer again that will prove to be an extremely bad idea.
ids[i] = (int)touch; put * and check it.
ids[i] = *(int *)touch;
In Code Composer, you can define new symbols in the linker command file simply:
_Addr_start = 0x5C00;
_AppLength = 0x4C000;
before the memory map and section assignment. This is done in the bootloader example from TI.
You can then refer to the address (as integers) in your c-code as this
extern uint32_t _Addr_start; // note that uint32_t is fake.
extern uint32_t _AppLength; // there is no uint32_t object allocated
printf("start = %X len= %X\r\n", (uint32_t)&_Addr_start, (uint32_t)&_AppLength);
The problem is that if you use the 'small' memory model, the latter symbol (at 0x45C00) gives linker warning, because it tries to cast it to a 16-bit pointer.
"C:/lakata/hardware-platform/CommonSW/otap.c", line 78: warning #17003-D:
relocation from function "OtapGetExternal_CRC_Calc" to symbol "_AppLength"
overflowed; the 18-bit relocated address 0x3f7fc is too large to encode in
the 16-bit field (type = 'R_MSP_REL16' (161), file = "./otap.obj", offset =
0x00000002, section = ".text:OtapGetExternal_CRC_Calc")
I tried using explicit far pointers, but code composer doesn't understand the keyword far. I tried to make the dummy symbol a function pointer, to trick the compiler into thinking that dereferencing it would.... The pointer points to code space, and the code space model is "large" while the data space model is "small".
I figured it out before I finished entering the question!
Instead of declaring the symbol as
extern uint32_t _AppLength; // pretend it is a dummy data
declare it as
void _AppLength(void); // pretend it is a dummy function
Then the pointer conversion works properly, because &_AppLength is assumed to be far now. (When it declared as an integer, &_AppLength is assumed to be near and the linker fails.)
I found a c function which I would like to use in my app. Unfortunately, my c knowledge is not great. The first section of code shows the original c code and the second my "translation" to objective c. I have 3 questions I would appreciate help with please:
Is my translation of the variables from their c counterparts to their objective c counterparts valid? (I have had no compiler warnings)
Is it acceptable to use the free () at the end or should this be done in another way in objective c
c code:
unsigned int i, j, diagonal, cost, s1len, s2len;
unsigned int *arr;
char *str1, *str2;
general code...
s1len = strlen(str1);
s2len = strlen(str2);
arr = (unsigned int *) malloc(sizeof(unsigned int) * j);
general code...
free(arr);
objective c code:
NSUInteger i, j, diagonal, cost, s1len, s2len;
NSUInteger *arr;
const char *str1 = [source cStringUsingEncoding:NSISOLatin1StringEncoding];
const char *str2 = [target cStringUsingEncoding:NSISOLatin1StringEncoding];
general code...
s1len = strlen(str1);
s2len = strlen(str2);
arr = (NSUInteger *) malloc(sizeof(NSUInteger) * j);
general code...
free(arr);
Objective-C is a strict superset of C, therefore you can use any C code without any modifications. I suggest either using the C code as is (as far as possible) or re-implementing the algorithm with objects in Objective-C.
NSString to char
What you need to provide, is a way to make convert Objective-C objects into C types, like NSString* to char*.
The conversion is correct, but you might want to use -UTF8String to keep all chars intact, Latin-1 might lose some information. The disadvantage of utf-8 is, that you're C code might not be able to correctly work with it.
You'd better get the lengths using one of NSString's methods instead of strlen, because it has linear running time and NSString's methods could be constant.
// utf-8
int len = [source length];
// latin 1
int len = [source lengthOfBytesUsingEncoding:NSISOLatin1StringEncoding];
int to NSInteger
There's no reason to convert ints to NSIntegers. Apple has added this type for 64bit compatibility. NSInteger is typedef'd so that on 32bit platforms it needs 32bit and on 64bit platforms 64bit.
I'd try to change as little as possible of the C code. (Makes it easier to update it when the original gets updated.) So just leave the ints as they are.
Memory management
C's memory management is more basic than Objective-C's, but as you seem to know you use malloc and free, that just stays the same. Retain/release and the garbage collector are only useful for objects anyway.
You can mix Objective C with pure C when developing for the iPhone. In general, with Objectvie C you want to be working with higher level objects and as such you shouldn't need to invoke malloc and similar (although of course you can).
I would suggest that you either re-implement the functionality the C code provides from scratch in Objective C, that is think about what you require the code to do and then just write the Objective C code - rather than trying to change the C code line by line. Or, I would just include the pure C code in your project and call the functions you need from Objective C.
What's stopping you from just using the c function in your code as is? You can use any c function in Objective-c and it won't cause a problem. Many of the functions in Cocoa are c functions (for example NSSearchPathForDirectoriesInDomains().
Your C version is perfectly valid Objective-C code. You don't need to translate it.