Nsight 5.0 on MacPro lion 10.7.4 - eclipse

I'm new to CUDA dev and I'm using NSight 5 on a MacPro.
I'm doing a very simple simulation with two particles (ver1 and ver2 here, which are two structs that have pointers to another type of structs – links)
The code compiled but seems to run into problem when reaches the end of this block, and never stepped into the integrate_functor():
...
thrust::device_vector<Vertex> d_vecGlobalVec(2);
d_vecGlobalVec[0] = ver1;
d_vecGlobalVec[1] = ver2;
thrust::for_each(
d_vecGlobalVec.begin(),
d_vecGlobalVec.end(),
integrate_functor(deltaTime)
);
...
So my questions are:
In NSight, I can see the values of member variables of ver1 and ver2; but right before the last line of the code in this block, when I expand the hierarchy of d_vecGlobalVec, I can see any of these values - the corresponding fields (e.g. of the first element in this vector) are just empty. Why is this the case? Obviously, ver1 and ver2 are on Host memo while the values in d_vecGlobalVec are on the device.
2.
A member of the NSight team posted this.
So following that, in general, does it mean that I should be able to step in and out between host and device code, and be able to see host/device variables as if there is no barrier between them?
System:
NVIDIA GeForce GT 650M 1024 MB
Mac OS X Lion 10.7.4 (11E2620)

Make sure your device code is actually called. Check all return codes and confirm that device actually worked on the output. Sometimes thrust may run the code on host if it believes it is more effective.
I would really recommend updating to 10.8 - it has the latest drivers with the best support for NVIDIA GeForce 6xx series.
Also note that for optimum experience you need to have different GPUs for display and CUDA debugging - otherwise Mac OS X may interfere and kill the debugger.

Related

indexing problem when calling fit() function

I'm currently working on a project of a nn to play a game similar to atari games (more details in the link). I'm having trouble with the indexing. perhaps anyone knows what could be the problem? because I cant seem to find it. Thank you for your time. here's my code (click on the link) and here's the full traceback. the problem starts from the way I call
history = network.fit(state, epochs=10, batch_size=10) // in line 82
See this post: Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
As said in the correct answer,
Modern CPUs provide a lot of low-level instructions, besides the usual arithmetic and logic, known as extensions, e.g. SSE2, SSE4, AVX, etc. From the Wikipedia:
The warning states that your CPU does support AVX (hooray!).
Pretty much, AVX speeds up your training, etc. Sadly, tensorflow is saying that they aren't going to use it... Why?
Because tensorflow default distribution is built without CPU extensions, such as SSE4.1, SSE4.2, AVX, AVX2, FMA, etc. The default builds (ones from pip install tensorflow) are intended to be compatible with as many CPUs as possible. Another argument is that even with these extensions CPU is a lot slower than a GPU, and it's expected for medium- and large-scale machine-learning training to be performed on a GPU.
What should yo do?
If you have a GPU, you shouldn't care about AVX support, because most expensive ops will be dispatched on a GPU device (unless explicitly set not to). In this case, you can simply ignore this warning by:
# Just disables the warning, doesn't enable AVX/FMA
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
If you don't have a GPU and want to utilize CPU as much as possible, you should build tensorflow from the source optimized for your CPU with AVX, AVX2, and FMA enabled if your CPU supports them. It's been discussed in this question and also this GitHub issue. Tensorflow uses an ad-hoc build system called bazel and building it is not that trivial, but is certainly doable. After this, not only will the warning disappear, tensorflow performance should also improve.
You can find all the details and comments in this StackOverflow question.
NOTE: This answer is a product of my professional copy-and-pasting.
Happy coding,
Bobbay
Has the code been debugged line by line ? as this would trace to the line causing error.
I assume the index error crops up from the below one - where "i" and further targets[i] , outs[i] can be checked for the values they have -
per_sample_losses = loss_fn.call(targets[i], outs[i])

MATLAB error message on startup [duplicate]

Since a couple of days, I constantly receive the same error while using MATLAB which happens at some point with dlopen. I am pretty new to MATLAB, and that is why I don't know what to do. Google doesn't seem to be helping me either. When I try to make an eigenvector, I get this:
Error using eig
LAPACK loading error:
dlopen: cannot load any more object with static TLS
I also get this while making a multiplication:
Error using *
BLAS loading error:
dlopen: cannot load any more object with static TLS
I did of course look for the solutions to this problem, but I don't understand too much and don't know what to do. These are threads I found:
How do I use the BLAS library provided by MATLAB?
http://www.mathworks.de/de/help/matlab/matlab_external/calling-lapack-and-blas-functions-from-mex-files.html
Can someone help me please?
Examples of function calls demonstrating this error
>> randn(3,3)
ans =
2.7694 0.7254 -0.2050
-1.3499 -0.0631 -0.1241
3.0349 0.7147 1.4897
>> eig(ans)
Error using eig
LAPACK loading error:
dlopen: cannot load any more object with static TLS
That's bug no 961964 of MATLAB known since R2012b (8.0). MATLAB dynamically loads some libs with static TLS (thread local storage, e.g. see gcc compiler flag -ftls-model). Loading too many such libs => no space left.
Until now mathwork's only workaround is to load the important(!) libs first by using them early (they suggest to put "ones(10)*ones(10);" in startup.m). I better don't comment on this "solution strategy".
Since R2013b (8.2.0.701) with Linux x86_64 my experience is: Don't use "doc" (the graphical help system)! I think this doc-utility (libxul, etc.) is using a lot of static TLS memory.
Here is an update (2013/12/31)
All the following tests were done with Fedora 20 (with glibc-2.18-11.fc20) and Matlab 8.3.0.73043 (R2014a Prerelease).
For more information on TLS, see
Ulrich Drepper, ELF handling For Thread-Local Storage, Version 0.21, 2013,
currently available at Akkadia and Redhat.
What happens exactly?
MATLAB dynamically (with dlopen) loads several libraries that need tls initialization. All those libs need a slot in the dtv (dynamic thread vector). Because MATLAB loads several of these libs dynamically at runtime at compile/link time the linker (at mathworks) had no chance to count the slots needed (that's the important part). Now it's the task of the dynamic lib loader to handle such a case at runtime. But this is not easy. To cite dl-open.c:
For static TLS we have to allocate the memory here and
now. This includes allocating memory in the DTV. But we
cannot change any DTV other than our own. So, if we
cannot guarantee that there is room in the DTV we don't
even try it and fail the load.
There is a compile time constant (called DTV_SURPLUS, see glibc-source/sysdeps/generic/ldsodefs.h) in the glibc's dynamic lib loader for reserving a number of additional slots for such a mess (dynamically loading libs with static TLS in a multithreading program). In the glibc-Version of Fedora 20 this value is 14.
Here are the first libs (running MATLAB) that needed dtv slots in my case:
matlabroot/bin/glnxa64/libut.so
/lib64/libstdc++.so.6
/lib64/libpthread.so.0
matlabroot/bin/glnxa64/libunwind.so.8
/lib64/libuuid.so.1
matlabroot/sys/java/jre/glnxa64/jre/lib/amd64/server/libjvm.so
matlabroot/sys/java/jre/glnxa64/jre/lib/amd64/libfontmanager.so
matlabroot/sys/java/jre/glnxa64/jre/lib/amd64/libt2k.so
matlabroot/bin/glnxa64/mkl.so
matlabroot/sys/os/glnxa64/libiomp5.so
/lib64/libasound.so.2
matlabroot/sys/jxbrowser/glnxa64/xulrunner/xulrunner-linux-64/libxul.so
/lib64/libselinux.so.1
/lib64/libpixman-1.so.0
/lib64/libEGL.so.1
/lib64/libGL.so.1
/lib64/libglapi.so.0
Yes more than 14 => too many => no slot left in the dtv. That's what the error message tries to tell us and especially mathworks.
For the record: In order not to violate MATLAB's license I didn't debug, decompile or disassemble any part of the binaries shipped with MATLAB. I only debugged the free and open glibc-binaries of Fedora 20 that MATLAB were using to dynamically load the libs.
What can be done, to solve this problem?
There are 3 options:
(a)
Rebuild MATLAB and do not dynamically load those libs
(with initial-exec tls model) instead link against them (then the linker
can count the required slots!)
(b)
Rebuild those libs and ensure they are NOT using the initial-exec tls model.
(c)
Rebuild glibc and increase DTV_SURPLUS in
glibc/sysdeps/generic/ldsodefs.h
Obviously options (a) and (b) can only be done by mathworks.
For option (c) no source of MATLAB is needed and thus can be done without mathworks.
What is the status at mathworks?
I really tried to explain this to the "MathWorks Technical Support Department". But my impression is: they don't understand me. They closed my support ticket and suggested a telephone(!) conversation in January 2014 with a technical support manager.
I'll do my very best to explain this, but to be honest: I'm not very confident.
Update (2014/01/10): Currently mathworks is trying option (b).
Update (2014/03/19): For the file libiomp5.so you can download a newly compiled version (without static TLS) at mathworks, bug report 961964. And the other libs? No improvement there. So don't be suprised to get "dlopen: cannot load any more object with static TLS" with "doc", e.g. see bug report 1003952.
Restarting Matlab solved the problem for me.
long story short: in the directory that you start matlab from create a file
startup.m with content ones(10)*ones(10);. Restart matlab and it will be taken care of.
This is, as I find, an age-old problem yet unsolved by MathWorks.
Here are my two cents, which worked for me (when I wanted IT++ external libraries, with MEX).
Let the library that you found to be the cause of the problem be "libXYZ.so", and that you know where it lies on your system.
The solution is to inform MATLAB to load the specific library at the earliest of its startup. The reason for this error is apparently due to the lack of slots for this thread local storage aka tls purpose (due to they already been filled-up).
Because the latest compilations suddenly required a new library that was not loaded earlier during its startup, MATLAB throws up this error.
Pity that MATLAB never cared to resolve this problem so long.
Fortunately, the solution is a single, very simple terminal command.
Typical steps on a linux-machine should be as follows:
Open command prompt (Ctrl+Alt+T in Ubuntu)
Issue the following command
export LD_PRELOAD=<PATH-TO-libxyz.so>
e.g.: export LD_PRELOAD=/usr/local/lib/libitpp.so
Start matlab from the same terminal
matlab &
Running your program now should resolve the issue, as it is for my case.
Good luck!
Reference:
[1] http://au.mathworks.com/matlabcentral/answers/125117-openmp-mex-files-static-tls-problem
http://www.mathworks.de/support/bugreports/961964 has been updated on 30/01/2014.
There is a zip file attached with libiomp5.so
I tested it on Mageia 4 x86_64 with Matlab R2013b.
I can now use the Documentation of Matlab to open a demo without any problem.
I had the same problem and I think I just solved it.
When installing matlab use the custom installation (I did not do this the first time). Choose to create symbolic links to matlab scripts in the predefined folder (/usr/local/bin). This did the trick for me!
I had the same problem with both Matlab 2013b and Matlab 2014a. The fix provided by mathworks for libiomp5.so only removed the problem of LAPACK not working. However, I could not use external libraries which are using OpenMp (such as VL_FEAT): I still get the error
"dlopen: cannot load any more object with static TLS."
The only thing which worked for me was downgrading to Matlab 2012b.
I came across this problem after "bar" (for bar plots) with a an array gives me just a single blue block, with no errors thrown. Reboot at first solved the problem. But after a memory error (after processing a very large file), I just cannot get past this blue block problem.
Using "hist" on a matrix input gives me the "BLAS loading error" problem and led me to this thread. The Mathwork workaround fixed the hist and bar problems.
Just wanted to bring recognition to the extent of this bug's influence.
I had the same problem and solved it by increasing my Java Heap memory. Go to Preferences > General > Java-Heap Memory, and increase the allocated memory.
Increasing Java heap memory (to 512 mb) also worked for me on R2013b/Ubuntu 12.04. The "BLAS loading error" began when I processed an 11 GB file (with 16 GB RAM), and has not recurred after increasing java heap memory and restarting matlab.

Specman debugging OS11 in gen

I'm getting OS 11 failure in a gen action. The constraints for this gen action are intensive, and it's too complex to debug.
How can we debug this failure and determine the source of this OS 11?
An OS 11 error probably means you're trying to dereference a NULL pointer. Make sure you load your code (not compile it) to see where this is happening (this applies to all OS 11 errors and not just the ones in constraints). Compiled code removes a lot of debug information (to run faster), making it difficult to trace the exact portion of e code that is causing the problem.
Specman provides a great constraint debugger that can assist you further. I don't know the commands by heart but you have to set a break point when the failing CFS (connected field set) is being generated. Search for break on gen in the documentation.

MatLab error: cannot open with static TLS

Since a couple of days, I constantly receive the same error while using MATLAB which happens at some point with dlopen. I am pretty new to MATLAB, and that is why I don't know what to do. Google doesn't seem to be helping me either. When I try to make an eigenvector, I get this:
Error using eig
LAPACK loading error:
dlopen: cannot load any more object with static TLS
I also get this while making a multiplication:
Error using *
BLAS loading error:
dlopen: cannot load any more object with static TLS
I did of course look for the solutions to this problem, but I don't understand too much and don't know what to do. These are threads I found:
How do I use the BLAS library provided by MATLAB?
http://www.mathworks.de/de/help/matlab/matlab_external/calling-lapack-and-blas-functions-from-mex-files.html
Can someone help me please?
Examples of function calls demonstrating this error
>> randn(3,3)
ans =
2.7694 0.7254 -0.2050
-1.3499 -0.0631 -0.1241
3.0349 0.7147 1.4897
>> eig(ans)
Error using eig
LAPACK loading error:
dlopen: cannot load any more object with static TLS
That's bug no 961964 of MATLAB known since R2012b (8.0). MATLAB dynamically loads some libs with static TLS (thread local storage, e.g. see gcc compiler flag -ftls-model). Loading too many such libs => no space left.
Until now mathwork's only workaround is to load the important(!) libs first by using them early (they suggest to put "ones(10)*ones(10);" in startup.m). I better don't comment on this "solution strategy".
Since R2013b (8.2.0.701) with Linux x86_64 my experience is: Don't use "doc" (the graphical help system)! I think this doc-utility (libxul, etc.) is using a lot of static TLS memory.
Here is an update (2013/12/31)
All the following tests were done with Fedora 20 (with glibc-2.18-11.fc20) and Matlab 8.3.0.73043 (R2014a Prerelease).
For more information on TLS, see
Ulrich Drepper, ELF handling For Thread-Local Storage, Version 0.21, 2013,
currently available at Akkadia and Redhat.
What happens exactly?
MATLAB dynamically (with dlopen) loads several libraries that need tls initialization. All those libs need a slot in the dtv (dynamic thread vector). Because MATLAB loads several of these libs dynamically at runtime at compile/link time the linker (at mathworks) had no chance to count the slots needed (that's the important part). Now it's the task of the dynamic lib loader to handle such a case at runtime. But this is not easy. To cite dl-open.c:
For static TLS we have to allocate the memory here and
now. This includes allocating memory in the DTV. But we
cannot change any DTV other than our own. So, if we
cannot guarantee that there is room in the DTV we don't
even try it and fail the load.
There is a compile time constant (called DTV_SURPLUS, see glibc-source/sysdeps/generic/ldsodefs.h) in the glibc's dynamic lib loader for reserving a number of additional slots for such a mess (dynamically loading libs with static TLS in a multithreading program). In the glibc-Version of Fedora 20 this value is 14.
Here are the first libs (running MATLAB) that needed dtv slots in my case:
matlabroot/bin/glnxa64/libut.so
/lib64/libstdc++.so.6
/lib64/libpthread.so.0
matlabroot/bin/glnxa64/libunwind.so.8
/lib64/libuuid.so.1
matlabroot/sys/java/jre/glnxa64/jre/lib/amd64/server/libjvm.so
matlabroot/sys/java/jre/glnxa64/jre/lib/amd64/libfontmanager.so
matlabroot/sys/java/jre/glnxa64/jre/lib/amd64/libt2k.so
matlabroot/bin/glnxa64/mkl.so
matlabroot/sys/os/glnxa64/libiomp5.so
/lib64/libasound.so.2
matlabroot/sys/jxbrowser/glnxa64/xulrunner/xulrunner-linux-64/libxul.so
/lib64/libselinux.so.1
/lib64/libpixman-1.so.0
/lib64/libEGL.so.1
/lib64/libGL.so.1
/lib64/libglapi.so.0
Yes more than 14 => too many => no slot left in the dtv. That's what the error message tries to tell us and especially mathworks.
For the record: In order not to violate MATLAB's license I didn't debug, decompile or disassemble any part of the binaries shipped with MATLAB. I only debugged the free and open glibc-binaries of Fedora 20 that MATLAB were using to dynamically load the libs.
What can be done, to solve this problem?
There are 3 options:
(a)
Rebuild MATLAB and do not dynamically load those libs
(with initial-exec tls model) instead link against them (then the linker
can count the required slots!)
(b)
Rebuild those libs and ensure they are NOT using the initial-exec tls model.
(c)
Rebuild glibc and increase DTV_SURPLUS in
glibc/sysdeps/generic/ldsodefs.h
Obviously options (a) and (b) can only be done by mathworks.
For option (c) no source of MATLAB is needed and thus can be done without mathworks.
What is the status at mathworks?
I really tried to explain this to the "MathWorks Technical Support Department". But my impression is: they don't understand me. They closed my support ticket and suggested a telephone(!) conversation in January 2014 with a technical support manager.
I'll do my very best to explain this, but to be honest: I'm not very confident.
Update (2014/01/10): Currently mathworks is trying option (b).
Update (2014/03/19): For the file libiomp5.so you can download a newly compiled version (without static TLS) at mathworks, bug report 961964. And the other libs? No improvement there. So don't be suprised to get "dlopen: cannot load any more object with static TLS" with "doc", e.g. see bug report 1003952.
Restarting Matlab solved the problem for me.
long story short: in the directory that you start matlab from create a file
startup.m with content ones(10)*ones(10);. Restart matlab and it will be taken care of.
This is, as I find, an age-old problem yet unsolved by MathWorks.
Here are my two cents, which worked for me (when I wanted IT++ external libraries, with MEX).
Let the library that you found to be the cause of the problem be "libXYZ.so", and that you know where it lies on your system.
The solution is to inform MATLAB to load the specific library at the earliest of its startup. The reason for this error is apparently due to the lack of slots for this thread local storage aka tls purpose (due to they already been filled-up).
Because the latest compilations suddenly required a new library that was not loaded earlier during its startup, MATLAB throws up this error.
Pity that MATLAB never cared to resolve this problem so long.
Fortunately, the solution is a single, very simple terminal command.
Typical steps on a linux-machine should be as follows:
Open command prompt (Ctrl+Alt+T in Ubuntu)
Issue the following command
export LD_PRELOAD=<PATH-TO-libxyz.so>
e.g.: export LD_PRELOAD=/usr/local/lib/libitpp.so
Start matlab from the same terminal
matlab &
Running your program now should resolve the issue, as it is for my case.
Good luck!
Reference:
[1] http://au.mathworks.com/matlabcentral/answers/125117-openmp-mex-files-static-tls-problem
http://www.mathworks.de/support/bugreports/961964 has been updated on 30/01/2014.
There is a zip file attached with libiomp5.so
I tested it on Mageia 4 x86_64 with Matlab R2013b.
I can now use the Documentation of Matlab to open a demo without any problem.
I had the same problem and I think I just solved it.
When installing matlab use the custom installation (I did not do this the first time). Choose to create symbolic links to matlab scripts in the predefined folder (/usr/local/bin). This did the trick for me!
I had the same problem with both Matlab 2013b and Matlab 2014a. The fix provided by mathworks for libiomp5.so only removed the problem of LAPACK not working. However, I could not use external libraries which are using OpenMp (such as VL_FEAT): I still get the error
"dlopen: cannot load any more object with static TLS."
The only thing which worked for me was downgrading to Matlab 2012b.
I came across this problem after "bar" (for bar plots) with a an array gives me just a single blue block, with no errors thrown. Reboot at first solved the problem. But after a memory error (after processing a very large file), I just cannot get past this blue block problem.
Using "hist" on a matrix input gives me the "BLAS loading error" problem and led me to this thread. The Mathwork workaround fixed the hist and bar problems.
Just wanted to bring recognition to the extent of this bug's influence.
I had the same problem and solved it by increasing my Java Heap memory. Go to Preferences > General > Java-Heap Memory, and increase the allocated memory.
Increasing Java heap memory (to 512 mb) also worked for me on R2013b/Ubuntu 12.04. The "BLAS loading error" began when I processed an 11 GB file (with 16 GB RAM), and has not recurred after increasing java heap memory and restarting matlab.

Running executables of different format on any OS

This shouldn't be that hard that one may think, if I got it right. Specifically, I'll begin with iOS and the ELF executable format. Let's clarify that I have a jailbroken iPhone and I don't want to do this in any appstore apps, so pleas avoid "good advices" like "you can't do it as it's prohibited by Apple".
So, what I have seen is that there's a Flash player implementation, called Frash (by Comex btw, developer of recent jailbreaks). This utility requires, after installation, that Android's libflashplayer.so is present (copied to) the iPhone file system. I digged into the source code and found out that the tweak actually opens the Android (ELF) shared object file, "parses" it and executes code from it. I already asked a friend of mine wheter it is or is not actually possible and he told me that it is, because ELF on ARM and Mach-O on ARM are binary compatible (because they're both ARM). But he actually failed to explain it to me in detail, so I'd like to ask how can it be done? I can't exactly understand the source code fragment that handles, but one thing is sure:
int fd = open("libflashplayer.so", O_RDONLY);
_assert(fd > 0);
fds_init();
sandbox_me();
int symtab_size;
Elf32_Sym *symtab;
void **init_array;
Elf32_Word init_array_size;
char *strtab;
TIME(base_load_elf(fd, &symtab, &symtab_size, &init_array, &init_array_size, &strtab));
// Call the init funcs
_assert(init_array);
while(init_array_size >= 4) {
void (*x)() = *init_array++;
notice("Calling %p", x);
x();
init_array_size -= 4;
}
(from the original code, as of 02/12/2011 on GitHub)
It seems to me that he uses libelf to perform this, right? And that in an ELF file there are symbols that can be executed on a compatible processor just fine?
I'd also like to know whether it is true for all other processor architectures? So maybe one can execute symbols from Linux binaries on OS X?
The important thing about compatibility is the underlying processor architecture, not Linux vs. OS X vs. Android. If the ELF or .so are compiled for the same processor instruction set, then this can work. If not, then they are not compatible. For example, if both were built for Linux but different processors, they would not be compatible.