Different behavior of mpiexec in Windows and in Ubuntu - eclipse

I have a code in Fortran (program.f) and I have compiled it with Eclipse in \ubuntu 16 and in Windows 7.
The Eclipse configuration for Ubuntu is the follow:
GNU Fortran Compiler: gfortran
Include paths(-l) : /usr/lib/openmpi/include
GNU Fortran Linker : mpif90
Tool Chain Editor : GCC Fortran
The Eclipse configuration for Windows is the follow:
GNU Fortran Compiler: gfortran
Include paths(-l) : C:\cygwin64\usr\include
GNU Fortran Linker : mpif90
Tool Chain Editor : GCC Fortran
When I execute the program in Ubuntu, the program works how it is expected.
In Ubuntu the program is executed with 2 processors by doing
$ mpiexec -np 2 myprogram
And the behavior is the follow
$ mpiexec -np 2 myprogram
There are 2 processors running this job.
Rank# 1 d1= 65 d2= 128
Rank# 0 d1= 1 d2= 64
Where d1 and d2 are pieces of the problem domain assigned to each processors. In this example the total domain is 128. The domain was assigned from 1 to 64 to processor 0, and from 65 to 128 to processor 1. This is the expected behavior: the model of 128 are divided in 2, from 1 to 64 to processor 0, and from 65 to 128 for the processor 1.
For the other hand, in Windows, after compile the code using the mentioned specifications, I execute the program by doing:
$ mpiexec.exe -n 2 myprogram.exe
And the behavior is the follow
$ mpiexec -np 2 myprogram
There are 1 processors running this job.
Rank# 0 d1= 1 d2= 128
Rank# 0 d1= 1 d2= 128
We can see that the behavior is different: the program executed in Windows is not running in parallel as it is expected. In the terminal we can see that 1 processor is running the program, and the domain is assigned as follow: from 1 to 128 (whole domain) to processor 0 and, from 1 to 128 (whole domain again?) to processor 0. This is the problem that I am trying to solve. I am trying to have the same behavior that I have in Ubuntu.
The mpiexec.exe program for Windows was obtained from the official installer MS-MPI.
The gfortran and the OpenMPI libs for Windows were obtained by using cygwin
I tried to change the GNU linker and the compiler in Eclipse for Windows and does not work. I tried to run the code in others machines with Windows 10 and problem is the same.
Any suggestions on how to try solve this issue?

As mentioned by #jcgiret there are a consistency problem: the program is compiled using OpenMPI and it is executed with MS-MPI.
To solve this issue the code was executed using the equivalent to mpiexec defined in the openmpi package:
usr/bin/mpiexec -> orterun.exe
The program is executed in windows by
$ orterun.exe -n 2 myprogram.exe
Then the results is the same that obtained in Ubuntu:
$ orterun.exe -n 2 myprogram.exe
There are 2 processors running this job.
Rank# 1 d1= 65 d2= 128
Rank# 0 d1= 1 d2= 64

Related

How can I remove COM ports by command line (w/o installing)

The setup: An automated test station, built around a Windows 7 PC. The UUT (Unit Under Test) are connected and disconnected often, creating many COM ports.
The Problem: The test is searching for the device at a specific COM port, requiring the user to manually remove the "ghost" com ports.
The question: Since the software is used by several test stations in parallel, at a production floor, I cannot install additional software (e.g. Devcon, part of Windows SDK). Is there a command line option to remove the COM ports?
Based on This blog entry
The only thing that actually worked for me is not the intended solution... But it fit the case at hand:
Created a batch file at C:\windows\system32
The content:
REG ADD "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\COM Name Arbiter" /v ComDB /t REG_BINARY /d 0206 /f
The actual value of the registry key (0206 in my example) can either be read from that key (use regedit) or calculated from binary:
com8 com7 com6 com5 com4 com3 com2 com1 com16 com15 com14 com13 com12 com11 com10 com9
0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0
since I wished to keep com1, com2, com10 and com11 - 0000 0011 0000 0110 - which stand for 0206
executing this batch file will remove the unnecessary comports while leaving the one's I intended
*The batch should be at system32 to be executed as elevated (administrator)
*for more details refer to this PDF

Install4J fails to startup on Mac OS X

With JabRef, we experience a weird situation where some users get the following error when starting JabRef
We do not ship a JRE with our application and rely on that Java is installed on the machine of the users.
We have a long issue on GitHub about this particular error and despite all our efforts, it seems we cannot find the source of the issue. Especially, since none of the devs can reproduce it. Fortunately, one of the users who experiences this error was kind enough to provide all information about his machine:
$ java -version
java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)
$ which java
/usr/bin/java
$ ls -l /usr/bin/java
lrwxr-xr-x 1 root wheel 74 Aug 14 12:09 /usr/bin/java -> /System/Library/Frameworks/JavaVM.framework/Versions/Current/Commands/java
$ /usr/libexec/java_home -V
Matching Java Virtual Machines (2):
1.8.0_161, x86_64: "Java SE 8" /Library/Java/JavaVirtualMachines/jdk1.8.0_161.jdk/Contents/Home
1.8.0_121, x86_64: "Java SE 8" /Library/Java/JavaVirtualMachines/jdk1.8.0_121.jdk/Contents/Home
$ echo $JAVA_HOME
/Library/Java/JavaVirtualMachines/jdk1.8.0_161.jdk/Contents/Home
$ echo $JDK_HOME
/Library/Java/JavaVirtualMachines/jdk1.8.0_161.jdk/Contents/Home
I asked this particular user to provide the INSTALL4J_LOG and so that we can compare the log-files of Install4J on my machine where it works and on the users machine where it fails.
My machine runs OS X 10.13.2 and the system of the user is OS X 10.12.6.
There are two main differences between them. The first difference is that when I invoke the JabRef launcher, it instantly tells me the order of places to search for Java:
2018-01-24 02:10:14.437 JavaApplicationStub[4846:345347] -[Launcher findJavaBundle:] [Line 479] search sequence (
EPATH,
Y,
"EJAVA_HOME",
"EJDK_HOME"
)
This is exactly the order we have configured in Install4J
On the users log-file this part is missing completely. However, it doesn't really matter, because the JRE/JDK is very rarely included in the $PATH variable. The next things to check are the default locations and this will find the JRE in Library/Internet\ Plug-Ins/JavaAppletPlugin.plugin first.
Therefore, on my machine and on the users machine, JabRef will use this JRE and not any of additionally installed JDK's.
Now, there is the probably more important difference. On the machine of the user, it seems the JRE needs to be unpacked which does not happen for me:
2018-01-23 12:27:08.320 JavaApplicationStub[40695:1133965] -[Launcher findJavaBundle:] [Line 474] Running unpacker
2018-01-23 12:27:08.323 JavaApplicationStub[40695:1133965] +[Unpacker unpack:withProgress:] [Line 24] found ()
2018-01-23 12:27:08.323 JavaApplicationStub[40695:1133965] +[Unpacker unpack:withProgress:] [Line 24] found ()
After that, the logs look similar but when finally the JRE is invoked, it cannot be found on the user's machine and we get the error message
2018-01-23 12:27:08.328 JavaApplicationStub[40695:1133965] int launcher_main(int, char **) [Line 925] Could not load JRE from The bundle “Java SE 8” couldn’t be loaded because its executable couldn’t be located..: (
0 CoreFoundation 0x00007fff8ff882cb __exceptionPreprocess + 171
1 libobjc.A.dylib 0x00007fffa4d9e48d objc_exception_throw + 48
2 CoreFoundation 0x00007fff90006c3d +[NSException raise:format:] + 205
3 JavaApplicationStub 0x0000000100008644 -[Launcher launch] + 468
4 JavaApplicationStub 0x0000000100008ef5 launcher_main + 645
5 JavaApplicationStub 0x0000000100009062 main + 34
6 JavaApplicationStub 0x0000000100001504 start + 52
)
Now, I'm not sure why unpacking is necessary and if it can be explained by the particular JRE or the different Mac OS versions.
In any way, I have asked the user to install the Oracle JRE 1.8.0_161 so that we both have the exact same Java and I'll report back if this solved the problem.
However, has someone a bright idea why the Install4J launcher crashes?

Cannot run magic functions in ipython terminal

I am using Enthought's Canopy environment on a 64 bit Linux OS. Everything works fine in the Ipython console which is attached with the editor. But when I ipython in the terminal and try to use magic functions, I get the following error.
---------------------------------------------------------------------------
error Traceback (most recent call last)
<ipython-input-3-29a4050aa687> in <module>()
----> 1 get_ipython().show_usage()
/home/shahensha/Development/Canopy/appdata/canopy-1.0.3.1262.rh5-x86_64/lib/python2.7/site-packages/IPython/core/interactiveshell.pyc in show_usage(self)
2931 def show_usage(self):
2932 """Show a usage message"""
-> 2933 page.page(IPython.core.usage.interactive_usage)
2934
2935 def extract_input_lines(self, range_str, raw=False):
/home/shahensha/Development/Canopy/appdata/canopy-1.0.3.1262.rh5-x86_64/lib/python2.7/site-packages/IPython/core/page.pyc in page(strng, start, screen_lines, pager_cmd)
188 if screen_lines <= 0:
189 try:
--> 190 screen_lines += _detect_screen_size(screen_lines_def)
191 except (TypeError, UnsupportedOperation):
192 print(str_toprint, file=io.stdout)
/home/shahensha/Development/Canopy/appdata/canopy-1.0.3.1262.rh5-x86_64/lib/python2.7/site-packages/IPython/core/page.pyc in _detect_screen_size(screen_lines_def)
112 # Proceed with curses initialization
113 try:
--> 114 scr = curses.initscr()
115 except AttributeError:
116 # Curses on Solaris may not be complete, so we can't use it there
/home/shahensha/Development/Canopy/appdata/canopy-1.0.3.1262.rh5-x86_64/lib/python2.7/curses/__init__.pyc in initscr()
31 # instead of calling exit() in error cases.
32 setupterm(term=_os.environ.get("TERM", "unknown"),
---> 33 fd=_sys.__stdout__.fileno())
34 stdscr = _curses.initscr()
35 for key, value in _curses.__dict__.items():
error: setupterm: could not find terminfo database
So, I installed a bare bones iPython shell which is not the one provided by Canopy and tried the same magic functions in there and it works fine.
Have I done something wrong with the installation? Please help
Thanks a lot
shahensha
This is not a solution, but just an observation. My desktop is MacOS-X and I connect to a Centos machine to run Enthought Canopy both 64 bit. I get the same error message as OP if I ssh from iterm2, but not if I use the Terminal app.
I am not sure what the underlying reason is, but may be someone can verify if a similar situation is true for linux. Interestingly I can use either iterm2 or Terminal on the local canopy without any issues.
Update:
I just noticed that the TERM environment variable in iterm2 was set to "xterm" while the Terminal app was showing "xterm-256color". Issuing the command export TERM="xterm-256color" before running the Canopy ipython in terminal solves the issue for me in iterm2.
Problem reproduction:
$ python -c 'import curses; curses.setupterm()'
Traceback (most recent call last):
File "<string>", line 1, in <module>
_curses.error: setupterm: could not find terminfo database
This irc log gave me the idea that this error was to do with libncursesw.
My Canopy version is 1.0.3.1262.rh5-x86_64. I have installed it to ~/src/canopy.
In ~/src/canopy/appdata/canopy-1.0.3.1262.rh5-x86_64/lib we can see that my canopy install has libncursesw.so.5.7.
My machine (Debian Wheezy 64bit) has libncursesw.so.5.9 (in /lib/x86_64-linux-gnu/libncursesw.so.5.9). I made canopy use this. You can toggle the problem on / off by using LD_PRELOAD and pointing at the .so file.
Solution
Replace libncurses.so.5.7 with libncurses.so.5.9:
CANOPYDIR=$HOME/src/canopy
CANOPYLIBS=$CANOPYDIR/appdata/canopy-1.0.3.1262.rh5-x86_64/lib/
SYSTEMLIBS=/lib/x86_64-linux-gnu
cp $SYSTEMLIBS/libncurses.so.5.9 $CANOPYLIBS
ln -sf $CANOPYLIBS/libncurses.so.5.9 $CANOPYLIBS/libncurses.so.5
It appears that Canopy User Python is not your default. See this article:
https://support.enthought.com/entries/23646538-Make-Canopy-s-Python-be-your-default-Python-i-e-on-the-PATH-
Update: Not true here -- instead, see batu's workaround answer.

Schrödinger's file

I am puzzled by the following sequence of commands.
sh-4.2$ pwd
/home/willard
sh-4.2$ ls -l f
-rwxr-xr-x 1 willard users 59116 Jan 23 14:54 f
sh-4.2$ file f
f: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.15, BuildID[sha1]=0xea0e08ff2b5a062698d45b78177acdd6bf140d1f, stripped
sh-4.2$ ./f
sh: ./f: No such file or directory
sh-4.2$ strace ./f
execve("./f", ["./f"], [/* 32 vars */]) = -1 ENOENT (No such file or directory)
write(2, "strace: exec: No such file or di"..., 40strace: exec: No such file or directory
) = 40
exit_group(1) = ?
+++ exited with 1 +++
sh-4.2$ ls -l f
-rwxr-xr-x 1 willard users 59116 Jan 23 14:54 f
sh-4.2$ uname -a
Linux xdat10 3.6.2-1-ARCH #1 SMP PREEMPT Fri Oct 12 23:58:58 CEST 2012 x86_64 GNU/Linux
How is this possible?
I found someone having the same problem (with relative explanation)
Running 32bit binary on a 64bit system
Quoting the most important sentences:
This situation often arises when you try to run a binary for the right
system (or family of systems) and superarchitecture but the wrong
subarchitecture. Here you have ELF binaries on a system that expects
ELF binaries, so the kernel loads them just fine. They are i386
binaries running on an x86_64 processor, so the instructions make
sense and get the program to the point where it can look for its
loader. But the program is a 32-bit program (as the file output
indicates), looking for the 32-bit loader /lib/ld-linux.so.2, and
you've presumably only installed the 64-bit loader
/lib64/ld-linux-x86-64.so.2 in the chroot.
You need to install the 32-bit runtime system in the chroot: the
loader, and all the libraries the programs need. On Debian amd64, the
32-bit loader is in the libc6-i386 package. You can install a bigger
set of 32-bit libraries by installing ia32-libs.
I bet there's a better way to verify this but i'd try to exec
ldd ./f
and search in the output which loader is needed to exec'it
man 2 execve:
ENOENT The file filename or a script or ELF interpreter does not
exist, or a shared library needed for file or interpreter can‐
not be found.
You could run ldd against this binary to look for libraries that could not be mapped and install them from multilib.

Solaris CPU run queue

Is there a command which can tell me whats in the Solaris run queue?
I can get a count using vmstat, but I need to know what processes/threads are in there.
The run-queue is always changing, so it's almost impossible to get the set of processes in the current run-queue.
That said, you can get an approximation by looking at the STAT (state) field of the process list from ps. When running the command below:
$ ps aux
...the if the STAT field begins with R, then the process is marked RUNNABLE by the kernel, which on most operating systems means that it is in the run-queue. Here's what a runnable process looks like on my machine:
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
root 78179 0.0 0.0 599828 480 s003 R+ 7:51AM 0:00.00 ps aux
On solaris, you can also use the prstat command and look at the STATE column. The value run indicates that the process is on the run-queue. (Also note that the value cpuN indicates that the process is currently running on processor N.
For example:
$ prstat -s cpu -n 5
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
13974 kincaid 888K 432K run 40 0 36:14.51 67% cpuhog/1
27354 kincaid 2216K 1928K run 31 0 314:48.51 27% server/5
14690 root 136M 46M sleep 59 0 0:00.59 2.3% Xsun/1
14797 kincaid 9192K 7496K sleep 59 0 0:00.10 0.9% dtwm/8
14851 kincaid 24M 14M sleep 48 0 0:00.03 0.3% netscape/1
Total: 97 processes, 190 lwps, load averages: 2.18, 2.15, 2.11
I was about to correct 0xfe answer when I saw you already did it. The run queue is containing theads not processes so the -L option is mandatory with the prstat command if you want to have the number of "state run" lines more or less matching the run queue. Beware that sampling artifacts will probably prevent to get accurate matches.
In any case, if you want to precisely know what processes/threads are sitting in the run queue you'd rather go the dtrace way assuming you are running Solaris 10 or newer.
The whoqueue.d script which might already been in /usr/demo/dtrace directory on your machine will be a good start:
# dtrace -s /usr/demo/dtrace/whoqueue.d
Run queue of length 1:
24349/1 (dtrace)
Run queue of length 3:
0/0 (sched)
0/0 (sched)
0/0 (sched)
Run queue of length 4:
22468/30 (java)
22468/17 (java)
22468/23 (java)
22468/10 (java)
Have a look at this page for details.