Matlab code after compilation - matlab

I am totally a newbie in Matlab
I want to ask that when we write a program in Matlab software or IDE and save it with a
.m (dot m) file and then compile and execute it, then that .m (dot m) file is converted into which file? I want to know this because i heard that matlab is platform independent and i did google this but i got converting matlab file to C, C++ etc
Sorry for the silly question and thanks in advance.

Matlab is an interpreted language. So in most cases there is no persistent intermediate form. However, there is an encrypted intermediate form called pcode and there are also the MATLAB compiler and MATLAB coder which delivers code in other high level languages such as C.
edit:
pcode is not generated automatically and should be platform/version independent. But it's major purpose is to encrypt the code, not to compile it (although, it does some partial compilation). To use pcode, you still need the MATLAB environment installed, so in many ways it acts like interpreted code.
But from your follow-up question I guess you don't quite understand how MATLAB works. The code gets interpreted (although with a bit of Just-In-Time Compilation), so there is no need for a persistent intermediate code file: the actual data structures representing your code are maintained by MATLAB. In contrast to compiled languages, where your development cycle is something like "write code, compile & link, execute", the compilation (actually: interpretation) step is part of the execution, so you end up with "write code, execute" in most of the cases.
Just to give you some intuitive understanding of the difference between a compiler and an interpreter. A compiler translates a high level language to a lower level language (let's say machine code that can be executed by your computer). Afterwards that compiled code (most likely stored in a file) is executed by your computer. An interpreter on the other hand, interprets your high level code piece by piece, determining what machine code corresponds to your high level code during the runtime of the program and immediately executes that machine code. So there is no real need to have a machine code equivalent of your entire program available (so in many cases an interpreter will not store the complete machine code, as that is just wasted effort and space).
You could look at interpretation more or less as a human would interpret code: when you try to manually determine the output of some code, you follow the calculations line by line and keep track of your results. You don't generally translate that entire code into some different form and afterwards execute that code. And since you don't translate the code entirely, there is no need to persistently store the intermediate form.
As I said above: you can use other tools such as MATLAB coder to convert your MATLAB code to other high languages such as C/C++, or you can use the MATLAB compiler to compile your code to executable form that depends on some runtime libraries. But those are only used in very specific cases (e.g. when you have to deploy a MATLAB application on computers/embedded devices without MATLAB, when you need to improve performance of your code, ...)
note: My explanation about compilers and interpreters is a quick comparison of the archetypal interpreter and compiler. Many real-life cases are somewhere in between, e.g. Java generally compiles to (JVM) bytecode which is then interpreted by the JVM and something similar can be said about the .NET languages and its CLR.

Since MATLAB is an interpreter, you can write code and just execute it from the IDE, without compilation.
If you want to deploy your program, you can use the MATLAB compiler to create an stand-alone executable or a shared library that you can use in a C++ project. On Windows, MATLAB code would compile to an .EXE file or a .DLL file, respectively.

Related

Impact of choosing a programming language on the OS performance

Does choosing a programming language decide performance when all of it is compiled to some 1's and 0's
Eg: printf (in C) vs cout (C++) vs print (in Python)
Do all of the above have same binary compiled code ?
Appreciate any help in understanding this concept of programming language and role on hardware in detail! Thanks in advance
The choice of programming language can have many impacts on the performance of your code, how portable it is, the comparability and among other things, how easily the objective can be put into code. To answer you question directly, C and C++ would likely produce the 'same binary' when printing an output, if they were both done for the same target environment. Python is different because it is an interpreted language, meaning the code is read by a program written in code native to the architecture and acted upon accordingly. Python is something of an edge case in this regard because it is technically compiled at execution time (and can be before distribution) but into an intermediate code similar in principle to Java byte code that is only understood by the Python interpreter.
The difference you bring up between lower language's like C and higher ones like Java, Python and even JavaScript is the nature of their execution being done by native hardware or by the interpreter. Language's running on bare metal are generally understood to be faster than those on interpreters as the interpreter takes time to understand the code and uses it's own system resources. Java tends to break this rule because it's interpreter is a full virtual machine that understands very simple byte code, making it competitive in speed to language's like C.
To what kind of binary code they are compiled depends on the compiler. For C and C++ there are dozens of different compilers which might generate different binary code. Besides that, most compilers even have optimization flags that influence the generated binary code a lot.
Python isn't even directly compiled into "machine code", it's compiled into bytecode for a python interpreter. The Python interpreter itself is a program that runs on the machine, then reads the python-bytecode and executes it probably by internally calling predefined functions (that already exist in machine-code)

Executing CUDA code In Matlab

I wanted to ask if anyone have run some C code containing CUDA code on Matlab?
I have read the documentation on Mathworks website but I still cant quite wrap my head around it. I do understand that it is two main type of ways you could do this either executing a CUDA kernel by constructing a object with the function parallel.gpu.CUDAKernel or by constructing a mex file out of a .cu file. There are some things though I do not understand when using these two methods.
Using the mex approach should I use another IDE like Visual Studio to compile a .cu file first before compiling the mex file in Matlab? If so how can I compile the .cu file without a main() function in the .cu file, I always get errors when I try to compile it that way in VS, or is it okay to have a main function in the .cu file and pass the pointers to the GPU arrays to the main function?
For the CUDA kernel approach, should compile the kernel in VS, and in that case how?
Both things can work.
If you want to have flexibility, my suggestion is to write your .cu files with .c (or .cpp) files*. Once you have some basic thing working, you should be able to write a mex wrapper around it in order to grab MATLAB variables and convert them co C/C++ so you can pass them to and from CUDA. This requires you to have a compiler that is both compatible with your versions of MATLAB, CUDA and OS. An example is Visual Studio 2013 in widows and most versions of MATLAB and CUDA, but please check. Generally this is done by linking nvcc to the mex compiler after setup with some xml files (see example here from my toolbox). This approach gives you full flexibility, not only for working with CUDA, but also with working with the anything that you may want to use together with your kernels e.g. tensorflow, eigen, SQL, ... Its full flexibility.
If instead you just want a few operations accelerated with simple methods, use the Parallel computing toolbox with gpuarrays for standard MATLAB operations or with parallel.gpu.CUDAKernel for your own kernels. To use this second one you need to compile a ptx file, which seems pretty straightforward. A priori this gives you less flexibility, as it will run just a kernel, but often complex GPU programs may need several kernels and data handling techniques, as well as communication between kernels etc. However I have personally haven't tried it and perhaps you can achieve full flexibility. Let me know and I will edit the answer.
In short, you choice depends on your application/needs.
*You might not need .c or .cpp files with modern versions of MATLAB and mexcuda.

How does a disassembler work and how is it different from a decompiler?

I'm looking into installing a disassembler (or decompiler) on my Linux Mint 17.3 OS and I wanted to know what the difference is between a disassembler and a decompiler. I have a rough idea of what they are (the names are fairly self-explanatory), but they are still a bit confusing.
I've read that a disassembler turns a program into assembly language, which I don't know, so it seems kind of useless to me. I've also read that a decompiler turns a 'binary file' into its source code. What exactly is a binary file?
Apparently, decompilers cannot decompile to C, only Python and other similar languages. So how can I turn a program into its original C source code?
A disassembler is a pretty straightforward application that transfers machine code into assembly language statements - This activity is the reverse operation that an assembler program does and is straightforward because there is a strict one-to-one relationship between machine code and assembly. A disassembler aims at a specific CPU. The original assembler that was used to create the executable is only of minor relevance.
A decompiler aims at recreating a compiled high-level language program from machine code into its original format - Thus trying the reverse operation of a C or Forth (popular languages for which de-compilers exist) compiler. Because there are so many high-level languages and thus so many ways in how original high-level language constructs could be expressed in machine code (even a lot of different strategies for the same language and construct, even in the same compiler, and even different strategies depending on the compiler mode and situation), this operation is much more complex and very dependent on the original compiler (and maybe even the command line that was used, it's chosen optimization level and also the used version).
Even if all that fits, most of the work of a decompiler is educated guessing and will most probably never reach a point where it can reconstruct the original program in its source code form 100% - It will rather end up with a version of source code that could have been the original program.

Racket Interactive vs Compiled Performance

Whether or not I compile a Racket program seems to make no difference to the runtime performance.
Is it just the loading of the file initially that is improved by compilation? In other words, does running racket src.rkt do a jit compilation on the fly, which is why I see no difference in compiling vs interactive?
Even for tight loops of integer arithmetic, where I thought some difference would occur, the profile times are equivalent whether or not I previously did a raco make.
Am I missing something simple?
PS, I notice that I can run racket against the source file (.rkt) or .zo file. Does racket automatically use the .zo if one is found that corresponds to the .rkt file, or does the .zo file need to be used explicitly? Either way, it makes no difference to the performance numbers I'm seeing.
Yes, you're right.
Racket compiles code in two stages: first, the code is compiled into bytecode form, and then when it runs it gets jitted into machine code. When you compile a file, you're basically creating the bytecode which saves on re-compiling it later. Since that's usually not something that takes a lot of time for small pieces of code, you won't see any noticeable difference in runtimes. For an extreme example, you can delete all *.zo files in the collection tree and start DrRacket -- it will take a lot of time to start since there's a ton of code, but once it does start, it would run almost as usual. (It would also be slow to click "run" since that will reload and recompile some files.) Another concern for bigger pieces of code is that the compilation process can make memory consumption higher, but that's also not an issue with smaller pieces of code.
See also the Performace chapter in the guide for hints on how to improve performance.
Racket will always compile your code, regardless of whether it is run interactively at the REPL or run from the command-line. Here is the section in the guide that explains it. In interactive mode, the compiler turns every expression/definition into bytecode in memory and executes that. Otherwise, the compilers outputs the bytecode to zo files.
Note: Eli replied at the same time I did. See his response for more details.

"All programs are interpreted". How?

A computer scientist will correctly explain that all programs are
interpreted and that the only question is at what level. --perlfaq
How are all programs interpreted?
A Perl program is a text file read by the perl program which causes the perl program to follow a sequence of actions.
A Java program is a text file which has been converted into a series of byte codes which are then interpreted by the java program to follow a sequence of actions.
A C program is a text file which is converted via the C compiler into an assembly program which is converted into machine code by the assembler. The machine code is loaded into memory which causes the CPU to follow a sequence of actions.
The CPU is a jumble of transistors, resistors, and other electrical bits which is laid out by hardware engineers so that when electrical impulses are applied, it will follow a sequence of actions as governed by the laws of physics.
Physicists are currently working out what makes those rules and how they are interpreted.
Essentially, every computer program is interpreted by something else which converts it into something else which eventually gets translated into how the electrons in your local neighborhood fly around.
EDIT/ADDED: I know the above is a bit tongue-in-cheek, so let me add a slightly less goofy addition:
Interpreted languages are where you can go from a text file to something running on your computer in one simple step.
Compiled languages are where you have to take an extra step in the middle to convert the language text into machine- or byte-code.
The latter can easily be easily be converted into the former by a simple transformation:
Make a program called interpreted-c, which can take one or more C files and can run a program which doesn't take any arguments:
#!/bin/sh
MYEXEC=/tmp/myexec.$$
gcc -o $MYEXEC ${1+"$#"} && $MYEXEC
rm -f $MYEXEC
Now which definition does your C program fall into? Compare & contrast:
$ perl foo.pl
$ interpreted-c foo.c
Machine code is interpreted by the processor at runtime, given that the same machine code supplied to a processor of a certain arch (x86, PowerPC etc), should theoretically work the same regardless of the specific model's 'internal wiring'.
EDIT:
I forgot to mention that an arch may add new instructions for things like accessing new registers, in which case code written to use it won't work on older processors in the range. Much like when you try to use an old version of a library and then try to use capabilities only found in newer libraries.
Example: many Linux distros are released as 686 only, despite the fact it's in the 'x86 family'. This is due to the use of new instructions.
My first thought was too look inside the CPU — see below — but that's not right. The answer is much much simpler than that.
A high-level description of a CPU is:
1. execute the current op
2. grab the next op
3. goto 1
Compare it to Perl's interpreter:
while ((PL_op = op = op->op_ppaddr(aTHX))) {
}
(Yeah, that's the whole thing.)
There can be no doubt that the CPU is an interpreter.
It just goes to show how useless it is to classify something is interpreted or not.
Original answer:
Even at the CPU level, programs get rewritten into simpler instructions to allow the CPU to execute more them more quickly. This is done by changing the order in which they are executed and executing them in parallel. For example, Intel's Hyperthreading.
Even deeper, each instruction is considered a program of its own, one that routes electronic signals. See microcode.
The Levels of interpretions are really easy to explain:
2: Runtimelanguage (CLR, Java Runtime...) & Scriptlanguage (Python, Ruby...)
1: Assemblies
0: Binary Code
Edit: I changed the level of Scriptinglanguages to the same level of Runtimelanguages. Thank's for the hint. :-)
I can write a Game Boy interpreter that works similarly to how the Java Virtual Machine works, treating the z80 machine instructions as byte code. Assuming the original was written in C1, does that mean C suddenly became an interpreted language just because I used it like one?
From another angle, gcc can compile C into machine code for a number of different processors. There's no reason the target machine has to be the same as the machine you're compiling on. In fact, this is a common way to compile C code for AVRs and other microcontrollers.
As a matter of abstraction, the compiler's job is to translate flat text into a structure, then translate that structure into something that can be executed somewhere. Whatever is doing the execution may have its own levels of breaking out the structure before really executing it.
A lot of power becomes available once you start thinking along these lines.
A good book on this is Structure and Interpretation of Computer Programs. Even if you only get through the first chapter (or half of the first chapter), I think you'll learn a lot.
1 I think most Game Boy stuff was hand coded ASM, but the principle remains.