What is the difference between Interprocedural and Intraprocedural analysis? - compiler-optimization

I have searched a lot to find the difference between the Interprocedural and Intraprocedural analysis. As far as I could understand, the Intraprocedural analysis is applied on a single procedure whereas the Interprocedural analysis is applied on all procedures. This is still not clear to me, more specficialy how this Interprocedural analysis is applied on all procedures?
Can someone please give me an explanation for that?

Interprocedural (think Internet, a network of networks, vs. Intranet, a single network) analysis is analysis that operates on multiple functions, for example the following Interproducedural optimization:
function main() {
return getConstant() * rand()
}
function getConstant() {
return 42
}
Could be inlined to:
function main() {
return 42 * rand()
}
And the following Intraprocedural optimization:
function getNrOfSecondsInWeek() {
secondsAMinute := 60
minutesAnHour := 60
hoursADay := 24
daysAWeek := 7
return secondsAMinute * minutesAnHour * hoursADay * daysAWeek;
}
Could be inlined to:
function getNrOfSecondsInWeek() {
return 604800;
}

Intraprocedural analysis is a mechanism for performing optimization (or other analysis) for each function in a compilation unit, using only the information available for that function and compilation unit.
Interprocedural analysis is a mechanism for performing optimization (or other analysis) across function and compilation unit boundaries.
More information here: https://www.ibm.com/docs/en/zos/2.3.0?topic=option-types-procedural-analysis

Related

Has anyone used Dijkstra's algorithm in OPL?

I have a model for a mining problem. I am working on adding into the model to use the shortest path inside a mine(open pit) for hauling ore and waste. For this, I was thinking of Dijkstra's algorithm. I could not find any example of the use of Dijkstra's algorithm in OPL. Has anyone done it before and can you share some ideas, please.
if you need to write Dijsktra's algorithm then Daniel is right and you d rather use the scripting part. Now if you need a shortest path within an existing OPL model you could use the following shortest path example:
.mod
tuple edge
{
key int o;
key int d;
int weight;
}
{edge} edges=...;
{int} nodes={i.o | i in edges} union {i.d | i in edges};
int st=1; // start
int en=8; // end
dvar int obj; // distance
dvar boolean x[edges]; // do we use that edge ?
minimize obj;
subject to
{
obj==sum(e in edges) x[e]*e.weight;
forall(i in nodes)
sum(e in edges:e.o==i) x[e]
-sum(e in edges:e.d==i) x[e]
==
((i==st)?1:((i==en)?(-1):0));
}
{edge} shortestPath={e | e in edges : x[e]==1};
execute
{
writeln(shortestPath);
}
.dat
edges=
{
<1,2,9>,
<1,3,9>,
<1,4,8>,
<1,10,18>,
<2,3,3>,
<2,6,6>,
<3,4,9>,
<3,5,2>,
<3,6,2>,
<4,5,8>,
<4,7,7>,
<4,9,9>,
<4,10,10>,
<5,6,2>,
<5,7,9>,
<6,7,9>,
<7,8,4>,
<7,9,5>,
<8,9,1>,
<8,10,4>,
<9,10,3>,
};
which gives
// solution (optimal) with objective 19
{<1 4 8> <4 7 7> <7 8 4>}
If you have a problem that can be solved using Dijkstra's algorithm then it seems a bit of overkill to use OPL or CPLEX to solve it. You could code up the algorithm in any programming language and use it from there. I guess that is why you don't find any examples.
If you still want to implement in OPL then use a scripting (execute) or a main block. The scripting code you can provide there is a superset of JavaScript, so you can implement Dijkstra's algorithm in JavaScript and put it there.

chisel3 arithmetic operations on Doubles

Please I have problems manipulating arithmetic operations with doubles in chisel. I have been seeing examples that uses just the following types: Int,UInt,SInt.
I saw here that arithmetic operations where described only for SInt and UInt. What about Double?
I tried to declare my output out as Double, but didn't know how. Because the output of my code is Double.
Is there a way to declare in Bundle an input and an output of type Double?
Here is my code:
class hashfunc(val k:Int, val n: Int ) extends Module {
val a = k + k
val io = IO(new Bundle {
val b=Input(UInt(k.W))
val w=Input(UInt(k.W))
var out = Output(UInt(a.W))
})
val tabHash1 = new Array[Array[Double]](n)
val x = new ArrayBuffer[(Double, Data)]
val tabHash = new Array[Double](tabHash1.size)
for (ind <- tabHash1.indices){
var sum=0.0
for (ind2 <- 0 until x.size){
sum += ( x(ind2) * tabHash1(ind)(ind2) )
}
tabHash(ind) = ((sum + io.b) / io.w)
}
io.out := tabHash.reduce(_ + _)
}
When I compile the code, I get the following error:
code error
Thank you for your kind attention, looking forward to your responses.
Chisel does have a native FixedPoint type which maybe of use. It is in the experimental package
import chisel3.experimental.FixedPoint
There is also a project DspTools that has simulation support for Doubles. There are some nice features, e.g. it that allows modules to parameterized on the numeric types (Complex, Double, FixedPoint, SInt) so that you can run simulations on double to validate the desired mathematical behavior and then switch to a synthesizable number format that meets your precision criteria.
DspTools is an ongoing research projects and the team would appreciate outside users feedback.
Operations on floating point numbers (Double in this case) are not supported directly by any HDL. The reason for this is that while addition/subtraction/multiplication of fixed point numbers is well defined there are a lot of design space trade-offs for floating point hardware as it is a much more complex piece of hardware.
That is to say, a high performance floating point unit is a significant piece of hardware in it's own right and would be time shared in any realistic design.

How to test generic performance with whole module optimization

In WWDC 2015 Session 409 near the 18 minute mark. The discussion at hand leads me to believe that generics can be optimized through Generic Specialization by enabling whole module optimization mode. Unfortunately my tests, which I'm not confident in, revealed nothing useful.
I ran some very simple tests between the following two methods to see if the performance was similar:
func genericMax<T : Comparable>(x:T, y:T) -> T {
return y > x ? y : x
}
func intMax(x:Int, y:Int) -> Int {
return y > x ? y : x
}
Simple XCTest:
func testPerformanceExample() {
self.measureBlock {
let x: Int = Int(arc4random_uniform(9999))
let y: Int = Int(arc4random_uniform(9999))
for _ in 0...1000000 {
// let _ = genericMax(x, y: y)
let _ = intMax(x, y: y)
}
}
}
What happened
Without optimization the following tests were reasonably different:
genericMax: 0.018 sec
intMax: 0.005 sec
However with Whole Module Optimization the following tests weren't similar:
genericMax: 0.014 sec
intMax: 0.004 sec
What I Expected
With whole module optimization enabled I expected similar times between the two methods calls. This leads me to believe that my test is flawed.
Question
Assuming my tests are flawed / poor. How could I better measure how Whole Module Optimization mode optimizes generics through Generic Specialization?
Your tests are flawed because they measure test performance, not app performance. Tests live in a separate executable file, and they do not benefit from whole-module optimizations themselves. Because of this, your test always uses the generic, non-specialized implementation even in places where your program doesn't.
If you want to see that whole-module optimizations are enabled in your executable, you need to test a function from your executable. (You also need to make sure that your tests either use the Release build, or that you have WMO enabled in the debug build.)
Adding this function to the executable:
func genericIntMax(x: Int, y: Int) -> Int {
return genericMax(x, y: y)
}
and using it from the tests in place of genericMax in the tests, I get identical performance. (Note that this is not really whole module optimization since the two functions live in the same file; but it shows the difference between app code and test code when it comes to optimizations.)

Translating snippet to functional from imperative [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I have the following Scala snippet. In order to solve my given problem, I "cheat" a little and use a var -- essentially a non-final, mutable data type. Its value is updated at each iteration through the loop. I've spent quite a bit of time trying to figure out how to do this using only recursion, and immutable data types and lists.
Original snippet:
def countChange_sort(money: Int, coins: List[Int]): Int =
if (coins.isEmpty || money < 0)
0
else if (coins.tail.isEmpty && money % coins.head != 0) {
0
} else if (coins.tail.isEmpty && money % coins.head == 0 || money == 0) {
1
} else {
-- redacted --
}
}
Essentially, are there any basic techniques I can use to eliminate the i and especially the accumulating cnt variables?
Thanks!!
There are lots of different ways to solve problems in functional style. Often you start by analysing the problem in a different way than you would when designing an imperative algorithm, so writing an imperative algorithm and then "converting" it to a functional one doesn't produce very natural functional algorithms (and you often miss out on lots of the potential benefits of functional style). But when you're an experienced imperative programmer just starting out with functional programming, that's all you've got, and it is a good way to begin getting your head around the new concepts. So here's how you can approach "converting" such a function as the one you wrote to functional style in a fairly uncreative way (i.e. not coming up with a different algorithm).
Lets just consider the else expression since the rest is fine.
Functional style has no loops, so if you need run a block of code a number of times (the body of the imperative loop), that block of code must be a function. Often the function is a simple non-recursive one, and you call a higher-order function such as map or fold to do the actual recursion, but I'm going to presume you need the practice thinking recursively and want to see it explicitly. The loop condition is calculated from the quantities you have at hand in the loop body, so we just have the loop-replacement function recursively invoke itself depending on exactly the same condition:
} else {
var cnt = 0
var i = 0
def loop(????) : ??? = {
if (money - (i * coins.head) > 0) {
cnt += countChange_sort(money - (i * coins.head), coins.tail)
i = i + 1
loop(????)
}
}
loop(????)
cnt
}
Information is only communicated to a function through its input arguments or through its definition, and communicated from a function through its return value.
The information that enters a function through its definition is constant when the function is created (either at compile time, or at runtime when the closure is created). Doesn't sound very useful for the information contained in cnt and i, which needs to be different on each call. So they obviously need to be passed in as arguments:
} else {
var cnt = 0
var i = 0
def loop(cnt : Int, i : Int) : ??? = {
if (money - (i * coins.head) > 0) {
cnt += countChange_sort(money - (i * coins.head), coins.tail)
i = i + 1
loop(cnt, i)
}
}
loop(cnt, i)
cnt
}
But we want to use the final value of cnt after the function call. If information is only communicated from loop through its return value, then we can only get the last value of cnt by having loop return it. That's pretty easy:
} else {
var cnt = 0
var i = 0
def loop(cnt : Int, i : Int) : Int = {
if (money - (i * coins.head) > 0) {
cnt += countChange_sort(money - (i * coins.head), coins.tail)
i = i + 1
loop(cnt, i)
} else {
cnt
}
}
cnt = loop(cnt, i)
cnt
}
coins, money, and countChange_sort are examples of information "entering a function through its definition". coins and money are even "variable", but they're constant at the point when loop is defined. If you wanted to move loop out of the body of countChange_sort to become a stand-alone function, you would have to make coins and money additional arguments; they would be passed in from the top-level call in countChange_sort, and then passed down unmodified in each recursive call inside loop. That would still make loop dependent on countChange_sort itself though (as well as the arithmetic operators * and -!), so you never really get away from having the function know about external things that don't come into it through its arguments.
Looking pretty good. But we're still using assignment statements inside loop, which isn't right. However all we do is assign new values to cnt and i and then pass them to a recursive invocation of loop, so those assignments can be easily removed:
} else {
var cnt = 0
var i = 0
def loop(cnt : Int, i : Int) : Int = {
if (money - (i * coins.head) > 0) {
loop(cnt + countChange_sort(money - (i * coins.head), coins.tail), i + 1)
} else {
cnt
}
}
cnt = loop(cnt, i)
cnt
}
Now there are some obvious simplifications, because we're not really doing anything at all with the mutable cnt and i other than initialising them, and then passing their initial value, assigning to cnt once and then immediately returning it. So we can (finally) get rid of the mutable cnt and i entirely:
} else {
def loop(cnt : Int, i : Int) : Int = {
if (money - (i * coins.head) > 0) {
loop(cnt + countChange_sort(money - (i * coins.head), coins.tail), i + 1)
} else {
cnt
}
}
loop(0, 0)
}
And we're done! No side effects in sight!
Note that I haven't thought much at all about what your algorithm actually does (I have made no attempt to even figure out whether it's actually correct, though I presume it is). All I've done is straightforwardly applied the general principle that information only enters a function through its arguments and leaves through its return values; all mutable state accessible to an expression is really extra hidden inputs and hidden outputs of the expression. Making them immutable explicit inputs and outputs, and then allows you to prune away unneeded ones. For example, i doesn't need to be included in the return value from loop because it's not actually needed by anything, so the conversion to functional style has made it clear that it's purely internal to loop, whereas you had to actually read the code of the imperative version to deduce this.
cnt and i are what is known as accumulators. Accumulators aren't anything special, they're just ordinary arguments; the term only refers to how they are used. Basically, if your algorithm needs to keep track of some data as it goes, you can introduce an accumulator parameter so that each recursive call can "pass forward" the data from what has been done so far. They often fill the role that local temporary mutable variables fill in imperative algorithms.
It's quite a common pattern for the return value of a recursive function to be the value of an accumulator parameter once it is determined that there's no more work left to do, as happens with cnt in this case.
Note that these sort of techniques don't necessarily produce good functional code, but it's very easy to convert functions implemented using "local" mutable state to functional style using this technique. Pervasive non-local use of mutability, such as is typical of most traditional OO programs, is harder to convert like this; you can do it, but you tend to have to modify the entire program at once, and the resulting functions have large numbers of extra arguments (explicitly exposing all the hidden data-flow that was present in original program).
I don't have any basic techniques to change the code you have specifically. However, here is a general tip for solving recursion algorithms:
Can you break the problem into sub-problems? In the money example, for example, if you are trying to get to $10 with a $5, that's similar to the question of getting to $5 with a $5 (having already chosen the $5 once). Try to draw it out and make rules. You'll be surprised at how much more obviously correct your solution is.
Since nobody answers your question I will try to give you some hints:
What is a loop?
Traversing each element of a collection. stop meeting a condition
What can you do with recursion:
Traversing each element of a collection. stop meeting a condition.
Start simple write a method without vars which prints each element of a collection.
Then the rest becomes simple look at your loop and what you are doing.
Instead of manipulating the variables directly(like i=i + 1), simply pass i + 1 to the recursive call of your method.
HTH

In what circumstances can a compiler change the execution order of programme statements?

If this is not a real question then feel free to close ;)
Not only the compiler can reorder execution (mostly for optimization), most modern processors do so, too. Read more about execution reordering and memory barriers.
The compiler can change the execution order of statements when it sees fit for optimization purposes, and when such changes wouldn't alter the observable behavior of the code.
A very simple example -
int func (int value)
{
int result = value*2;
if (value > 10)
{
return result;
}
else
{
return 0;
}
}
A naive compiler can generate code for this in exactly the sequence shown. First calculate "result" and return it only if the original value is larger than 10 (if it isn't, "result" would be ignored - calculated needlessly).
A sane compiler, though, would see that the calculation of "result" is only needed when "value" is larger than 10, so may easily move the calculation "value*2" inside the first braces and only do it if "value" is actually larger than 10 (needless to mention, the compiler doesn't really look at the C code when optimizing - it works in lower levels).
This is only a simple example. Much more complicated examples can be created. It is very possible that a C function would end up looking almost nothing like its C representation in compiled form, with aggressive enough optimizations.
Many compilers use something called "common subexpression elimination". For example, if you had the following code:
for(int i=0; i<100; i++) {
x += y * i * 15;
}
the compiler would notice that y * 15 is invariant (its value doesn't change). So it would compute y * 15, stick the result in a register and change the loop statement to "x += r0 * i". This is kind of a contrived example, but you often see expressions like this when working with array indexes or any other base + offset type of situation.