What does assignment mean to a C11 atomic? - atomic

For example,
atomic_int test(void)
{
atomic_int tmp = ATOMIC_VAR_INIT(14);
tmp = 47; // Looks like atomic_store
atomic_int mc; // Probably just uninitialised data
memcpy(&mc,&tmp,sizeof(mc)); // Probably equivalent to a copy
tmp = mc + 4; // Arithmetic
return tmp; // A copy - perhaps load then store
}
Clang is happy with all this. I've read section 7.17 of the standard, and it says a lot about the memory model and the defined functions (init, store, load etc) but doesn't say anything about the usual operations (+, = etc).
Also of interest is the behaviour of passing struct wot { atomic_int value; } to functions.
I would like to believe that assignment behaves identically to an atomic load then store using memory_order_seq_cst.
Even more optimistically, I would like to believe that struct assignment, passing to function, returning from function and even memcpy also behaves identically to carefully copying the bit pattern across under memory_order_seq_cst.
I can't find any supporting evidence for either belief in the standard though. There's definitely a chance that assignment and memcpy of atomic primitives is undefined behaviour.
How should primitive operations on atomic primitives behave?
Thanks!

Operations on objects that are _Atomic qualified (and atomic_int is just a different writing for that) are guaranteed to have sequential consistency. You find that mentionned at the end of the semantics section for each of the operands. (And maybe the mention for assignment is missing.)
Your code is not correct at two places: initialization must use the ATOMIC_VAR_INIT macro (7.17.2.1), and memcpy is undefined (the sizes might not agree), although it probably will work on most of the architectures.
Also the line
tmp = mc + 4; // Arithmetic
doesn't do what your comment claims. This is not arithmetic on an atomic object, but a load followed by an ordinary addition. More interesting would be
mc += 4; // Arithmetic
which is an atomic operation with sequential consistency.

Related

Why is "reduce" an infix operator in Chapel?

According to this manual page, we can use reduce to perform reduction like
summation (+):
var a = (+ reduce A) / num;
var b = + reduce abs(A);
var c = sqrt(+ reduce A**2);
and maximum value/location:
var (maxVal, maxLoc) = maxloc reduce zip(A, A.domain);
Here, Chapel defines reduce to be an infix operator rather than a function (e.g., reduce( A, + )). IMHO, the latter form seems to be a bit more readable because the arguments are always separated by parentheses. So I am wondering if there is some reason for this choice (e.g., to simplify some parallel syntax) or just a matter of history (convention)?
I'd say the answer is a matter of history / convention. A lot of Chapel's array and domain features were heavily inspired by the ZPL language from the University of Washington, and I believe this syntax was taken reasonably directly from ZPL.
At the time, we didn't have a notion of passing things like functions and operators around in Chapel, which is probably one of the reasons that we didn't consider more of a function-based approach. (Even now, first-class function support in Chapel is still somewhat in its infancy, and I don't believe we have a way to pass operators around).
I'd also say that Chapel is a language that generally favors syntax for key patterns rather than taking more of a "make everything look like a function / method call" approach (e.g., ranges are supported via a literal syntax and several key operators rather than using an object type with methods).
None of this is to say that the choice was obviously right or couldn't be reconsidered.

What is the difference in atomic_load() and assignment?

I am working on a project that deals with lots of atomic operations. Till now I didn’t knew about atomic_load() and was only relying on assignment operator to get value of an atomic type and I haven’t seen an error except of so much of testing. Those atomic types are changed by multiple processes and threads as well by atomic_compare_exchange_strong_explicit(), so they will need an old value every time, and that’s where I always did oldValue = <Atomic_ type_variable> and it always works fine.
Is that just by chance? Should I prefer using atomic_load()?
foo = atomic_var is just a shortcut syntax for foo = atomic_load(&atomic_var);
Which itself is a shortcut for foo = atomic_load_explicit(&atomic_var, memory_order_seq_cst); That has a use-case when you want to use an ordering weaker than the default seq_cst.
The main reason for using atomic_load explicitly in your source code is probably to remind human readers that a variable or pointer is atomic. Or maybe as a part of a macro, using atomic_load(&(macro_input)) would create a compile-time error for a non-atomic pointer.
As a "generic" function, you can't take a normal function-pointer to it.
Its existence may be just to make it easier to write the language standard, and explain everything in terms of functions.
It's not the actual assignment that's key here, it's evaluating the atomic variable in an rvalue context (reading it's value as part of an expression, like you typically find on the right-hand side of an =). printf("%d\n", my_atomic_var); is also equivalent to atomic_load.
And BTW, the same thing holds for atomic_var = foo; being exactly the same as atomic_store_explicit with mo_seq_cst. Here it is assignment that's key.
Other kinds of lvalue references to an atomic variable are different, like read-modify-write atomic_var++ is equivalent to atomic_fetch_add.

Is it possible to have a compiler which optimizes a = func(a)? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Say I have an object of type A. Consider this case for any function of the type A -> A (i.e. takes object of type A and returns another object of type A):
foo = func(foo)
Here, the simplest case would be to for the result of func(foo) to be copied into foo.
Is it possible to optimize this so that:
foo gets modified inplace in func
There are no constraints on the language used. What I want to know is what constraints and properties the language must have to enable such an optimization. Are there any existing languages which perform such an optimization?
Example(in pseudo code):
type Matrix = List<List<int>>
Matrix rotate90Deg(Matrix x):
Matrix result(x.columns, x.rows) #Assume it has a constructor which takes as args the num of rows, and num of cols.
for (int i = 0; i < x.rows; i++):
for (int j = 0; j < x.columns; j++):
result[i][j] = x[j][i]
return result
Matrix a = [[1,2,3],[4,5,6],[7,8,9]]
a = rotate90Deg(a)
Here, is it possible to optimize the code so that it doesn't allocate memory for a new matrix(result), and instead just modifies the original matrix passed.
First of all, you have to realize that some operations are inherently not possible to be computed in-place. Matrix-matrix multiplication is an example of this, and rotate90Deg would fall under this category since such an operation is actually a matrix multiplication by the appropriate multiplication matrix.
Now as for your example, you actually coded up a matrix transpose function. Matrix transpose can be done in-place since you are swapping pairs of numbers, but I doubt that any compilers can automatically detect this and optimize it for you. Indeed, there are many, many tricks that one can do to optimize matrix transpose in order to be cache-friendly in order to gain huge performance increases. Nevertheless, with an naive implementation, you will almost certainly end up with something very similar to what Aditya Kumar describes in his answer.
As I have foreshadowed by using the word "naive" earlier, programmers can coax the compiler to inline lots and lots of things in extremely optimized ways through advanced templating and other meta-programming techniques. (At least in C++, and maybe other languages that allow you to overload operator =.) For anyone interested in a case study of how this is done and what is involved, take a look at the Eigen matrix library, and how it handles a simple operation like u = v + w; where the three variables are all matrices of floats. Following is a brief overview of the key points.
A naive implementation would overload operator+ to return a temporary and operator= to copy that temporary to the result. Of course, in C++11 it is pretty easy to avoid the final copy during assignment by way of move constructors, but you will still have unnecessary temporaries if you had something a little more complex with multiple operators on the right hand side like u = 3.15f * u.transposed() + 5.0f; since each operator/method would return a temporary, and that temporary would have to be looped over in order to process the next operator.
Long story short, what Eigen does is rather than perform each operation when the corresponding function call occurs, the calls return a templated functor of sorts which merely describes the operation that needs to take place, and all the actual work ends up happening in operator =, thus enabling the compiler to emit a single, inlined loop for traversing the data only once and doing the operation truly in-place.
Yes it is possible, and this optimization is provided by at least C++11 (inlining).
To explain the optimization a little bit.
e.g.
foo_t foo;
foo = func(foo); // #1
foo_t func(foo_t foo1) {
foo_t new_foo;
// operate on new_foo by using foo1
return new_foo;
}
There are three instances of foo_t being made:
foo is copied and passed as foo1 to func
new_foo is created.
new_foo is assigned to foo by copying the contents of new_foo into foo;
All the three copies can be eliminated provided there are some invariants.
foo (the argument to be passed to function is never used later with the same original value. This is equivalent to saying that foo is 'dead' at line #1. This is established here as foo is reassigned.
the scope of object new_foo in function func has its lifetime that does not extend the life of function func. This is also established here as the way new_foo is created, it will be on stack and the lifetime of objects in stack is the same as the lifetime of the function in which the object was created.
In C++ it can be achieved using inlining the function func. After inlining, the code basically will look like this.
`foo_t foo;`
`foo_t new_foo;`
`// operate on new_foo by using foo`
`foo = new_foo;`
Although, C++ provides inlining as a language feature but almost any optimizing compiler do inlining these days.
Now it depends on what kind of operation you perform on new_foo and foo whether this extra new_foo will be optimized away or not. For some data types it is trivial (the compiler can do a 'copy-propagation' followed by 'dead-code elimination' to remove new_foo completely.

Constants in MATLAB

I've come into ownership of a bunch of MATLAB code and have noticed a bunch of "magic numbers" scattered about the code. Typically, I like to make those constants in languages like C, Ruby, PHP, etc. When Googling this problem, I found that the "official" way of having constants is to define functions that return the constant value. Seems kludgey, especially because MATLAB can be finicky when allowing more than one function per file.
Is this really the best option?
I'm tempted to use / make something like the C Preprocessor to do this for me. (I found that something called mpp was made by someone else in a similar predicament, but it looks abandoned. The code doesn't compile, and I'm not sure if it would meet my needs.)
Matlab has constants now. The newer (R2008a+) "classdef" style of Matlab OOP lets you define constant class properties. This is probably the best option if you don't require back-compatibility to old Matlabs. (Or, conversely, is a good reason to abandon back-compatibility.)
Define them in a class.
classdef MyConstants
properties (Constant = true)
SECONDS_PER_HOUR = 60*60;
DISTANCE_TO_MOON_KM = 384403;
end
end
Then reference them from any other code using dot-qualification.
>> disp(MyConstants.SECONDS_PER_HOUR)
3600
See the Matlab documentation for "Object-Oriented Programming" under "User Guide" for all the details.
There are a couple minor gotchas. If code accidentally tries to write to a constant, instead of getting an error, it will create a local struct that masks the constants class.
>> MyConstants.SECONDS_PER_HOUR
ans =
3600
>> MyConstants.SECONDS_PER_HOUR = 42
MyConstants =
SECONDS_PER_HOUR: 42
>> whos
Name Size Bytes Class Attributes
MyConstants 1x1 132 struct
ans 1x1 8 double
But the damage is local. And if you want to be thorough, you can protect against it by calling the MyConstants() constructor at the beginning of a function, which forces Matlab to parse it as a class name in that scope. (IMHO this is overkill, but it's there if you want it.)
function broken_constant_use
MyConstants(); % "import" to protect assignment
MyConstants.SECONDS_PER_HOUR = 42 % this bug is a syntax error now
The other gotcha is that classdef properties and methods, especially statics like this, are slow. On my machine, reading this constant is about 100x slower than calling a plain function (22 usec vs. 0.2 usec, see this question). If you're using a constant inside a loop, copy it to a local variable before entering the loop. If for some reason you must use direct access of constants, go with a plain function that returns the value.
For the sake of your sanity, stay away from the preprocessor stuff. Getting that to work inside the Matlab IDE and debugger (which are very useful) would require deep and terrible hacks.
I usually just define a variable with UPPER_CASE and place near the top of the file. But you have to take the responsibly of not changing its value.
Otherwise you can use MATLAB classes to define named constants.
MATLAB doesn't have an exact const equivalent. I recommend NOT using global for constants - for one thing, you need to make sure they are declared everywhere you want to use them. I would create a function that returns the value(s) you want. You might check out this blog post for some ideas.
You might some of these answers How do I create enumerated types in MATLAB? useful. But in short, no there is not a "one-line" way of specifying variables whose value shouldn't change after initial setting in MATLAB.
Any way you do it, it will still be somewhat of a kludge. In past projects, my approach to this was to define all the constants as global variables in one script file, invoke the script at the beginning of program execution to initialize the variables, and include "global MYCONST;" statements at the beginning of any function that needed to use MYCONST. Whether or not this approach is superior to the "official" way of defining a function to return a constant value is a matter of opinion that one could argue either way. Neither way is ideal.
My way of dealing with constants that I want to pass to other functions is to use a struct:
% Define constants
params.PI = 3.1416;
params.SQRT2 = 1.414;
% Call a function which needs one or more of the constants
myFunction( params );
It's not as clean as C header files, but it does the job and avoids MATLAB globals. If you wanted the constants all defined in a separate file (e.g., getConstants.m), that would also be easy:
params = getConstants();
Don't call a constant using myClass.myconst without creating an instance first! Unless speed is not an issue. I was under the impression that the first call to a constant property would create an instance and then all future calls would reference that instance, (Properties with Constant Values), but I no longer believe that to be the case. I created a very basic test function of the form:
tic;
for n = 1:N
a = myObj.field;
end
t = toc;
With classes defined like:
classdef TestObj
properties
field = 10;
end
end
or:
classdef TestHandleObj < handle
properties
field = 10;
end
end
or:
classdef TestConstant
properties (Constant)
field = 10;
end
end
For different cases of objects, handle-objects, nested objects etc (as well as assignment operations). Note that these were all scalars; I didn't investigate arrays, cells or chars. For N = 1,000,000 my results (for total elapsed time) were:
Access(s) Assign(s) Type of object/call
0.0034 0.0042 'myObj.field'
0.0033 0.0042 'myStruct.field'
0.0034 0.0033 'myVar' //Plain old workspace evaluation
0.0033 0.0042 'myNestedObj.obj.field'
0.1581 0.3066 'myHandleObj.field'
0.1694 0.3124 'myNestedHandleObj.handleObj.field'
29.2161 - 'TestConstant.const' //Call directly to class(supposed to be faster)
0.0034 - 'myTestConstant.const' //Create an instance of TestConstant
0.0051 0.0078 'TestObj > methods' //This calls get and set methods that loop internally
0.1574 0.3053 'TestHandleObj > methods' //get and set methods (internal loop)
I also created a Java class and ran a similar test:
12.18 17.53 'jObj.field > in matlab for loop'
0.0043 0.0039 'jObj.get and jObj.set loop N times internally'
The overhead in calling the Java object is high, but within the object, simple access and assign operations happen as fast as regular matlab objects. If you want reference behavior to boot, Java may be the way to go. I did not investigate object calls within nested functions, but I've seen some weird things. Also, the profiler is garbage when it comes to a lot of this stuff, which is why I switched to manually saving the times.
For reference, the Java class used:
public class JtestObj {
public double field = 10;
public double getMe() {
double N = 1000000;
double val = 0;
for (int i = 1; i < N; i++) {
val = this.field;
}
return val;
}
public void setMe(double val) {
double N = 1000000;
for (int i = 1; i < N; i++){
this.field = val;
}
}
}
On a related note, here's a link to a table of NIST constants: ascii table and a matlab function that returns a struct with those listed values: Matlab FileExchange
I use a script with simple constants in capitals and include teh script in other scripts tr=that beed them.
LEFT = 1;
DOWN = 2;
RIGHT = 3; etc.
I do not mind about these being not constant. If I write "LEFT=3" then I wupold be plain stupid and there is no cure against stupidity anyway, so I do not bother.
But I really hate the fact that this method clutters up my workspace with variables that I would never have to inspect. And I also do not like to use sothing like "turn(MyConstants.LEFT)" because this makes longer statements like a zillion chars wide, making my code unreadible.
What I would need is not a variable but a possibility to have real pre-compiler constants. That is: strings that are replaced by values just before executing the code. That is how it should be. A constant should not have to be a variable. It is only meant to make your code more readible and maintainable. MathWorks: PLEASE, PLEASE, PLEASE. It can't be that hard to implement this. . .

What is the difference between forward declaration and forward reference?

What is the difference between forward declaration and forward reference?
Forward declaration is, in my head, when you declare a function that isn't yet implemented, but is this incorrect? Do you have to look at the specified situation for either declaring a case "forward reference" or "forward declaration"?
A forward declaration is the declaration of a method or variable before you implement and use it. The purpose of forward declarations is to save compilation time.
The forward declaration of a variable causes storage space to be set aside, so you can later set the value of that variable.
The forward declaration of a function is also called a "function prototype," and is a declaration statement that tells the compiler what a function’s return type is, what the name of the function is, and the types its parameters. Compilers in languages such as C/C++ and Pascal store declared symbols (which include functions) in a lookup table and references them as it comes across them in your code. These compilers read your code sequentially, that is, top to bottom, so if you don't forward declare, the compiler discovers a symbol that it can't reference in the lookup table, and it raises an error that it doesn't know how to respond to the function.
The forward declaration is a hint to the compiler that you have defined (filled out the implementation of) the function elsewhere.
For example:
int first(int x); // forward declaration of first
...
int first(int x) {
if (x == 0) return 1;
else return 2;
}
But, you ask, why don't we just have the compiler make two passes on every source file: the first one to index all the symbols inside, and the second to parse the references and look them up? According to Dan Story:
When C was created in 1972, computing resources were much more scarce
and at a high premium -- the memory required to store a complex
program's entire symbolic table at once simply wasn't available in
most systems. Fixed storage was also expensive, and extremely slow, so
ideas like virtual memory or storing parts of the symbolic table on
disk simply wouldn't have allowed compilation in a reasonable
timeframe... When you're dealing with magnetic tape where seek times
were measured in seconds and read throughput was measured in bytes per
second (not kilobytes or megabytes), that was pretty meaningful.
C++, while created almost 17 years later, was defined as a superset
of C, and therefore had to use the same mechanism.
By the time Java rolled around in 1995, average computers had enough
memory that holding a symbolic table, even for a complex project, was
no longer a substantial burden. And Java wasn't designed to be
backwards-compatible with C, so it had no need to adopt a legacy
mechanism. C# was similarly unencumbered.
As a result, their designers chose to shift the burden of
compartmentalizing symbolic declaration back off the programmer and
put it on the computer again, since its cost in proportion to the
total effort of compilation was minimal.
In Java and C#, identifiers are recognized automatically from source files and read directly from dynamic library symbols. In these languages, header files are not needed for the same reason.
A forward reference is the opposite. It refers to the use of an entity before its declaration. For example:
int first(int x) {
if (x == 0) return 1;
return second(x-1); // forward reference to second
}
int second(int x) {
if (x == 0) return 0;
return first(x-1);
}
Note that "forward reference" is used sometimes, though less often, as a synonym for "forward declaration."
From Wikipedia:
Forward Declaration
Declaration of a variable or function which are not defined yet. Their defnition can be seen later on.
Forward Reference
Similar to Forward Declaration but where the variable or function appears first the definition is also in place.
forward declarations are used to allow single-pass compilation of a language (C, Pascal).
if forward references are allowed without forward declaration (Java, C#), a two-pass compiler is required.