Do the risks caused by bypassing Perl safe signals for example like shown in the second timeout example in the DBI documentation concern only the code that uses such bypassing?
The code in that example works hard to localize the change to just that section of code, or any code called from it.
There is not 100% guarantee that no code will be effected outside the code that bypasses safe signals, because signals are no longer safe. In the example the call being timed out is a DBI->connect. For most DBD's this will be implemented mostly in C, unless the C code can handle being aborted and tried again you might find that some data structures internal to the DBD, or the libraries it uses, are left in a inconstant state.
The chances of the example code going wrong is probably incredibly tiny. My personal anecdote on the issues is that I had used the traditional Perl signal handling for years before safe signals were introduced and for a long time I had never had a problem. I hadn't even been very cautious about what I did in my signal handlers. Then we managed to hit a data set that actually did trigger memory corruptions in about 1 out of ever 100 runs. Just modifying the signal handlers to use better practices, similar to those in the example, eliminated our issues.
What does that even mean? By using unsafe signals, you can corrupt Perl's internals and Perl variables. It can also cause problem if a non-reentrant C library call is interrupted.
This can lead to SEGFAULTs and other problems, and those may only manifest themselves outside the block where the timeout is in effect.
Related
While learning the subject of operating systems, Critical Section is a topic which I've come across. To solve this problem, certain methods are provided like semaphores, certain software solutions, etc...etc..etc. But I've a question that from where is the code for implementing these solutions originated? As programmers never are found writing such codes for their program. Suppose I write a simple program executing printf in 'C', I never write any code for critical section problem. And the code is converted into low level instructions and is executed by OS, which behaves as our obedient servant. So, where does code dealing with critical section originate and fit in? Let resources like frame buffer be the critical section.
The OS kernel supplies such inter-thread comms synchronization mechanisms, mutex, semaphore, event, critical section, conditional variables etc. It has to because the kernel needs to block threads that cannot proceed. Many languages provide convenient wrappers around such calls.
Your app accesses them, directly or indirectly, via system calls, ie intrrupts that enter kernel state and ask for such services.
In some cases, a short-term user-space spinlock may get plastered on top, but such code should defer to a system call if the spinner is not quickly satisfied.
In the case of C printf, the relevant library, (stdio usually), will make the calls to lock/unlock the I/O stream, (assuming you have linked in a multithreaded version of the library).
Perl currently implements $SIG{__DIE__} in such a way that it will catch any error that occurs, even inside eval blocks. This has a really useful property that you can halt the code at the exact point where the error occurs, collect a stack trace of the actual error, wrap this up in an object, and then call die manually with this object as the parameter.
This abuse of $SIG{__DIE__} is deprecated. Officially, you are supposed to replace $SIG{__DIE__} with *CORE::GLOBAL::die. However, these two are NOT remotely equivalent. *CORE::GLOBAL::die is NOT called when a runtime error occurs! All it does is replace explicit calls to die().
I am not interested in replacing die.
I am specifically interested in catching runtime errors.
I need to ensure that any runtime error, in any function, at any depth, in any module, causes Perl to pass control to me so that I can collect the stack trace and rethrow. This needs to work inside an eval block -- one or more enclosing eval blocks may want to catch the exception, but the runtime error could be in a function without an enclosing eval, inside any module, from anywhere.
$SIG{__DIE__} supports this perfectly—and has served me faithfully for a couple of years or more—but the Powers that Be™ warn that this fantastic facility may be snatched away at any time, and I don't want a nasty surprise one day down the line.
Ideally, for Perl itself, they could create a new signal $SIG{__RTMERR__} for this purpose (switching signal is easy enough, for me anyway, as it's only hooked in one place). Unfortunately, my persuasive powers wouldn't lead an alcoholic to crack open a bottle, so assuming this will not happen, how exactly is one supposed to achieve this aim of catching runtime errors cleanly?
(For example, another answer here recommends Carp::Always, which … also hooks DIE!)
Just do it. I've done it. Probably everyone who's aware of this hook has done it.
It's Perl; it's still compatible going back decades. I interpret "deprecated" here to mean "please don't use this if you don't need it, ew, gross". But you do need it, and seem to understand the implications, so imo go for it. I seriously doubt an irreplaceable language feature is going away any time soon.
And release your work on CPAN so the next dev doesn't need to reinvent this yet again. :)
http://dlang.org/expression.html#AssertExpression
Regarding assert(0): "The optimization and code generation phases of compilation may assume that it is unreachable code."
The same documentation claims assert(0) is a 'special case', but there are several reasons that follow.
Can the D compiler optimize based on general assert-ions made in contracts and elsewhere?
(as if I needed another reason to enjoy the in{} and out{} constructs, but it certainly would make me feel a little more giddy to know that writing them could make things go fwoosh-ier)
In theory, yes, in practice, I don't think it does, especially since the asserts are killed before even getting to the optimizer on dmd -release. I'm not sure about gdc and ldc, but I think they share this portion of the code.
The spec's special case reference btw is that assert(0) is still present, in some form, with the -release compile flag. It is translated into an illegal instruction there (asm {hlt;} - non-kernel programs on x86 aren't allowed to use that so it will segfault upon hitting it), whereas all other asserts are simply left out of the code entirely in -release mode.
GDC certainly does optimise based on asserts. The if conditions make for much better code, even causing unnecessary code to disappear. However, unfortunately at the moment the way it is implemented is that the entire assert can disappear in release build mode so then the compiler never sees the beneficial if-condition info and actually generates worse code in release than in debug mode! Ironic. I have to admit that I've only looked at this effect with if conditions in asserts in the body, I haven't checked what effect in and out blocks have. The in- and out- etc contract blocks can be turned off based on a command line switch iirc, so they are not even compiled, I think this possibly means the compiler doesn't even look at them. So this is another thing that might possibly affect code generation, I haven't looked at it. But there is a feature here that I would very much like to see, that the if condition truth values in the assert conditions (checking that there is no side-effect code in the expression for the assert cond) can always be injected into the compiler as an assumption, just as if there had been an if statement even in release mode. It would involve pretending you had just seen an if ( xxx ) but with the actual code generation for the test suppressed in release mode, and with subsequent code feeling the beneficial effects of say known truth values, value-range limits and so on.
For example, if I implement some simple object caching, which method is faster?
1. return isset($cache[$cls]) ? $cache[$cls] : $cache[$cls] = new $cls;
2. return #$cache[$cls] ?: $cache[$cls] = new $cls;
I read somewhere # takes significant time to execute (and I wonder why), especially when warnings/notices are actually being issued and suppressed. isset() on the other hand means an extra hash lookup. So which is better and why?
I do want to keep E_NOTICE on globally, both on dev and production servers.
I wouldn't worry about which method is FASTER. That is a micro-optimization. I would worry more about which is more readable code and better coding practice.
I would certainly prefer your first option over the second, as your intent is much clearer. Also, best to keep away edge condition problems by always explicitly testing variables to make sure you are getting what you are expecting to get. For example, what if the class stored in $cache[$cls] is not of type $cls?
Personally, if I typically would not expect the index on $cache to be unset, then I would also put error handling in there rather than using ternary operations. If I could reasonably expect that that index would be unset on a regular basis, then I would make class $cls behave as a singleton and have your code be something like
return $cls::get_instance();
The isset() approach is better. It is code that explicitly states the index may be undefined. Suppressing the error is sloppy coding.
According to this article 10 Performance Tips to Speed Up PHP, warnings take additional execution time and also claims the # operator is "expensive."
Cleaning up warnings and errors beforehand can also keep you from
using # error suppression, which is expensive.
Additionally, the # will not suppress the errors with respect to custom error handlers:
http://www.php.net/manual/en/language.operators.errorcontrol.php
If you have set a custom error handler function with
set_error_handler() then it will still get called, but this custom
error handler can (and should) call error_reporting() which will
return 0 when the call that triggered the error was preceded by an #.
If the track_errors feature is enabled, any error message generated by
the expression will be saved in the variable $php_errormsg. This
variable will be overwritten on each error, so check early if you want
to use it.
# temporarily changes the error_reporting state, that's why it is said to take time.
If you expect a certain value, the first thing to do to validate it, is to check that it is defined. If you have notices, it's probably because you're missing something. Using isset() is, in my opinion, a good practice.
I ran timing tests for both cases, using hash keys of various lengths, also using various hit/miss ratios for the hash table, plus with and without E_NOTICE.
The results were: with error_reporting(E_ALL) the isset() variant was faster than the # by some 20-30%. Platform used: command line PHP 5.4.7 on OS X 10.8.
However, with error_reporting(E_ALL & ~E_NOTICE) the difference was within 1-2% for short hash keys, and up 10% for longer ones (16 chars).
Note that the first variant executes 2 hash table lookups, whereas the variant with # does only one lookup.
Thus, # is inferior in all scenarios and I wonder if there are any plans to optimize it.
I think you have your priorities a little mixed up here.
First of all, if you want to get a real world test of which is faster - load test them. As stated though suppressing will probably be slower.
The problem here is if you have performance issues with regular code, you should be upgrading your hardware, or optimize the grand logic of your code rather than preventing proper execution and error checking.
Suppressing errors to steal the tiniest fraction of a speed gain won't do you any favours in the long run. Especially if you think that this error may keep happening time and time again, and cause your app to run more slowly than if the error was caught and fixed.
I've been making some progress with audio programming for iPhone. Now I'm doing some performance tuning, trying to see if I can squeeze more out of this little machine. Running Shark, I see that a significant part of my cpu power (16%) is getting eaten up by objc_msgSend. I understand I can speed this up somewhat by storing pointers to functions (IMP) rather than calling them using [object message] notation. But if I'm going to go through all this trouble, I wonder if I might just be better off using C++.
Any thoughts on this?
Objective C is absolutely fast enough for DSP/audio programming, because Objective C is a superset of C. You don't need to (and shouldn't) make everything a message. Where performance is critical, use plain C function calls (or use inline assembly, if there are hardware features you can leverage that way). Where performance isn't critical, and your application can benefit from the features of message indirection, use the square brackets.
The Accelerate framework on OS X, for example, is a great high-performance Objective C library. It only uses standard C99 function calls, and you can call them from Objective C code without any wrapping or indirection.
The problem with Objective-C and functions like DSP is not speed per se but rather the uncertainty of when the inevitable bottlenecks will occur.
All languages have bottlenecks but in static linked languages like C++ you can better predict when and where in the code they will occur. In the case of Objective-C's runtime coupling, the time it takes to find the appropriate object, the time it takes to send a message is not necessary slow but it is variable and unpredictable. Objective-C's flexibility in UI, data management and reuse work against it in the case of tightly timed task.
Most audio processing in the Apple API is done in C or C++ because of the need to nail down the time it takes code to execute. However, its easy to mix Objective-C, C and C++ in the same app. This allows you to pick the best language for the immediate task at hand.
Is Objective C fast enough for DSP/audio programming
Real Time Rendering
Definitely Not. The Objective-C runtime and its libraries are simply not designed for the demands of real time audio rendering. The fact is, it's virtually impossible to guarantee that using ObjC runtime or libraries such as Foundation (or even CoreFoundation) will not result your renderer missing its deadline.
The common case is a lock -- even a simple heap allocation (malloc, new/new[], [[NSObject alloc] init]) will likely require a lock.
To use ObjC is to utilize libraries and a runtime which assume locks are acceptable at any point within their execution. The lock can suspend execution of your render thread (e.g. during your render callback) while waiting to acquire the lock. Then you can miss your render deadline because your render thread is held up, ultimately resulting in dropouts/glitches.
Ask a pro audio plugin developer: they will tell you that blocking within the realtime render domain is forbidden. You cannot e.g. run to the filesystem or create heap allocations because you have no practical upper bound regarding the time it will take to finish.
Here's a nice introduction: http://www.rossbencina.com/code/real-time-audio-programming-101-time-waits-for-nothing
Offline Rendering
Yes, it would be acceptably fast in most scenarios for high level messaging. At the lower levels, I recommend against using ObjC because it would be wasteful -- it could take many, many times longer to render if ObjC messaging used at that level (compared to a C or C++ implementation).
See also: Will my iPhone app take a performance hit if I use Objective-C for low level code?
objc_msgSend is just a utility.
The cost of sending a message is not just the cost of sending the message.
It is the cost of doing everything that the message initiates.
(Just like the true cost of a function call is its inclusive cost, including I/O if there is any.)
What you need to know is where are the time-dominant messages coming from and going to and why.
Stack samples will tell you which routines / methods are being called so often that you should figure out how to call them more efficiently.
You may find that you're calling them more than you have to.
Especially if you find that many of the calls are for creating and deleting data structure, you can probably find better ways to do that.