What are the Parameters on which RTOS are compared? - rtos

I want to compare two RTOS (e.g. -> Keil-RTX ,Ucos-iii and freertos), but I do not know on what parameters I need to compare them for e.g. Memory footprint, certified etc.
On which points do we compare RTOS ?

You need to compare them on the parameters that are important to your application and meeting its requirements. Those may include for example:
Context switch time
Message passing performance
Scalability
RAM footprint
ROM footprint
Heap usage
OS primitives (queues, mutex, event-flags, semaphores, timer etc.)
Scheduling algorithms (priority-preemptive, round-robin, cooperative)
Per developer cost
Per unit royalty cost
Licence type/terms
Source or object code provided
Availability integrated middleware libraries (filesystem, USB, CAN, TCP/IP etc.)
Safety certified
Platform/target support
RTOS aware debugger support
RTOS/scheduling monitor/debug tools availability
Vendor support
Community support
Documentation quality
The possible parameters are many, and only you can determine what is useful and important to your project.
I suggest selecting about five parameters important to your project, and then analysing each option using the Kepner-Tregoe method. For each parameter you assign a weight based on its relative importance, you score each solution against each parameter, and then you sum the score x weight for an over all score. The method takes some of the subjectivity out of selection and perhaps importantly provides evidence of your decision making process when you have to justify it to your boss.

Related

Can Someone Give me a high-level overview of the VSWS Algorithm used in Operating Systems?

I am trying to find videos/resources that can give me a simple, clear, concise description of the VSWS algorithm but I cannot seem to find any. Any help would be appreciated!
Can Someone Give me a high-level overview of the VSWS Algorithm...
The basic idea of the Variable-Interval Sampled Working Set algorithm is:
each virtual page has a "was used" flag
while the program is running, if/when the program uses a virtual page (including when the page's data had to be fetched from elsewhere/disk before it could be used) the CPU or OS sets the page's "was used" flag.
after a variable amount of time, the OS checks all the "was used" flags and decides that if a page wasn't used then its not part of the working set (and may evict them to free up physical memory); then clears all the "was used" flags (ready for the next variable amount of time).
... used in Operating Systems?
I wouldn't assume it's actually used in modern operating systems.
Most operating systems use something loosely based on "least recently used"; where a similar "variable sampling" approach is used to build up an estimate of "time when page was used last" (and not merely a single "was used" flag), which is then used to estimate "probability of future use"; which might then be combined with "cost of eviction" and "priority of program" to come up with a combined score; where the pages with the worst score are deemed "best to evict to free up physical memory".
Note 1: If a page was modified and needs to be written to swap space (and then possibly loaded back from swap space later) then it has a higher "cost of eviction"; and if a page hasn't been modified since it was fetched from a file or swap space last then it has a lower "cost of eviction". To improve performance (reduce the cost of eviction, not forgetting that estimates are crude and often poorly predict future use) it'd make sense to prefer the eviction of "cheaper to evict" pages.
Note 2: When there's multiple tasks running; it's good to give some tasks preferential treatment. For an extreme example, imagine if the OS is under "low memory" conditions and constantly thrashing (transferring data to/from) disks; and an admin/user is trying to terminate a buggy program that is causing all the disk trashing but can't because the tool/s they need to use to fix the problem are unresponsive (because those tools were not given preferential treatment and have to be fetched from the "already being thrashed" disk).
Note 3: In some cases (e.g. a task called sleep() and it's trivial to determine that it will wake up soon) it's possible to use other information to get a better estimate of "probability of future use" than a simple "least recently used" algorithm could provide.
Note 4: Typically when an OS needs to free up some physical memory there's other things (e.g. file data caches) that could also be considered (and could also participate in that "calculate a score and evict whatever has the worst score" system).
Note 5: Modern systems also pre-fetch data (e.g. from files, etc) before the data is actually requested. It's entirely possibly for pre-fetched "not requested by any program, not used at all yet" data to be more important than "explicitly requested and previously used" data.

Is there any standard for supporting Lock-step processor?

I want to ask about supporting Lock-step(lockstep, lock-step) processors in SW-level.
As I know, in AUTOSAR-ASILD, Lock-step processor is used for fault torelant system as below scenario.
The input signals for a processor is copied to another processor(its Lock-step pair).
The output signals from two different processors are compared.
If two output signals are different, trap is generated.
I think that if there is generated trap, then this generated trap should be processed somewhere in SW-level.
However, I could not find any standard for this processing.
I have read some error handling in SW topics specified in AUTOSAR, but I could not find any satisfying answers.
So, my question is summarized as below.
In AUTOSAR or other standard, where is the right place that processes Lock-step trap(SW-C or RTE or BSW)?.
In AUTOSAR or other standard, what is the right action that processes Lock-step trap(RESET or ABORT)?
Thank you.
There are multiple concepts involved here, from different sources.
The ASIL levels are defined by ISO 26262. ASIL-D is the highest level and using a lockstep CPU is one of the methods typically used to achieve ASIL-D compliance for the whole system. Autosar doesn't define how you achieve ASIL-D, or any ASIL level at all. From an Autosar perspective, lockstep would be an implementation detail of the MCU driver, and Autosar doesn't require MCUs to support lockstep. How a particular lockstep implementation works (whether the outputs are compared after each instruction or not, etc.) depends on the hardware, so you can find those answers in the corresponding hardware manual.
Correspondingly, some decisions have to be made by people working on the system, including an expert on functional safety. The decision on what to do on lockstep failure is one such decision - how you react to a lockstep trap should be defined at the system level. This is also not defined by Autosar, although the most reasonable option is to reset your microcontroller after saving some information about the error.
As for where in the Autosar stack the trap should be handled, this is also an implementation decision, although the reasonable choice is for this to happen at the MCAL level - to the extent that talking about levels even makes sense here, as the trap will run in interrupt/trap context and not the normal OS task context. Typically, a trap would come with a higher priority than any interrupt, and also typically it's not possible to disable the traps in software. A trap will be handled by some routine that is registered by the OS in the same way it registers ISRs, so you'd want to configure the trap handler in whatever tool you're using for OS configuration. The lockstep trap may (again, depending on the hardware) be considered a non-recoverable trap, meaning that the trap handler should trigger a reset eventually. Calling the standard ShutdownOS() function may be reasonable.

What is the best definition of an RTOS?

I have yet to find a definition of an RTOS that is specific enough to have meaning. The best one I can find is on wiki:
https://en.wikipedia.org/wiki/Real-time_operating_system
However I have some critical comments/questions:
"Real Time" seems to be undefined in all the definitions for RTOS I've found. Nothing can be fast as actual real time (infinitesimally small!). Therefore, I believe "real time" only makes sense in the context of the observer. Real time for a human using an iPhone user might be <20ms because human eye sight cannot detect changes faster than that. For an air bag deployment it might be <1ms. All definitions on the internet seem to gloss over the definition of "real time"!
If RTOS is defined by the requirement to execute something within a specific time frame ("deadline"), why does jitter come into the definition? If the iPhone response jitters between 12-14ms, is it no longer responding in real time? It meets the 20ms requirement, right? If one time the response went to 100ms, the user might notice, at which point the system is not an RTOS
How can there possibly be a "soft" RTOS?! The definition of RTOS is meeting a particular deadline time requirement. If it doesn't meet it, than its not an RTOS! The very definition of RTOS prohibits a "soft" RTOS
To me it seems there is no formal and precise definition of RTOS. It's a general term to explain the characteristic of an OS who's main priority is the appearance of "real time" (per requirement number) to a particular type of observer. It also seems like the name has taken on implementation meaning such as how things are processed, multi-tasking, message passing, semaphores, etc... all which may NOT be part of an RTOS at all if the system fails to respond within the "deadline" requirement, right?
Sorry about such a ubiquitous question, but I can't get a clear picture in my brain. All definitions I've found are simply not precise enough or cloud the definition with implementation details.
You're right that no definition defines the exact time bounds. That's not the goal of a definition. Real time isn't dependent on the observer, though, but the application. As applications differ, time bounds differ, and therefore a definition cannot give that bound as a number.
Jitter is irrelevant as long as the application's time bound is met. You're absolutely right about the example. If the deadline is 20 ms, taking 100 ms is a failure. If the OS is to blame for the delay, it's not an RTOS.
"Soft realtime" has a very specific meaning, and this is probably the only thing you really got wrong. The concept at work here is, what do you do when a task exceeds its deadline? (Note: this could be either the fault of the task itself or the RTOS.) In a hard realtime system, the task simply has no value anymore. A late outcome is as good as no outcome, and you cancel the task. No point in risking other tasks.
Soft RTOS is actually more complex. Finishing the task still has value, although diminished. So the RTOS cannot hard kill the task, but the OS still has to ensure other tasks meet their deadlines. That requires extra care, which wouldn't have been necessary if you'd just kill the task.
There is an Embedded Systems Dictionary. Here are some excerpts:
real-time adj. Having timeliness requirements, typically in the form of deadlines that can’t be missed.
real-time operating system n. An operating system designed specifically for use in real-time systems. Abbreviated RTOS.
real-time system n. Any computer system, embedded or otherwise, that has timeliness requirements. The following question can be used
to distinguish real-time systems from the rest: “Is a late answer as
bad, or even worse, than a wrong answer?” In other words, what happens
if the computation doesn’t finish in time? If nothing bad happens,
it’s not a real-time system. If someone dies or the mission fails,
it’s generally considered “hard” real-time, which is meant to imply
that the system has hard deadlines. Everything in between is “soft”
real-time.

Object oriented programming with C++-AMP

I need to update some code I used for Aho-Corasick algorithm in order to implement the algorithm using the GPU. However, the code heavily relies on object-oriented programming model. My question is, is it possible to pass objects to parallel for each? If not; is there any way around can be workable and exempt me from re-writing the entire code once again. My apologies if this seems naive question. C++-AMP is the first language I use in GPU programming. Hence, my experience in this field is quiet limited.
The answer to your question is yes, in that you can pass classes or structs to a lambda marked restrict(amp). Note that the parallel_foreach` is not AMP restricted, its lambda is.
However you are limited to using the types that are supported by the GPU. This is more of a limitation of current GPU hardware, rather than C++ AMP.
A C++ AMP-compatible function or lambda can only use C++
AMP-compatible types, which include the following:
int
unsigned int
float
double
C-style arrays of int, unsigned int, float, or double
concurrency::array_view or references to concurrency::array
structs containing only C++ AMP-compatible types
This means that some data types are forbidden:
bool (can be used for local variables in the lambda)
char
short
long long
unsigned versions of the above
References and pointers (to a compatible type) may be used locally but
cannot be captured by a lambda. Function pointers, pointer-to-pointer,
and the like are not allowed; neither are static or global variables.
Classes must meet more rules if you wish to use instances of them.
They must have no virtual func- tions or virtual inheritance.
Constructors, destructors, and other nonvirtual functions are allowed.
The member variables must all be of compatible types, which could of
course include instances of other classes as long as those classes
meet the same rules.
... From the C++ AMP book, Ch, 3.
So while you can do this it may not be the best solution for performance reasons. CPU and GPU caches are somewhat different. This makes arrays of structs a better choice of CPU implementations, whereas GPUs often perform better if structs of arrays are used.
GPU hardware is designed to provide the best performance when all
threads within a warp are access- ing consecutive memory and
performing the same operations on that data. Consequently, it should
come as no surprise that GPU memory is designed to be most efficient
when accessed in this way. In fact, load and store operations to the
same transfer line by different threads in a warp are coalesced into
as little as a single transaction. The size of a transfer line is
hardware-dependent, but in general, your code does not have to account
for this if you focus on making memory accesses as contiguous as
possible.
... Ch. 7.
If you take a look at the CPU and GPU implementations of the my n-body example you'll see implementations of both approaches for CPU and GPU.
The above does not mean that your algorithm will not run faster when you move the implementation to C++ AMP. It just means that you may be leaving some additional performance on the table. I would recommend doing the simplest port possible and then consider if you want to invest more time optimizing the code, possibly rewriting it to take better advantage of the GPU's architecture.

Analysing and generating statistics on your code

I was wondering if anyone had any ideas or procedures for generating general statistics on your source code.
Off the top of my head I would love to know how many functions in my project's code are called once or very few times or any classes that are only instantiated once.
I'm sure there is a ton of other interesting things to be found out.
I could do something like the above using grep magic but has anyone come across tools or tips?
Coverity is the first thing coming to mind. It currently offers (on one of their products)
Software DNA Map™ analysis system: Generates a comprehensive representation of the entire build system including a semantically correct parsing of every line of code.
Defect Manager: Intuitive interface makes it easy to establish ownership of defects and resolve them via a customized workflow that mirrors your existing development process.
Local Analysis: Enables code to be analyzed locally on developers’ desktops to ensure quality before sharing with other developers.
Boolean Satisfiability: Translates the code into questions based on Boolean values, then applies SAT solvers for the most accurate defect detection and the lowest false positive rate available. Only Prevent offers the added precision of this proprietary method.
Race Conditions Checker: Features an industry-first race conditions checker built specifically for today’s complex multi-threaded applications.
Path Simulation: Simulates 100% of all values and data paths, enabling detection of the most critical defects.
Statistical & Interprocedural Analysis: Ensures a comprehensive analysis of your entire build system by inferring correct behavior based on previously observed behavior and performing whole-program analysis similar to the executing Bin.
False Path Pruning: Efficiently removes false positives to give Prevent an average FP rate of about 15%, with some users reporting FP rates of as low as 5%.
Incremental Analysis: Analyzes source code wholly or incrementally, allowing you to save time by checking only those components that are affected by a change.
Reporting: Measures software quality trends over time via customizable reporting so you can show defects grouped by checker, classification, component, and other defect information.
There are lots of tools that do this. But afaik none of them are language independent (which in turn would be mostly impossible e.g. some languages might not even have functions).
Generally you will find those tools under the categories of "code coverage tools" or "profilers".
For .Net you can use Visual Studio or Clrprofiler.