Linking Mulitiple Computers to Process a Task - cpu-architecture

I am unsure whether this question belongs here, so please feel free to migrate it if it doesn't.
My question is this, Is it possible to combine many different PC units to work as one?
Take for example, buying 3 different HP desktop PCs. Then link the hardware so that they act as one PC.
If so, please point me to some resources I can use.
Thanks for your time.
Note
I am not referring to linking them over a network, but rather, making the actual hardware work together.
I am not sure this is possible, so I am sure all my google search terms are not related to the issue.

You should realize that linking them over a network does not obviate their ability to work together to complete a task. Most supercomputers and clusters today are interconnected via a network (albeit a very high speed one like Infiniband). The key is to have software that can understand that it's operating in a distributed environment (e.g. MPI libraries). You might also take a look at OpenMP or Hadoop. It really depends on what you want to do with it.

You can not link some computer together to behave like one!!! Therefore you will need special hardware, which offers you the possibility to extend the numbers of CPU's working together. (Like a cray)

If you are talking about write an application that will be processed by those computers, you may be referring to MPI.
You can use the Open MPI to do that, most of languages nowdays have MPI libraries.
You can find a more elaborated information about Parallel Computing on Wikipedia Parallel Computing Article.

Related

What is the best way to start programming with Real Time Linux?

Although I have implemented many projects in C, I am completely new to operating systems. I tried real time linux on Discovery board (STM32) and got the correct results for blinking LED but I didn't really understand the whole process since I just followed the steps and could not find whole description for each step on the internet.
I want to implement scheduling on real time linux. What is the best way to start? Any sites, books, tutorials available?
Complete RTLinux process description will be appreciated.
Thanks in adv.
The transition from "bare metal" to OS based programming is something that I experienced in reverse. I started out a complete software guy, totally into the OS side of things and over time I have moved to the opposite of that (even designing circuits in VHDL!). My advice would be to start simple. Linux is pretty complex, and everywhere you look there are many layers of things all working together to deliver the final product. If you are dead set on a real time linux extension, I'd be happy to suggest https://xenomai.org/ which is a real time extension for linux.
However, to more specifically address your question about implementing scheduling in Linux, you can, but it will be a large amount of work and can be very complicated. The OS uses a completely fair scheduling process ( http://en.wikipedia.org/wiki/Completely_Fair_Scheduler ) and whenever you spin up a thread, it simply gets added to the list to run. This can differ slightly if you implement your code in kernel space as a driver, rely on hardware interrupts, etc., but in general, this is how Linux works. Real time generally means that it has the ability to assign threads one of several different priorities and utilize thread preemption fully at any given time which are concepts that aren't really a part of vanilla Linux. It has some notion of this, but it has limitations that can cause problems when you are looking for real time behavior from Linux.
What may be helpful to you is an RTOS. If you are looking for a full on Real Time Operating System, check out FreeRTOS http://www.freertos.org/ . It has a large community and supports a lot of different devices out of the box with a large amount of example code. They even support your specific board with an example package, so you can give it a shot with nothing to lose! http://www.freertos.org/FreeRTOS-for-Cortex-M3-STM32-STM32F100-Discovery.html . It gives you access to many OS ish constructs like network APIs, memory management, and threading without the overhead and latency of a huge OS. With an RTOS, you create tasks and assign them priorities so you become the scheduler and are no longer at the mercy of the OS. You run the OS, not the OS runs you (if that makes sense). Plus, the constructs offered within an RTOS will feel much like bare metal code and thus will be much easier to follow, understand, and fully learn. It is a more simple world to learn the base building blocks of a full blown OS such as Linux or Windows. If this option sounds good, I would suggest looking through the supported devices on FreeRTOS website and picking one you would like to experiment with and then go for it. I would highly recommend this as a way to learn about scheduling and OS constructs in general as it is as simple as you can get and open source. Once you have the basics of an RTOS down, buying a book about Linux specifically wouldn't be a bad idea. Although there are many free resources on the web related to learning about Linux, they are commonly contradictory, and can be misleading. Pile on learning Linux specific knowledge along with OS in general, and it can feel overwhelming. Starting simpler will help keep you from getting burnt out and minimize the amount of time you spend feeling lost. Linux is definitely a learning process, but like with any learning process, start simple, keep your ultimate goal in mind, make a plan, and take small, manageable steps along that plan until you look up and find yourself exactly where you want to be. Then go tackle the next mountain!
The real-time Linux landscape is quite confusing. 99.99% of the information out there is just plain obsolete.
First, there are lots of "microkernels" that run Linux as one task. (Such as the defunct RTLinux). The problem is that you must write your real-time task to a different API, and can't depend on anything in Linux, because Linux will be frozen in the background while your task runs. So unless your task is dead-simple ("stop the motors when I press this button"), this approach will cause more pain than gain.
Next, there is the realtime Linux patch set. This hasn't been doing so well. because of the next item:
Lastly, the current Linux kernel has gotten rid of the problems that caused people to need realtime in the past. You can even turn off Linux on one of your processors to have full control of the CPU. See also this paper.
To answer your question: I see two different paths you could take:
1) Start with a normal 3.xx Linux kernel and explore the various APIs and realtime techniques (i.e. realtime priorities, memory pinning, etc.) This can get you "close enough" for 99% of what people want "realtime" for. If it's good enough for high frequency trading, it's probably good enough for you.
2) If you have a hard realtime requirement and you are worried that Linux won't cut it, then (as Nick mentioned above), just go buy a processor and write your realtime code with no OS. By splitting up your "realtime" and "non-realtime" code onto different CPUs, you will make the whole system simpler and much more robust.
If you want to learn real-time operating systems then I suggest that you get an FPGA, for example the Altera DE2, and experiment with your own operating system and ucos. You can read a good text about embedded RTOS here.
You could also get a Linux Raspberry and write your own operating system for that.

Setting up a distributed computing grid on local network using .NET

In our firm we run complex simulations using our own software developed in .NET. These simulations are well-suited to parallel computation and we currently make much use of the various multi-threading features native to .NET. Even so, simulations often take hours or days.
We'd like to explore the potential of distributing computation over our local network of high-performance (24 core) workstations to access more CPU power. However we have no experience in this area.
Searching on Google reveals a few MPI-based options such as Pure MPI, MPI.NET, plus some commercial software such as Frontier.
Which solution should we consider for something that is ideally well-suited to a .NET environment and is relatively easy to set up?
Thanks!
Multithreading != grid computing, so you will need to rewrite some parts of your application regardless of what you will choose in the end.
I don’t know your network infrastructure but it sounded to me, like you would want to use normal desktop workstations to run distribute the code. I wouldn’t use MPI for that. MPI was rather developed for clusters and supercomputers where the network supports high bandwidth and low latency. Those aren’t the properties of a traditional office network (unless I understood something wrong).
The next thing you have to deal with is the fact that users shouldn’t turn off their machines if computations are performed on them. No grid computing platform (including MPI) deals with these kind of issues, as it is usually running on server hardware which has little failures and are running 24/7.
I don’t think there is a simple and inexpensive solution to this. You could have a service running on each machine which could execute code from DLLs with predefined parameters and send responses. Those assemblies could be downloadable from some windowsshare. But you want to have really huge peaces of work to be distributed like this. You wouldn’t get almost any improvements if the application runs only for a minute or less.
In the end you’d need also a service to find those services which are online or not, some kind of in memory DB where every service could write the IP address and that it’s online so that the clients would know to whom they can distribute the work. This could be done using RavenDB (as you said you are working with .Net), Redis or an application which was actually written for these kind of problems, Zookeeper.

VM cloud based OS possible?

"A process virtual machine (also, language virtual machine) is
designed to run a single program, which means that it supports a
single process. Such virtual machines are usually closely suited to
one or more programming languages and built with the purpose of
providing program portability and flexibility (amongst other things).
An essential characteristic of a virtual machine is that the software
running inside is limited to the resources and abstractions provided
by the virtual machine—it cannot break out of its virtual
environment. quote from the Wikipedia Article"
I've been studying usage of virtual machines, especially with their importance in Cloud Computing, and I want to know if it would be possible to develop a VM based operating system that could be scaled dynamically to use the processing power of connected servers? Use it's own local hardware for fast processing, but also boost it's performance by sending processes that don't need to return immediate responses to a cloud service.
Is this possible, or is that concept flawed?
Basically, the OS scaling with the connected cloud servers. The processes that are ok to send to the cloud servers for a latent response would be up to the developers of each program.
At first, I could see this being effective only for corporations in need of cost-effective massive computation. But as internet speeds increase, even front-end interface animation calculations might be possible, having less local hardware, relying more heavily on cloud services.
If it's possible, it would allow many scientific simulations that would otherwise need super computer time to be possible from any location in the world, at no more cost than exactly what processing is done at a specific speed. And would lead eventually to consumer devices being extremely small, 'scaleably' powerful, and very cheap, allowing people to pay for processing the same way they pay for internet service today.
Is this possible, or is that concept flawed?
Both. ;)
What you are talking about seems like what used to be called "Grid Computing". (Sun even sold it in the early 90's.) The concept was that you put a magic library on all your boxes, and your app will be able to scale out with no further work by the programmer.
This is useful -- but only if your problem is "embarrassingly parallel" (i.e. lots of independent calculations that don't affect each other.).
MPI is one such popular way to do this: http://www.linux-mag.com/id/5759/
Unfortunately, most of the time people have problems that are more lumpy (grab a bunch of data from database, do some calculations, generate PDF.) In these cases, it's simpler to figure out a good strategy and manually code it up, than try to use a magic library which can be hard to debug, and even harder to work around performance problems. I know a lot of people using AWS, and none of them use a 'magic grid library' like you are talking about. They communicate between servers using simple protocols like Queues or HTTP interfaces.
That's not because your idea won't work. It's just that their needs can be satisfied by something much simpler to debug/run/tune.
Another neat idea in the same vein: http://www.gocircuit.org/

Any good library or software for queue networks simulation?

I have been trying to make work EZSIM with no luck, which is a software to build discrete event simulators in a graphical DOS environment. In this software, my simulator and many others (of the other people in the course I'm taking) don't work, but teacher's simulator (and examples of the downloaded files) does work.
So, I began to distrust of the software.
Do you know any software that resolves the same kind of problems but really works? It will be good if it is free, or I can download an evaluation copy or something like that.
If you don't know any software, do you know any library which might work? Preferably in C#, Ansi C, Java or Delphi.
This may be more than what you're looking for, but check out NS2. It's the standard for open source network simulations, and will allow you to simulate all kinds of network layer behavior.
I've also used JUNG in the past. It's very flexible, although it also doesn't offer much out of the box.
I used Möbius in my computer systems analysis class. It is free for educational use (which sounds like what you're doing). It's a Java GUI which generates C++ code.
The R package queuecomputer. queuecomputer is a computationally efficient method for simulating queues with arbitrary arrival and service times. There is a submitted paper on arXiv describing the algorithm used in the package. Examples can be found within the arXiv paper and the vignette. A web app based on the package is available at https://ace-ebert.shinyapps.io/queue_simulator_mmk/ .

Comparison of embedded operating systems?

I've been involved in embedded operating systems of one flavor or another, and have generally had to work with whatever the legacy system had. Now I have the chance to start from scratch on a new embedded project.
The primary constraints on the system are:
It needs a web-based interface.
Inputs are required to be processed in real-time (so a true RTOS is needed).
The memory available is 32MB of RAM and FLASH.
The operating systems that the team has used previously are VxWorks, ThreadX, uCos, pSOS, and Windows CE.
Does anyone have a comparison or trade study regarding operating system choice?
Are there any other operating systems that we should consider? (We've had eCos and RT-Linux suggested).
Edit - Thanks for all the responses to date. A pity I can't flag all as "accepted".
I think it would be wise to evaluate carefully what you mean by "RTOS". I have worked for years at a large company that builds high-performance embedded systems, and they refer to them as "real-time", although that's not what they really are. They are low-latency and have deterministic schedulers, and 9 times out of 10, that's what people are really after when they say RTOS.
True real-time requires hardware support and is likely not what you really mean. If all you want is low latency and deterministic scheduling (again, I think this is what people mean 90% of the time when they say "real-time"), then any Linux distribution would work just fine for you. You could probably even get by with Windows (I'm not sure how you control the Windows scheduler though...).
Again, just be careful what you mean by "Real-time".
It all depends on how much time was allocated for your team has to learn a "new" RTOS.
Are there any reasons you don't want to use something that people already have experience with?
I have plenty of experience with vxWorks and I like it, but disregard my opinion as I work for WindRiver.
uC/OS II has the advantage of being fully documented (as in the source code is actually explained) in Labrosse's Book. Don't know about Web Support though.
I know pSos is no longer available.
You can also take a look at this list of RTOSes
I worked with QNX many years ago, and have nothing but great things to say about it. Even back then, QNX 4 (which is positively chunky compared to the Neutrino microkernel) was perfectly suited for low memory situations (though 32MB is oodles compared to the 1-2MB that we had to play with), and while I didn't explicitly play with any web-based stuff, I know Apache was available.
I purchased some development hardware from netburner
It has been very easy to work with and very well documented. It is an RTOS running uCLinux. The company is great to work with.
It might be a wise decision to select an OS that your team is experienced with. However I would like to promote two good open source options:
eCos (has you mentioned)
RTEMS
Both have a lot of features and drivers for a wide variety of architectures. You haven't mentioned what architecture you will be using. They provide POSIX layers which is nice if you want to stay as portable as possible.
Also the license for both eCos and RTEMS is GPL but with an exception so that the executable that is produced by linking against the kernel is not covered by GPL.
The communities are very active and there are companies which provide commercial support and development.
We've been very happy with the Keil RTX system....light and fast and meets all of our tight real time constraints. It also has some nice debugging features built in to monitor stack overflow, etc.
I have been pretty happy with Windows CE, although it is 'heavier'.
Posting to agree with Ben Collins -- your really need to determine if you have a soft real-time requirement (primarily for human interaction) or hard real-time requirement (for interfacing with timing-sensitive devices).
Soft can also mean that you can tolerate some hiccups every once in a while.
What is the reliability requirements? My experience with more general-purpose operating systems like Linux in embedded is that they tend to experience random hiccups due to their smart average-case optimizations that try to avoid starvation and similar for individual tasks.
VxWorks is good:
good documentation;
friendly developing tool;
low latency;
deterministic scheduling.
However, I doubt that WindRiver would convert their major attention to Linux and WindRiver Linux would break into the market of WindRiver VxWorks.
Less market, less requirement of engineers.
Here is the latest study. The last one was done more than 8 years ago so this is most relevant. The tables can be used to add additional RTOS choices. You'll note that this comparison is focused on lighter machines but is equally applicable to heavier machines provided virtual memory is not required.
http://www.embedded.com/design/operating-systems/4425751/Comparing-microcontroller-real-time-operating-systems