Setting up a distributed computing grid on local network using .NET

Setting up a distributed computing grid on local network using .NET - distributed-computing

In our firm we run complex simulations using our own software developed in .NET. These simulations are well-suited to parallel computation and we currently make much use of the various multi-threading features native to .NET. Even so, simulations often take hours or days.
We'd like to explore the potential of distributing computation over our local network of high-performance (24 core) workstations to access more CPU power. However we have no experience in this area.
Searching on Google reveals a few MPI-based options such as Pure MPI, MPI.NET, plus some commercial software such as Frontier.
Which solution should we consider for something that is ideally well-suited to a .NET environment and is relatively easy to set up?
Thanks!

Multithreading != grid computing, so you will need to rewrite some parts of your application regardless of what you will choose in the end.
I don’t know your network infrastructure but it sounded to me, like you would want to use normal desktop workstations to run distribute the code. I wouldn’t use MPI for that. MPI was rather developed for clusters and supercomputers where the network supports high bandwidth and low latency. Those aren’t the properties of a traditional office network (unless I understood something wrong).
The next thing you have to deal with is the fact that users shouldn’t turn off their machines if computations are performed on them. No grid computing platform (including MPI) deals with these kind of issues, as it is usually running on server hardware which has little failures and are running 24/7.
I don’t think there is a simple and inexpensive solution to this. You could have a service running on each machine which could execute code from DLLs with predefined parameters and send responses. Those assemblies could be downloadable from some windowsshare. But you want to have really huge peaces of work to be distributed like this. You wouldn’t get almost any improvements if the application runs only for a minute or less.
In the end you’d need also a service to find those services which are online or not, some kind of in memory DB where every service could write the IP address and that it’s online so that the clients would know to whom they can distribute the work. This could be done using RavenDB (as you said you are working with .Net), Redis or an application which was actually written for these kind of problems, Zookeeper.

Related

What is the best way to start programming with Real Time Linux?

Although I have implemented many projects in C, I am completely new to operating systems. I tried real time linux on Discovery board (STM32) and got the correct results for blinking LED but I didn't really understand the whole process since I just followed the steps and could not find whole description for each step on the internet.
I want to implement scheduling on real time linux. What is the best way to start? Any sites, books, tutorials available?
Complete RTLinux process description will be appreciated.
Thanks in adv.

The transition from "bare metal" to OS based programming is something that I experienced in reverse. I started out a complete software guy, totally into the OS side of things and over time I have moved to the opposite of that (even designing circuits in VHDL!). My advice would be to start simple. Linux is pretty complex, and everywhere you look there are many layers of things all working together to deliver the final product. If you are dead set on a real time linux extension, I'd be happy to suggest https://xenomai.org/ which is a real time extension for linux.
However, to more specifically address your question about implementing scheduling in Linux, you can, but it will be a large amount of work and can be very complicated. The OS uses a completely fair scheduling process ( http://en.wikipedia.org/wiki/Completely_Fair_Scheduler ) and whenever you spin up a thread, it simply gets added to the list to run. This can differ slightly if you implement your code in kernel space as a driver, rely on hardware interrupts, etc., but in general, this is how Linux works. Real time generally means that it has the ability to assign threads one of several different priorities and utilize thread preemption fully at any given time which are concepts that aren't really a part of vanilla Linux. It has some notion of this, but it has limitations that can cause problems when you are looking for real time behavior from Linux.
What may be helpful to you is an RTOS. If you are looking for a full on Real Time Operating System, check out FreeRTOS http://www.freertos.org/ . It has a large community and supports a lot of different devices out of the box with a large amount of example code. They even support your specific board with an example package, so you can give it a shot with nothing to lose! http://www.freertos.org/FreeRTOS-for-Cortex-M3-STM32-STM32F100-Discovery.html . It gives you access to many OS ish constructs like network APIs, memory management, and threading without the overhead and latency of a huge OS. With an RTOS, you create tasks and assign them priorities so you become the scheduler and are no longer at the mercy of the OS. You run the OS, not the OS runs you (if that makes sense). Plus, the constructs offered within an RTOS will feel much like bare metal code and thus will be much easier to follow, understand, and fully learn. It is a more simple world to learn the base building blocks of a full blown OS such as Linux or Windows. If this option sounds good, I would suggest looking through the supported devices on FreeRTOS website and picking one you would like to experiment with and then go for it. I would highly recommend this as a way to learn about scheduling and OS constructs in general as it is as simple as you can get and open source. Once you have the basics of an RTOS down, buying a book about Linux specifically wouldn't be a bad idea. Although there are many free resources on the web related to learning about Linux, they are commonly contradictory, and can be misleading. Pile on learning Linux specific knowledge along with OS in general, and it can feel overwhelming. Starting simpler will help keep you from getting burnt out and minimize the amount of time you spend feeling lost. Linux is definitely a learning process, but like with any learning process, start simple, keep your ultimate goal in mind, make a plan, and take small, manageable steps along that plan until you look up and find yourself exactly where you want to be. Then go tackle the next mountain!

The real-time Linux landscape is quite confusing. 99.99% of the information out there is just plain obsolete.
First, there are lots of "microkernels" that run Linux as one task. (Such as the defunct RTLinux). The problem is that you must write your real-time task to a different API, and can't depend on anything in Linux, because Linux will be frozen in the background while your task runs. So unless your task is dead-simple ("stop the motors when I press this button"), this approach will cause more pain than gain.
Next, there is the realtime Linux patch set. This hasn't been doing so well. because of the next item:
Lastly, the current Linux kernel has gotten rid of the problems that caused people to need realtime in the past. You can even turn off Linux on one of your processors to have full control of the CPU. See also this paper.
To answer your question: I see two different paths you could take:
1) Start with a normal 3.xx Linux kernel and explore the various APIs and realtime techniques (i.e. realtime priorities, memory pinning, etc.) This can get you "close enough" for 99% of what people want "realtime" for. If it's good enough for high frequency trading, it's probably good enough for you.
2) If you have a hard realtime requirement and you are worried that Linux won't cut it, then (as Nick mentioned above), just go buy a processor and write your realtime code with no OS. By splitting up your "realtime" and "non-realtime" code onto different CPUs, you will make the whole system simpler and much more robust.

If you want to learn real-time operating systems then I suggest that you get an FPGA, for example the Altera DE2, and experiment with your own operating system and ucos. You can read a good text about embedded RTOS here.
You could also get a Linux Raspberry and write your own operating system for that.

VM cloud based OS possible?

"A process virtual machine (also, language virtual machine) is
designed to run a single program, which means that it supports a
single process. Such virtual machines are usually closely suited to
one or more programming languages and built with the purpose of
providing program portability and flexibility (amongst other things).
An essential characteristic of a virtual machine is that the software
running inside is limited to the resources and abstractions provided
by the virtual machine—it cannot break out of its virtual
environment. quote from the Wikipedia Article"
I've been studying usage of virtual machines, especially with their importance in Cloud Computing, and I want to know if it would be possible to develop a VM based operating system that could be scaled dynamically to use the processing power of connected servers? Use it's own local hardware for fast processing, but also boost it's performance by sending processes that don't need to return immediate responses to a cloud service.
Is this possible, or is that concept flawed?
Basically, the OS scaling with the connected cloud servers. The processes that are ok to send to the cloud servers for a latent response would be up to the developers of each program.
At first, I could see this being effective only for corporations in need of cost-effective massive computation. But as internet speeds increase, even front-end interface animation calculations might be possible, having less local hardware, relying more heavily on cloud services.
If it's possible, it would allow many scientific simulations that would otherwise need super computer time to be possible from any location in the world, at no more cost than exactly what processing is done at a specific speed. And would lead eventually to consumer devices being extremely small, 'scaleably' powerful, and very cheap, allowing people to pay for processing the same way they pay for internet service today.

Is this possible, or is that concept flawed?
Both. ;)
What you are talking about seems like what used to be called "Grid Computing". (Sun even sold it in the early 90's.) The concept was that you put a magic library on all your boxes, and your app will be able to scale out with no further work by the programmer.
This is useful -- but only if your problem is "embarrassingly parallel" (i.e. lots of independent calculations that don't affect each other.).
MPI is one such popular way to do this: http://www.linux-mag.com/id/5759/
Unfortunately, most of the time people have problems that are more lumpy (grab a bunch of data from database, do some calculations, generate PDF.) In these cases, it's simpler to figure out a good strategy and manually code it up, than try to use a magic library which can be hard to debug, and even harder to work around performance problems. I know a lot of people using AWS, and none of them use a 'magic grid library' like you are talking about. They communicate between servers using simple protocols like Queues or HTTP interfaces.
That's not because your idea won't work. It's just that their needs can be satisfied by something much simpler to debug/run/tune.
Another neat idea in the same vein: http://www.gocircuit.org/

Linking Mulitiple Computers to Process a Task

I am unsure whether this question belongs here, so please feel free to migrate it if it doesn't.
My question is this, Is it possible to combine many different PC units to work as one?
Take for example, buying 3 different HP desktop PCs. Then link the hardware so that they act as one PC.
If so, please point me to some resources I can use.
Thanks for your time.
Note
I am not referring to linking them over a network, but rather, making the actual hardware work together.
I am not sure this is possible, so I am sure all my google search terms are not related to the issue.

You should realize that linking them over a network does not obviate their ability to work together to complete a task. Most supercomputers and clusters today are interconnected via a network (albeit a very high speed one like Infiniband). The key is to have software that can understand that it's operating in a distributed environment (e.g. MPI libraries). You might also take a look at OpenMP or Hadoop. It really depends on what you want to do with it.

You can not link some computer together to behave like one!!! Therefore you will need special hardware, which offers you the possibility to extend the numbers of CPU's working together. (Like a cray)

If you are talking about write an application that will be processed by those computers, you may be referring to MPI.
You can use the Open MPI to do that, most of languages nowdays have MPI libraries.
You can find a more elaborated information about Parallel Computing on Wikipedia Parallel Computing Article.

Real time system concept proof project

I'm taking an introductory course (3 months) about real time systems design, but any implementation.
I would like to build something that let me understand better what I'll learn in theory, but since I have never done any real time system I can't estimate how long will take any project. It would be a concept proof project, or something like that, given my available time and knowledge.
Please, could you give me some idea? Thank you in advance.
I programm in TSQL, Delphi and C#, but I'll not have any problem in learning another language.

Suggest you consider exploring the Real-Time Specification for Java (RTSJ). While it is not a traditional environment for constructing real-time software, it is an up-and-coming technology with a lot of interest. Even better, you can witness some of the ongoing debate about what matters and what doesn't in real-time systems.
Sun's JavaRTS is freely available for download, and has some interesting demonstrations available to show deterministic behavior, and show off their RT garbage collector.
In terms of a specific project, I suggest you start simple: 1) Build a work-generator that you can tune to consume a given amount of CPU time; 2) Put this into a framework that can produce a distribution of work-generator tasks (as threads, or as chunks of work executed in a thread) and a mechanism for logging the work produced; 3) Produce charts of the execution time, sojourn time, deadline, slack/overrun of these tasks versus their priority; 4) demonstrate that tasks running in the context of real-time threads (vice timesharing) behave differently.
Bonus points if you can measure the overhead in the scheduler by determining at what supplied load (total CPU time produced by your work generator tasks divided by wall-clock time) your tasks begin missing deadlines.

Try to think of real-time tasks that are time-critical, for instance video-playing, which fails if tasks are not finished (e.g. calculating the next frame) in time.
You can also think of some industrial solutions, but they are probably more difficult to study in your local environment.

You should definitely consider building your system using a hardware development board equipped with a small processor (ARM, PIC, AVR, any one will do). This really helped remove my fear of the low-level when I started developing. You'll have to use C or C++ though.
You will then have two alternatives : either go bare-metal, or use a real-time OS.
Going bare-metal, you can learn :
How to initalize your processor from scratch and most importantly how to use interrupts, which are the fastest way you have to respond to an externel event
How to implement lightweight threads with fast context switching, something every real-time OS implements
In order to ease this a bit, look for a dev kit which comes with lots of documentation and source code. I used Embedded Artists ARM boards and they give you a lot of material.
Going with the RT OS :
You'll fast-track your project, and will be able to learn how to fine-tune a RT OS
You may try your hand at an open-source OS, such as Linux or the BSDs, and learn a lot from the source code
Either choice is good, you will get a really cool hands-on project to show off and hopefully better understand your course material. Good luck!

As most realtime systems are still implemented in C or C++ it may be good to brush up your knowledge of these programming languages. Many realtime systems are also embedded systems, so you might want to play around with a cheap open source one like BeagleBoard (http://beagleboard.org/). This will also give you a chance to learn about cross compiling etc.

What application domains are CPU bound and will tend to benefit from multi-core technologies?

I hear a lot of people talking about the revolution that is coming in programming due to multi-core processors and parallelism, but I can't shake the feeling that for most of us, CPU cycles aren't the bottleneck. Pretty much all of my programs have been I/O bound in one way or another (database, filesystem, network, user interaction, etc.) for a very long time.
Now I can think of a few areas where CPU cycles are a limiting factor, like code breaking, graphics, sound, some forms of simulation (weather, physics, etc.), and some forms of mathematical research, but they all seem like fairly specialized application domains. My general impression is that most programs are still I/O bound and that for most of our industry CPUs have been plenty fast for quite a while now.
Am I off my rocker? What other application domains are CPU bound today? Do any of them include a large portion of the programming population? In essence, I'm wondering whether the multi-core CPUs will impact very many of us, and if so, how?

Visual effects / rendering. (Entertainment industry.)
Artificial Intelligence. (Games and scientific research.)
Biomedical research.
Physical simulations. (Games and scientific research.)
Database applications including SaaS, most webpages, etc.
As the personal computer becomes more and more a browser-based thin client for web applications, this industry will expand, as well as the need for more, and parallelized processing power on the backend. I could see gaming pushing parallel processing in personal computing.

One of the ways to leverage multi core is through a use of remote desktop technologies.
It's much easier to deploy desktop applications to one big Citrix server instead of dozens of user desktops.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse