What kind of apps can perform better on multi-core cpu? - multicore

I just want to know that if there is a simple way of judging what kind of apps can perform better on multi-core cpu? Such as Memcached, redis, MySQL, Cassandra and so on.

Anything where independent calculations can be performed...
Financial Applications and Graphics Rendering Applications come to mind.

Bruteforcing cryptographic hashes.

There are all kinds of apps that could benefit, but if you want to boil it down to just one important thing then I would have to say any application that takes advantage of a multithreaded architecture. If developed correctly the application threads could be ran simultaneously on different cores. The trick is to make sure they do not serialize from excessive locking.

Very simple example.
Anything that has a computation that can be broken down. Say you need to add all numbers from 0-800,000 and you have an 8 core machine.
You can set up 8 loops to add the numbers 1-100,000; 100,001-200,000, etc... run one on each core, save the results in a variable. i.e. loop1,loop2,etc. Then add the variables together when the loops terminate for your answer.

Related

Setting up a distributed computing grid on local network using .NET

In our firm we run complex simulations using our own software developed in .NET. These simulations are well-suited to parallel computation and we currently make much use of the various multi-threading features native to .NET. Even so, simulations often take hours or days.
We'd like to explore the potential of distributing computation over our local network of high-performance (24 core) workstations to access more CPU power. However we have no experience in this area.
Searching on Google reveals a few MPI-based options such as Pure MPI, MPI.NET, plus some commercial software such as Frontier.
Which solution should we consider for something that is ideally well-suited to a .NET environment and is relatively easy to set up?
Thanks!
Multithreading != grid computing, so you will need to rewrite some parts of your application regardless of what you will choose in the end.
I don’t know your network infrastructure but it sounded to me, like you would want to use normal desktop workstations to run distribute the code. I wouldn’t use MPI for that. MPI was rather developed for clusters and supercomputers where the network supports high bandwidth and low latency. Those aren’t the properties of a traditional office network (unless I understood something wrong).
The next thing you have to deal with is the fact that users shouldn’t turn off their machines if computations are performed on them. No grid computing platform (including MPI) deals with these kind of issues, as it is usually running on server hardware which has little failures and are running 24/7.
I don’t think there is a simple and inexpensive solution to this. You could have a service running on each machine which could execute code from DLLs with predefined parameters and send responses. Those assemblies could be downloadable from some windowsshare. But you want to have really huge peaces of work to be distributed like this. You wouldn’t get almost any improvements if the application runs only for a minute or less.
In the end you’d need also a service to find those services which are online or not, some kind of in memory DB where every service could write the IP address and that it’s online so that the clients would know to whom they can distribute the work. This could be done using RavenDB (as you said you are working with .Net), Redis or an application which was actually written for these kind of problems, Zookeeper.

Rubinius + Padrino on production?

Is any one running padrino on Rubinius + Puma in production? If yes then how stable is it?
Is it better than MRI + Thin? I am thinking of giving it a try but bit worried about its stability.
I use Puma in production, it is fine for stability, and gives excellent speed. There are times when you should pick Thin (remember, you're in an event loop), and times when you should pick Puma. Picking Thin moves concurrency away from the code level to the IO level, so Thin is good for dealing with lots of realtime or permanent connections, something like a chat server or realtime application. Something where the app is about serving different pages, you want low memory and good context switching, things like preforking (i.e. Unicorn), or running on the Rubinius version of Ruby with Puma that makes concurrency easier to code because it will perform well with threading as opposed to something like MRI with a global interpreter lock. JRuby, for example, uses native threads, and will therefore use all the available processors, so it can be helpful under certain circumstances.
See http://ylan.segal-family.com/blog/2013/05/20/unicorn-vs-puma-redux/.
I've never used Padrino, but I don't see why that would be as much of a factor as your code.
It is silly to ask for which is better because only you can tell whether something is good and does the job for you or not.
There are certain factors you can use to measure if Rubinius is good for you or not.
Ask yourself these questions:
Do you actually know what Rubinius is?
Why are you considering Rubinius?
Have you benchmarked your app with both runtimes?
What are your tests saying? Do you have tests?
There are probably more questions but it seems you're just looking for something new, right? :)
You might want to join #rubinius on freenode to ask your questions.

What does a web-based framework scalable?

thanks you very much in advance.
First of all, I conceive scalability as the ability to design a system that doest not change when the demand of its services, whatever they are, increases considerably. May you need more hardware (vertically or horizontally0? Fine, add it at your leisure because the system is prepared and has been designed to cope with it.
My question is simple to ask but presumably very complex to answer. I would like to know what you I look at in a framework to make sure it will scale accordingly, both in number of hits and number of sessions running simultaneously.
This question is not about technology nor a particular framework at all, it is more a theoretical question.
I know that depend very much on having a good database design and a proper hardware behind with replication, etc... Let's assume that this all exists, however yet my framework must meet some criteria, what?
Provide a memcache?
Ability to run across multiple machines (at the web server level) and use many replicated databases? But what is in the software that makes that possible?
etc...
Please, let's not relate the answers with any particular programming language or technology behind.
Thanks again,
D.
I think scalability depends most of all on the use case: do you expect huge amounts of data, then you should focus on the database, if it's about traffic, focus on the server, is it about adding new features, focus on your data-model and the framework you are using...
Comparing a microposts-service like Twitter to a university website or a webservice like GoogleDocs you will find quite different requirements.
First of all the common notion of scalability is the ability of a software to improve in throughput or capacity if more hardware resources are added (CPUs, memory, bandwidth etc).
Software that does not improve in increased resources is not scalable.
Getting out of the definitions, I think your question is related to evaluation of frameworks you are planning to introduce to your implementation that may affect your software's ability to scale.
IMHO the most important factor to evaluate when introducing a framework is to see if there is hidden serialization in it (that serialization in effects transfers to/affects your software)
So if you introduce a framework that introduces serialization in your application that can affect your ability to scale.
How to evaluate?
Careful source code inspection (if open source)
Are there any performance guarantees offered by those that build the
framework?
Do measurements yourself to see how introducing this framework
affects your performance and replace if not satisfied

SmartGWT, ZK and GenericFrame - Online Homework

Good day,
Our school, a small high school in semi-rural New Zealand, is currently looking into online homework solutions. Being one of the IT guys, I have been asked to look into some of the options. We have checked around and there are no robust solutions that cover what we are looking for. So, we are considering development of our own system, either on our own or in collaboration with some other schools.
Before I put significant time into any one option, I would thought I should ask for some expert advice.
Please keep in mind that one of our major obstacles is that around 20% of our students are on dial-up because broadband is not available in their area.
We are also not limited to the technologies listed, they just are the ones that we have been looking into up to this point.
With that in mind, here goes.
1. Is there a way to pre-determine the bandwidth needed for these technologies?
2. If bandwidth continued to be too limiting, could the final solution stand alone so we could distribute it to students on CD or USB stick?
3. What are some pros/cons of each for use with databases, specifically mysql or postgresql? (After all we do need to keep track of lots of data)
4. What are some pros/cons of each for of these RIA development?
I appreciate everyone for sharing their time and expertise on the matter.
Cheers,
Ben
1) If you write full-AJAX application, such as in GWT, the bandwitch will be:
a) the size of application java script, images, etc., you may consider that everything is loaded when user logs in (cache for images may seems to be big, but it's easily overloaded)
b) the size of communication - in GWT it depends only from you! no magic full-frame reloading, sending is only what YOU are wanting to send
2) I do not catch your point, stand alone applications can be distributed such way, applications that use databases generally can't
3) postgresql has high compatibility with Oracle - same transaction+select for update behaviour, pgPLSQL is highly inspired by PL/SQL (easy to rewrite stored procedures).
I personally suggest MySQL for a school project for its simplicity. PostgreSQL is powerful but a bit complicate to configure and the visual tool for optimizing queries not good.
Without considering the bandwidth, I definitely suggest ZK since, again, it is much easier to learn, to develop and to maintain (also much more powerful). The bandwidth consumption and latency of GWT really depends how much effort you want to invest, and how skillful your people are familiar with distributed computing, while the network bandwidth is basically the states of UI (not data), which is reasonably small. In short, you could have the best network bandwidth and latency if you optimize it at the best with GWT, while ZK is less to worry but, if you want to improve, you have to use jQuery (i.e, in JavaScript).
Thanks lechlukasz, I appreciate your comments and insight.
I will clarify my point about stand alone applications. We have a number of students, as high as 20%, who do not have access to broadband due to their geographic location. We are considering, as part of the design, how we may be able to distribute a stand alone version.
For instance, if we were to abstract all the database calls using a separate class in GWT, we could recompile a stand alone version that didn't make the database calls. The database would likely only be for tracking results and reporting.
In reality, we would likely implement the front end product first with references to empty methods for storing the results in a database and implement those methods at a later time.
For the record, we have started to code up some test cases using GWT/SmartGWT and are pleased with the results. Although we cannot comment on the other technologies considered because we didn't try them to the same extent, we are pleased with the results to this point of the project.
Cheers,
Ben

Real time system concept proof project

I'm taking an introductory course (3 months) about real time systems design, but any implementation.
I would like to build something that let me understand better what I'll learn in theory, but since I have never done any real time system I can't estimate how long will take any project. It would be a concept proof project, or something like that, given my available time and knowledge.
Please, could you give me some idea? Thank you in advance.
I programm in TSQL, Delphi and C#, but I'll not have any problem in learning another language.
Suggest you consider exploring the Real-Time Specification for Java (RTSJ). While it is not a traditional environment for constructing real-time software, it is an up-and-coming technology with a lot of interest. Even better, you can witness some of the ongoing debate about what matters and what doesn't in real-time systems.
Sun's JavaRTS is freely available for download, and has some interesting demonstrations available to show deterministic behavior, and show off their RT garbage collector.
In terms of a specific project, I suggest you start simple: 1) Build a work-generator that you can tune to consume a given amount of CPU time; 2) Put this into a framework that can produce a distribution of work-generator tasks (as threads, or as chunks of work executed in a thread) and a mechanism for logging the work produced; 3) Produce charts of the execution time, sojourn time, deadline, slack/overrun of these tasks versus their priority; 4) demonstrate that tasks running in the context of real-time threads (vice timesharing) behave differently.
Bonus points if you can measure the overhead in the scheduler by determining at what supplied load (total CPU time produced by your work generator tasks divided by wall-clock time) your tasks begin missing deadlines.
Try to think of real-time tasks that are time-critical, for instance video-playing, which fails if tasks are not finished (e.g. calculating the next frame) in time.
You can also think of some industrial solutions, but they are probably more difficult to study in your local environment.
You should definitely consider building your system using a hardware development board equipped with a small processor (ARM, PIC, AVR, any one will do). This really helped remove my fear of the low-level when I started developing. You'll have to use C or C++ though.
You will then have two alternatives : either go bare-metal, or use a real-time OS.
Going bare-metal, you can learn :
How to initalize your processor from scratch and most importantly how to use interrupts, which are the fastest way you have to respond to an externel event
How to implement lightweight threads with fast context switching, something every real-time OS implements
In order to ease this a bit, look for a dev kit which comes with lots of documentation and source code. I used Embedded Artists ARM boards and they give you a lot of material.
Going with the RT OS :
You'll fast-track your project, and will be able to learn how to fine-tune a RT OS
You may try your hand at an open-source OS, such as Linux or the BSDs, and learn a lot from the source code
Either choice is good, you will get a really cool hands-on project to show off and hopefully better understand your course material. Good luck!
As most realtime systems are still implemented in C or C++ it may be good to brush up your knowledge of these programming languages. Many realtime systems are also embedded systems, so you might want to play around with a cheap open source one like BeagleBoard (http://beagleboard.org/). This will also give you a chance to learn about cross compiling etc.