Designing a Stress Testing Framework - frameworks

I do a sort of integration/stress test on a very large product (think operating-system size), and recently my team and I have been discussing ways to better organize our test workloads. Up until now, we've been content to have all of our (custom) workload applications in a series of batch-type jobs each of which represent a single stress-test run. Now that we're at a point where the average test run involves upwards of 100 workloads running across 13 systems, we think it's time to build something a little more advanced.
I've seen a lot out there about unit testing frameworks, but very little for higher-level stress type tests. Does anyone know of a common (or uncommon) way that the problem of managing large numbers of workloads is solved?
Right now we would like to keep a database of each individual workload and provide a front-end to mix and match them into test packages depending on what kind of stress we need on a given day, but we don't have any examples of the best way to do more advanced things like ranking the stress that each individual workload places on a system.
What are my fellow stress testers on large products doing? For us, a few handrolled scripts just won't cut it anymore.

My own experience has been initially on IIS platform with WCAT and latterly with JMeter and Selenium.
WCAT and JMeter both allow you to walk through a web site as a route, completing forms etc. and record the process as a script. The script can then be played back singly or simulating multiple clients and multiple threads. Randomising the play back to simulate lumpy and unpredictable use etc.
The scripts can be edited and or can be written by hand once you know whee you are going. WCAT will let you play back from the log files as well allowing you simulate real world usage.
Both of the above are installed on a PC or server.
Selenium is a FireFox add in but works in a similar way recording and playing back scripts and allowing for scaling up.
Somewhat harder is working out the scenarios you are testing for and then designing the tests to fit them. Also Interaction with the database and other external resouces need to be factored in. Expcet to spend a lot of time looking at log files, so good graphical output is essential.

The most difficult thing for me was to collect and organized performance metrics from servers under load. Performance Monitor was my main tool. When Visual Studio Tester edition came up I was astonished how easy it was to work with performance counters. They pre-packaged a list of counters for a web server, SQL server, ASP.NET application, etc. I learned about a bunch of performance counters I did not know even exist. In addition you can collect your own counters too. You can store metrics after each run. You can also connect to production servers and see how they are feeling today. Then I see all those graphics in real time I feel empowered! :) If you need more load you can get VS Load Agents and create load generating rig (or should I call it botnet). Comparing to over products on the market it is relatively inexpensive. MS license goes per processor but not per concurrent request. That means you can produce as much load as your hardware can handle. On average, I was able to get about 3,000 concurrent web requests on dual-core, 2GB memory computer. In addition you can incorporate performance tests into your builds.
Of course it goes for Windows. Besides the price tag of about $6K for the tool could be a bit high, plus the same amount of money per additional load agent.

Related

When to use a workflow engine - use and misuse

THE CONTEXT:
I should develop a software to calculate billing for a lot of customers
The software should be used by different local administrations, each one with its own rules to calculate the billing to its citizens.
At first i've thought to a workflow engine in order to "design" different calculation flows and apply then to the customers.
In the past i had a little experience with a workflow manager product (i worked a little with IBM BPM) and i had a lot of difficult to debug what happens when something went wrong and i found a lot of performance issue (respect to a simple OOP software).
Maybe these difficulties went caused by my poor knowledge of the tool, or maybe IBM BPM is not as good as IBM says.
Anyway, respect to my objective (produce a custom billing, and make it as flexible as possible in therm of configuration and process) is a workflow engine a suitable product?
Any suggestion about tools, frameworks and above all how to approach the problem.
My initial idea of the architecture is to develop a main software in c# (where i'm more confident) and using a workflow engine (like JBpm) as a black box, invoking previously configured flows into the bpm.
I would recommend using Cadence Workflow for your use case. It records all events that are related to your workflow in an execution history. It makes troubleshooting of production issues very straightforward.
As workflow is essentially a program in Java (or Go) you have unlimited flexibility on implementation. The performance is also not an issue as Cadence was built from ground up for high scalability. It was tested to over hundred million open workflows and tens of thousands of events per second.
See the presentation that goes over Cadence programming model.

How to make/build larger Selenium Test Suites?

I'm building tests for a suite of enterprise-ware. The good advice I had from the author of Test::WWW::Selenium was to build routines of larger functions and then parameterise them. So we have: add_user, post_blog and so on. These routines can then be enriched to check that the right text appears on pages and so on. This software is very configurable and we have tens of sites, all different. But these building blocks can be strung together and the driver data modified appropriately on a per-site basis.
Everything I've found out there on Selenium has been very beginner, nothing about building larger Test Suite. Is there anything beyond this or is this as good as it gets?
It is very much possible to run very large number of selenium tests. In my organization we run around 200,000 to 300,000 tests per day across multiple websites. So yes its possible.
Note:- I code in JAVA and all the info below is from a JAVA perspective.
For a large scale testing using selenium to be successful, I would say it needs 3 basic components
The infrastructure
A good framework and easily maintainable code
Easy and clear reporting
Infrastructure
Your infrastructure should be able to support the huge load. We use selenium grid (if you are using Selenium 2 then its called as Grid 2.0) to achieve this. Selenium Grid allows us to run multiple tests in parallel and across multiple browsers. We use our own servers in which virtual machines are deployed to support this testing. There are vendors like Saucelabs.com to whom you can outsource the infrastructure maintenance.
Framework and testcode
Your framework must support multithreading and it should be threadsafe to utilize the selenium grid features. We use JAVA to make sure this happens. To run the tests in parallel testNG is used. testNG can run multiple methods in parallel and all these tests will point to one single Hub. The hub would then distribute all these tests against multiple Remote Controls(RCs) connected to it.
When you have large set of tests, maintenance is inevitable. To reduce the rework effort due to a change in the application, its always better to follow the Page Object Model. Page object essentially means - each page in your application should have a corresponding class in your code. You will define all the elements and functions that can happen in that page in this class. This is a very modular and reusable structure. You can google for pageobject model and you will find many articles.
Reporting
Its great to run 200,000 tests per day, but if you cannot showcase what failed and what passed in an easily accessible method your automation tests will not be of much use. There are multiple methods for this. From HTML reporting to building custom dashboards to show the results.
Once you have all these in place, then the only thing you will need is a tool to keep these tests running continuously. You can use any CI (continuous integration) tool to achieve this. Jenkins , Hudson , cruisecontrol are few of the CI tools that are mostly used.

How does "distributed computing" apply to web development or programming in general?

I am about to use Apache Hadoop, the headlines read:
The Apache Hadoop project develops
open-source software for reliable,
scalable, distributed computing.
I can relate "scalability" to programming, but I just don't know how this "distributing" can help me in my development. According to wikipedia:
A distributed system consists of
multiple autonomous computers that
communicate through a computer
network. The computers interact with
each other in order to achieve a
common goal
So does this mean I can deploy my web apps across multiple computers and do some sort of "intense computing"? The terms that come into my mind are Content Delivery Networks and Cloud Computing.
Web development has always been about distributed computing, since clients have been on different machines to the servers they talk to, web pages can pull in resources from many servers to build a page's content, and servers may talk to other machines to achieve their goals. CDNs make this more obvious than before, but really they're just an evolution, an introduction of a virtualization/indirection layer between what you ask for and the hardware used to provide it.
Clouds are about taking the concepts of virtualization and applying them to remote hosting, both of low-level OSes and higher-level software platforms. The really interesting thing about them is that this enables different business models on the part of customers (and with different risks too, but that's mostly not related to the fact that it's distributed computing but rather that it is not wholly under your control in your own jurisdiction).
I've found that the most effective use of distributed computing is when you think in terms of connecting together distinct services, each of which with different capabilities (which might be for technical reasons, or might not; sometimes, it's for business or legal reasons that things have to be divided up) and where each of those services may be provided by many components in multiple locations. There are, and continue to remain, issues with balancing the need for performance (which is a force that brings components together) and the need for robustness (which tends to lead to distribution and replication) within the overall context of the general capabilities map.
My goodness! That paragraph sounds like terrible piffle! What I'm trying to say is that it's all trade-offs, and you should be prepared for not getting it right first time.
(Hadoop is a mechanism for doing a distributed file store, and for efficiently applying certain classes of operation – those that fit well with MapReduce or other similar scatter-gather algorithms – across that whole dataset. If that shoe fits, use it. But it doesn't solve all problems, and thank goodness for that! Things that can do everything tend to look very much like things that can't actually do anything at all, and usefulness and comprehensibility come in the restrictions.)
Hadoop is typically used to process massive data sets by distributing the processing of that data set across multiple machines.
What this means is you probably don't want to use it to "deploy an application". You might use it to process stats on your application, however. For instance, you might have very large logs of user data. This would happen if your user data grows to become too large to fit on a single hard drive, and/or would take too long for one machine to process stats on (using standard methods like an SQL query).
Ygam. While the traditional role of "clients" and "servers" have been pretty stable from 1960 till about 2005.
I believe with every fiber of my being, that distributed computing is that we all carry processors around in our pockets.
Phones do computing work. Phones do NOT need centralized servers, but they DO benefit from them.
Phones , Smartphones, tablets are an example of where distributed computation is going.
You can make a wifi base-station out of an Android device now. So now a phone becomes a server of sorts for just that instant in the coffee shop that you turn it on for that cute person next to you without internet ....and now I digress.......

How to manage Build time in TDD

Hi in my project we have hundreds of test cases.These test cases are part of build process which gets triggered on every checkin and sends mail to our developer group.This project is fairly big and is been for more than five years.
Now we have so many test cases that build takes more than an hour .Some of the test cases are not structured properly and after refactoring them i was able to reduce the running time substantially,but we have hundreds of test cases and refactoring them one by one seems bit too much.
Now i run some of the test cases(which takes really long to execute) only as part of nightly build and not as part of every checkin.
I am curious as how other guys manage this .
I believe it was in "Working Effectively with Legacy Code" that he said if your test suite takes longer than a couple minutes it will slow developers down too much and the tests will start getting neglected. Sounds like you are falling into that trap.
Are your test cases running against a database? Then that's most likely your biggest source of performance problems. As a general rule, test cases shouldn't ever be doing I/O, if possible. Dependency Injection can allow you to replace a database object with mock objects that simulate the database portion of your code. That allows you test the code without worrying whether the database is setup correctly.
I highly recommend Working Effectively with Legacy Code by Michael Feathers. He discusses how to handle a lot of the headaches that you seem to be running into without having to refactor the code all at once.
UPDATE:
A another possible help would be something like NDbUnit. I haven't used it extensively yet, but it looks promising: http://code.google.com/p/ndbunit/
Perhaps you could consider keeping your oracle database but running it from a ram drive? It wouldn't need to be large because it would only contain test data.
We have about 1000 tests, large percentage of those communicating through REST and hitting database. Total run time is about 8 minutes. An hour seems excessive, but I don't know what you are doing and how complex your tests are.
But I think there is a way to help you. We are using TeamCity and it has a nice ability to have multiple build agents. What you could do is split your test project into subprojects with each subproject containing just a number of tests. You could use JNunit/NUnit Categories to separate them. Then you would configure TeamCity so that each agent would build just one type of subproject. This way, you'd get parallel execution of tests. With few agents (you get 3 for free), you should be able to get to 20 minutes, which might even be acceptable. If you put each agent into VM, you might not even require additional machines, you just need lots of RAM.

What does it mean to say that a framework "scales well"?

When reading about frameworks (.net. ruby on rails, django, spring, etc.), I keep seeing so and so does or doesn't scale well.
What does it mean when someone says that a framework "scales well" and what does it mean to say a framework "doesn't scale well"?
Thank you.
When you plot some resource use (memory, time, disk space, network bandwidth) against concurrent users, you get a function that describes how the application works at different scale factors.
Small-scale -- a few users -- uses a few resources.
Large-scale -- a large number of users -- uses a large number of resources.
The critical question is "how close to linear is the scaling?" If it scales linearly, then serving 2,000 concurrent users costs 2 times as much as serving 1,000 users and 4 times as much as serving 500 users. This is a tool/framework/language/platform/os that scales well. It's predictable, and the prediction is linear.
If it does not scale linearly, then serving 4,000 users costs 1,000 times as much as serving 2,000 users which cost 100 times serving 500 users. This did not scale well. Something went wrong as usage went up; it does not appear predictable and it is not linear.
It means that a particular framework does (or does not) meet the increased demand that more users put on it. If you have an application written in VBScript, it might not do a good job of handling the 40,000,000 users of Facebook, for example.
This blog post explains some of the scalability pains Twitter experienced a year or so ago. It could provide some more insight into the answer to your question.
Sometimes lack of scalability is used to denigrate a language or framework, so watch out for that. Stick to studies that show real metrics. This applies to my VBScript example in the previous paragraph as well.
If a framework or an application scales well, it means that it can handle larger loads. As your site becomes more popular with more visitors and more hits per day, a framework that scales well will handle the larger load the same as it handles a smaller load. A framework that scales well will act the same when it receives 200,000 hits an hour as it does when it gets 1 hit an hour. Not only hits, but being deployed across multiple servers, possibly behind load balancing, possibly with several different database servers. A framework that scales well can handle these increasing demands well.
For instance, twitter exploded almost overnight last year. It was developed using Ruby On Rails, and it was hotly featured in the ongoing debate on whether Rails scales well or not.
substitute the phrase "handle expansion" for "scale"
There are a few elements to it in my mind. The first is the obvious one -- performance scaling. Can your framework be used to build hight capacity, high throughput system or can it just be used to build smaller applications. Will it scale vertically on hardware (parallel libraries for example) and will it scale horizontally (web farms, for example).
The second is can it scale to larger teams or the enterprise. That is, does it work well with large code bases? Large development teams? Does it have good tool support? How easy is it to deploy? Can you roll out to tens or hundreds or even thousands of users? All the way down to is it easy to hire people that have this skill. Think of trying to put together a development team of 20 or 50 people that all work on this framework. Would it be easy or next to impossible?
IMHO, saying that a framework "scales well" usually means that someone in the hearsay chain was able to use it to handle lots of volume.
In parallel programming scalability is usually used to describe how an algorithm performs as it is parallelized. An algorithm that has a 1:1 speedup is a rare beast but will double in performance on twice the hardware/cpu, treble on three times the hardware/cpu, etc...
In my experience practically any framework can be made to scale given sufficient expertise.
The easier the framework is to use the greater the chance that a developer with insufficient expertise will run into scalability problems.
It means that some respected company is doing something serious with it and isn't having any problems with it.
Scaling means how easy it is to satisfy more demand by using more hardware.
Example: You have a website written in some language that gets 1000 visits a day. You get featured in some prominent magazin and your number of users grows. Suddenly you have 1000000 visits a day, thats 1000 times as much. If you can just use 1000 more servers to satisfy the grown need of resources, your website scales well. If on the other hand you add 2000 servers but still users can't connect, because your database can only handle 1000 requests per day, than your website does not scale well.