SSRS Load testing and pure performance - ssrs-2008

I am doing load testing with an SSRS report and getting more and more disappointed with it.
I need an expert opinion whether there is a way to improve performance.
Environment setup:
SSRS Report which calls stored procedure that selects from table with 5000 rows, runs 3 milliseconds top.
C# Application which based on input parameters creates threads and in parallel makes calls to the SSRS Report.
SSRS is accessed by POST request to the URL and under one NT User.
Stats:
Stored procedure in MSSM Studio runs 3 milliseconds
SSRS report in IE runs for 50ms
C# Application with single thread gets results back in 157 - 239 milliseconds
4 threads average 500 milliseconds for the same report
8 threads: 800 milliseconds for the same report.
16 threads: 1300 milliseconds for the same report.
Is there any configuration or settings that can be changed so SSRS handles concurrent calls better?

Apparently the RS service only has 2 threads available per CPU. I've seen that number touted in various forums around the net, here is one.
You may find that the threads you are attempting to fire up are actually hindering the performance, effectively 'overloading' the RS threads. You can check your log file to see if threads are being stressed according to this, and states you can expect an error of the following type in the log:
WARN: Thread pool pressure. Using current thread for a work item
Perhaps you could look at matching the number of max concurrent threads you are creating to the number of CPUs the RS service has access to * 2

Related

Multiple concurrent connections with Vertx

I'm trying to build a web application that should be able to handle at least 15000 rps. Some of the optimizations I have done is increase the worker pool size to 20 and set an accept back log to 25000. Since I have set my worker pool size to 20; wil this help with the the blocking piece of code?
A worker pool size of 20 seems to be the default.
I believe the important question in your case is how long do you expect each request to run. On my side, I expect to have thousands of short-lived requests, each with a payload size of about 5-10KB. All of these will be blocking, because of a blocking database driver I use at the moment. I have increased the default worker pool size to 40 and I have explicitly set my deploy vertical instances using the following formulae:
final int instances = Math.min(Math.max(Runtime.getRuntime().availableProcessors() / 2, 1), 2);
A test run of 500 simultaneous clients running for 60 seconds, on a vert.x server doing nothing but blocking calls, produced an average of 6 failed requests out of 11089. My test payload in this case was ~28KB.
Of course, from experience I know that running my software in production would often produce results that I have not anticipated. Thus, the important thing in my case is to have good atomicity rules in place, so that I don't get half-baked or corrupted data in the database.

How does jmeter starts sending requests to server

If Thread: 100, Rampup: 1 and Loop count: 1 is the configuration, how will jmeter start sending requests to the server?
Request will be sent 1 req/sec or all requests will be sent all at once to server?
JMeter will send requests as fast as it can, to wit:
It will start all threads (virtual users) you define in Thread Group within the ramp-up period (in your case - 100 threads in 1 second)
Each thread (virtual user) will start executing Samplers which are present in the Thread Group upside down (or according to the Logic Controllers)
When there are no more samplers to execute or loops to iterate the thread will be shut down
When there are no more active threads left - JMeter test will end.
With regards to requests per second - it mostly depends on your application response time, i.e.
if you have 100 virtual users and response time is 1 second - you will get 100 requests/second
if you have 100 virtual users and response time is 2 seconds - you will get 50 requests/second
if you have 100 virtual users and response time is 500 milliseconds - you will get 200 requests/second
etc.
I would recommend increasing (and decreasing) the load gradually, this way you will be able to correlate increasing load with increasing throughput/response time/number of errors, etc. while releasing all threads at once will not tell you the full story (unless you're doing a form of spike testing, in this case consider using Synchronizing Timer)
JMeter's ramp-up period set as 1 means to start all 100 threads in 1 second.
This isn't recommended settings as describe below
The ramp-up period tells JMeter how long to take to "ramp-up" to the full number of threads chosen. If 10 threads are used, and the ramp-up period is 100 seconds, then JMeter will take 100 seconds to get all 10 threads up and running. Each thread will start 10 (100/10) seconds after the previous thread was begun. If there are 30 threads and a ramp-up period of 120 seconds, then each successive thread will be delayed by 4 seconds.
Ramp-up needs to be long enough to avoid too large a work-load at the start of a test, and short enough that the last threads start running before the first ones finish (unless one wants that to happen).
Start with Ramp-up = number of threads and adjust up or down as needed.
See also Can i set ramp up period 0 in JMeter?
bear in mind that with low rampup and many threads, you may be limited by local resources, so your results may be a measurement of client capability rather than server.

Is this an intelligent use case for optaPlanner?

I'm trying to clean up an enterprise BI system that currently is using a prioritized FIFO scheduling algorithm (so a priority 4 report from Tuesday will be executed before priority 4 reports from Thursday and priority 3 reports from Monday.) Additional details:
The queue is never empty, jobs are always being added
Jobs range in execution time from under a minute to upwards of 24 hours
There are 40 some odd identical app servers used to execute jobs
I think I could get optaPlanner up and running for this scenario, with hard rules around priority and some soft rules around average time in the queue. I'm new to scheduling optimization so I guess my question is what should I be looking for in this situation to decide if optaPlanner is going to help me or not?
The problem looks like a form of bin packing (and possibly job shop scheduling), which are NP-complete, so OptaPlanner will do better than a FIFO algorithm.
But is it really NP-complete? If all of these conditions are met, it might not be:
All 40 servers are identical. So running a priority report on server A instead of server B won't deliver a report faster.
All 40 servers are identical. So total duration (for a specific input set) is a constant.
Total makespan doesn't matter. So given 20 small jobs of 1 hour and 1 big job of 20 hours and 2 machines, it's fine that it takes all small jobs are done after 10 hours before the big job starts, given a total makespan of 30 hours. There's no desire to reduce the makespan to 20 hours.
"the average time in the queue" is debatable: do you care about how long the jobs are in the queue until they are started or until they are finished? If the total duration is a constant, this can be done by merely FIFO'ing the small jobs first or last (while still respecting priority of course).
There are no dependencies between jobs.
If all these conditions are met, OptaPlanner won't be able to do better than a correctly written greedy algorithm (which schedules the highest priority job that is the smallest/largest first). If any of these conditions aren't met (for example you buy 10 new servers which are faster), then OptaPlanner can do better. You just have to evaluate if it's worth spending 1 thread to figure that out.
If you use OptaPlanner, definitely take a look at real-time scheduling and daemon mode, to replan as new reports enter the system.

Activiti Rest - Calling multiple instances concurrently

I have defined some simple BPM flows (F1) and deployed in activiti-rest.war. For simplicity, I have take a simple start-end flow.
I have written a REST client to execute the flow (F1) in parallel threads (20) with its required parameters for 1000 http requests.
Problem: I can see the flows are running sequentially, one by one response for the 20 parallel threads. It took a time of around 60 secs to complete with 20 threads (even when increased to 50 threads) it is the same.
Activiti Version : 5.15
What should be the problem here ?. Any help will be really useful.
activiti-rest/service/runtime/process-instances - Rest URL used to start the instance
Thanks,
Yoka
At last i found the solution.
It could be of two reasons
1) Make sure task's "Exclusive" property is set to false. But it needs more analysis on how your process task will be running. Refer the below link for further information
http://www.activiti.org/userguide/#exclusiveJobs
2) If you run the activity rest application and the client process on a dual-core machine. It might be difficult to assess the response time.
Thanks,
Yoka

SQL Server "Audit Logout" operation takes long.

We have a stored procedure that is called about 300,000 times per day by 15 users throughout the day. I have poured through every line and it is about as efficient as I can get it.
The stored procedure is accessed through an ASP.NET page on 4.0 from a legacy VB6 application on basic Winterms.
When I look at the SQL trace file, I see the following:
exec sp_reset_connection (Using the connection pool)
Audit Login
Execution of the stored procedure
Audit Logout
I see on step 4, the read and writes are way high, which makes sense since it's an accumulation of the connection being reused in the pool.
What concerns me is how long it takes, sometimes at takes 50ms, and other times 400ms, it's totally random. From the docs I read "Audit Logout" is the entire duration for all three steps. But steps 1-3 were very quick, like 0-5ms. Why would the "Audit Logout" duration take so long?
I´m "dealing" with a similar issue right now and stumbled across this
post: http://social.msdn.microsoft.com/Forums/en/sqldatabaseengine/thread/84ecfe9e-ff0e-4fc5-962b-cffdcbc619ee
Maybe this (out of the above mentioned post) is the solution:
"One error in my analysis has been identified. When a connection is
pulled out of the pool, the server is sent a sp_reset_connection.
That reset invokes an audit_logout followed by an audit_login. The
next audit_logout doesn’t occur until the next time the connection is
pulled out of the pool… so the long intervals I am seeing include the
time the application processes the results of a query, releases the
connection to the connection pool, does whatever, and finally pulls
the connection back out of the pool to start the next transaction."