what can cause asp.net to stop responding - asp.net-3.5

we have an asp.net application, that makes heavy use of REST/json services through httpwebrequest and heavy use of the web cache. seemingly sporadically, the worker process simply stops responding. the browser sits expecting a response and never gets one. only an app pool restart seems to remedy the problem. there are no unhandled exceptions or any other anomalies that we have noticed. cpu and memory loads are low (<10% cpu util, >80% free memory)
the only possible hint that we have had is that we saw 'deadlock detected' messages in the event log at one point, but not consistently. additionally, we've eliminated what we think to be the only possible cause of such contention (executing multiple concurrent httpwebrequests via begin/end request).
any ideas?
win 03 server/data center edition (on amazon ec2)
asp.net 3.5

When you have a non-responsive application with low cpu used, it's most likely a concurrency issue. You either have deadlocks which cause all your threads to stall, or your asynchronous threads take forever to finish and the ThreadPool ends up starving. Since you're running ASP.Net, you should check how the ThreadPool is doing. You can watch the performance counters or, better, attach to your application with windbg (you can get it from Debugging Tools for Windows http://www.microsoft.com/whdc/DevTools/Debugging/default.mspx).
Launch windbg, attach to the asp.net process and then, in the command prompt:
.loadby mscorwks sos
!ThreadPool
Even better, download the sosex extension (http://www.stevestechspot.com/CommentView,guid,9fdcf4a4-6e09-4807-bc31-ac1adf836f6c.aspx) and check for deadlocks with
!dlk
check what are the threads blocking, switch to one of the threads (using the UI or the command line) and type
!CLRStack
to have a stack trace showing you which method is blocking your application.

Related

When deadlocks occur in modern operating systems?

I know deadlocks was a hot research topic in past. But, even though I studied lots of modern operating systems, I cannot see any major problem about deadlocks now. I know some (most) resources which deadlocks can occur strictly managed by operating system itself and seems it prevent deadlocks someway, I really didn't see any case related to a deadlock. I know lots of features about resources handled different than others in popular systems with different design principles but, they can all maintain system deadlock-free.
Try to use two mutexes in your program and in first thread close in sequence: mutex1, sleep(500ms), mutex2, in second thread: mutex2, sleep(1000ms), mutex1.
In systems. In windows (including 8.1) if your application uses SendMessage and broadcast HWND_BROADCAST - if one application is hung, your application also will be in hung state. Also in part cases of DDE communication (including ShellExecute for part of programs), if one application is not responsive, your application can be in hung state.
But you can use SendMessageTimeout...
The deadlock will always be possible if processes or threads will be synchronized. Synchronization of processes and threads is a "must-have" element of applications.
AND... SYSTEM-WIDE deadlock (Windows):
Save all your documents before this action.
Create HWND h1 with parent=0 or parent=GetDesktopWindow and styles 0x96cf0000
Create HWND h2 with parent=h1 and styles 0x96cf0000
Create HWND h3 with parent=h2 and styles 0x56cf0000 (here must be a child window).
Use ::SetParent(h1, h3);
Then click any of these windows.
The system will in cyclic (triangle) order try to reorder windows. The application is hung but if any other application will try to use SetWindowPos, the application will newer return from this function. The Task Manager won't help, the Alt+Ctrl+Del also stops to work. 100% of usage of CPU... Only hard reset will help you.
There is possibility to prevent it but this situation must be detected ASAP.
Operating system deadlocks still happen. When a system has limited contended resources that it can't reclaim a deadlock is still possible.
In linux, look at kernel stalls, these happen when I/O doesn't release in a timely manner. Kernel stalls are particularly interesting between vmware and guest operating systems.
For external instigators, deadlocks happen when san systems and networks have issues.
New release deadlocks happen fairly often while maturing a kernel, not per user, but as a whole from the community.
Ever get a blue screen or instant reboot? Some of those are caused by lost resources.
Kernels are fairly mature, and have gotten good at reclaiming resources, but aren't perfect.
Most modern resource handlers tend to present as services now instead of being lockable objects. Most resource sharing within the operating system relies on separate channels, alleviating much of the overlap. There's a higher reliance on queues and toggles instead of direct locking contention on shared buffers. These are generalities of trends in OS parts and pieces that contribute to less opportunity for deadlocks, but there's not a way to guarantee a deadlock less system.

PhantomJS not killing webserver client connections

I have a kind of proxy server running on a WebServer module and I noticed that this server is being killed due to its memory consumption.
Every time the server gets a new request it creates a child client process, the problem I see is that the process remains alive indefinitely.
Here is the server I'm using:
server.js
I thought response.close() was closing and killing client connections, but it is not.
Here is the list of child processes displayed on htop:
(Those process are even more, it is just a fragment of the list)
I really need to kill those processes because they are using all the free memory. Am I missing something?
I could simply restart the server, but the memory will still be wasted.
Thanks you !
EDIT:
The processes I mentioned before are threads and no independient processes as I thought (check this).
Every http request creates a new thread, and that's ok, but this thread is not being killed after the script ends.
Also, I found out that no new threads are created if the request handler doesn't run casper (I mean casper.run(..)).
So, new threads are created only if the server runs a casper instance, the problem is that this instance doesn't end after run function does.
I tried casper.done() as mentioned below, but it kill the whole process instead of the current running thread. (I did not find any doc for this function).
When I execute other casper scripts, outside the server in the same machine, the instanced threads and the whole phantom process ends successfully. What would be happening?
I am using Phantom 2.1.1 and Casper 1.1.1 versions.
Please ask me anything if you want more or specific information.
Thanks again for reading !
This is a well known issue with casper:
https://github.com/casperjs/casperjs/issues/1355
It has not been fixed by the casper guys and is currently marked as an enhancement. I guess it's not on their priority list.
Anyways, the workaround is to write a server side component e.g. a node.js server to handle the incoming requests and for every request run a casper script to do the scraping in a new child process. This child process will be closed when casper terminates it's job. While this is a workaround, it is not an optimal solution as the cost of opening a child process for every request is not cheap. it will be hard to heavily scale an approach similar to this. However, it is a sufficient workaround. More on this fully sensible approach is in the link above.

OSB: Analyzing memory of proxy service

I have multiple proxies in a message flow.Is there a way in OSB by which I can monitor the memory utilization of each proxy ? I'm getting OOM, want to investigate which proxy is eating away all/most memory.
Thanks !
If you're getting OOME then it's either because a proxy is not freeing up all the memory it uses (so will eventually fail even with one request at a time), or you use too much memory per invocation and it dies over a certain threshold but is fine under low load. Do you know which it is?
Either way, you will want to generate a heap dump on OOME so you can investigate what's going on. It's annoying but sometimes necessary. A colleague had to do that recently to fix some issues (one problem was an SB-transport platform bug, one was a thread starvation issue due to a platform work manager bug, the last one due to a Muxer bug when used in exalogic).
If it just performs poorly under load, then you'll need to do the usual OSB optimisations, like use fewer Assign steps (but assign more variables per step), do a lot more in xquery rather than proxy steps, especially loops that don't need a service callout, since they can easily be rolled into a for loop in xquery; you know, all the standard stuff.

Is there a way to interrupt Eclipse when it is hanging (eg, on content assist)

I'm using the Scala plugin, and it sometimes likes to hang and spin a rainbow wheel (Mac wait icon) for a long time. It's very annoying. Is there something like a "Control C" for the current thread? I'd like a way to tell Eclipse to kill the current UI command. This would help when use plugins that are not as polished as the one for Java.
There is no such feature and I would not hold your breath waiting for one. Unlike system processes that exist independent of each other, threads are entangled. Force killing a thread is very likely to corrupt various data structures and leave Eclipse process in a bad state.
Various API techniques exist to allow for cancellation of an operation, but they all rely on the running operation to actively check for cancellation request and safely shutdown. Not much help for dealing with unpolished plugins, since graceful handling of cancellation in all cases tends to be implemented as part of polishing.

What does 8badf00d mean?

Sometimes my iPhone application crashes with a weird crashlog, that reads exception code is 0x8badf00d. The stacktraces show random snapshots of app execution, but nothing suspicious. This happens very rarely and I'm not able to find out how to reproduce it. Does anybody know more about this kind of exception and exception code?
Here is an excerpt from my crashlogs:
Exception Type: 00000020
Exception Codes: 0x8badf00d
Highlighted Thread: 0
Application Specific Information:
Failed to deactivate
Thread 0:
...
< nothing suspicious here >
...
Unknown thread crashed with unknown flavor: 5, state_count: 1
0x8badf00d is the error code that the watchdog raises when an application takes too long to launch or terminate. See Apple's Crash Reporting for iPhone OS Applications document
If you get Exception Type: 00000020 and Exception Codes: 0x8badf00d then it is watchdog timeout crash reports. The exception code 0x8badf00d is called "ate bad food".
The most common reason for this crash is synchronous activity on main thread. The fix is to switch to asynchronous activity on main thread.
Refer this Apple document for more detail.
It is HexSpeak, see here: http://en.wikipedia.org/wiki/Hexspeak
It's some programmer's idea of a joke. You have to pick a number for your code, but the number doesn't necessarily mean anything in itself. 8badf00d is just another way to write the number 2,343,432,205, and was chosen because it looks 'funny' when represented in hex for an exception log.
0x8badf00d exception is raised by apple provided watchdog. The most common cause for watchdog timeout crashes in a network application is synchronous networking on the main thread. There are four contributing factors here:
synchronous networking — This is where you make a network request and block waiting for the response.main thread — Synchronous networking is less than ideal in general, but it causes specific problems if you do it on the main thread. Remember that the main thread is responsible for running the user interface. If you block the main thread for any significant amount of time, the user interface becomes unacceptably unresponsive.long timeouts — If the network just goes away (for example, the user is on a train which goes into a tunnel), any pending network request won't fail until some timeout has expired. Most network timeouts are measured in minutes, meaning that a blocked synchronous network request on the main thread can keep the user interface unresponsive for minutes at a time.Trying to avoid this problem by reducing the network timeout is not a good idea. In some situations it can take many seconds for a network request to succeed, and if you always time out early then you'll never make any progress at all.watchdog — In order to keep the user interface responsive, iOS includes a watchdog mechanism. If your application fails to respond to certain user interface events (launch, suspend, resume, terminate) in time, the watchdog will kill your application and generate a watchdog timeout crash report. The amount of time the watchdog gives you is not formally documented, but it's always less than a network timeout.
There are two common solutions:
asynchronous networking — The best solution to this problem is to run your networking code asynchronously. Asynchronous networking code has a number of advantages, not least of which is that it lets you access the network safely without having to worry about threads.synchronous networking on a secondary thread — If it's prohibitively difficult to run your networking code asynchronously (perhaps you're working with a large portable code base that assumes synchronous networking), you can avoid the watchdog by running your synchronous code on a secondary thread.
Refer apple docs for more information.
It's a failure code added by a dev with a good sense of humor. Because hexadecimal uses letters as well as numbers, it's possible to come up with hex numbers that look approximately like english words, such as "0xdeadbeef", etc. I'm sure that the exception has a specific meaning, but if there's no major symptoms associated with it, you can probably ignore it without too much concern.
from wikipedia 0xBAADF00D ("bad food") is used by Microsoft's LocalAlloc(LMEM_FIXED) to indicate uninitialised allocated heap memory when the debug heap is used.[7]