Solaris process taking up large CPU - solaris

I have a java process seens like taking up all the cpu processing power.
I thinking of killing it but how do i know what is the program that is actually causing such huge usage of solarsi cpu usage?

Try prstat -- it should show you how much CPU each process on your system is using.

Although this question was asked sometime back, I'm posting this so anyone can refer in future.
You can use the top command to investigate the issue. However, the thing with Solaris is, you have to turn off the Irix mode to get the real process information. While running top tool, press I to toggle the Irix mode.
More information can find here -> How to investigate CPU utilization in Solaris

Related

Unusual spikes in CPU utilization in CentOS 6.6 while starting pycharm

my system since last couple of days is behaving strangely. I am a regular user of pycharm software, and it used to work on my system very smoothly with no hiccups at all. But since last couple of days, whenever I start pycharm, my CPU utilization behaves strangly, like in the image: Unusual CPU util
I am confused as when I go to processes or try ps/top in terminal, there are no process which is utilizing cpu more then 1 or 2%. So I am not sure where these resources are getting consumed.
By unusual CPU util I mean, That first CPU1 is getting used 100% for couple or so minutes, then CPU2. Which is, only one cpu's utilization goes to 100% for sometime followed by other's. This goes on for 10 to 20 minutes. then system comes back to normal.
P.S.: I don't think this problem is related to pycharm, as I face similar issues while doing other work also, just that I always face this with pycharm for sure.
POSSIBLE CAUSE: I suspect you have a thrashing problem. The CPU usage of your applications are low because none of them are actually getting much useful work done. All the processing is being taken up by moving memory pages to and from the disk. Your CPU usage probably settles down after a time because your application has entered a state where its memory working set has shrunk to a point where it all can be held in memory at one time.
This has probably happened because one of the apps on your machine is handling a larger data set than before, and so requires more addressable memory. Another possibility is that, for some reason, a lot more apps are running on your machine.
POTENTIAL SOLUTION: There are several ways you can address this. The simplest is to put more RAM on your machine. If this doesn't work or isn't possible, you'll have to figure out which app is the memory hog. You may simply have to work with smaller problems/data-sets or offload some of the apps onto a different box.
MIGRATING CPU LOAD: Operating systems will move tasks (user apps, kernel) around for many different reasons. The reasons can range anywhere from it being just plain random to certain apps having more of their addressable memory in one bank vs another. Given that you are probably doing a lot of thrashing, I'm not surprised that the processor your app is running is randomized over time.

What prevents an OS to recover from a 'blue screen of death'?

If a program violates its instruction path and/or memory data the OS halts it with some message due to the program running in the 'virtual machine' like space of the OS and its unable to determine its next instruction.
The OS in tern is also a program, sharing the machine resources as any other program and can halt in a similar fashion but it's sometimes healthy enough to display some debugging info and blue screen. So as a programmer I'm thinking, if I can do that - emit debugging info and make the screen blue why wouldn't I be able to try to recover the OS altogether instead of requiring a cold reboot ? After all its the OS - it's supposed to be the rock solid foundation (not talking about Windows of course) of all software, if the space shuttle ran Windows then what would happen - it won't recover ?:)
So: is it only that MS hasn't taken care of trying everything to recover to the point that a reboot is not required or is it some other more deeper problem that has stop companies like MS to be unable to do that ?
It's nothing specific to Microsoft; Linux has a kernel panic mechanism, OS X has a kernel panic mechanism. I expect every non-toy operating system kernel has a panic mechanism of some sort when internal corruption is detected. The corruption could come from faulty hardware, faulty software, gamma rays hitting the memory boards just right, who knows.
The whole point behind the kernel panic is a recognition that something that shouldn't go wrong has gone wrong. What else might be invalid? Depending upon where the crash happened, it might not be safe to sync and unmount the filesystems because that might scribble corrupt data over good data on the drives.
Writing to the video card is a good way to inform the user of events (many systems have monitors attached, anyway) and writing to the video card isn't likely to corrupt on-disk data: it would take quite an error for the IOMMU or page tables to be so corrupted that they refer instead to on-disk files and most operating systems will refuse to write to block devices after a kernel panic to try to protect user data at all cost.
Consider what you could do to bring the system back up to a running state? You'd need to tear down all applications that might be associated with corrupted kernel data structures. You'd need to restart applications, in the right order, to bring system services back up. And a reboot is a very easy way to reliably do both those things.
You can't recover the OS for the same reasons a user-space program can't recover -- when certain types of errors are seen it means that your program is in an undefined state and therefore can't recover. Even if the problem in some sense isn't fatal (i.e. doesn't cause the program to immediately die), it's not safe to continue because things are or are likely corrupted.
For example, be it a user-space program or the OS kernel, say a buffer overrun or an messed up pointer causes the stack to be corrupted. How is the program supposed to recover from that? With a blown stack when the function that is currently executing ends, where will it return to? The return address is likely gone. Now what?
And it's not just Microsoft. Ever hear of a "kernel panic" in Unix?

simultaeous programs

I see that advantages or running programs simultaeously are that the user can run multiple programs and it offers better CPU usage, can any one give me an example of when it actually saves CPU time? eg busy waiting?
Not sure if I understand right, but when your emacs is waiting for you to type, you save time by scheduling another application while waiting for keyboard inputs.

What sort of things can cause a whole system to appear to hang for 100s-1000s of milliseconds?

I am working on a Windows game and while rendering, some computers will experience intermittent pauses ("hitches" for lack of a better term). When profiled they appear in seemingly random places in the code. Eventually I noticed that it wasn't just my process that was affected, but (seemingly) every process on the system. All of the threads in my application hitch at once. The CPU utilization drops during these hitches and it appears as if most processes make no progress.
This leads me to believe this may be an Operating System or Driver issue, but it only occurs while playing the game (and only on some systems). What sort of operations might the operating system be doing that would require the kernel to pause all user threads and block. Some kind of I/O? At first I thought of paging but my impression is that would only affect a single process, no?
Some systems in use: Windows, DirectX (3d), nVidia cards (unknown if replicates on ATI), using overlapped io for streaming
If you have a lot of graphics in use, it may be paging graphics memory into the swap file.
Or perhaps the stream is getting buffered on disk?
It's worth seeing if the hitches coincide with the PC's disk activity LED.
Heavy use of memory mapped IO. This of course includes the system pagefile, but can also include user applications that use mmio heavily (gcc for one)

Computationally intensive scala process using actors hangs uncooperatively

I have a computationally intensive scala application that hangs. By hangs I means it is sitting in the process stack using 1% CPU but does not respond to kill -QUIT nor can it be attached via jdb attach.
Runs 2-12 hours at 800-900% CPU before it gets stuck
The application is using ~10 scala.actors.
Until now I have had great success with kill -QUIT but I am bit stumped as to how to proceed.
The actors write a fair amount to stdout using println which is redirected to a text file but has not been helpful so far diagnostically.
I am just hoping there is some obvious technique when kill -QUIT fails that I am ignorant of.
Or just confirmation that having multiple actors println asynchronously is a real bad idea (though I've been doing it for a long time only recently with these results)
Details
scala 2.8.1 & 2.8.0
mac osx 10.6.5
java version "1.6.0_22"
Thanks
if you just wanna remove the process from the run queue, the obvious choice is
kill -9
Of course you wanna avoid having to do that in the first place, but you do not provide any information that would allow us to help you with that. Writing to stdout in multiple actors is indeed a bad idea, but the worst thing that could come from that is garbled output.
Most times I have seen the JVM react like that is when it goes into no permgen space. It will then be incapable of anything (even dying). You don't find any traces of that in your printout? Have you tried to increase the permgen space?