AIX: piping the application command output to more results in malloc failure in application - redirect

I have a command line application, which executed on shell will list the output reading from the database. And it gets this information in chunks for which memory allocation and free is being done.
When I execute the command (Whose output will span around 6000 pages) it is listing the data correctly.
But (only in AIX) when I issue the 'command | more' after displaying random number of pages, memory allocation in the application that is getting the data in chunks is failing.
(Where as the same command implementation with more is working fine in linux for the same data).
Any idea why in AIX it is failing? Anybody know about the memory allocation criteria in AIX? why piping the output to more command causes memory allocation failure in application?

It is not clear exactly what the failure is. Are you getting a seg fault or is the call the malloc returning 0 indicating that you are out of memory?
The fault could be in an AIX library but it could just as easily be within your application.
Go here: http://pic.dhe.ibm.com/infocenter/aix/v6r1/index.jsp (or the page that is appropriate for your level)
Search for "malloc debug". These facilities are not bleeding edge but they are fairly good and complete. With some time and care you can track down memory leaks and using memory after it has been freed (which sounds like the case here).
Its also good to review the available APARs for your level looking for matches that sound similar.
There are also 3rd party tools like zero fault http://www.zerofault.com/index.html and Purify (which looks like IBM purchased) http://www-01.ibm.com/software/awdtools/purify/unix/sysreq/ to help out.
Good luck

Related

Find CPU usage of a specific process

I'm trying to write a PowerShell program that "records" a process CPU and RAM usage.
After searching for ways to do it I found about the command Get-Counter.
Now it works perfectly fine for RAM but I just can't understand the values I'm getting from the CPU.
For example when I tested my program I check a process that used about 10% CPU(according to Task Manager), but when checking with Get-Counter I get a value around 90.
Now I know Get-Counter takes all the logical processors into account.
But I have 16 logical processors, so I just can't see where the 90 is coming from.
If someone knows either how to make sense of the value I'm getting, or if there is another way to record CPU usage I will be thankful.
without seeing (at least) part of the code it will be hard to tell why you are getting different value (might be wrong counter, wrong value [raw, second, cooked], or maybe even some calculations being performed within your script). Either way my guess would be that you are querying raw value rather than "cookedvalue"

Perl "Out of memory!" when processing a large batch job

A few others and I are now the happy maintainers of a few legacy batch jobs written in Perl. About 30k lines of code, split across maybe 10-15 Perl files.
We have a lot of long-term fixes for improving how the batch process works, but in the short term, we have to keep the lights on for the various other projects that depend on the output of these batch jobs.
At the core of the main part of these batch jobs is a hash that is loaded up with a bunch of data collected from various data files in a bunch of directories. When these were first written, everything fit nicely into memory - no more than 100MB or so. Things of course grew over the years, and the hash now grows up to what the box can handle (8GB), leaving us with a nice message from Perl:
Out of memory!
This is, of course, a poor design for a batch job, and we have a clear (long-term) roadmap to improve the process.
I have two questions however:
What kind of short-term options can we look at, short of throwing more memory at the machine? Any OS settings that can be tweaked? Perl runtime/compile flags that can be set?
I'd also like to understand WHY Perl crashes with the "out of memory!" error, as opposed to using the swap space that is available on machine.
For reference, this is running on a Sun SPARC M3000 running Solaris 10 with 8 cores, 8 GB RAM, 10 GB swap space.
The reason throwing more memory at the machine is not really an ideal solution is mostly because of the hardware it's running on. Buying more memory for these Sun boxes is crazy expensive compared to the x86 world, and we probably won't be keeping these around much longer than another year.
The long-term solution is of course refactoring a lot of the codebase, and moving to Linux on x86.
There aren't really any generally-applicable methods of reducing a program's memory footprint; it takes someone familiar with Perl to scan the code and find something relevant to your specific situation
You may find that storing your hash as a disk-based database helps, and the more general way is to use Tie::Hash::DBD which will allow you to use any database that DBI supports, but it won't help with hashes whose values can be references, such as nested hashes. (As ThisSuitIsBlackNot has commented, DBM::Deep overcomes even this obstacle.)
I presume your Perl code is crashing at startup? If you have a memory leak then it should be simpler to find the cause. Alternatively it may be obvious to you that the initial population of the hash is wasteful in that it is storing data that will never be used. If you show that part of your code then I am sure someone will be able to assist
Try to use 64bit version of interpreter. I had the same issue with "Out of memory" message. In my case 32bit strawberry perl ate 2Gb of RAM before termination. 64bit version of interpreter can use bigger amount. It ate the rest of my 16Gb and than started to swap like hell. But I received a result.

"Out of memory" error for standalone matlab applications - memory fragmentation

I have to deliver an application as a standalone Matlab executable to a client. The code include a series of calls to a function that internally creates several cell arrays.
My problem is that an out-of-memory error happens when the number of calls to this function increases in response to the increase in the user load. I guess this is low-level memory fragmentation as the workspace variables are independent from the number of loops.
As mentioned here, quitting and restarting Matlab is the only solution for this type of out-of-memory errors at the moment.
My question is that how I can implement such a mechanism in a standalone application to save data, quit and restart itself in the case of out-of-memory error (or when high likelihood of such an error is predicted somehow).
Is there any best practice available?
Thanks.
This is a bit of a tough one. Instead of looking to restart to clear things out, could you change the code to break the work in to chunks to make it more efficient? Fragmentation is mostly proportional to the peak cell-related memory usage and how much the size of data items varies, and less to the total usage over time. If you can break a large piece of work in to smaller pieces done in sequence, this can lower the "high water mark" of your fragmented memory usage. You can also save on memory usage by using "flyweight" data structures that share their backing data values, or sometimes converting to cell-based structures to reference objects or numeric codes. Can you share an example of your code and data structure with us?
In theory, you could get a clean slate by saving your workspace and relevant state out to a mat file and having the executable launch another instance of itself with an option to reload that state and proceed, and then having the original executable exit. But that's going to be pretty ugly in terms of user experience and your ability to debug it.
Another option would be to offload the high-fragmentation code in to another worker process which could be killed and restarted, while the main executable process survives. If you have the Parallel Computation Toolbox, which can now be compiled in to standalone Matlab executables, this would be pretty straightforward: open a worker pool of one or two workers, and run the fraggy code inside them using synchronous calls, periodically killing the workers and bringing up new ones. The workers are independent processes which start out with non-fragmented memory spaces. If you don't have PCT, you could roll your own by compiling your application as two separate apps - the driver app and worker app - and have the main app spin up a worker and control it via IPC, passing your data back and forth as MAT files or bytestreams. That's not going to be a lot of fun to code, though.
Perhaps you could also push some of the fraggy code down in to the Java layer, which handles cell-like data structures more gracefully.
Changing the code to be less fraggy in the first place is probably the simpler and easier approach, and results in a less complicated application design. In my experience it's often possible. If you share some code and data structure details, maybe we can help.
Another option is to periodically check for memory fragmentation with a function like chkmem.
You could integrate this function to be called silently from you code each couple of iterations, or use a timer object to have it called every X minutes...
The idea is to use thse undocumented functions feature memstats and feature dumpmem to get the largest free memory blocks available in addition to the largest variables currently allocated. Using that you could make a guess if there is a sign of memory fragmentation.
When detected, you would warn the user and instruct them you how to save their current session (export to MAT-file), restart the app, and restore the session upon restart.

Writing to hard disk from contiguous physical memory

I have an ARM based device, running linux, which is connected to a camera, and I'm trying to store captured frames to HD efficiently.
I'm developing in user space, but can modify drivers at will
I'm coding in C
Frames which are written into memory using DMA, and I have their physical memory pointer.
I am able to control all the frame capturing flow, and I can tell when the frame buffers are stable (dqueued from the video4linux driver)
Linux version is 3.0.35
I'm familiar with kernel source code, not an expert, but I'm able to find my way in it and figure out things, as long as I get some hints...
I believe I have 2 alternatives:
Find the optimal configuration for my filesystem, for opening the file and writing into it. I'm now using ext4, and standard fopen() fwrite() functions. I understand I can also use mmap, or add O_DIRECT flag when calling open(), but didn't try it yet.
Find a way to pass the physical address of the buffer (I can get it
from my Video4Linux driver) directly to the filesystem/hard drive driver,
so the data will be transfered directly from there.
I found method 1 to be slow, having memory transactions as my bottleneck, since fwrite involves copying data from userspace to kernel space, and then again into some sort of cache, and then on to DMA. Too many memory transactions for a simple store...
Regarding method 2 - I don't know if that's possible, but if I was the one designing this system from scratch, this is what I would do.
Any thoughts?
Regarding method 1 (using open() and write(), mmap() and/or O_DIRECT)
can you recommend an optimal settings for my purpose?
Is method 2 (storing to HD directly from an existing DMA buffer) possible? If so - can you point me to an example?
the only problem with writing into a file via mmap on UNIXs, is that you either have to deal with signals in case of out-of-disk-space
or you have make certain that the file is not sparse
and thus all needed disk space is already allocated.
I think an uptodate G++ provides a method of converting signals into C++ exception handling,
but I'm not certain how supported this is on other systems than mac-os.

How to get a command line process to use less processing power

I am wondering how to get a process run at the command line to use less processing power. The problem I'm having is the the process is basically taking over the CPU and taking MySQL and the rest of the server with it. Everything is becoming very slow.
I have used nice before but haven't had much luck with it. If it is the answer, how would you use it?
I have also thought of putting in sleep commands, but it'll still be using up memory so it's not the best option.
Is there another solution?
It doesn't matter to me how long it runs for, within reason.
If it makes a difference, the script is a PHP script, but I'm running it at the command line as it already takes 30+ minutes to run.
Edit: the process is a migration script, so I really don't want to spend too much time optimizing it as it only needs to be run for testing purposes and once to go live. Just for testing, it keeps bring the server to pretty much a halt...and it's a shared server.
The best you can really do without modifying the program is to change the nice value to the maximum value using nice or renice. Your best bet is probably to profile the program to find out where it is spending most of its time/using most of its memory and try to find a more efficient algorithm for what you are trying to do. For example, if your are operating on a large result set from MySQL you may want to process records one at a time instead of loading the entire result set into memory or perhaps you can optimize your queries or the processing being performed on the results.
You should use nice with 19 "niceness" this makes the process very unlikely to run if there are other processes waiting for the cpu.
nice -n 19 <command>
Be sure that the program does not have busy waits and also check the I/O wait time.
Which process is actually taking up the CPU? PHP or MySQL? If it's MySQL, 'nice' won't help at all (since the server is not 'nice'd up).
If it's MySQL in general you have to look at your queries and MySQL tuning as to why those queries are slamming the server.
Slamming your MySQL server process can show as "the whole system being slow" if your primary view of the system through MySQL.
You should also consider whether the cmd line process is IO intensive. That can be adjusted on some linux distros using the 'ionice' command, though it's usage is not nearly as simplistic as the cpu 'nice' command.
Basic usage:
ionice -n7 cmd
will run 'cmd' using 'best effort' scheduler at the lowest priority. See the man page for more usage details.
Using CPU cycles alone shouldn't take over the rest of the system. You can show this by doing:
while true; do done
This is an infinite loop and will use as much of the CPU cycles it can get (stop it with ^C). You can use top to verify that it is doing its job. I am quite sure that this won't significantly affect the overall performance of your system to the point where MySQL dies.
However, if your PHP script is allocating a lot of memory, that certainly can make a difference. Linux has a tendency to go around killing processes when the system starts to run out of memory.
I would narrow down the problem and be sure of the cause, before looking for a solution.
You could mount your server's interesting directory/filesystem/whatever on another machine via NFS and run the script there (I know, this means avoiding the problem and is not really practical :| ).