Why is Matlab allocating so much memory?

Why is Matlab allocating so much memory? - matlab

I used to run 5-6 Matlab instances on a server (Win Server 2008) and they allocated about 50-60Mb each. Now, as I plan to run up to 10 instances simultaneously, I upgraded the server RAM from 1 to 2 GB (running on a VPS) on a Win Server 2012 platform. Now after migrating to the new platform, I see that each Matlab instance is allocating more than 100Mb or RAM each, some up to 150Mb, meaning that the whole migration feels useless. Why is this the case? Does Matlab simply allocate whatever is available, or?
Thanks in advance.

Related

Batch jobs and reduced SSD lifetime?

I am working on a batch job which imports data from a legacy database, transforms the data in 3NF and inserts the resulting data into another database (target database). The batch job is written with Spring Batch.
While I was developing the steps of the job, I wrote unit tests to test the functionality for each step. But now I am finished with development of the steps and want to test the system in a kind of testing environment before rolling the batch job out to production. Therefore, I imported the legacy database locally on a MySQL server and also created a local version of the target database. These MySQL servers are deployed on my Macbook Pro with 256 GB SSD. I already ran the job a few times with little bugfixes but now it came to my mind that SSDs are more sensible to write cycles than a standard HDD. Hence, I checked the process mysqld in my activity manager and noticed that 424.64 GB have been written to my SSD in the last three days.
How much influence (lifetime, write cycles) does this number of written GB will have to my SSD? Would you recommend to deploy the database on a normal HDD instead of using my SSD? Or do you think that I am falsely alarmed?

I would recommend you deploy the database to a normal HDD, because the NAND flash on your SSD do have a max erase threshold. In other words, you are wearing down your SSD. Although SSDs have features to ensure that the NAND flash wear down evenly, you are definitely wearing it down much faster than normal usage.

Setting up the optimal number of processors/cores per processor virtual machine (VMware)

I was looking for an answear but didn't find one.
I'm trying to create a new VM to develop a web application. What would be the optimal processor settings?
I have i7 (6th gen) with hyperthreading.
Host OS: Windows 10. Guest OS: CentOS.
Off topic: RAM that should I give to VM should be 50% of my memory? Would it be ok? (I have 16GB RAM)
Thanks!

This is referred to as 'right-sizing' a vm, and it is dependent on the application workload that will run inside it. Ideally, you want to provide the VM with the minimum amount of resources the app requires to run correctly. "Correctly" is subjective based upon your expectations.
Inside your VM (CentOS) you can run top to see how much memory and cpu % is being used. You can also install htop which you may find friendlier than top.
RAM
If you see a low % of RAM being used, you can probably reduce what you're giving the VM. If you are seeing any swap memory used (paging to disk), you may want to increase the RAM. Start with 2GB and see how the app behaves.
CPU
You'll may want to start with no more than 2vCPUs, checkout top to see how utilized the application is under load, and then make an assessment for more/less vCPUs.
The way a hosted hypervisor (VMware Workstation) handles guest CPU usage is through a CPU scheduler. When you give a vm x number of vCPUs, the VM will need to wait till that many cores are free on the CPU to do 'work'. The more vCPUs you give it, the more difficult (slower) it will be to schedule. It's more complicated than this, but I'm trying to keep it high level. CPU scheduling deep dive.

mongodb flushing mmap takes around 20 secs with no updates being required

Hi One of our customers is running mongodb V2.2.3 on a 64 bit windows server 2008 R2 Enterprise.
We're currently seeing mmap flush times of over 20 seconds every minute.
What is confusing me is that it isn't doing any writes to the disk. (Disk write bytes is next to 0)
Our programme which access the data has been temporary turned off.
so all that is connected is a mongo shell.
Mongostat and mongotop aresn't showing anything
The database has 130 million records. There are 356 files for mmap.
Any sugestions on what could be causing this?
Thanks

If your working set is significantly larger than memory, and MongoDB is constantly going to disk for reads (and not just the normal spikes when syncing writes to disk), then you really should be sharding to spread the data across multiple machines/instances.

Given the behaviour you have described and that you have a large number of files for mmap, I suspect the underlying performance issue is SERVER-12401 in the MongoDB Jira issue tracker:
On Windows, Memory Mapped File flushes are synchronous operations. When the OS Virtual Memory Manager is asked to flush a memory mapped file, it makes a synchronous write request to the file cache manager in the OS. This causes large I/O stalls on Windows systems with high Disk IO latency, while on Linux the same writes are asynchronous.
There are a few possible ways to improve the flush performance on Windows, including code changes in both the MongoDB server and the Windows O/S. There is some ongoing work to address these issues, now that the synchronous flushing behaviour on Windows has been confirmed.
If you are using higher latency local storage (for example, spinning disks) you may be able to mitigate the issue by upgrading to SSD or better spec'd drives.
I would suggest upvoting/watching SERVER-12401 and the related Jira issues for updates.
It would also be worth upgrading from MongoDB 2.2 to a newer version as 2.2 is now past end-of-life for updates. There have been two major production release branches since then, including significant improvements in general performance/features as well as Windows support.

How to grab a full memory dump of a large memory usage

I am hosting IIS based web service applications on Windows 2008 64-bit system running on a Quad core 8G machine. Ran into couple of instances when W3WP was running at 7.6G of memory usage. Nothing else was responding on the system including RDP. Right click on the process from the task manager and creating the dumps, froze the system and all its threads for a long time (close to 30minutes). When the freeze up occurred during off hours, we let the dump run for a while (ran close to 1 hour) but still dump didn't complete. In the interest of getting the system up, we had to kill IIS
Tried other tools like procexp, debug diag etc to create full memory dump and all have the same results
So, what tool does the community use to grab dump files quickly? Or without freezing all the threads? I realize latter might be a rhetorical question. But what are the options for generating such a large dump file without locking up the system for a long time?

IMO you shouldn't have to wait until the process memory grows to 8 GB. I am sure with something like 3 - 4 GB you should be able to detect the memory leak.
Procdump has an option based on memory threshold
-m Memory commit threshold in MB at which to create a dump of the process.
I would you this option to dump the memory of the process.
And also SSD would help in writing faster.

WPA a.k.a xperf (http://msdn.microsoft.com/en-us/performance/cc825801.aspx) is a powerfull tool, to diagnose the applications. You will get call stack of the culprit allocation. You dont have to collect the dump and it is no-invasive and does not load much in production systems
Complete step by step information is available here. http://msdn.microsoft.com/en-us/library/ff190906(v=VS.85).aspx.

How to setup matlabpool for multiple processors?

I just setup a Extra Large Heavy Computation EC2 instance to throw it at my Genetic Algorithms problem, hoping to speed up things.
This instance has 8 Intel Xeon processors (around 2.4Ghz each) and 7 Gigs of RAM.
On my machine I have an Intel Core Duo, and matlab is able to work with my two cores just fine by runinng:
matlabpool open 2
On the EC2 instance though, matlab only is capable of detecting 1 out of 8 processors, and if I try running:
matlabpool open 8
I get an error saying that the ClusterSize is 1 since there's only 1 core on my CPU. True, there is only 1 core on each CPU, but I have 8 CPUs on the given EC2 instance!
So the difference from my machine and the ec2 instance is that I have my 2 cores on a single processor locally, while the EC2 instance has 8 distinct processors.
My question is, how do I get matlab to work with those 8 processors?
I found this paper, but it seems related to setting up matlab with multiple EC2 instances (not related to multiple processors on the same instance, EC2 or not), which is not my problem.
Any help appreciated!
Note: the point is not EC2, I am remoting into it and running matlab on it as if it was any other machine. The point is that I can't get matlab to see the 8 processors!

MATLAB isn't seeing all 8 cores. Set it manually. Parallel menu -> Manage Configurations. Right-click on the "local" line. In the scheduler tab, set the "Number of workers available to scheduler" to 8.
Original answer was a question getting more detail:
Are you trying to use MDCS on EC2 (and MATLAB's user interface on your PC), or are you trying to run MATLAB's user interface and PCT on EC2 (via ssh or vnc or the like)?

This post is to add information in response to a part of original poster's question
[OP] I found this paper, but it seems related to setting up matlab with multiple EC2 instances (not related to multiple processors on the same instance, EC2 or not)...
The paper mentioned above is no longer available
In its place MathWorks offers MATLAB users a way to set up and distribute computations on a cluster running MATLAB Distributed Computing Server (MDCS) on Amazon EC2. More information is available here: http://www.mathworks.com/ec2