Matlab not able to read in large file? - matlab

I have a data file (6.3GB) that I'm attempting to work on in MATLAB, but I'm unable to get it to load, and I think it may be a memory issue. I've tried loading in a smaller "sample" file (39MB) and that seems to work, but my actual file won't load at all. Here's my code:
filename = 'C://Users/Andrew/Documents/filename.mat';
load(filename);
??? Error using ==> load
Can't read file C://Users/Andrew/Documents/filename.mat.
exist(filename);
EDU>> ans = 2
Well, at least the file exists. When I check the memory...
memory
Maximum possible array: 2046 MB (2.146e+009 bytes) *
Memory available for all arrays: 3442 MB (3.609e+009 bytes) **
Memory used by MATLAB: 296 MB (3.103e+008 bytes)
Physical Memory (RAM): 8175 MB (8.572e+009 bytes)
* Limited by contiguous virtual address space available.
** Limited by virtual address space available.
So since I have enough RAM, do I need to increase the maximum possible array size? If so, how can I do that without adding more RAM?
System specifics: I'm running 64-bit Windows, 8GB of RAM, MATLAB Version 7.10.0.499 (R2010a). I think I can't update to a newer version since I'm on a student license.

As the size might be the issue, you could try load('fileName.mat', 'var1'); load('fileName.mat', 'var2'); etc. For this, you'll have to know the variable names though.

An option would be to use the matfile object to load/index directly into the file instead of loading into ram.
doc matfile
But one limitation is that you can not index directly into a struct. So you would need to find a friend to convert the struct in your mat file and save it with the version option
save(filename, variables, '-v7.3')

May be you can load part by part your data to do your stuff using load part of variables from mat file. You must have matlab 7.3 or newer.

From your file path I can see you are using Windows. Matlab is only 32 bit for Windows and Linux (there is no 64 bit for these OSes at least for older releases, please see my edit), which means you are limited to <4GB ram total for a single application (no matter how much you have in your system), this is a 32 bit application issue so there is nothing you can do to remedy it. Interestingly the Mac version is 64 bit and you can use as much ram as you want (in my computer vision class we often used my mac to do our big video projects because windows machines would just say "out of memory")
As you can see from your memory output you can only have ~3.4GB total for matrix storage, this is far less than the 6.3GB file. You'll also notice, you can only use ~2GB for one particular matrix (that number changes as you use more memory).
Typically when working with large files you can read the file line by line, rather than loading the entire file into memory. But since this is a .mat file that likely wouldn't work. If the file contains multiple variables maybe separate them each into their own individual files that are small enough to load
The take home message here is you can't read the entire file at once unless you hop onto a Mac with enough RAM. Even then the size for a single matrix is still likely less than 6.3GB
EDIT
Current Matlab student versions can be purchased in 64 bit for all OSes as of 2014 see here so a newer release of Matlab might allow you to read the entire file at once. I should also add there has been a 64 bit version before 2014, but not for the student license

Related

How to convert an OS bin file(with a custom bootloader) to iso file which can be burnt into CD or booted by USB? [duplicate]

This question already has answers here:
Creating a bootable ISO image with custom bootloader
(2 answers)
Closed 2 years ago.
I have finished writing my simple Operating System and I want to test it on a real hardware(PC), not bochs or qemu. My OS has a custom bootloader and a kernel and I used cat to concat them into one single bin file. But I spent hours on finding out a way to convert the bin file to a bootable iso file but failed each time. According to OSDev.org, I think I need to use genisoimage(mkisofs) to do the convert, but I don't know exactly how this command works, I finally outputted a iso file but this one is not working.(I think I used the wrong command, can someone explain a little bit more to me?)
Other Approaches I tried:
Directly burn the bin file to a CD. Error: Missing Operating System.
Convert the bin file to ISO using winbin2iso and other windows platform software. Error: Could not boot. Not even in qemu.
Also, what is El-Torito?
When a CD is being booted, the firmware checks the CD for a "boot catalogue" in the CD's meta-data. This is a list of entries with (up to) one entry for each kind of computer; so that it's possible to create a single CD that works for 80x86 BIOS and 80x86 UEFI (and PowerPC and Sparc and ...).
For 80x86 there are 4 different types of entries:
UEFI. The rest of the entry tells the firmware the starting sector and size of a FAT file system image. UEFI figures out which file it wants from the FAT file system based on converting the architecture into a file name (e.g. for 80x86 it'll probably want the file \EFI\BOOT\BOOTX64.EFI).
"No emulation, 80x86 BIOS". The rest of the entry tells the firmware the starting sector and size of a boot loader. The boot loader can be any size you like (up to about 639 KiB); and the "device number" the BIOS tells you (in dl) will be for the CD drive itself, so you can load more stuff from the same disk using it.
"Hard disk emulation, 80x86 BIOS". The rest of the entry tells the firmware the starting sector and size of a disk image. The disk image should have an MBR with a BPB and partition/s, with an active partition pointing to where the operating system's boot loader is at the start of its partition. In this case the BIOS will create a fake "device 0x80" (from info in the BPB in the MBR) and mess up the device numbers for any real hard drives (e.g. the first hard drive which would've been "device 0x80" will become "device 0x81" instead, etc). Note that this is inefficient because sectors on a CD are 2048 bytes but the BIOS will emulate 512 byte sectors, so every time you try to read a 512 byte sector the BIOS will actually read 2048 bytes and throw the "wrong" 75% of the data away. It's also very annoying (trying to squeeze anything good in 512 bytes is impossible). It's mostly only used for obsolete junk (e.g. MS-DOS).
"Floppy disk emulation, 80x86 BIOS". The rest of the entry tells the firmware the starting sector and size of a disk image. The disk image should have a boot loader in the first sector with a BPB. In this case the BIOS will create a fake "device 0x00" (from info in the BPB in the MBR) and mess up the device numbers for any real floppy drives. Just like hard disk emulation, this is inefficient, even more "very annoying" (because it also limits the OS to the size of a floppy which is nowhere near enough space), and only intended for obsolete junk.
The best way to deal with CDs is to write 2 new boot loaders (one for "no emulation, 80x86 BIOS" and another for UEFI); then use any tool you like to create the appropriate structures on the CD (e.g. genisoimage with the -no-emul-boot option for your "no emulation, 80x86 BIOS" boot loader plus some other option for UEFI that doesn't seem to exist!?).
Note that it's easy to write your own utility that is "more clever". You'd want to create a FAT file system (for UEFI) and an ISO9660 file system (for the rest of the operating system's file - help/docs, drivers, etc), but most tools won't create the FAT file system for you, and it's possible for files in both file systems (FAT and ISO9660) to use the same sectors (so that the same files appear in both file systems without costing twice as much disk space). Something like this would probably only take you 1 week to write yourself (and you'll learn a lot about CDs and ISO9660 that you're going to have to learn eventually anyway). The relevant documentation (for booting from CD, ISO9660 file systems, FAT file systems, and UEFI) are all easily obtained online.
Also, what is El-Torito?
El-Torito is another name for the "Bootable CD-ROM Specification" that describes the structures needed on a CD to make it bootable.

Is WinDbg supposed to be so excruciatingly slow?

I'm trying to analyze some mini crash dumps. I'm using Windows 10 Pro Build 1607 and WinDbg 10.0.14321.1024. I have my symbol file path set to
SRV*C:\SymCache*https://msdl.microsoft.com/download/symbols
Basically, whenever I load up a minidump (all < 1 MB .dmp files), it takes WinDbg forever to actually analyze them. I understand the first run can take long, but it took mine almost 12 hours before it would let me enter a command. I assumed that, since the symbols were cached, it wouldn't take long at all to re-open the same .dmp. This is not the case. It loads up, goes pretty much instantaneously to "Loading Kernel Symbols", then takes another 30 minutes before it prints the "BugCheck" line. It's been another 30 minutes, and I still can't enter commands into it.
My PC has a 512 GB SSD, 8 GB of RAM, and an i5-4590. I don't think it should be this slow.
What am I doing wrong?
These kind of complaints seem to occur more often lately and I can reproduce it on my PC. This is not your fault but some issue with the Internet or the symbol server on Microsoft side.
Monitoring the traffic with Wireshark and looking at my disk on how the symbol cache get populated, I can say:
only one file is being downloaded at a time.
the problem also occurs with older WinDbg versions (6.2.9200)
the problem occurs with HTTP and HTTPS
when symbols are found, the transfer speed is very slow, then increasing. The effective transfer rate is down at 11 kb/s to 20 kb/s (on a line which can handle 6500 kb/s)
there's quite a high number of packets out of order, duplicate packets etc., especially during the "lookup phase" where no file is downloaded yet. Such a lookup phase can easily take 8 minutes.
even if the file already exists on disk, the "lookup phase" is performed.
the HTTP roundtrip time (request to response) is 8 to 9 seconds
This is the symbol server being really slow. Other have noticed as well: https://twitter.com/BruceDawson0xB/status/772586358556667904
Your symbol path contains a local cache so it should load faster next time around, but it seems that the cache is not effective, I can't tell really why (I suspect the downloaded symbols are not a perfect match and they are being downloaded again, every time).
I would recommend modifying the _NT_SYMBOL_PATH (or whatever is the way your sympath is initialized) to SRV*C:\SymCache only, ie. do not attempt to automatically download, just use the symbols you already have cached locally. The image should open fairly fast. Only enable the symbols server if you discover missing symbols.
I ran into the same problem (extremely slow windbg), but loading/reloading/fixing/caching symbols did not help. By accident, I figured out that this problem persists when I try to print memory with address taken from a register, like
db rax
The rule of thumb is to always use # with the register name.
db #rax
Without this symbol, the debugger considers rax to be a symbol name, and looks for it some time (depending on the amount of symbols you have loaded) and fails to find it eventually, and falls back to treating it like a register name. Printing memory from register with # symbol works instantly, even if you have gigs of symbols loaded in memory. As you can see, this problem is also symbol-related, but in a different way.

MATLAB used up all my disk space! How can I get it back?

I left MATLAB running on a simple ode45 + plot, and when I came back I saw that the 5GBs of free space I had on my drive (C:) was no more! MATLAB had stopped due to "no memory".
Can someone please tell me what happened and how I can get my space back???
Thank You.
You can visually inspect hard disk usage and find folders and files which take up a lot of space with a tool such as TreeSize Free.
P.S. You can also try clearing temporary folders either trough built-in disk cleaner or other tools such as CCleaner.
MatLab is one of those apps that have an all world of computing science where you only want to work in a small tiny island of knowledge, the Help folder of it is huge, anyway here's some things you can do to make it slimmer on disk:
Install only the packages you need.
Use JPEGMini to compress the JPEG collection of the huge help folder.
Use Pngyu to compress the huge collection of PNG files to 8 bit depth.
Step 2 and 3 will get you back like a Gigabyte if not more.
Use NTFS compression on the MatLab Folder.
It will get you back another 2 Gigabytes
Both step 2 and 3 must be done with admin privileges, the drag and drop of folder to it must be done with another app with admin privileges also, you can use Explorer++ as Windows File Explorer alternative.

load .mat file - run out of memory

I have a matrix cube which I load in my program to read data from. The size of this .mat file is 2.8 GB. I am not being able to load it with the error of 'running out of memory'. Is there a way to fix this?
You can use the matfile class to work on ranges within variables inside MatLab files. See
Load and save parts of variables in MAT-files
Here's some additional discussion that discloses that this feature is new with R2011b.
If the size of the data exceeds the available memory on your machine, then you are in trouble - this is unavoidable. However, if you only want certain variables inside the .mat file you can try to load just those variables using the
load(filename, variables)
version of the load function. It really depends on the contents of your .mat file. If the file is 2.8GB and you need ALL of the variables in the file and your machine does not have enough memory to cope, your only option is to buy more RAM.
EDIT Apparently this answer is incorrect if you are running R2011b and above as explained in the answer of Ben Voight

MATLAB slowing down on long debugging sessions

I have noticed that MATLAB (R2011b on Windows 7, 64 bit) tends to slow down if I am in debugging mode for a long period of time (e.g. 3 hours). I don't recall this happening on previous versions of MATLAB.
The slow down is small, but significant enough to have an impact on my productivity (sometimes MATLAB needs to wait for up to 1 sec before I can type on the command line or on the editor).
I usually spend hours on debugging mode (e.g. after stopping at a keyboard statement) coding full projects in this mode. I find working on debugging mode convenient to organically grow my code while inspecting my code anytime in execution time.
The odd thing is my machine has 16 GB of RAM and the total size of all workspaces while in debugging mode is usually less than 4 GB. I don't have any other large process running in the background, and my system reports ~8GB of free RAM.
Also, unfortunately MATLAB does not let me call pack from debugging mode; it complains with :
Warning: PACK can only be used from the MATLAB command line.
I have reproduced this behavior after restarting MATLAB, rebooting my system, and on different days. With this, my question/s are:
Has anybody else noticed this? Is there anything I could do to prevent this slowdown without exiting debugging mode?
Are there any technical notes or statements from Mathworks addressing this issue?
In case it matters, my code is on a network drive, so I added the following on my startup.m file, which should alleviate any impact on performance resulting from it:
system_dependent('RemoteCWDPolicy', 'None');
system_dependent('RemotePathPolicy', 'None');
system_dependent('DirChangeHandleWarn','Never');
I have experienced some similar issues. The problem ended up being that Mathworks changed how Matlab caches files. For some users, it is now storing data in the TMP folder as defined by the environment variables. This folder was being scanned by anti virus and causing a lot of performance problem. Of course, IT wouldn't let us exclude the TMP folder from scans. So we added a line to our start up script that changes the environment variable of TMP to some other location within an excluded folder.
You don't have to worry about changing the variable back or messing up other programs. When applications launch, they copy the environment variables into their own local instance of them. Any changes made to them only change the local copy of those variables, not the system copy.
Here is the function you will need.
setenv('TEMP', 'C:\TEMP');
I'm not sure if it was TMP or TEMP. Check your environment variables to be sure.
I am using MATLAB R2011 on linux 10, windows 7 (32 bit).
I experienced MATLAB slowing down while printing simple variables in command window.
It turned that there was one .m file loaded in my Editor.
It was a big file with 10000 lines. These lines were simple data that should have been saved as mat file. When i closed this file, the editor was back to its normal speed.