Can I force visdiff to display more than the first 2000 bytes? - matlab

I have two binary files that I'm trying to compare using Matlab's built-in function visdiff, but it only displays the first 2000 bytes as a default. Is there any way to force the comparison tool to display the entire contents of both files side by side?

Edit the file matlabroot\toolbox\shared\comparisons\private\bindiff.m, where matlabroot is your MATLAB installation directory. On line 149, you'll see it sets the variable MAXLEN to 2000. Change this to something bigger (even Inf seems to work).
You may need to type rehash toolboxcache after making this change, in order to get MATLAB to notice.
Please note:
As you're making a change to the MATLAB source, this is at your own risk (it seems fine to me though). Keep a backup of the file you've edited.
That truncation at 2000 bytes is there for a reason - comparing the whole of larger binary files does seem to take quite a while, so be patient. Maybe try gradually increasing MAXLEN, rather than going straight to Inf.
I only have R2011b available to me right now, so if you're on a newer version the file path and line number I mentioned above may have changed. It was very easy to trace through the code from visdiff to comparisons_private to bindiff though, so unless they've changed the deeper structure of the Comparisons Tool between 11b and now, it will probably be very similar.

Related

How can I make a saving code faster? -MatLab

I'm running a short code to open one by one a list of files and saving back only one of the variables contained in the files. The process seems to me much slower than I expected and getting slower with time, I don't fully understand why and how I could make it run faster. I always struggle with optimization. I'd appreciate if you have suggestions.
The code is the following (the ... substitute the actual path just for example):
main_dir=dir(strcat('\\storage2-...\Raw\DAQ5\'));
filename={};
for m=7:size(main_dir,1)
m
second_dir=dir([main_dir(m).folder '\' main_dir(m).name '\*.mat']);
for mm=1:numel(second_dir)
filename{end+1}=[second_dir(mm).folder '\' second_dir(mm).name];
for mmm=1:numel(filename)
namefile=sprintf(second_dir(mm,1).name);
load(string(filename(1,mmm)));
save(['\\storage2-...\DAQ5\Ch1_',namefile(end-18:end-4),'.mat'], 'Ch_1_y')
end
end
end
The original file is about 17 MB and once the single variable is saved it is about 6 MB in size.
The Matlab load function takes an optional additional argument to specify just a selected variable to read from the input file.
s = load('path/to/file.mat', 'Ch_1_y');
That way you don't have to spend time loading in all the other variables from those input .mat files that you're just going to immediately throw away.
And using save to save MAT-files over SMB shares can be slow. You might want to call save to write it to a temporary local file first, and then copy the completed file to the final destination. Sounds like more I/O, but it can actually be a net win, depending on your particular system and network. Measure it both ways to see if it's a win in your particular situation.

Use older MATLAB save formats

I'm running a model that has a bunch of DLLs which read some .mat files.
When I use an old version of MATLAB (I think 2011a) to generate the files I get files that work okay, but when I create them with 2017a the files seem not to work with the same script.
I've used 2017 to read in the working 2011 file and then saved it, and these files also don't work.
I've also tried the above with the '-vXX' settings at all available values according to the help, with no success.
Example:
clear; load('v2011file.mat'); save('v2017copy.mat', '-v6', 'var1', 'var2', 'var3');
One thing that I have noticed between the two is that when they're selected in the "Current folder" browser, the preview always shows the 2017 files with the variable names in alphabetical order, regardless of the order that I saved them in, while the older 2011 file seems to maintain the order that they were saved. I can only assume that this is something related to a change in the way that files are saved - it might not be the problem but it does hint toward a change (it does this whether or not I include '-vXX' to use older formats).
It's probably worth noting that the 2011 files are created on XP, while the 2017 files are made on Windows 7.
Essentially I'm looking for anyone who might know whether it's possible for me to change the way the file is put together by MATLAB, rather than having to change the DLLs to accept a newer file.
It looks like I can work around the save order issue and have something that works by doing:
save('new2017file.mat', 'var1');
save('new2017file.mat', 'var3'. '-append');
save('new2017file.mat', 'var2', '-append');
Meaning I can put them in a specific order - I have to have the default save set to -v7 in preferences>general>.mat files too.
I wouldn't say no to a more elegant answer if there's one available though!

matlab code turned into unreadable symbols

I hit something wrong in Matlab and my code was transformed into unreadable strings of symbols (I . I suspect that this is a simple question for cs people, but I'm just an academic-in-training "end" user of code --that is, I know little theory and forget it easily, unfortunately.
I hit ctr+z but nothing happened, closed it and opened it again, but the symbols are still there. And it doesn't run, the error I get is:
The input character is not valid in MATLAB statements or expressions.
This is the beginning of my code, in its unfortunate current state:
MATLAB 5.0 MAT-file, Platform: PCWIN, Created on: Sun Oct 05 06:57:45 2014
"‰\*’fTøÄ^L3:!I]ƒÁCƒÒP>朳÷>—º0ç²öEEHÉm�0fÈçRHñ)—\¢ßZï³æ3öïû£óû�㬽ֻÞËó>ïûîZ‡£ñ-IŽj⻺ø«âÀ§ªú}ÕeßrÏè¼Qƒ3råó$G]µ¾O<ÈÎÉÊÈÍÊLuTø¨Õ4»ÕÁãò²ÇºFå–¯fØ P«iv«•8\¶\¶\¶œ¶œ¶œ¶Ff�¿â¼XuåÚ­Š½•¥bï•«yÙ¹Cs®��ÕÊ»áßüJ»ËWÓìV+Ù‘a' |5Ínµ²#\¶\¶\¶œ¶œ¶œ¶»†Ére9+F4±šVyU:£ÂæËVÓl$T¸0±÷òe…ɬ�9C]W 5kPNîØœ¼Ê«9¹ƒ+{'3ÍVBš­„´Jœ¶:8mupÚêà´ÕÁi«ƒÓV‡r~¨("¾ZAD|µ‚ˆr.±•�f+¡’¶~È°õC†­2lý�aë‡[?äåŒØx)'«8mVÓ*¯–ƒÓn5Í‘3nÐy¡<›—ýòÃùt[ùj†Ü_®°WI«¼·|YüGâ9ËåpT´‚V³s._% 7–¯VÜèPHÍ•72{„+?g°Ê/§íê{Sm÷¦&"_YD†­Ü+÷¦Úí%¹µÄ_IUG¼NVSv”×Éñw­õСþo‰‡E³Ä§ÐÍfæ&|ÓÈÍ©Y¥•ÞÆÍ°ëÏK‹×¿ô5°GÞзÃ�SØ„”Ýö/`mågf›ºê�;ŸÝÃÄéocÇÜ,µú‡
Ÿíçeƒ;îíáfr9c¯ú^¨ãûðöb§;ÞÙ^À¤ØÅǦ°‡÷ç§,Ø<…�¾°müá nöû{ÅÝÛ•zØCþ‡Îo×éÙ#gÿ;ãæc:ô[b2yÍÛ­¼L~åz<ìÏ>æ†|^ÜÓÃ=Oélžg¦8¿ØÄú~�­müIÞ°Ÿuæù­÷²FKöþ‰Òžr3if>÷²æÕ&¿™þ“¿‡ºÙŒ›“žªu£Îž•60Ù$)p Áî�†76`ŠÉÆÉ�õ|¬‰?í°ÎÞJ†4ÑÙù~XÖHܸS‡ß�?ÙdŠä‚d?ä&ûØ´£µû—°¯™ú
ѯ�:¯TUç5/îo ôxÄÇd8ëž1¿ûMè]ÕùOûÙßó…À]Ûô�ø,3Øöñ‡[¤ÞkÂîÛMÄ;ÉÇ—
ÏòÅqñŠØV½­q)ð±Û¤³Lx§õ¢sä1ß½ÎÝjÀn¦ä§XëE‘ãï}ågÌüágâß6?ô3¹\œ„�%>vR˜}C®ñ;釜b?»(àÒâ—
And so on.
Thanks!
Your file isn't code at all, it's a Matlab data file with the wrong extension. You can see this by running
>> x = magic(10); %// creates a 10x10 matrix
>> save('junk.m', 'x'); %// note .m extension rather than .mat
>> edit junk.m
You will see something like this in your editor window -
MATLAB 5.0 MAT-file, Platform: PCWIN64, Created on: Sun Oct 05 14:21:38 2014
å³B1#V A6å³B1#V å³B1#V // etc etc, lots more junk here
which is what a .mat file looks like when you change its extension to .m and open it as code.
So, sadly, I think you have overwritten your code file with some data. If you rename your file to have a .mat extension and then load it in MATLAB, you will be able to see what the data is.
If you have some kind of backup, you may be able to get your code back. Otherwise you're out of luck.
P.S. I used to be "just an academic-in-training end-user of code" as well. I spent some time learning a little bit of CS theory and programming languages (on and off over the course of a year) and it has paid back 100x in productivity gains. Not only will you be able to solve many more issues on your own, you'll also be able to do things that you haven't even considered possible to automate your work, leaving more time for the "fun" bit of research. I highly recommend the time investment!
Check the Extension of your file:
In MATLAB you can use the command pcode, which preparses your MATLAB code to a form that is unreadable by humans, but runs exactly the same (actually, very slightly faster) as the original MATLAB code. What happens is that for each .m file you pcode, you'll get a new file with a .p extension. The .p file runs the same as the .m file, but is unreadable.

How to read DAF (double precision array file) "transfer" files?

I downloaded some data in DAF "transfer" format, which NASA completely
fails to explain here:
http://naif.jpl.nasa.gov/pub/naif/toolkit_docs/C/req/daf.html#Conversion%20and%20Transfer%20of%20DAF%27s
How do I read this file. Here are the first few lines I'm trying to comprehend:
DAFETF NAIF DAF ENCODED TRANSFER FILE
'DAF/SPK '
'2'
'6'
'NIO2SPK '
BEGIN_ARRAY 1 3895604
'URA111 '
'-BC186A96D0E76^8'
'BC0DDF032F041^8'
'2BD'
'7'
'1'
'3'
1024
'-BC18166^8'
'FD2^4'
'-DA4A19AC2BCD18^4'
'-4D5E7E1A67739^4'
'1D46248537C30E^5'
'EBA587DFA5E3B^3'
'-26885CE73CB0D^4'
'-BF0DC6EDB5B2C8^2'
'129C1CFEABE48^3'
'5594FC676368^1'
'-472EBF2225A^1'
'-2198AE1963D^0'
'79CC4CA0C^-1'
'FDD9792D82^-2'
'2001D81A^-2'
'333BCEE2BDD724^4'
'-D78AA10831D9C8^4'
'-6D712677574DF8^4'
'283A14783CDC^4'
'90AC22194ABF6^3'
'-1DEF6219F664FE^3'
'-47318F604096^2'
'9B805F405B1C^1'
'1275B947E2AC^1'
'-16A664664D^0'
'-2F614B9F5^-1'
'-B7C3E41D^-3'
'2F3D71F8^-3'
According to NASA, this is/was a popular format for Fortran programs,
but google was not at all helpful (wikipedia doesn't have an entry
either).
OK, I think I finally figured it out at least part of this. For
reference, the original file (a whopping 162M in size) is the
ura111.bsp file in:
http://naif.jpl.nasa.gov/pub/naif/generic_kernels/spk/satellites/
and converted to ura111.xsp using the toxfr program in:
http://naif.jpl.nasa.gov/pub/naif/utilities/SunIntel_32bit/
The small files:
http://naif.jpl.nasa.gov/pub/naif/generic_kernels/spk/satellites/ura111.cmt
http://naif.jpl.nasa.gov/pub/naif/generic_kernels/spk/satellites/ura111.inp
explain more about the main file.
Things like "-BC18166^8" really are double precision numbers, written
in modified hexadecimal IEEE-754 format. Wikipedia sort of explains
this format here:
http://en.wikipedia.org/wiki/IEEE-754
and there are IEEE-754-to-decimal convertors like this:
http://www.h-schmidt.net/FloatConverter/ (and many others)
However, these don't explain/convert the exact format NASA uses, which
was one reason for my confusion.
For reference "-BC18166^8" is converted as follows:
The decimal value of "BC18166" is 197230950
We now divide by 16 repeatedly until the result is less than 1 (in
other words, we divide by 16^(length of "BC18166")), yielding
0.734742544591427
The '^8' means we multiply by 16**8 to get 3155695200
the leading "-" just means we add a minus sign to get -3155695200
Of course, we could've combined steps 2 and 3 and just multiplied
197230950 by 16.
#klugerama, to answer your question, yes, I am trying to write a file
parser, this time in Perl, as part of a program that accurately
identifies the positions of various objects in our solar system.
I've already parsed the NASA files relating to planets (and Earth's
own moon) here:
ftp://ssd.jpl.nasa.gov/pub/eph/planets/ascii/
but these are in a much different and far easier-to-parse format.
This document (hosted at ucla.edu) has a complete description of the file format.
Addtionally, check out this python project on Github. It appears to provide the DAFTB function you're looking for.
Edit For the record (cough), it doesn't look like this format was ever intended to be read, per se, by humans. It's a transfer format intended to be converted back to usable binary in whatever executable code is appropriate.
You didn't explain why you want to do this. So unless you are writing a file parser (which has been done already in at least two languages), I'm not sure what the benefit is of being able to read the raw values.
Strictly speaking, the answer to your question is that you use software (see link above) to read it.

A rotating log file in perl

I have implemented a log file that will be storing the cpu and memory state of a process after every minute.I have limited the maximum size of the file to 3MB (thats enough for my purpose).
The script will be called by a cron job after every minute and the script will log the details for that minute and will rename the file as "Log_.log".
When the size reaches "3MB - 100 bytes" I reset the file pointer to point to the begining and will overwrite the first entry in the log file and will now rename the file as "Log_<0+some offset>.log".
As I am renaming the file after every minute to update the file pointer position, is it a good/efficient way ?
I do not want to maintain more than one log file for this purpose.
Another option for me is to maintain the file pointer position in a file ,but ....another file !! not interested in maintaining one if this option is good :)
Thanks in Advance.
Are you an engineer? This is a nice example of some simple task, solved by a perfectly working but overly complex solution.
Unless the content you put in takes exactly as many bytes as the content you take out, writing "in" a file will actually cause the whole following part after your writing position to be rewritten to disk. Append is much cheaper.
Renaming the file to store the pointer works - but it's not very elegant, and makes stuff more complex (for one, your process needs write rights to the directory in which the file resides - else just write access to two files is sufficient)
Unless disk space is an issue (and really, it rarely is), your approach is less efficient than say, append everything to a file, and rotate the file when it reaches its maximum size. This way you always have the last 3MB of logs available, and maximum 3MB more in your current file. It will make parsing the file a lot easier too, instead of recalculating the entire pointer position thing.
Update to answer your comment:
Renaming a file every minute (or even every second) shouldn't slow down your system significantly, don't worry about that.
Our concerns are mainly with "why you think you need to rename the file". It's not better technically, it's not better from a logical point of view, it makes a lot of other (future) tasks harder. You could store the file pointer in a seperate file, or at the end of your file, and there are better^H^H^H^H^H^H simpler solutions that don't require the file pointer at all.
I'm confused why you would rename your file. What does this accomplish?
Are the log entries fixed size? Or variable size?
If the entries are fixed size, then there is no trouble in re-writing the existing file from the start: you won't ever have incomplete entries in your file, and if you are writing a counter or timestamps to the file, it should be clear where the 'cursor' is located.
If the entries are variable size, then you should probably not begin re-writing the file from the beginning without somehow making it clear where the 'cursor' is located in the file, and write code that is resilient to reading truncated log entries.
Can you re-use existing tools such as RRDtool?