Search for files with MATLAB - matlab

My question is how to use MATLAB to search for a certain type of files in a folder. I give an example to detail on my question:
Suppose we have the following folder as well as files in it:
My_folder
Sub_folder1
Sub_sub_folder1
a.txt
1.txt
2.txt
Sub_folder2
3.txt
abc.txt
In this example, I want to find all the .txt files in My_folder as well as its sub-folders. I was wondering what I could do with MATLAB. Thanks!

To my knowledge Matlab doesn't have an inbuilt function to do recursive directory searches, however there are a couple available for download on Matlab Central: here and here.
Alternatively you could write your own recursive function and use the dir function to search at each level for files matching your criterea or other directories to recurse into.

I agree with the Matlab Central options -- another method which I've used when MLC is not an option (no network, or customer computer, etc) is the quick and dirty dos commands:
dos(['dir /s/b ' mywildcard])
The /s will do a recursive directory search for whatever wildcards you specify, and /b will make it so you only get filenames (complete will full path, but no headers, file sizes, etc).
This is obviously platform dependant, so is mostly used when you are forced to work without your "standard" set of utilities you've accumulated.

Even though an answer has been accepted, I would like to point out Matlab's dir function.
This built-in function returns the contents of the folder in question. Furthermore, it indicates which content is a folder of its own. Therefore, with a little loop one could use this function to search sub-directories as well.

Related

Matlab function precedence and subfolders

In the Matlab function precedence page, it states that function precedence goes:
Functions in the current folder.
Functions elsewhere on the path, in order of appearance.
My question is, when they say "Functions in the current folder" does this exclude functions in subfolders of the current folder? If so, is there a way for me to have subfolders be called preferentially without changing the order of my folders in the path?
I need to do this because I have 2 folders (each with subfolders) of code that run functions with the same name. It seems the subfolders aren't given automatic precedence. I really don't want to have to change my path order every time I run one folder, and I really don't want to have to rename 100s of functions and function calls that my team has written.
The only solution I can think of would be to remove the whole subfolder system and just have a jumbled mess of files in one folder. Are there any other things I can do?
Thanks in advance for the help!

matlab creating paths to stop copying code

I have created a few general function in MATLAB that I intend to use for a few separate projects. However I do not want to copy the function into each separate project function.
I have created a folder called Misc_Function when I have placed these general functions. I know I can reference this functions explicitly by using the path and function name when trying to call the functions.
I believe you can add a path (in my case 'H:\MyTeam\Matlab\Misc_Function') when MATLAB loads up is that correct and if so how do you do this?
Assuming the above can be done I'm interested to know how MATLAB finds the correct function. In my understanding (guess work) MATLAB has a list of paths that it check trying to find a function with the name specified - is that correct? If so what happens when there are functions with the same name?
MATLAB indeed has its own search path which is a collection of folders that MATLAB will search when you reference a function or class (and a few other things). To see the search path, type path at the MATLAB prompt. From the documentation:
The order of folders on the search path is important. When files with the same name appear in multiple folders on the search path, MATLAB uses the one found in the folder nearest to the top of the search path.
If you have a set of utility functions that you want to make available to your projects, add the folder to the top of the search path with the addpath function, like so
addpath('H:\MyTeam\Matlab\Misc_Function');
You have to do this everytime you start MATLAB. Alternatively, and more conveniently, save the current search path with the savepath command or add the above commands to your startup.m file.
You can check the actual paths where Matlab searches for functions using
path
You will notice, that the most top path (on start up) is a path in your home folder. For Linux this is e.g. /home/$USER/Documents/MATLAB. For Windows it is somewhere in the the c:\Users\%USER%\Documents\Matlab (I think). Placing a file startup.m in this folder allows to add additional paths using
addpath('H:\MyTeam\Matlab\Misc_Function');
or
addpath(genpath('H:\MyTeam\Matlab\Misc_Function'));
on start up of Matlab. The latter (genpath) allows to also add all subdirectories. Simply write a file startup.m and add one of above lines there.
I believe 'addpath' will add the folder to MATLAB path only for the current MATLAB session. To save the updated path for other sessions, you need to execute 'savepath' command.
As mentioned in the previous comments, adding the folder in startup.m is a good idea since it will be added to the path on MATLAB startup.
To answer your question about how MATLAB finds the correct function, MATLAB maintains a list of directories in its path in a file called pathdef.m. Any changes to the path will be written to this file (when you execute 'savepath'). The path variable is initialized with the contents of this file.

Access data files from subfolder of current script directory

I have been working on MATLAB scripts.
Basically, I have a lot of functions and data files (collectively known as kernels):
I want to organize it a little bit.
The idea is
to create a subfolder named functions and save all functions in it.
Another kernels and save all data kernel files in it.
Later by adding these paths at runtime, all the scripts should be able to access these functions and kernels without giving the full path to them, i.e. The script should search it in the subfodlers too.
Applying addpath(genpath(pwd)); worked for functions but it couldn't access kernel files
e.g. What if I want to access file named naif0010.tls inside subfolder kernels.
It didn't work. Any suggestions.
Example:
% Add the current script directory and subfolders to search path
addpath(genpath(pwd));
% Load NASA Spice (mice) to the script here
% add MICE reference path to MATLAB
addpath('C:\Program Files\MATLAB\R2012b\extern\mice\src\mice');
addpath('C:\Program Files\MATLAB\R2012b\extern\mice\lib');
% Load leap second kernel
% If the leapsecond kernel is placed in script directory
% This file is present in pwd/kernel/naif0010.tls
cspice_furnsh('naif0010.tls');
There are a couple of things to keep in mind. First, your current working directory (pwd) is in the Matlab path by default, so you don't usually need to explicitly call addpath in order to use scripts, functions, or data files there.
Also, in many cases you can access files by providing a relative path rather than an absolute path. In your case, this would look like
cspice_furnsh('kernels/naif0010.tls')
I solved it with some work around which I know is not the correct answer but for now I can go ahead....
addpath(genpath(pwd));
% Basically just forming full path of the data file
leapSecondsFile = fullfile(pwd,'kernels','naif0010.tls');
cspice_furnsh(leapSecondsFile);
Still waiting for correct answer or suggestions :-)
Update:
Thanks nispio's comment above, The correct way is :
% Load current directory and subfolders
addpath(genpath(pwd)); % This is not necessary
cspice_furnsh('kernels\naif0010.tls');

Basic script to replace image files in multiple directories

I have a situation where we have several thousand image files that have become corrupted on our server (Windows 2008 R2 x64). I have a working image file that I want to replace the corrupt files with. The files must retain the same name and path (size, timestamps, etc do not matter).
So the basic idea would be to replace each corrupt image file with the working file.
I do not write code, only the occasional windows batch file.
Should I use VB or PowerShell (or something else) for this? What will the script look like for this?
I apologize in advance if this question is too basic for stackoverflow.
You don't really need a batch file,
try looking at the for command
e.g.
FOR /R %f in (*.jpg) DO copy newfile.jpg "%f"
This should do a recursive search and copy newfile.jpg over the jpg's it finds.
It all boils down to how you are identifying the broken jpgs.
When I dont use a wild card for example
FOR /R %f in (broken.jpg) DO copy newfile.jpg "%f"
Then newfile.jpg gets copied to every subdirectory. If I use a wildcard ( *,?) the command works as expected. Is there a way to have this commend work with a (set) that does not contain wildcards?

How do you compare the content of two archive files programmatically?

I'm doing some testing to ensure that the all in one zip file that i created using a script file will produce the same output as the content of a few zip files that i must manually click and create via web interface. Therefore the zip will have different folder structure.
Of course i can manually extracted them out and using my powerful eyeball technique to scan them or even lazier i can write a script to do that, but before i invest more time and get accused by my boss for company time robbery, i'm asking if there's a better way to do this?
I'm using perl LAMP stack by the way.
thanks.
You can use perl's Archive::ZIP or Python's zipfile to extract the filenames, sizes and CRC checksums of the files in the archives. Create a file which contains the results sorted by file name (ignore the path).
For your smaller ZIPs, merge the results of the script (cat list1 list2 list3 | sort).
Now, you can use diff to compare the results.
I can wholeheartly recommend Beyond Compare. Unless you're really getting underpaid, it's the biggest bang for your (bosses) buck.
[Edit] I seem to have scanned over the different folder structure, sorry about that.Beyond Compare can compare all files in folders with the same folderstructure. It does not have (I believe) the intelligence to go searching for matches in files in different folders.
Regards,
Lieven
Create a crc checksum for your files.
If your checksum is the same for the original files and the unzipped files, you can be sure the files are the same. And even works for non text data.
A checksum be easily be created with an external program such as "SFV Checker" or programmatically (.net/java for example include libraries to do this).
Taking a cue from Carra's answer...if A.zip is your single big archive and B.zip is the archive generated through the web then use the following algorithm
Extract all files from A.zip and recursively (w.r.t folders) compute the checksum of the files present in the folder (using cksum, md5sum etc) where the contents were extracted and save this information after sorting it (pipe it through sort) to a file (say A.txt)
Do the same for B.zip and generate B.txt
Compare A.txt with B.txt they should be exactly the same.
OR
Use unzip -l to get file/directory lists for both the (zip) archives and then flatten the hierarchy of the user generated zip file and compare with the contents of your script generated zip file using some thing like diff. By flattening of hierarchy I mean you may need to do some kind of pre-precessing on one or both lists before you can do a meaningful comparison with diff.