Get a list of all subdirectories in Matlab - matlab

I'm trying get an absolute path of all subfolders in project_dirs.
project_dirs='D:\MPhil\Model_Building\Models\TGFB\Vilar2006\SBML_sh_ver\vilar2006_SBSH_test7\Python_project3_IQM_project';
all_project_dirs=dir(project_dirs)
for i=all_project_dirs,
full_dir=fullfile(project_dirs,i.name)
The above code gives a single string of all the subfolder directories concatenated together. How do I modify my code to get a cell array of these absolute paths?

There's a function for that: genpath(). It will give you all directories recursively in a string, split by :. Use strsplit() to parse the result.

You can do this:
all_project_dirs = {all_project_dirs([all_project_dirs.isdir]).name};
How it works:
This selects, among the elements of all_project_dirs, those that are directories;
From them it gets the name field;
The values of that field are contatenated into a cell array.
You may want to remove the first two directory names, which are always '.' and '..':
all_project_dirs = all_project_dirs(3:end);
To obtain full paths, you can use strcat:
all_project_dirs = strcat(project_dirs, filesep, all_project_dirs);
or, as suggested by Jørgen, use fullfile:
all_project_dirs = fullfile(project_dirs, all_project_dirs);

Related

How to remove last 2 positions in split and get remaining first value.?

Let's say string is a variable file name like few examples below:
file1_name_cr_001.csv
file2_name1_name2.nn.123.456_updt_000.csv
filename_2012.444.1234_utc_del_004.csv
The length of last 8 string values will always remain fixed i.e. (_001.csv,_000.csv,_004.csv). We need to only extract values = cr, updt, del
How can we get the value as single value before _cr,_updt,_del.?
any suggetions.?
output should get like this:
file1_name/cr/001
file2_name1_name2.nn.123.456/updt/000
filename_2012.444.1234_utc/del/004
I have reproduced the above and got the below results.
First, I took a sample file name in set variable.
Then, I got the string from start to length-8.
#substring(variables('sample'),0,sub(length(variables('sample')),8))
For end folder:
#replace(split(substring(variables('sample'),sub(length(variables('sample')),8), 8),'.')[0],'_','')
For Start folder:
#substring(variables('before_8'), 0, lastIndexOf(variables('before_8'), '_'))
For middle folder:
#split(variables('before_8'), '_')[sub(length(split(variables('before_8'), '_')), 1)]
Result folder structure:
#concat(variables('start'),'/',variables('middle'),'/',variables('end'))
Result:
Give this variable in copy activity source folder path and it will generate the folder structure for you.
For multiple file names, first store all file names in an array then use a ForEach and inside ForEach do the same operations as above.

Matlab - Help in listing files using a name-pattern

I'm trying to create a function that lists the content of a folder based on a pattern, however the listing includes more files than needed. I'll explain by an example: Consider a folder containing the files
file.dat
file.dat._
file.dat.000
file.dat.001
...
file.dat.999
I am interested only in the files that are .000, .001 and so on. The files file.dat and file.dat._ are to be excluded.
The later numbering can also be .0000,.0001 and so on, so number of digits is not necessarily 3.
I tried using the Dir command with the pattern file.dat.* - this included file.dat for some reason (Why the last comma treated differently?) and file.dat._, which was expected.
The "obvious" set of solutions is to add an additional regular expression or length check - however I would like to avoid that, if possible.
This needs to work both under UNIX and Windows (and preferably MacOS).
Any elegant solutions?
Get all filenames with dir and filter them using with the regex '^file\.dat\.\d+$'. This matches:
start of the string (^)
followed by the string file.dat. (file\.dat\.)
followed by one or more digits (\d+)
and then the string must end ($)
Since the output of dir is a cell array of char vectors, regex returns a cell array with the matching indices of each char vector. The matching indices can only be 1 or [], so any is applied to each cell's content to reduce it to true or false The resulting logical index tells which filenames should be kept.
f = dir('path/to/folder');
names = {f.name};
ind = cellfun(#any, regexp(names, '^file\.dat\.\d+$'));
names = names(ind);

Store user input as wildcard

I am having some trouble with a data processing function in MATLAB. The function takes the name of the file to be processed as an input, finds the desired files, and reads in the data.
However, several of the desired files are variants, such as Data_00.dat, Data.dat, or Data_1_March.dat. Within my function, I would like to search for all files containing Data and condense them into one usable file for processing.
To solve this, I would like desiredfile to be converted into a wildcard.
Here is the statement I would like to use.
selectedfiles = dir *desiredfile*.dat % Search for file names containing desiredfile
This returns all files containing the variable name desiredfile, rather than the user input.
The only solution that I can think of is writing a separate function that manually condenses all the variants into one file before my function is run, but I am trying to keep the number of files used down and would like to avoid this.
You could concatenate strings for that. Considering desiredFile as a variable.
desiredFile = input('Files: ');
selectedfiles = dir(['*' desiredfile '*.dat']) % Search for file names containing desiredfile
Enclosing strings between square brackets [string1 string2 ... stringN]concatenates them. Matlab's dir function receives a string.
I believe you can achieve that using the dir command.
dataSets = dir('/path/to/dir/containing/Data*.dat');
dataSets = {dataSets.name};
Now simply loop over them, more information here.
To quote the matlab help:
dir lists the files and folders in the MATLAB® current folder. Results appear in the order returned by the operating system.
dir name lists the files and folders that match the string name. When name is a folder, dir lists the contents of the folder. Specify name using absolute or relative path names. You can use wildcards (*).

Name of files in a specific directory using matlab

I want to find the name of files in a specific directory. I know that the dir command return the name of files but it contains the file name with the their extension. Therefore, I used strfind to remove the extension of files as follows:
a = dir(fullfile(dataset_path, [dataset_category '\qrel']))
for i= 3: length(a)
name{i} = a(i).name(1:strfind(a(i).name, '.')-1)
I want a better approach without loop. I wonder is it a way to use vectorization for this purpose. I used the following code but it return an error
a = dir(fullfile(dataset_path, [dataset_category '\qrel']))
name = a.name(1:strfind(a.name, '.')-1)
You can do that with regular expressions:
name = regexprep({a.name}, '\.[^\.]*$', '');
This collects all names in a cell array ({a.name}). For each string it matches a dot (\.) followed by zero or more characters other than a dot ([^\.]*) at the end of the string ($), and removes that. Thanks to #Shai for the "other than a dot" correction, which makes sure that only the final dot is matched.

How can I get a list of all directory names and/or all files in a specific directory in MATLAB?

There are two things I want to do:
Get a list of all the directory names within a directory, and
Get a list of all the file names within a directory
How can I do this in MATLAB?
Right now, I'm trying:
dirnames = dir(image_dir);
but that returns a list of objects, I think. size(dirnames) returns the number of attributes, and dirnames.name only returns the name of the first directory.
The function DIR actually returns a structure array with one structure element per file or subdirectory in the given directory. When getting data from a structure array, accessing a field with dot notation will return a comma-separated list of field values with one value per structure element. This comma-separated list can be collected into a vector by placing it in square brackets [] or a cell array by placing it in curly braces {}.
I usually like to get a list of file or subdirectory names in a directory by making use of logical indexing, like so:
dirInfo = dir(image_dir); %# Get structure of directory information
isDir = [dirInfo.isdir]; %# A logical index the length of the
%# structure array that is true for
%# structure elements that are
%# directories and false otherwise
dirNames = {dirInfo(isDir).name}; %# A cell array of directory names
fileNames = {dirInfo(~isDir).name}; %# A cell array of file names
No. you are incorrect about what dirnames.name returns.
D = dir;
This is a structure array. If you want a list of which are directories, do this
isdirlist = find(vertcat(D.isdir));
Or I could have used cell2mat here. Note that if you just try D.name, is returns a comma separated list. You can get all of the names as a cell array simply though.
nameslist = {D.name};
Assuming that "image_dir" is the name of a directory, the following code shows you how to determine which items are directories and which are files and how to get their names. Once you've gotten that far, building a list of only directories or only files is straightforward.
dirnames = dir(image_dir);
for(i = 1:length(dirnames))
if(dirnames(i).isdir == true)
% It's a subdirectory
% The name of the subdirectory can be accessed as dirnames(i).name
% Note that both '.' and '..' are subdirectories of any directory and
% should be ignored
else
% It's a filename
% The filename is dirnames(i).name
end
end