How to only access the file names in a directory?
>> files = dir('*.png');
>> disp(class(dir('*.png')))
struct
>> fields
fields =
'name'
'date'
'bytes'
'isdir'
'datenum'
>> for i=1:numel(fields)
files.(fields{i}.name)
end
Struct contents reference from a non-struct array object.
>> for i=1:numel(fields)
files.(fields{i}).name
end
Expected one output from a curly brace or dot indexing expression, but there were 11 results.
File names are in the field names of the struct array returned by dir. So:
files = dir('*.png');
for k = 1:numel(files)
f = files(k).name; % f contains the name of each file
end
You can use ls like this
list=ls('*.png');
for ii=1:size(list,1)
s = strtrim(list(ii,:)); % a string containing the name of each file
end
ls works with chars instead of cells.
Related
I have a folder containing a series of data with file names like this:
abc1
abc2
abc3
bca1
bca2
bca3
bca4
bca5
cba1
... etc
My goal is to load all the relevant files for each file name, so all the "abc" files, and plot them in one graph. Then move on to the next file name, and do the same, and so forth. Is there a way to do this?
This is what I currently have to load and run through all the files, grab the data in them and get their name (without the .mat extension) to be able to save the graph with the same filename.
dirName = 'C:\DataDirectory';
files = dir( fullfile(dirName,'*.mat') );
files = {files.name}';
data = cell(numel(files),1);
for i=1:numel(files)
fname = fullfile(dirName,files{i});
disp(fname);
files{i} = files{i}(1:length(files{i})-4);
disp(files{i});
[Rest of script]
end
You already found out about the cool features of dir, and have a cell array files, which contains all file names, e.g.
files =
'37abc1.mat'
'37abc2.mat'
'50bca1.mat'
'50bca2.mat'
'1cba1.mat'
'1cba2.mat'
The main task now is to find all prefixes, 37abc, 50bca, 1cba, ... which are present in files. This can be done using a regular expression (regexp). The Regexp Pattern can look like this:
'([\d]*[\D]*)[\d]*.mat'
i.e. take any number of numbers ([\d]*), then any number of non-numeric characters ([\D]*) and keep those (by putting that in brackets). Next, there will be any number of numeric characters ([\d]*), followed by the text .mat.
We call the regexp function with that pattern:
pre = regexp(files,'([\d]*[\D]*)[\d]*.mat','tokens');
resulting in a cell array (one cell for each entry in files), where each cell contains another cell array with the prefix of that file. To convert this to a simple not-nested cell array, we call
pre = [pre{:}];
pre = [pre{:}];
resulting in
pre =
'37abc' '37abc' '50bca' '50bca' '1cba' '1cba'
To remove duplicate entries, we use the unique function:
pre = unique(pre);
pre =
'37abc' '50bca' '1cba'
which leaves us with all prefixes, that are present. Now you can loop through each of these prefixes and apply your stuff. Everything put together is:
% Find all files
dirName = 'C:\DataDirectory';
files = dir( fullfile(dirName,'*.mat') );
files = {files.name}';
% Find unique prefixes
pre = regexp(files,'([\d]*[\D]*)[\d]*.mat','tokens');
pre = [pre{:}]; pre = [pre{:}];
pre = unique(pre);
% Loop through prefixes
for ii=1:numel(pre)
% Get files with this prefix
curFiles = dir(fullfile(dirName,[pre{ii},'*.mat']));
curFiles = {curFiles.name}';
% Loop through all files with this prefix
for jj=1:numel(curFiles)
% Here the magic happens
end
end
Sorry, I misunderstood your question, I found this solution:
file = dir('*.mat')
matching = regexp({file.name}, '^[a-zA-Z_]+[0-9]+\.mat$', 'match', 'once'); %// Match once on file name, must be a series of A-Z a-z chars followed by numbers.
matching = matching(~cellfun('isempty', matching));
str = unique(regexp(matching, '^[a-zA-Z_]*', 'match', 'once'));
str = str(~cellfun('isempty', str));
group = cell(size(str));
for is = 1:length(str)
ismatch = strncmp(str{is}, matching, length(str{is}));
group{is} = matching(ismatch);
end
Answer came from this source: Matlab Central
I have a directory at '../../My_Dir' relative to the Matlab working directory. This directory itself has several sub-directories in it. Each sub-directory then has several files in it.
I want to create two-dimensional array, or a matrix, of strings. Each row represents one of the sub-directories. The first column in the row is the full path to the sub-directory itself, and the other columns are the full paths to the files within that sub-directory.
Can anyone show me some code that will help me to implement this? Thank you!
You can first get all the sub-folders by
d = dir(pathFolder);
isub = [d(:).isdir];
subFolders = {d(isub).name}';
Note, you further need to remove . and .. from it:
subFolders(ismember(subFolders,{'.','..'})) = [];
And then get files in each of them by using (from this post):
function fileList = getAllFiles(dirName)
dirData = dir(dirName); %# Get the data for the current directory
dirIndex = [dirData.isdir]; %# Find the index for directories
fileList = {dirData(~dirIndex).name}'; %'# Get a list of the files
if ~isempty(fileList)
fileList = cellfun(#(x) fullfile(dirName,x),... %# Prepend path to files
fileList,'UniformOutput',false);
end
subDirs = {dirData(dirIndex).name}; %# Get a list of the subdirectories
validIndex = ~ismember(subDirs,{'.','..'}); %# Find index of subdirectories
%# that are not '.' or '..'
for iDir = find(validIndex) %# Loop over valid subdirectories
nextDir = fullfile(dirName,subDirs{iDir}); %# Get the subdirectory path
fileList = [fileList; getAllFiles(nextDir)]; %# Recursively call getAllFiles
end
end
To get the folders FULL PATH:
d = dir(baseDir);
d(~[d.isdir])= []; %Remove all non directories.
names = setdiff({d.name},{'.','..'});
filesFullPath = names;
for i=1:size(names,2)
filesFullPath{i} = fullfile(baseDir,names{1,i});
end
Matlab... "thanks" for this s...
As an example, let's say I have three subfolders in 'My_Dir' called 'A' (containing 'a1.txt' and 'a2.txt'), 'B' (containing 'b1.txt'), and 'C' (containing 'c1.txt', 'c2.txt', and 'c3.txt'). This will illustrate how to handle a case with different numbers of files in each subfolder...
For MATLAB versions R2016b and later, the dir function supports recursive searching, allowing us to collect a list of files like so:
dirData = dir('My_Dir\*\*.*'); % Get structure of folder contents
dirData = dirData(~[dirData.isdir]); % Omit folders (keep only files)
fileList = fullfile({dirData.folder}.', {dirData.name}.'); % Get full file paths
fileList =
6×1 cell array
'...\My_Dir\A\a1.txt'
'...\My_Dir\A\a2.txt'
'...\My_Dir\B\b1.txt'
'...\My_Dir\C\c1.txt'
'...\My_Dir\C\c2.txt'
'...\My_Dir\C\c3.txt'
As an alternative, in particular for earlier versions, this can be done using a utility I posted to the MathWorks File Exchange: dirPlus. It can be used as follows:
dirData = dirPlus('My_Dir', 'Struct', true, 'Depth', 1);
fileList = fullfile({dirData.folder}.', {dirData.name}.');
Now we can format fileList in the way you specified above. First we can use unique to get a list of unique subfolders and an index. That index can then be used with mat2cell and diff to break fileList up by subfolder into a second level of cell array encapsulation:
[dirList, index] = unique({dirData.folder}.');
outData = [dirList mat2cell(fileList, diff([index; numel(fileList)+1]))]
outData =
3×2 cell array
'...\My_Dir\A' {2×1 cell}
'...\My_Dir\B' {1×1 cell}
'...\My_Dir\C' {3×1 cell}
I am looping through a lot of files and I need to remove the '.jpg' from each name.
Example file name:
20403y.jpg
but I just need the
20403y
All the file names end with 'y' if that helps.
One way is with regular expressions:
filename = 'myfilename.jpg';
pattern = '.jpg';
replacement = '';
regexprep(filename,pattern,replacement)
Result:
ans =
myfilename
If you have the filenames in a cell array feed the cell array to regexprep. As the documentation explains, "If str is a cell array of strings, then the regexprep return value s is always a cell array of strings having the same dimensions as str."
Example:
myfilenames = {'myfilename.jpg' 'afilename.jpg' 'anotherfilename.jpg' };
newfilenames= regexprep(myfilenames,'.jpg','');
Result:
newfilenames =
'myfilename' 'afilename' 'anotherfilename'
files = dir('*y.jpg');
% Loop through each
for id = 1:length(files)
% Get the file name (minus the extension)
[p, f] = fileparts(files(id).name); % f will just give you file name
% Use following to rename the files
% I think you don't want to rename them
% movefile(files(id).name, f);
end
I am trying to find all files in a directory that match 'hello'. i have the following code:
fileData = dir();
m_file_idx = 1;
fileNames = {fileData.name};
index = regexp(filenames,'\w*hello\w*','match') ;
inFiles = fileNames(~cellfun(#isempty,index));
Ex. if my directory has 3 files with the word hello in it, inFiles returns me
inFiles =
[1x23 char] [1x26 char] [1x25 char]
instead i want inFiles to return me the name of the file,ex thisishello.m,hiandhello.txt
how can i do this in a simple way?
This code:
fileData = dir();
fileNames = {fileData.name};
disp('The full directory...')
disp(fileNames)
index = regexp(fileNames,'\w*hello\w*','match');
inFiles = fileNames(~cellfun(#isempty,index));
disp('Print out the file names')
inFiles{:}
generates this output:
>> script
The full directory...
Columns 1 through 6
'.' '..' 'andsevenyears.txt' 'fourscore.txt' 'hello1.txt' 'hello2.txt'
Column 7
'script.m'
Print out the file names
ans =
hello1.txt
ans =
hello2.txt
To me it looks as if you were having some issues with understanding cell arrays. Here's a specific tutorial that works through them. (jerad's link also looks like a good resource)
I think what's going on here is that when an element of a cell array is longer than a certain length (appears to be 19 characters for strings), matlab doesn't print the actual element, it prints a description of the content instead (in this case, "[1x23 char]").
For example:
>> names = {'1234567890123456789' 'bar' 'car'}
names =
'1234567890123456789' 'bar' 'car'
>> names = {'12345678901234567890' 'bar' 'car'}
names =
[1x20 char] 'bar' 'car'
celldisp might work better for your situation:
>> celldisp(names)
names{1} =
12345678901234567890
names{2} =
bar
names{3} =
car
I need to get all those files under D:\dic and loop over them to further process individually.
Does MATLAB support this kind of operations?
It can be done in other scripts like PHP,Python...
Update: Given that this post is quite old, and I've modified this utility a lot for my own use during that time, I thought I should post a new version. My newest code can be found on The MathWorks File Exchange: dirPlus.m. You can also get the source from GitHub.
I made a number of improvements. It now gives you options to prepend the full path or return just the file name (incorporated from Doresoom and Oz Radiano) and apply a regular expression pattern to the file names (incorporated from Peter D). In addition, I added the ability to apply a validation function to each file, allowing you to select them based on criteria other than just their names (i.e. file size, content, creation date, etc.).
NOTE: In newer versions of MATLAB (R2016b and later), the dir function has recursive search capabilities! So you can do this to get a list of all *.m files in all subfolders of the current folder:
dirData = dir('**/*.m');
Old code: (for posterity)
Here's a function that searches recursively through all subdirectories of a given directory, collecting a list of all file names it finds:
function fileList = getAllFiles(dirName)
dirData = dir(dirName); %# Get the data for the current directory
dirIndex = [dirData.isdir]; %# Find the index for directories
fileList = {dirData(~dirIndex).name}'; %'# Get a list of the files
if ~isempty(fileList)
fileList = cellfun(#(x) fullfile(dirName,x),... %# Prepend path to files
fileList,'UniformOutput',false);
end
subDirs = {dirData(dirIndex).name}; %# Get a list of the subdirectories
validIndex = ~ismember(subDirs,{'.','..'}); %# Find index of subdirectories
%# that are not '.' or '..'
for iDir = find(validIndex) %# Loop over valid subdirectories
nextDir = fullfile(dirName,subDirs{iDir}); %# Get the subdirectory path
fileList = [fileList; getAllFiles(nextDir)]; %# Recursively call getAllFiles
end
end
After saving the above function somewhere on your MATLAB path, you can call it in the following way:
fileList = getAllFiles('D:\dic');
You're looking for dir to return the directory contents.
To loop over the results, you can simply do the following:
dirlist = dir('.');
for i = 1:length(dirlist)
dirlist(i)
end
This should give you output in the following format, e.g.:
name: 'my_file'
date: '01-Jan-2010 12:00:00'
bytes: 56
isdir: 0
datenum: []
I used the code mentioned in this great answer and expanded it to support 2 additional parameters which I needed in my case. The parameters are file extensions to filter on and a flag indicating whether to concatenate the full path to the name of the file or not.
I hope it is clear enough and someone will finds it beneficial.
function fileList = getAllFiles(dirName, fileExtension, appendFullPath)
dirData = dir([dirName '/' fileExtension]); %# Get the data for the current directory
dirWithSubFolders = dir(dirName);
dirIndex = [dirWithSubFolders.isdir]; %# Find the index for directories
fileList = {dirData.name}'; %'# Get a list of the files
if ~isempty(fileList)
if appendFullPath
fileList = cellfun(#(x) fullfile(dirName,x),... %# Prepend path to files
fileList,'UniformOutput',false);
end
end
subDirs = {dirWithSubFolders(dirIndex).name}; %# Get a list of the subdirectories
validIndex = ~ismember(subDirs,{'.','..'}); %# Find index of subdirectories
%# that are not '.' or '..'
for iDir = find(validIndex) %# Loop over valid subdirectories
nextDir = fullfile(dirName,subDirs{iDir}); %# Get the subdirectory path
fileList = [fileList; getAllFiles(nextDir, fileExtension, appendFullPath)]; %# Recursively call getAllFiles
end
end
Example for running the code:
fileList = getAllFiles(dirName, '*.xml', 0); %#0 is false obviously
You can use regexp or strcmp to eliminate . and ..
Or you could use the isdir field if you only want files in the directory, not folders.
list=dir(pwd); %get info of files/folders in current directory
isfile=~[list.isdir]; %determine index of files vs folders
filenames={list(isfile).name}; %create cell array of file names
or combine the last two lines:
filenames={list(~[list.isdir]).name};
For a list of folders in the directory excluding . and ..
dirnames={list([list.isdir]).name};
dirnames=dirnames(~(strcmp('.',dirnames)|strcmp('..',dirnames)));
From this point, you should be able to throw the code in a nested for loop, and continue searching each subfolder until your dirnames returns an empty cell for each subdirectory.
This answer does not directly answer the question but may be a good solution outside of the box.
I upvoted gnovice's solution, but want to offer another solution: Use the system dependent command of your operating system:
tic
asdfList = getAllFiles('../TIMIT_FULL/train');
toc
% Elapsed time is 19.066170 seconds.
tic
[status,cmdout] = system('find ../TIMIT_FULL/train/ -iname "*.wav"');
C = strsplit(strtrim(cmdout));
toc
% Elapsed time is 0.603163 seconds.
Positive:
Very fast (in my case for a database of 18000 files on linux).
You can use well tested solutions.
You do not need to learn or reinvent a new syntax to select i.e. *.wav files.
Negative:
You are not system independent.
You rely on a single string which may be hard to parse.
I don't know a single-function method for this, but you can use genpath to recurse a list of subdirectories only. This list is returned as a semicolon-delimited string of directories, so you'll have to separate it using strread, i.e.
dirlist = strread(genpath('/path/of/directory'),'%s','delimiter',';')
If you don't want to include the given directory, remove the first entry of dirlist, i.e. dirlist(1)=[]; since it is always the first entry.
Then get the list of files in each directory with a looped dir.
filenamelist=[];
for d=1:length(dirlist)
% keep only filenames
filelist=dir(dirlist{d});
filelist={filelist.name};
% remove '.' and '..' entries
filelist([strmatch('.',filelist,'exact');strmatch('..',filelist,'exact'))=[];
% or to ignore all hidden files, use filelist(strmatch('.',filelist))=[];
% prepend directory name to each filename entry, separated by filesep*
for f=1:length(filelist)
filelist{f}=[dirlist{d} filesep filelist{f}];
end
filenamelist=[filenamelist filelist];
end
filesep returns the directory separator for the platform on which MATLAB is running.
This gives you a list of filenames with full paths in the cell array filenamelist. Not the neatest solution, I know.
This is a handy function for getting filenames, with the specified format (usually .mat) in a root folder!
function filenames = getFilenames(rootDir, format)
% Get filenames with specified `format` in given `foler`
%
% Parameters
% ----------
% - rootDir: char vector
% Target folder
% - format: char vector = 'mat'
% File foramt
% default values
if ~exist('format', 'var')
format = 'mat';
end
format = ['*.', format];
filenames = dir(fullfile(rootDir, format));
filenames = arrayfun(...
#(x) fullfile(x.folder, x.name), ...
filenames, ...
'UniformOutput', false ...
);
end
In your case, you can use the following snippet :)
filenames = getFilenames('D:/dic/**');
for i = 1:numel(filenames)
filename = filenames{i};
% do your job!
end
With little modification but almost similar approach to get the full file path of each sub folder
dataFolderPath = 'UCR_TS_Archive_2015/';
dirData = dir(dataFolderPath); %# Get the data for the current directory
dirIndex = [dirData.isdir]; %# Find the index for directories
fileList = {dirData(~dirIndex).name}'; %'# Get a list of the files
if ~isempty(fileList)
fileList = cellfun(#(x) fullfile(dataFolderPath,x),... %# Prepend path to files
fileList,'UniformOutput',false);
end
subDirs = {dirData(dirIndex).name}; %# Get a list of the subdirectories
validIndex = ~ismember(subDirs,{'.','..'}); %# Find index of subdirectories
%# that are not '.' or '..'
for iDir = find(validIndex) %# Loop over valid subdirectories
nextDir = fullfile(dataFolderPath,subDirs{iDir}); %# Get the subdirectory path
getAllFiles = dir(nextDir);
for k = 1:1:size(getAllFiles,1)
validFileIndex = ~ismember(getAllFiles(k,1).name,{'.','..'});
if(validFileIndex)
filePathComplete = fullfile(nextDir,getAllFiles(k,1).name);
fprintf('The Complete File Path: %s\n', filePathComplete);
end
end
end