Faster copyfiles Matlab - matlab

I have a folder origin_training with subfolders like FM, folder1, folder2, ... . I can get the list of image files in .png format into a cell called FM.
FM_dir = fullfile(origin_training, 'FM');
FM = struct2cell(dir(fullfile(FM_dir, '*.png')))';
My goal is to match the names in my folder with the images in my cd, and make a new folder FM with the images from cd. Image names in the two paths are identical. I can do that with:
% mother_folder = '...' my current directory
filenames = FM(:,1);
destination = fullfile(mother_folder, 'FM', FM(:,1));
cellfun(#copyfile,filenames, destination);
This is really slow. Takes minutes even for small amounts of images (~30).
My current directory has ~ 10000 images (the ones that correspond to FM, folder2, folder3, ...). What am I missing here?

An alternative would be to create a shell command and execute it. Assuming FM contains full paths for each file, or relative paths from the current directory, then:
FM_dir = fullfile(origin_training, 'FM');
destination = fullfile(mother_folder, 'FM');
curdir = cd(FM_dir);
FM = dir('*.png');
cmd = ['cp ', sprintf('%s ',FM.name), destination];
system(cmd);
cd(curdir);
If you're on Windows, replace 'cp' by 'copy'.
Note that here we're not creating a shell command per file to be copied (presumably what copyfile does), but a single shell command that copies all files at once. We're also not specifying the name for each copied file, we specify the names of the files to be copied and where to copy them to.
Note also that for this to work, mother_folder must be an absolute path.

I somehow put my code into a function and now it works at expected speed. I suspected that cd had to do with the low speed so I am only passing the full directories as character vectors.
The same approach works for my purpose or with a slight modification to just copy from A to B. As now, it works to match files in A to those on B and copy to B/folder_name
function [out] = my_copyfiles(origin_dir, folder_name, dest_dir, extension)
new_dir = fullfile(origin_dir, folder_name);
% Get a cell with filenames in the origin_dir/folder_name
% Mind transposing to have them in the rows
NEW = struct2cell(dir(fullfile(new_dir, extension)))';
dest_folder = fullfile(dest_dir, folder_name);
% disp(dest_folder)
if ~exist(dest_folder, 'dir')
mkdir(dest_folder)
end
%% CAUTION, This is the step
% Use names from NEW to generate fullnames in dest_dir that would match
filenames = fullfile(dest_dir, NEW(:,1));
destination = fullfile(dest_folder, NEW(:,1));
% if you wanted from origin, instead run
% filenames = fullfile(new_dir, NEW(:,1));
% make the copies
cellfun(#copyfile,filenames, destination);
%Save the call
out.originDir = new_dir;
out.copyFrom = dest_dir;
out.to = dest_folder;
out.filenames = filenames;
out.destination = destination;
end

Related

How to loop through a set of files of varying type in MATLAB

I have a set of files in a folder that I want to convert to a different type using MATLAB, ie. for each file in this folder do "x".
I have seen answers that suggest the use of the "dir" function to create a structure that contains all the files as elements (Loop through files in a folder in matlab). However, this function requires me to specify the file type (.csv, .txt, etc). The files I want to process are from a very old microscope and have the file suffixes .000, .001, .002, etc so I'm unable to use the solutions suggested elsewhere.
Does anyone know how I can get around this problem? This is my first time using MATLAB so any help would be greatly appreciated. All the best!
As suggested by beaker and Ander Biguri, you can do this by getting all files and then sorting.
Here is the related code with explanations inline:
baseDir = 'C:\mydir';
files = dir(baseDir);
files( [ files.isdir] ) = []; % This removes directories, especially '.' and '..'.
fileNames = cell(size(files)); % In this cell, we store the file names.
fileExt = cell(size(files)); % In this cell, we store the file extensions.
for k = 1 : numel(fileNames )
cf = files(k);
fileNames{k} = [ cf.folder filesep cf.name]; % Get file name
[~,~, fileExt{k}] = fileparts( cf.name ); % Get extension
end
[~,idx] = sort( fileExt ); % Get index to sort by extension
fileNames = fileNames( idx ); % Do the actual sorting

Problem adding tow digits to path string to read multiple files

I want to read multiple files from a folder, the only problem in my code down is that
instead of having the full image path:
'checkboard/Gaussian_Noised/image_01.jpg'
I have the path without the zero in the last two digits
EDIT: There is space instead of zero on the first 9 Image paths.
'checkboard/Gaussian_Noised/image_ 1.jpg'
how can I fix this please
%-------------------------------------------------------
clc
clear all
close all
%------------------------------------------------------
path = 'checkboard/Gaussian_Noised/';
% Define images to process
imageFileNames = cell(1,36);
for n = 1:36
imageFileNames{n} = strcat(path,sprintf('image_%2d.jpg',n))
end
%------------------------------------------------------
In this case, it is better to use the float-format-specifier to pad zeros:
num2str(1,'%02.f')
'01'
num2str(1,'%02.d')
'1'
num2str(1,'%2.d')
'1'
num2str(1,'%d')
'1'
So in your case:
sprintf('image_%02.f.jpg',n)
general advice
As a general advice, you may want just to check the folder for all files of a certain pattern
path2dir = pwd; % path to working directory
pattern = 'image_*.jpg'
Lst = dir(fullfile(path2dir ,pattern));
for i = 1:length(Lst)
path2file = fullfile(Lst(i).folder,Lst(i).name);
% load the file and do something with it
end

How to read specific images from one folder has different types of images and text files and save into another one

I have a folder with 10 subfolder each has about 100 different files including text files and different extension images. I just need to copy the image files with JPG extension and move it to another single folder.
I am using this code:
clear all
clc
M_dir = 'X:\Datasets to be splitted\Action3\Action3\'% source directory
D_dir = 'X:\Datasets to be splitted\Action3\Depth\'
files = dir(M_dir);% main directory
dirFlags = [files.isdir];
subFolders = files(dirFlags);%list of folders
for k = 1 :length(subFolders)
if any(isletter(subFolders(k).name))
c_dtry = strcat(M_dir,subFolders(k).name)
fileList = getAllFiles(c_dtry)%list of files in subfolder
for n1 = 1:length(fileList)
[pathstr,name,ext] = fileparts(fileList{n1})% file type
%s = dir(fileList{n1});
%S = s.bytes/1000;%file size
Im = imread(fileList{n1});
%[h,w,d] = size(Im);%height width and dimension
if strcmp(ext,'.jpg')|strcmp(ext,'.JPG')%)&S>=50&(write image dimension condition))% here you need to modify
baseFileName = strcat(name,ext);
fullFileName = fullfile(D_dir, baseFileName); % No need to worry about slashes now!
imwrite(Im, fullFileName);
else
end
end
end
end
But the code is stopped with an error when a text file being processed.
The error says:
Error using imread>get_format_info (line 491)
Unable to determine the file format.
Error in imread (line 354)
fmt_s = get_format_info(fullname);
Error in ReadFromSubFolder (line 16)
Im = imread(fileList{n1});
Thanks
Your code is reading the data before you check the extension
Im = imread(fileList{n1});
is before
if strcmp(ext,'.jpg')|strcmp(ext,'.JPG')
edit
For finding files which end 'vis' you can usestrcmp
strcmp ( name(end-2:end), 'vis' )
For completeness you should also check that name is longer than 3 char.
Here is a relatively simpler solution using dir and movefile or copyfile depending on whether you want to move or copy:
%searching jpg files in all subdirectories of 'X:\Datasets to be splitted\Action3\Action3'
file = dir('X:\Datasets to be splitted\Action3\Action3\**\*.jpg');
filenames_with_path = strcat({file.folder},'\',{file.name});
destination_dir = 'X:\Datasets to be splitted\Action3\Depth\';
%mkdir(destination_dir); %create the directory if it doesn't exist
for k=1:length(filenames_with_path)
movefile(filenames_with_path{k}, destination_dir, 'f'); %moving the files
%or if you want to copy then: copyfile(filenames_with_path{k}, destination_dir, 'f');
end

Batch Change File Names in Matlab

I have 200 JPEG images numbered from 1200 to 1399. How do I change their names from 1200.jpg-1400.jpg to 1.jpg-200.jpg, while keeping the original order of the image names?
This is a problem which can be tackled more efficiently in bash or other shell scripting languages. Probably, it can even be solved just via a single shell find.
Anyway, matlab can also do the job.
Consider this.
list = dir('./stack*.png'); % # assuming the file names are like stack1200.jpg
offset = -1200; # % numbering offset amount
for idx = 1:length(list) % # we go through every file name
name = list(idx).name;
number = sscanf(name,'stack%f.png',1); % # we extract the number
movefile(name,['stack' num2str(number + offset) '.png' ]); % # we rename the file.
end
I have find this solution which is quiet straightforward.
Pathfold= ('Path\to\images\');
dirData = dir('path\to\images\*.jpg');
fileNames = {dirData.name};
for i = 1:size(fileNames)
newName = sprintf('%d.jpg',i);
movefile(fileNames{i},[Pathfold , newName]) %# Rename the file
end

How to get all files under a specific directory in MATLAB?

I need to get all those files under D:\dic and loop over them to further process individually.
Does MATLAB support this kind of operations?
It can be done in other scripts like PHP,Python...
Update: Given that this post is quite old, and I've modified this utility a lot for my own use during that time, I thought I should post a new version. My newest code can be found on The MathWorks File Exchange: dirPlus.m. You can also get the source from GitHub.
I made a number of improvements. It now gives you options to prepend the full path or return just the file name (incorporated from Doresoom and Oz Radiano) and apply a regular expression pattern to the file names (incorporated from Peter D). In addition, I added the ability to apply a validation function to each file, allowing you to select them based on criteria other than just their names (i.e. file size, content, creation date, etc.).
NOTE: In newer versions of MATLAB (R2016b and later), the dir function has recursive search capabilities! So you can do this to get a list of all *.m files in all subfolders of the current folder:
dirData = dir('**/*.m');
Old code: (for posterity)
Here's a function that searches recursively through all subdirectories of a given directory, collecting a list of all file names it finds:
function fileList = getAllFiles(dirName)
dirData = dir(dirName); %# Get the data for the current directory
dirIndex = [dirData.isdir]; %# Find the index for directories
fileList = {dirData(~dirIndex).name}'; %'# Get a list of the files
if ~isempty(fileList)
fileList = cellfun(#(x) fullfile(dirName,x),... %# Prepend path to files
fileList,'UniformOutput',false);
end
subDirs = {dirData(dirIndex).name}; %# Get a list of the subdirectories
validIndex = ~ismember(subDirs,{'.','..'}); %# Find index of subdirectories
%# that are not '.' or '..'
for iDir = find(validIndex) %# Loop over valid subdirectories
nextDir = fullfile(dirName,subDirs{iDir}); %# Get the subdirectory path
fileList = [fileList; getAllFiles(nextDir)]; %# Recursively call getAllFiles
end
end
After saving the above function somewhere on your MATLAB path, you can call it in the following way:
fileList = getAllFiles('D:\dic');
You're looking for dir to return the directory contents.
To loop over the results, you can simply do the following:
dirlist = dir('.');
for i = 1:length(dirlist)
dirlist(i)
end
This should give you output in the following format, e.g.:
name: 'my_file'
date: '01-Jan-2010 12:00:00'
bytes: 56
isdir: 0
datenum: []
I used the code mentioned in this great answer and expanded it to support 2 additional parameters which I needed in my case. The parameters are file extensions to filter on and a flag indicating whether to concatenate the full path to the name of the file or not.
I hope it is clear enough and someone will finds it beneficial.
function fileList = getAllFiles(dirName, fileExtension, appendFullPath)
dirData = dir([dirName '/' fileExtension]); %# Get the data for the current directory
dirWithSubFolders = dir(dirName);
dirIndex = [dirWithSubFolders.isdir]; %# Find the index for directories
fileList = {dirData.name}'; %'# Get a list of the files
if ~isempty(fileList)
if appendFullPath
fileList = cellfun(#(x) fullfile(dirName,x),... %# Prepend path to files
fileList,'UniformOutput',false);
end
end
subDirs = {dirWithSubFolders(dirIndex).name}; %# Get a list of the subdirectories
validIndex = ~ismember(subDirs,{'.','..'}); %# Find index of subdirectories
%# that are not '.' or '..'
for iDir = find(validIndex) %# Loop over valid subdirectories
nextDir = fullfile(dirName,subDirs{iDir}); %# Get the subdirectory path
fileList = [fileList; getAllFiles(nextDir, fileExtension, appendFullPath)]; %# Recursively call getAllFiles
end
end
Example for running the code:
fileList = getAllFiles(dirName, '*.xml', 0); %#0 is false obviously
You can use regexp or strcmp to eliminate . and ..
Or you could use the isdir field if you only want files in the directory, not folders.
list=dir(pwd); %get info of files/folders in current directory
isfile=~[list.isdir]; %determine index of files vs folders
filenames={list(isfile).name}; %create cell array of file names
or combine the last two lines:
filenames={list(~[list.isdir]).name};
For a list of folders in the directory excluding . and ..
dirnames={list([list.isdir]).name};
dirnames=dirnames(~(strcmp('.',dirnames)|strcmp('..',dirnames)));
From this point, you should be able to throw the code in a nested for loop, and continue searching each subfolder until your dirnames returns an empty cell for each subdirectory.
This answer does not directly answer the question but may be a good solution outside of the box.
I upvoted gnovice's solution, but want to offer another solution: Use the system dependent command of your operating system:
tic
asdfList = getAllFiles('../TIMIT_FULL/train');
toc
% Elapsed time is 19.066170 seconds.
tic
[status,cmdout] = system('find ../TIMIT_FULL/train/ -iname "*.wav"');
C = strsplit(strtrim(cmdout));
toc
% Elapsed time is 0.603163 seconds.
Positive:
Very fast (in my case for a database of 18000 files on linux).
You can use well tested solutions.
You do not need to learn or reinvent a new syntax to select i.e. *.wav files.
Negative:
You are not system independent.
You rely on a single string which may be hard to parse.
I don't know a single-function method for this, but you can use genpath to recurse a list of subdirectories only. This list is returned as a semicolon-delimited string of directories, so you'll have to separate it using strread, i.e.
dirlist = strread(genpath('/path/of/directory'),'%s','delimiter',';')
If you don't want to include the given directory, remove the first entry of dirlist, i.e. dirlist(1)=[]; since it is always the first entry.
Then get the list of files in each directory with a looped dir.
filenamelist=[];
for d=1:length(dirlist)
% keep only filenames
filelist=dir(dirlist{d});
filelist={filelist.name};
% remove '.' and '..' entries
filelist([strmatch('.',filelist,'exact');strmatch('..',filelist,'exact'))=[];
% or to ignore all hidden files, use filelist(strmatch('.',filelist))=[];
% prepend directory name to each filename entry, separated by filesep*
for f=1:length(filelist)
filelist{f}=[dirlist{d} filesep filelist{f}];
end
filenamelist=[filenamelist filelist];
end
filesep returns the directory separator for the platform on which MATLAB is running.
This gives you a list of filenames with full paths in the cell array filenamelist. Not the neatest solution, I know.
This is a handy function for getting filenames, with the specified format (usually .mat) in a root folder!
function filenames = getFilenames(rootDir, format)
% Get filenames with specified `format` in given `foler`
%
% Parameters
% ----------
% - rootDir: char vector
% Target folder
% - format: char vector = 'mat'
% File foramt
% default values
if ~exist('format', 'var')
format = 'mat';
end
format = ['*.', format];
filenames = dir(fullfile(rootDir, format));
filenames = arrayfun(...
#(x) fullfile(x.folder, x.name), ...
filenames, ...
'UniformOutput', false ...
);
end
In your case, you can use the following snippet :)
filenames = getFilenames('D:/dic/**');
for i = 1:numel(filenames)
filename = filenames{i};
% do your job!
end
With little modification but almost similar approach to get the full file path of each sub folder
dataFolderPath = 'UCR_TS_Archive_2015/';
dirData = dir(dataFolderPath); %# Get the data for the current directory
dirIndex = [dirData.isdir]; %# Find the index for directories
fileList = {dirData(~dirIndex).name}'; %'# Get a list of the files
if ~isempty(fileList)
fileList = cellfun(#(x) fullfile(dataFolderPath,x),... %# Prepend path to files
fileList,'UniformOutput',false);
end
subDirs = {dirData(dirIndex).name}; %# Get a list of the subdirectories
validIndex = ~ismember(subDirs,{'.','..'}); %# Find index of subdirectories
%# that are not '.' or '..'
for iDir = find(validIndex) %# Loop over valid subdirectories
nextDir = fullfile(dataFolderPath,subDirs{iDir}); %# Get the subdirectory path
getAllFiles = dir(nextDir);
for k = 1:1:size(getAllFiles,1)
validFileIndex = ~ismember(getAllFiles(k,1).name,{'.','..'});
if(validFileIndex)
filePathComplete = fullfile(nextDir,getAllFiles(k,1).name);
fprintf('The Complete File Path: %s\n', filePathComplete);
end
end
end