I have some 1000 images I need to load as training data for a facial recognition program.
There are 100 people, each have 10 unique pictures.
Saved in a folder like:
myTraining //main folder
- John // sub folder
- John_Smith_001, John_Smith_002, ... , 00n, //images
- Mary // sub folder
- Mary_Someone_001... you get the idea :)
I'm familiar with a lot of matlab but not ways of iterating through external files.
What is an easy implementation to go through each folder, one by one and load the images, ideally using the retrieving the file names and using them as variable/image names.
Thanks in advance.
Using the following commands will recursively list all files in a specific directory and its sub-directories. I have listed it for both Windows and Mac/Linux. I unfortunately can not test the Mac/Linux version since I am not near any of those machines, but it will be fairly similar to what is written below.
Windows
[~,result] = system('dir C:\Users\username\Desktop /a-d /s /b');
files = regexp(result,'\n','Split')
Mac/Linux
[~,result] = system('find /some/Directory -type file);
files = regexp(result,'\n','Split')
You can then iterate through the cell array created, files, and do whatever loading you may need with imread, or something like that
You can do it like this:
basePath = pwd; %your base path which is in your case myTraining
allPaths = dir(basePath); %get all directory content
subFolders = [allPaths(:).isdir]; %get only indices of folders
foldersNames = {allPaths(subFolders).name}'; % filter folders names
foldersNames(ismember(foldersNames,{'.','..'})) = []; %delete default paths for parents return '.','..'
for i=1:length(foldersNames), %loop through all folders
tmp = foldersNames{i}; %get folder by index
p = strcat([basePath '\']);
currentPath =strcat([p tmp]); % add base to current folder
cd(currentPath); % change directory to new path
files = dir('*.jpg'); % list all images in your path which in your case could be John or Mary
for j=1:length(files), % loop through your images
img = imread(files(j).name); % read each image and do what you want
end
end
for jpg images it would be
files = dir('*.jpg');
for file = files'
img = imread(file.name);
% Do some stuff
end
and if you have multiple extensions use
files = [dir('*.jpg'); dir('*.gif')]
I hope this helps
Related
I have a lot of videos to run which are kept in the diffrent folder than my current directory of Matlab and VideoReader is not taking the directory address of the video. Need help in creating video object of video kept in a diffrent folder.
filePattern = fullfile(pwd, 'videoDir\videoname.mp4');
fileList = dir (filePattern );
video_name =fileList.name;
obj = VideoReader(video_name);
The .name field of the directory structure is only the final part of the name - it does not include any folders or subfolders. Your very first line defines the entire absolute path and filename for the video file. You can pass that to VideoReader directly.
filePattern = fullfile(pwd, 'videoDir\videoname.mp4');
obj = VideoReader(filePattern);
In fact, there's no reason you need the 'fullfile' call unless you are going to want to reference this file from a different directory at some later date.
obj = VideoReader('videoDir/videoname.mp4');
For a more flexible version of this, consider we have a bunch of *.mp4 files in a bunch of sub-directories and we want to step through all of them.
Directory = dir('*/*.mp4'); % this command works on Windows or Linux
for jj = 1:length(Directory)
obj(jj) = VideoReader(fullfile(Directory(jj).folder,Directory(jj).name));
end
I have a lot of .fig files that are named like this: 20160922_01_id_32509055.fig, 20160921_02_id_53109418.fig and so on.
So I thought that I create a script that loop through all the .fig files in the folder and group(copy) them into another folder(s) based on the last number in the file name. The folder is created based on the id number. Is this possible?
I have been looking on other solutions involving looping through folders but I am totally fresh. This would make it easier for me to check the .fig files while I am learning to do other stuff in Matlab.
All is possible with MATLAB! We can use dir to get all .fig files, then use regexp to get the numeric part of each filename and then use copyfile to copy the file to it's new home. If you want to move it instead, you can use movefile instead .
% Define where the files are now and where you want them.
srcdir = '/my/input/directory';
outdir = '/my/output/directory';
% Find all .fig files in the source directory
figfiles = dir(fullfile(srcdir, '*.fig'));
figfiles = {figfiles.name};
for k = 1:numel(figfiles)
% Extract the last numeric part from the filename
numpart = regexp(figfiles{k}, '(?<=id_)\d+', 'match', 'once');
% Determine the folder we are going to put it in
destination = fullfile(outdir, numpart);
% Make sure the folder exists
if ~exist(destination, 'dir')
mkdir(destination)
end
% Copy the file there!
copyfile(fullfile(srcdir, figfiles{k}), destination)
end
Here's an example how to identify and copy the files. I'll let you do the for loop :)
>> Figs = dir('*.fig'); % I had two .fig files on my desktop
>> Basename = strsplit(Figs(1).name, '.');
>> Id = strsplit(Basename{1}, '_');
>> Id = Id{3};
>> mkdir(fullfile('./',Id));
>> copyfile(Figs(1).name, fullfile('./',Id));
Play with the commands to see what they do. It should be straightforward :)
I have a folder dat that could contain n subfolders and these contain various .dat files
I need to get all the files in these subfolder stored in a data structure myarchive that contains the file_name, its subfolder_name and a object resulanalysis that is the result of a my analysis script
The purpose of this operation is to obtain file_name and subfolder_name of the entries in myarchive that matches a generic result
With this code I'm able to get all the analysis result of the files contained in the current folder and I have the matching function, but I don't know how to solve the described classification problem.
files = dir('*.dat');
for file = files'
im = load(file.name);
result=myanalyzer(im);
end
Could someone help me?
If someone has a better strategy that could meet my problem is welcome.
Thanks.
If I understand you correctly, you want to through all the subfolders, load the .dat file, run some analysis and see if the result matches a certain value. If it matches you wan to save the names of the subfolder and the file in a data structure myarchive. If that's the case, here's the code:
topfolder = '...\dat\'; % Specify the full path to the dat folder
cd(topfolder)
subfolderlist = dir;
subfolderlist = subfolderlist(3:end); % because the first two results are '.' and '..'
counter = 0;
for ii = 1:lenght(subfolderlist)
cd([topfolder,subfolderlist(ii).name])
filename = dir('*.dat');
im = load(filename); % assuming there is only one .dat file in the folder
if myanalyzer(im) == result
counter = counter + 1;
myarchive(counter).subfolder_name = subfolerlist(ii);
myarchive(counter).filename = filename;
end
end
I have a data set of big number of videos so, I want to read these videos and save each video separately with its name because it consumes a lot of time to process among all these videos every time specially for training and classification. If you have any idea how can read all video files in the specified folder D:\words of format .avi and save each one with its own name as .MAT file.
But this code doesn't work
Thanks,,,
files = fuf('D:\words');
for i = 1:size(files);
name = files{i};
file = strcat('D:\words',name);
x = VideoReader(file.avi); %NOT SURE FROM THIS LINE%
v = read(x)
name = strcat(name,'.mat');
save(name,'v');
end
You don't need an additional function like fuf to get a list of file names.
If all your files are in "D:\words" (i.e. not in a bunch of sub-directories, which would complicate things), you can just use things like ls to fetch a list of all the avi files.
This is not the most elegant way of doing it (hard coding the directory and not using things like fullfile), but hopefully it's relatively easy to understand what's going on:
% use ls or dir to specifically match *.avi files
files = ls('D:\words\*.avi')
% note that size can return more than one value
% hence size(files,1)
for n = 1:size(files,1);
filename = files(n,:); % pick one file
% assuming this works - you might want to do some error checking
x = VideoReader(filename);
v = read(x);
% now we just want the name minus the ext
[pathstr,name,ext] = fileparts(filename);
fout = ['D:\words\',name,'.mat'];
save(fout,'v');
end
Your variable file is likely a string, not a structure:
...
file = strcat('D:\words',name);
x = VideoReader(file);
...
Or maybe this if the files in your cell array don't have extensions:
...
file = strcat('D:\words',name);
x = VideoReader([file '.avi']);
...
If your fuf function returns files that are not AVI movies, you'll need to do more work.
1) I have an original directory called "Original_file" that contains several number of images. The code below serves to read those images from the directory, converts them to greyscale, then write them into new directory called "Target_File".
Target_File='modified_images';
mkdir(Target_File);
directory='original_images';
dnames = {directory};
cI = cell(1,1);
c{1} = dir(dnames{1});
cI{1} = cell(length(c{1}),1);
for j = 1:length(c{1}),
cI{1}{j} = double(imread([dnames{1} '/' c{1}(j).name]))./255;
cI{1}{j} = rgb2gray(cI{1}{j});
imwrite(cI{1}{j}, fullfile(Target_File, ['image' num2str(j) '.jpg']));
end
2) From the "Target_File": The code below serves to select randomly a specific number of images and put them in a training file.
Train_images='training_file';
mkdir(Train_images);
ImageFiles = dir('Target_File');
totalNumberOfFiles = length(ImageFiles)-1;
scrambledList = randperm(totalNumberOfFiles);
numberIWantToUse = 5; % for example 5
loop_counter = 1;
for index = scrambledList(1 :numberIWantToUse)
baseFileName = ImageFiles(index).name
str = fullfile('Target_File', baseFileName);
image = imread(str);
imwrite( image, fullfile(Train_images, ['image' num2str(index) '.jpg']));
loop_counter = loop_counter + 1;
end
What I want in this question ?
A) If we consider that we have a directory that contains several number of folders (folder1, folder2, ...., foldern). Each of these folders contains several images. So how can I edit my code in 1) in order to apply the same concept and get a new directory "Target_File" that contains the same number of folders, but each folder becomes containing the greyscale images?
Then, from the Target_File created in A) : I want to select (randomly as in 2)) from each folder in Target_File, a specific number of images and put them in training file, and the remaining images in testing file. This procedure is repeated for all folders in the directory.
So if the directory contains 3 folders, each of these folders is split into training and test files. So the first folder is split into train1 and test1, the second directory into train2 and test2, the third directory into train3 and test3, etc. So how to edit my code in 2) ?
Any help will be very appreciated.
You can use the dir command to get a list of sub-directories, and then loop through that list with calls to mkdir to create each one in turn. After that, it is just a matter of matching the file paths so you can save the greyscale image loaded from a source subfolder to its corresponding target folder.
Specifically, D = dir('directory') will return a struct where each element of the structure is an element stored in 'directory'. D(i).isdir will be 1 if D(i).name corresponds to the name of one of your subfolders (note that you will need to ignore D(1:2), as those are the folder navigation tags . and ..). So, get your list of directory contents, and then loop through those calling mkdir if D(i).isdir is 1.
I am not sure I understand the rest of your question, but if you just need a random subsample of the entire image set (regardless of subfolder it is stored in), while you are making your subfolders above you can also make secondary calls of dir to the subfolders to get a list of their contents. Loop through and check whether each element is an image, and if it is save it to an array of image path names. When you have compiled this master list, you can grab a random subset from it.