possible to extract files according filename listed in a text file by using matlab? - matlab

i have thousand files in a folder, however, i only need to extract out hundred files from the folder according to the filename listed in a text file into new folder. The filenames in text file is listed as a column..is that possible to be run by using matlab?what is the code shall i need to write? Thanks.
example:
filenames.txt is in the C:\matlab
folder include thousand files is named as BigFiles also in C:\matlab
files to be extracted from BigFiles folder is listed in column as below:
filenames.txt
a1sndh
sd3rfe
rgd4de
sd5erw
please advise...thanks...

Enumerate all files in a folder of a specific type (if needed) using:
%main directory to process
directory = 'to_process';
%enumerate all files (.m in this case)
files = dir(fullfile(directory,'*.m'));
numfiles = length(files);
fprintf('Found %i files\n',numfiles)
Then you could load the single column using one of the many file I/O functions in Matlab.
Then just loop through all the input names and check it's name against all the read in files (files{i}.name), and if so, move it.

EDIT:
From what I understood, you are looking for a solution along the lines:
filenames.txt
a.txt
b.txt
c.txt
.
.
.
moveMyFiles.m
%# read filenames listed in a text file
fid = fopen('C:\matlab\filenames.txt');
fList = textscan(fid, '%s');
fList = fList{1};
fclose(fid);
%# source/destination folder names
sourceDir = 'C:\matlab\BigFiles';
destDir = 'C:\matlab\out';
if ~exist(destDir,'dir')
mkdir(destDir);
end
%# move files one by one
for i=1:numel(fList)
movefile(fullfile(sourceDir,fList{i}), fullfile(destDir,fList{i}));
end
You can replace the MOVEFILE function by COPYFILE if you simply want to copy the files instead of moving them...

Related

How to rename a file inside a zip file without extracting it using Matlab commands

I have a bunch of zip folders that I have to extract and read the data (stored in a unique file). The problem is some of these folders have two files by any kind of error (instead of 1) with the same name. When I use the Matlab command "unzip", one of the files is overwrited by the other. The problem is these two files are not the same: one of them has the information I need, and the other one is almost empty. So I would like to rename these two files to file_a and file_b, extract them, and once both are extracted, keep only the larger one.
Do you know if there is any way to rename files inside a zip?
I made a function which will modify the filenames inside the zip file so they can be uncompressed seemlessly.
The function locate the file names in the zip file and change the first letter of each file it encounter with a sequence "A, B, C, D, etc ...".
function differentiateFileNames(zipFilename)
%% get the filenames contained in the zip file
filenames = getZipFileNames(zipFilename) ;
nFiles = numel(filenames) ;
%% Find the positions of the file name fields
% read the full file as a string
str = fileread(zipFilename) ;
% if all filenames are identical, we only need to search for the first name
% in our list
idx = strfind( str , filenames{1} ) ;
%% group indices by physical file
% Each filename appears twice in the zip file:
% ex for 2 files: file1 ... file2 ... file1 ...file2
idx = reshape(idx,nFiles,2)-1 ;
%% Now modify each filename
% (replace the first character of each filename)
fid = fopen(zipFilename,'r+') ;
for k=1:nFiles
char2write = uint8('A'+(k-1)) ; % will be: A, B, C, D, ect ...
fseek(fid,idx(k,1),'bof') ;
fwrite(fid,char2write,'uint8') ;
fseek(fid,idx(k,2),'bof') ;
fwrite(fid,char2write,'uint8') ;
end
fclose(fid) ;
end
function filenames = getZipFileNames(zipFilename)
try
% Create a Java file of the ZIP filename.
zipJavaFile = java.io.File(zipFilename);
% Create a Java ZipFile and validate it.
zipFile = org.apache.tools.zip.ZipFile(zipJavaFile);
% Extract the entries from the ZipFile.
entries = zipFile.getEntries;
catch exception
if ~isempty(zipFile)
zipFile.close;
end
delete(cleanUpUrl);
error(message('MATLAB:unzip:invalidZipFile', zipFilename));
end
cleanUpObject = onCleanup(#()zipFile.close);
k = 0 ;
filenames = cell('') ;
while entries.hasMoreElements
k=k+1;
filenames{k,1} = char(entries.nextElement.getName) ;
end
zipFile.close
end
Be aware that this script assumes that all the files have a similar name in the zip file. When it locate the file names position it only check versus the first file name found.
The sub function getZipFileNames is just a rip off of parts of the unzip.m, with only the necessary content to be able to read the file names contained in the zip file.
For testing:
I made a zip file containing 2 files:
New Text Document1.txt
New Text Document2.txt
I modified the file names inside the zip file with a hex editor, in order to have:
New Text Document1.txt
New Text Document1.txt
so both files have the same name in the archive. If I try to unzip that file, as you described I only get one file in output (the last file overwrite the other).
If I run differentiateFileNames(zipFilename), then unzip the file, I get 2 files in the output directory:
Aew Text Document1.txt
Bew Text Document1.txt
I know it can look a bit cryptic, but it insures the files are diferentiated. If you want, as an exercise, it wouldn't take much to extend the script to directly unzip the files, find out the largest one, delete the other, then rename the file left with the proper original name.

Converting multiple .txt files to .mat in the same folder

I have many .txt files that contain n rows and 7 columns each delimited with whitespace. I want to convert each file to .mat file and save that in the same folder.
I tried this but it's not working:
files = dir('*.txt');
for file = files'
data=importdata(file.name);
save(file.name, 'data');
end
While this works for a single file, i want to do it programmably since the number of .txt files i have is very large:
data=importdata('myfile.txt');
save('myfile', 'data');
Thank you for your help
This should work
files = dir('*.txt');
for idx = 1:length(files)
file_name = files(idx).name;
fprintf("Processing File %s\n",file_name);
data=importdata(file_name);
[filepath,name,ext] = fileparts(fullfile(pwd,file_name));
save([name '.mat'],'data');
end
dir creates a stucture which you need to index through so we create the for loop to start at 1 and keep going until all the elements of dir have been processed.
Note in the code, I've also added a section to split the file name (e.g file1.txt) in to the file name and extension. This is so we only use the name part and not the extension when creating the mat file.
#scotty3785's answers worked well and also this worked for me in case somebody needs it:
files = dir('*.txt');
for i=1:length(files)
data=importdata(files(i).name);
save(erase(files(i).name,".txt"), 'data');
end

Loop through .fig files and group them into folders based on the file name

I have a lot of .fig files that are named like this: 20160922_01_id_32509055.fig, 20160921_02_id_53109418.fig and so on.
So I thought that I create a script that loop through all the .fig files in the folder and group(copy) them into another folder(s) based on the last number in the file name. The folder is created based on the id number. Is this possible?
I have been looking on other solutions involving looping through folders but I am totally fresh. This would make it easier for me to check the .fig files while I am learning to do other stuff in Matlab.
All is possible with MATLAB! We can use dir to get all .fig files, then use regexp to get the numeric part of each filename and then use copyfile to copy the file to it's new home. If you want to move it instead, you can use movefile instead .
% Define where the files are now and where you want them.
srcdir = '/my/input/directory';
outdir = '/my/output/directory';
% Find all .fig files in the source directory
figfiles = dir(fullfile(srcdir, '*.fig'));
figfiles = {figfiles.name};
for k = 1:numel(figfiles)
% Extract the last numeric part from the filename
numpart = regexp(figfiles{k}, '(?<=id_)\d+', 'match', 'once');
% Determine the folder we are going to put it in
destination = fullfile(outdir, numpart);
% Make sure the folder exists
if ~exist(destination, 'dir')
mkdir(destination)
end
% Copy the file there!
copyfile(fullfile(srcdir, figfiles{k}), destination)
end
Here's an example how to identify and copy the files. I'll let you do the for loop :)
>> Figs = dir('*.fig'); % I had two .fig files on my desktop
>> Basename = strsplit(Figs(1).name, '.');
>> Id = strsplit(Basename{1}, '_');
>> Id = Id{3};
>> mkdir(fullfile('./',Id));
>> copyfile(Figs(1).name, fullfile('./',Id));
Play with the commands to see what they do. It should be straightforward :)

Matlab:renaming Files in a Sequential Order

I have a number of text files with no Sequential Order :
010010.txt 010030.txt 010070.txt
How could I change the file names to:
text01.txt text02.txt ....
Is it possible not to re writte over the old directory but create a new directory
I have used the following script but the result is that it is working fine but it goes from text001.txt to text021.txt to then text041.txt
any idea?
directory = 'C:\test\'; %//' Directory with txt files
filePattern = fullfile(directory, '*.txt'); %//' files pattern with absolute paths
old_filename = cellstr(ls(filePattern)) %// Get the filenames
file_ID = strrep(strrep(old_filename,'file',''),'.txt','') %// Get numbers associated with each file
file_ID_doublearr = str2double(file_ID)
file_ID_doublearr = file_ID_doublearr - min(file_ID_doublearr)+1
file_ID = strtrim(cellstr(num2str(file_ID_doublearr)))
str_zeros = arrayfun(#(t) repmat('0',1,t), 4-cellfun(#numel,file_ID),'uni',0) %// Get zeros string to be pre-appended to each filename
new_filename = strcat('file',str_zeros,file_ID,'.txt') %// Generate new filenames
cellfun(#(m1,m2) movefile(m1,m2),fullfile(directory,old_filename),fullfile(directory,new_filename)) %// Finally rename files with the absolute paths
That looks pretty complicated. I would simply make a system call to move all of the files to a new directory, then sequentially rename each file one at a time with additional system calls. It also looks like you're using Windows, so I'll provide a solution for that platform. You have the beginning right where you are reading in the files from a source directory.
directory = 'C:\test\'; %// Directory with txt files
directoryToCopyOver = 'C:\out\'; %// Directory where you want to copy the files over
%// Copy source directory to target directory
system(['xcopy ' directory ' ' directoryToCopyOver]);
filePattern = fullfile(directoryToCopyOver, '*.txt'); %//' files pattern with absolute paths
names = dir(filePattern); %// Find all files with above pattern
%// For each file we have...
for idx = 1 : numel(names)
name = names(idx).name; %// Get a name of a file
%// Rename this file to textxx.txt
outName = sprintf('text%2.2d.txt', idx);
%// Call system and rename the file
system(['ren ' directoryToCopyOver name ' ' directoryToCopyOver outName]);
end
Some important things to note is that I use system to make system calls to your Windows command prompt. I use xcopy to copy a whole directory from one point to another. In this case, this would be your source directory over to a new target directory. After I do this, I invoke MATLAB's dir to determine all of the file names that match the particular pattern you have laid out, which is all of the text files.
Then, for each text file name we have, we read in this name, then create an output name of type textxx.txt, where xx is a number starting from 1 to as many text files as we have, and then I invoke the Windows command prompt command ren to rename the file from the original name to the new name. Also, take a look at sprintf from MATLAB. It is designed to create strings using formatting delimiters. If you see how I called it, %2.2d means that I am expecting the number to be two digits long, and should the number be less than two digits, fill the spaces with a zero. If you want to increase the amount of digits, simply add more to each place. For example, if you want to have 4 digits, do %4.4d, and so on. This will properly create the right string so that we can rename the right file in this new directory.
Hope this helps!

importing excel into matlab

I have 4 folders in the same directory where each folder contains ~19 .xls files. I have written the code below to obtain the name of each of the folders and the name of each .xls file within the folders.
path='E:\Practice';
folder = path;
dirListing = dir(folder);
dirListing=dirListing(3:end);%first 2 are just pointers
for i=1:length(dirListing);
f{i} = fullfile(path, dirListing(i,1).name);%obtain the name of each folder
files{i}=dir(fullfile(f{i},'*.xls'));%find the .xls files
for j=1:length(files{1,i});
File_Name{1,i}{j,1}=files{1,i}(j,1).name;%find the name of each .xls file
end
end
Now I'm trying to import the data from excel into matlab by using xlsread. What I'm struggling with is knowing how to load the data into matlab within a loop where the excel files are in different directories (different folders).
This leaves me with a 1x4 cell named File_Name where each cell refers to a different folder located under 'path', and within each cell is then the name of the spreadsheets wanting to be imported. The size of the cells vary as the number of spreadsheets in each folder varies.
Any ideas?
thanks in advance
I'm not sure if I'm understanding your problem, but all you have to do is concatenate the strings that contain directory (f{}) and the file name. Modifying your code:
for i=1:length(dirListing);
f{i} = fullfile(path, dirListing(i,1).name);%obtain the name of each folder
files{i}=dir(fullfile(f{i},'*.xls'));%find the .xls files
for j=1:length(files{1,i});
File_Name{1,i}{j,1}=files{1,i}(j,1).name;%find the name of each .xls file
fullpath = [f{i} '/' File_Name{1,i}{j,1}];
disp(['Reading file: ' fullpath])
x = xlsread(fullpath);
end
end
This works on *nix systems. You may have to join the filenames with a '\' on Windows. I'll find a more elegant way and update this posting.
Edit: The command filesep gives the forward or backward slash, depending on your system. The following should give you the full path:
fullpath = [f{i} filesep File_Name{1,i}{j,1}];
Take a look at this helper function, written by a member of the matlab community.
It allows you to recursively search through directories to find files that match a certain pattern. This is a super handy function to use when looking to match files.
You should be able to find all your files in a single call to this function. Then you can loop through the results of the rdir function, loading the files one at a time into whatever data structure you want.