Name of files in a specific directory using matlab - matlab

I want to find the name of files in a specific directory. I know that the dir command return the name of files but it contains the file name with the their extension. Therefore, I used strfind to remove the extension of files as follows:
a = dir(fullfile(dataset_path, [dataset_category '\qrel']))
for i= 3: length(a)
name{i} = a(i).name(1:strfind(a(i).name, '.')-1)
I want a better approach without loop. I wonder is it a way to use vectorization for this purpose. I used the following code but it return an error
a = dir(fullfile(dataset_path, [dataset_category '\qrel']))
name = a.name(1:strfind(a.name, '.')-1)

You can do that with regular expressions:
name = regexprep({a.name}, '\.[^\.]*$', '');
This collects all names in a cell array ({a.name}). For each string it matches a dot (\.) followed by zero or more characters other than a dot ([^\.]*) at the end of the string ($), and removes that. Thanks to #Shai for the "other than a dot" correction, which makes sure that only the final dot is matched.

Related

Matlab - Help in listing files using a name-pattern

I'm trying to create a function that lists the content of a folder based on a pattern, however the listing includes more files than needed. I'll explain by an example: Consider a folder containing the files
file.dat
file.dat._
file.dat.000
file.dat.001
...
file.dat.999
I am interested only in the files that are .000, .001 and so on. The files file.dat and file.dat._ are to be excluded.
The later numbering can also be .0000,.0001 and so on, so number of digits is not necessarily 3.
I tried using the Dir command with the pattern file.dat.* - this included file.dat for some reason (Why the last comma treated differently?) and file.dat._, which was expected.
The "obvious" set of solutions is to add an additional regular expression or length check - however I would like to avoid that, if possible.
This needs to work both under UNIX and Windows (and preferably MacOS).
Any elegant solutions?
Get all filenames with dir and filter them using with the regex '^file\.dat\.\d+$'. This matches:
start of the string (^)
followed by the string file.dat. (file\.dat\.)
followed by one or more digits (\d+)
and then the string must end ($)
Since the output of dir is a cell array of char vectors, regex returns a cell array with the matching indices of each char vector. The matching indices can only be 1 or [], so any is applied to each cell's content to reduce it to true or false The resulting logical index tells which filenames should be kept.
f = dir('path/to/folder');
names = {f.name};
ind = cellfun(#any, regexp(names, '^file\.dat\.\d+$'));
names = names(ind);

How can I get a part of partfile?

This is probably a very simple question, but I am not able to find a straightforward solution.
[pathstr,name,ext] = fileparts('/xaaa/Data/Q2/CONUS/2002/PRECIPRATE.20020401.000000.tif')
Obviously, fileparts gives /xaaa/Data/Q2/CONUS/2002/
But I only want to access /xaaa/Data/Q2/CONUS/ and disregard the last section.
One way to do it is simply count the letters parthstr(1:20). But there must be an elegant alternative.
The most robust way to get a parent folder is to use '..' to access the folder above a provided folder. This is because it is independent of whether you specify an absolute or relative path as the input.
parent = fullfile(folder, '..');
In your case, since you have a filename and you want to get the parent, you can add a 'fileparts' call to that to get the direct parent folder, then pass it to the above.
parent = fullfile(fileparts(filename), '..');
This is more robust because it allows you to specify a relative file path such as 2002/PRECIPRATE.20020401.000000.tif which could fail if you tried to call fileparts multiple times.
If you only have a filename (with no directories because you're in the folder where the file is), you can use which to get an absolute path to the file.
parent = fullfile(fileparts(which(filename)), '..');
One simple way is to repeat the use of fileparts():
>> [pathstr,name,ext] = fileparts('/xaaa/Data/Q2/CONUS/2002/PRECIPRATE.20020401.000000.tif');
>> [parent_pathstr, name, ~] = fileparts(pathstr)
parent_pathstr =
/xaaa/Data/Q2/CONUS
name =
2002
Note: using the tilde ~ just ignores the file extension for the second call to fileparts() because you don't expect an extension.
There are three answers proposed already, but I do believe there's a better solution. I would match .*(?=/.*/) pattern using regexp, like this:
>> originalPath = '/xaaa/Data/Q2/CONUS/2002/PRECIPRATE.20020401.000000.tif';
>> res = char(regexp(originalPath, '.*(?=/.*/)', 'match'))
res =
/xaaa/Data/Q2/CONUS
If you need to go n levels deeper, just keep adding .*/ for each level, e.g.
>> res = char(regexp(originalPath, '.*(?=/.*/.*/)', 'match'))
res =
/xaaa/Data/Q2
For the OS-agnistic version, or if your path contains some mixture of back-slashes and forward-slashes, you can use the following regex: '.*(?=[/\\].*[/\\])'. Once again, to go several levels deper, just add an extra .*[/\\] for each level.
The benefit over using strsplit and fileparts is that you don't need to iterate anything - you get the answer with one simple regex.
Regarding .. - I myself used this solution for a long time for generating Matlab Path dynamically. However Matlab is sometimes not able to handle breakpoints correctly in the files that have .. in their path. To be exact, if you place a breakpoint in such a file, Matlab would ignore it unless there's another breakpoint that is triggered first (which is not in a file with .. in path).
It obviously handles relative paths as well.

Loop through files with specific extension

I need to open many files in a loop, with the same extension.
Example file names are: c1_p1_t_r.mat,c1_p3_t_r.mat,c1_p6_t_r.mat,c1_p7_t_r.mat,c1_p10_t_r.mat,etc.
So basically, the first and last part of the file names are the same, but something in the middle changes.
I tried with:
Ext = 'c1_*t_r*.mat';
files = dir(Ext);
but it doesn't work. Any suggestion would be greatly appreciated.
Looking at the file names you shared you should use c1*t_r.mat rather than c1*t_r*.mat
Use files = dir('*.Ext'); You need the apostrophes to pass it as a string and the asterisk as the wildcard for file names. I think passing multiple asterisks here is the problem. You might resort to creating the variable name as a full string in case they are as similar though:
for ii = 1:NumberOfFiles
filename = sprintf('c1_p%dt_r.mat',ii);
%//load file with created name
end

Store user input as wildcard

I am having some trouble with a data processing function in MATLAB. The function takes the name of the file to be processed as an input, finds the desired files, and reads in the data.
However, several of the desired files are variants, such as Data_00.dat, Data.dat, or Data_1_March.dat. Within my function, I would like to search for all files containing Data and condense them into one usable file for processing.
To solve this, I would like desiredfile to be converted into a wildcard.
Here is the statement I would like to use.
selectedfiles = dir *desiredfile*.dat % Search for file names containing desiredfile
This returns all files containing the variable name desiredfile, rather than the user input.
The only solution that I can think of is writing a separate function that manually condenses all the variants into one file before my function is run, but I am trying to keep the number of files used down and would like to avoid this.
You could concatenate strings for that. Considering desiredFile as a variable.
desiredFile = input('Files: ');
selectedfiles = dir(['*' desiredfile '*.dat']) % Search for file names containing desiredfile
Enclosing strings between square brackets [string1 string2 ... stringN]concatenates them. Matlab's dir function receives a string.
I believe you can achieve that using the dir command.
dataSets = dir('/path/to/dir/containing/Data*.dat');
dataSets = {dataSets.name};
Now simply loop over them, more information here.
To quote the matlab help:
dir lists the files and folders in the MATLABĀ® current folder. Results appear in the order returned by the operating system.
dir name lists the files and folders that match the string name. When name is a folder, dir lists the contents of the folder. Specify name using absolute or relative path names. You can use wildcards (*).

Find and replace text file Matlab

I'm writting a Matlab code that generates an array number and it should replace that each number in a text file (that already exists) and replace all instances with that. The number should be in string format. I've achieved this:
ita='"';
for i=1:size(z,2)
word_to_replace=input('Replace? ','s');
tik=input('Replacement? ','s');
coluna=input('Column? ');
files = dir('*.txt');
for i = 1:numel(files)
if ~files(i).isdir % make sure it is not a directory
contents = fileread(files(i).name);
fh = fopen(files(i).name,'w');
val=num2str(z(i,coluna));
word_replacement=strcat(tik,val,ita);
contents = regexprep(contents,'word_to_replace','word_replacement');
fprintf(fh,contents); % write "replaced" string to file
fclose(fh) % close out file
end
end
end
I want the code to open the file#1 ('file.txt'), find and replace all instances 'word_replacement' with 'word_to_replace' and save to the same file. The number of txt files is undefined, it could be 100 or 10000.
Many thanks in advance.
The problem with your code is the following statement:
contents = regexprep(contents,'word_to_replace','word_replacement');
You are using regular expressions to find any instances of word_to_replace in your text files and changing them to word_replacement. Looking at your code, it seems that these are both variables that contain strings. I'm assuming that you want the contents of the variables instead of the actual name of the variables.
As such, simply remove the quotations around the second and third parameters of regexprep and this should work.
In other words, do this:
contents = regexprep(contents, word_to_replace, word_replacement);