Delete File Paths Using Pattern - matlab

I have a 3x1 string array like this:
str = 3x1 string array
"C:\Temp\MyReport.docx"
"C:\Data\Experiment1\Trial1\Sample1.csv"
"C:\Temp\Slides.pptx"
I am trying to delete the file extensions such that the only thing remaining is the path:
str = 3x1 string
"C:\Temp\MyReport\"
"C:\Data\Experiment1\Trial1\"
"C:\Temp\Slides.pptx\"
I tried using this code:
match = "/" + wildcardPattern
new_str = erase(str, match);
and erase to remove only the extension, but I get an error "unrecognized function 'wilcardpattern'"
Is there a better way to do this, or fix the error?
Thanks

The fileparts (ref) and fullfile (ref) functions are meant for this kind of work.
If this is your starting point:
s = [ ...
"C:\Temp\MyReport.docx"
"C:\Data\Experiment1\Trial1\Sample1.csv"
"C:\Temp\Slides.pptx"
];
Then you can compute your desired output, per entry, like this:
ix = 1;
[strPath, strName, strExt] = fileparts( s(ix) );
x = fullfile( strPath, strName)
Now x = "C:\Temp\MyReport".

Related

Extract information from path name

I want to make a script in MATLAB that saves my output data with a certain name. All information for this name is in the path from the input data, like it is shown here:
path = 'C:\projektions100\algorithm1\method_A\data1';
projection =
algorithm =
method =
data =
The script then should extract the text in the path with the keyword (f.e. method) from the adjacent backslashes so the script is more flexible in case I made a spelling mistake with some folder names.
This is what I found to extract a text between a start and a end point but I cannot simply use the backslashes since there are a few of them in the path.
How should I proceed?
You can simply use a regexp with named tokens:
>> path = 'C:\projektions100\algorithm1\method_A\data1';
>> all=regexp(path,'[^\\]+\\proje[ck]tion(?<projection>[^\\]+)\\algorithm(?<algorithm>[^\\]+)\\method(?<method>[^\\]+)\\data(?<data>.+$)','names')
all =
struct with fields:
projection: 's100'
algorithm: '1'
method: '_A'
data: '1'
The problem is on how to find the end of your keywords. Here is a bit code, which loops through the keywords and looks for them in the path (stored in p2fldr, because the variable path returns the working path in MATLAB and you overshadow it if you define it).
p2fldr = 'C:\projektions100\algorithm1\method_A\data1';
% keywords
kyWrd = {'projection','algorithm','method','data'};
Tag = cell(size(kyWrd));
for i = 1:length(kyWrd)
% get keyword
ky = kyWrd{i};
% look for it in the path
idx = strfind(p2fldr,ky);
if ~isempty(idx)
% remaining path
idx_offset = idx+strlength(ky);
prm = p2fldr(idx_offset:end);
% look for file separator '\'
idx_tmp = strfind(prm,filesep);
% if you don't find one, it is pabably the last entry, so take the
% length
if isempty(idx_tmp)
idx_tmp = length(prm)+1;
end
% this is the index where it ends
idx2 = idx_tmp(1)-1;
% assign to tag-cell
Tag{i} = prm(1:idx2);
end
end
You can build a shortcut if you know that they are always in the last 4 entries of your path, so you can use strsplit right away and index the last returned cells
str_splt = strsplit(p2fldr,filesep);
Tag = cell(size(kyWrd));
for i = 1:length(kyWrd)
% index cells
str = str_splt{end-length(kyWrd)+i};
% get keyword
ky = kyWrd{i};
Tag{i} = str(length(ky)+1:end);
end
Note that this does not care if it matches your keywords (e.g. your path says 'projektions' but I defined the keyword to be 'projection')

concatenation of the arrayfun output

Assuming that
outputTemp =
2×1 cell array
{122×1 string}
{220×1 string}
finalOutput is a string array (342x1 string).
is there any way to do the following
outputTemp = arrayfun(#(x)someFunc(x), someInput, 'UniformOutput', false)';
finalOutput= [outputTemp{1}; outputTemp{2}];
in one line?
for the minimal example, someFunc can be a function that provides the names of the files in folders provided in someInput.
Short answer: yes. Here is a MWE:
str1 = ["Test";"Test1";"42"]
str2 = ["new test";"pi = 3"]
C = {str1;str2}
ConCatStr = [C{1};C{2}];
This should answer the question regarding concatnation of string-arrays. Note that this is only possible with real strings (not with char-arrays). It is hard to tell, what you are doing beforehand as there are not details about getFilesFilt() and mainFolderCUBX.
EDIT MVE for the updated question
% function that returns a matrix
fnc = #(x)[x,1];
% anonymous function that returns a vector
fnc2 = #(x)reshape(fnc(x),2,1)
tmp = arrayfun(#(x)fnc(x), rand(10,1),'UniformOutput',false)
Answer: there is no proper way. However, you can do a little bit of fiddling and force everything into a single line (making the code ugly and less efficient)
tmp = arrayfun(#(x)fnc(x), rand(10,1),'UniformOutput',false);
out = reshape(cell2mat(tmp),numel(cell2mat(tmp)),1);
just replace the tmp with what is written with it.
You can try the following code using cat() + subsref(), i.e.,
finalOutput= cat(1,subsref(arrayfun(#(x)someFunc(x), someInput, 'UniformOutput', false),struct('type', '{}', 'subs', {{:}})));
Example
S(1).f1 = rand(3,5);
S(2).f1 = rand(6,10);
S(3).f1 = rand(4,2);
cat(1,subsref(arrayfun(#(x) mean(x.f1)',S,'UniformOutput',false),struct('type', '{}', 'subs', {{:}})))
such that
ans =
0.89762
0.53776
0.42440
0.25272
0.58197
0.34503
0.40259
0.41792
0.43527
0.53974
0.49976
0.63342
0.36539
0.58541
0.57042
0.60914
0.60851

What is wrong with my loop code?

I wrote a loop code to extract my table files names into a string or an array and meanwhile collect the data into arrays. But I found my code went wrong as only one file was read and repeated in the loop over and over again. I had no idea where my code is wrong and it already took me several hours to look for the problem. Could somebody help me?
DataCircle = dir('*-circle.xls');
MeanAreaCircle = [];
ColonyNumCircle = [];
PlateNameCircle = [];
for zz = 1:numel(DataCircle)
basefilenamedata1 = DataCircle(w).name; % generate the base name
DataName1 = regexprep(basefilenamedata1,'-circle.xls',''); %replace part of the name and the extension
PlateNameCircle = [PlateNameCircle DataName1]; % collect the file name into a string
T1 = readtable(basefilenamedata1); % read data in
MeanAreaCircle = [MeanAreaCircle mean(T1.area)]; % collect the mean for area
end
What I got is like this, which is wrong:
>> PlateNameCircle
PlateNameCircle ='IMG_0813IMG_0813IMG_0813IMG_0813IMG_0813IMG_0813IMG_0813IMG_0813IMG_0813IMG_0813'
>> MeanAreaCircle
MeanAreaCircle =
1.0e+03 *
6.4152 6.4152 6.4152 6.4152 6.4152 6.4152 6.4152 6.4152 6.4152 6.4152
My input file list:
IMG_0809-CC.xls
IMG_0809-circle.xls
IMG_0810-CC.xls
IMG_0810-circle.xls
IMG_0812-CC.xls
IMG_0812-circle.xls
IMG_0813-CC.xls
IMG_0813-circle.xls
What I want is a column or a character array or a string like this:
PlateNameCircle = 'IMG_0809' 'IMG_0810' 'IMG_0811' 'IMG_0812' 'IMG_0813'
Issues
1: you're using w to index the name, instead of zz, the loop variable. Apparently, there is a stray variable called w in your workspace, equal to 8. That's why you're always reading the same file, regardless of iteration number.
2: you're not adding a space in the name:
PlateNameCircle = [PlateNameCircle ' ' DataName1];
3: you're adding the same name on each two consecutive iterations:
PlateNameCircle = 'IMG_0809-CC IMG_0809 IMG_0810-CC IMG_0810 ...'
Improvements
vectorize and use cell strings
preallocate instead of grow-on-the-fly
give your variables even-better names (although you already did a pretty good job there already, tbh)
something like:
filenames = {D.name};
PlateNameCircle = regexprep(filenames,'-circle.xls',''); %...doubt this is actually what you want, but it *is* what you've written...
MeanAreaCircle = zeros(numel(filenames),1);
ColonyNumCircle = []; % <- not used?
for zz = 1:numel(filenames)
T1 = readtable(filenames{zz}); % read data in
MeanAreaCircle(zz) = mean(T1.area); % collect the mean for area
end
I think the w in
basefilenamedata1 = DataCircle(w).name;
should be zz instead. Otherwise you are always looking at the first file in the list.

MATLAB - create a list for multiple files

I dont know if the title is appropriate, but i need to import several files e.g. 25 (files like info.asd , ina.asd, sdd.asd etc). So in my opinion its possible to import them via a for loop instead of hardcoding the operation. Any ideas how to implement the list in matlab, so the software 'd know what to import?
You can do it without loop with this function. sPath is the path containing your files and sExt is the extension of the files you want to list.
function cList = fileList(sPath, sExt)
if nargin == 1
sExt = '.asd';
end
% List files in the given path
stDir = dir(sPath);
tDir = struct2table(stDir);
tFile = tDir(~tDir.isdir, :);
% Keep only file with the right extension
cList = tFile.name;
[~, cList, cExt] = cellfun(#fileparts , ...
cList , ...
'UniformOutput', false);
vIsIni = cellfun(#(x) strcmpi(x, sExt), cExt);
cList = cList(vIsIni);
end

Concatenate strings of digits in matlab

Suppose I have a series of strings such as:
a = '101010101010'
b = '010101'
c = '000101010'
is there a way in Matlab to concatenate them and produce the binary number 101010101010010101000101010?
Use the concatenation operator [ ], with horizontal concatenation , (vertical concatenation ; will fail here unless you reshape() into column vectors):
[a,b,c]
However, I suggest storing your variables in a cell array:
s = {'101010101010','010101', '000101010'};
[s{:}]
or
cat(2,s{:})
To concatenate strings, you could say:
out = [a b c];
Alternatively:
out = strcat(a,b,c);
Yet another way:
out = sprintf('%s', a,b,c);
I think that this should work:
res = [a,b,c]
or alternatively call
res = strcat(a,b,c)
or, yet
res = cat(2,a,b,c)