Stripping suffix from all elements of string cell array - matlab

I have a cell array containing strings, called old_array. Each element ends with the suffix '.dat'. I want to create a new cell array, called new_array, that has the same elements, but without this suffix.
I know that the following function exists:
[new_array] = arrayfun(func, old_array)
But what do I use for func? I thought about using strsplit(str, '.') and taking the first element of this array, something like:
[new_array] = arrayfun(strsplit(*, '.')[0], old_array)
But what do I place instead of the *? What's the best solution?

If every element in your array ends with .dat, why don't you simply extract all of the characters except the last 4 for each string in your cell array?
new_array = cellfun(#(x) x(1:end-4), old_array, 'UniformOutput', false);
This should return a new cell array stored in new_array where each element loses the last 4 characters of each string from the cell array old_array, which is the string .dat for each string in old_array.
However, if you want to make this more robust and if you want to accommodate any file name, you can use strsplit like what you have in your post. You would have to structure it like so:
%// Use to split up the strings for each cell and store in individual cells
new_array_temp = cellfun(#(x) strsplit(x, '.'), old_array, 'UniformOutput', false);
%// Extract the first cell of each nested cell
new_array = cellfun(#(x) x{1}, new_array_temp, 'UniformOutput', false);
We need to do the first step so that we can return a cell array of cells. Each nested cell within the larger cell array will contain strings that are delimited by . within the nested cell element. You then run the next command so that we extract the first string of each nested cell, which is the file name itself before the ..
Here's an example that shows you how this is run, as well as the intermediate outputs:
old_array = {'Hi.dat', 'how.dat', 'are.dat', 'you.dat'};
new_array_temp = cellfun(#(x) strsplit(x, '.'), old_array, 'UniformOutput', false);
celldisp(new_array_temp);
new_array_temp{1}{1} =
Hi
new_array_temp{1}{2} =
dat
new_array_temp{2}{1} =
how
new_array_temp{2}{2} =
dat
new_array_temp{3}{1} =
are
new_array_temp{3}{2} =
dat
new_array_temp{4}{1} =
you
new_array_temp{4}{2} =
dat
new_array = cellfun(#(x) x{1}, new_array_temp, 'UniformOutput', false);
disp(new_array);
'Hi' 'how' 'are' 'you'
Minor note
Note: strsplit only works for MATLAB R2013a and up. If you want this to work with previous versions of MATLAB, use regexp. Replace the strsplit call within cellfun with this:
new_array_temp = cellfun(#(x) regexp(x, '\.', 'split'), 'UniformOutput', false);
This should basically achieve the same thing as strsplit. However, if you really, really, really, really want to use strsplit, there is an implementation on the MathWorks File Exchange: http://www.mathworks.com/matlabcentral/fileexchange/21710-string-toolkits/content/strings/strsplit.m

Related

Matlab add string variable as column in table

I'm trying to use YOLOv4 in MATLAB R2022b to carry out detections on all images in a directory, and append results to a text file.
I can append just the detection results to each line, but when I try to add the filename I get this error:
You might have intended to create a one-row table with the character vector '000001.jpg' as one of its variables. To store text data in a table, use a string array or a cell array of character vectors rather than character arrays. Alternatively, create a cell array with one row, and convert that to a table using CELL2TABLE.
I understand that the filename is a string, and the values returned by YOLO are a categorical array, but I don't understand the most efficient way to deal with this.
filesDir = dir("/home/ADL-Rundle-1/img1/");
for k=1:length(filesDir)
baseFileName=filesDir(k).name
fullFileName = fullfile(filesDir(k).folder, baseFileName);
if isfile(fullFileName)
img = imread(fullFileName);
[bboxes,scores,labels] = detect(detector,img);
T = table(baseFileName, labels, bboxes, scores);
writetable(T,'/home/tableDataPreTrained.txt','WriteMode','Append','WriteVariableNames',0);
end
end
The format of results from YOLO is
And I'd like a file with
000001.jpg, 1547.3, 347.35, 355.64, 716.94, 0.99729
000001.jpg, 717.81, 370.64, 76.444, 108.92, 0.61191
000002.jpg, 1, 569.5, 246.49, 147.25,0.56831
baseFileName is a char vector.
The error message is telling you to use a cell array of char vectors:
T = table({baseFileName}, labels, bboxes, scores);
or a string array:
T = table(string(baseFileName), labels, bboxes, scores);
I would use the string array, it's the more modern MATLAB, and the table looks prettier when displayed. But both accomplish the same thing.
Given that labels and the other two variables have multiple rows, you need to replicate the file name that number of times:
frame = repmat(string(baseFileName), size(labels,1), 1);
T = table(frame, labels, bboxes, scores);

Save Cell Array into 2D-Array in Matlab

I have a Cell Array 1*42 .
I want to save this cell array into 311029*42 array size in .mat file.enter image description here
How to do it ?
You can use cell2mat function to do this. You can see the mechanism of this function in this link (see the following image).
You can just horizontally concatenate a comma-separated list generated from the cell array, then save your new variable like so:
newData = [data{:}];
svae('your_file.mat', 'newData');
Let C be a cell array of 1x42 size. Then, run the following code to get the output array Y.
N = length(C);
L = size(C{1});
Y = size(L(1),L(2)*N);
for n = 1:N
Y(:,1+(n-1)*L(2):n*L(2)) = C{n};
end

cell string array manipulation

I have a cell with strings in the following format:
data = {'para1_left = numeric value';'para1_right = numeric value';
'para2_left = numeric value';'para2_right = numeric value';
........
'para100_up = numeric value';'para100_down = numeric value';
and so on...I have a few hundreds of these};
I want two cells out of this cell: one with just the parameter names, p_name, and another with just the values, p_val.
Once I have the two cells, I want to compare the p_name cell with another cell of shorter length. This new cell will have strings in the following format:
new_cell = {'para1';'para5';'para10';...'para25'};
Basically these strings miss the trailing parts: _left, _right, etc.
Then, I want to have a list of indices of p_name that contain any of the strings in new_cell, indx_match = [1;2;10;20....and so on] so that I can get the values of the matching parameter names by doing p_val{indx_match}.
I want to do the above with the minimum number of lines, probably using cellfun. I figured out how to find the indices by using strfind command, but then it creates a cell array and p_val{indx_match} doesn't work (I tried various ways using cellfun, but no success yet).
I'm not sure exactly what you want, but this should get you on the right track.
Org = {'a_l = 5'; 'a_r = 7'; 'b_l = 6'; 'b_r = 7'};
Shr = {'a'};
splt = cellfun(#(s) strsplit(s, {'=', ' '}), Org, 'uni', 0);
p_name = cellfun(#(c) c{1}, splt, 'uni', 0);
p_val = cellfun(#(c) str2num(c{2}), splt);
param = cellfun(#(c) strsplit(c, '_'), p_name, 'uni', 0);
param = cellfun(#(c) c{1}, param, 'uni', 0);
index = cellfun(#(s) strfind(param, s), Shr, 'uni', 0);

Matlab, Convert cell to matrix

Hope some of you can help me. I have converted a pdf with a lot of txt and tables to .txt file. I did this because three values of the pdf has to be writen into exel. This has to be done more than a thusind times a mounth, therefore i thought there has to be a better eay than doing it manually. The only things that has to be extracted is the Date, Repport number and a single volume. I found out that the date and repport number always is at the same line, so thats pretty easy to extract, even though its readen into a 145x1 cell. But this brings me to my first question.
Each of the cells looks like this:
Date 23/4-2015
Repportnumber 8
How do i remove the whitespace?
I also have to extract the volume. this was more difficult, cause the linemunber of the volume differentiates from one pdf to another, therefore i created a searchfunction, which works and founds the volume, which is created to a cellarry looking like this:
[233.4 452.2 94.6]
I only needs the middlenumber, so how do i create this into a matrix?
Keep in mind it is a 1x1 cell, with whitespace!
Hope some of you guys can help me.
For your first question, you can remove the spaces by searching the line of characters and identifying the spaces with strcmp, then setting those elements of the character string to be empty ([]). Here is an example of the code for that:
% number of character
N = length(my_string);
% character to remove (initialize all 0)
icut = zeros(1,N);
% check each character
for i = 1:N;
% if character is a space, tag for removal
if strcmp(my_string(i),' ');
icut(i) = 1;
end
end
% remove space characters
my_string(icut == 1) = [];
For your second question, you can convert the contents of the cell to a numeric array then simply take the 2nd element.
% convert the cell contents to an array of numbers
cell_array = str2num(my_cell{1});
% get the middle value
middle_value = cell_array(2);
This assumes the cell contains the array of values as a string, as in my_cell = {'[233.4 452.2 94.6]'};.
You can remove the whitespace from a string using strrep. This works on cells containing strings or on char arrays and returns the same object type that it was applied to. If you pass in a cell to strrep it will return a cell, if you pass in a char array it will return a char array.
>> C = {'Date 23/4-2015 Repportnumber 8'};
>> strrep(C, ' ', '') % Cell containing string (char array)
ans =
'Date23/4-2015Repportnumber8'
>> strrep(C{1}, ' ', '') % String (char array)
ans =
Date23/4-2015Repportnumber8
To convert the version cell array to a matrix you can use str2num. Then you can use linear indexing to extract the correct version.
>> C = {'[233.4 452.2 94.6]'};
>> C = str2num(C{1});
>> C(2)
ans =
452.2000

vectorizing a script with cellfun

I'm aiming to import data from various folder and text files into matlab.
clear all
main_folder = 'E:\data';
%Directory of data
TopFolder = dir(main_folder);
%exclude the first two cells as they are just pointers.
TopFolder = TopFolder(3:end);
TopFolder = struct2cell(TopFolder);
Name1 = TopFolder(1,:);
%obtain the name of each folder
dirListing = cellfun(#(x)dir(fullfile(main_folder,x,'*.txt')),Name1,'un',0);
Variables = cellfun(#(x)struct2cell(x),dirListing,'un',0);
FilesToRead = cellfun(#(x)x(1,:),Variables,'un',0);
%obtain the name of each text file in each folder
This provides the name for each text file in each folder within 'main_folder'. I am now trying to load the data without using a for loop (I realise that for loops are sometimes faster in doing this but I'm aiming for a compact script).
The method I would use with a for loop would be:
for k = 1:length(FilesToRead);
filename{k} = cellfun(#(x)fullfile(main_folder,Name{k},x),FilesToRead{k},'un',0);
fid{k} = cellfun(#(x)fopen(x),filename{k},'un',0);
C{k} = cellfun(#(x)textscan(x,'%f'),fid{k},'un',0);
end
Is there a method which would involve not using loops at all? something like cellfun within cellfun maybe?
folder = 'E:\data';
files = dir(fullfile(folder, '*.txt'));
full_names = strcat(folder, filesep, {files.name});
fids = cellfun(#(x) fopen(x, 'r'), full_names);
c = arrayfun(#(x) textscan(x, '%f'), fids); % load data here
res = arrayfun(#(x) fclose(x), fids);
assert(all(res == 0), 'error in closing files');
but if the data is in csv format it can be even easier:
folder = 'E:\data';
files = dir(fullfile(folder, '*.txt'));
full_names = strcat(folder, filesep, {files.name});
c = cellfun(#(x) csvread(x), full_names, 'UniformOutput', false);
now all the data is stored in c
Yes. This it going to be pretty scary since C depends on fid depends on filename. The basic idea will be:
deal(feval(#(filenames_fids){filenames_fids{1}, filenames_fids{2}, ...
<compute C>}, feval(#(filenames){filenames, <compute fid>}, ...
<compute filenames>)));
Let's start with computing the filenames:
arrayfun(#(x)cellfun(#(x)fullfile(main_folder,Name{k},x),FilesToRead{k},...
'un',0), 1:length(FilesToRead), 'uniformoutput', 0);
this will give us a K-by-1 cell array of filenames. Now we can use that to compute fids:
{filenames, arrayfun(#(k)cellfun(#(x)fopen(x),filenames{k},'un',0), ...
1:length(FilesToRead), 'uniformoutput', 0)};
We stick fids together with filenames in a K-by-2 cell array, ready to pass on to compute our final outputs:
{filenames_fids{1}, filenames_fids{2}, ...
arrayfun(#(k)cellfun(#(x)textscan(x,'%f'), ...
filenames_fid{2}{k},'un',0), 1:length(FilesToRead), 'uniformoutput', 0)}
Then we're putting that final cell array into deal, so that the results end up in three different variables.
[filenames fid C] = deal(feval(#(filenames_fids){filenames_fids{1}, ...
filenames_fids{2}, arrayfun(#(k)cellfun(#(x)textscan(x,'%f'), ...
filenames_fid{2}{k},'un',0), 1:length(FilesToRead), 'uniformoutput', 0)}, ...
feval(#(filenames){filenames, arrayfun(#(k)cellfun(#(x)fopen(x), ...
filenames{k},'un',0), 1:length(FilesToRead), 'uniformoutput', 0)}, ...
arrayfun(#(x)cellfun(#(x)fullfile(main_folder,Name{k},x),FilesToRead{k}, ...
'un',0), 1:length(FilesToRead), 'uniformoutput', 0))));
Errm... There's probably a nicer way to do this if you don't mind about keeping filenames and fid. Maybe using cellfun instead of arrayfun could also make it more concise, but I'm not very good with cellfuns, so this is what I came up with. I think the for loop version is more compact anyway! (also, I haven't actually tested this. It will probably need some debugging).