Matlab, Convert cell to matrix - matlab

Hope some of you can help me. I have converted a pdf with a lot of txt and tables to .txt file. I did this because three values of the pdf has to be writen into exel. This has to be done more than a thusind times a mounth, therefore i thought there has to be a better eay than doing it manually. The only things that has to be extracted is the Date, Repport number and a single volume. I found out that the date and repport number always is at the same line, so thats pretty easy to extract, even though its readen into a 145x1 cell. But this brings me to my first question.
Each of the cells looks like this:
Date 23/4-2015
Repportnumber 8
How do i remove the whitespace?
I also have to extract the volume. this was more difficult, cause the linemunber of the volume differentiates from one pdf to another, therefore i created a searchfunction, which works and founds the volume, which is created to a cellarry looking like this:
[233.4 452.2 94.6]
I only needs the middlenumber, so how do i create this into a matrix?
Keep in mind it is a 1x1 cell, with whitespace!
Hope some of you guys can help me.

For your first question, you can remove the spaces by searching the line of characters and identifying the spaces with strcmp, then setting those elements of the character string to be empty ([]). Here is an example of the code for that:
% number of character
N = length(my_string);
% character to remove (initialize all 0)
icut = zeros(1,N);
% check each character
for i = 1:N;
% if character is a space, tag for removal
if strcmp(my_string(i),' ');
icut(i) = 1;
end
end
% remove space characters
my_string(icut == 1) = [];
For your second question, you can convert the contents of the cell to a numeric array then simply take the 2nd element.
% convert the cell contents to an array of numbers
cell_array = str2num(my_cell{1});
% get the middle value
middle_value = cell_array(2);
This assumes the cell contains the array of values as a string, as in my_cell = {'[233.4 452.2 94.6]'};.

You can remove the whitespace from a string using strrep. This works on cells containing strings or on char arrays and returns the same object type that it was applied to. If you pass in a cell to strrep it will return a cell, if you pass in a char array it will return a char array.
>> C = {'Date 23/4-2015 Repportnumber 8'};
>> strrep(C, ' ', '') % Cell containing string (char array)
ans =
'Date23/4-2015Repportnumber8'
>> strrep(C{1}, ' ', '') % String (char array)
ans =
Date23/4-2015Repportnumber8
To convert the version cell array to a matrix you can use str2num. Then you can use linear indexing to extract the correct version.
>> C = {'[233.4 452.2 94.6]'};
>> C = str2num(C{1});
>> C(2)
ans =
452.2000

Related

Matlab add string variable as column in table

I'm trying to use YOLOv4 in MATLAB R2022b to carry out detections on all images in a directory, and append results to a text file.
I can append just the detection results to each line, but when I try to add the filename I get this error:
You might have intended to create a one-row table with the character vector '000001.jpg' as one of its variables. To store text data in a table, use a string array or a cell array of character vectors rather than character arrays. Alternatively, create a cell array with one row, and convert that to a table using CELL2TABLE.
I understand that the filename is a string, and the values returned by YOLO are a categorical array, but I don't understand the most efficient way to deal with this.
filesDir = dir("/home/ADL-Rundle-1/img1/");
for k=1:length(filesDir)
baseFileName=filesDir(k).name
fullFileName = fullfile(filesDir(k).folder, baseFileName);
if isfile(fullFileName)
img = imread(fullFileName);
[bboxes,scores,labels] = detect(detector,img);
T = table(baseFileName, labels, bboxes, scores);
writetable(T,'/home/tableDataPreTrained.txt','WriteMode','Append','WriteVariableNames',0);
end
end
The format of results from YOLO is
And I'd like a file with
000001.jpg, 1547.3, 347.35, 355.64, 716.94, 0.99729
000001.jpg, 717.81, 370.64, 76.444, 108.92, 0.61191
000002.jpg, 1, 569.5, 246.49, 147.25,0.56831
baseFileName is a char vector.
The error message is telling you to use a cell array of char vectors:
T = table({baseFileName}, labels, bboxes, scores);
or a string array:
T = table(string(baseFileName), labels, bboxes, scores);
I would use the string array, it's the more modern MATLAB, and the table looks prettier when displayed. But both accomplish the same thing.
Given that labels and the other two variables have multiple rows, you need to replicate the file name that number of times:
frame = repmat(string(baseFileName), size(labels,1), 1);
T = table(frame, labels, bboxes, scores);

Performing find and replace functions on elements of a table in Matlab

I am working with a 400x1200 imported table (readtable generated from an .xls) which contains strings, doubles, dates, and NaNs. Each column is typed consistently. I am looking for a way to locate all instances in the table of any given string ('Help me please') and replace them all with a double (1). Doing this in Matlab will save me loads of work making changes to the approach used on the rest of this project.
Unfortunately, all of the options I've looked at (regexp, strrep, etc) can only take a string as a replacement. Strfind was similarly unhelpful, because of the typing across the table. The lack of cellfun has also made this harder than it should be. I know the solution should have something to do with finding the indices of the strings I want and then just looping DataFile{subscript} = [1], but I can't find a way to do it.
First you should transform your table at a cell array.
Then, you can use the strrep along with str2num, e.g.
% For a given cell index
strrep(yourCellIndexVariable, "Help me please", "1");
str2num(yourCellIndexVariable);
This will replace the string "Help me please" with the string "1" (the strrep function) and the str2num will change the cell index to the double value according to the string.
By yourCellIndexVariable I mean an element from the cell array. There are several ways to get all cells from a cell array, but I think that you have solved that part already.
What you can do is as follows:
[rows, cols] = size(table); % Get the size of your table
YourString = 'Help me please'; % Create your string
Strmat = repmat(YourString,rows,cols); % Stretch to fill a matrix of table size
TrueString = double(strcmp(table,Strmat)); % Compares all entries with one another
TrueString now contains logicals, 1 where the string 'Help me please' is located, and 0 where it is not.
If you have a table containing multiple classes it might be handy to switch to cells though.
Thank you very much everyone for helping think through to a solution. Here's what I ended up with:
% Reads data
[~, ~, raw] = xlsread ( 'MyTable.xlsx');
MyTable = raw;
% Makes a backup of the data in table form
MyTableBackup = readtable( 'MyTable.xlsx' );
% Begin by ditching 1st row with variable names
MyTable(1,:) = [];
% wizard magic - find all cells with strings
StringIndex = cellfun('isclass', MyTable, 'char');
% strrep goes here to recode bad strings. For example:
MyTable(StringIndex) = strrep(MyTable(StringIndex), 'PlzHelpMe', '1');
% Eventually, we are done, so convert back to table
MyTable = cell2table(MyTable);
% Uses backup Table to add variable names
% (the readtable above means the bad characters in variable names are already escaped!)
MyTable.Properties.VariableNames = MyTableBackup.Properties.VariableNames;
This means the new values exist as strings ('1', not 1 as a double), so now I just str2double when I access them for analysis. My takeaway - Matlab is for numbers. Thanks again all!

Matlab Clipboard Precision - Format long -

I need to copy paste several matrix from matlab to excel so i did my researches and i've found a really amazing script called num2clip that brings the selected array to the clipboard.
The only problem is that the numbers format is short, when i would like it to be long.
I suspect the "double" type used in the script but i'm still new to matlab so i do have some important lacks.
Here is the script that i've found, what do i have to do according to you in order to keep the long input format ?
function arraystring = num2clip(array)
function arraystring = num2clip(array)
%NUM2CLIP copies a numerical-array to the clipboard
%
% ARRAYSTRING = NUM2CLIP(ARRAY)
%
% Copies the numerical array ARRAY to the clipboard as a tab-separated
% string. This format is suitable for direct pasting to Excel and other
% programs.
%
% The tab-separated result is returned as ARRAYSTRING. This
% functionality has been included for completeness.
%
%Author: Grigor Browning
%Last update: 02-Sept-2005
%convert the numerical array to a string array
%note that num2str pads the output array with space characters to account
%for differing numbers of digits in each index entry
arraystring = num2str(array);
%add a carrige return to the end of each row
arraystring(:,end+1) = char(10);
%reshape the array to a single line
%note that the reshape function reshape is column based so to reshape by
%rows one must use the inverse of the matrix
%reshape the array to a single line
arraystring = reshape(arraystring',1,prod(size(arraystring)));
%create a copy of arraystring shifted right by one space character
arraystringshift = [' ',arraystring];
%add a space to the end of arraystring to make it the same length as
%arraystringshift
arraystring = [arraystring,' '];
%now remove the additional space charaters - keeping a single space
%charater after each 'numerical' entry
arraystring = arraystring((double(arraystring)~=32 |...
double(arraystringshift)~=32) &...
~(double(arraystringshift==10) &...
double(arraystring)==32) );
%convert the space characters to tab characters
arraystring(double(arraystring)==32) = char(9);
format long e
%copy the result to the clipboard ready for pasting
clipboard('copy',arraystring);
Best regards.
Just replace the line with:
arraystring = num2str(array) ;
to a line like that:
arraystring = num2str(array,'%15.15f') ;
This will give you the maximum precision you can reach with the double type (15 digits).
Look at the num2str documentation for more custom format.
Thank you Hoki for your participation.
I didnt had the time to go through all the documentation wich is great by the way.
When i tried your solution, the copied data was all inserted on one cell, i just had to change :
arraystring = num2str(array,'%15.15f') ;
to
arraystring = num2str(array,15) ;
Have a nice day !

Variable labels in MATLAB

I have a huge table data= {1000 x 1000} of binary data.
They table's variable names are encoded for eg D1,D2,...,DA2,DA3,... with their real labels given in a .txt file.
The .txt file also consists of some text for eg:
D1: Age
Mean age: 33
Median :
.
.
.
D2: weight
I would just like to pick out these names from the text file and create a table with the real variable names.
Any suggestions?
If there is a specific number of lines between each of those labels, then you can extract them by reading in the file, and looping over the relevant lines. For each label, it simple to extract the label with strsplit()
e.g. Let's say there's 5 lines between each label
uselessLines = 5;
% imports as a vertical matrix with each line from the file.
dataLabelsFile = importdata(filename);
% get the total number of lines
numLines = size(dataLabelsFile);
% pre-allocate array for labels, a cell is used for a string
dataLabels = cell(ceil(numLines/(uselessLines+1)));
% use a seperate counting variable
m = 1;
% now, for each label, we add it to the dataLabels matrix
for i=1:(uselessLines+1):numLines
line = strsplit(dataLabelsFile{i}); % by default splits on whitespace
dataLabels(m) = line(2);
m = m + 1;
end
By the end of that loop you should have a variable called dataLabels that holds all of the labels. Now, you can actually very easily work out which label goes with which set of data
provided they are still in the same order. The indexes will be the same for the label to the data.
This is a method you could try if the labels are evenly spaced.
However, if the labels are a random number of lines, then you probably want to do a check with a regular expression like the person below me has suggested. Then you just replace the last two lines of the loop with something like this.
...
if (regular expression matched)
dataLabels(m) = line(2);
m = m + 1;
end
...
That being said, while regular expressions are flexible, if you can get away with replacing it with literally one function call, it's usually better to do that. Regex efficiencies are determined by the skill of the programmer, while in-built functions have generally been tested by some of the better programmers in the world. Additionally, Regex's are harder to understand if you ever want to go back and change it.
Of course there are times when Regex's are amazing, I'm just not convinced this is one of those times.
An implemention of the approach in my earlier comment:
fid = fopen(filename);
varNames = cell(0);
proceed = true;
while proceed
line = fgetl(fid);
if ischar(line)
startIdx = regexp(line,'(?<=^[A-Z]*\d*:)\s');
if ~isempty(startIdx)
varNames{end+1} = strtrim(line(startIdx:end)); %#ok<SAGROW>
end
else
proceed = false;
end
end
fclose(fid);
I cant put the resulting varNames in a table for you, since I have a version of Matlab that does not support tables.

Converting a comma separated filed to a matlab matrix

I have a comma separated file in the format:
Col1Name,Col1Val1,Col1Val2,Col1Val3,...Col1ValN,Col2Name,Col2Val1,...Col2ValN,...,ColMName,ColMVal1,...,ColMValN
My question is, how can I convert this file into something Matlab can treat as a matrix, and how would I go about using this matrix in a file? I supposed I could some scripting language to format the file into matlab matrix format and copy it, but the file is rather large (~7mb).
Thanks!
Sorry for the edit:
The file format is:
Col1Name;Col2Name;Col3Name;...;ColNName
Col1Val1;Col2Val2;Col3Val3;...;ColNVal1
...
Col1ValM;Col2ValM;Col3ValM;...;VolNValM
Here is some actual data:
Press;Temp.;CondF;Cond20;O2%;O2ppm;pH;NO3;Chl(a);PhycoEr;PhycoCy;PAR;DATE;TIME;excel.date;date.time
0.96;20.011;432.1;431.9;125.1;11.34;8.999;134;9.2;2.53;1.85;16.302;08.06.2011;12:01:52;40702;40702.0.5
1;20.011;433;432.8;125;11.34;9;133.7;8.19;3.32;2.02;17.06;08.06.2011;12:01:54;40702;40702.0.5
1.1;20.012;432.7;432.4;125.1;11.34;9;133.8;8.35;2.13;2.2;19.007;08.06.2011;12:01:55;40702;40702.0.5
1.2;20.012;432.8;432.5;125.2;11.35;9.001;133.8;8.45;2.95;1.95;21.054;08.06.2011;12:01:56;40702;40702.0.5
1.3;20.012;432.7;432.4;125.4;11.37;9.002;133.7;8.62;3.17;1.87;22.934;08.06.2011;12:01:57;40702;40702.0.5
1.4;20.007;432.1;431.9;125.2;11.35;9.003;133.7;9.48;4.17;1.6;24.828;08.06.2011;12:01:58;40702;40702.0.5
1.5;19.997;432.3;432.2;124.9;11.33;9.003;133.8;8.5;3.84;1.79;27.327;08.06.2011;12:01:59;40702;40702.0.5
1.6;20;432.8;432.6;124.5;11.29;9.003;133.6;8.57;3.22;1.86;30.259;08.06.2011;12:02:00;40702;40702.0.5
1.7;19.99;431.9;431.9;124.4;11.28;9.002;133.6;8.79;3.7;1.81;35.152;08.06.2011;12:02:02;40702;40702.0.5
1.8;19.994;432.1;432.1;124.4;11.28;9.002;133.6;8.58;3.41;1.84;39.098;08.06.2011;12:02:03;40702;40702.0.5
1.9;19.993;433;432.9;124.6;11.3;9.002;133.6;8.59;3.45;5.53;45.488;08.06.2011;12:02:04;40702;40702.0.5
2;19.994;433;432.9;124.8;11.32;9.002;133.5;8.6;2.76;1.99;50.646;08.06.2011;12:02:05;40702;40702.0.5
If you don't know number of rows and columns up front, you can't use previous solution. Use this instead.
7 Mb is not large, it is small. This is the 21st century.
To read in to a matlab matrix:
text = fileread('file.name'); % a string with the entire file contents in it. 7 Mb is no big deal.
NAMES = {}; % we'll record column names here
VALUES = []; % this will be the matrix of values
while text(end) = ','
text(end)=[]; % elimnate any trailing commas
end
commas = find(text==','); % Index all the commas
commas = [0;commas(:);length(commas)+1] % put fake commas before and after text to simplify loop
col = 0; % which column are we in
I = 1;
while I<length(commas)
txt = text(commas(I)+1:commas(I+1)-1);
I = I+1;
num = str2double(txt);
if isnan(num) % this means it must be a column name
NAMES{end+1,1} = txt;
col = col+1; % can you believe Matlab doesn't support col++ ???
row = 1; % back to the top at each new column
continue % we have dealt with this txt, its not a num so ... next
end
% if we made it here we have a number
VALUES(row,col) = num;
end
Then you can save your matlab matrix VALUES and also the header names if you want them in matlab format NAMES into matlab format file
save('mymatrix.mat','VALUES','NAMES'); % saves matrix and column names to .mat file
You get stuff back in to matlab when you want it from the file by:
load mymatrix.mat; % loads VALUES and NAMES from .mat file
Some limitations:
You can't use commas in your column header names.
You cannot "name" a column something like "898.2" or anything which can be read as a double number, it will be read in as a number.
If your columns have different lengths, the shorter ones will be padded with zeros to the length of the longest column.
That's all I can think of.