collapse cell array to text matlab - matlab

it's a basic question I guess (but I'm new to Matlab), but given:
>> class(motifIndexAfterThresholds)
ans =
'double'
with :
16
8037
14340
21091
27903
34082
as the contents of that variable
I hoped to print to the same line on the matlab console the contents of that variable and some other output:
fprintf('With threshold set to %d, %d motifs found at positions %f.\n',threshold,length(motifIndexAfterThresholds), motifIndexAfterThresholds);
When I do this however, I'm getting more than one line of output:
With threshold set to 800, 6 motifs found at positions 16.000000.
With threshold set to 8037, 14340 motifs found at positions 21091.000000.
With threshold set to 27903, 34082 motifs found at positions
Can someone share the method for collapsing this double array to a single line of text that I can display on the Matlab console please?

What you need is the num2str function that is builtin MATLAB. Modify your code as below:
strThresholds = num2str(motifIndexAfterThresholds.', '%f, '); % Transpose used here since you need to make sure that motifIndexAfterThresholds is a row vector
fprintf('With threshold set to %d, %d motifs found at positions %s.\n',threshold,length(motifIndexAfterThresholds), strThresholds);
The num2str function will convert your vector to a string with the specified format. So for your given example,
strThresholds = '16.000000, 8037.000000, 14340.000000, 21091.000000, 27903.000000, 34082.000000,'
You could definitely edit the format string used in the num2str funcion to suit your needs. I would suggest using %d since you have integers in your vector

Related

The format type of reading a matrix from a file?

I saved a matrix in a file.txt; the type of matrix is as follow:
file: 0.010993,0.21973,0.012142,0.49897,0.24634,0.01183
When I open the matrix by using
Matrix= dlmread( 'File.txt') ) ;
The matlab take only 4 number after the point, where Matrix:
Matrix: 0.0110 0.2197 0.0121 0.4990 0.2463 0.0118
I don't want to change the numbers after the matrix.
Checking the Workspace:
Without any change and simply checking variable Matrix in the workspace window reveals that the full values are stored and that MATLAB is simply choosing to only display a specific amount of digits:
Matrix = dlmread('File.txt')
Changing the Display Format:
If you configure the display format to long you will see that MATLAB does extract all the decimals within the text file and can display them in the command window. Also opening up array Matrix in the workspace panel and clicking the cells also verifies that all the decimal numbers are taken into account. For more details on display options: Matlab Documentation: Set Command Window Output Display Format.
Note: "Numeric formats affect only how numbers appear in Command Window output, not how MATLABĀ® computes or saves them".
format long
Matrix = dlmread('File.txt')
Extension: Checking Sum
format long
Matrix = dlmread( 'File.txt') ;
sum(Matrix)
sum([0.010993,0.21973,0.012142,0.49897,0.24634,0.01183])
Google Calculator Check
Taking VPA Sum:
format long
Matrix = dlmread('File.txt') ;
sum(vpa(Matrix))
sum(vpa([0.010993,0.21973,0.012142,0.49897,0.24634,0.01183]))
Playground Script:
Testing different display methods and approaches to taking the sum().
x = [0.010097,0.19957,0.011086,0.49413,0.27437,0.010745];
sum(vpa(x))
x = [0.010097000000000 0.199570000000000 0.011086000000000 0.494130000000000 0.274370000000000 0.010745000000000];
sum(vpa(x))
%%
format short
x = [0.010097,0.19957,0.011086,0.49413,0.27437,0.010745];
sum(x)
x = [0.010097000000000 0.199570000000000 0.011086000000000 0.494130000000000 0.274370000000000 0.010745000000000];
sum(x)
%%
format long
x = [0.010097,0.19957,0.011086,0.49413,0.27437,0.010745];
sum(x)
x = [0.010097000000000 0.199570000000000 0.011086000000000 0.494130000000000 0.274370000000000 0.010745000000000];
sum(x)
Ran using MATLAB R2019b

Optimizing reading the data in Matlab

I have a large data file with a text formatted as a single column with n rows. Each row is either a real number or a string with a value of: No Data. I have imported this text as a nx1 cell named Data. Now I want to filter out the data and to create a nx1 array out of it with NaN values instead of No data. I have managed to do it using a simple cycle (see below), the problem is that it is quite slow.
z = zeros(n,1);
for i = 1:n
if Data{i}(1)~='N'
z(i) = str2double(Data{i});
else
z(i) = NaN;
end
end
Is there a way to optimize it?
Actually, the whole parsing can be performed with a one-liner using a properly parametrized readtable function call (no iterations, no sanitization, no conversion, etc...):
data = readtable('data.txt','Delimiter','\n','Format','%f','ReadVariableNames',false,'TreatAsEmpty','No data');
Here is the content of the text file I used as a template for my test:
9.343410
11.54300
6.733000
-135.210
No data
34.23000
0.550001
No data
1.535000
-0.00012
7.244000
9.999999
34.00000
No data
And here is the output (which can be retrieved in the form of a vector of doubles using data.Var1):
ans =
9.34341
11.543
6.733
-135.21
NaN
34.23
0.550001
NaN
1.535
-0.00012
7.244
9.999999
34
NaN
Delimiter: specified as a line break since you are working with a single column... this prevents No data to produce two columns because of the whitespace.
Format: you want numerical values.
TreatAsEmpty: this tells the function to treat a specific string as empty, and empty doubles are set to NaN by default.
If you run this you can find out which approach is faster. It creates an 11MB text file and reads it with the various approaches.
filename = 'data.txt';
%% generate data
fid = fopen(filename,'wt');
N = 1E6;
for ct = 1:N
val = rand(1);
if val<0.01
fwrite(fid,sprintf('%s\n','No Data'));
else
fwrite(fid,sprintf('%f\n',val*1000));
end
end
fclose(fid)
%% Tommaso Belluzzo
tic
data = readtable(filename,'Delimiter','\n','Format','%f','ReadVariableNames',false,'TreatAsEmpty','No Data');
toc
%% Camilo Rada
tic
[txtMat, nLines]=txt2mat(filename);
NoData=txtMat(:,1)=='N';
z = zeros(nLines,1);
z(NoData)=nan;
toc
%% Gelliant
tic
fid = fopen(filename,'rt');
z= textscan(fid, '%f', 'Delimiter','\n', 'whitespace',' ', 'TreatAsEmpty','No Data', 'EndOfLine','\n','TextType','char');
z=z{1};
fclose(fid);
toc
result:
Elapsed time is 0.273248 seconds.
Elapsed time is 0.304987 seconds.
Elapsed time is 0.206315 seconds.
txt2mat is slow, even without converting resulting string matrix to numbers it is outperformed by readtable and textscan. textscan is slightly faster than readtable. Probably because it skips some of the internal sanity checks and does not convert the resulting data to a table.
Depending of how big are your files and how often you read such files, you might want to go beyond readtable, that could be quite slow.
EDIT: After tests, with a file this simple the method below provide no advantages. The method was developed to read RINEX files, that are large and complex in the sense that the are aphanumeric with different numbers of columns and different delimiters in different rows.
The most efficient way I've found, is to read the whole file as a char matrix, then you can easily find you "No data" lines. And if your real numbers are formatted with fix width you can transform them from char into numbers in a way much more efficient than str2double or similar functions.
The function I wrote to read a text file into a char matrix is:
function [txtMat, nLines]=txt2mat(filename)
% txt2mat Read the content of a text file to a char matrix
% Read all the content of a text file to a matrix as wide as the longest
% line on the file. Shorter lines are padded with blank spaces. New lines
% are not included in the output.
% New lines are identified by new line \n characters.
% Reading the whole file in a string
fid=fopen(filename,'r');
fileData = char(fread(fid));
fclose(fid);
% Finding new lines positions
newLines= fileData==sprintf('\n');
linesEndPos=find(newLines)-1;
% Calculating number of lines
nLines=length(linesEndPos);
% Calculating the width (number of characters) of each line
linesWidth=diff([-1; linesEndPos])-1;
% Number of characters per row including new lines
charsPerRow=max(linesWidth)+1;
% Initializing output var with blank spaces
txtMat=char(zeros(charsPerRow,nLines,'uint8')+' ');
% Computing a logical index to all characters of the input string to
% their final positions
charIdx=false(charsPerRow,nLines);
% Indexes of all new lines
linearInd = sub2ind(size(txtMat), (linesWidth+1)', 1:nLines);
charIdx(linearInd)=true;
charIdx=cumsum(charIdx)==0;
% Filling output matrix
txtMat(charIdx)=fileData(~newLines);
% Cropping the last row coresponding to new lines characters and transposing
txtMat=txtMat(1:end-1,:)';
end
Then, once you have all your data in a matrix (let's assume it is named txtMat), you can do:
NoData=txtMat(:,1)=='N';
And if your number fields have fix width, you can transform them to numbers way more efficiently than str2num with something like
values=((txtMat(:,1:10)-'0')*[1e6; 1e5; 1e4; 1e3; 1e2; 10; 1; 0; 1e-1; 1e-2]);
Where I've assumed the numbers have 7 digits and two decimal places, but you can easily adapt it for your case.
And to finish you need to set the NaN values with:
values(NoData)=NaN;
This is more cumbersome than readtable or similar functions, but if you are looking to optimize the reading, this is WAY faster. And if you don't have fix width numbers you can still do it this way by adding a couple lines to count the number of digits and find the place of the decimal point before doing the conversion, but that will slow down things a little bit. However, I think it will still be faster.

Matlab mat to libsvm format (strange fprint behaviour)

I'm trying to write my own implementation of mat2libsvm format converter(I don't want to use original function because it want double mat for input, but I working with images and have uint8 matrices).
So here is example that I don't understand:
a= zeros(2,256);
a(1,256)=1;
formatSpec = '%i:%d ';
row= a(1,:);id=find(row);fprintf(formatSpec,[id ; row(id)]);
>256:1
row= uint8(a(1,:));id=find(row);fprintf(formatSpec,[id ; row(id)]);
>255:1
why it cuts off to 255? anyway id in 1st and 2nd examples is double.
In the second line you are concatenating an uint8 with an double, which casts both to uint8. Minimal example:
[256;uint8(1)]
To solve this, use fprintf with multiple input arguments:
fprintf(formatSpec,id , row(id));

Can't understand the warning

My code works but every time it runs through line 13 it writes on the command window :"Warning: Integer operands are required for colon operator
when used as index ".
The relevant part of my code looks like that:
filename = uigetfile;
obj = mmreader(filename);
nFrames=obj.NumberOfFrames;
for k = 1 : nFrames
this_frame = read(obj, k);
thisfig = figure();
thisax = axes('Parent', thisfig);
image(this_frame, 'Parent', thisax);
if k==1
handle=imrect;
pos=handle.getPosition;
end
partOf=this_frame(pos(2):pos(2)+pos(4),pos(1):pos(1)+pos(3));%this is line 13
vector(k)=mean2(partOf);
title(thisax, sprintf('Frame #%d', k));
end
Why this warning appears and can i ignore it?
It's probably because one or more of the following: pos(2), pos(2)+pos(4), pos(1) and pos(1)+pos(3) are not integers, which indices are supposed to be. You may want to use the round function to round them up to integer values.
Maayan,
The problem seems to occur from the values of your pos vector (and the value of the calculations you make to pos vector values).
This is the solution quoted from MathWorks(MATLAB) website:
http://www.mathworks.com/support/solutions/en/data/1-FA9A2S/?solution=1-FA9A2S
Modify the index computations using the FIX, FLOOR, CEIL, or ROUND functions to ensure that the indices are integers. You can test if a variable contains an integer by comparing the variable to the output of the ROUND function operating on that variable when MATLAB is in debug mode on the line containing the variable.

Matlab rgb2hsv dimensions

I am reading in the matlab documentation that rgb2hsv will return an m-by-n-by-3 image array, yet when I call it, I get a 1-by-3 vector. Am I misunderstanding something?
Here is an sample code:
image_hsv = rgb2hsv('filepath')
and as output
image_hsv =
0.7108 0.3696 92.0000
You cannot call rgb2hsv on a filepath - it must be called on a MATLAB image matrix. Try:
image_rgb = imread('filepath'); % load the image array to MATLAB workspace
image_hsv = rgb2hsv(image_rgb); % convert this array to hsv
You can see these matrices with:
>> whos image* % display all variables whose name begins with 'image'
Name Size Bytes Class Attributes
image_hsv 480x640x3 7372800 double
image_rgb 480x640x3 921600 uint8
What your original code was doing was converting your filepath string to ascii numbers, taking the first three values of this array as RGB values and converting these to HSV.
NOTE: This example highlights dangers with MATLAB's weak typing system, where data types are silently converted from one type to another. Also maybe a lack of correct input checking to the rgb2hsv function.