I have a ton of data that needs to be processed from lab work. I have a ton of .mat files that contain a signal matrix of dimensions 7 x w. I need to resize the matrix to 7 x N and w is larger and smaller than N to make the rest of the analysis easier (don't care about data past N). I have the psuedocode of how I want this to work but don’t know how to implement it. Any help would be great thanks!
Folder structure of all my data:
Main folder
Alpha 1
1111.mat
1321.mat
Alpha 2
1010.mat
1234.mat
1109.mat
933.mat
Alpha 3
1223.mat
etc.
Psudeocode:
Master_matrix = []
For all n *.mat
Load n'th *.mat from alpha 1
If w > N
Resize matrix down to N
Else
Zero pad to N
End if
Master_matrix = master_matrix .+ new resized matrix
End for
rest of my code...
First you need to generate the file list. I have my own function for that, but there is, for example, GETFILELIST or the excellent interactive UIPICKFILES to generate the list of files.
Once you have the file list (I'll assume it's a cell array containing the filenames), you can do the following:
nFiles = length(fileList);
Master_matrix = zeros(7,N);
for iFile = 1:nFiles
%# if all files contain a variable of the same name,
%# you can simplify the loading by not assigning an output
%# in the load command, and call the file by
%# its variable name (i.e. replace 'loadedData')
tmp = load(fileList{iFile});
fn = fieldnames(tmp);
loadedData = tmp.(fn{1});
%# find size
w = size(loadedData,2);
if w>=N
Master_matrix = Master_matrix + loadedData(:,1:N);
else
%# only adding to the first few columns is the same as zero-padding
Master_matrix(:,1:w) = Master_matrix(:,1:w) = loadedData;
end
end
Note: In case you don't actually want to add up the data, but simply store it in the master array, you can make Master_matrix into a 7-by-N-by-nFiles array, where the nth plane of Master_matrix is the content of the nth file. In this case, you'd initialize Master_matrix as
Master_matrix = zeros(7,N,nFiles);
and you'd write the if-clause as
if w>=N
Master_matrix(:,:,iFile) = Master_matrix(:,:,iFile) + loadedData(:,1:N);
else
%# only adding to the first few columns is the same as zero-padding
Master_matrix(:,1:w,iFile) = Master_matrix(:,1:w,iFile) = loadedData;
end
Also note that you might want to initialize Master_matrix as NaN instead of zeros, so that the zeros don't affect subsequent statistics (if that's what you want to do with the data).
Related
I have multiple images in a folder, and for each image, I want to store the data(pixel values) as a row vector. After I store them in a row vector I can combine these row vectors as one multi dimensional array. e.g. the data for the first image will be stored in row 1, the data for the second image will be stored in row 2 and so on. And any time I want to access a particular image data, let us say I want the third image, I can do something like this race(3,:).
I am currently getting the error:
Dimensions of matrices being concatenated are not consistent.
The error occurs here race = [race; imagevec] I am lost in how to correct this, unless imagevec = I(:)' is not converting the matrix to a row vector .
race = []; % to store all row vector
imagevec = []; % to store row vector
path = 'C:\Users\User_\somedir\'; % directory
pathfile = dir('C:\Users\User_\somedir\*.jpg'); % image file extension in directory
for i = 1 : length(path)
filename = strcat(path,pathfile(i).name); % get the file
I = imread(filename); % read file
imagevec = I(:)'; % convert image data to row vector
race = [race; imagevec]; % store row vector in matrix
end
Using a cell array instead of a matrix will allow you to index in this way even if your images are of different sizes.
You don't even have to turn them into a row vector to store them all in the same structure. You can do something like this:
path = 'C:\Users\User_\somedir\'; % directory
pathfile = dir([path,*.jpg']); % image file extension in directory
race = cell(length(pathfile),1);
for i = 1 : length(pathfile)
filename = strcat(path,pathfile(i).name); % get the file
I = imread(filename); % read file
race{i} = I; % store in cell array
end
Then when you want to perform some operation, you can simply index into the cell array. You could even turn it into a row vector, if you wanted to, as follows.
thisImage = race{3}(:)';
If you are using a matrix to store the results, all rows of a matrix must be the same length.
Cell arrays are similar to arrays except the elements need not be the same type / size.
You can accomplish what you are looking for using a cell array. First, initialize race to:
race = {};
Then try:
race = {race{:}, imagevec};
I am working with lung data sets in matlab, but I need to sort the slices correctly and show them.
I knew that can be done using the "instance number" parameter in Dicom header, but I did not manage to run the correct code.
How can I do that?
Here is my piece of code:
Dicom_directory = uigetdir();
sdir = strcat(Dicom_directory,'\*.dcm');
files = dir(sdir);
I = strcat(Dicom_directory, '\',files(i).name);
x = repmat(double(0), [512 512 1 ]);
x(:,:,1) = double(dicomread(I));
axes(handles.axes1);
imshow(x,[]);
First of all, to get the DICOM header, you need to use dicominfo which will return a struct containing each of the fields. If you want to use the InstanceNumber field to sort by, then you can do this in such a way.
%// Get all of the files
directory = uigetdir();
files = dir(fullfile(directory, '*.dcm'));
filenames = cellfun(#(x)fullfile(directory, x), {files.name}, 'uni', 0);
%// Ensure that they are actually DICOM files and remove the ones that aren't
notdicom = ~cellfun(#isdicom, filenames);
files(notdicom) = [];
%// Now load all the DICOM headers into an array of structs
infos = cellfun(#dicominfo, filenames);
%// Now sort these by the instance number
[~, inds] = sort([infos.InstanceNumber]);
infos = infos(inds);
%// Now you can loop through and display them
dcm = dicomread(infos(1));
him = imshow(dcm, []);
for k = 1:numel(infos)
set(him, 'CData', dicomread(infos(k)));
pause(0.1)
end
That being said, you have to be careful sorting DICOMs using the InstanceNumber. This is not a robust way of doing it because the "InstanceNumber" can refer to the same image acquired over time or different slices throughout a 3D volume. If you want one or the other, I would choose something more specific.
If you want to sort physical slices, I would recommend sorting by the SliceLocation field (if available). If sorting by time, you could use TriggerTime (if available).
Also you will need to consider that there could also potentially be multiple series in your folder so maybe consider using the SeriesNumber to differentiate these.
I am trying to copy the results to a matrix and want the output in a 32768*8 array. This is the code I am using, but it stops working after the last line.
As you can see for the first file ( i=1), the decimal data,T(32768*1) is converted to M(32768*8). Now I want this M to be stored for each iteration of i, without overwriting anything.
Files_list = getAllFiles('C:\Stellaris Measurements\Stellaris-LM4F120_all');
for i = 1:15000
B=num2str(cell2mat(Files_list(i)));
fid = fopen(B,'rb');
T= fread(fid,inf,'uint8','ieee-be');
total = numel(T);
%M=textread('C:\Users\admin\Workspace\STELLARIS-LM4F120_00_210214_104000_0001_temp_025.bin','%2c');
%M=dec2bin(M);
M= de2bi(T,8,'left-msb');
M = measure(i);
end
So, basically I want to create a martix for each of the measurement, which will store the converted binary results in a 32768*8 array.
Thanks!
BR,
\Kashif
I want to create a MATLAB function to import data from files in another directory and fit them to a given model, but because the data need to be filtered (there's "thrash" data in different places in the files, eg. measurements of nothing before the analyzed motion starts).
So the vectors that contain the data used to fit end up having different lengths and so I can't return them in a matrix (eg. x in my function below). How can I solve this?
I have a lot of datafiles so I don't want to use a "manual" method. My function is below. All and suggestions are welcome.
datafit.m
function [p, x, y_c, y_func] = datafit(pattern, xcol, ycol, xfilter, calib, p_calib, func, p_0, nhl)
datafiles = dir(pattern);
path = fileparts(pattern);
p = NaN(length(datafiles));
y_func = [];
for i = 1:length(datafiles)
exist(strcat(path, '/', datafiles(i).name));
filename = datafiles(i).name;
data = importdata(strcat(path, '/', datafiles(i).name), '\t', nhl);
filedata = data.data/1e3;
xdata = filedata(:,xcol);
ydata = filedata(:,ycol);
filter = filedata(:,xcol) > xfilter(i);
x(i,:) = xdata(filter);
y(i,:) = ydata(filter);
y_c(i,:) = calib(y(i,:), p_calib);
error = #(par) sum(power(y_c(i,:) - func(x(i,:), par),2));
p(i,:) = fminsearch(error, p_0);
y_func = [y_func; func(x(i,:), p(i,:))];
end
end
sample data: http://hastebin.com/mokocixeda.md
There are two strategies I can think of:
I would return the data in a vector of cells instead, where the individual cells store vectors of different lengths. You can access data the same way as arrays, but use curly braces: Say c{1}=[1 2 3], c{2}=[1 2 10 8 5] c{3} = [ ].
You can also filter the trash data upon reading a line, if that makes your vectors have the same length.
If memory is not an major issue, try filling up the vectors with distinct values, such as NaN or Inf - anything, that is not found in your measurements based on their physical context. You might need to identify the longest data-set before you allocate memory for your matrices (*). This way, you can use equally sized matrices and easily ignore the "empty data" later on.
(*) Idea ... allocate memory based on the size of the largest file first. Fill it up with e.g. NaN's
matrix = zeros(length(datafiles), longest_file_line_number) .* NaN;
Then run your function. Determine the length of the longest consecutive set of data.
new_max = length(xdata(filter));
if new_max > old_max
old_max = new_max;
end
matrix(i, length(xdata(filter))) = xdata(filter);
Crop your matrix accordingly, before the function returns it ...
matrix = matrix(:, 1:old_max);
I have a data file m.txt that looks something like this (with a lot more points):
286.842995
3.444398
3.707202
338.227797
3.597597
283.740414
3.514729
3.512116
3.744235
3.365461
3.384880
Some of the values (like 338.227797) are very different from the values I generally expect (smaller numbers).
So, I am thinking that
I will remove all the points that lie outside the 3-sigma range. How can I do that in MATLAB?
Also, the bigger problem is that this file has a separate file t.txt associated with it which stores the corresponding time values for these numbers. So, I'll have to remove the corresponding time values from the t.txt file also.
I am still learning MATLAB, and I know there would be some good way of doing this (better than storing indices of the elements that were removed from m.txt and then removing those elements from the t.txt file)
#Amro is close, but the FIND is unnecessary (look up logical subscripting) and you need to include the mean for a true +/-3 sigma range. I would go with the following:
%# load files
m = load('m.txt');
t = load('t.txt');
%# find values within range
z = 3;
meanM = mean(m);
sigmaM = std(m);
I = abs(m - meanM) <= z * sigmaM;
%# keep values within range
m = m(I);
t = t(I);
%# load files
m = load('m.txt');
t = load('t.txt');
%# find outliers indices
z = 3;
idx = find( abs(m-mean(m)) > z*std(m) );
%# remove them from both data and time values
m(idx) = [];
t(idx) = [];