Sorting dicom images in Matlab - matlab

I am working with lung data sets in matlab, but I need to sort the slices correctly and show them.
I knew that can be done using the "instance number" parameter in Dicom header, but I did not manage to run the correct code.
How can I do that?
Here is my piece of code:
Dicom_directory = uigetdir();
sdir = strcat(Dicom_directory,'\*.dcm');
files = dir(sdir);
I = strcat(Dicom_directory, '\',files(i).name);
x = repmat(double(0), [512 512 1 ]);
x(:,:,1) = double(dicomread(I));
axes(handles.axes1);
imshow(x,[]);

First of all, to get the DICOM header, you need to use dicominfo which will return a struct containing each of the fields. If you want to use the InstanceNumber field to sort by, then you can do this in such a way.
%// Get all of the files
directory = uigetdir();
files = dir(fullfile(directory, '*.dcm'));
filenames = cellfun(#(x)fullfile(directory, x), {files.name}, 'uni', 0);
%// Ensure that they are actually DICOM files and remove the ones that aren't
notdicom = ~cellfun(#isdicom, filenames);
files(notdicom) = [];
%// Now load all the DICOM headers into an array of structs
infos = cellfun(#dicominfo, filenames);
%// Now sort these by the instance number
[~, inds] = sort([infos.InstanceNumber]);
infos = infos(inds);
%// Now you can loop through and display them
dcm = dicomread(infos(1));
him = imshow(dcm, []);
for k = 1:numel(infos)
set(him, 'CData', dicomread(infos(k)));
pause(0.1)
end
That being said, you have to be careful sorting DICOMs using the InstanceNumber. This is not a robust way of doing it because the "InstanceNumber" can refer to the same image acquired over time or different slices throughout a 3D volume. If you want one or the other, I would choose something more specific.
If you want to sort physical slices, I would recommend sorting by the SliceLocation field (if available). If sorting by time, you could use TriggerTime (if available).
Also you will need to consider that there could also potentially be multiple series in your folder so maybe consider using the SeriesNumber to differentiate these.

Related

Changing Matlab Nested Structures to Matrices Iteratively for histogram plotting

Pretty new to Matlab, so please forgive the poor coding. I have some data for different categories (9 categories) with a different number of data points in each category. I created a structure that holds the data points for the different categories. I believe the categories are themselves structures within the larger structure.
I want to plot a histogram for each category. The first thing I tried was just creating a for-loop and plotting a histogram for each category in the structure, but this failed because the histogram doesn't take in structures. The next thing I tried to do was create another for loop which would change the structure holding each category into a cell array, but this also failed with the error:
if isnumeric(c{1}) || ischar(c{1}) || islogical(c{1}) || isstruct(c{1})
I am able to individually change each category to a cell array and then to a matrix, which allowed me to create one histogram. Is there a way to do this using a loop? My code is below. Thanks.
data = readtable('');
data = table2array(data);
Trial = data(:,1);
dist = data(:,2);
Time = data(:,3);
intstim = data(:,4);
color = data(:,5);
UniqueDist = unique(dist);
for ii = 1.0:length(UniqueDist)
idx = find(dist == UniqueDist(ii));
distTime(ii).data = Time(idx);
distTime(ii).data = distTime(ii).data(distTime(ii).data ~= 0);
end
for jj =1.0:length(distTime)
distTime(jj).data = struct2cell(distTime(jj));
distTime(jj).data = cell2mat(distTime(jj));
end

How do I find out in which strucutre my matlab variable is stored in?

I have raw data in a .mat format. The Data is comprised of a bunch of Structures titled 'TimeGroup_XX' where xx is just a random number. Each one of these structures contains a bunch of signals and their corresponding time steps.
It looks something like this
TimeGroup_45 =
time45: [34069x1 double]
Current_Shunt_5: [34069x1 double]
Voltage_Load_5: [34069x1 double]
This is simply unusable, as I simply do not know where the variable I am looking for is hiding in 100's of the structures contained in the raw data. I just know that I am looking for 'Current_Shut_3' for example!
There has to be a way that would allow me to do the following
for all variables in Work space
I_S3 = Find(Current_Shut_3)
end for
Basically I do not want to manually click through every structure to find my variable and just want it to be saved in a normal time series instead of it being hidden in a random structure! Any suggestion on how to do it? There has to be a way!
I have tried using the 'whos' command, but did not get far, as it only returns a list of the stored strucutres in the workspace. I cannot convert that text to a variable and tell it to search for all the fields.
Thanks guys/girls!
This is a great example of why you shouldn't iterate variable names when there are plenty of adequate storage methods that don't require code gymnastics to get data back out of. If you can change this, do that and don't even bother reading the rest of this answer.
Since everything is apparently contained in one *.mat file, specify an output to load so it's output into a unified structure and use fieldnames to iterate.
Using the following data set, for example:
a1.Current_Shunt_1 = 1;
a2.Current_Shunt_2 = 2;
a5.Current_Shunt_5 = 5;
b1.Current_Shunt_5 = 10;
save('test.mat')
We can do:
% Load data
alldata = load('test.mat');
% Get all structure names
datastructs = fieldnames(alldata);
% Filter out all but aXX structures and iterate through
adata = regexpi(datastructs, 'a\d+', 'Match');
adata = [adata{:}];
queryfield = 'Current_Shunt_5';
querydata = [];
for ii = 1:numel(adata)
tmp = fieldnames(alldata.(adata{ii}));
% See if our query field is present
% If yes, output & break out of loop
test = intersect(queryfield, tmp);
if ~isempty(test)
querydata = alldata.(adata{ii}).(queryfield);
break
end
end
which gives us:
>> querydata
querydata =
5

MATLAB loop through excel files

My code is posted below. It does exactly what I need it to do.
It reads in a file and plots the data that I need. If I want to read in another file and have it go through the same code, without having to write the whole thing a second time with different variables, is that possible? I would like to store the matrices from each loop.
As you can see the file I get is called: Oxygen_1keV_300K.xlsx
I have another file called: Oxygen_1keV_600K.xlsx
and so on.
How can I loop through these files without having to re-code the whole thing? I then want to plot them all on the same graph. It would be nice to store the final matrix Y and Ymean for each file so they are not overwritten.
clear
clc
files = ['Oxygen_1keV_300K','Oxygen_1keV_300K','Oxygen_1keV_600K','Oxygen_1keV_900K'];
celldata = cellstr(file)
k = cell(1,24);
for k=1:24
data{k} = xlsread('C:\Users\Ben\Desktop\Oxygen_1keV_300K.xlsx',['PKA', num2str(k)]);
end
for i=1:24
xfinal{i}=data{1,i}(end,1);
xi{i}=0:0.001:xfinal{i};
xi{i}=transpose(xi{i});
x{i}=data{1,i}(:,1);
y{i}=data{1,i}(:,4);
yi{i} = interp1(x{i},y{i},xi{i});
end
Y = zeros(10001, numel(data));
for ii = 1 : numel(data)
Y(:, ii) = yi{ii}(1 : 10001);
end
Ymean = mean(Y, 2);
figure (1)
x=0:0.001:10;
semilogy(x,Ymean)
Cell arrays make it very easy to store a list of strings that you can access as part of a for loop. In this case, I would suggest putting your file paths in a cell array as a substitute for the string used in your xlsread call
For example,
%The first file is the same as in your example.
%I just made up file names for the next two.
%Use the full file path if the file is not in your current directory
filepath_list = {'C:\Users\Ben\Desktop\Oxygen_1keV_300K.xlsx', 'file2.xlsx', 'file3.xlsx'};
%To store separate results for each file, make Ymean a cell array or matrix too
YMean = zeros(length(filepath_list), 1);
%Now use a for loop to loop over the files
for ii=1:length(filepath_list)
%Here's where your existing code would go
%I only include the sections which change due to the loop
for k=1:24
%The change is that on this line you use the cell array variable to load the next file path
data{k} = xlsread(filepath_list{ii},['PKA', num2str(k)]);
end
% ... do the rest of your processing
%You'll need to index into Ymean to store your result in the corresponding location
YMean(ii) = mean(Y, 2);
end
Cell arrays are a basic matlab variable type. For an introduction, I recommend the documentation for creating and accessing data in cell arrays.
If all your files are in the same directory, you can also use functions like dir or ls to populate the cell array programatically.

MATLAB: vectors of different length

I want to create a MATLAB function to import data from files in another directory and fit them to a given model, but because the data need to be filtered (there's "thrash" data in different places in the files, eg. measurements of nothing before the analyzed motion starts).
So the vectors that contain the data used to fit end up having different lengths and so I can't return them in a matrix (eg. x in my function below). How can I solve this?
I have a lot of datafiles so I don't want to use a "manual" method. My function is below. All and suggestions are welcome.
datafit.m
function [p, x, y_c, y_func] = datafit(pattern, xcol, ycol, xfilter, calib, p_calib, func, p_0, nhl)
datafiles = dir(pattern);
path = fileparts(pattern);
p = NaN(length(datafiles));
y_func = [];
for i = 1:length(datafiles)
exist(strcat(path, '/', datafiles(i).name));
filename = datafiles(i).name;
data = importdata(strcat(path, '/', datafiles(i).name), '\t', nhl);
filedata = data.data/1e3;
xdata = filedata(:,xcol);
ydata = filedata(:,ycol);
filter = filedata(:,xcol) > xfilter(i);
x(i,:) = xdata(filter);
y(i,:) = ydata(filter);
y_c(i,:) = calib(y(i,:), p_calib);
error = #(par) sum(power(y_c(i,:) - func(x(i,:), par),2));
p(i,:) = fminsearch(error, p_0);
y_func = [y_func; func(x(i,:), p(i,:))];
end
end
sample data: http://hastebin.com/mokocixeda.md
There are two strategies I can think of:
I would return the data in a vector of cells instead, where the individual cells store vectors of different lengths. You can access data the same way as arrays, but use curly braces: Say c{1}=[1 2 3], c{2}=[1 2 10 8 5] c{3} = [ ].
You can also filter the trash data upon reading a line, if that makes your vectors have the same length.
If memory is not an major issue, try filling up the vectors with distinct values, such as NaN or Inf - anything, that is not found in your measurements based on their physical context. You might need to identify the longest data-set before you allocate memory for your matrices (*). This way, you can use equally sized matrices and easily ignore the "empty data" later on.
(*) Idea ... allocate memory based on the size of the largest file first. Fill it up with e.g. NaN's
matrix = zeros(length(datafiles), longest_file_line_number) .* NaN;
Then run your function. Determine the length of the longest consecutive set of data.
new_max = length(xdata(filter));
if new_max > old_max
old_max = new_max;
end
matrix(i, length(xdata(filter))) = xdata(filter);
Crop your matrix accordingly, before the function returns it ...
matrix = matrix(:, 1:old_max);

How to efficiently find correlation and discard points outside 3-sigma range in MATLAB?

I have a data file m.txt that looks something like this (with a lot more points):
286.842995
3.444398
3.707202
338.227797
3.597597
283.740414
3.514729
3.512116
3.744235
3.365461
3.384880
Some of the values (like 338.227797) are very different from the values I generally expect (smaller numbers).
So, I am thinking that
I will remove all the points that lie outside the 3-sigma range. How can I do that in MATLAB?
Also, the bigger problem is that this file has a separate file t.txt associated with it which stores the corresponding time values for these numbers. So, I'll have to remove the corresponding time values from the t.txt file also.
I am still learning MATLAB, and I know there would be some good way of doing this (better than storing indices of the elements that were removed from m.txt and then removing those elements from the t.txt file)
#Amro is close, but the FIND is unnecessary (look up logical subscripting) and you need to include the mean for a true +/-3 sigma range. I would go with the following:
%# load files
m = load('m.txt');
t = load('t.txt');
%# find values within range
z = 3;
meanM = mean(m);
sigmaM = std(m);
I = abs(m - meanM) <= z * sigmaM;
%# keep values within range
m = m(I);
t = t(I);
%# load files
m = load('m.txt');
t = load('t.txt');
%# find outliers indices
z = 3;
idx = find( abs(m-mean(m)) > z*std(m) );
%# remove them from both data and time values
m(idx) = [];
t(idx) = [];