How to group all images in subfolders in groups of 8? - matlab

I don't know how to start but by sharing the 'folder' structure and its elements.
Here they are: Google Drive Link to the ZIP The images to group have been downloaded from PMD Sprites Collab
Next, I want to share the MATLAB code I am working with to try to 'group the images'. It was devised thanks to MATLAB: Create and Explore Datastore for Image Classification:
filepath='C:\portrait'; % above the unzipped folder, better full file path
dataFolder=fullfile(filepath);
PortraitIMDS=imageDatastore(filepath,...
'IncludeSubfolders',true,...
'LabelSource','foldernames');
numObs=length(PortraitIMDS.Labels);
numObsPerClass=countEachLabel(PortraitIMDS);
%numObs
numObsPerClass
%numObsToShow=8;
classes=compose('%04d', 0:905);
classSID=zeros(1,length(classes));
for i=1:length(classes)
classSID(i)=str2num(char(classes(i)));
end % this small part would be used to make a loop over 'class' variable below.
class = "0001"; % between "0000" y "0905"
% obtain all the images in groups of 8.
idxClass = find(PortraitIMDS.Labels == class);
size(idxClass)
%numObsPerClass(str2num(size(idxClass,1))+1,2)
%idx = randsample(idxClass,numObsToShow,false);
% I DO NOT want it randomised, I want everything in order
imshow(imtile(PortraitIMDS.Files(idxClass),'GridSize',[2 4],'ThumbnailSize',[144 144]));
Ok, so, as the code mentions, I do not want a randsample of the idxClass added by class. I don't know how to retrieve an ordered sample of data of idxClass, but only this randomized sample of data.
I must mention that there are 906 classes in total, from 0 to 905, and the image count inside them can range from a whole lot of nearly 1500 png's in only the 0001 class to nothing in some classes, where others can have 40, 25, 16, 8, etc.
I ask this question because I don't know how to proceed in order to retrieve all the images Class by Class, group by group, in order. The groups of 8 with the ThumbnailSize I am looking for are already obtained via 'GridSize',[2 4],'ThumbnailSize',[144 144]));, but I'd like to go over all the images grouping them in 2x4 grids whether they are from a class or not, jumping to the next image after finishing the 2x4 grid. I do not remember how many images are in total, so I am sure it won't fit perfectly in groups of 8.
I thought of a loop over 'classes' would be useful, but it won't open all the images but instead the first 8 ones of each class with the current setup... (I haven't written yet the classSID on class ="0001", but that would be the idea)
Maybe it is something not viable with MATLAB, so I am also open to answers in Python.
Remember, the point is to group the images of the zip in 1 in ordered groups of 8. You know, ordered combinations without any single repetition of N elements taken 8 by 8.

Ok, so it was a shuffling problem that has already been solved via this code:
filepath='C:\portrait'; % full file path (not mine)
filelist=dir(fullfile(filepath,'**\*.png'));
filelist = filelist(~[filelist.isdir]);
k=1
while k < size(filelist,1)+1
photo_1=strcat(filelist(k).folder,'\',filelist(k).name);
k=k+1;
photo_2=strcat(filelist(k).folder,'\',filelist(k).name);
k=k+1;
photo_3=strcat(filelist(k).folder,'\',filelist(k).name);
k=k+1;
photo_4=strcat(filelist(k).folder,'\',filelist(k).name);
k=k+1;
photo_5=strcat(filelist(k).folder,'\',filelist(k).name);
k=k+1;
photo_6=strcat(filelist(k).folder,'\',filelist(k).name);
k=k+1;
photo_7=strcat(filelist(k).folder,'\',filelist(k).name);
k=k+1;
photo_8=strcat(filelist(k).folder,'\',filelist(k).name)
fig=imtile({photo_1,photo_2,photo_3,photo_4,photo_5,photo_6,...
photo_7,photo_8},'GridSize',[2 4],'ThumbnailSize',[144 144]);
imshow(fig)
imwrite(fig,strcat('Photosfrom',num2str(k-8),'to',num2str(k-1),'.png'))
k=k+1
end
So yeah, the problem was sort of processing the data which are image, not image processing itself as it was already solved...

Related

Manipulating large sets of data in Matlab, asking for advice on a few things, cells and numeric array operations, with performance in mind

This is a cross-post from here:
Link to post in the Mathworks community
Currently I'm working with large data sets, I've saved those data set as matlab files with the two biggest files being 9.5GB and 5.9GB.
These files contain a cell array each of 1x8 (this is done for addressibility and to prevent mixing up data from each of the 8 cells and I specifically wanted to avoid eval).
Each cell then contains a 3D double matrix, for one it's 1001x2002x201 and the other it is 2003x1001x201 (when process it I chop of 1 row at the end to get it to 2002).
Now I'm already running my script and processing it on a server (64 cores and plenty of RAM, matlab crashed on my laptop, as I need more than 12GB ram on windows). Nonetheless it still takes several hours to finish running my script and I still need to do some extra operations on the data which is why I'm asking advice.
For some of the large cell arrays, I need to find the maximum value of the entire set of all 8 cells, normally I would run a for loop to get the maximum of each cel and store each value in a temporay numeric array and then use the function max again. This will work for sure I'm just wondering if there's a better more efficient way.
After I find the maximum I need to do a manipulation over all this data as well, normally I would do something like this for an array:
B=A./maxvaluefound;
A(B > a) = A(B > a)*constant;
Now I could put this in a for loop, adress each cell and run this, however I'm not sure how efficient that would be though. Do you think there's a better way then a for loop that's not extremely complicated/difficult to implement?
There's one more thing I need to do which is really important, each cell as I said before is a slice (consider it time), while inside each slide is the value for a 3D matrix/plot. Now I need to integrate the data so that I get more slices. The reason I need to do this that I need to create slices/frames/plots to create a movie/gif. I'm planning on plotting the 3d data using scatter3 where this data is represented by color. I plan on using alpha values to make it see through so that one can actually see the intensity in this 3d plot. However I understand how to use griddata but apparently it's quite slow. Some of the other methods where hard to understand. Thus what would be the best way to interpolate these (time) slices in an efficient way over the different cells in the cell array? Please explain it if you can, preferably with an example.
I've added a pic for the Linux server info I'm running it on below, note I can not update the matlab version unfortunately, it's R2016a:
I've also attached part of my code to give a better idea of what I'm doing:
if (or(L03==2,L04==2)) % check if this section needs to be executed based on parameters set at top of file
load('../loadfilewithpathnameonmypc.mat')
E_field_650nm_intAll=cell(1,8); %create empty cell array
parfor ee=1:8 %run for loop for cell array, changed this to a parfor to increase speed by approximately 8x
E_field_650nm_intAll{ee}=nan(szxit(1),szxit(2),xres); %create nan-filled matrix in cell 1-8
for qq=1:2:xres
tt=(qq+1)/2; %consecutive number instead of spacing 2
T1=griddata(Xsall{ee},Ysall{ee},EfieldsAll{ee}(:,:,qq)',XIT,ZIT,'natural'); %change data on non-uniform grid to uniform gridded data
E_field_650nm_intAll{ee}(:,:,tt)=T1; %fill up each cell with uniform data
end
end
clear T1
clear qq tt
clear ee
save('../savelargefile.mat', 'E_field_650nm_intAll', '-v7.3')
end
if (L05==2) % check if this section needs to be executed based on parameters set at top of file
if ~exist('E_field_650nm_intAll','var') % if variable not in workspace load it
load('../loadanotherfilewithpathnameonmypc.mat');
end
parfor tt=1:8 %run for loop for cell array, changed this to a parfor to increase speed by approximately 8x
CFxLight{tt}=nan(szxit(1),szxit(2),xres); %create nan-filled matrix in cells 1 to 8
for qq=1:xres
CFs=Cafluo3D{tt}(1:lxq2,:,qq)'; %get matrix slice and tranpose matrix for point-wise multiplication
CFxLight{tt}(:,:,qq)=CFs.*E_field_650nm_intAll{tt}(:,:,qq); %point-wise multiple the two large matrices for each cell and put in new cell array
end
end
clear CFs
clear qq tt
save('../saveanotherlargefile.mat', 'CFxLight', '-v7.3')
end

Applying (with as few loops as possible) a function to given elements/voxels (x,y,z) taken from subfields of multiple structs (nifti's) in MATLAB?

I have a dataset of n nifti (.nii) images. Ideally, I'd like to be able to get the value of the same voxel/element from each image, and apply a function to the n data points. I'd like to do this for each voxel/element across the whole image, so that I can reconvert the result back into .nii format.
I've used the Tools for NIfTI and ANALYZE image toolbox to load my images:
data(1)=load_nii('C:\file1.nii');
data(2)=load_nii('C:\file2.nii');
...
data(n)=load_nii('C:\filen.nii');
From which I obtain a struct object with each sub-field containing one loaded nifti. Each of these has a subfield 'img' corresponding to the image data I want to work on. The problem comes from trying to select a given xyz within each img field of data(1) to data(n). As I discovered, it isn't possible to select in this way:
data(:).img(x,y,z)
or
data(1:n).img(x,y,z)
because matlab doesn't support it. The contents of the first brackets have to be scalar for the call to work. The solution from googling around seems to be a loop that creates a temporary variable:
for z = 1:nz
for x = 1:nx
for y = 1:ny
for i=1:n;
points(i)=data(i).img(x,y,z);
end
[p1(x,y,z,:),~,p2(x,y,z)] = fit_data(a,points,b);
end
end
end
which works, but takes too long (several days) for a single set of images given the size of nx, ny, nz (several hundred each).
I've been looking for a solution to speed up the code, which I believe depends on removing those loops by vectorisation, preselecting the img fields (via getfield ?)and concatenating them, and applying something like arrayfun/cellfun/structfun, but i'm frankly a bit lost on how to do it. I can only think of ways to pre-select which themselves require loops, which seems to defeat the purpose of the exercise (though a solution with fewer loops, or fewer nested loops at least, might do it), or fun into the same problem that calls like data(:).img(x,y,z) dont work. googling around again is throwing up ways to select and concatenate fields within a struct, or a given field across multiple structs. But I can't find anything for my problem: select an element from a non-scalar sub-field in a sub-struct of a struct object (with the minimum of loops). Finally I need the output to be in the form of a matrix that the toolbox above can turn back into a nifti.
Any and all suggestions, clues, hints and help greatly appreciated!
You can concatenate images as a 4D array and use linear indexes to speed up calculations:
img = cat(4,data.img);
p1 = zeros(nx,ny,nz,n);
p2 = zeros(nx,ny,nz);
sz = ny*nx*nz;
for k = 1 : sz
points = img(k:sz:end);
[p1(k:sz:end),~,p2(k)] = fit_data(a,points,b);
end

Faster way to load .csv files in folder and display them using imshow in MATLAB

I have a piece of MATLAB code that works fine, but I wanted to know is there any faster way of performing the same task, where each .csv file is a 768*768 dimension matrix
Current code:
for k = 1:143
matFileName = sprintf('ang_thresholded%d.csv', k);
matData = load(matFileName);
imshow(matData)
end
Any help in this regard will be very helpful. Thank You!
In general, its better to separate the loading, computational and graphical stuff.
If you have enough memory, you should try to change your code to:
n_files=143;
% If you know the size of your images a priori:
matData=zeros( 768, 768,n_files); % prealocate for speed.
for k = 1:n_files
matFileName = sprintf('ang_thresholded%d.csv', k);
matData(:,:,k) = load(matFileName);
end
seconds=0.01;
for k=1:n_Files
%clf; %Not needed in your case, but needed if you want to plot more than one thing (hold on)
imshow(matData(:,:,k));
pause(seconds); % control "framerate"
end
Note the use of pause().
Here is another option using Matlab's data stores which are designed to work with large datasets or lots of smaller sets. The TabularTextDatastore is specifically for this kind of text based data.
Something like the following. However, note that since I don't have any test files it is sort of notional example ...
ttds = tabularTextDatastore('.\yourDirPath\*.csv'); %Create the data store
while ttds.hasdata %This turns false after reading the last file.
temp = read(ttds); %Returns a Matlab table class
imshow(temp.Variables)
end
Since it looks like your filenames' numbering is not zero padded (e.g. 1 instead of 001) then the file order might get messed up so that may need addressed as well. Anyway I thought this might be a good alternative approach worth considering depending on what else you want to do with the data and how much of it there might be.

Average of values from multiple matrices in Matlab

I have 50 matrices contained in one folder, all of dimension 181 x 360. How do I cycle through that folder and take an average of each corresponding data points across all 50 matrices?
If the matrices are contained within Matlab variables stored using save('filename','VariableName') then they can be opened using load('filename.mat').
As such, you can use the result of filesInDirectory = dir; to get a list of all your files, using a search pattern if appropriate, like files = dir('*.mat');
Next you can use your load command, and then whos to see which variables were loaded. You should consider storing these for ease clearing after each iteration of your loop.
Once you have your matrix loaded (one at a time), you can take averages as you need, probably summing a value across multiple loop iterations, then dividing by a total counter you've been measuring (using perhaps count = count + size(MatrixVar, dimension);).
If you need all of the matrices loaded at once, then you can modify the above idea, to load using a loop, then average outside of the loop. In this case, you may need to take care - but 50*181*360 isn't too bad I suspect.
A brief introduction to the load command can be found at this link. It talks mainly about opening one matrix, then plotting the values, but there are some comments about dealing with headers, if needed, and different ways in which you can open data, if load is insufficient. It doesn't talk about binary files, though.
Note on binary files, based on comment to OP's question:
If the file can be opened using
FID = fopen('filename.dat');
fread(FID, 'float');
then you can replace the steps referring to load above, and instead use a loop to find filenames using dir, open the matrices using fopen and fread, then average as needed, finally closing the files and clearing the matrices.
In this case, probably your file identifier is the only part you're likely to need to change during the loop (although your total will increase, if that's how you want to average your data)
Reshaping the matrix, or inverting it, might make the code clearer (which is good!), but might not be necessary depending on what you're trying to average - it may be that selecting only a subsection of the matrix is sufficient.
Possible example code?
Assuming that all of the files in the current directory are to be opened, and that no files are elsewhere, you could try something like:
listOfFiles = dir('*.dat');
for f = 1:size(listOfFiles,1)
FID = fopen(listOfFiles(f).name);
Data = fread(FID, 'float');
% Reshape if needed?
Total = Total + sum(Data(start:end,:)); % This might vary, depending on what you want to average etc.
Counter = Counter + (size(Data,1) * size(Data,2)); % This product will be the 181*360 you had in the matrix, in this case
end
Av = Total/Counter;

Create a circular buffer for image acquisition

I'm new in programming with matlab and trying to do the following:
I continously capture an image (size 1024x1024) with a camera to have a live image using the getdata function.
To do a measurement I would like to store only 100 images using a circular buffer- more precisely I'm thinking of storing 100 images and erasing the oldest images if new data is acquired and do a measurement on the last 100 images.
Hope my concern is understandable...
Thanks for an answer!
This question has been answered here by a worker from MathWorks : Create a buffer matrix for continuous measurements. ( He also made a video of it : http://blogs.mathworks.com/videos/2009/05/08/implementing-a-simple-circular-buffer/
The code :
buffSize = 10;
circBuff = nan(1,buffSize);
for newest = 1:1000;
circBuff = [newest circBuff(1:end-1)]
end
Check the update made by gnovice which applies the circular buffer to image processing.
What you call a "circular buffer" is known as a queue or FIFO (First In, First Out). Usually this would be stored in a linked list data structure, where every object (matrix, in your case) points to the next object. In Matlab however, there is not built-in linked list structure, but Matlab arrays (vectors/matrices) are pretty flexible and efficient when it comes to manipulating them.
So you can simply store each image as a matrix inside an array of length 100, giving you a 3 dimensional matrix of dimensions 100x1024x1024. Then, when you get new data you simply remove the last matrix from the array and insert a new matrix at the beginning of the array. Hopefully this will be fast enough for you.
Good luck!
May you can create an array of 100 1024x1024-matrices. and refer the following link to maintain the read and write position.
logic of circular buffer