Import variables from file - matlab

Hi so I have a file called config.m that contains a list of variables along with some comments. I wanted to basically load that script up through another matlab script so that the variables would be recognized and used and could also be changed easily. Here is what my file with the variables looks like.
%~~~~~~~~~~~~~~~~~
%~~~[General]~~~~~
%~~~~~~~~~~~~~~~~~
%path to samtools executable
samtools_path = '/home/pubseq/BioSw/samtools/0.1.8/samtools';
%output_path should be to existing directory, script will then create tumour
%and normal folders and link the bam files inside respectively
output_path = '/projects/dmacmillanprj/testbams';
prefix = %prefix for output files
source_file = % from get_random_lines.pl, what is this?
% The window size
winSize = '200';
% Between 0 and 1, i.e. 0.7 for 70% tumour content
tumour_content = '1';
% Should be between 0 and 0.0001
gc_window = 0.005;
% Path to tumour bam file
sample_bam = '/projects/analysis/analysis5/HS2310/620GBAAXX_4/bwa/620GBAAXX_4_dupsFlagged.bam';
% Path to normal bam file
control_bam = '/projects/analysis/analysis5/HS2381/620GBAAXX_6/bwa/620GBAAXX_6_dupsFlagged.bam';
I have tried this:
load('configfile.m')
??? Error using ==> load
Number of columns on line 2 of ASCII file /home/you/CNV/branches/config_file/CopyNumber/configfile.m
must be the same as previous lines.

Just run the script config.m inside another script as
config
Remember config.m file should be in the working directory or in MATLAB path.
However I would recommend you to create a function from this script and return a structure with all the parameters as fields. Then you will be more flexible in your main script since you can assign any name to this structure.
function param = config()
param.samtools_path = '/home/pubseq/BioSw/samtools/0.1.8/samtools';
param.output_path = '/projects/dmacmillanprj/testbams';
% ... define other parameteres
In the main script:
P = config;
st_dir = P.samtools_path;
% ...etc...

Alternatively, you could define a class with constant properties in your config.m file:
classdef config
properties (Constant)
samtools_path = '/home/pubseq/BioSw/samtools/0.1.8/samtools';
output_path = '/projects/dmacmillanprj/testbams';
end
end
Thereby, you can access the class properties in another script:
config.samtools_path
config.output_path
To round it up, you could place your config.m file into a package (+ folder) and import it explicitly in your script. Assuming your package would be called "foo" and the "+foo" folder on your Matlab path, your script would look as follows:
import foo.config
foo.config.samtools_path
foo.config.output_path

load() is not suitable for files that contain text (even in the form of matlab comments.)
You should use textscan() or dlmread(), specifying to them that you want to skip two header lines or that you want to treat '%' as indicating a comment.

Related

How I can i audioread all files in a folder

I have a folder with files named like this: "speaker1_001.wav". It goes From 001 to speaker1_020. How can I do a for loop to "audioread" all the files and storing the value in variables with different names?
This is what I got, but I only obtain one variable instead of 20.
mypath = fullfile('TrainVoices', 'speaker1');
for idx = 1:20
filename = fullfile(mypath, sprintf('speaker1_%d.wav', idx));
nameSpeaker = sprintf('speaker1_%d', idx);
[nameSpeaker, fs] = audioread(filename);
end
In your code, you try to dinamically create the name of the output variable nameSpeaker with the instruction nameSpeaker = sprintf('speaker1_%d', idx); in order to use it as output variable in the call to audioread.
This is not correct since you actually assign the string created with sprintf to the variable nameSpeaker rather then "change" the name of it.
Also, you have to manage the "zeros" included in the filename.
A part from this error (which can be fixed), in general it is not a good practice to use dinamically created variable.
A possible solution could be to store the wav data in a struct which allows to dinamically create the name of the field.
Moreover, since according to the code you've posted you know in advance the path and the root name of the inout files, you can create the complete filename by simply appending the different strings rather than using fullfile
In the following you can find a possible implementaton of the proposed solution.
The output will be a struct named nameSpeaker with a set of fields named speaker1_1, speaker1_2, speaker1_3 ... etc after the name of the input file in which, for semplicity the "zeros"have been removed.
Each of this fields is a struct with the field: data and fs containing the data of the wav file.
For example:
the data of speaker1_001.wav arfe stored in the struct
nameSpeaker.speaker1_1.data
nameSpeaker.speaker1_1.fa
the data of speaker1_002.wav arfe stored in the struct
nameSpeaker.speaker1_2.data
nameSpeaker.speaker1_2.fs
and so on.
% Defina the path
mypath='TrainVoices\speaker1'
% Define the file root name
f_root_name='speaker1_'
% Define the extension of the input file\
ext='.wav'
% Loop over the input filess
for idx = 1:20
%& add the proper number of "0" to tjhe filename
if(idx <= 9)
f_name=[f_root_name '00' num2str(idx)]
else
f_name=[f_root_name '0' num2str(idx)]
end
% Build the filename
filename=fullfile(mypath,[f_name ext])
% Read the wav file
[data,fs] = audioread(filename);
% Store the wav file data in a struct
nameSpeaker.([f_root_name num2str(idx)]).data=data;
nameSpeaker.([f_root_name num2str(idx)]).fs=fs;
end
You can make access to the data by simply specify the "idx" of the file.
For example, to make access to the data of speaker1_001.wav, you can simply define the file "idx" and then build the names of the fields accordingly:
file_idx=3
data=nameSpeaker.([f_root_name num2str(file_idx)]).data
fs=nameSpeaker.([f_root_name num2str(file_idx)]).fs

How can I run a MATLAB script on .csv files in two separate folders at the same time?

So I have an iterative loop that extracts data from .csv files in MATLAB's active folder and plots it. I would like to take it one step further and run the script on two folders, each with their own .csv files.
One folder is called stress and the other strain. As the name implies, they contain .csv files for stress and strain data for several samples, each of which is called E3-01, E3-02, E3-03, etc. In other words, both folders have the same number of files and the same names.
The way I see it, the process would have the following steps:
Look in the stress folder, look inside file E3-01, extract the data in the column labelled Stress
Look in the strain folder, look inside file E3-01, extract the data in the column labelled Strain
Combine the data together for sample E3-01 and plot it
Repeat steps 1-3 for all files in the folders
Like I said, I already have a script that can find the right column and extract the data. What I'm not sure about is how to tell MATLAB to alternate the folder that the script is being run on.
Instead of a script, would a function be better? Something that accepts 4 inputs: the names of the two folders and the columns to extract?
EDIT: Apologies, here's the code I have so far:
clearvars;
files = dir('*.csv');
prompt = {'Plot name:','x label:','y label:','x values:','y values:','Points to eliminate:'};
dlg_title = 'Input';
num_lines = 1;
defaultans = {'Title','x label','y label','Surface component 1.avg(epsY) [True strain]','Stress','0'};
answer = inputdlg(prompt,dlg_title,num_lines,defaultans);
name_plot = answer{1};
x_label = answer{2};
y_label = answer{3};
x_col = answer{4};
y_col = answer{5};
des_cols = {y_col,x_col};
smallest_n = 100000;
points_elim = answer{6};
avg_x_values = [];
avg_y_values = [];
for file = files'
M=xlsread(file.name);
[row,col]=size(M);
if smallest_n > row
smallest_n = row;
end
end
smallest_n=smallest_n-points_elim;
avg_x_values = zeros(smallest_n,size(files,1));
avg_y_values = zeros(smallest_n,size(files,1));
hold on;
set(groot, 'DefaultLegendInterpreter', 'none');
set(gca,'FontSize',20);
ii = 0;
for file = files'
ii = ii + 1;
[n,s,r] = xlsread(file.name);
colhdrs = s(1,:);
[row, col] = find(strcmpi(s,x_col));
x_values = n(1:end-points_elim,col);
[row, col] = find(strcmpi(s,y_col));
y_values = n(1:end-points_elim,col);
plot(x_values,y_values,'DisplayName',s{1,1});
avg_x_values(:,ii)=x_values(1:smallest_n);
avg_y_values(:,ii)=y_values(1:smallest_n);
end
ylabel({y_label});
xlabel({x_label});
title({name_plot});
colormap(gray);
hold off;
avg_x_values = mean(avg_x_values,2);
avg_y_values = mean(avg_y_values,2);
plot(avg_x_values,avg_y_values);
set(gca,'FontSize',20);
ylabel({y_label});
xlabel({x_label});
title({name_plot});
EDIT 2: #Adriaan I tried to write the following function to get a column from a file:
function [out_col] = getcolumn(col,file)
file = dir(file);
[n,s,r] = xlsread(file.name);
colhdrs = s(1,:);
[row, col] = find(strcmpi(s,col));
out_col = n(1:end,col);
end
but I get the error
Function 'subsindex' is not defined for values of class 'struct'.
Error in getcolumn (line 21)
y = x(:,n);
not sure why.
You can do both, of course, and it depends on preference mainly, provided you're the sole user of the script. If others are going to use it as well, use functions instead, as they can contain a proper help file and calling help functionname will then give you that help.
For instance:
folders1 = dir(../strain/*)
folders2 = dir(../stress/*)
for ii 1 = 1:numel(folders)
operand1 = folders1{ii};
operand2 = folders2{ii};
%... rest of script
%
% Or function:
data = YourFunction(folders1{ii},folders2{ii})
end
So all in all you can use both, although from experience I find functions easier to use in the end, as you just pass parameters and don't need to trawl through the complete code to change the parameters each run.
Additionally you can partition off small parts of your program which do a fix task. If you nest your functions, and finally call just a single function in your scripts, you don't have to look at hundreds of lines of code each time you run the script, but rather can just run a single function (which can also be inside a script or function, ad infinitum).
Finally, a function has its own scope; meaning that any variables that are in that function stay within that function unless you explicitly set them as output (apart from global variables, but those are problematic anyway). This can be a good thing, or a bad thing, depending on the rest of your code. If you function would output ~20 variables for further processing, the function probably should include more steps. It'd be a good thing if you create lots of intermediate variables (I always do), because when the function's finished running, the scope of that function will be removed from memory, saving you clear tmpVar1 tmpVar2 tmpVar3 etc every few lines in your script.
For the script the argument in favour would be that it is easier to debug; you don't need dbstop on error and can step a bit easier through the script, keeping check of all your variables. But, after the debugging has been completed, this argument becomes moot, and thus in general I'd start with writing a script, and once it performs as desired, I rework it to a function at minimal extra effort.

Saving data to .mat file in MATLAB

I'm new to MATLAB, and I can't manage to make my function work in order to save my data into a .mat file.
The input:
A structure, with 5 fields:
data: 3D matrix of 19x1000x143
labels: 1x143 matrix with 1 or -1 in it
subject_number: an integer
sampling_rate: an integer, 500 Hz
channel_names: 1x19 matrix with text in it
name: a string for the name of the file
clean: a matrix 1x143 with 1 or 0 in it.
The idea is to save only the clean data, marked as 1 in the clean matrix.
If clean(i) is equal to 1:
save data(:,:,i) and labels(:,i)
This is the code I've tried to implement in the saving.m file:
function saving(EEG_struct, clean, name)
subject_number = EEG_struct.subject_number;
fs = EEG_struct.sampling_rate;
chan_names = EEG_struct.channel_names;
nb_epoch = size(EEG_struct.data, 3);
for j=1:nb_epoch
if clean(j) == 1
% Keep the epoch and label
data = cat(3, data, EEG_struct.data(:,:,j));
labels = cat(2, labels, EEG_struct.labels(:,j));
end
end
save(name, data, labels, subject_number, fs, chan_names)
As you can see, I would like to save the data as a structure with the same shape as the EEG_struct input.
Moreover, I would like to use a parfor instead of a for, but it raised me an error I didn't quite get:
An UndefinedFunction error was thrown on the workers for 'data'. This might be because the file containing 'data' is not accessible on the workers. Use addAttachedFiles(pool, files) to specify the required files to be attached. See the documentation for 'parallel.Pool/addAttachedFiles' for more details. Caused by: Undefined function or variable 'data'.
Thanks for the help !
You can use your clean variable as a logical index and parse out your data and labels at once. So there is no need for a loop.
Also the save command needs the "names" of the vars to save not the variables themselves. So I just put ' ' around each one.
function saving(EEG_struct, clean, name)
subject_number = EEG_struct.subject_number;
fs = EEG_struct.sampling_rate;
chan_names = EEG_struct.channel_names;
nb_epoch = size(EEG_struct.data, 3);
%No need for a loop at all
data = EEG_struct.data(:,:,logical(clean));
labels = EEG_struct.labels(logical(clean)); %This is a 1xN so I removed the extra colon operator
save(name, 'data', 'labels', 'subject_number', 'fs', 'chan_names');
EDIT:
Per you comment if you want to just leave everything in the structure. I gave you 2 options for how to save it.
function saving(EEG_struct, clean, name)
%Crop out ~clead data
EEG_struct.data = EEG_struct.data(:,:,logical(clean));
EEG_struct.labels = EEG_struct.labels(logical(clean)); %This is a 1xN so I removed the extra colon operator
% Option 1
save(name, 'EEG_struct');
% Option2
save(name, '-struct', 'EEG_struct');
Option 1 will directly save the struct to the MAT file. So if you were to load the data back like this:
test = load(name);
test =
EEG_struct: [1x1 struct]
You would get your structure placed inside another structure ... which might not be ideal or require an extra line to de-nest it. On the other hand just loading the MAT file with no outputs load(name) would put EEG_struct into your current workspace. But if in a function then it sort of springs into existence without every being declared which makes code a bit harder to follow.
Option 2 uses the '-struct' option which breaks out each field automatically into separate vars in the MAT file. So loading like this:
EEG_struct = load(name);
Will put all the fields back together again. To me at least this looks cleaner when done within a function but is probably just my preference
So comment out which ever you prefer. Also, not I did not include clean in the save. You could either append it to the MAT or add it to your structure.
To get a structure the same as EEG_struct but with only the data/labels corresponding with the clean variable, you can simply make a copy of the existing structure and remove the rows where clean=0
function saving(EEG_struct, clean, name)
newstruct = EEG_struct;
newstruct.data(:,:,logical(~clean)) = '';
newstruct.labels(logical(~clean)) = '';
save(name,'newstruct');

Getting absolute file path from function handle

Is there possibility to retrieve the absolute path to the file containing a function represented by a function handle? For example:
%child folder containing test_fun.m file
handle = #test_fun
cd ..
%root folder - test_fun not available
path = GETPATHFROMHANDLE(handle)
Is there equivalent to GETPATHFROMHANDLE function in MATLAB? It seems to by simple functionality, but I can't work it out. I know about func2str and which functions, but that doesn't work in that case.
Function handles (i.e. objects of class function_handle) have a method called functions, which will return information about the handle, including the full path of the associated file:
>> fs = functions(h)
fs =
function: 'bar'
type: 'simple'
file: 'C:\Program Files\MATLAB\R2013b\toolbox\matlab\specgraph\bar.m'
>> fs.file
ans =
C:\Program Files\MATLAB\R2013b\toolbox\matlab\specgraph\bar.m
Since the output of functions is a struct, this can be done in a single command with getfield:
>> fName = getfield(functions(h),'file')
fName =
C:\Program Files\MATLAB\R2013b\toolbox\matlab\specgraph\bar.m
However, you can use func2str and which to get the file name if you string them together:
>> h = #bar;
>> fName = which(func2str(h))
fName =
C:\Program Files\MATLAB\R2013b\toolbox\matlab\specgraph\bar.m

MATLAB Reading several images from folder

I have a folder called BasePics within a folder called Images. Inside BasePics there are 30 JPEG images. I'm wondering if the following is possible: Can a script be written that reads all of these images using the imread() command. The names of the images are somewhat sequential: C1A_Base.jpg, C1B_Base.jpg, C1C_Base.jpg, C2A_Base.jpg, C2B_Base.jpg, C2C_Base.jpg, etc.... all the way up to C10C_Base.jpg
Can a loop be used somehow:
file = dir('Images\BasePics');
NF = length(file);
for k = 1:NF
images(k) = imread(fullfile('ImagesBasePics',file(k))
imagesc(images(k))
end
This is a rough idea of what I want to do, but I'm wondering if it can be done with the current naming format I have in the Images folder. I would also like to have each image being read be its own variable with the same or similar name as it is named in the folder Images\BasePics currently, rather than have an concatenated array of 30 images all under the one variable images. I would like to have 30 separate variables, with names such as A1, A2,A3,B1,B2,B3 etc...
Also when I just ask for:
dir images\BasePics
Matlab outputs 33 files, instead of 30. There are two extra files at the beginning of the folder: '.' and '..' and one at the end: 'Thumbs.db' These do not exist when I look at the folder separately, is there a way to programically have Matlab skip over these?
Thanks!!
Since you know the names of the files in advance, you can skip the dir and go ahead and read the files:
for l = 'ABC'
for n=1:10
nm = sprintf('C%d%c_Base.jpg', n, l );
fnm = sprintf('%c%d', l, n );
imgs.(fnm) = imread( fullfile('images','BasePics', nm ) );
end
end
Now you have a struct imgs with fields A1...C10 for each image.
You are very close. I would just use dir('Images\BasePics\*.jpg') to get rid of the extraneous files.
The naming system you want will not lend itself to additional batch processing (do you really want to type all of A1, A2, etc?). I would either keep it sequential, and store a list of the filenames to match, or use a struct array, like images.C1A, etc.
dirlist = dir('Images\BasePics\*.jpg');
for k = 1:length(dirlist);
fname = dirlist(k).name;
[path,name,ext] = fileparts(fname); % separate out base name of file
images.(name) = imread(fullfile('Images\BasePics', fname));
end