I get the following error when I try to run this code "Attempted to access id(90); index out of bounds because numel(id)=89.Error in Untitled66 (line 26) person = find(id(fileNum)==ids);" Can someone help me spot the error?
% File Names reading and label generation
dataFolder= 'allcontent/';
fileNames = dir([dataFolder 'c*.*']);
lbl = sscanf(cat(1,fileNames.name)','co2%c%d.rd_%d.mat');
status = lbl(1:3:end);
id = lbl(2:3:end);
ids = unique(id);
trial = lbl(3:3:end);
%% File reading and Data Generation
%data = 256*channel*trial*stimulus*id
trData = zeros(256,64,10,3,20,'single');
label = zeros(10,3,20,'single');
trials = ones(3,20);
for fileNum = 1:numel(fileNames)
fin = fopen([dataFolder fileNames(fileNum).name]);
for i=1:4
line= fgetl(fin);
end
a= sscanf(line,'%S%d %s , trial %d');
stimulus = (3-numel(a));
person = find(id(fileNum)==ids);
trialNum = trials(stimulus, person);
label (trialNum, stimulus, person) = status(fileNum);
fprintf('%d %d %d\n', person,trialNum, stimulus);
for ch=1: 64
fgetl(fin);
curData = textscan(fin,'%d %s %d %f');
trData(:,ch,trialNum,stimulus,person) = curData{4};
end
for fileNum = 1:numel(fileNames) iterates over all files, but you don't have an id for each file: id = lbl(2:3:end);.
It seems to me that you want to iterate only over 1/3 or the files?
for fileNum = 2:3:numel(fileNames)
It's hard to tell what you're trying to accomplish, though. Are files related in groups of 3? You're probably better off selecting files from their names before computing id and all the other support matrices.
Related
I have this script to count the number of true segments (column3 "Segment Good") for each category ("go", "nogo" under column 1), but it never returns the actual number :(
I would appreciate to have another pairs of eyes to look at it! Thank you!
(This .txt file is converted from .log file from EGI netstation if it matters)
sample .txt file:
category Segment number Segment Good Eye movements
go 1 true true
go 2 false false
go 3 true true
go 4 false false
nogo 1 true true
nogo 2 false false
Files2 = dir(strcat('/Users/EGI/GoNogo2/log22/','*.txt'));
lengthFiles2 = length(Files2);
for ff = 1:lengthFiles2
ff
try
filename = Files2(ff).name;
idx=strfind(filename,'_');
name{ff} = filename(1:idx(1)-1);
path = strcat('/Users/EGI/GoNogo2/log22/',filename);
data = readtable(path);
category = data(:,1);
segmentgood = data(:,3);
record = zeros(size(category,1),2);
for i=1:size(category,1)
tt = category(i,1).Category{1};
if strcmp(tt,'go')
record(i,1)=1;
elseif strcmp(tt,'nogo')
record(i,2)=1;
end
end
segment = zeros(size(segmentgood,1),2);
for i=1:size(category,1)
tt = segmentgood(i,1).SegmentGood{1};
if strcmp(tt,'true')
segment(i,1)=1;
elseif strcmp(tt,'false')
segment(i,2)=1;
end
end
goTrue = 0;
nogoTrue=0;
for i=1:size(category,1)
if record(i,1)==1&segment(i,1)==1
goTrue=goTrue+1;
else record(i,2)==1&segment(i,1)==1
nogoTrue=nogoTrue+1;
end
end
result(ff,1:2) = [goTrue nogoTrue];
result2(ff,1:2) = [sum(record)];
end
end
results = {};
results2={};
for i=1:lengthFiles2
results {i,1} = name{i};
results {i,2} = result(i,1);
results {i,3} = result(i,2);
%results {i,4} = result(i,3);
results2 {i,1} = name{i};
results2 {i,2} = result2(i,1);
results2 {i,3} = result2(i,2);
% results2 {i,4} = result2(i,3);
end
xlswrite('/Users/EGI/Desktop/log2/.xls',results,'results');
Brief Explanation and Preface:
Not too exactly sure of all the implementation requirements of this task but here is a method of reading .txt/.log files. It uses the function textscan() to scan the file into MATLAB as a cell array with each data entry formatted as %s %d %s %s (string, integer, string, string).
Category: string → %s
Segment Number: integer → %d
Segment Good: string → %s
Eye Movements: string → %s
After reading this data as a cell array denoted as Data in the script below we can split this array into columns. Now we can check which indices/rows have the "Category" go and nogo by using the contains() function. The contains function will take in two arguments. The first argument is the string/array of strings that are being searched and the second argument is the string to search for. The contains() function will return true "1" for all the indices where it can find the string to search for:
Example:
Result = contains(["Apple", "Pear", "Grape", "Apple"],"Apple");
will return
Result = [1 0 1 0];
After evaluating the indices corresponding to go and nogo we can matrix index the third column using these values. Matrix indexing allows us to grab all the indices corresponding to a condition/using a logical array. In this case, our logical array/condition was the indices pertaining to nogo and go. Applying contains(,"true") to these new subsets will allow us to find where true occurs given each category go and nogo. Finally using the function nnz() (number of non-zeroes) will allow you to find how many ones/times contains() returned true.
Script:
clear;
clc;
%Reading in the data as a cell array%
fileID = fopen('sample.log', 'r');
Header = string(fgetl(fileID));
Data = textscan(fileID,'%s %d %s %s');
fclose(fileID);
%Splitting the data into specific columns%
Category = Data(:,1);
Category = string(Category{1,1});
Segment_Number = Data(:,2);
Segment_Number = string(Segment_Number{1,1});
Segment_Good = Data(:,3);
Segment_Good = string(Segment_Good{1,1});
Eye_Movements = Data(:,4);
Eye_Movements = string(Eye_Movements{1,1});
%Finding the indices corresponding to "go" and "nogo"%
No_Go_Indices = contains(Category,"nogo");
Go_Indices = ~No_Go_Indices;
%Finding how many true cases in column 3 (Segment_Good) corresponding to "go" and "nogo"
Go_True_Cases = nnz(contains(Segment_Good(Go_Indices),"true"));
No_Go_True_Cases = nnz(contains(Segment_Good(No_Go_Indices),"true"));
%Counting the number of times true occurs in each column%
Column_3_True_Count = nnz(contains(Segment_Good,"true"));
Column_4_True_Count = nnz(contains(Eye_Movements,"true"));
%Printing the results to the command window%
fprintf("Category: go -> %d true\n",Go_True_Cases);
fprintf("Category: nogo -> %d true\n\n",No_Go_True_Cases);
fprintf("Total true cases: %d\n", Column_3_True_Count);
Extension: Looping Through Files in a Directory
clear;
clc;
%Full or relative directory path%
%Adding directory with the files to the be accessible%
Directory_Path = "Files";
addpath(Directory_Path);
%Reading the filenames within the directory and the number of files%
Text_Files = dir(Directory_Path+'/*.txt');
Number_Of_Files = length(Text_Files);
%Creating arrays that will hold the results%
All_No_Go_True_Cases = zeros(Number_Of_Files,1);
All_Go_True_Cases = zeros(Number_Of_Files,1);
%Looping through the files and running the function that will grab the
%results%
for File_Index = 1: Number_Of_Files
File_Name = string(Text_Files(File_Index).name);
[No_Go_True_Cases,Go_True_Cases] = Get_Data(File_Name);
All_No_Go_True_Cases(File_Index) = No_Go_True_Cases;
All_Go_True_Cases(File_Index) = Go_True_Cases;
end
All_No_Go_True_Cases
All_Go_True_Cases
%Local function definition%
function [No_Go_True_Cases,Go_True_Cases] = Get_Data(File_Name)
%Reading in the data as a cell array%
fileID = fopen(File_Name, 'r');
Header = string(fgetl(fileID));
Data = textscan(fileID,'%s %d %s %s');
fclose(fileID);
%Splitting the data into specific columns%
Category = Data(:,1);
Category = string(Category{1,1});
Segment_Number = Data(:,2);
Segment_Number = string(Segment_Number{1,1});
Segment_Good = Data(:,3);
Segment_Good = string(Segment_Good{1,1});
Eye_Movements = Data(:,4);
Eye_Movements = string(Eye_Movements{1,1});
%Finding the indices corresponding to "go" and "nogo"%
No_Go_Indices = contains(Category,"nogo");
Go_Indices = ~No_Go_Indices;
%Finding how many true cases in column 3 (Segment_Good) corresponding to "go" and "nogo"
Go_True_Cases = nnz(contains(Segment_Good(Go_Indices),"true"));
No_Go_True_Cases = nnz(contains(Segment_Good(No_Go_Indices),"true"));
%Counting the number of times true occurs in each column%
Column_3_True_Count = nnz(contains(Segment_Good,"true"));
Column_4_True_Count = nnz(contains(Eye_Movements,"true"));
%Printing the results to the command window%
fprintf("Category: go -> %d true\n",Go_True_Cases);
fprintf("Category: nogo -> %d true\n",No_Go_True_Cases);
fprintf("Total true cases: %d\n\n", Column_3_True_Count);
end
Ran using MATLAB R2019b
I am trying to write data to .txt files. Each of the files is around 170MB (after writing data to it).
I am using octave's fprintf function, with '%.8f' to write floating point values to a file. However, I am noticing a very weird error, in that a sub-set of entries in some of the files are getting corrupted. For example, one of the lines in a file is this:
0.43529412,0.}4313725,0.43137255,0.33233533,...
that "}" should have been "4". Now how did octave's fprintf write that "}" with '%.8f' option in the first place? What is going wrong?
Another example is,
0.73289\8B987,...
how did that "\8B" get there?
I have to process a very large data-set with 360 Million points in total. This error in a sub-set of rows in some files is becoming a big problem. What is causing this problem?
Also, this corruption doesn't occur at random. For example, if a file has 1.1 Million rows, where each row corresponds to a vector representing a data-instance, then the problem occurs say in 100 rows at max, and these 100 rows are clustered togeter. Say for example, these are distributed from row 8000 to 8150, but it is not the case that out of 100 corrupted rows, first 50 are located near say 10000th row and the remaining at say 20000th row. They always form a cluster.
Note: Below code is the code-block responsible for extracting data and writing it to files. Some variables in the code, like K_Cell have been computed computed earlier and play virtually no role in data-writing process.
mf = fspecial('gaussian',[5 5], 2);
fidM = fopen('14_01_2016_Go_AeossRight_ClustersM_wLAMRD.txt','w');
fidC = fopen('14_01_2016_Go_AeossRight_ClustersC_wLAMRD.txt','w');
fidW = fopen('14_01_2016_Go_AeossRight_ClustersW_wLAMRD.txt','w');
kIdx = 1;
featMat = [];
% - Generate file names to print the data to
featNo = 0;
fileNo = 1;
filePath = 'wLRD10_Data_Road/featMat_';
fileName = [filePath num2str(fileNo) '.txt'];
fidFeat = fopen(fileName, 'w');
% - Compute the global means and standard deviations
gMean = zeros(1,13); % - Global mean
gStds = zeros(1,13); % - Global variance
gNpts = 0; % - Total number of data points
fidStat = fopen('wLRD10_Data_Road/featStat.txt','w');
for i=1600:10:10000
if (featNo > 1000000)
% - If more than 1m points, close the file and open new one
fclose(fidFeat);
% - Get the new file name
fileNo = fileNo + 1;
fileName = [filePath num2str(fileNo) '.txt'];
fidFeat = fopen(fileName, 'w');
featNo = 0;
end
imgName = [fAddr num2str(i-1) '.jpg'];
img = imread(imgName);
Ir = im2double(img(:,:,1));
Ig = im2double(img(:,:,2));
Ib = im2double(img(:,:,3));
imgR = filter2(mf, Ir);
imgG = filter2(mf, Ig);
imgB = filter2(mf, Ib);
I = im2double(img);
I(:,:,1) = imgR;
I(:,:,2) = imgG;
I(:,:,3) = imgB;
I = im2uint8(I);
[Feat1, Feat2] = funcFeatures1(I);
[Feat3, Feat4] = funcFeatures2(I);
[Feat5, Feat6, Feat7] = funcFeatures3(I);
[Feat8, Feat9, Feat10] = funcFeatures4(I);
ids = K_Cell{kIdx};
pixVec = zeros(length(ids),13); % - Get the local image features
for s = 1:length(ids) % - Extract features
pixVec(s,:) = [Ir(ids(s,1),ids(s,2)) Ig(ids(s,1),ids(s,2)) Ib(ids(s,1),ids(s,2)) Feat1(ids(s,1),ids(s,2)) Feat2(ids(s,1),ids(s,2)) Feat3(ids(s,1),ids(s,2)) Feat4(ids(s,1),ids(s,2)) ...
Feat5(ids(s,1),ids(s,2)) Feat6(ids(s,1),ids(s,2)) Feat7(ids(s,1),ids(s,2)) Feat8(ids(s,1),ids(s,2))/100 Feat9(ids(s,1),ids(s,2))/500 Feat10(ids(s,1),ids(s,2))/200];
end
kIdx = kIdx + 1;
for s=1:length(ids)
featNo = featNo + 1;
fprintf(fidFeat,'%d,%.8f,%.8f,%.8f,%.8f,%.8f,%.8f,%.8f,%.8f,%.8f,%.8f,%.8f,%.8f,%.8f\n', featNo, pixVec(s,:));
end
% - Compute the mean and variances
for s = 1:length(ids)
gNpts = gNpts + 1;
delta = pixVec(s,:) - gMean;
gMean = gMean + delta./gNpts;
gStds = gStds*(gNpts-1)/gNpts + delta.*(pixVec(s,:) - gMean)/gNpts;
end
end
Note that the code block:
for s=1:length(ids)
featNo = featNo + 1;
fprintf(fidFeat,'%d,%.8f,%.8f,%.8f,%.8f,%.8f,%.8f,%.8f,%.8f,%.8f,%.8f,%.8f,%.8f,%.8f\n', featNo, pixVec(s,:));
end
is the only part of the code that writes the data-points to the files.
The earlier code-block,
if (featNo > 1000000)
% - If more than 1m points, close the file and open new one
fclose(fidFeat);
% - Get the new file name
fileNo = fileNo + 1;
fileName = [filePath num2str(fileNo) '.txt'];
fidFeat = fopen(fileName, 'w');
featNo = 0;
end
opens a new file for writing the data to it, when the currently opened file exceeds the limit of 1 million data-points.
Furthermore, note that
pixVec
variable cannot contain anything other than floats/double values, or the octave will throw an error.
I got an error when I run a matlab code online. The error is:
Warning: Name is nonexistent or not a directory: ......\toolbox_misc
Undefined function 'relabel' for input arguments of type 'int32'.
Error in import_experiment_label (line 22)
runs = relabel(run);
I tried to download matlab toolbox misc online, but still cannot fix
the problem. Anyone can help me? Thank you so much!
The following is the original code:
% Load the text experiment-label file to cell
% Kinalizer: /share/Bot/Research/mvpa_data/Haxby_6_subjects
% mac: /Users/kittipat/Downloads/Research/Haxby_7_subjects
% subjID = 2;
inDir = ['/Users/kittipat/Downloads/Research/Haxby_7_subjects/subj',num2str(subjID),'/'];
inFile = 'labels.txt';
outDir = ['/Users/kittipat/Downloads/Research/Haxby_7_subjects/subj',num2str(subjID),'/matlab_format'];
fileID = fopen(fullfile(inDir,inFile));
% !!!!! Must remove "labels chunks" at the top of the txt file first
myCell = textscan(fileID, '%s %d');
fclose(fileID);
category_name = myCell{1};
run = myCell{2};
% Make sure the run numbers are well-ordered from 1 to R
addpath('../../../toolbox_misc/');
runs = relabel(run);
num_run = length(unique(runs));
num_time_stamp = length(runs);
% Make associate labels (needs input from user)
category_name_list = {'rest','face','house','cat','bottle','scissors','shoe','chair','scrambledpix'};
assoc_label_list = [0,1,2,3,4,5,6,7,8];
num_category = length(assoc_label_list); % including 'rest'
assoc_label = zeros(num_time_stamp,1);
regs = zeros(num_category,num_time_stamp);
for i = 1:num_time_stamp
assoc_label(i) = assoc_label_list(strcmp(category_name{i},category_name_list));
regs(assoc_label(i)+1,i) = 1; % 'rest' is column 1
end
regs_with_rest = regs;
regs = regs(2:end,:); % exclude "rest" in the 1-st column
num_category = num_category - 1; % exclude the "rest"
save(fullfile(outDir,'experiment_design'),...
'category_name',...% the category name for each time stamp
'assoc_label',...% the number label for each time stamp
'assoc_label_list',...% the mapping between category_name and assoc_label
'category_name_list',...% list of the category name
'num_category',...% number of categories excluding "rest"
'regs',...% the category matrix excluding "rest"
'num_run',...% number of runs in well-ordered integer
'runs'... % the run# for each time stamp
);
%% plot the figure
h1 = figure;
subplot(4,1,2); plot(assoc_label,'b.-');
xlim([1, num_time_stamp]);
set(gca,'YTick',0:max(assoc_label(:))); set(gca,'YTickLabel',category_name_list);
subplot(4,1,1); plot(runs,'r.-');
title('run number after relabeling --> runs'); xlim([1, num_time_stamp]);
subplot(4,1,3); imagesc(regs_with_rest);
title('original design matrix --> regs\_with\_rest');
set(gca,'YTick',1:(num_category+1)); set(gca,'YTickLabel',category_name_list);
subplot(4,1,4); imagesc(regs);
title('after "rest" is removed --> regs');
xlabel('time stamps');
set(gca,'YTick',1:num_category); set(gca,'YTickLabel',category_name_list(2:end));
print(h1,'-djpeg',fullfile(outDir,'experiment_design.jpg'));
I fix the relabel problem. Then I change part of the code as follows.
inDir = ['/D disk/MATLAB/R2014a/subjX/beta_extraction_for_Haxby_matlab_toolbox_v1_8/subj',num2str(subjID),'/'];
inFile = 'labels.txt';
outDir = ['/D disk/MATLAB/R2014a/subjX/beta_extraction_for_Haxby_matlab_toolbox_v1_8/subj',num2str(subjID),'/matlab_format'];
Another error comes:
import_experiment_label
Undefined function or variable 'subjID'.
Error in import_experiment_label (line 7)
inDir = ['/D disk/MATLAB/R2014a/subjX/beta_extraction_for_Haxby_matlab_toolbox_v1_8/subj',num2str(subjID),'/'];
How to fix this problem? I do not know what is the wrong here. Thank you guys!
I have an index file (called runnumber_odour.txt) that looks like this:
run00001.txt ptol
run00002.txt cdeg
run00003.txt adef
run00004.txt adfg
I need some way of loading this in to a matrix in matlab, such that I can search through the second column to find one of those strings, load the corresponding file and do some data analysis with it. (i.e. if I search for "ptol", it should load run00001.txt and analyse the data in that file).
I've tried this:
clear; clc ;
% load index file - runnumber_odour.txt
runnumber_odour = fopen('Runnumber_odour.txt','r');
count = 1;
lines2skip = 0;
while ~feof(runnumber_odour)
runnumber_odourmat = zeros(817,2);
if count <= lines2skip
count = count+1;
[~] = fgets(runnumber_odour); % throw away unwanted line
continue;
else
line = strcat(fgets(runnumber_odour));
runnumber_odourmat = [runnumber_odourmat ;cell2mat(textscan(line, '%f')).'];
count = count +1;
end
end
runnumber_odourmat
But that just produces a 817 by 2 matrix of zeros (i.e. not writing to the matrix), but without the line runnumber_odourmat = zeros(817,2); I get the error "undefined function or variable 'runnumber_odourmat'.
I have also tried this with strtrim instead of strcat but that also doesn't work, with the same problem.
So, how do I load that file in to a matrix in matlab?
You can do all of this pretty easily using a Map object so you will not have to do any searching or anything like that. Your second column will be a key to the first column. The code will be as follows
clc; close all; clear all;
fid = fopen('fileList.txt','r'); %# open file for reading
count = 1;
content = {};
lines2skip = 0;
fileMap = containers.Map();
while ~feof(fid)
if count <= lines2skip
count = count+1;
[~] = fgets(fid); % throw away unwanted line
else
line = strtrim(fgets(fid));
parts = regexp(line,' ','split');
if numel(parts) >= 2
fileMap(parts{2}) = parts{1};
end
count = count +1;
end
end
fclose(fid);
fileName = fileMap('ptol')
% do what you need to do with this filename
This will provide for quick access to any element
You can then do what was described in the previous question you had asked, with the answer I provided.
I have a micro-array data of 38 row and 7130 columns. I am trying to read the data but keeping having the above error.
I debugged and found when I read the data, I have a 1x7129 instead of a 38x7130. I don't know why. My 7130th column contains letters while the rest of the data are numbers. Any idea why this is happening?
My file is in text tab delimited and here is my code for reading the file:
clear;
fn=32;
col=fn+1;
cluster=2;
num_eachClass=3564;
row=num_eachClass*cluster;
fid1 = fopen('data.txt', 'r');
txt_format='';
for t=1:col txt_format=[txt_format '%g '];
end
data = fscanf(fid1,txt_format,[col row]);
data = data'; fclose(fid1);
Try this code to read the data:
filename = 'yourfilename.txt';
fid = fopen(filename,'r');
% If you have a line with column headers use those 3 lines. Comment if not.
colnames = fgetl(fid);
colnames = textscan(colnames, '%s','delimiter','\t');
colnames = colnames{:};
% Reading the data
tsformat = [repmat('%f ',1,7129) '%s'];
datafromfile = textscan(fid,tsformat,'delimiter','\t','CollectOutput',1);
fclose(fid);
% Get the data from the cell array
data = datafromfile{1};
labels = datafromfile{2};
EDIT
To separate your dataset to training and test, do something like this:
train_samp = 1:19;
test_samp = 20:38;
train_data = data(train_samp,:);
test_data = data(test_samp,:);
train_label = labels(train_samp);
test_label = labels(test_samp);
You can also separate samples randomly:
samp_num = size(data,1);
test_num = 19;
randorder = randperm(samp_num);
train_samp = randorder(test_num+1:samp_num);
test_samp = randorder(1:test_num);
I haven't done transposition data = data';.
If you have to, just switch row and column indexes in the above code:
train_data = data(:,train_samp);
test_data = data(:,test_samp);