Creat a loop to generate a a mean value of data which I got form TXT files? - matlab

I trying to do script to evalute my lab data into matlab, I have lots of txt files for many samples, each sample has 30 txt files. I did a function to get the data from these files and store them into a structure that contains data and labels.
I would like to know if it is possible to load all the 30 files using loop instead of line by line .
function s = try1(fn)
% adapt for other file format according to your needs
fid = fopen(fn);
s.data = [];
% skip first lines
for k=1:6
l = fgetl(fid);
% read header
a = regexp(l,',','split');
s.labels = a;
l = fgetl(fid);
k=0;
end
while( (l~=-1)&(k<130) )
l = l(1:end-1);
a = regexp(l,', ','split');
a = regexpre
p(a, ',','.');
s.data = [s.data; str2double(a(2:4))];
l = fgetl(fid);
k = k+1;
end
fclose(fid);
end

To do what you want with a data directory and processing each file in a loop you can use a combination of fullfile(...) and a loop. This will likely work:
dataDirectory = 'c:/myDirectory';
allFilesDir = dir(fullfile(dataDirectory , '*.txt'));
allFN = {allFilesDir.name}
% Or do this:
% allFN = {'file1.txt', 'file2.txt'};
result = [];
for ind = 1:length(allFN)
myFN = fullfile(dataDirectory, allFN{ind});
result(ind) = try1(myFN);
end
% Now get the mean:
myMean = mean([result.data])

Related

Reading multiple text files with two columns in MATLAB

I want to read multiple text files. Each text file has two columns. All the two columns of all text files have same rows. I want to know, in MATLAB, how to read each text file then read each column one by one, subtract one column data from the other column and then read the next file and so on. I have written the following code but I am missing some step in the code. I appreciate your support. Thank you all.
for k = 1:9
filename = sprintf('Data_F_Ind000%d.txt',k);
a(:,k) = load(filename);
x = a(:,1)};
y = a(:,2);
z = x - y;
end
data = cell(9,1) ;
diff_data = cell(9,1) ;
for k = 1:9
filename = sprintf('Data_F_Ind000%d.txt',k);
a = load(filename);
data{i} = a ;
x = a(:,1)};
y = a(:,2);
diff_data{i} = x - y;
end
You can do this multiple ways. I imagine that you want to do something with z instead of just throwing it away every time. I would do this by taking advantage of an access pattern.
numFiles = 9;
numRows = ....; % not required but used to preallocate the a matrix
pattern = 1:2:numFiles * 2; % create a vector of 1 3 5 ...
a = zeros(numRows, numFiles * 2);
z = zeros(numRows, numFiles);
for k = 1:numFiles
fileName = sprintf('Data_F_Ind000%d.txt, 'k');
a(:,pattern(k):pattern(k) + 1) = load(fileName);
z(:,k) = a(:,pattern(k)) - a(:,pattern(k) + 1);
end
This is untested and is clearly missing some data but the intent should be clear. You don't need to preallocate variables but it helps speed calculations so I try to do it whenever possible.

string compare and remove elements

I have a csv file which consists of a N-by-M table.
in the first colum each row consists of 6 different numbers and I need to detect if any of the numbers is identical and then print a error message
This is how I thought it should be writte
valid=true(height(Information),1);
for i=1:height(Information),1;
if Information{i, 1} == Information{:, 1}
fprintf('Invalid number in line %d', i);
valid(i)=false;
end
end
use third output of unique and histcounts:
% generate two matrices, one with 2 identical elements
A1 = rand(3);
A1(end,1) = A1(1);
A2 = rand(3);
% check identical elements
[~,~,ic] = unique(A1(:,1),'stable');
identicalNumbers = any(histcounts(ic,max(ic)) > 1) % true
[~,~,ic] = unique(A2(:,1),'stable');
identicalNumbers = any(histcounts(ic,max(ic)) > 1) % false
Edit it can be done even simpler:
identicalNumbers = numel(ic) > max(ic)
First read the csv file in a matrix called A. then try the following code:
uniqueVals = unique( A(:,1) );% find unique values of col 1
[r c] = size(uniqueVals);% r determines the number of unique values in A(:,1)
[rr cc] = size(A);% rr is total number of values in A(:,1)
if (r~=rr)
disp('identical numbers detected');
end
I have modified my code. The below code detect the same numbers in the firs column and tells you the index:
A = randi (8,6)
uniqueVals = unique( A(:,1) );
[c r] = size(uniqueVals);
for i=1:c
[m n]= size(find(A(:,1) == uniqueVals(i)));
if m>1
disp('same values detected in rows: ')
find(A(:,1) == uniqueVals(i))
end
end
check the code and give me a feedback.
I have downloaded youe csv file in my local drive. run the code and select the csv file using the dialog box.
clear
clc
[file_name, mach_path] = uigetfile( ...
{'*.csv', 'All CSV (*.csv)'}, ...
'Select File');
% If "Cancel" is selected then return
if isequal([file_name,mach_path],[0,0])
return
% Otherwise construct the fullfilename and Check and load the file
else
fileName = fullfile(mach_path,file_name);
end
fid = fopen(fileName,'r'); %# Open the file
lineArray = cell(100,1); %# Preallocate a cell array (ideally slightly
%# larger than is needed)
lineIndex = 1; %# Index of cell to place the next line in
nextLine = fgetl(fid); %# Read the first line from the file
while ~isequal(nextLine,-1) %# Loop while not at the end of the file
lineArray{lineIndex} = nextLine; %# Add the line to the cell array
lineIndex = lineIndex+1; %# Increment the line index
nextLine = fgetl(fid); %# Read the next line from the file
end
fclose(fid); %# Close the file
lineArray = lineArray(1:lineIndex-1); %# Remove empty cells, if needed
for iLine = 1:lineIndex-1 %# Loop over lines
lineData = textscan(lineArray{iLine},'%s',... %# Read strings
'Delimiter',',');
lineData = lineData{1}; %# Remove cell encapsulation
if strcmp(lineArray{iLine}(end),',') %# Account for when the line
lineData{end+1} = ''; %# ends with a delimiter
end
lineArray(iLine,1:numel(lineData)) = lineData; %# Overwrite line data
end
A = lineArray;
uniqueVals = unique( A(:,1) );
[cc ~] = size(uniqueVals);
for i=1:cc
[mm ~]= size(find(ismember(A(:,1),uniqueVals(i))));
if mm>1
second = find(ismember(A(:,1),uniqueVals(i)));
disp('same value detected in rows: ')
disp(second(2));
A(second(2),:) = [];
disp(A);
end
end

Matlab: how to rename bulk files if varying length

I am trying to rename over a thousand files of varying length. For example, my original file names are:
1-1.txt
1-13.txt
12-256.txt
...
I would like these files to appear as follows:
100000-1-1.txt
100000-1-13.txt
100000-12-256.txt
...
I have been trying to use the following script:
d = 'directoryname';
names = dir(d);
names = {names(~[names.isdir]).name};
len = cellfun('length',names);
mLen = max(len);
idx = len < mLen;
len = len(idx);
names = names(idx);
for n = 1:numel(names)
oldname = [d names{n}];
newname = sprintf('%s%100000-*s',d,mLen, names{n});
dos(['rename','oldname', 'newname']); % (1)
end
What am I doing wrong? Thanks in advance for your help!!
See if this works for you -
add_string = '100000-'; %//'# string to be concatenated to all filenames
pattern = fullfile(d,'*.txt') %// we are renaming only .txt files
files_info = dir(pattern)
org_paths = fullfile(d,{files_info.name})
new_paths = fullfile(d,strcat(add_string,{files_info.name}))
if ~isempty(org_paths{1}) %// Make sure we are processing something
cellfun(#(x1,x2) movefile(x1,x2), org_paths, new_paths); %// rename files
end
Edit
add_string = '100000-'; %//'# string to be concatenated to all files
pattern = fullfile(d,'*.txt') %// Rename .txt files
files_info = dir(pattern);
f1 = {files_info.name};
org_paths = cellfun(#(S) fullfile(d,S), f1, 'Uniform', 0);
f2 = strcat(add_string,f1);
new_paths = cellfun(#(S) fullfile(d,S), f2, 'Uniform', 0);
if numel(org_paths)>0
if ~isempty(org_paths{1}) %// Make sure we are processing something
cellfun(#(x1,x2) movefile(x1,x2), org_paths, new_paths); %// rename all files
end
end
I don't really get what you are doing with len and mLen, but assuming that's correct, you just need the following changes within the for loop:
for n = 1:numel(names)
oldname = [d filesep names{n}]; %// include filesep to build full filename
newname = sprintf('100000-%s', names{n}); %// prepend '100000-' to names{n}
%// or replace by the simpler: newname = ['100000-' names{n}];
dos(['rename' oldname ' ' newname]); % (1) %// oldname and newname are variables
end

Importing data block with Matlab

I have a set of data in the following format, and I would like to import each block in order to analyze them with Matlab.
Emax=0.5/real
----------------------------------------------------------------------
4.9750557 14535
4.9825821 14522
4.990109 14511
4.9976354 14491
5.0051618 14481
5.0126886 14468
5.020215 14437
5.0277414 14418
5.0352678 14400
5.0427947 14372
5.0503211 14355
5.0578475 14339
5.0653744 14321
Emax=1/real
----------------------------------------------------------------------
24.965595 597544
24.973122 597543
24.980648 597543
24.988174 597542
24.995703 597542
25.003229 597542
I have modified this piece of code from MathWorks, but I think, I have problems dealing with the spaces between each column.
Each block of data consist of 3874 rows and is divided by a text (Emax=XX/real) and a line of ----, unfortunately is the only way the software export the data.
Here is one way to import the data:
% read file as a cell-array of lines
fid = fopen('file.dat', 'rt');
C = textscan(fid, '%s', 'Delimiter','');
C = C{1};
fclose(fid);
% remove separator lines
C(strncmp('---',C,3)) = [];
% location of section headers
headInd = [find(strncmp('Emax=', C, length('Emax='))) ; numel(C)+1];
% extract each section
num = numel(headInd)-1;
blocks = struct('header',cell(num,1), 'data',cell(num,1));
for i=1:num
% section header
blocks(i).header = C{headInd(i)};
% data
X = regexp(C(headInd(i)+1:headInd(i+1)-1), '\s+', 'split');
blocks(i).data = str2double(vertcat(X{:}));
end
The result is a structure array containing the data from each block:
>> blocks
blocks =
2x1 struct array with fields:
header
data
>> blocks(2)
ans =
header: 'Emax=1/real'
data: [6x2 double]
>> blocks(2).data(:,1)
ans =
24.9656
24.9731
24.9806
24.9882
24.9957
25.0032
This should work. I don't think textscan() will work with a file like this because of the breaks between blocks.
Essentially what this code does is loop through lines between blocks until it finds a line that matches the data format. The code is naive and assumes that the file will have exactly the number of blocks lines per block that you specify. If there were a fixed number of lines between blocks it would be a lot easier and you could remove the first inner loop and replace with just ~=fgets(fid) once for each line.
function block_data = readfile(in_file_name)
fid = fopen(in_file_name, 'r');
delimiter = ' ';
line_format = '%f %f';
n_cols = 2; % Number of numbers per line
block_length = 3874; % Number of lines per block
n_blocks = 2; % Total number of blocks in file
tline = fgets(fid);
line_data = cell2mat(textscan(tline,line_format,'delimiter',delimiter,'MultipleDelimsAsOne',1));
block_n = 0;
block_data = zeros(n_blocks,block_length,n_cols);
while ischar(tline) && block_n < n_blocks
block_n = block_n+1;
tline = fgets(fid);
if ischar(tline)
line_data = cell2mat(textscan(tline,line_format,'delimiter',delimiter,'MultipleDelimsAsOne',1));
end
while ischar(tline) && isempty(line_data)
tline = fgets(fid);
line_data = cell2mat(textscan(tline,line_format,'delimiter',delimiter,'MultipleDelimsAsOne',1));
end
line_n = 1;
while line_n <= block_length
block_data(block_n,line_n,:) = cell2mat(textscan(tline,line_format,'delimiter',delimiter,'MultipleDelimsAsOne',1));
tline = fgets(fid);
line_n = line_n+1;
end
end
fclose(fid)

how to dlmwrite a file from array

How to write the cell as below in text file(my_data.out)?
http_only = cell2mat(http_only)
dlmwrite('my_data.out',http_only)
I get the error as below:(I have tried to solve but still return me the error)
Here is my full code:
I want to generate the text file for each of the data which only store 'http_only'
then check for that is it meet the word in split_URL.
%data = importdata('DATA/URL/training_URL')
data = importdata('DATA/URL/testing_URL')
domain_URL = regexp(data,'\w*://[^/]*','match','once')
no_http_URL = regexp(domain_URL,'https?://(?:www\.)?(.*)','tokens','once');
no_http_URL = vertcat(no_http_URL{:});
split_URL = regexp(no_http_URL,'[:/.]*','split')
[sizeData b] = size(split_URL);
for i = 1:100
A7_data = split_URL{i};
data2=fopen(strcat('DATA\WEBPAGE_SOURCE\TESTING_DATA\',int2str(i),'.htm'),'r')
CharData = fread(data2, '*char')'; %read text file and store data in CharData
fclose(data2);
img_only = regexp(CharData, '<img src.*?>', 'match'); %checking
http_only = regexp(img_only, '"http.*?"', 'match');
http_only1 = horzcat(http_only{:});
fid = fopen('my_data.out',int2str(i),'w');
for col = 1:numel(http_only1)
fprintf(fid,'%s\n',http_only1{:,col});
end
fclose(fid);
feature7_data=(~cellfun('isempty', regexpi(CharData , A7_data, 'once')))
B7(i)=sum(feature7_data)
end
feature7(B7>=5)=-1;
feature7(B7<5&B7>2)=0;
feature7(B7<=2)=1;
feature7'
Write cell-by-cell using fprintf -
fid = fopen('my_data.out','w');
for col = 1:numel(http_only)
fprintf(fid,'%s\n',http_only{:,col});
end
fclose(fid);
Edit 1: If your input is a cell array of cell arrays, use this code instead.
Code
http_only1 = horzcat(http_only{:});
fid = fopen('my_data.out','w');
for col = 1:numel(http_only1)
fprintf(fid,'%s\n',http_only1{:,col});
end
fclose(fid);
Edit 2: For a number of inputs to be stored into separate files, use this demo -
data1 = {{'[]'} {'"http://google.com"'} {'"http://yahoo.com'}};
data2 = {{'[]'} {'"http://overflow.com"'} {'"http://meta.exchange.com'}};
data = cat(1,data1,data2);
for k = 1:size(data,1)
data_mat = horzcat(data{k,:});
out_filename = strcat(out_basename,num2str(k),'.out');
fid = fopen(out_filename,'w');
for col = 1:numel(data_mat)
fprintf(fid,'%s\n',data_mat{:,col});
end
fclose(fid);
end