Matlab: finding/writing data from mutliple files by column header text

Matlab: finding/writing data from mutliple files by column header text - matlab

I have an array which I read into Matlab with importdata. It has 5 header lines
file = 'aoao.csv';
s = importdata(file,',', 5);
Matlab automatically treats the last line as the column header. I can then call up the column number that I want with
s.data(:,n); %n is desired column number
I want to be able to load up many similar files at once, and then call up the columns in the different files which have the same column header name (which are not necessarily the same column number). I want to be able to write and export all of these columns together into a new matrix, preferably with each column labelled with its file name,
what should I do?

samp = 'len-c.mp3'; %# define desired sample/column header name
file = dir('*.csv');
have the directory ready in main screen current folder. This creates a detailed description of file,
for i=1:length(file)
set(i) = importdata(file(i).name,',', 5);
end
this imports the data from each of the files (comma delimited, 5 header lines) and transports it to a cell array called 'set'
for k = 1:14;
for i=1:length(set(k).colheaders)
TF = strcmp(set(k).colheaders(i),samp); %compares strings for match
if TF == 1; %if match is true
group(:,k) = set(k).data(:,i); %save matching column# to 'group'
end
end
end
this retrieves the data from the named colheader within each file

Related

Read first line of CSV file that contains text in cells

I have a deco.csv file and I only want to extract B1 to K1 (20 columns of the first rows), i.e. Deco_0001 to Deco_0020.
I first make a pre-allocation:
names = string(20,1);
and what I want is when calling S(1), it gives Deco_0001; when calling S(20), it gives Deco_0020.
I have read through textscan but I do not know how to specify the range is first row and running from column 2 to column 21 of the csv file.
Also, I want save the names individually but what I have tried just save the first line in only one cell:
fid=fopen('deco.csv');
C=textscan(fid, '%s',1);
fclose(fid);
Thanks!

It's not very elegant, but this should work for you:
fid=fopen('deco.csv');
C=textscan(fid, '%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s',1,'Delimiter',',');
fclose(fid);
S = cell(20,1);
for ii = 1:20
S{ii} = C{ii+1};
end

Matlab | How to load/use files with consecutive names (abc1, abc2, abc3) and then pass on to the next series (cba1, cba2, cba3)?

I have a folder containing a series of data with file names like this:
abc1
abc2
abc3
bca1
bca2
bca3
bca4
bca5
cba1
... etc
My goal is to load all the relevant files for each file name, so all the "abc" files, and plot them in one graph. Then move on to the next file name, and do the same, and so forth. Is there a way to do this?
This is what I currently have to load and run through all the files, grab the data in them and get their name (without the .mat extension) to be able to save the graph with the same filename.
dirName = 'C:\DataDirectory';
files = dir( fullfile(dirName,'*.mat') );
files = {files.name}';
data = cell(numel(files),1);
for i=1:numel(files)
fname = fullfile(dirName,files{i});
disp(fname);
files{i} = files{i}(1:length(files{i})-4);
disp(files{i});
[Rest of script]
end

You already found out about the cool features of dir, and have a cell array files, which contains all file names, e.g.
files =
'37abc1.mat'
'37abc2.mat'
'50bca1.mat'
'50bca2.mat'
'1cba1.mat'
'1cba2.mat'
The main task now is to find all prefixes, 37abc, 50bca, 1cba, ... which are present in files. This can be done using a regular expression (regexp). The Regexp Pattern can look like this:
'([\d]*[\D]*)[\d]*.mat'
i.e. take any number of numbers ([\d]*), then any number of non-numeric characters ([\D]*) and keep those (by putting that in brackets). Next, there will be any number of numeric characters ([\d]*), followed by the text .mat.
We call the regexp function with that pattern:
pre = regexp(files,'([\d]*[\D]*)[\d]*.mat','tokens');
resulting in a cell array (one cell for each entry in files), where each cell contains another cell array with the prefix of that file. To convert this to a simple not-nested cell array, we call
pre = [pre{:}];
pre = [pre{:}];
resulting in
pre =
'37abc' '37abc' '50bca' '50bca' '1cba' '1cba'
To remove duplicate entries, we use the unique function:
pre = unique(pre);
pre =
'37abc' '50bca' '1cba'
which leaves us with all prefixes, that are present. Now you can loop through each of these prefixes and apply your stuff. Everything put together is:
% Find all files
dirName = 'C:\DataDirectory';
files = dir( fullfile(dirName,'*.mat') );
files = {files.name}';
% Find unique prefixes
pre = regexp(files,'([\d]*[\D]*)[\d]*.mat','tokens');
pre = [pre{:}]; pre = [pre{:}];
pre = unique(pre);
% Loop through prefixes
for ii=1:numel(pre)
% Get files with this prefix
curFiles = dir(fullfile(dirName,[pre{ii},'*.mat']));
curFiles = {curFiles.name}';
% Loop through all files with this prefix
for jj=1:numel(curFiles)
% Here the magic happens
end
end

Sorry, I misunderstood your question, I found this solution:
file = dir('*.mat')
matching = regexp({file.name}, '^[a-zA-Z_]+[0-9]+\.mat$', 'match', 'once'); %// Match once on file name, must be a series of A-Z a-z chars followed by numbers.
matching = matching(~cellfun('isempty', matching));
str = unique(regexp(matching, '^[a-zA-Z_]*', 'match', 'once'));
str = str(~cellfun('isempty', str));
group = cell(size(str));
for is = 1:length(str)
ismatch = strncmp(str{is}, matching, length(str{is}));
group{is} = matching(ismatch);
end
Answer came from this source: Matlab Central

Renaming JPG images in matlab

I am new to matlab and image analysis. I would really appreciate some insight/help into the following problem. I am trying to rename images (jpg) in a folder that have a random name into specific (new) names. I made the an excel file with two columns the first column contains the old names and the second column the new names. I found the next code on stack overflow (Rename image file name in matlab):
dirData = dir('*.jpg'); %# Get the selected file data
fileNames = {dirData.name}; %# Create a cell array of file names
for iFile = 1:numel(fileNames) %# Loop over the file names
newName = sprintf('image%05d.jpg',iFile); %# Make the new name
movefile(fileNames{iFile},newName); %# Rename the file
end
The code gives all the photos a new name based on the old one but that is not what I want, the new names I use are not linked to the old ones. I tried the following code :
g= xlsread('names.xlsx') % names.xlsx the excel file with old and new names
for i=1:nrows(g)
image=open(g(i,1));
save(g(i,2),image);
end
It doesn't work. I get the error message :using open (line 68)
NAME must contain a single string. I don't think open is the right function to use.
Thank you !

How about this:
% Get the jpeg names
dir_data = dir('*.jpg');
jpg_names = {dir_data.name};
% Rename the files according to the .xlsx file
[~,g,~] = xlsread('names.xlsx'); % Thanks to the post of Sean
old_names = g(:,1);
new_names = g(:,2);
for k = 1:numel(old_names)
% check if the file exists
ix_file = strcmpi(old_names{k}, jpg_names);
if any(ix_file)
movefile(old_names{k}, new_names{k});
end;
end;

It seems like g = xlsread(file) reads only numeric data from file (see here). The third output argument of xlsread returns raw data including strings; does the following work? (i don't have excel and can't test it)
[~, ~, g] = xlsread('names.xlsx')
for i = 1:nrows(g)
movefile(g{i,1}, g{i,2})
end

Extract variables from a textfile in matlab

I have a large text file (~3500 lines) which contains output data from an instrument. This consists of repeated entries from different measurements which are all formatted as follows:
* LogFrame Start *
variablename1: value
variablename2: value
...
variablename35: value
* LogFrame End *
Each logframe contains 35 variables. From these I would like to extract two, 'VName' and 'EMGMARKER' and put the associated values in columns in a matlab array, (i.e. the array should be (VName,EMGMARKER) which I can then associate with data from another output file which I already have in a matlab array. I have no idea where to start with this in order to extract the variables from this file, so hence my searches on the internet so far have been unsuccessful. Any advice on this would be much appreciated.

You could use textscan:
C = textscan(file_id,'%s %f');
Then you extract the variables of interest like this:
VName_indices = strcmp(C{1},'VName:');
VName = C{2}(VName_indices);

If the variable names, so 'VName' and 'EMGMARKER' , are always the same you can just read through the file and search for lines containing them and extract data from there. I don't know what values they contain so you might have to use cells instead of arrays when extracting them.
fileID = fopen([target_path fname]); % open .txt file
% read the first line of the specified .txt file
current_line = fgetl(fileID);
counter = 1;
% go through the whole file
while(ischar(current_line))
if ~isempty(strfind(current_line, VName))
% now you find the location of the white space delimiter between name and value
d_loc = strfind(current_line,char(9));% if it's tab separated - so \t
d_loc = strfind(current_line,' ');% if it's a simple white space
variable_name{counter} = strtok(current_line);% strtok returns everything up until the first white space
variable_value{counter} = str2double(strtok(current_line(d_loc(1):end)));% returns everything form first to second white space and converts it into a double
end
% same block for EMGMARKER
counter = counter+1;
current_line = fgetl(fileID);% next line
end
You do the same for EMGMARKER and you have cells containing the data you need.

Import csv file in Matlab

I need your help to import some csv files into matlab. They have the following format
#CONTENT
Class,Category,Level,Form
xxxxx,xxxxx,1.0,1
#DATA_GENERATION
Date,Agency,Version,ScientificAuthority
2010-04-08,INME,1.0,XXX xxx xxxx
#PLATFORM
Type,ID,Name,Country,GAW_ID
STN,308,xxxx,xxx
#INSTRUMENT
Name,Model,Number
ECC,6A,6A23500
#LOCATION
Latitude,Longitude,Height
25,-3,631.0
#TIMESTAMP
UTCOffset,Date,Time
+00:00:00,2010-04-07,10:51:00
* SOFTWARE: SNDPRO 1.321
* TROPOPAUSE IN MB 184
*
#FLIGHT_SUMMARY
IntegratedO3,CorrectionCode,SondeTotalO3,CorrectionFactor,TotalO3,WLCode,ObsType,Instrument,Number
328.4,0,379.9
#AUXILIARY_DATA
MeteoSonde,ib1,ib2,PumpRate,BackgroundCorr,SampleTemperatureType,MinutesGroundO3
RS92-SGPW,,,,Pressure,Pump
#PUMP_CORRECTION
Pressure,Correction
2.0,1.171
3.0,1.131
5.0,1.092
10.0,1.055
20.0,1.032
30.0,1.022
50.0,1.015
100.0,1.011
200.0,1.008
300.0,1.006
500.0,1.004
1000.0,1.000
#PROFILE
Pressure,O3PartialPressure,Temperature,WindSpeed,WindDirection,LevelCode,Duration,GPHeight,RelativeHumidity,SampleTemperature
945.36,4.590,14.6,10.0,30,2,0,631,43,22.8
944.90,4.620,14.3,7.8,20,0,2,635,44,22.8
943.51,4.630,13.9,7.6,17,0,4,647,44,22.8
942.13,4.620,13.4,8.1,16,0,6,660,45,22.8
940.98,4.590,13.0,9.0,16,0,8,670,45,22.8
939.83,4.590,12.6,9.8,17,0,10,680,46,22.8
938.69,4.600,12.1,10.3,18,2,12,691,46,22.8
937.77,4.600,12.2,10.9,18,0,14,699,47,22.9
936.63,4.600,12.1,11.4,19,0,16,709,47,22.9
935.48,4.600,11.8,11.9,19,0,18,719,47,22.9
934.12,4.600,11.7,12.3,19,0,20,731,47,22.9
932.98,4.590,11.6,12.6,19,0,22,742,48,22.9
931.84,4.590,11.6,12.9,18,0,24,752,48,22.9
930.93,4.600,11.6,13.2,18,0,26,760,48,22.9
929.79,4.600,11.4,13.4,17,0,28,770,49,22.9
928.88,4.610,11.5,13.6,16,0,30,778,49,22.9
927.98,4.620,11.4,13.7,15,0,32,787,49,23.0
927.30,4.620,11.3,13.8,14,0,34,793,49,23.0
The first line of the file is empty and second line contains the #CONTENT. I would like to have in a matrix all data that are under the line Pressure,O3PartialPressure,Temperature,WindSpeed,WindDirection,LevelCode,Duration,GPHeight,RelativeHumidity,SampleTemperature

Use the csvread() function. From the documentation:
csvread Read a comma separated value file.
M = csvread('FILENAME') reads a comma separated value formatted file
FILENAME. The result is returned in M. The file can only contain
numeric values.
In your case, since you want to exclude all of the content up until the #PROFILE data, you would have to know the line number of the data you're interested in in advance, then use one of the following uses (again from the documentation):
M = csvread('FILENAME',R,C) reads data from the comma separated value
formatted file starting at row R and column C. R and C are zero-
based so that R=0 and C=0 specifies the first value in the file.
M = csvread('FILENAME',R,C,RNG) reads only the range specified
by RNG = [R1 C1 R2 C2] where (R1,C1) is the upper-left corner of
the data to be read and (R2,C2) is the lower-right corner. RNG
can also be specified using spreadsheet notation as in RNG = 'A1..B7'.