I have a text file with data structure like this
30,311.263671875,158.188034058,20.6887207031,17.4877929688,0.000297248129755,aeroplane
30,350.668334961,177.547393799,19.1939697266,18.3677368164,0.00026999923648,aeroplane
30,367.98135376,192.697219849,16.7747192383,23.0987548828,0.000186387239864,aeroplane
30,173.569274902,151.629364014,38.0069885254,37.5704650879,0.000172595537151,aeroplane
30,553.904602051,309.903320312,660.893981934,393.194030762,5.19620243722e-05,aeroplane
30,294.739196777,156.249740601,16.3522338867,19.8487548828,1.7795707663e-05,aeroplane
30,34.1946258545,63.4127349854,475.104492188,318.754821777,6.71026540999e-06,aeroplane
30,748.506652832,0.350944519043,59.9415283203,28.3256549835,3.52978979379e-06,aeroplane
30,498.747009277,14.3766479492,717.006652832,324.668731689,1.61551643174e-06,aeroplane
30,81.6389465332,498.784301758,430.23046875,210.294677734,4.16855394647e-07,aeroplane
30,251.932098389,216.641052246,19.8385009766,20.7131652832,3.52147743106,bicycle
30,237.536972046,226.656692505,24.0902862549,15.7586669922,1.8601918593,bicycle
30,529.673400879,322.511322021,25.1921386719,21.6920166016,0.751171214506,bicycle
30,255.900146484,196.583847046,17.1589355469,27.4430847168,0.268321367912,bicycle
30,177.663650513,114.458488464,18.7516174316,16.6759414673,0.233057001606,bicycle
30,436.679382324,273.383331299,17.4342041016,19.6081542969,0.128449092153,bicycle
I want to index those file with a label file.and the result will be something like this.
60,509.277435303,284.482452393,26.1684875488,31.7470092773,0.00807665128377,15
60,187.909835815,170.448471069,40.0388793945,58.8763122559,0.00763951029512,15
60,254.447280884,175.946624756,18.7212677002,21.9440612793,0.00442053096776,15
However there might be some class that is not in label class and I need to filter those line out so I can use load() to load in.(you can't have char inside that text file and execute load().
here is my implement:
function test(vName,meta)
f_dt = fopen([vName '.txt'],'r');
f_indexed = fopen([vName '_indexed.txt'], 'w');
lbls = loadlbl()
count = 1;
while(true),
if(f_dt == -1),
break;
end
dt = fgets(f_dt);
if(dt == -1),
break
else
dt_cls = strsplit(dt,','){7};
dt_cls = regexprep(dt_cls, '\s+', '');
cls_idx = find(strcmp(lbls,dt_cls));
if(~isempty(cls_idx))
dt = strrep(dt,dt_cls,int2str(cls_idx));
fprintf(f_indexed,dt);
end
end
end
fclose(f_indexed);
if(f_dt ~= -1),
fclose(f_dt);
end
end
However it work very very slow because the text file contains 100 thousand of lines. Is it anyway that I could do this task smarter and faster?
You may use textscan, and get the indices/ line numbers of the labels you want. After knowing the line numbers, you can extract what you want.
fid = fopen('data.txt') ;
S = textscan(fid,'%s','delimiter','\n') ;
S = S{1} ;
fclose(fid) ;
%% get bicycle lines
idx = strfind(S, 'bicycle');
idx = find(not(cellfun('isempty', idx)));
S_bicycle = S(idx)
%% write this to text file
fid2 = fopen('text.txt','wt') ;
fprintf(fid2,'%s\n',S_bicycle{:});
fclose(fid2) ;
From S_bicycle, you can extract your numbers.
How do I get to read my file with increment .htm file with correct file format and path?
path:DATA\WEBPAGE_SOURCE\train75_phish_data\1.htm
file:1.htm,2.htm,3.htm....etc
Inside 1.htm,2.htm,3.htm....etc are the soucre code of webpage
I do try with the following example, but got the error when i=21.
data2=fopen(strcat('DATA\WEBPAGE_SOURCE\train75_phish_data\',int2str(i),'.htm'),'r')
I have refer to this, still cannot work, any ideas?
http://www.mathworks.com/help/matlab/ref/fopen.html
Here is my code:
data = importdata('DATA/URL/trainURL')
domain_URL = regexp(data,'\w*://[^/]*','match','once')
[sizeData b] = size(domain_URL);
for i = 1:150
A7_data = domain_URL{i};
data2=fopen(strcat('DATA\WEBPAGE_SOURCE\train75_phish_data\',int2str(i),'.htm'),'r')
CharData = fread(data2, '*char')'; %read text file and store data in CharData
img_only = regexp(CharData, '<img.*?>', 'match');
feature7_data=(cellfun(#(n) isempty(n), strfind(img_only, A7_data)))
B7(i)=sum(feature7_data)
end
feature7(B7>=10)=1;
feature7(B7<10&B7>5)=0;
feature7(B7<=5)=-1;
feature7'
Here is my output:
data = importdata('DATA/URL/trainURL') is a list of URL being saved inside
I could not loop the results for i=20, it will come to error when iteration=21, I want to loop until 150, it cnt read the 'data2' for 'i=21'
I think you need to handle possible exceptions that can come in a more principled way. Try this:
data = importdata('DATA/URL/trainURL')
domain_URL = regexp(data,'\w*://[^/]*','match','once')
[sizeData b] = size(domain_URL);
for i = 1:150
A7_data = domain_URL{i};
filename = fullfile('DATA\WEBPAGE_SOURCE\train75_phish_data\',strcat(int2str(i),'.htm'));
if (exist(filename,'file')),
disp(sprintf('file %s exists, processing it',filename));
data2=fopen(filename,'r');
CharData = fread(data2, '*char')'; %read text file and store data in CharData
fclose(data2);
img_only = regexp(CharData, '<img.*?>', 'match');
feature7_data=(cellfun(#(n) isempty(n), strfind(img_only, A7_data)))
B7(i)=sum(feature7_data)
else,
disp(sprintf('file %s does not exist, skipping it!',filename));
end
end
feature7(B7>=10)=1;
feature7(B7<10&B7>5)=0;
feature7(B7<=5)=-1;
feature7'
after the line that does the fread.
I am looking for MATLAB code that does some routine (updates a file.m), if file.csv is edited more recently than file.m.
Something that should look like:
% Write time extraction
tempC = GetFileTime('file.csv', [], 'Write');
tempdateC = tempC.date
tempM = GetFileTime('file.m', [], 'Write');
tempdateM = tempM.date
% Write time comparison
if numel(dir('file.m')) == 0 || tempdateC >= tempdateM
matDef = regexprep(fileread('file.csv'), '(\r\n|\r|\n)', ';\n');
f = fopen('file.m', 'w');
fwrite(f, ['Variable = [' matDef(1:end) '];']);
fclose(f);
end
The lines for timestamp extraction seem to be incorrect MATLAB code. The rest works (Evaluate variables in external file strings).
You can extract the modification time of a file using MATLAB's dir command. Something like:
function modTime = GetFileTime(fileName)
listing = dir(fileName);
% check we got a single entry corresponding to the file
assert(numel(listing) == 1, 'No such file: %s', fileName);
modTime = listing.datenum;
end
Note that the output is in MATLAB's datenum serial date format.
My code has 2 parts. First part is an automatic file opening programmed like this :
fichierref = 'H:\MATLAB\Archive_08112012';
files = dir(fullfile(fichierref, '*.txt'));
numberOfFiles = numel(files);
delimiterIn = ' ';
headerlinesIn = 11;
for d = 1:numberOfFiles
filenames(d) = cellstr(files(d).name);
end
for i=1:numberOfFiles
data = importdata(fullfile(fichierref,filenames{i}),delimiterIn,headerlinesIn);
end
Later on, I want the user to select his files for analysis. There's a problem with this though. I typed the lines as follow :
reference = warndlg('Choose the files from which you want to know the magnetic field');
uiwait(reference);
filenames = cellstr(uigetfile('./*.txt','MultiSelect', 'on'));
numberOfFiles = numel(filenames);
delimiterIn = ' ';
headerlinesIn = 11;
It's giving me the following error, after I press OK on the prompt:
Error using cellstr (line 34)
Input must be a string.
Error in FreqVSChampB_no_spec (line 128)
filenames = cellstr(uigetfile('./*.txt','MultiSelect', 'on'));
Anyone has an idea why it's doing that?
You do not need the cellstr command for the output of uigetfile in 'MultiSelect' mode: the output is already in a cellarray form (see doc of uigetfile).
I want to add a property-value pair to existing file. In the mean time all the properties should be ordered in alphabetical order. For example :
[Info] % property 1
value 1
[system] % property 2
value 2
How can i add additional property such that all properties will be sorted in alphabetical order. I was able to add property -value pair to the end of the file using
fh = fopen(filename,'a') but i am not able to sort them alphabetically.
so far i tried this as follows but with this one it keeps printing only the new property-value pair . I want to print remaining properties onces it prints the new one.
function [] = myfun(filename ,propName,propvalue)
rfh = fopen(filename,'r');
tname = tempname();
wfh = fopen(tname,'w');
line = fgetl(rfh);
while ischar(line)
if (line(1) == '[') && (line(end) == ']')
property = lower(line(2:end-1)) % from ini file
String2 = property;
String1 = propName;
[sat] = sor(String1,String2)% subfunction
if sat == -1
fprintf(wfh,'[%s]\r\n%s\r\n',propName,propvalue);
else
fprintf(wfh,'%s\r\n',line);
end
else
fprintf(wfh,'%s\r\n',line);
end
line = fgetl(rfh);
end
fclose(rfh);
fclose(wfh);
movefile(tname,filename,'f')
function [sat] = sor(String1,String2)
Index = 1;
while Index < length(String1) && Index < length(String2) && String1(Index) == String2(Index)
Index = Index + 1;
end
% Return the appropriate code
if String1(Index) < String2(Index)
sat= -1
elseif String1(Index) > String2(Index)
sat= +1
else % the characters at this position are equal -- the shorter of the two strings should be "less than"
if length(String1) == length(String2)
sat = 0
elseif length(String1) < length(String2)
sat = -1
else
sat = +1
end
end
Is this a .ini file? You might want to take a look at INIConfig from the MATLAB File Exchange, a set of routines for handling INI files arranged in a convenient class. I haven't used it, but perhaps it might do what you need.
If not, you can always:
Read in the file
Loop through it line by line
When you find a line starting with [ followed by a word alphabetically later than the property you'd like to insert, insert your property and value
Include the remainder of the file
Write the whole file back out again.
How about read the file into a struct?
function fileData = readFileIntoStruct( fileName )
%
% read [property] value pairs file into struct
%
fh = fopen( fileName, 'r' ); % read handle
line = fgetl( fh );
while ischar( line )
% property
tkn = regexp( line, '\[([^\]+)]\]', 'once', 'tokens' );
% read next line for value
val = fgetl( fh );
fileDate.(tkn{1}) = val;
line = fgetl( fh ); % keep reading
end
fclose( fh ); % don't forget to close the file at the end.
Now you have all the data as a struct with properties as fieldnames and values as the field value.
Now you can update a property simply by:
function fileData = updateProperty( fileData, propName, newVal )
if isfield( fileData, propName )
fileData.(propName) = newVal;
else
warning( 'property %s does not exist - please add it first', propName );
end
You can add a property:
function fileData = addProperty( fileData, propName, newVal )
if ~isfield( fileData, propName )
fileData.(propName) = newVal;
else
warning ( 'property %s already exists, use update to change its value', propName );
end
You can sort the properties alphabetically using orderfields:
fileData = orderfields( fileData );
You can write the struct back to file simply using:
function writeDataToFile( newFileName, fileData )
fopen( newFileName , 'w' ); %write handle
propNames = fieldnames( fileData );
for ii = 1:numel( propNames )
fprintf( fh, '[%s]\r\n%s\r\n', propNames{ii}, fileData.(propNames{ii}) );
end
fclose( fh );
Assumptions:
The properties' names are legitimate Matlab field names (see variable naming for details).
The value of each property is always a string.
I did not include any error-checking code in these examples (files not found, wrongly formatted strings, etc.)
I assume the input file is strictly "[prop] val" pairs without any additional comments etc.