Read from txt file in Matlab - matlab

I'm having problems reading in from a txt file in matlab. The txt file is an online review, so the delimeter I want to use is just a single whitespace. I've tried using dlmread, textscan and textread but can't seem to get it to work. I want each word in the txt file to be in a seperate cell in an array. How do I go about this?
Thanks
EDIT, this is the txt file
My husband and I satayed for two nights at the Hilton Chicago,and
enjoyed every minute of it! The bedrooms are immaculate,and the
linnens are very soft. We also appreciated the free wifi,as we could
stay in touch with friends while staying in Chicago. The bathroom was
quite spacious,and I loved the smell of the shampoo they provided-not
like most hotel shampoos. Their service was amazing,and we absolutely
loved the beautiful indoor pool. I would recommend staying here to
anyone.

textread('your_filename', '%s') should work.

If all else fails (other answers already seem good, but you specifically said the functions they proposed do not work), try something like this:
fid = fopen('test.txt');
for i = 1:1000
A{i} = fscanf(fid,'%s',1);
end
fclose(fid)
Just make sure your loop is long enough to read every word.

Related

Converting multiple text file to Mat file in Matlab

I have multiple text files like Symbol1010, Symbol1020...SymbolXXXX.
I want to know if there is any easiest way to process those files in to mat files.
Specifications:
All the files have the same header (strings) in the first row.
All the files have the date in their first column
All the files have the same number of rows and columns.
I tried using importdata and it works good for single file.
If "importdata" works well for your files I would strongly suggest using it in a loop. If you encounter problems while implementing that, please be more specific in your question. Below is a sample that might be a good starting point.
prefix = 'Symbol';
suffixes = (1010:10:1100);
for idx = 1 : length(suffixes)
filename = [prefix, num2str(suffixes(idx))];
A = importdata(filename);
save(filename,'A');
end
Your question is missing quite a lot of detail so I can only give you a general answer, but I'm going to assume that you already know that you should put the single-file code in a loop and that in your single-file example you currently hardcode the name of the file.
Your first problem would then be how to get the list of files. The functions you want are dir and possibly fullfile, you should check out the documentation by typing doc dir in the console. Matlab has extensive documentation and you can often find answers in there very quickly indeed.
If you need more specific answers you would need to post the code that you have so far, a description of what you want to happen and what is happening. I recommend the stackoverflow.com/tour as a good introduction to how to pose a good question.
Thanks michael and xenoclast for the help. I got this
d = dir('*.txt');
nfiles = length(d);
%Conversion of data in text format to Mat format
data = cell(1, nfiles);
for k = 1:nfiles
data{k} = importdata(d(k).name);
end

MATLAB - Stitch Together Multiple Files

I am new to MATLAB programming and some of the syntax escapes me. So I need a little help. Plus I need some complex looping ideas.
Here's the breakdown of what I have:
12 seperate .dat files, each titled something like output_1_x.dat, output_2_x.dat, etc.
each file is actually one piece of a whole that was seperated and processed
each .dat file is approx. 3.9 GB
Here's what I need to do:
create a single file containing all the data from each seperate file, i.e. I need to recreate the original file.
call this complete output file something like output_final.dat
it has to be done in MATLAB, there are no other alternatives (actually there maybe; see note below)
What is implied:
I will have to fread each 3.9 GBfile into chunks or packets, probably 100 mb at a time (using an imbedded loop?)
these packets will have to be read then written sequentially
after one file is read then written into output_final.dat, the next file is automatically read & written (the master loop).
Well, that's pretty much it. I did a search for 'merging mulitple files' and found this. That isn't exactly what I need to do...I don't need to take part of a file, or data from files, and write it to a new one. I'm simply...concatenating...? This would be simple in Java or Perl, but I only have MATLAB as a tool.
Note: I am however running KDE in OpenSUSE on a pretty powerful box. Maybe someone who is also an expert in terminal knows a command/script to do this from the kernel?
So on this site we usually would point you to whathaveyoutried.com but this question is well phrased.
I wont write the code but i will give you how I would do it. So first I am a bit confused about why you need to fread the file. Are you just appending one file onto the end of another?
You can actually use unix commands to achieve what you want:
files = dir('*.dat');
for i = 1:length(files)
string = sprintf('cat %s >> output_final.dat.temp', files(i).name);
unix(string);
end
That code should loop through all the files and pipe all of the content into output_final.dat.temp (then just rename it, we didn't want it to be included in anything);
But if you really want to use fread because you want to parse the lines in some manner then you can use the same process:
files = dir('*.dat');
fidF = fopen('output_final.dat', 'w');
for i = 1:length(files)
fid = fopen(files(i).name);
while(~feof(fid))
string = fgetl(fid) %You may choose to parse the string in some manner here
fprintf(fidF, '%s', string)
end
end
Just remember, if you are not parsing the lines this will take much much longer.
Hope this helps.
I suggest using a matlab.io.matfileclass objects on two of the files:
matObj1 = matfile('datafile1.mat')
matObj2 = matfile('datafile2.mat')
This does not load any data into memory. Then you can use the objects' methods to sequentialy save a variable from one file to another.
matObj1.varName = matObj2.varName
You can get all the variables in one file with fieldnames(mathObj1) and loop through to copy contents from one file to another. You can then clear some space by removing the copied fields. Or you can use a bit more risky procedure by directly moving the data:
matObj1.varName = rmfield(matObj2,'varName')
Just a disclaimer: haven't tried it, use at own risk.

Structure in Matlab (I can't find a proper title !)

In my scenario, I have 100 nodes. Each time a random node out of them generates a data. I wish to record them in previously created files.
I have been using switch-case style to open the particular file associated with a node. However, it's clumsy for 100 nodes already and I need to increase the number of nodes. I was looking for a straight forward manner of opening a file based on node. I found bit hint here:
Stackoverflow_a_year_ago
But I'm unable to pick and open a particular file, say if the random node is 125, I'll open n125.txt file. Any help is appreciated. Here goes the code:
number_of_nodes=100;
for i=1:number_of_nodes
rand_node=ceil(rand(1,1)*100);
rand_output=ceil(rand(1,1)*10);
switch(rand_node)
case{1}
f1=fopen('n1.txt', 'a+');
fprintf(f1, rand_output);
fclose(f1);
case{2}
f2=fopen('n2.txt', 'a+');
fprintf(f2, rand_output);
fclose(f2) ;
end
end
Also, tried,
%..........................................
Names = dir('myprog*.TXT');
Names.name; %returns all file names.
Maybe I'm misunderstanding your question but the answer seems obvious:
fid=fopen(sprintf('n%d.txt',rand_node), 'a+');
fprintf(fid, rand_output);
fclose(fid);

Split large txt file into more txtfiles

I'm having an txt file approximately 1000kb big. Now I want to use objective-c split it into 10 txt files of 100kb.
I haven't really worked with NSRange. Well I know how it works, but then to read from a given location with the length: 'to end of file'... I've no idea how to do that.
Some code on how to split this into multiple 100kb txt file would really help me out here.
Thank you in advance.
HW:
In that you tagged this question with "iphone" I would suggest that your best approach is to NOT read in the big file first and then go about segmenting it. You don't want to be a memory bully .
Other than that, this question has already been asked and answered in Here

Reading large csv files with strings containing commas as one field

I have a large .csv file (~26000 rows). I want to be able to read it into matlab. Another problem is that it contains a collection of strings delimited by commas in one of the fields.
I'm having trouble reading it. I tried stuff like tdfread, which won't work here. Any tricks with textscan i should be aware about?
Is there any other way?
I'm not sure what is generating your CSV file but that is your problem.
The point of a CSV file, is that the file itself designates separation of fields. If the text of the CSV contains commas, then nothing you can do will help you. How would ANY program know when the text in a single field contains commas, or when that comma is a field delimiter?
Proper CSV would have a text qualifier. Some generators/readers gives you the option to use one. The standard text qualifier is a " (quote). Its changeable, though, because your text may contain those, too.
Again, its all about generating proper CSV content.
There's a chance that xlsread won't give you the answer you expect -- do the strings always appear in the same columns, for example? I think (as everyone else seems to :-) that it would be more robust to just use
fid = fopen('yourfile.csv');
and then either textscan
t = textscan(fid, '%s', delimiter', sprintf('\n'));
t = t{1};
or just fgetl (the example in the help is perfect).
After that you can do some line-by-line processing -- using textscan again on the text content of each line, for example, is a nice, quick way to get a cell-array that will allow fast analysis of each line.
You have a problem because you're reading it in as a .csv, and you have commas within your data. You can get it in Excel and manipulate the date, possibly extract the unwanted commas with Excel formulas. I work with .csv files for DB imports quite a bit. I imagine matLab has similar rules, which is - no commas in your data.
Can you tell us more about your data? Are there commas throughout, our just one column? Maybe you can read it in as tab delimited?
Are you using a Unix system? The reason I am asking is that you could use a command-line function such as sed and regular expressions to clean those data files before you pass them into Matlab. Here is a link that explains how to do exactly what you are looking for.
Since, as others have observed, your file is CSV with commas inside what you think of as a single field, it's going to be hard to persuade Matlab that that really is only one field. I think your best strategy is going to be to read one line at a time, into a string acting as a buffer, and to translate it, field-by-field, into the variables or other data structures that you want. Since Matlab has in-built regular expression capabilities this shouldn't be too hard.
And, as others have already suggested, posting a sample of your data would help us to help you.
One easy solution is:
path='C:\folder1\folder2\';
data = 'data.csv';
data = dataset('xlsfile',sprintf('%s\%s', path,data));
Of course you could also do the following:
[data,path] = uigetfile('C:\folder1\folder2\*.csv');
data = dataset('xlsfile',sprintf('%s\%s', path,data));
now you will have loaded the data as dataset. An easy way to get a column 1 for example is
double(data(1))