Read table from text file - matlab

I'm trying to import data from a text file into the workspace by using the readtable function.
The text file structure is pretty simple, being composed by 4 columns of types date, time, integer and float respectively as shown in the following minimal example:
2013-07-07 05:15:19 8 213.0
2013-07-07 05:15:19 11 109.0
2013-07-07 05:15:20 14 33.5
2013-07-07 05:15:24 56 182.0
When I try to load the data like this:
data = readtable(filename,...
'Format','%{yyyy-MM-dd}D %{HH:mm:ss}D %d %f %*[^\n]',...
'ReadVariableNames',false);
I get the following error:
Error using textscan
Badly formed format string.
Error in table/readTextFile (line 160)
raw = textscan(fid,format,'delimiter',delimiter,'whitespace',whiteSpace, ...
Error in table.readFromFile (line 41)
t = table.readTextFile(filename,otherArgs);
Error in readtable (line 114)
t = table.readFromFile(filename,varargin);
If I try this instead:
data = readtable(filename,...
'Format','%{yyyy-MM-dd}D%{HH:mm:ss}D%d%f%*[^\n]',...
'Delimiter',' ',...
'ReadVariableNames',false);
I get the exactly same error.
I've checked the Mathwork's online documentation, but I was unable to find the solution to my problem.
EDIT: Actually, the desired table format would be to have a datetime column replacing the date and time columns. What I'm doing is joining date and time manually after reading the table. If you know a way to import the table merging those 2 variables straight away, that would be great.

Initially if you will do this with your data format:
data = readtable('data.txt','Delimiter',' ','ReadVariableNames',false)
You will get Nx4 data array so that you can manipulate it as much as you like.
You can read on how to manipulate with the data imported as table here

Related

Read text file in MATLAB for data analysis

I have uploaded the file here. These are some lines from my txt file:
RSN1146_KOCAELI_AFY000 1.345178e-02
RSN1146_KOCAELI_AFY090 1.493577e-02
RSN1146_KOCAELI_AFYDWN 5.350641e-03
RSN4003_SANSIMEO_25862-UP 4.869095e-03
RSN4003_SANSIMEO_25862090 1.199087e-02
RSN4003_SANSIMEO_25862360 1.181286e-02
I would like to remove the data with DWN on 3rd line and -UP in 4th line. So the data will only have:
RSN1146_KOCAELI_AFY000 1.345178e-02
RSN1146_KOCAELI_AFY090 1.493577e-02
RSN4003_SANSIMEO_25862090 1.199087e-02
RSN4003_SANSIMEO_25862360 1.181286e-02
Then, I want to obtain the maximum value for RSN1146 & RSN4003.
I tried to read the file with the code below:
Data=fopen('maxPGA.txt','r');
readfile=fscanf(Data,'%c %s')
It is weird as I cannot perform further analysis as the data is not imported as 2 column in MATLAB, any solution for this?
I tried:
Data= importdata('maxPGA.txt')
as well, but the data are grouped into 2 different table in this case.

Write Matrix Data to Each Member of Datatype in HDF5 file via MATLAB

This is my first go at trying to create an HDF5 file from scratch using the Low-Level commands via MATLAB.
My issue is that I am having a hard time trying to write data to each specific member in the datatype on my dataset.
First, I create a new HDF5 file, and set the right layer of groups:
new_h5 = H5F.create('new_hdf5_file.h5','H5F_ACC_TRUNC','H5P_DEFAULT','H5P_DEFAULT');
new_h5 = H5G.create(new_h5,'first','H5P_DEFAULT','H5P_DEFAULT','H5P_DEFAULT');
new_h5 = H5G.create(new_h5,'second','H5P_DEFAULT','H5P_DEFAULT','H5P_DEFAULT');
Then, I create my datatype:
datatype = H5T.create('H5T_compound',20);
H5T.insert(datatype,'first_element',0,'H5T_NATIVE_INT');
H5T.insert(datatype,'second_element',4,'H5T_NATIVE_DOUBLE');
H5T.insert(datatype,'third_element',12,'H5T_NATIVE_DOUBLE');
Then, I format that into my dataset:
new_h5 = H5D.create(new_h5,'location',datatype,H5S.create('H5S_SCALAR'),'H5P_DEFAULT');
subset = H5D.get_type(H5D.open(new_h5,'/first/second/location'));
mem_type = H5T.get_member_type(subset,0);
I receive an error with the following command:
H5D.write(mem_type,'H5ML_DEFAULT','H5S_ALL','H5S_ALL','H5P_DEFAULT',data);
Error using hdf5lib2
Unhandled HDF5 class (H5T_NO_CLASS) encountered. It is not possible to write to this attribute or dataset.
So, I try this method instead:
new_h5 = H5D.create(new_h5,'location',datatype,H5S.create_simple(2,dims,dims),'H5P_DEFAULT'); %where dims are the dimensions of all matrices of data structure
H5D.write(mem_type,'H5ML_DEFAULT','H5S_ALL','H5S_ALL','H5P_DEFAULT',data); %where data is a structure
I receive an error with this following command:
H5D.write(mem_type,'H5ML_DEFAULT','H5S_ALL','H5S_ALL','H5P_DEFAULT',data);
Error using hdf5lib2
Attempted to transfer too many values to or from the library buffer.
When looking here for the XML tags for the error messages, it describes the above error as "illegalArrayAccess." Apparently, according to this question, you can only write to 4 members without the buffer throwing an error?
Is this correct? How can I correctly write to each member. I am about to reach my mental limit trying to figure this one out.
EDIT:
References kept here for general information:
HDF5 Compound Datatypes Example
HDF5 Compount Datatypes
H5D.write MATLAB Command
I found out why I cannot write data. I have solved the problem. I had my dimensions set incorrectly (which is code I forgot to include originally). My apologies. I had my dimensions like this:
dims = fliplr(size(data_matrix));
Where dims was a 15x250 matrix. The error was in that the buffer was unable to write a 250x15 matrix for each member, because it only had data for a 250x1 for each member.
The following code will (generically) work for writing data to each member:
new_h5 = H5F.create('new_hdf5_file.h5','H5F_ACC_TRUNC','H5P_DEFAULT','H5P_DEFAULT');
new_h5 = H5G.create(new_h5,'first','H5P_DEFAULT','H5P_DEFAULT','H5P_DEFAULT');
new_h5 = H5G.create(new_h5,'second','H5P_DEFAULT','H5P_DEFAULT','H5P_DEFAULT');
datatype = H5T.create('H5T_compound',20);
H5T.insert(datatype,'first_element',0,'H5T_NATIVE_INT');
H5T.insert(datatype,'second_element',4,'H5T_NATIVE_DOUBLE');
H5T.insert(datatype,'third_element',12,'H5T_NATIVE_DOUBLE');
dims = fliplr(size(data_matrix)); dims = [1 dims(1,2)];
new_h5 = H5D.create(new_h5,'location',datatype,H5S.create_simple(2,dims,dims),'H5P_DEFAULT');
H5D.write(new_h5,'H5ML_DEFAULT','H5S_ALL','H5S_ALL','H5P_DEFAULT',data_structure);
where data_matrix is a 15x250 matrix containing all data, and where data_structure is a sctucture containing 15 fields, each 250x1 in size.

Pyspark: ValueError

I have a dictionary of PySpark RDDs and am trying to convert them to data frames, save them as variable and then join them. When I attempt to convert one of my RDDs to a data frame I get the following error:
File "./spark-1.3.1/python/pyspark/sql/types.py",
line 986, in _verify_type
"length of fields (%d)" % (len(obj), len(dataType.fields)))
ValueError: Length of object (52) does not match with length of fields (7)
Does anyone know what this exactly means or can help me with a work around?
I agree - we need to see more code - obfuscated data is fine.
You are using SparkQL it seems (sql types) - mapped onto what ? HDFS/Text
From the error it would appear that your create schema is incorrect - leading to an error - when to create a Data Frame.
This was due to the passing of an incorrect RDD, sorry everyone. I was passing the incorrect RDD which caused didn't fit the code I was using.

HDF format on Matlab

I have a Modis image with hdf format.
fileinfo = hdfinfo('MOD09GA.A2011288.hdf');
I'm trying to create a matrix but I only need three bands that are stored on the attributes (I know it because I've checked on Erdas). I've checked the structure of the attributes and there are 12 bands (fileinfo.Attributes= <1x12 struct>). How can I extract and create a matrix with three bands?
sds_info = fileinfo.SDS(2);
What I'm trying to do is the following...
data1 = hdfread(sds_info.Attributes)
But I get the following error:
??? Error using ==>
hdfread>dataSetInfo at 418
HINFO must be a structure
describing a specific data set
in the file.
Checking the help I know I have to use that structure. How can I know the content of the attributes? How can I select and create a matrix with that information?
data1 = hdfread(s.Vdata(1), 'Fields', {'Idx', 'Temp', 'Dewpt'})
PS) I'm using the hdftool importing every band. There another way to do it?
At the end, this is what I've done (I don't erase the post just in case could help someone):
sur_refl_b01_1 = hdfread('MOD09GA.A2011288.h17v05.005.2011293000105.hdf', '/MODIS_Grid_500m_2D/Data Fields/sur_refl_b01_1', 'Index', {[1 1],[1 1],[2400 2400]});

I need help on this data file to be edited in SOM_PAK format

I am working on Self Organizing Map (SOM) Implementation and I have a microarray dataset which I am trying to read in using some_read_data function, but I keep having an errors when I edit it to have it in SOM_PAK form which is recognise by SOM for reading such as:
??? Error using ==> somtoolbox\som_read_data.m Only 69 vector
components on input file data line 1 (dimension is 70)
Error in ==> SomMainFunction at 3 sD = som_read_data('B_r2.txt');
But, when I try to read the data without editing which is the original file (Editor: DEAD LINK!), it indicates "Data read OK", but I have the following error:
??? Error using ==> unknown Out of memory. Type HELP MEMORY for your
options.
Error in ==> somtoolbox\som_bmus.m at 189 Bmus =
zeros(dlen,length(which_bmus));
Error in ==> somvis\somvis_p_matrix.m at 41
[dummy dists] = som_bmus (dat, dat, 2:datlen);
Error in ==> SomMainFunction at 16 [pheight rad_real perc] =
somvis_p_matrix(sM,sD);
You can get the datafile from here (Editor: DEAD LINK!)
you can also download the toolbox from here.
I need someone to help me correct this data for me and put it in SOM_PAK format. I have tried getting it in SOM_PAK format, but it still giving me errors.
In B_r2.txt your first column is not the data, just row numbers, delete it. The number in the first row should be number of columns. Why it's 47?