How to read .yml file in matlab - matlab

I have a sequence of .yml files generated by opencv that I was trying to read into MATLAB using yamlmatlab, but I am getting the following error:
y_data = ReadYaml(yaml_file);
Error using ReadYamlRaw>load_yaml (line 78)
while scanning a directive
in "<string>", line 1, column 1:
%YAML:1.0
^
expected alphabetic or numeric character, but found :(58)
in "<string>", line 1, column 6:
%YAML:1.0
^
My YAML Files look like the following:
%YAML:1.0
Vocabulary: !!opencv-matrix
rows: 100
cols: 78
dt: f
data: [ 1.00037329e-001, 8.75103176e-002, 1.09445646e-001,
1.05232671e-001, 6.78173527e-002, 9.65989158e-002,
1.62132218e-001, 1.56320035e-001, 1.12932988e-001,
1.27447948e-001, 1.88054979e-001, 1.88775390e-001,.....
And
%YAML:1.0
---
vocabulary: !!opencv-matrix
rows: 100
cols: 1
dt: f
data: [ 3.54101445e-04, 1.23916077e+02, 9.93522644e+01,
2.42377838e+02, 3.53855858e+01, 1.69853516e+02, 5.81151466e+01,
8.07454453e+01, 1.83035984e+01, 2.13557846e+02, 1.52394699e+02,
1.10933914e+02, ......
I have tried it with YAMLMatlab but am still getting the same error. Please help how to read these file and convert them into .mat files.

You can use the parser I wrote and published recently on matlabcentral and github, cvyamlParser. It can handle the header in yaml file properly.
https://zenodo.org/record/2703498#.XNg20NMzafU
https://github.com/tmkhoyan/cvyamlParser
https://in.mathworks.com/matlabcentral/fileexchange/71508-cvyamlparser
It is a MEX-file compiled for linux and osx. You can use the src file and instructions on to compile a windows version.
It will take a yaml file written by open cv and convert it to a structure with the same variable names as provided in the yaml. The variable data type is inferred at runtime, optionally you can use sorting for variables that have a numerical index like A1,A2,A4,A5 etc.
Use it like so:
s = readcvYaml('../data/test_data.yaml')
s =
struct with fields:
matA0: [1000×3 double]
matA1: [1000×3 double]
matA2: [1000×3 double]
Or with sorting:
s = readcvYaml('../data/test_data.yaml','sorted')
s =
struct with fields:
matA: [1×3 struct]

It appears that the linked library (which appears to use SnakeYAML under the hood) is not able to parse the YAML 1.0 YAML directive which contains a colon (:) rather than a space in later versions of the specification.
%YAML:1.0
Became:
%YAML 1.2
It appears that the contents of the YAML file are compatible with newer YAML formats, so you could try remove the directive from the file prior to parsing (delete the first line).
As far as converting once you have the data loaded into MATLAB, you should be able to do something like:
% Read the yaml file
yaml = yaml.ReadYaml(yaml_file);
% Load in the matrix and reshape into the desired size
mat = reshape(yaml.data, yaml.cols, yaml.rows).';
% Save to .mat file
save('output.mat', 'mat')

Related

Trying to open a python file using power shell but it brings up a list 'index out of range' error... but the items are not out of range?

PS C:\OIDv4_ToolKit> python convert_annotations.py
Currently in subdirectory: train
Converting annotations for class: Vehicle registration plate
0%| | 0/400 [00:00<?, ?it/s]0317.44 497.91974400000004 413.44 526.08
0%| | 0/400 [00:00<?, ?it/s]
Traceback (most recent call last):
File "C:\OIDv4_ToolKit\convert_annotations.py", line 66, in <module>
coords = np.asarray([float(labels[1]), float(labels[2]), float(labels[3]), float(labels[4])])
IndexError: list index out of range
python file: this is the error it refers to as line 66 (Line 7 here)
with open(filename) as f:
for line in f:
for class_type in classes:
line = line.replace(class_type, str(classes.get(class_type)))
print(line)
labels = line.split()
coords = np.asarray([float(labels[1]), float(labels[2]), float(labels[3]), float(labels[4])])
coords = convert(filename_str, coords)
This doesn't look like a PowerShell issue; the python interpreter looks like it is being run correctly. I suggest adding the python tag to your question to get the right people involved.
Having located the source, it seems as if some of the text files in the following directory aren't in the format expected by convert_annotations.py:
C:\OIDv4_ToolKit\OID\Dataset\train\Vehicle registration plate\Label\
You can verify this with:
print("labels length =", len(labels))
after the line.split() method. If you get a length of 1, it is likely the items on a line somewhere aren't separated with whitespace, for example with commas. You can also inspect the files manually to determine the format. To find them, you can use:
print(os.path.join(os.getcwd(), filename))
inside the the for loop, which is on Line 54 in the source I linked above. Note also that the string split() method supports a custom separator as the first argument, should the files be in a different format.
This issue occurs when you don't put the class name in classes.txt
The class name should be same in classes.txt as downloaded class.

How do you call a function that takes in a MAT file, manipulate the data in that file, and create a new textfile with that same MAT file name?

The filename in question is a MAT file that contains elements in the form of "a - bi" where 'i' signifies an imaginary number. The objective is to separate the real, a, and imaginary, b, parts of these elements and put them into two arrays. Afterwards, a text file with the same name as the MAT file will be created to store the data of the newly created arrays.
Code:
function separate(filename)
realArray = real(filename)
imagArray = imag(filename)
fileIDname = strcat(filename, '.txt')
fileID = fopen(fileIDname, 'w')
% more code here - omitted for readability
end
I am trying to run the above code via command window. Here's what I've tried so far:
%attempt 1
separate testFileName
This does not work as the output does not contain the correct data from the MAT file. Instead, realArray and imagArray contains data based on the ascii characters of "testFileName".
e.g. first element of realArray corresponds to the integer value of 't', the second - 'e', third - 's', etc. So the array contains only the number of elements as the number of characters in the file name (12 in this case) instead of what is actually in the MAT file.
%attempt 2
load testFileName
separate(testFileName)
So I tried to load the testFileName MAT variable first. However this throws an error:
Complex values cannot be converted to chars
Error in strcat (line 87)
s(1:pos) = str;
Error in separate (line xx)
fileIDname = strcat(filename, '.txt')
Basically, you cannot concatenate the elements of an array to '.txt' (of course). But I am trying to concatenate the name of the MAT file to '.txt'.
So either I get the wrong output or I manage to successfully separate the elements but cannot save to a text file of the same name after I do so (an important feature to make this function re-usable for multiple MAT files).
Any ideas?
A function to read complex data, modify it and save it in a txt file with the same name would look approximately like:
function dosomestuff(fname)
% load
data=load(fname);
% get your data, you need to knwo the variable names, lets assume its call "datacomplex"
data.datacomplex=data.datacomplex+sqrt(-2); % "modify the data"
% create txt and write it.
fid=fopen([fname,'.txt'],'w');
fprintf(fid, '%f%+fj\n', real(data.datacomplex), imag(data.datacomplex));
fclose(fid);
There are quite few assumptions on the data and format, but can't do more without extra information.

Matlab Readtable Invalid parameter name: Range

I am trying to read column C from an Excel CSV file (file is too large to load entire thing). I am trying the following code:
filename='AS-1704-CT-Data-(Jan4---Jan-7)_1.csv';
T=readtable(filename, 'Delimiter', ',', 'Range', 'C:C')
The error I get says Error in (line 2), Invalid parameter name: Range.
According to the Matlab doc for readtable, Range is a valid parameter. The Name is 'Range' and the Value is 'C:C' (I've also tried 'C2:C8' while troubleshooting).
Am I missing something here?
MATLAB interprets your file as text, and according to documentation
When reading:
Text files, only these parameter names apply: FileType, ReadVariableNames, ReadRowNames, TreatAsEmpty, DatetimeType, Delimiter, HeaderLines, Format, EmptyValue, MultipleDelimsAsOne, CollectOutput, CommentStyle, ExpChars, EndOfLine, DateLocale, and Encoding.
So Range is not a valid parameter name for text files.
You could try saving your file as an excel workbook (.xls) and reading from that.

How to read line with comma-separated fields from file?

I have task to read a positional file. I am able to read positional file with hard-coded data length in code but my task is to read data lengths from external file.
val lengths = Seq(3,10,5,4) // <-- I'd like to read it from an external file
Say, you have a file with the following content (that corresponds to the positions):
$ cat positions.csv
3,10,5,4
In Scala, you could read the file as follows:
val lengths = scala.io.Source.
fromFile("positions.csv").
getLines.
take(1).
toArray.
head.
split(",").
map(_.toInt).
toSeq
scala> lengths.foreach(println)
3
10
5
4

MATLAB: extracting onsets in different files and saving them in different files

My MATLAB script is to:
Extract four different fMRI onsets from MATLAB files (the files are named 'subject 06 data', 'subject 05 data', etc.)
Put this information in a new file with two other variables named 'durations' and 'names'.
Save all this as a new MATLAB file.
I am facing two problems:
At the moment, the script below manages to do steps 1 through 3 for the first MATLAB file in the directory 'Gender_recogntion', but it does not do 1 through 3 for the other MATLAB files in the folder. It crashes in the loop at the line 'load(sub_name(i).name);'.
This is the error I get:
??? Improper index matrix reference.
Error in ==> Gender_onsets_script_2 at 16
load(sub_name(i).name);
In addtion, I would like to name the new MATLAB files with the name of the original MATLAB files. At the moment, the new MATLAB files is named 'onsets.mat'.
clear all
close all
clc
cd 'C:\Program Files\MATLAB\R2007b\Data\Resilience\Real_data\Raw\Matlab_files\Gender_recogntion';
sub_name = dir('C:\Program Files\MATLAB\R2007b\Data\Resilience\Real_data\Raw\Matlab_files\Gender_recogntion\*.mat');
for i = 1:numel(sub_name);
load(sub_name(i).name);
names = {'sad' 'anger' 'neutral' 'rest'};
durations = {[18] [18] [18] [18]};
onsets=cell(1,4);
onsets{1} = data.time_since_scan_start(data.emotion==5)/1000; %Get the 36 onsets for sad.
onsets{2} = data.time_since_scan_start(data.emotion==4)/1000; %Get the 36 onsets for anger.
onsets{3} = data.time_since_scan_start(data.emotion==6)/1000;% Get the 36 onsets for calm.
onsets{4} = datarest.onset/1000; %Get the six onsets for the rest blocks.
onsets{1} = onsets{1}(1:6:36)'; %Get the first onset value of each of the six blocks.
onsets{2} = onsets{2}(1:6:36)';
onsets{3} = onsets{3}(1:6:36)';
onsets{4} = onsets{4}';
%cd Onsets folder, saves onsets, and then cd back to folder "Matlab_files"
cd 'C:\Program Files\MATLAB\R2007b\Data\Resilience\Real_data\Onsets';
save 'onsets.mat' names durations onsets
cd 'C:\Program Files\MATLAB\R2007b\Data\Resilience\Real_data\Raw\Matlab_files\Gender_recogntion';
end
For your second question about naming the output files the same as the input, you can use the function version of save and pass in the variable sub_name(i).name as the filename argument.
save(sub_name(i).name, 'names', 'durations', 'onsets')
This uses the exact same name for input and output (in different directories, in your script). When I save output files, I typically keep them in the same directory as the inputs, so I modify an input filename with regular expressions (see regexprep) or adding a prefix or suffix (strcat) to create a related but distinct output filename.
For future reference...the default filetype for save is MATLAB data format; you could pass in '-ASCII' as an argument to save as a text file if your data types were compatible. The cell arrays in this example aren't, but strings and numerical matrices would be, so if text output files were important, you could use alternate data structures from the beginning or convert cells with cell2mat. A generic example with the save() version: save(filename, '-ASCII', 'x', 'y','z') where x,y,z are ASCII-friendly variables and filename is a text file.
[additional response, adding Jan 5, 2011]
About your first question on the error message:
??? Improper index matrix reference.
Is it possible that a saved .mat file contains a variable named dir, that would override the standard directory-listing function and cause that error? I read that tip on another site, just wanted to pass it along in case it helps.