Can table variable names start with a number character? - matlab

I am running something like this:
table = readtable(path,'ReadVariableNames',true);
In the .csv file all of the cells of the first row contain string identifiers like '99BM', '105CL', etc. Everything starts with a number. The command above gives a table with variable names like 'x99BM', 'x105CL', etc.
Is it possible to get rid of this 'x'? I need to compare these identifiers with a column of another table that is clear of this 'x'.

No, table variable names can't start with a number since they follow the same naming conventions as normal variables. The 'x' is automatically added by readtable to create a valid name. You should have noticed this warning when calling readtable:
Warning: Variable names were modified to make them valid MATLAB identifiers.
The original names are saved in the VariableDescriptions property.
So, you can't get rid of the 'x' within the table. But, if you need to do any comparisons, you can do them against the original values saved in the VariableDescriptions property, which will have this format:
>> T.Properties.VariableDescriptions
ans =
1×2 cell array
'Original column heading: '99BM'' 'Original column heading: '105CL''
You can parse these with a regular expression, for example:
originalNames = regexp(T.Properties.VariableDescriptions, '''(.+)''', 'tokens', 'once');
originalNames = vertcat(originalNames{:});
originalNames =
2×1 cell array
'99BM'
'105CL'
And then use these in any string comparisons you need to do.

No. As mentioned by gnovice, the readtable function automatically renames invalid names. It does this by calling matlab.lang.makevalidname and setting the output as the column name.
If I understand correctly, you're comparing the contents of a column from one table with the names of the columns of another table. In that case contains(['x' <contents of single row of column>], table.VariableNames) will prepend x to the value in a row of the table column (for this implementation, you need to loop through every row of the table) and then compare this string with the variable names of the table. You can also do this in a single line with arrayfun or something but I am doing this from memory right now and can't recall the correct syntax.

Related

MATLAB Column headers not valid variable name

I am working in MATLAB and trying to add units to the Column headers to a table of values then I will insert into SQLite Database but I have a column names of German characters (e.g 'ß', 'ä'), but this is invalid because of the special characters. According to everything I've found thus far has said that column headers must be valid variable names, e.g. alphanumeric and "_" only.
But I can not change my original database column names so does anyone know of a work-around it?
My code of building a table and sending into database is:
insertData = cell2table(full_matrix,'VariableNames',colnames);
insert(conn,tableName,colnames,insertData);
And some of my column names:
'maß','kapazität', 'räder'
Thank you very much for helping.
Do you have to create the table first? I would try just passing the cell array of data directly to insert like so:
insert(conn, tableName, colnames, full_matrix);
The above assumes that it is the cell2table call that is giving you an error related to the special characters. If it's the insert call, then I guess MATLAB won't let you create databases with column names that don't conform to its variable naming conventions. If that's the case, you'll have to convert the column names to something valid, which you can do with either genvarname (for older MATLAB versions) or matlab.lang.makeValidName (suggested for versions R2014a and newer):
colNames = {'maß','kapazität', 'räder'};
validNames = genvarname(colNames);
% or...
validNames = matlab.lang.makeValidName(colNames, 'ReplacementStyle', 'hex');
validNames =
1×3 cell array
'ma0xDF' 'kapazit0xE4t' 'r0xE4der'
Note that the above solutions replace the invalid characters with their hex equivalents. You could also change the 'ReplacementStyle' to replace them with underscores or delete them altogether. I would go with the hex values because it gives you the option of converting the column names back to their original string values if you need those for anything later. Here's how you could do that using regexprep, hex2dec, and char:
originalNames = regexprep(validNames, '0x([\dA-F]{2})', '${char(hex2dec($1))}');
originalNames =
1×3 cell array
'maß' 'kapazität' 'räder'

converting 1x1 matrix to a variable

I read the data from the csv which contains two columns id which text/string and the cancer which is 1/0. please see the code be
M = readtable('data.csv');
I try to access the very first value using
row= M(n,1); //It's from the ID column which is text
But it comes in the form of a 1x1matrix, and I am unable to put it in a single variable.
for example I want after the above line works row should contain a string in it like. row = 'patientID'. Now is there anyway to convert it into a single value?
Use row = M{n,1}. Note the curly braces.
The curly braces say "get the contents of the table", as opposed to the circular brackets you had been using which say "get me a portion of the table, as a table".

MATLAB export multiple .csv files at one time

I have a matrix where I need to export each column into a separate .csv file.
I know the number of columns and I can achieve my desired result if I specifically select one column to export. I would do this by:
dlmwrite('1.csv',data(:,1), 'precision', 9)
Therefore if I want column 2 I would change the variable to data(:,2) and save this as 2.csv.
So I want a loop that will do all this automatically. I have tried
for i=1:Number_of_Columns
dlmwrite('(i).csv',csv_data(:,(i)), 'precision', 9)
end
which clearly won't work but I am unsure how to do it.
Any help or advice would be much appreciated
Your problem is the filename. If you put i between quotes it will be taken as character instead of a variable. (In your case your filename will always be "(i).csv")
You can concatenate strings using [ ], and since i is an integer you have to convert it to string using num2str()
Try:
for i=1:Number_of_Columns
dlmwrite([num2str(i) '.csv'], csv_data(:,i), 'precision', 9)
end
PD: Since you are storing each column (not each row) in a file, I'm not sure if you want a file where each element is in a separate line, or if you want the column to be stored as a row and separated by commas.
If you want the latter, transpose your column:
dlmwrite([num2str(i) '.csv'], csv_data(:,i).', 'precision', 9)
Note that the transpose operator is .' instead of the complex conjugate ' (this is a common misuse since the results are the same as long as you only use real numbers)

Matlab split text column in a table

I have a table object in MatLab with a text column. This text column is a "tag" and contains underscores two split the tag.
I'd like to create a column with the second element of the tag. I used strsplit but It didn't work. Also I tried regexp but it gives me a cell object with 126 cells objects inside, and I don't know how to extract the second element of every cell.
Any suggestion?
Example:
a = {'a_b'; 'a_c';'a_n';'a_t'}
t = table(a)
I just want a vector with the second element.
Thanks.
How about
t=[t rowfun(#(x) x{1}(3),t)]
with 1 being the column and 3 being the element you want. For undefined length of the string parts it gets a little bit more tricky
t=[t rowfun(#(X) X{1}(strfind(X{1},'_')+1:end),t,'OutputFormat','cell')];
strfind() gets the '_' element so (find+1:end) is the rest of the string. as they can be of different length everything has to a cell as Output and then be added to the table. if the column changes you have to adopt the code in both {1}

How to randomly select from a list of 47 names that are entered from a data file?

I have managed to input a number data file into a matrix but have been unable to do so for any data that is not a number.
I have a list of 47 names and supposed to generate a random name from the list. I have tried to use the function textscan but was not going anywhere. Also how do I generate a random name from the list? All I have been able to do was generate a random number between 1 to 47.
Appreciate the replies. I should have said I need it in MATLAB sorry.
Here is a sample list of data in my data file
name01
name02
name03
and the code to read it:
fid = fopen('names.dat','rt');
headerChars = fgetl(fid);
data = fscanf(fid,'%f,%f,%f,%f',[4 47]).';
fclose(fid);
The above is what I have to read the data file into a matrix but it is only reading the first line. (Yes it was modified from a previous post here on this forums :/)
Edit: As per the helpful comments from mtrw, and the fixed formatting of the sample data file, I've updated my answer with more detail.
With a single name (i.e. "Bob", "Bob Smith", or "Smith, Bob") on each line of the file, you can use the function TEXTSCAN by specifying '%s' as the format argument (to denote reading a string) and the newline character '\n' as the 'Delimiter' (the character that separates the strings in the file):
fid = fopen('namefile.txt','r');
names = textscan(fid,'%s','Delimiter','\n');
fclose(fid);
Then it's a matter of randomly picking one of the names. You can use the function RANDI to generate a random integer in the range from 1 to the number of names read from the file (found using the NUMEL function):
names = names{1}; %# Get the contents from the cell returned by TEXTSCAN
selectedName = names{randi(numel(names))};
Sounds like you're halfway home. Take that random number and use it as an index for the list.
For example, if you randomly generate the number 23 then fetch the 23rd entry in the list which gives you a random name draw.
Use the RANDOMBETWEEN function to get a random number within your range. Use INDEX to get the actual cell value. For instance:
=INDEX(A1:A47, RANDBETWEEN(1, 47))
The above will work for your specific case of 47 names, assuming they're in column A. In general, you'd want something like:
=INDEX(MyNames, RANDBETWEEN(ROW(MyNames), ROW(MyNames) + ROWS(MyNames) - 1))
This assumes you've named your range of cells "MyNames" (for example, by selecting all the cells in your range and setting a name in the naming box). The above formula works by using the ROW function to return the top row of the MyNames array and the ROWS function to get the total rows in MyNames.