Preventing MATLAB's readtable function from ignoring first row of delimited text data file - matlab

I have a very similar issue as the following question that was previously asked:
readtable on text file ignores first row which contains the column names
However, in my case, the file is consistently formatted correctly. All values are separated by a single space, including the first row, which consists of the column headers. I've tried switching the spaces to tabs, but this did not fix anything.
I am simply using the following code:
% Get list of file names from current directory and make file name variable
filelist = ls();
filename=filelist(3,1:97);
% create table object using file name
DE_genelst_raw_CntrlMvF = readtable(filename);
And where I should have a table with 6 rows and 5 columns with headers, I get a 6x5 table with the column headers missing. I used the readtable function with a more complex delimited dataset and it correctly included the headers. So I know it should be able to work. just not sure what is wrong. If need be I can provide a copy of the file. Thank you for the help.

Related

Fileheader in double quotes using talend

I have exported a file with doublequotes but i am unable to add text qualifier in double quotes.
tmysqlinput---tmap---ifileoutputdelimited
Thanks.
It isn't pretty. The solution is to handle the header row as just a normal row of data and turn off the header row checkbox.
1a. I like to create a schema defining the output columns. Then , use a tFixed flow input (or similar) to build the the first row of data (your header values ) , using the defined schema, and output to the file. The file output component's header row checkbox should be turned off. The shared schema makes it easy to make sure everything matches up .
1b. Alternatively, just build a single string containing the header and write it out as raw data.
Afterward, output rows of actual data to the same file, in append mode, no header, again using the same schema.

How can I prevent MATLAB from automatically modifying .dat file variable names upon import using the dataset function?

So, I currently have a MATLAB script that does stuff with data and then, using a template .dat file, creates about 20 more .dat files with only a single column being changed (I've been using the dataset and export functions to read and write the files, respectively). The program that will use the .dat files, ExperimentBuilder, requires that the headers have names that start with dollar signs (for example: $image). However, when I use the dataset function in MATLAB to import the template file, I get this warning:
Warning: Variable names were modified to make them valid MATLAB identifiers.
It then replaces all the dollar signs in the variables to x_ (for example, x_image), which would be fine if it would let me change it back to the $ format. But whenever I try to using set , it just gives me this warning again and reverts it back to x_, which is unreadable by ExperimentBuilder.
I know I could just do a quick copy and paste on each file with the original headings, but I would like to know if there's a way to fix this problem in the actual code.
Thanks!
Thing is the MATLAB database uses the header names to provide access to the columns by name, this is why the header names must be valid identifiers (isvarname() states that it must starts with a letter, and contains only valid alphanumeric characters [a-zA-Z0-9_]).
The easiest solution would be manually write the header line yourself (including names starting with $), while separately exporting the data without the headers:
export(ds, ..., 'WriteObsNames',false)
(Note that dataset.export overwrites files by default, so you'll have to export first, then prepend the header line at the beginning of the file. Or if you're comfortable modifying MATLAB own functions, then go edit dataset.export and change the fopen mode from overwrite 'wt' to append 'at' mode).

Using the second row of a delimited text file as the header row when importing into Access 2010.

Is it possible to use the values of the second row of a delimited text file (e.g. a csv file) as the header row when importing into Access 2010?
No - the headers have to be in the first line of the imported file. You need to delete the empty first line of data.
If there are too many files for this to be practical, as you imply, you have a couple of options.
Presuming the headers are the same on all of your files to be imported, you could combine all of the text files into one file and import that.
If the headers are different, you could write some code to batch delete the first line from all your files, as is suggested here.

Talend tFileOutputdelimited component - problems with the split .csv files

I tried my luck on the Talend forum and no luck there, so I will try here as well.
I have a job that is reading a large table and then writing the data to .csv files in increments of 25000 rows. What I have noticed is that all .csv files created after the first .csv file have the data loaded all in one row versus the first .csv file that has the data loaded in 25000 rows (as I want it).
Is there a setting that needs to get set on the tFileOutputDelimited component that will allow for the rows in all subsequent .csv files to get loaded as they are in the first (and 'good') .csv file? I am thinking it may be due to what is being used for the 'Escape char' value on the 'Advance settings' tab but am not sure.
On the tFileOutputDelimited component's 'Basic settings' tab, the CSV Row Separator value is CRLF("\r\n") and the field separator is ",". On the component's 'Advanced settings' tab, the Escape char value is """ and the Text enclosure value also is """.
Also, this is being run in a Windows 7 environment.
Unfortunately the documentation I found for the tFileOutputDelimited component's 'Advance settings' tab is lacking in regards to the CSV options.
Below is an example of what is being encountered. As listed below, the first file looks great but all files that follow do not break on the line break and end up placing all of the data on one row versus individual rows.
File #1
header row
row 1
row 2
row 3
...
row 25000
File #2...
header rowrow1row2...row25000
File #3...
header rowrow1row2...row25000
If you need more details, let me know and I'll send them right off. Thank you in advance.
Figured it out. As mentioned in my initial post, the CSV Row Separator had been set to the CRLF("\r\n") option. I changed this to the LF("\n") and that addressed the problem. I had looked atthe generated java code and noticed that it was not treating the CRLF("\r\n") as one of the default options - only \n and \r were. This pointed me in the direction of trying the \n option.

In Matlab, how do I create a CSV file from a subset of the lines in a text file?

I need to open a text file and convert it into a CSV file in Matlab. The first 3 lines of the text file are sentences that need to be omitted. The next 28 lines are numbers that need to make up the first column of the CSV, and then the next 28 lines need to make up the second column.
The text file is called datanal.txt and the output file can be named anything. Any help would be appreciated.
Don't have Matlab now to test, but try this. Your input file should be in Matlab's current directory, or put the full path to the file name.
A = csvread('datanal.txt',3,0);
A = reshape(A,28,2);
csvwrite('output.csv',A)
well you can add #'s in front of the first 3 lines then use load and a reshape. Did you need a fully automated script or is there only one file? If you're familiar with matlab at all there are a bunch of ways to turn that large column vector into a matrix.