Converting columns to strings - matlab

I have many files, with different numbers of columns (.txt files). How to automaticaly change this %s%s%s... string to correspond numbers '%s' to amount of column?
data=textscan(fid,'%*s%*s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%','HeaderLines',skip_lines,'CollectOutput',1);
The files looks like:
I have algoritm which load many files and it will nice, when I could automatize this.

try this method:
importdata('path to file')
You can specify the delimiter as well. this method is adaptive and you do not need to take care about columns. This method return the header, text data and number data in separate variables.

Related

How to use numbers as attributes in PostgreSQL?

I have a .csv file that has numbers as column names. I want to import that file to a table in PostgreSQL, but it gives an error.
I have 1024 columns so I can't manually change it in my file. Is there a way around that?
This is the Excel file that I got:
If you want a table with 1024 columns you are doing something wrong.
You should choose a different data model.
But it is possible to use numbers as column names, as long as you surround them with double quotes.

Tallying unknown words across columns in Tableau (or from comma separated column)

I have an issue that I have been trying to solve for the better part of a week now. I have a large database (in Google sheets) representing casestudies. I have some columns with multiple categories listed (in this example 'species', 'genera', and 'morphologies'), and I want to be able to tally how many times each category occurs in the data set.
I use Tableau to visalise the data, and the final output will be a large publc tableau. I know I can do a "find" based on the specific string, but I'd like the dataset to be dynamic and be able to handle new data being added without having to update calculated fields? Is there a way of finding uniqe terms (either from a single column of comma separated values, or from multiple columns), and tallying them?
Things I have tried so far:
1 - A pivot table in Tableau. Works well, but messes with all the other data, since it repeats lines.
2 - A pivot table on its own data source in Tableau. Also works well, and avoids the problem of messing with the other data. However, now each figure is disconnected from the others so I can't do a large dashboard where everything is filtered by each other (ie filtering species and genera by country at the same time).
3 - An SQL query() in google sheets, which finds all unique terms and queries them, which can then be plotted in Tableau. Also works well, but similar problem of the data being disconnected from all the other terms in the dataset.
Any ideas of a field calculation that will find, list and tally unique terms in a single comma separated column (or across multiple columns), without changing the data structure?
I have placed a sample data set here (google sheets), which is a smaller version of what I'm actually working on. In it I have marked comma separated columns in grey, and they're followed by a bunch of columns with the values split into columns. I only need to analyse either of those (ie either a calculation to separate comma separate values or from multiple columns).
I've also added a sample Tableau workbook here.

Keep leading zeros when joining data sources in tableau

I am trying to create a data source in Tableau (10.0) where I am joining a table from SQL with an Excel file. The join happens on a site id but when reading the id from the excel source, Tableau strips the leading zeros (and SQL keeps leading zeros). I see this example
to add the leading zeros back as a new, calculated field. But the join is still dropping rows because the id is not properly formatted when making the join.
How do I get the excel data source to read the column with the leading zeros so I can do the join?
Launch Excel and choose to open a new blank workbook.
Click the Data tab and select From Text.
Browse to the saved CSV file and select Import.
Ensure that Delimited is selected and click Next.
Leave Tab as the delimiter and click Next.
Select the column containing the data with leading zeros and click
Text.
Repeat for each column which contains leading zeros.
Click Finish.
Click OK.
Never heard of or used tableau, but it sounds as though something (jet/ace database driver being used to read excel file?) is determining the column to be numeric and parsing the data as numbers, losing leading zeroes
If your attempts at putting them back are giving you grief, I'd recommend trying the other direction instead; get sqlserver to convert its strings to numbers. Number matching should be more reliable than String matching, so long as the two systems don't handle rounding differently :)
If your Excel file was read in from a CSV and the Site ID is showing "Number Stored as Text", I think you can solve your problem by telling Tableau on the Data Source entry that the field is actually a string. On the preview data source view, change the "#" (designating number) to string so that both the SQL source and the Excel source are both strings before doing the join.
This typically has to do with the way Excel stores values as mentioned above. I would play around with the number formatting for the Site ID column in Excel itself, not Tableau, and changed that two "Text" in Excel. You can verify if Tableau will read it properly with the leading 0s by exporting your excel file to csv and looking in the csv files to see if the leading 0s are still there.

Problems reading CSV in Octave

I have a .csv file and I can't read it on Octave. On R I just use the command below and everything is read alright:
myData <- read.csv("myData.csv", stringsAsFactors = FALSE)
However, when I go to Octave it doesn't do it properly with the below command:
myData = csvread('myData.csv',1,0);
When I open the file with Notepad, the data looks something like the below. Note there isn't a comma separating the last column name (i.e. Column3) from the first value (i.e. Value1) and the same thing happens with the last value of the first row (i.e. Value3) and the first value of the second row (i.e Value4)
Column1,Column2,Column3Value1,Value2,Value3Value4,Value5,Value6
The Column1 is meant for date values (with format yyyy-mm-dd hh:mm:ss), I don't know if that has anything to do with the problem.
Alex's answers already explains why csvread does not work for your case. That function only reads numeric data and returns an array. Since your fields are all strings, you need something that reads a csv file into a cell array.
That function is named csv2cell and is part of the io package.
As a separate note, if you plan to make operation with those dates, you may want to convert those dates as strings, into serial date numbers. This will allow you to put your dates in a numeric array which will allow for faster operations and reduced memory usage. Also, the financial package has many functions to deal with dates.
csvread only reads numeric data, so a date does not qualify unfortunately.
In Octave you might want to check out the dataframe package. In Matlab you would do readtable.
Otherwise there are also more primitive functions you can use like textscan.

Read csv file excluding first column and first line

I have a csv file containing 8 lines and 1777 columns.
I need to read all the contents in matlab, excluding the first line and first column. First line and first column contain strings and matlab can't parse them.
Do you have any idea?
data = csvread(filepath);
The code above reads all the contents
As suggested, csvread with a range will read in the numeric data. If you would like to read in the strings as well (which are presumably column headers), you can use readtable:
t = readtable(filepath);
This will create a table with the column headers in your file as variable names of the columns of the table. This way you can keep the strings associated with the data, if need be.