I would like to use cCSVParse http://michael.stapelberg.de/cCSVParse
in my project. I receive csv data from internet and want to parse it and save to core data. cCSVParse seems to be appropriate class of it. But it can only read csv data from file. When I receive data from internet, I wouldn't like to save it to file. Is there any way to use it for parsing data from NSData or NSString?
One approach is to surround the CSV data (one row at a time) with [] chars and parse it with a JSON parser. (But note that this only works if the non-numeric data items are enclosed in quotes.)
Or you can simply use
NSArray* items = [csvRow componentsSeparatedByString:#","];
if the data items aren't enclosed in quotes.
Related
I have a CSV file and I want to read that file and store it in case class. As I know A CSV is a comma separated values file. But in case of my csv file there are some data which have already comma itself. and it creates new column for every comma. So the problem how to split data from that.
1st data
04/20/2021 16:20(1st column) Here a bunch of basic techniques that suit most businesses, and easy-to-follow steps that can help you create a strategy for your social media marketing goals.(2nd column)
2nd data
11-07-2021 12:15(1st column) Focus on attracting real followers who are genuinely interested in your content, and make the most of your social media marketing efforts.(2nd column)
var i=0
var length=0
val data=Source.fromFile(file)
for (line <- data.getLines) {
val cols = line.split(",").map(_.trim)
length = cols.length
while(i<length){
//println(cols(i))
i=i+1
}
i=0
}
If you are reading a complex CSV file then the ideal solution is to use an existing library. Here is a link to the ScalaDex search results for CSV.
ScalaDex CSV Search
However, based on the comments, it appears that you might actually be wanting to read data stored in a Google Sheet. If that is the case, you can utilize the fact that you have some flexibility to save the data in a text file yourself. When I want to read data from a Google Sheet in Scala, the approach I use first is to save the file in a format that isn't hard to read. If the fields have embedded commas but no tabs, which is common, then I will save the file as a TSV and parse that with split("\t").
A simple bit of code that only uses the standard library might look like the following:
val source = scala.io.Source.fromFile("data.tsv")
val data = source.getLines.map(_.split("\t")).toArray
source.close
After this, data will be an Array[Array[String]] with your data in it that you can process as you desire.
Of course, if your data includes both tabs and commas then you'll really want to use one of those more robust external libraries.
You could use univocity CSV parser for faster stuffs.
You can also use it for creation as well.
Univocity parsers
Papa Parse seems wise, but I think he might be giving me null. I'm just:
Papa.parse(countries);
Where countries is a string containing the XMLHttpRequest of the countries csv file from a timezone database here:
https://timezonedb.com/download
But Papa Parse seems to have added an empty array to the end of it's data array. So when I'm searching and sorting through the array, that one empty guy at the end is giving me troubles. I can write around it but it's not ideal, and I thought Papa Parse was supposed to make those kind of csv parsing problems go away. Am I Parsing wrong?
Here is the end of the PapaParsed Array in console:
You need to use skipEmptyLines: true in parse config. For example:
Papa.parse(this.csvData, {skipEmptyLines: true,})
it was adding empty line to my iteration as well. i decided to skip it by doing loop:
for(let i=0;i<data.length -1;i++){
We can also use below syntax to remove empty lines from the record.
For example, in order to remove empty values from header, we can use the below code snippet.
headers.filter(Boolean);
I am working on an ingestion feature that will take a strongly formatted .xlsx file and import the records to a temp storage table and then process the rows to create db records.
One of the columns is strictly formatted as "Text" but it seems like the Open XML API handles the columns cells differently on a row-by-row basis. Some of the values while appearing to be numeric values are truly not (which is why we format the column as Text) -
some examples are "211377", "211727.01", "209395.388", "209395.435"
what these values represent is not important but what happens is that some values (using the Open XML API v2.5 library) will be read in properly as text whether retrieved from the Shared Strings collection or simply from InnerXML property while others get sucked in as numbers with what appears to be appended rounding or precision.
For example the "211377", "211727.01" and "209395.435" all come in exactly as they are in the spreadsheet but the "209395.388" value is being pulled in as "209395.38800000001" (there are others that this happens to as well).
There seems to be no rhyme or reason to which values get messed up and which ones which import fine. What is really frustrating is that if I use the native Import feature in SQL Server Management Studio and ingest the same spreadsheet to a temp table this does not happen - so how is that the SSMS import can handle these values as purely text for all rows but the Open XML API cannot.
To begin the answer you main problem seems to be values,
"209395.388" value is being pulled in as "209395.38800000001"
Yes in .xlsx file value is stored as 209395.38800000001 instead of 209395.388. And it's the correct format to store floating point numbers; nothing wrong in it. You van simply confirm it by following code snippet
string val = "209395.38800000001"; // <= What we extract from Open Xml
Console.WriteLine(double.Parse(val)); // < = Simply pass it to double and print
The output is :
209395.388 // <= yes the expected value
So there's nothing wrong in the value you extract from .xlsx using Open Xml SDK.
Now to cells, yes cell can have verity of formats. Numbers, text, boleans or shared string text. And you can styles to a cell which would format your string to a desired output in Excel. (Ex - Date Time format, Forced strings etc.). And this the way Excel handle the vast verity of data. It need this kind of formatting and .xlsx file format had to be little complex to support all.
My advice is to use a proper parse method set at extracted values to identify what format it represent (For example to determine whether its a number or a text) and apply what type of parse.
ex : -
string val = "209395.38800000001";
Console.WriteLine(float.Parse(val)); // <= Float parse will be deduce a different value ; 209395.4
Update :
Here's how value is saved in internal XML
Try for yourself ;
Make an .xlsx file with value 209395.388 -> Change extention to .zip -> Unzip it -> goto worksheet folder -> open Sheet1
You will notice that value is stored as 209395.38800000001 as scene in attached image.. So nothing wrong on API for extracting stored number. It's your duty to decide what format to apply.
But if you make the whole column Text before adding data, you will see that .xlsx hold data as it is; simply said as string.
In my app Web services are created in dot net and i am consuming those and I am getting response.In that all the fields like company,type,location everything are strings and there is no problem with this..And there is one more field called Exhibit number actually it is a Integer but they are created as string only.While I am displaying this it is showing zero instead of that number.. Here is my code...
//Storing into Array
[SurveyFilesArray addObject:[NSDictionary dictionaryWithObjectsAndKeys:[dic objectForKey:#"FileName"],#"FileName",[dic objectForKey:#"ExibhitNumber"],#"ExhibitNumber",[dic objectForKey:#"Description"],#"Description",[dic objectForKey:#"FileQuality"],#"FileQuality",nil]];
//Retrieving from Array..
NSLog(#"???%#???",[NSString stringWithFormat:#"%d",[[[SurveyFilesArray objectAtIndex:indexPath.row]objectForKey:#"ExibhitNumber"]intValue]]);
NSLog(#"%d",[[[SurveyFilesArray objectAtIndex:indexPath.row]objectForKey:#"ExibhitNumber"]intValue]);
You have typos in your code.
In the order of your code:
Ex-ib-hit-Number
Ex-hib-it-Number
These are 2 different strings.
In your example code you save it correctly written as Ex-hib-it. But you try to access it with ex-ib-hit afterwards. This cannot work.
i have an online xml file filled with items.
At startup i check my internet connection, if so, i parse the xml and compare the item objects to those in my sqlite database. One of the item values is 'lastupdated', whichs is a php generated string.
if the lastupdated value from the xml item is different from the value of the one in the database, the item needs to be updated into the database.
I seem to have parsing errors: as the lastupdated value in my database has 10 characters, and the one in my xml file seems to have 11.. When I output both i get the following:
2010-03-26 15:15:07.771 bbc_v1[97647:207] 1269429166
2010-03-26 15:15:07.771 bbc_v1[97647:207]
1269429166
2010-03-26 15:15:07.771 bbc_v1[97647:207] lenght xml item value: 11
2010-03-26 15:15:07.771 bbc_v1[97647:207] length db value: 10
it seems i'm having parsing problems with whitespace and enter/return stuff? How should I clean the xml value?
Try the -stringByTrimmingCharactersInSet function:
NSString* cleanString = [dirtyString stringByTrimmingCharactersInSet:
[NSCharacterSet whitespaceAndNewlineCharacterSet]];
What about running strip or trim on the string. This normally eliminates whitespaces.