I have a CSV file and I want to read that file and store it in case class. As I know A CSV is a comma separated values file. But in case of my csv file there are some data which have already comma itself. and it creates new column for every comma. So the problem how to split data from that.
1st data
04/20/2021 16:20(1st column) Here a bunch of basic techniques that suit most businesses, and easy-to-follow steps that can help you create a strategy for your social media marketing goals.(2nd column)
2nd data
11-07-2021 12:15(1st column) Focus on attracting real followers who are genuinely interested in your content, and make the most of your social media marketing efforts.(2nd column)
var i=0
var length=0
val data=Source.fromFile(file)
for (line <- data.getLines) {
val cols = line.split(",").map(_.trim)
length = cols.length
while(i<length){
//println(cols(i))
i=i+1
}
i=0
}
If you are reading a complex CSV file then the ideal solution is to use an existing library. Here is a link to the ScalaDex search results for CSV.
ScalaDex CSV Search
However, based on the comments, it appears that you might actually be wanting to read data stored in a Google Sheet. If that is the case, you can utilize the fact that you have some flexibility to save the data in a text file yourself. When I want to read data from a Google Sheet in Scala, the approach I use first is to save the file in a format that isn't hard to read. If the fields have embedded commas but no tabs, which is common, then I will save the file as a TSV and parse that with split("\t").
A simple bit of code that only uses the standard library might look like the following:
val source = scala.io.Source.fromFile("data.tsv")
val data = source.getLines.map(_.split("\t")).toArray
source.close
After this, data will be an Array[Array[String]] with your data in it that you can process as you desire.
Of course, if your data includes both tabs and commas then you'll really want to use one of those more robust external libraries.
You could use univocity CSV parser for faster stuffs.
You can also use it for creation as well.
Univocity parsers
I have a request payload(JSON format) which has an array with 1000 objects and each object has 6 key value pairs out of which 5 I’m reading from the csv file using parameterization and the 6th key has to be a unique date value of a future date for each of the object in the array.
I tried this with time-shift function which works for 1 iteration but I want to execute it for n- number of iterations.
I checked for groovy code for this but I have no knowledge of groovy and have started learning it.
How can I achieve this in JMeter.
Also, on reading time-shift function from HTTP Request Defaults-Parameters or from the Test Plan-User Defined Variables it does not read different date for each object, it duplicates same date of the first variable in each object.
{
“deviceNumber": “XX”,
“array: [
{
“keyValue1: “${value1_ReadFromCSV}”,
"keyValue2”: “${value2_ReadFromCSV}”,
"keyValue3”: “${value3_ReadFromCSV}”,
"keyValue4”: “${value4_ReadFromCSV}”,
"keyValue5”: “${value5_ReadFromCSV}”,
"keyValue6”: "2020-05-23” (Should be dynamically generated)
},
{
“keyValue7: “value7_ReadFromCSV”,
"keyValue8”: "value8_ReadFromCSV",
"keyValue9”: "value9_ReadFromCSV",
"keyValue10”: "value10_ReadFromCSV",
"keyValue11”: "value11_ReadFromCSV",
"keyValue12”: "2020-05-24” (Should be dynamically generated)
},
.
.
.
.
{
“keyValue995: “value995_ReadFromCSV”,
"keyValue996”: "value996_ReadFromCSV",
"keyValue997”: "value997_ReadFromCSV",
"keyValue998”: "value998_ReadFromCSV",
"keyValue999”: "value999_ReadFromCSV",
"keyValue1000”: "2025–12-31” (Should be dynamically generated)
}
]
}
I have got the partial solution to this, by reading the csv file line by line and storing each line into a variable using groovy. However, I don't want to store directly the line into the variable but to create a JSON object like above from each line of csv file with a unique future date for each object which is in the array.
The csv file is : (Note: I have removed column for date column in csv as I no longer need it.)
deviceNumber,keyValue1,keyValue2,keyValue3,keyValue4,keyValue5,keyValue7,keyValue8,keyValue9,keyValue10,keyValue11,keyValue12,keyValue13,keyValue15,keyValue15,keyValue16
01,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring
02,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring
03,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring
.
.
.
1000,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring,somestring
Kindly suggest any reference/example to do this.
I provide only generic instructions:
You can dynamically construct request body using JSR223 PreProcessor
You can read CSV file into memory using File.readLines() function
You can build JSON out of the values from the CSV file using JsonBuilder class
More information:
Apache Groovy - Parsing and producing JSON
Apache Groovy - Why and How You Should Use It
I am writing scala scripts. I need to perform row filter operations such as greater than,less than operations for the csv file. I have tried using filter option in the script unable to get the results. Please let me know how to perform filter operation for the csv file. The sample data has been attached here for reference.Thanks in advance.
for (line <- bufferedSource.getLines) {
cols += line.split(",").filter(csv => csv(1).toInt > 10000)}
Instead of resorting to for use map. This code snippet should work
bufferedSource.getLines.map(row => row.split(",")).filter(cols => cols(1).toInt > 10000).toList
Also, it's a better approach to use a case class for the CSV you are filtering to make your code more readable.
I have a data source which is stored as a large number of gzipped, csv files. The header info for this source is a separate file.
I'd like to load this data into spark for manipulation - is there an easy way to get spark to figure out the schema/load the headers? There are literally hundreds of columns, and they might change between runs, would strongly prefer not to do this by hand
This can easily be done in spark :
if your header file is : headers.csv and it only contains header then simply first load this file with header set as true :
val headerCSV = spark.read.format("CSV").option("header","true").load("/home/shivansh/Desktop/header.csv")
then get the Columns out in the form of Array:
val columns = headerCSV.columns
Then read the other file without the header information and pass this file as the header:
spark.read.format("CSV").load("/home/shivansh/Desktop/fileWithoutHeader.csv").toDF(columns:_*)
This will result in the DF with the combined value !
I am trying to feed the values of a feeder that supplies Id's into a .txt File. Is their any way to extract values directly from the feeder without having to extract the Id from each session?
I not sure, what do you mean, but the way for extract values from feed you can use next :
val creditCard = "creditCard"
feed(tsv("CreditCard.txt").random)
Inside file "Credit.txt" you should have 1st line (column) name exactly as init value of variable -> "creditCard".
In this way you can use it like : "${creditCard}" in you script.