I would like to save a session attribute in a list in my gatling simulation. What am trying to do is to get all the values of my JSON who are defined in a CV file and write it in a file. In my example below "test" is always equal to the value of the first jsonPath.
Here what I am doing:
val scn1 = scenario("[SCENARIO] GET")
.repeat(Nbproduct-1, "counter") (
feed(csv(CSV).circular)
.exec(http("get JSON")
.get(url_1")
.check(jsonPath("""$.${meta_ref}""").find.saveAs("test")))
.pause(1)
.exec(session => {
writer.write("\""+session("meta_cts").as[String]+"\":\"" + session("test").as[String]+"\",\n")
session
}
)
I also tried this but it get the value of the counter...
.check(jsonPath("""$.${meta_ref}""").find.saveAs("""jdd_value("${counter}")""")))
Thanks for the help!
Feeders are shared datasources, so first user will pop the first record, second user the second record, etc...
Then, it's not possible to define checks at runtime (depending on some entries in a file). All DSL components are builders that are only resolved once when the Simulation is loaded.
Related
I am using Gatling to test a system which expects 2 sequential Post requests say, R1 and R2. These Post requests have different Json request bodies but one common key "ID". So one user should execute R1-R2 in order and a new random ID should be generated per user. This ID generated in R1 should be passed to R2 and hence added as the value of the ID key in its request bodies.
The random ID is generated inside a feeder at the R1 request:
val R1Id = Iterator.continually(Map("randId1" -> R1_requestBody.replace("0000000000", randomTokenGenerator.generateTokenID())))
val r1 =
scenario("R1Scenarios").feed(R1Id)
.exec(http("POST R1")
.....
.body(StringBody(session => """${randId1}""")).asJSON
Now, in R2, I want to feed want had been ID value had been generated inside the feeder of R1.
val R2Id = Iterator.continually(Map("randId2" -> R2_requestBody.replace("0000000000", ***Token generated in the first request***)))
val R2= {
scenario("R2 Scenarios")
.exec(R1.r1)
//calls the first scenario as R2 should be executed after R1
.feed(R2Id )
.exec(http("POST R2")
....
.body(StringBody(session => """${randId2}""")).asJSON
Finally executing the simulation:
val jsonScenario = R2.r2.inject(constantUsersPerSec(2) during (1 second))
setUp(jsonScenario)
.protocols(httpConf)
Instead of generating whole body in feeder you can generate only that random id, lets call it userToken:
val tokenFeeder = Iterator.continually(Map(
"userToken" -> randomTokenGenerator.generateTokenID()
))
and replace it while building request body:
.body(
StringBody(session => R1_requestBody.replace(
"0000000000",
session("userToken").as[String]
))
).asJSON
Or even cleaner and better - use fact that Gatling replaces every string containing placeholder like ${sessionAttributeName} with that session attribute string value and instead of using "0000000000" in your body template use ${userToken} placeholder fe:
val bodyTemplate ="""{
|"userName": "John Doe",
|"userToken": "${userToken}"
|}""".stripMargin
and then just use that template for body and Gatling expression language will do the magic:
.body(StringBody(bodyTemplate)).asJSON
I'm trying to get gatling to create random data per POST request. I've followed a few posts on stackoverflow and other places. I came up with this scenario -
def randomUuid = UUID.randomUUID().toString
val feeder = Iterator.continually(Map("user" -> randomUuid))
def createPostRequest = {
http("createuser")
.post("http://jsonplaceholder.typicode.com/posts")
.body(StringBody("${user}"))
.check(status.is(201))
}
val scn = scenario("some load test")
.feed(feeder)
.forever(exec(createPostRequest))
setUp(scn.inject(atOnceUsers(1)))
.maxDuration(20 minutes)
However, when I run this code it just calls my feeder once to create a single UUID and just re-uses the same UUID throughout the load test.
I created the code above after following this thread. I'm using gatling 2.2.5. Here's my sbt config -
import sbt._
object Dependencies {
private val gatlingHighcharts = "io.gatling.highcharts" % "gatling-
charts-highcharts" % "2.2.5" % "test"
private val gatlingTest = "io.gatling" % "gatling-test-framework" % gatlingHighcharts.revision % "test"
val gatlingDependencies = Seq(gatlingHighcharts, gatlingTest)
}
As you don't call feed inside a loop, typically your forever one, you will indeed only generate one single value per virtual user.
If what you want is to have unique values per loop iteration, move the feed call inside the loop.
in your setUp, you're only creating one user - so your scenario is only getting executed once, meaning that 'feed' only occurs once before you start looping over your request.
change your scenario to be
val scn = scenario("some load test")
.feed(feeder)
.exec(createPostRequest)
and make your setUp (replacing 100 with whatever number of users you want)
setUp(scn.inject(atOnceUsers(100)))
We are currently facing a performance issue in sparksql written in scala language. Application flow is mentioned below.
Spark application reads a text file from input hdfs directory
Creates a data frame on top of the file using programmatically specifying schema. This dataframe will be an exact replication of the input file kept in memory. Will have around 18 columns in the dataframe
var eqpDF = sqlContext.createDataFrame(eqpRowRdd, eqpSchema)
Creates a filtered dataframe from the first data frame constructed in step 2. This dataframe will contain unique account numbers with the help of distinct keyword.
var distAccNrsDF = eqpDF.select("accountnumber").distinct().collect()
Using the two dataframes constructed in step 2 & 3, we will get all the records which belong to one account number and do some Json parsing logic on top of the filtered data.
var filtrEqpDF =
eqpDF.where("accountnumber='" + data.getString(0) + "'").collect()
Finally the json parsed data will be put into Hbase table
Here we are facing performance issues while calling the collect method on top of the data frames. Because collect will fetch all the data into a single node and then do the processing, thus losing the parallel processing benefit.
Also in real scenario there will be 10 billion records of data which we can expect. Hence collecting all those records in to driver node will might crash the program itself due to memory or disk space limitations.
I don't think the take method can be used in our case which will fetch limited number of records at a time. We have to get all the unique account numbers from the whole data and hence I am not sure whether take method, which takes
limited records at a time, will suit our requirements
Appreciate any help to avoid calling collect methods and have some other best practises to follow. Code snippets/suggestions/git links will be very helpful if anyone have had faced similar issues
Code snippet
val eqpSchemaString = "acoountnumber ....."
val eqpSchema = StructType(eqpSchemaString.split(" ").map(fieldName =>
StructField(fieldName, StringType, true)));
val eqpRdd = sc.textFile(inputPath)
val eqpRowRdd = eqpRdd.map(_.split(",")).map(eqpRow => Row(eqpRow(0).trim, eqpRow(1).trim, ....)
var eqpDF = sqlContext.createDataFrame(eqpRowRdd, eqpSchema);
var distAccNrsDF = eqpDF.select("accountnumber").distinct().collect()
distAccNrsDF.foreach { data =>
var filtrEqpDF = eqpDF.where("accountnumber='" + data.getString(0) + "'").collect()
var result = new JSONObject()
result.put("jsonSchemaVersion", "1.0")
val firstRowAcc = filtrEqpDF(0)
//Json parsing logic
{
.....
.....
}
}
The approach usually take in this kind of situation is:
Instead of collect, invoke foreachPartition: foreachPartition applies a function to each partition (represented by an Iterator[Row]) of the underlying DataFrame separately (the partition being the atomic unit of parallelism of Spark)
the function will open a connection to HBase (thus making it one per partition) and send all the contained values through this connection
This means the every executor opens a connection (which is not serializable but lives within the boundaries of the function, thus not needing to be sent across the network) and independently sends its contents to HBase, without any need to collect all data on the driver (or any one node, for that matter).
It looks like you are reading a CSV file, so probably something like the following will do the trick:
spark.read.csv(inputPath). // Using DataFrameReader but your way works too
foreachPartition { rows =>
val conn = ??? // Create HBase connection
for (row <- rows) { // Loop over the iterator
val data = parseJson(row) // Your parsing logic
??? // Use 'conn' to save 'data'
}
}
You can ignore collect in your code if you have large set of data.
Collect Return all the elements of the dataset as an array at the driver program. This is usually useful after a filter or other operation that returns a sufficiently small subset of the data.
Also this can cause the driver to run out of memory, though, because collect() fetches the entire RDD/DF to a single machine.
I have just edited your code, which should work for you.
var distAccNrsDF = eqpDF.select("accountnumber").distinct()
distAccNrsDF.foreach { data =>
var filtrEqpDF = eqpDF.where("accountnumber='" + data.getString(0) + "'")
var result = new JSONObject()
result.put("jsonSchemaVersion", "1.0")
val firstRowAcc = filtrEqpDF(0)
//Json parsing logic
{
.....
.....
}
}
Scala is pretty new for me and I have problems as soon as a leave the gatling dsl.
In my case I call an API (Mailhog) which responds with a lot of mails in json-format. I can’t grab all the values.
I need it with “jsonPath” and I need to “regex” as well.
That leads into a map and a list which I need to iterate through and save each value.
.check(jsonPath("$[*]").ofType[Map[String,Any]].findAll.saveAs("id_map"))
.check(regex("href=3D\\\\\"(.*?)\\\\\"").findAll.saveAs("url_list"))
At first I wanted to loop the “checks” but I did’nt find any to repeat them without repeating the “get”-request too. So it’s a map and a list.
1) I need every value of the map and was able to solve the problem with the following foreach loop.
.foreach("${id_map}", "idx") {
exec(session => {
val idMap = session("idx").as[Map[String,Any]]
val ID = idMap("ID")
session.set("ID", ID)
})
.exec(http("Test")
.get("/{ID}"))
})}
2) I need every 3rd value of the list and make a get-request on them. Before I can do this, I need to replace a part of the string. I tried to replace parts of the string while checking for them. But it won’t work with findAll.
.check(regex("href=3D\\\\\"(.*?)\\\\\"").findAll.transform(raw => raw.replace("""=\r\n""","")).saveAs("url"))
How can I replace a part of every string in my list?
also how can I make a get-request on every 3rd element in the list.
I can't get it to work with the same foreach structure above.
I was abole to solve the problem by myself. At first I made a little change to my check(regex ...) part.
.check(regex("href=3D\\\\\"(.*?)\\\\\"").findAll.transform(_.map(raw => raw.replace("""=\r\n""",""))).saveAs("url_list"))
Then I wanted to make a Get-Request only on every third element of my list (because the URLs I extracted appeared three times per Mail).
.exec(session => {
val url_list =
session("url_list").as[List[Any]].grouped(3).map(_.head).toList
session.set("url_list", url_list)
})
At the end I iterate through my final list with a foreach-loop.
foreach("${url_list}", "urls") {
exec(http("Activate User")
.get("${urls}")
)
}
I'm current creating some Gatling simulation to test a REST API. I don't really understand Scala.
I've created a scenario with several exec and pause;
object MyScenario {
val ccData = ssv("cardcode_fr.csv").random
val nameData = ssv("name.csv").random
val mobileData = ssv("mobile.csv").random
val emailData = ssv("email.csv").random
val itemData = ssv("item_fr.csv").random
val scn = scenario("My use case")
.feed(ccData)
.feed(nameData)
.feed(mobileData)
.feed(emailData)
.feed(itemData)
.exec(
http("GetCustomer")
.get("/rest/customers/${CardCode}")
.headers(Headers.headers)
.check(
status.is(200)
)
)
.pause(3, 5)
.exec(
http("GetOffers")
.get("/rest/offers")
.queryParam("customercode", "${CardCode}")
.headers(Headers.headers)
.check(
status.is(200)
)
)
}
And I've a simple Simulation :
class MySimulation extends Simulation {
setUp(MyScenario.scn
.inject(
constantUsersPerSec (1 ) during (1)))
.protocols(EsbHttpProtocol.httpProtocol)
.assertions(
global.successfulRequests.percent.is(100))
}
The application I'm trying to simulate is a multilocation mobile App, so I've prepared a set of samples data for each Locale (US, FR, IT...)
My REST API handles all the locales, therefore I want to make the simulation concurrently execute several instances of MyScenario, each with a different locale sample, to simulate the global load.
Is it possible to execute my simulation without having to create/duplicate the scenario and change the val ccData = ssv("cardcode_fr.csv").random for each one?
Also, each locale has its own load, how can I create a simulation that takes a single scenario and executes it several times concurrently with a different load and feeders?
Thanks in advance.
From what you've said, I think this may be a good approach:
Start by grouping your data in such a way that you can look up each item you want to send based on the current locale. For this, I would recommend using a Map that matches a locale string (such as "FR") to the item that matches that locale for the field you're looking to fill in. Then, at the start of each iteration of the scenario, you just pick which locale you want to use for the current iteration from a list. It would look something like this:
val locales = List("US", "FR", "IT")
val names = Map( "US" -> "John", "FR" -> "Pierre", "IT" -> "Guillame")
object MyScenario {
//These two lines pick a random locale from your list
val random_index = rand.nextInt(locales.length);
val currentLocale = locales(random_index);
//This line gets the name
val name = names(currentLocale)
//Do the rest of your logic here
}
This is a very simplified example - you'll have to figure out how you actually want to retrieve the data from files and put it into a Map structure, as I assume you don't want to hard code every item for every field into your code.