validating response outside of context manager? - locust

[EDITED: I realized after reading response that I oversimplified my question.]
I am new to Locust and not sure how to solve this problem.
I have function (call it "get_doc") that is passed a locust.HttpSession() and uses it to issue an HTTP request. It gets the response and parses it, returning it up several layers of call. One of these higher-level calls looks at the returned, parsed document to decide if the response was what was expected or not. If not, I want Locust to mark the request/response as failed. A code sketch would be:
class MyUser (HttpUser):
#task
def mytask(self):
behavior1 (self.client)
def bahavior1(session):
doc = get_doc(session, url1)
if not doc_ok (doc):
??? how to register a failure with Locust here...
doc2 = get_doc(session, url2)
...
def get_doc(http_session, url):
page = http_session.get(url)
doc = parse (page)
return doc
There may be several behavior[n] functions and several Locust users calling them.
A constraint is that I would like to keep Locust-specific stuff out of bahavior1() so that I can call it with an ordinary Requests session. I have tried to do something like this in get_doc() (the catch_response parameter and success/fail stuff is actually conditionalized on 'session' being an HttpSession object):
def get_doc (session, meth, url):
resp = session.request (meth, url, catch_response=True)
doc = parse (resp.content)
doc.logfns = resp.success, resp.failure
return doc
and then in behavior1() or some higher up-chain caller I can
doc.logfns[1]("Document not as expected")
or
doc.logfns[0] # Looks good!
Unfortunately this is not working; the calls to them produce no errors but Locust doesn't seem to record any successes or failures either. I am not sure if it should work or I bungled something in my code. Is this feasible? Is there a better way?

You can make get_doc a context manager, call .get with catch_response=True and yield instead of return inside it. Similar to how it is done here: https://github.com/SvenskaSpel/locust-plugins/blob/2cbbdda9ae37b6cbb0a11cf69aca80b164198aec/locust_plugins/users/rest.py#L22
And then use it like this
def mytask(self):
with get_doc(self.client, url) as doc:
if not doc_ok(doc):
doc.failure(”doc was not ok :(”)
If you want, you can add the parsed doc as a field on the response before yielding in your doc function, or call doc.failure() inside doc_ok.

Related

How to handle a basic form submission with http4s?

I can't believe this isn't in the http4s documentation, and the example code I was able to dig up online (after poking around long enough to discover the UrlForm class) is not working for me.
The relevant bit of code looks like this:
case req # POST -> Root / "compose" =>
req.decode[UrlForm] { ps =>
println("ps.values: " + ps.values)
val content = ps.getFirstOrElse("content",
throw new IllegalStateException("No content given!"))
// Do something with `content`...
}
When submitting the associated form, the IllegalStateException is thrown. ps.values is an empty map (Map()).
I can see (using println) that the Content-Type is application/x-www-form-urlencoded, as expected, and I can see from my browser's Network tab that request "paramaters" (the encoded form values) are being sent properly.
The problem is that I had a filter (javax.servlet.Filter) in place that was calling getParameterMap on the HttpServletRequest. This was draining the InputStream for the request, and it was happening before the request got passed off to the servlet (BlockingHttp4sServlet) instance.
It seems to me the BlockingHttp4sServlet should raise an IllegalStateException (or something more descriptive) when it receives an InputStream with isFinished returning true. (I've filed an issue with the http4s project on Github.)

how to keep redirection history of duplicates

scrapys duplication filter ignores already seen urls/requests. So far, so good.
The Problem
Even if a request is dropped I still want to keep the redirection history.
Example:
Request 1 : B
Request 2 : A --301--> B
In this case request 2 is dropped without letting me know that it is a 'hidden' duplicate of request 1.
Attempts
I already tried to catch the signal request_dropped. This works but I don't see a possibility to sent an item to the pipeline from the handler.
Best regards and thanks for your help :)
Raphael
You are probably looking for DUPEFILTER_DEBUG
Set it to True in settings.py file and you will see all URLs that were ignored because of being duplicate
I figured out a way to handle those 'hidden' redirects:
Catch the signal 'request_dropped' from 'from_crawler':
#classmethod
def from_crawler(cls, crawler, *args, **kwargs):
spider = super(YourSpider, cls).from_crawler(crawler, *args, **kwargs)
crawler.signals.connect(spider.on_request_dropped, signal=signals.request_dropped)
return spider
Use 'self.crawler.engine.scraper.enqueue_scrape' to route the response to a callback which can yield items. enqueue_scrape expects a response so you can simply create a dummy response from the dropped request (I used TextResponse for this). With this response you can also define the callback.
def on_request_dropped(self, request, spider):
""" handle dropped request (duplicates) """
request.callback = self.parse_redirection_from_dropped_request
response = TextResponse(url=request.url, request=request)
self.crawler.engine.scraper.enqueue_scrape(request=request,
response=response, spider=self)
Process the redirection history of the dropped request within the callback you defined. From here you can handle things exactly like within the regular parse callback.
def parse_redirection_from_dropped_request(self, response):
...
yield item
I hope this might help you if you stumble upon the same problem.

How to make internal synchronous post request in Play framework and scala?

I'm new to Play and Scala. I'm trying to build an Application using Play and Scala. I need to make post call internally to get data from my server. But this should be synchronous. After getting the data from this post request, I need to send that data to front end. I've seen many resources but all are asynchronous. Please help me.
I'm fetching data from DB and then should return the data as response.
DB is at remote server not in the hosted server.
I think you should not block anyway.
def action = Action.async {
WS.url("some url")
.post(Json.toJson(Map("query"->query)))
.map { response =>
val jsonResponse = response.json
// in this place you have your response from your call
// now just do whatever you need to do with it,
// in this example I will return it as `Ok` result
Ok(jsonResponse)
}
}
Just map the result of your call and modify it staying in context of Future and use Action.async that takes a Future.
If you really want to block use Await.result(future, 5 seconds), importing
import scala.concurrent.duration._
import scala.concurrent.Await
See docs for Await here
All requests are asynchronous but nothing prevents you from waiting the response with await in your code.
val response = await(yourFutureRequest).body
The line written above will block until the future has finished.

Restangular - how to cancel/implement my own request

I found a few examples of using fullRequestInterceptor and httpConfig.timeout to allow canceling requests in restangular.
example 1 | example 2
this is how I'm adding the interceptor:
app.run(function (Restangular, $q) {
Restangular.addFullRequestInterceptor(function (element, operation, what, url, headers, params, httpConfig) {
I managed to abort the request by putting a resolved promise in timeout (results in an error being logged and the request goes out but is canceled), which is not what I want.
What I'm trying to do - I want to make the AJAX request myself with my own requests and pass the result back to whatever component that used Restangular. Is this possible?
I've been looking a restangular way to solve it, but I should have been looking for an angular way :)
Overriding dependency at runtime in AngularJS
Looks like you can extend $http before it ever gets to Restangular. I haven't tried it yet, but it looks like it would fit my needs 100%.
I'm using requestInterceptor a lot, but only to change parameters and headers of my request.
Basically addFullRequestInterceptor is helping you making change on your request before sending it. So why not changing the url you want to call ?
There is the httpConfig object that you can modify and return, and if it's close to the config of $http (and I bet it is) you can change the url and even method, and so change the original request to another one, entirely knew.
After that you don't need timeout only returning an httpConfig customise to your need.
RestangularConfigurer.addFullRequestInterceptor(function (element, operation, route, url, headers, params, httpConfig) {
httpConfig.url = "http://google.com";
httpConfig.method = "GET";
httpConfig.params = "";
return {
httpConfig: httpConfig
};
});
It will be pass on and your service or controller won't know that something change, that's the principle of interceptor, it allow you to change stuff and returning to be use by the next process a bit like a middleware. And so it will be transparent to the one making the call but the call will be made to what you want.

How to use dispatch.json in lift project

i am confused on how to combine the json library in dispatch and lift to parse my json response.
I am apparently a scala newbie.
I have written this code :
val status = {
val httpPackage = http(Status(screenName).timeline)
val json1 = httpPackage
json1
}
Now i am stuck on how to parse the twitter json response
I've tried to use the JsonParser:
val status1 = JsonParser.parse(status)
but got this error:
<console>:38: error: overloaded method value parse with alternatives:
(s: java.io.Reader)net.liftweb.json.JsonAST.JValue<and>
(s: String)net.liftweb.json.JsonAST.JValue
cannot be applied to (http.HttpPackage[List[dispatch.json.JsObject]])
val status1 = JsonParser.parse(status1)
I unsure and can't figure out what to do next in order to iterate through the data, extract it and render it to my web page.
Here's another way to use Dispatch HTTP with Lift-JSON. This example fetches JSON document from google, parses all "titles" from it and prints them.
import dispatch._
import net.liftweb.json.JsonParser
import net.liftweb.json.JsonAST._
object App extends Application {
val http = new Http
val req = :/("www.google.com") / "base" / "feeds" / "snippets" <<? Map("bq" -> "scala", "alt" -> "json")
val json = http(req >- JsonParser.parse)
val titles = for {
JField("title", title) <- json
JField("$t", JString(name)) <- title
} yield name
titles.foreach(println)
}
The error that you are getting back is letting your know that the type of status is neither a String or java.io.Reader. Instead, what you have is a List of already parsed JSON responses as Dispatch has already done all of the hard work in parsing the response into a JSON response. Dispatch has a very compact syntax which is nice when you are used to it but it can be very obtuse initially, especially when you are first approaching Scala. Often times, you'll find that you have to dive into the source code of the library when you are first learning to see what is going on. For instance, if you look into the dispatch-twitter source code, you can see that the timeline method actually performs a JSON extraction on the response:
def timeline = this ># (list ! obj)
What this method is defining is a Dispatch Handler which converts the Response object into a JsonResponse object, and then parses the response into a list of JSON Objects. That's quite a bit going on in one line. You can see the definition for the operand ># in the JsHttp.scala file in the http+json Dispatch module. Dispatch defines lots of Handlers that do a conversion behind the scenes into different types of data which you can then pass to block to work with. Check out the StdOut Walkthrough and the Common Tasks pages for some of the handlers but you'll need to dive into the various modules source code or Scaladoc to see what else is there.
All of this is a long way to get to what you want, which I believe is essentially this:
val statuses = http(Status(screenName).timeline)
statuses.map(Status.text).foreach(println _)
Only instead of doing a println, you can push it out to your web page in whatever way you want. Check out the Status object for some of the various pre-built extractors to pull information out of the status response.