searching key / value pairs in CloudWatch Insight for WAF logs - aws-cloudwatch-log-insights

So, the AWS Cloudfront WAF logs get sent to AWS Cloud Insights. How can I search the random placement of the key / value pairs for the httpRequest array?
Example log looks like this:
httpRequest.headers.0.name host
httpRequest.headers.0.value www.somedomain.com
httpRequest.headers.1.name cache-control
httpRequest.headers.1.value no-cache
httpRequest.headers.2.name pragma
httpRequest.headers.2.value no-cache
httpRequest.headers.3.name accept
httpRequest.headers.3.value */*
httpRequest.headers.4.name accept-encoding
httpRequest.headers.4.value gzip, deflate
httpRequest.headers.5.name from
httpRequest.headers.5.value bingbot(at)microsoft.com
httpRequest.headers.6.name user-agent
httpRequest.headers.6.value Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
So, a JSON array with 2 hashes. The order in that array is random. Sometimes user-agent will be in 1 or 3 or X. How can I search the value of the "value" field that corresponds to the value of the "name" field for "user-agent" ? ie: I want to search for "bingbot" but have it be specific to matching the "user-agent". I know I can just do a filter on #message for bingbot, but that just seems expensive and not specific / prone to false hits.

Okay, so I think the "easiest" way is to treat #message as a string and write your own parse rule, pull the value you want into your own column via a regex and then you can search / do whatever on that.
If anyone has a better idea I'm all ears.
fields #timestamp, #message
| parse #message /(?i)"name":"user-agent","value":"(?<httpRequestUserAgent>[^"]+)/
| filter action == "BLOCK"
| stats count() as httpRequestUserAgentCount by httpRequestUserAgent
| sort by httpRequestUserAgentCount desc
The (?i) marks it as case insensitive.

Related

How to solve a problem with an individual hash?

I'm trying to automate querying numbers to verify the operator... there is an ios application that does this, but it generates an individual hash for each number. I couldn't decrypt the md5 hash to understand the logic...
If you try to make the request on your local host you will get it with these variables below
Number: 69984840187
Hash: 49cf2461cb595848c3c2e23e52cadda9
https://api.ftapps.com/operadora/consulta.php
POST /operator/consultation.php HTTP/1.1
Host api.ftapps.com
Content-Type application/x-www-form-urlencoded
connection keep-alive
Accept /
User-Agent Operator/20220922001 CFNetwork/1399 Darwin/22.1.0
Accept-Language en-BR,pt;q=0.9
Content-Length 57
Accept-Encoding gzip, deflate, br
Post form: hash=49cf2461cb595848c3c2e23e52cadda9&numeros=69984840187
But if you change the Number, this hash will not work and it will give an error in the request... Does anyone have a solution for this problem? How do I get the hash for the number I want to query?
App: https://apps.apple.com/br/app/operadora-descobrir/id854075522

REST API - what to return when query for a GET does not find a result

I'm calling a backend REST endpoint that takes in a query param and searches for a matching result /people?name=joe, and I'm wondering what status code and return data I should be returning when no object is found in the DB matching name=joe.
Things I've considered:
If I was directly hitting an endpoint /people/joe and it was not found, then I would definitely return 404.
If I was hitting an endpoint that returned a list of results for a query like if /people?name=joe was supposed to return ALL people named joe, then I would just return 200 with an empty list as the body. But in my case, I can only have one object for each name, so I'm not returning a list, so this doesn't apply here.
So this is a different case where I'm hitting an endpoint and passing in a query param to "search" for some data. And it is expected that in many cases, the data won't exist yet.
This seems pretty similar to the first bullet point above, but I don't like returning a 404 here since this is not necessarily an error.
Should I return a 200 but with an empty object {} as the body, and then my frontend should check if body == {} then take that to mean no data found?
Or should I still return a 404 here? Again, this is not really an error in my case which is why I don't want to use a 404, but if that makes most sense, then I could.
Easy parts first - status codes are metadata of the transfer-of-documents-over-a-network domain (Webber, 2011). In the context of a GET request (which asks for the current selected representation of a resource), a 200 response indicates that the response content is a representation of the resource (as opposed, for example, to being a representation of an error).
Furthermore, URI are opaque: general purpose HTTP components do not make assumptions about the semantics of resources based on the spelling of their identifiers. In other words, the "rules" are exactly the same for both
/people/joe
/people?joe
/people?name=joe
...
So at the HTTP level, the answers to your question are easy: if there's a current representation, then you reply to GET requests with a 200 status and copy the current representation into the response content.
The hard parts are deciding when there is a current representation, and what it looks like. REST and HTTP don't have anything to say about that, really. It's a resource design concern.
For example, this is interaction "follows all the rules":
GET /people?name=dave HTTP/1.1
HTTP/1.1 200 OK
Content-Location: /people?name=dave
Content-Type: text/plain
Dave's not here, man
HTTP is a general purpose mechanism for asking for documents/transmitting documents over a network, but it is agnostic about what documents look like and what keys we use to identify documents in the store.
If you are dealing with representations that describe zero or exactly one things, it can still be reasonable to use a list which is either empty or contains exactly one element (if you are familiar with Option/Optional/Maybe: same idea, we're presenting something with the semantics of an iterable collection)
HTTP/1.1 200 OK
Content-Location: /people?name=dave
Content-Type: application/json
[]
HTTP/1.1 200 OK
Content-Location: /people?name=bob
Content-Type: application/json
[{
...
}]
I agree that 200 and empty collection is better than 404 in your scenario. I don't like the idea of looking for {}, it is not explicit enough.
Possible ways of doing this:
200 ok
{
items:[]
}
200 ok
{
size:0//,
//items:undefined
}
200 ok
[]
206 Partial Content
Accept-Ranges: items
Content-Range: items 0-0/0
// []

CA LISA unable to create VS from Req/Rsp pairs

I have been trying to create a REST/Json Virtual service on CA LISA 7.5 (we can’t update), using request response pairs. The request response looks like the following:
Name-req
GET /cods_party_web/party/111700 HTTP/1.1
Pragma: no-cache
Cache-Control: no-cache
x-abc-outlet-id: 017879
x-abc-user-id: CTM
x-abc-consent-level: 2
x-abc-application-id: 00028
x-abc-outlet-id-type: OU_ID
x-abc-user-id-type: 1
x-IBM-Client-Id: XXX....
x-IBM-Client-Secret: XXX...
Name-rsp
HTTP/1.1 200 {"party":{"partyId":111700,"foreNames":["Julie","Pamela",""],"lastName":"Duncan","initials":["J"],"...lots of content......."type":"EMAIL"}],"associatedOU":null}
When I try to build the virtual service image, no matter what options I select, my VS image response is either in hex as shown below or it is blank.
I remember having this problem 1 year ago, and was able to get the reponse to look like below but I can’t remember how I did it.
Success response
Not found response.
Many thanks in advance
It's not returning hex - those are just column numbers for an empty binary response. I think the problem is your response document is not properly formed HTTP - there's no reason phrase in the status line, and you need two line feeds after the status line. Try this:
HTTP/1.1 200 OK
{"party":{"partyId":111700,"foreNames":["Julie","Pamela",""],"lastName":"Duncan","initials":["J"],"...lots of content......."type":"EMAIL"}],"associatedOU":null}
I understand that you can't upgrade, so this doesn't really help you, but LISA 9.5 doesn't have this issue -- the response looks like it's supposed to.
On the other hand, CA has released a free, simpler version of LISA that also successfully generates a VS from your example. Check it out here:
http://educationcontent.ca.com/A01/index.html

How should REST API accept boolean values?

Resource 123 has a current configuration state and a default configuration state, and both of these configuration states can be represented by JSON.
A GET request to http://example.com/123/config will return the current configuration state, and a GET request to http://example.com/123/config?reset=true will return the default configuration state.
How should an API interpret boolean values? For instance:
http://example.com/123/config?reset=true
http://example.com/123/config?reset=blablabla
http://example.com/123/config?reset=false
http://example.com/123/config?reset=1
http://example.com/123/config?reset=0
http://example.com/123/config?reset=
http://example.com/123/config?reset
True and false
The true and false literals are just fine to represent boolean values. They are quite descriptive and, if your API supports JSON, true and false are definitively the obvious choices.
Enumerations
In a few situations, however, you may want to avoid boolean values because they cannot be expanded. You may want to consider enumerations instead.
It may be a poor comparison but it might help you to get the main idea of this approach: have a look at CSS properties such as overflow or visibility. They allow expandable values instead of only true or false. So new values can be easily added without changing the property names.
So, for the situation described in your question, to retrieve the default state of a resource, I would support a query parameter such as status, that could have values such as default and current.
The following would return the default state of the resource:
GET /config?status=default HTTP/1.1
Host: example.com
Accept: application/json
And the following would return the current state of the resource:
GET /config?status=current HTTP/1.1
Host: example.com
Accept: application/json
If no query parameter is provided, you could that the client wants the current state of the resource.
If you need to restore the resource state to its default state, consider using PUT, sending the new representation of the resource in the request payload. Something like:
PUT /config/status HTTP/1.1
Host: example.com
Content-Type: application/json
{
"value": "default"
}
Whichever way you want it to, it's completely up to you as the architect/designer. true/false is the most syntactically correct version, make sure that one works and add the other options as sugar if you want.

HTTP Accept-Encoding and sending unencoded data

I building a module for compressing HTTP output. Reading the spec, I haven't found a clear distinction on a couple of things:
Accept-Encoding:
Should this be treated the same as a Accept-Encoding: * or as if no header is present?
Or what if I don't support gzip, but I get a header like this:
Accept-Encoding: gzip
Should I return a 406 error or just return the data unencoded?
EDIT:
I've read over the spec a few times. It mentions my first case, but it doesn't define what the behavior of the server should be.
Should I treat this case as if the header is not present? Or should I return a 406 error because there's no way to encode something given the field value ('' isn't a valid encoding).
There is written everything in the Spec: 14.3 Accept-Encoding:
The special "*" symbol in an Accept-Encoding field matches any
available content-coding not explicitly listed in the header
field.
If an Accept-Encoding field is present in a request, and if the server cannot send a response which is acceptable according to the Accept-Encoding header, then the server SHOULD send an error response with the 406 (Not Acceptable) status code.
edit:
If the Accept-Encoding field-value is empty, then only the "identity"
encoding is acceptable.
In this case, if "identity" is one of the available content-codings, then the server SHOULD use the "identity" content-coding, unless it has additional information that a different content-coding is meaningful to the client.
What is "identity"
identity
The default (identity) encoding; the use of no transformation whatsoever. This content-coding is used only in the Accept- Encoding header, and SHOULD NOT be used in the Content-Encoding header.