How to convert a Postman request into a NiFi request? - rest

I don't mind if you use an example from another API that is not Adobe Analytics'. I just need to know the pattern that I have to follow in order to succesfully convert a Postman request into a NiFi request.
After successfully creating requests to pull reports from Adobe Analytics via Postman, I´m having difficulties to migrate these Postman requests to NiFi. I haven´t been able to find concrete use cases that explicity explain how to do this kind of task step-by-step.
I'm trying to build a backend on top of NiFi to handle multiple data extracts from Adobe Analytics in an efficient and robust way. That is instead of having to create all required scripts by myself. Yet, there is more documentation about REST APIs and Postman cases than there is about REST APIs and NiFi cases.
In the screenshot below we can see how the Postman request looks like. It takes 3 headers and 1 temporary header that includes the authorization value (Bearer token). This temporary header is generated automatically after filling in the OAuth 2.0 authorization form in the Authorization tab, as shown here.
Then, we have the body of the request. This json text is generated automatically by debugging Adobe Analytics' workspaces as shown here.
I'd like to know the following in a step-by-step manner with screenshots if possible:
Which processor(s) should I use in NiFi to obtain a similar response as the one I got in Postman?
Which properties should I add/remove from the processor to make this work?
How should I name these properties?
Is there a default property whose value/name I should modify?
As you can see, the question mainly refers to properties setup in NiFi, as well as Processor selection. I already tried to configure some processors but I don't seem to get the correct properties setup, or maybe I'm selecting the wrong processors.
I'm using NiFi v1.6.0 and Postman v7.8.0
This is most likely an easy task for users already familiar with NiFi and API requests, but it has proven challenging to me. Hopefully this will help other users looking to build more robust pipelines by using NiFi instead of doing it manually.
Thanks.

It only takes 3 NiFi processors to replicate a REST API request that works in Postman. In this solution we use a request that contains a nested JSON request. The advantage of this simple approach is that it reduces the amount of configuration required to obtain a successful response from the API. That is, even if you are using a complex JSON request. In this case the body of the JSON request is passed through the GenerateFlowFile processor, without the need of any other processor to parse/format the request.
Step #1. Create a processor called GenerateFlowFile. The only property that you will have to modify is the Custom Text. Paste in there your whole JSON request just as it was in Postman. In this case I'm using the very same JSON shown in the question above. It's a good idea to setup Yield Duration to 10 seconds or more.
Step #2. Create a processor called InvokeHTTP. Then modify the 6 properties shown in the screenshots below. Use the same Authorization details you've used in Postman. Make sure to copy the Bearer token from Postman after it has been tested. Also, don't forget to setup the HTTP Method, Remote URL and Content-Type as well.
Step #3. Finally, add a couple of LogAttribute processors to store the output of InvokeHTTP. One of these LogAttribute processors should store successful responses. The other one can be used for Failure, Original, Retry and No-Retry. Or you can create LogAttribute for each of these outputs.
Step #4. Now, connect the processors and Start your data flow! You should start seeing data populate the Successful LogAttribute. Then you can use the Data Provenance option to review the incoming data and confirm that this is exactly the same result you previously obtained from Postman.
Note: This is a simple, straightforward, "for starters" solution to replicate a Postman API request using a nested static JSON. There are more solutions in StackOverflow that tackle more complex cases, like dynamic JSON. Here's a list of some other posts:
nifi invokehttp post complex json
In NiFi processor 'InvokeHTTP' where do you write body of POST request?
Configuring HTTP POST request from Nifi

Related

How to retrieve user information after login with RESTful services

What should be the standard approach for getting user information after login ?
POST request to validate user/password and retrieve information on response
POST request to validate user/password followed by GET request to retrieve information?
As far as I understand, GET should be the preferred one to retrieve data, but it seems burdensome to performe two requests; at the same time, it feels weird to get data back on POST response. Which should be preferred?
My 2 cents:) if you really want to follow REST Paradigm then you should use standard http method as GET. Although an overloaded POST might do the job however it’s not following the standard.
In SOAP world everything is POST and you can do a lot of funky stuff however in REST world there is a standard on what method used for what purpose ideally.

Sending passwords over HTTPS: GET vs POST

I'm creating a headless API that's going to drive an Angular front end. I'm having a bit of trouble figuring out how I should handle user authentication though.
Obviously the API should run over SSL, but the question that's coming up is how should I send the request that contains the user's password: over GET or POST. It's a RESTFUL API, so what I'm doing is retrieving information meaning it should get a GET request. But sending the password over get means it's part of the URI, right? I know even a GET request is encrypted over HTTPS, but is that still the correct way? Or is this a case to break from RESTFUL and have the data in the body or something (can a GET request have data in the body?).
If you pass the credentials in a request header, you will be fine with either a GET or POST request. You have the option of using the established Authorization header with your choice of authentication scheme, or you can create custom headers that are specific to your API.
When using header fields as a means of communicating credentials, you do not need to fear the credentials being written to the access log as headers are not included in that log. Using header fields also conforms to REST standards, and should actually be utilized to communicate any meta-data relevant to the resource request/response. Such meta-data can include, but is not limited to, information like: collection size, pagination details, or locations of related resources.
In summary, always use header fields as a means of authentication/authorization.
mostly GET request will bind data in URL itself... so it is more redable than POST..
so if it is GET, there is a possibility to alive HISTORY LOG
Using ?user=myUsername&pass=MyPasswort is exactly like using a GET based form and, while the Referer issue can be contained, the problems regarding logs and history remain.
Sending any kind of sensitive data over GET is dangerous, even if it is HTTPS. These data might end up in log files at the server and will be included in the Referer header in links to or includes from other sides. They will also be saved in the history of the browser so an attacker might try to guess and verify the original contents of the link with an attack against the history.
You could send a data body with a get request too but this isn't supported by all libraries I guess.
Better to use POST or request headers. Look at other APIs and how they are handling it.
But you could still use GET with basic authentication like here: http://restcookbook.com/Basics/loggingin/

Pushing MailGun webhook onto AWS SQS

I need to process MailGun webhooks. I did implement a solution directly on our web servers to process the webhooks, but MailGun generates so many calls from a large campaign that it effectively becomes a DOS attack.
One solution I've been looking at is using AWS API Gateway to a Lambda function to then push onto an SQS queue. We can then poll the queue at a rate we can manage. Unfortunately we can't get this to work as AWS API Gateway does not support multipart/form-data content types (which some of the webhooks are). This means that our SQS messages are not well formatted / structured. The best we can do is use the $util.escapeJavaScript($input.body) function in the mapping template to create an SQS message that contains the raw string of the webhook content (with escaped javascript chars) that is effectively unparsable i.e. we can't get data out of it.
I've had a go at using Zapier to process the webhook and push directly on the SQS queue. This can parse the various content types effectively and create a nicely structured message for us, but the cost of the service is not viable.
Has anybody managed this problem in another way? Are there solutions to API Gateway not parsing the content properly? I've deliberately stayed away from MailGuns event polling API as it involves significant delays before the polled data can be 'trusted' (according to MailGun).
Basically, is there another way of getting a nicely parsed message from content types multipart/form-data and application/x-www-form-urlencoded onto the queue?
Any ideas would be much appreciated!
To add, this link higlights issues with APS Gateway and multipart\form-data content:
API Gateway - Post multipart\form-data
As you've mentioned you can base64 encode in api gateway and call base64decode in the lambda function to retrieve the original payload (There are standard libraries in every language).
Also, note you can that you can use multipart form data for non file bodies.
Get non file body from multipart/form-data using AWS API Gateway and Lambda
I had the same challenge when building Suet. I ended up switching to Google Cloud functions which I really recommend. Don't waste time on Amazon API Gateway. Use Google Cloud Functions and use a middleware like multer. (You can see the source of Suet's webhook handler here).
Not sure if you ever came to a solution, but I have this working with the following settings.
Setup your API Gateway method to use "Use Lambda Proxy integration"
In your lambda (I use node.js) use busboy to work through the multi-part submission from the mailgun webhook. (use this post for help with busboy Busboy help)
Make sure that any code you are going to execute after all busboy is complete is executed in the 'finish' portion of the busboy code.

Send batched data to Google Calendar with Scala/Spray

We are successfully sending data for new, changed, and removed events to Google Calendar from a Scala app using Spray HTTP. However, we are currently sending one event per request, and this becomes very inefficient when there are multiple events for the current user. In these cases we would like to send batched data, as described here:
https://developers.google.com/google-apps/calendar/batch
The documentation begins with:
A batch request is a single standard HTTP request containing multiple
Google Calendar API calls, using the multipart/mixed content type.
Within that main HTTP request, each of the parts contains a nested
HTTP request.
Since we are already using spray http we would like to use its support for multipart/mixed requests (spray.http.MultipartContent) but it isn't clear that this is possible since the parts must consist of one or more spray.http.BodyPart instances and there doesn't seem to be a way to turn a spray.http.HttpRequest into a BodyPart.
Has anyone successfully done this? We are also taking a look at the Google API Client for Java but would rather not go down that path if there is a more Scala-friendly way to do it.

specify an action to run via REST API

Just wondering what the best practice is for specifying an endpoint in rest to say "runSomeAction"? I'm aware of the uses for GET,POST,PUT,DELETE operations and using nouns to specify those endpoints, but what is the preferred method for exposing server functionality that isn't a CRUD type operation?
EDIT:
The result of the action will just kick off a process on the server and return a status of 200 immediately (before process is complete), no body. This process is specifically running some validation rules against saved items in a database.
What is the end result of the action? Typically, you do a PUT/POST to create the resulting resource. For instance, instead of POST /sendEmail, you'd do POST /email-notifications.
EDIT
In your case, I would consider your resource to be the results of the validation. I would suggest either POST /validations or POST /validations/{whateverTypeIsBeingValidated}. You could alternately go with validation-results. Even if you don't support the client viewing the validation results now, you have the option to do so later.
Also, per #MartinBroadhurst, a REST API may not be an ideal tool here.