How can REST API pass large JSON? - rest

I am building a REST API and facing this issue: How can REST API pass very large JSON?
Basically, I want to connect to Database and return the training data. The problem is in Database I have 400,000 data. If I wrap them into a JSON file and pass through GET method, the server would throw Heap overflow exception.
What methods we can use to solve this problem?
DBTraining trainingdata = new DBTraining();
#GET
#Produces("application/json")
#Path("/{cat_id}")
public Response getAllDataById(#PathParam("cat_id") String cat_id) {
List<TrainingData> list = new ArrayList<TrainingData>();
try {
list = trainingdata.getAllDataById(cat_id);
Gson gson = new Gson();
Type dataListType = new TypeToken<List<TrainingData>>() {
}.getType();
String jsonString = gson.toJson(list, dataListType);
return Response.ok().entity(jsonString).header("Access-Control-Allow-Origin", "*").header("Access-Control-Allow-Methods", "GET").build();
} catch (SQLException e) {
logger.warn(e.getMessage());
}
return null;
}

The RESTful way of doing this is to create a paginated API. First, add query parameters to set page size, page number, and maximum number of items per page. Use sensible defaults if any of these are not provided or unrealistic values are provided. Second, modify the database query to retrieve only a subset of the data. Convert that to JSON and use that as the payload of your response. Finally, in following HATEOAS principles, provide links to the next page (provided you're not on the last page) and previous page (provided you're not on the first page). For bonus points, provide links to the first page and last page as well.
By designing your endpoint this way, you get very consistent performance characteristics and can handle data sets that continue to grow.
The GitHub API provides a good example of this.

My suggestion is no to pass the data as a JSON but as a file using multipart/form-data. In your file, each line could be a JSON representing a data record. Then, it would be easy to use a FileOutputStream to receive te file. Then, you can process the file line by line to avoid memory problems.
A Grails example:
if(params.myFile){
if(params.myFile instanceof org.springframework.web.multipart.commons.CommonsMultipartFile){
def fileName = "/tmp/myReceivedFile.txt"
new FileOutputStream(fileName).leftShift(params.myFile.getInputStream())
}
else
//print or signal error
}
You can use curl to pass your file:
curl -F "myFile=#/mySendigFile.txt" http://acme.com/my-service
More details on a similar solution on https://stackoverflow.com/a/13076550/2476435

HTTP has the notion of chunked encoding that allows you send a HTTP response body in smaller pieces to prevent the server from having to hold the entire response in memory. You need to find out how your server framework supports chunked encoding.

Related

Should a RESFTful Web API return the modified entity on an Update operation (Put)?

I'm creating a new Web API and I'm having a doubt regarding the Update operation (it's a basic CRUD). Should I return a DTO with the updated entity data? I want my API to be RESTful.
have a read here
https://www.rfc-editor.org/rfc/rfc7231
it says and I quote:
For a state-changing request like PUT (Section 4.3.4) or POST
(Section 4.3.3), it implies that the server's response contains the
new representation of that resource, thereby distinguishing it from
representations that might only report about the action (e.g., "It
worked!"). This allows authoring applications to update their
local copies without the need for a subsequent GET request.
However, you do not need to be too fixed on this, return a 201 for example when you create something is perfectly OK as well and you probably want to add the the unique identifier of the created resource.
For updates, a 200 would be ok as well. 204 can be acceptable as well as already mentioned.
The bottom line is ... return only the data you need, if you need to see the whole updated object then return it. If you don't then don't do it. Keep in mind that some objects can be quite big and have a whole object graph below them, there's no point sending too much data down the wire.
I guess the most important thing is to choose one way of doing things and then be consistent and use the same thing everywhere
First of all, returning a DTO has nothing to do with RESTful.
It's true that DTO is a pattern created with the purpose of transferring data to remote interfaces (and web services can be a good fit for this pattern).
However using DTOs won't make your application more or less RESTful. Your application can use DTOs to have more control over the data exposed in the REST API. Just that.
If your update operation relies on the PUT HTTP method (which is designed to replace the state of a resource with a new representation), you may want to return 200 or 204 status code to indicate that the operation has succeeded.
If you go for 200, you can return a representation of the new state of the recently updated resource. If you go for 204, no representation must be returned.
By representation I mean a JSON document, a XML document or any other content that can be used to represent the state of a given resource.
We normally return NoContentResult after update is successful. For example,
[HttpPut("{id}", Name = "UpdateUser")]
public IActionResult UpdateUser(Guid id, [FromBody] UserUpdateDto user)
{
if (user == null)
{
return BadRequest();
}
if (!_repository.UserExists(id))
{
return NotFound();
}
var entity = _repository.GetUser(id);
Mapper.Map(user, entity);
_repository.UpdateUser(entity);
return NoContent();
}
NoContent basically returns status code 204. The following is the source code of NoContentResult.
public class NoContentResult : StatusCodeResult
{
public NoContentResult()
: base(204)
{
}
}
Returning data from a PUT operation is optional, though not necessary. If theres anything you wanted to calculate in the model which will be useful for the client then return them, but otherwise a 204.

SmartGWT not able to parse data in the DataSource.transformResponse() method

I need some help please...
I am working with a GWT enabled web application. I am using the gwt-2.3.0 SDK.
I have a method that extends the DataSource class and uses the transformResponse method:
public class DeathRecordXmlDS extends DataSource {
protected void transformResponse(DSResponse response, DSRequest request, Object data){
super.transformResponse(response, request, data);
}
}
As I understand, the transformResponse() method should get control and at this point, I will have access to the data that is being provided to the Client side of my application. I am trying to work with the Object data parameter (the third parameter) that is passed in.
I am expecting an XML formatted string to be passed in. The XML will contain data (a count field) that I need to access and use.
I don't seem to be getting an XML string. Here's what I know...
I do see the XML data being passed to my webapp (the client). I can see this because I inspect the webpage that I am working with and I see the Response data. Here's an example of something that I expect to receive:
XML data from Query:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Collection numRecords="0">
<DeathRecords/>
</Collection>
The above XML is valid (I checked it in a Validator). This is a case where there was no data (No Death Records) being returned to my application. The numRecords XML attribute is set to "0". Of course, If I do have records returned the numRecords will contain the number of records and I'll get that same number of DeathRecord nodes.
I am not getting the above data (or, I don't know how to work with it) in the transformResponse() method.
Here's what I've done to try to figure this out...
The Object data parameter... it is a JavaScriptObject. I know this because I did a .getClass().getName() on it:
DeathRecordXmlDS::transformResponse() data.getClass().getName(): com.google.gwt.core.client.JavaScriptObject$
Then, to try to work with it, I converted it to a String:
com.google.gwt.core.client.JavaScriptObject dataJS = (com.google.gwt.core.client.JavaScriptObject)data;
System.out.println("DeathRecordXmlDS::transformResponse() data as a JavaScriptObject: "+dataJS.toString());
The contents of 'data' formatted as a String look like:
DeathRecordXmlDS::transformResponse() data as a JavaScriptObject: [XMLDoc <Collection>]
So, it looks like I have something that has to do with my 'Collection' node, but not a String of XML data that I can parse and get to my numRecords attribute.
What do I need to do to gain access to the XML in the transformResponse() method?
Thanks!
I think your data object is already translated to a javascript collection.
Maybe you could use the utility class XMLTools to retrieve your numRecords information:
Integer numRecords = Integer.parseInt(XMLTools.selectString(data, "Collection/#numRecords"));
After working on this for an additional period of time I was able to read the XML data that I am working with. I used the following piece of code:
try{
JsArray<JavaScriptObject> nodes = ((JavaScriptObject) XMLTools.selectNodes(data, "/Collection/#numRecords")).cast();
for (int i = 0; i < nodes.length(); i++) {
com.google.gwt.dom.client.Element element = (com.google.gwt.dom.client.Element) nodes.get(i);
numRecords = element.getNodeValue();
}
} catch(Exception e){
// If Parsing fails, capture the exception
System.out.println("DeathRecordXmlDS::transformResponse() Not able to parse the XML");
}
I think the first step to solving this was understanding that the parameter 'data' of type Object was really a JavaScriptObject. I learned this after looking at the .getClass() and .getName(). This helped me understand what I was working with:
System.out.println("DeathRecordXmlDS::transformResponse() data.getClass().getName(): "+data.getClass().getName());
Once I knew it was a JavaScriptObject, I was able to do a little more focused of a Google search for what I was trying to accomplish. I was a little surprised that the XMLTools.selectNodes() function worked the way it did, but the end result is that I was able to read the numRecords attribute.
Thanks for the suggestion!

How to close InputStream which fed into Response(jax.rs)

#GET
#Path("/{id}/content")
#Produces({ "application/octet-stream" })
public Response getDocumentContentById(#PathParam("id") String docId) {
InputStream is = getDocumentStream(); // some method which gives stream
ResponseBuilder responseBuilder = Response.ok(is);
responseBuilder.header("Content-Disposition", "attachment; filename=" + fileName);
return responseBuilder.build();
}
Here how can I close the InputStream is ? If something(jax.rs) closes automatically. Please give me some information. Thank you.
When you're wanting to stream a custom response, the most reliable way I've found is to return an object that contains the InputStream (or which can obtain the stream in some other way at some point), and to define a MessageBodyWriter provider that will do the actual streaming at the right time.
For example, this code is part of Apache Taverna, and it streams back the zipped contents of a directory. All that the main code needs to do to use it is to return a ZipStream as the response (which can be packaged in a Response or not) and to ensure that it is dealing with returning the application/zip content type. The final point to note is that since this is dealing with CXF, you need to manually register the provider; unlike with Glassfish, they are not automatically picked up. This is a good thing in sophisticated scenarios, but it does mean that you need to do the registration.

With Spring Data REST, how to make custom queries use the HATEOAS output format?

I'm learning the Spring 4 stuff by converting an existing Spring 3 project. In that project I have a custom query. That query fetches data in a straightforward way, after which some heavy editing is done to the query results. Now the data is sent to the caller.
I plan on extending CrudRepository for most of my simple query needs. The data will be output in HATEOAS format.
For this custom query I think I should be adding custom behavior (spring.io, "Working with Spring Data Repositories", Section 1.3.1, "Adding custom behavior to single repositories").
As an example:
#Transactional(readOnly = true)
public List<Offer> getFiltered(List<Org> orgs, OfferSearch criteria) {
List<Offer> filteredOffers = getDateTypeFiltered(criteria);
filteredOffers = applyOrgInfo(orgs, filteredOffers);
filteredOffers = applyFilterMatches(filteredOffers, criteria);
return sortByFilterMatches(filteredOffers);
}
(The code merely illustrates that I don't have a simple value fetch going on.)
If I could use the raw results of getDateTypeFiltered(criteria) then I could put that into a CrudRepository interface and the output would be massaged into HATEOAS by the Spring libraries. But I must do my massaging in an actual Java object, and I don't know how to tell Spring to take my output and emit it in my desired output format.
Is there an easy way to get there from here? Or must I try things like do my filtering in the browser?
Thanks,
Jerome.
To properly get HAL formatted results, your query controllers must return some form of Spring HATEOAS Resource type.
#RequestMapping(method = RequestMethod.GET, value = "/documents/search/findAll")
public ResponseEntity<?> findAll() {
List<Resource<Document>> docs = new ArrayList<>();
docs.add(new Resource<Document>(new Document("doc1"), new Link("localhost")));
docs.add(new Resource<Document>(new Document("doc2"), new Link("localhost")));
Resources<Resource<Document>> resources = new Resources<Resource<Document>>(docs);
resources.add(linkTo(methodOn(ApplicationController.class).findAll()).withSelfRel());
resources.add(entityLinks.linkToCollectionResource(Document.class).withRel("documents"));
return ResponseEntity.ok(resources);
}
I have submitted a pull request to Spring Data REST to update its reference docs to specify this in http://docs.spring.io/spring-data/rest/docs/2.4.0.RELEASE/reference/html/#customizing-sdr.overriding-sdr-response-handlers
I am not sure I perfectly got your question. If I did this should be the answer: http://docs.spring.io/spring-data/jpa/docs/1.9.0.RELEASE/reference/html/#repositories.custom-implementations

http delete with REST

I am currently using Jersey Framework (JAX-RS implementation) for building RESTful Web Services. The Resource classes in the project have implemented the standard HTTP operations - GET,POST & DELETE. I am trying to figure out how to send request parameters from client to these methods.
For GET it would be in the query string(extract using #QueryParam) and POST would be name/value pair list (extract using #FormParam) sent in with the request body. I tested them using HTTPClient and worked fine. For DELETE operation, I am not finding any conclusive answers on the parameter type/format. Does DELETE operation receive parameters in the query string(extract using #QueryParam) or in the body(extract using #FormParam)?
In most DELETE examples on the web, I observe the use of #PathParam annotation for parameter extraction(this would be from the query string again).
Is this the correct way of passing parameters to the DELETE method? I just want to be careful here so that I am not violating any REST principles.
Yes, its up to you, but as I get REST ideology, DELETE URL should delete something that is returned by a GET URL request. For example, if
GET http://server/app/item/45678
returns item with id 45678,
DELETE http://server/app/item/45678
should delete it.
Thus, I think it is better to use PathParam than QueryParam, when QueryParam can be used to control some aspects of work.
DELETE http://server/app/item/45678?wipeData=true
The DELETE method should use the URL to identify the resource to delete. This means you can use either path parameters or query parameters.
Beyond that, there is no right and wrong way to construct an URL as far as REST is concerned.
You can use like this
URL is http://yourapp/person/personid
#DELETE
#Path("/person/{id}")
#Produces(MediaType.APPLICATION_JSON)
public Response deletePerson(#PathParam("id") String id){
Result result = new Result();
try{
persenService.deletePerson(id);
result.setResponce("success");
}
catch (Exception e){
result.setResponce("fail");
e.printStackTrace();
}
return Response.status(200).entity(result).build();
}
#QueryParam would be the correct way. #PathParam is only for things before any url parameters (stuff after the '?'). And #FormParam is only for submitted web forms that have the form content type.