API Request Assistance - rest

I'm new to playing around with calling third party REST API's.
I have an API which requires an ID (/sites/{id}/. As I don't know the ID off the top of my head and would like to query multiple ID's, is there anyway to wildcard this ID for it to run through and check for instance ID's 1 through to 10? Or is this more of a python integration?

As mentioned above, if an API happens to have a parameter "id", whether or not you can scan for all available IDs (or any ID between 1 and 10) depends entirely on the API.
In your case, the API (for help.rapid7.com) is well documented. It appears to have an endpoint to "list sites", which should give you what you're looking for:
https://help.rapid7.com/insightvm/en-us/api/index.html#operation/getSites
Sites
GET /api/3/sites
Server URL
https://help.rapid7.com/api/3/sites
Retrieves a paged resource of accessible sites.
PARAMETERS
Query Parameters
* page integer <int32>
Default: 0
The index of the page (zero-based) to retrieve.
* size integer <int32>
Default: 10
The number of records per page to retrieve.
* sort
Multiple query params of string
The criteria to sort the records by, in the format:
property[,ASC|DESC].
The default sort order is ascending.
Multiple sort criteria can be specified using multiple sort query parameters.
You would probably want to do the following:
Call /api/3/sites (with a filter) to get a list of sites you're interested in, then
Make successive calls to /sites/{id}/ for each site in the list you want detailed information about.

Related

How to improve performance on nested graphql connections when using pagination

I'm trying to implement some kind of a basic social network project. It has Posts, Comments and Likes like any other.
A post can have many comments
A post can have many likes
A post can have one author
I have a /posts route on the client application. It lists the Posts by paginating and shows their title, image, authorName, commentCount and likesCount.
The graphql query is like this;
query {
posts(first: 10, after: "123456") {
totalCount
edges {
node {
id
title
imageUrl
author {
id
username
}
comments {
totalCount
}
likes {
totalCount
}
}
}
}
}
I'm using apollo-server, TypeORM, PostgreSQL and dataloader. I use dataloader to get author of each post. I simply batch the requested authorIds with dataloader, get authors from PostgreSQL with a where user.id in authorIds query, map the query result to the each authorId. You know, the most basic type of usage of dataloader.
But when I try to query the comments or likes connection under each post, I got stuck. I could use the same technique and use postId for them if there was no pagination. But now I have to include filter parameters for the pagination. And there maybe other filter parameters for some where condition as well.
I've found the cacheKeyFn option of dataloader. I simply create a string key for the passed filter object to the dataloader, and it doesn't duplicate them. It just passes the unique ones to the batchFn. But I can't create a sql query with TypeORM to get the results for each first, after, orderBy arguments separately and map the results back to the function which called the dataloader.
I've searched the spectrum.chat source code and I think they don't allow users to query nested connections. Also tried Github GraphQL Explorer and it lets you query nested connections.
Is there any recommended way to achieve this? I understood how to pass an object to dataloader and batch them using cacheKeyFn, but I can't figure out how to get the results from PostgreSQL in one query and map the results to return from the loader.
Thanks!
So, if you restrict things a bit, this is doable. The restriction is to only allowed batched connections on the first page of results, e.g. so all the connections you're fetching in parallel are being done with the parameters. This is a reasonable constraint because it lets you do things like get the first 10 feed items and the first 3 comments for each of them, which represents a fairly typical use case. Trying to support independent pagination within a single query is unlikely to fulfil any real world use cases for a UI, so it's likely an over-optimisation. With this in mind, you can support the "for each parent get the first N children" use case with PostgreSQL using window.
It's a bit fiddly, but there are answers floating around which will get you in the right direction: Grouped LIMIT in PostgreSQL: show the first N rows for each group?
So use dateloader how you are with cacheKeyFn, and let your loader function recognise whether you can perform the optimisation (e.g. after is null and all other arguments are the same). If you can optimise, use a windowing query, otherwise do unoptimised queries in parallel as you would normally.

RESTful API desgin: retrieve number of items in category

I want to create a REST API with items in categories and list all categories alongside its number of items.
Schemas:
Category {id, name}
Item {id, name, categoryId}
Endpoints:
GET /categories/list
GET /categories/<id>
PUT /categories/<id>
GET /items/list[?category=<categoryId>]
GET /items/<id>
To update a category I take what I get from GET /categories/<id>, modify the JSON object and PUT it back.
So far so good.
My question is if there are one ore more best practices to retrieve the item count?
I can think of a few ways to do this:
Fire a GET /items/list?category=<categoryId> for each category, counting the resulting items.
Taking the count from a X-total-count or content range header or a total_count field returned from the endpoint will avoid having to actually load all items.
Add an item_count field to the resulting category JSON objects.
How should this read only field be handled for PUTs? Make the backend ignore it? Manually unset it?
Create a dedicated endpoint /categories/item_counts that returns a list of categories with each number of items.
I like option number 2 (e.g. the wordpress API does it this way) because it does not need extra requests. But I really dislike the idea of having a different object structure for the GET and PUT requests.
REST is really about representation of objects. Category doesn't have a count as it's a single object. Item doesn't have a count either for the same reason. Count is more like RPC where you tell the service to compute something.
GET /items/list[?category=<categoryId>]
is like RPC, passing the category parameter to the list method. Staying in that idiom, you could "chain" the methods to get the total count of items in the specified category:
GET /items/list/count[?category=<categoryId>]
although I'd use path parameters instead:
GET /items/list/<category_id>/count
but list is implied so you could remove it:
GET /items/<category_id>/count
It's straying a bit from "pure" REST but it keeps your actual REST objects clean, as you say, keeping total_count our of your Category objects.
I'm assuming you sometimes need the count but not all the Items otherwise you wouldn't need to ask the API for the count, you'd just count them yourself in the client. That suggests another option:
GET /categories/<id>/count
{
"total_count": 10
}
which fits better with the use case of finding out how many Items are in a Category.

Proper multi-id syntax when using the custom_file_ids[] query parameter for the CLIO API "contacts" endpoint

What is the correct API syntax for using the custom_file_ids[] query parameter to specify multiple fields (but not all) in the CLIO API contacts result set? I need to specify multiple custom fields. I can get it to work for a single field, but not multiple fields at the same time.
Specifically, how do I specify and delimit the multiple fields? I have tried the following:
custom_file_ids[]=1234567,2345678
custom_file_ids[]=[1234567,2345678]
custom_file_ids[]=(1234567,2345678)
custom_file_ids[]={1234567,2345678}
custom_file_ids[]=1234567:2345678
The API documentation at https://app.clio.com/api/v4/documentation is silent on the list syntax that it expects.
Below is one specific API call I tried (both the actual URL-encoded call, and a decoded one for clarity) using a simple comma-delimited list, but which only returns custom field data for the first ID in the list--not the second. If I enclose the ID list in any kind of brackets (per above), the endpoint returns a 404 error.
https://app.clio.com/api/v4/contacts?custom_field_ids[]=1234567%2C2345678&custom_field_values[4529224]=true&fields=id%2Cname%2Cprimary_address%2Cprimary_work_address%2Cis_client%2Ctype%2C%20primary_email_address%2Cprimary_phone_number%2Ccustom_field_values%7Bid%2Cfield_type%2Cfield_name%2Cvalue%2Ccustom_field%7D
https://app.clio.com/api/v4/contacts?custom_field_ids[]=1234567,2345678&custom_field_values[4529224]=true&fields=id,name,primary_address,primary_work_address,is_client,type,primary_email_address,primary_phone_number,custom_field_values{id,field_type,field_name,value,custom_field}
Try:
custom_file_ids[]=1234567&custom_file_ids[]=2345678
I was able to do this with Contacts Custom Fields by putting custom_field_id[] on the URL as many times as you have IDs.
I hope this helps.

Get TFS work item and its links using REST API

I'm using TFS REST API and am trying to retrieve work items & their child items by title (parent's title is the parameter). I can't find a way to retrieve these linked items using TFS REST API.
This is what I've tried. First I query for the work items by title:
URI = http://[tfspath]/_apis/wit/wiql?api-version=1.0
query = SELECT * FROM WorkItem WHERE [System.Title] = 'some title'
The above returns me an object WorkItems which has only the ID/URL of the matching work item. Then, I use the returned ID on the query below (lets say the id is 1234):
URI = http://[tfspath]/_apis/wit/workitems/1234?fields=System.Title&api-version=1.0
This returns the title of the item & other fields I might include on the fields list. However, I cannot find a way to include the child items in the returns. I've tried including System.RelatedLinks but this does not change the returned fields. Example:
URI = http://[tfspath]/_apis/wit/workitems/1234?fields=System.Title,System.RelatedLinkCount,System.RelatedLinks&api-version=1.0
Returns
"fields":{"System.RelatedLinkCount":4,"System.Title":"some title"}
Which means there are 4 related links to the work item "some title", but they are not being returned.
What am I missing here? How do I get these related links/child items?
Append &$expand=relations to the querystring to fetch the links collection of a workitem:
$expand enum { all, relations, none } none
Gets work item relationships (work item links, hyperlinks, file attachements, etc.).
See: https://www.visualstudio.com/en-us/docs/integrate/api/wit/work-items#with-links-and-attachments
To get a work item with all details as well as the links with details, you'll need to use the APIs that are intended for reporting purposes. Due to the possible shear size of the returned document, it will be chunked and you will be given a watermark. You may need to do multiple requests.
See: https://www.visualstudio.com/en-us/docs/integrate/api/wit/reporting-work-item-links

How to implement cursors for pagination in an api

This is similar to to this question which doesn't have any answers. I've read all about how to use cursors with the twitter, facebook, and disqus api's and also this article about how disqus generally built their cursors, but I still cannot seem to grok the concept of how they work and how to implement a similar solution in my own projects. Can someone explain specifically the different techniques and concepts behind them?
Lets first understand why offset pagination fails for large data sets with an example.
Clients provide two parameters limit for number of results and offset and for page offset.
For example, with offset = 40, limit = 20, we can tell the database to return the next 20 items, skipping the first 40.
Drawbacks:
Using LIMIT OFFSET doesn’t scale well for large
datasets. As the offset increases the farther you go within the
dataset, the database still has to read up to offset + count rows
from disk, before discarding the offset and only returning count
rows.
If items are being written to the dataset at a high frequency, the
page window becomes unreliable, potentially skipping or returning
duplicate results.
How Cursors solve this ?
Cursor-based pagination works by returning a pointer to a specific item in the dataset. On subsequent requests, the server returns results after the given pointer.
We will use parameters next_cursor along with limit as the parameters provided by client in this case.
Let’s assume we want to paginate from the most recent user to the oldest user.When client request for the first time , suppose we select the first page through query:
SELECT * FROM users
WHERE team_id = %team_id
ORDER BY id DESC
LIMIT %limit
Where limit is equal to limit plus one, to fetch one more result than the count specified by the client. The extra result isn’t returned in the result set, but we use the ID of the value as the next_cursor.
The response from the server would be:
{
"users": [...],
"next_cursor": "1234", # the user id of the extra result
}
The client would then provide next_cursor as cursor in the second request.
SELECT * FROM users
WHERE team_id = %team_id
AND id <= %cursor
ORDER BY id DESC
LIMIT %limit
With this, we’ve addressed the drawbacks of offset based pagination:
Instead of the window being calculated from scratch on each request based on the total number of items, we’re always fetching the next count rows after a specific reference point. If items are being written to the dataset at a high frequency, the overall position of the cursor in the set might change, but the pagination window adjusts accordingly.
This will scale well for large datasets. We’re using a WHERE clause to fetch rows with id values less than the last id from the previous page. This lets us leverage the index on the column and the database doesn’t have to read any rows that we’ve already seen.
For detailed explanation you can visit this wonderful engineering article from slack!
Here is an article about pagination: paginating-real-time-data-cursor-based-pagination
Cursors – we need to have at least one column with unique sequential values to implement cursor based pagination. This can be similar to Twitter’s max_id parameter or Facebook’s after parameter.
In general you should pass the current item or page number in the request as a param. Other usual param is the batch size of the page. Then on the server side backend you select and return the proper dataset, with an SQL query for example.
enter image description hereHere's what I am Done with. The cursor is working as a pointer and it points to that index. and limit will pick that many rows from that pointer. Let's say we have given id 10 and limit 5 then it will go to id 10 and pick 5 elements from there.
Some Graph API connections uses cursors by default. You can use 'limit' and 'before'/'after' parameters in your call. If you are still not clear, you can post your code here and I can explain with it.