Difference between recursive and iterative dns lookup - sockets

I am making a resolver and a nameserver program with out using dns libraries(such as netdb.h) by directly sending a dns message. but i have few problems. As far as i find out when we send a recursive request, the nameserver queried finds out the Records for us. Does the queries used by nameserver to query other servers, are similar to iterative queries? at least these images suggest this.
I am also confused if a client can do a iterative query or only a nameserver can do iterative queries?
Recursive dns lookup:
Iterative dns lookup:

Any DNS client (or "resolver") may perform iterative queries.
By definition, though, a resolver that does perform iterative queries is a recursive resolver, and not a stub resolver.
Stub resolvers are usually implemented as libraries, linked directly into your executable.
However it is also possible to build a complete recursive resolver as a standalone library. libunbound is a particularly good example.

A client can certainly do iterative queries on its own without needing to consult a recursive resolver but there are many reasons not to do that:
simplify the complexity of the software that needs to exist in the stub resolver libraries (e.g. libresolv or built into libc) on every host
delegate iterative querying to a server in an ISP network or nearer to a "backbone" which will have a better Internet connection (most importantly, lower delay) and can complete the iterative query faster.
aggregate the DNS queries of many end users onto a small number of caching resolvers. Most of the time the resolvers won't have to do the full iterative query: they will have some or all of the results already cached.
reduce the number of places where the "hints" file (a list of root nameservers and their IP addresses), which is necessary to bootstrap a recursive resolver, has to be deployed.
DNSSEC throws a wrench in that: with DNSSEC, the end user must perform the full iterative query if it wants to certify the result. It is yet to be seen how the large scale deployment of DNSSEC-enabled resolvers will happen.

a recursive query:- the DNS server might send query other DNS server's in the internet on your behalf, for the answer.As proxy server sends the query to the main server for the answer.
In an iterative query, the name server, will not go and fetch the complete answer for your query, but will give back a referral to other DNS server's, which might have the answer.
it the proxy server gives you the answer unless it gives to search in other servers

Related

REST API retrieving many subresources efficiently

Let's assume I have a REST API for a bulletin board with threads and their comments as a subresource, e.g.
/threads
/threads/{threadId}/comments
/threads/{threadId}/comments/{commentId}
The user can retrieve all threads with /threads, but what is an efficient/good way to retrieve all comments?
I know that HAL can embeded subresources directly into a parent resource, but that possibly means sending much data over the network, even if the client does not need the subresource. Also, I guess paging is difficult to implement (let's say one thread contains many hundred posts).
Should there be a different endpoint representing the SQL query where threadId in (..., ..., ...)? I'm having a hard time to name this endpoint in the strict resource oriented fashion.
Or should I just let the client retrieve each subresource individually? I guess this boils down to the N+1 problem. But maybe it's not so much of a deal, as they client could start to retrieve all subresources at once, and the responses should come back simulataneously? I could think of the drawback that this more or less forces the API client to use non-blocking IO (as otherwise the client may need to open 20 threads for a page size of 20 - or even more), which might not be so straight-forward in some frameworks. Also, with HTTP 1.1, only 6 simulatenous requests are allowed per TCP connection, right?
I actually now tend to the last option, with a focus on HTTP 2 and non-blocking IO (or even server push?) - although some more simpler clients may not support this. At least the API would be clean and does not have to be changed just to work around technical difficulties.
Is there any other option I have missed?

Rest API vs GraphQL - Idempotent

This question/answer from another user was very informative as to what idempotent means:
What is an idempotent operation?
When it comes to Rest API, since caching for GET requests can quickly be enabled if not already, if a user is wanting to fetch some examples: /users/:id or /posts/:id, they can do so as many times as they'd like and it shouldn't mutate any data.
If I'm understanding correctly, a GET request is idempotent in this case.
QUESTION
I believe Relay and Dataloader can help with GraphQL queries as far as caching, but doesn't address browser/mobile caching.
If we're talking about just the GET request portion of GraphQL, it's part of a single endpoint, what could I use tech/features or otherwise, that would address the benefits that regular http requests provide caching-wise.
Idempotence vs Caching
First of all, caching and idempotence are different things and do not necessarily relate to one another. Caching may or may not be used to implement idempotent operations - it is certainly not a requirement.
Secondly, when speaking of HTTP requests, idempotence essentially concerns the state of the server rather than its responses. An idempotent operation will have leave the server in the exact same state if performed multiple times. It does not mean that the responses returned by an idempotent operation will be the same (although they often might be).
GET requests in REST are expected to be idempotent by contract - i.e. a GET operation must not have any state-altering side-effects on the server (technically this may not always be true, but for the sake of the explanation let's assume it is). This does not mean that if you GET the same resource multiple times, you'll always get the same data back. Actually, that wouldn't make sense since resources change over time and may be deleted as well right? So caching GET responses can help with performance but has nothing to do with idempotence.
Another way to look at it is that you are querying for a resource rather than arbitrary data. A GET request is idempotent in that you'll always get the same resource and you won't modify the state of the system in any way.
Finally, a poorly (oddly?) developed GET operation on the server side might have side-effects, which would violate the REST contract and make the operation non idempotent. Simple response caching would not help in such a case.
GraphQL
I like to see GraphQL queries as equivalent to GETs in REST. This means that if you query for data in GraphQL, the resolvers must not perform any side-effects. This would ensure that performing the same query multiple times will leave the server in an unchanged state.
Just like in a simple GET you'd be querying for specific resources, although unlike in GET, GraphQL allows you to query for many instances of many different types of resources at once. Once again, that does not mean identical responses, first and foremost because the resources may change over time.
If some of your queries have side-effects (i.e. they alter the state of the resources on the server), they are not idempotent! You should probably use mutations instead of queries to achieve these side-effects. Using mutations would make it clear to the client/consumer that the operation is not idempotent and should be treated accordingly (mutation inputs may accept idempotence keys to ensure Stripe-like idempotency, but that's a separate topic).
Caching GraphQL responses
I hope that by now it's clear that caching is not required to ensure / is not what determines the idempotency of GraphQL queries. It's only used to improve performance.
If you are still interested in server-side caching options for GraphQL, there are plenty of resources. You could start by reading what Apollo Server documentation has to say on the topic. Don't forget that you can also cache database/service/etc. responses. I am not going provide any specific suggestions since, judging by your question, there's much grater confusion elsewhere.

Query node-label topology from Yarn via REST API [MapR 6.1/Hadoop-2.7]

There is a Java and CLI-interface to query Yarn RM for node-to-nodelabel (and inverse) mappings. Is there a way to do this via the REST-API as well?
An initial RM-API search revealed only node-label based job submissions as an option.
Sadly that is actually broken in MapR-Hadoop (6.1 as of 6/6/19), so my code has to work around that, by implementing the correct scheduling itself. This works (barely - more broken APIs here as well) using the YarnClient Java API.
But as I want to schedule jobs against different resource managers at the same time, behind firewalls, the REST-API is the most compelling option to achieve this, and the YarnClient API's RPC backend can't be easily transported.
My current worst-case solution would be to parse the YARN-WebUI in some way.
The only solution I found so far:
Request /ws/v1/cluster/nodes - this gets you all nodes.
FlatMap/Distinct on each node's nodeLabels, if you need just the list of node labels. Filter by nodeLabel, if you need all nodes for a specified label.
This does mean, that you always have to query all nodes, then sort/filter/arrange by NodeLabels, which is a lot of client-side magic. But apparently there's no GetNodesToLabel or even GetClusterNodeLabels to help us out.
I assume getLabelsToNodes is just a client-side implementation, as the protocol doesn't define the API, so that's right out the window for REST, unless implemented in the WebService.

Rest Security Ensure Resource Delete

Background:I'm a new developer fresh out of college at a company that uses RPC architectural style for a lot its internal services.They also seem to change which tool they use behind the scenes pretty frequently, so the tight coupling between the client and server implementations in RPC is problematic. I was tasked with rewriting one of the services, and I feel a RESTful api would be a good match because the backing technology can only deal with files anyway, but I have a few questions.My understanding of REST so far is that you break operations up as much as possible and shift the focus to resources, so both the client and the server together make a state machine with the server mainly handling the transitions through hypermedia.Example:say you have a service that takes a file and splits it in two byte-wise.I would design the sequence for this likethe client would POST the file they want split,server splits the fileserver writes both result pieces to a temp folderserver returns that the client should GET and both files URI'sthe client sends a GET for the pieceserver returns the piece and that the client should DELETE the URIthe client sends a DELETE for the URI
and 2 and 3 are done for both pieces.My question is: How do you ensure that the pieces get deleted at the end?a client could just not follow step 3if you combine step 2&3, a malicious (or negligent) client could just stop after step 1but if you combine them all, isn't that just RPC over HTTP?
If the 2 pieces in question are inseparable, then they are in fact just properties of a single resource.
And yes, if a POST/PUT must be followed by a DELETE, then you're probably just trying to shoehorn RPC into a REST-style architecture.
There's no real definition of what "REST" actually is, but if the one thing certain about it is that it MUST be stateless; i.e. every separate request must be self-sufficient - it cannot depend on a previous request, and cannot mandate subsequent requests.

Spam IP blacklist feed

Where I can get a list of IP from spam blacklist database. I need something like phishtank API where I can download their blacklist and implement it with my app.
I have seen a spam website doing lookup like http://www.mxtoolbox.com/ I was hoping I could get access to spam feed like them. csv or sql would be nice.
Thank you.
That site is using DNSBL lookups, not a local resource. To perform such lookups, individual blacklists sometimes differ, but in general, to look up an IP address, reverse the octets, add the DNSBL zone, and resolve; a result indicates a match (typically 127.0.0.x where the x might be a result code of some sort). So for example, to look up 123.45.67.89 in dnsbl.example.net, you'd perform an A query like this:
nslookup 89.67.45.123.dnsbl.example.net
(The program nslookup is not particularly good or convenient, but it is widely available, even on Windows.)
Read up on DNS in general and DNSBLs in particular. I have found Wikipedia to have good overviews for these topics.