ArangoDB Foxx as a REST back-end - rest

I am working on an app that would greatly benefit from Arangos' multi-model capabilities. Considering the app needs for the back-end, I have concluded that most, if not all, of it could be served through a REST API as to aid cleaner design for future development and integration with others. The API would then be consumed by several web and mobile front-end frameworks to handle the rest of the logic. The project will be developed with Javascript for the whole stack, using the NodeJS ecosystem.
.
The question itself:
Should and could one use arangodb + foxx to create the complete back-end stack for serving a REST API, thus avoiding another layer/component in the stack? e.g. express/hapi/loopback etc.
.
Major back-end requirements:
Authentication with roles
Sessions
Encryption
Complex querying (root of my initial thought, as to avoid multiple hops between DB and back-end)
Entry parsing, validation and sanitization
Scheduled tasks
.
Mainly looking for:
Known design advantages
Known design limitations
"Hidden" bottlenecks
Other possible future regrets
.
Side question (that might answer some of the above): Could Foxx utilise some of the node middleware available via npm?
Thanks in advance for your time!

You can use ArangoDB Foxx as the sole backend of your application, however it is important to keep the limitations of Foxx (compared to a general purpose JS environment like Node.js) in mind when doing this.
You mention encryption. While ArangoDB does support some cryptography (e.g. HMAC signing and PBKDF2 key derivation for passwords) the support is not as exhaustive and extensible as in Node.js. Also when using computationally expensive cryptography this will affect the performance of the database (because unlike Node.js Foxx is strictly synchronous and thus all operations should be considered blocking).
ArangoDB does not support role-based authentication out of the box but it is perfectly reasonable to implement it within ArangoDB using Foxx (just like you would implement it in Node.js, except you don't need to leave the database).
For sessions there are generally two possible approaches: you can either use a collection with session documents (using ArangoDB as your session backend) or you can keep your services stateless by using signed tokens (Foxx comes with JWT support out of the box).
Complex/stored queries and input validation (using the joi schema library originally written for hapi) are actually some of the main use cases of Foxx so those shouldn't be any problem whatsoever.
Foxx comes with its own mechanism for queueing tasks, which can also be scheduled ahead or recur periodically. However depending on your requirements an external job or message queue may be a better fit. The good thing is you can get started with the built-in job queue right away and still move on to a dedicated solution if the need arises during development.
As for middleware and NPM packages: Foxx is not fully compatible with Node.js code. While we provide a lot of compatibility code and try to keep the core modules compatible where possible, a big difference is that Node.js is generally used to perform asynchronous operations while in ArangoDB all operations are synchronous.
If you have Node.js modules that don't use crypto, file or network I/O and don't use asynchronous APIs (e.g. setTimeout, promises) they may be compatible with Foxx. A lot of utility libraries like lodash work with no problems at all. Even if you find that a module doesn't work it may be possible to write an adapter for it like we have done with mocha (integrated into Foxx) and GraphQL (via the graphql-sync package on NPM).
In my experience it is a good approach to put your Foxx service behind a thin layer of Node.js (e.g. a simple express application that mostly just proxies to your Foxx API) and/or to delegate some parts of your backend to standalone Node.js microservices (e.g. integration with non-HTTP services like e-mail or LDAP) which can be integrated in Foxx via HTTP.
One more thing: while a lot of existing express middleware likely isn't compatible with Foxx because of Node-specific dependencies and async logic, ArangoDB 3 will bring a new version of Foxx with support for middleware using a functionally express-compatible API.

I'm just starting to port my sails application to a FOXX application so I can answer some of your questions.
Role based authorization in ArangoDB is probably at too high a level than you want. In our case, we use an external service to authorize various web and service based applications at a very fine-grained level (much lower than a vertex or an edge). My feeling is that Authorization at that level will require you to write it yourself in javascript. If it's just CRUD on a per collection basis, then it shouldn't require much effort.
For authorization and sessions, I would look at the FOXX example found at: FOXX authorization-session example
It's not clear what you're asking about encryption. If you're talking about SSL connections, then that is natively supported (see arangodb end-points). As for internal encryption, there is a javascript crypto module ArangoDb crypto
Entry validation, etc. is supported by the javascript joi package.
Complex querying... Absolutely and getting even better in ArangoDB version 3.x. Traversals can be chained (go down using one edge collection, then up using another).
You're right on the ball when thinking about efficiency. This is the main reason we're going from sails to FOXX. In our case, we filter query results based on permissions from our external service. This means that we can't use ArangoDB native skip and limit support if these attributes are specified by the client. In sails, we have to bring back results in chunks and collect until we hit the appropriate skip and limit values. By moving to FOXX, we save a lot of network and other resources. We tested this by having sails forward the request to our prototype FOXX implementation. This scaled much better than the sails post-processing setup.
You can use NPM modules with restrictions. See Javascript Modules

Related

To run a GraphQL server in Python that allows queries and subscriptions, do I have to combine it with a web framework service?

Excuse my ignorance in this area: most of my programming has been in optimization and research. I am very new to GraphQL and client-server programming.
My organization is working on an automated scheduler in Python 3.9 for scheduling observations for a large-scale telescope.
We are relying on many different services to all communicate via GraphQL. At the moment, I am trying to implement a GraphQL server that can be queried or accept subscriptions to disseminate when a new schedule for the night is created (for any number of reasons such as changing weather conditions, instrument faults, modifications to observations). Eventually, we will need to allow mutations (e.g. to the priority of observations, or to fix an observation at a given time).
I am looking at both Strawberry and Graphene as my possible options, but what is unclear to me is if I require them to be combined with a web framework service like Django or Flask to achieve the functionality that I need.
I see that Strawberry has a built-in (possibly only debug) server, but it also discusses integration with Django, Flask, and others, and I am not certain if I need to go to that level. I have been working through examples and completed a JavaScript course using Apollo Server / Client, but I'm not sure how these compare to Python GraphQL server implementations.
I apologize for my lack of knowledge: I am trying to keep the project as simple as possible for now, and having played around with Graphene and Django, I'm not sure if I'm overcomplicating things of if this approach is necessary.
Statements like "Graphene is fully featured with integrations for the most popular web frameworks and ORMs" lead me to believe a web framework is required, but again, I am not sure and feel very out of my depth since in this area is virtually nonexistent.
I'm the maintainer of Strawberry GraphQL 😊
For both Strawberry and Graphene you'd need framework like Django or Flask.
Strawberry has support for Subscriptions when using an ASGI framework like Starlette or FastAPI, there's some example here: https://strawberry.rocks/docs/general/subscriptions#subscriptions
We also have an almost-done PR that adds support for subscriptions using django: https://github.com/strawberry-graphql/strawberry/pull/1407

Rest vs XCC in MarkLogic

In Marklogic, do we have any preference over service call among REST API and XCC?
Which is better for performance and why? Or which one is suitable in what scenario ?
Assumption - Java layer is always present in the system.
In terms of performance, XCC will likely outperform REST API calls. It avoids the overhead of the REST rewriter and request/response processing.
However, it's also important to note that you can make HTTP calls to a MarkLogic HTTP server without configuring it to be a REST API instance. You could invoke an installed JavaScript or XQuery module directly via HTTP.
MarkLogic Data Services provide another means of creating services and generating the Java classes that will be used to invoke data services in an RPC manner. Similar to invoking an installed module, they avoid the overhead of the REST rewriter and parameter processing and can perform better than REST API calls.
The advantage of the MarkLogic REST API is that they provide some standard out of the box functionality that can be leveraged. The MarkLogic Java Client API sits on top of the REST API.
There are pos and cons for either of them. Which to use may depend a lot on performance requirements, preferences for how much code to write vs. leveraging APIs and provided functionality.
Also, note that things don't need to be exclusive. Use what makes sense when it makes sense, and mix-n-match if you need. For instance, maybe most calls work fine and would leverage the Java Client API, but a few particular calls either use XCC or Data Services for high volume and velocity requirements in which every millisecond counts.

Adapter Proxy for Restful APIs

this is a general 'what technologies are available' question.
My company provides a web application with a RESTful API. However, it is too slow for my needs and some of the results are in an awkward format.
I want to wrap their restful server with a proxy/adapter server, so when you connect to the proxy you get the RESTful API I wish the real one provides.
So it needs to do a few things:
passthrough most requests
cache some requests
do some extra requests on the original server to detect if a request is cacheable
for instance: there is a request for a field in a record: GET /records/id/field which might be slow, but there is a fingerprint request GET /records/id/fingerprint which is always fast. If there exists a cache of GET /records/1/field2 for the fingerprint feedbeef, then I need to check the original server still has the fingerprint feed beef before serving the cached version.
fix headers for some responses - e.g. content-type, based upon the path
do stream processing on some large content, for instance
GET /records/id/attachments/1234
returns a 100Mb log file in text format
remove null characters from files
optionally recode the log to filter out irrelevant lines, reducing the load on the client
cache the filtered version for later requests.
While I could modify the client to achieve this functionality, such code would not be re-usable for other clients (different languages), and complicates the client logic.
I had a look at whether clojure/ring could do it, and while there is a nice little proxy middleware for it, it doesn't handle streaming content as far as I can tell - the whole 100Mb would have to be downloaded. Also it doesn't include any cache logic yet.
I took a look at whether squid could do it, but I'm not familiar with the technology, and it seems mostly concerned with passing through requests rather than modifying them on the fly.
I'm looking for hints where I might find the correct technology to implement this. I'm mostly language agnostic if learning a new language gets me access to a really simple way to do it.
I believe you should choose a platform that is easier for you to implement your custom business logic on. The following web application frameworks provide easy connectivity with REST APIs, and allow you to create a web application that could work as a REST proxy:
Play framework (Java + Scala)
express + Node.js (Javascript)
Sinatra (Ruby)
I'm more familiar with Play, of which I know it provides utilities for caching you could find useful, and is also extendable by a number of plugins.
If you are familiar with Scala, you could have a also have a look at Finagle. It is a framework build be Twitter's infrastructure team to provide protocol-agnostic connectivity. It might be an overkill for REST to REST proxy, but it provides abstractions you might find useful.
You could also look at some 3rd party services like Apitools, which allows to create a proxy programmatically (in lua). Apirise is a similar service (of which I'm a co-founder) that intends to do provide similar functionalities with a user-friendly UI.
Beeceptor does exactly what you want. It plugs in-between your web-app and original API to route requests.
For your use-case of caching a few responses, you can create a rule. That way it shall not hit the original endpoint.
The requests to original APIs can be mocked, and you can inspect response
You can simulate delays.
(Note: it is a shameless plug, I am the author of Beeceptor and thought it should help you and other developers.)
https://github.com/nodejitsu/node-http-proxy is looking useful - although I don't yet know if it can stream process for transcoding.

REST API MongoDB Authentication

I am thinking in using MongoDB as my main database. However, my app is
fully in JavaScript and I wanted to use the REST API, client side.
I still can't understand what security mechanisms can I use in order to
make a JS call to the database without revealing all the data to all the
users.
Please advice on this matter.
Regards,
Donald
First of all, you can enable database auth which will make the REST interface require authentication if connected to from a remote machine.
That said, it's a very bad idea to expose your database like you suggest. Build a persistence abstraction layer in a server technology you're comfortable with (node.js for example) and put all security constraints and authentication there. The advantages are numerous :
You can keep your API stable even if the MongoDB one changes. You can even replace it with another persistence solution if the need arises in most cases.
You can limit the load a single client can put on your database. If you expose the database directly there's very little you can do to avoid people doing expensive queries or even potentially corrupting writes.
You can often do smart app-side caching and optimization that is not possible if every client directly accesses the database (this depends a bit on the app in question though).
Check out Sleepy.Mongoose, it's a REST API interface for MongoDB. I haven't tried it, but it appears to support standard MongoDB authentication.
MongoLab has MongoDB database hosting with a REST API that can be accessed client side, they even through in some jQuery based examples in their support documentation. That said, Remon is right that you sacrifice any security by doing so because you're making your API key public.
RESTHeart is a Web API for MongoDB.
It provides application level authorization and authentication.
Check the security documentation section.
Also some example applications are available on github:
blog example (using AngularJs via $htpp service)
notes example (using AngularJs via Restangular service)

What is middleware exactly?

I have heard a lot of people talking recently about middleware, but what is the exact definition of middleware? When I look into middleware, I find a lot of information and some definitions, but while reading these information and definitions, it seems that mostly all 'wares' are in the middle of something. So, are all things middleware?
Or do you have an example of a ware that isn't middleware?
Lets say your company makes 4 different products, your client has another 3 different products from another 3 different companies.
Someday the client thought, why don't we integrate all our systems into one huge system. Ten minutes later their IT department said that will take 2 years.
You (the wise developer) said, why don't we just integrate all the different systems and make them work together? The client manager staring at you... You continued, we will use a Middleware, we will study the Inputs/Outputs of all different systems, the resources they use and then choose an appropriate Middleware framework.
Still explaining to the non tech manager
With Middleware framework in the middle, the first system will produce X stuff, the system Y and Z would consume those outputs and so on.
Middleware is a terribly nebulous term. What is "middleware" in one case won't be in another. In general, you can expect something classed as middleware to have the following characteristics:
Primarily (usually exclusively) software; usually doesn't need any specialized hardware.
If it weren't there, applications that depend on it would have to incorporate it as part of their application and would experience a lot of duplication.
Almost certainly connects two applications and passes data between them.
You'll notice that this is pretty much the same definition as an operating system. So, for instance, a TCP/IP stack or caching could be considered middleware. But your OS could provide the same features, too. Indeed, middleware can be thought of like a special extension to an operating system, specific to a set of applications that depend on it. It just provides a higher-level service.
Some examples of middleware:
distributed cache
message queue
transaction monitor
packet rewriter
automated backup system
Wikipedia has a quite good explanation: http://en.wikipedia.org/wiki/Middleware
It starts with
Middleware is computer software that connects software components or applications. The software consists of a set of services that allows multiple processes running on one or more machines to interact.
What is Middleware gives a few examples.
There are (at least) three different definitions I'm aware of
in business computing, middleware is messaging and integration software between applications and services
in gaming, middleware is pretty well anything that is provided by a third-party
in (some) embedded software systems, middleware provides services that applications use, which are composed out of the functions provided by the hardware abstraction layer - it sits between the application layer and the hardware abstraction layer.
Simply put Middleware is a software component which provides services to integrate disparate systems together.
In an complex enterprise environment, there are a number of challenges when you need to integrate two or more enterprise systems together to talk to each other. Normally these systems do not understand each others language as they are developed on different platforms using different languages (like C++, Java, Cobol, etc.).
So here comes middleware software in picture which provides services like
transformation of messages formats from one app to other,
routing and enriching messages besides taking care of security,
encryption,
validation and
applying different business rules to these messages.
A typical example of middleware is an ESB products like IBM message broker (WMB/IIB), WESB, Datapower XI50, Oracle Fusion, Mule and many others.
Therefore, middleware sits mostly in between the service consuming apps and services provider apps and help these apps to talk to each other.
Middleware is about how our application responds to incoming requests. Middlewares look into the incoming request, and make decisions based on this request. We can build entire applications only using middlewares. For e.g. ASP.NET is a web framework comprising of following chief HTTP middleware components.
Exception/error handling
Static file server
Authentication
MVC
As shown in the above diagram, there are various middleware components in ASP.NET which receive the incoming request, and redirect it to a C# class (in this case a controller class).
Middleware is a general term for software that serves to "glue together" separate, often complex and already existing, programs. Some software components that are frequently connected with middleware include enterprise applications and Web services.
There is a common definition in web application development which is (and I'm making this wording up but it seems to fit): A component which is designed to modify an HTTP request and/or response but does not (usually) serve the response in its entirety, designed to be chained together to form a pipeline of behavioral changes during request processing.
Examples of tasks that are commonly implemented by middleware:
Gzip response compression
HTTP authentication
Request logging
The key point here is that none of these is fully responsible for responding to the client. Instead each changes the behavior in some way as part of the pipeline, leaving the actual response to come from something later in the sequence (pipeline).
Usually, the middlewares are run before some sort of "router", which examines the request (often the path) and calls the appropriate code to generate the response.
Personally, I hate the term "middleware" for its genericity but it is in common use.
Here is an additional explanation specifically applicable to Ruby on Rails.
Middleware stands between web applications and web services that natively can't communicate and often are written in different languages/frameworks.
One such example is OWIN middleware for .NET environment, before owin people were forced to host web apps in a microsoft hosting software called IIS. After owin was developed, it has added capacity to host both in IIS and self host, in IIS was just added support for Owin which acted as an interface. Also it become possible to host .NET web apps on Linux via Mono, which again added support for Owin.
It also added capacity to create Single Page Applications, Owin handling Http request/response context, so on top of owin you can add authentication/authorization logic via OAuth2 for example, you can configure middleware to register a class which contains logic of user authentification (for ex. OAuth2 implementation) or class which contains logic of how to manage http request/response messages, that way you can make one application communicate with other applications/services via different data format (like json, xml, etc if you are targeting web).
Some examples of middleware: CORBA, Remote Method Invocation (RMI),...
The examples mentioned above are all pieces of software allowing you to take care of communication between different processes (either running on the same machine or distributed over e.g. the internet).
From my own experience with webwork, a middleware was stuff between users (the web browser) and the backend database. It was the software that took stuff that users put in (example: orders for iPads, did some magical business logic, i.e. check if there are enough iPads available to fill the order) and updated the backend database to reflect those changes.
It is just a piece of software or a tool on which your application executes and rapplication capabilities with respect to high availability,scalability,integrating with other softwares or systems without you bothering about your application level code changes .
For example : The operating system on which your application runs requires an I.P change , you do not have to worry about it in your code , it is the middleware stack on which you can simple update the configuration.
Example 2 : You experience problems with your runtime memory allocation and feel that the your application usage has increased , you do not have to much about it unless you have a bug or bottleneck in your code , it is easily achievable by tuning middleware software configuration on which your application runs.
Example 3 : You have multiple disparate software and you need them to talk to each other or send data in a common format which is understandable by all the systems then this is where middleware systems comes handy.
Hope the information provided helps.
it is a software layer between the operating system
and applications on each side of a distributed computing system in a network. In fact it connects heterogeneous network and software systems.
If I am not wrong, in software application framework, based on the context, you can consider middleware for the following roles that can be combined in order to perform certain activities in between the user request and the application response.
Adapter
Sanitizer
Validator
I always thought of it as the oldest software I have had to install. The total app used a web server, a database server, and an application server. The web server being the middleware between the data and the app.