I'm putting together a REST based API but I'm not sure on how I should deliver the response for collections vs individual resources.
Does it make sense to have a slimmed down representation for a collection over a single item in the world of REST?
Say I have something along the lines of this for a collection of albums:
{
items: [
{
"id": 1,
"title": "Thriller"
},
...
]
}
But then for the actual individual item I had
{
"id": 1,
"title": "Thriller",
"artist": "Michael Jackson",
"released": "1982",
"imageLinks": {
"smallThumbnail": "...",
"largeThumbnail": "..."
}
...
}
A resource representation should be unique irrespective of whether it is given as a collection or a single item. But, you can introduce a new parameter like fields which can be used by the clients to get only the required field thereby optimising the bandwidth.
/albums - This should give the list of objects each having the structure of what you would give in a individual item api
/albums?fields=id,title - This can give the list of objects with just the id & title.
Related
if /wallet returns a list a wallets and each wallet has a list of transactions. What is the standard OpenAPI/REST standard?
For example,
http://localhost:8000/api/wallets/ gives me
{
"count": 1,
"next": null,
"previous": null,
"results": [
{
"user": 1,
"address": "3E8ociqZa9mZUSwGdSmAEMAoAxBK3FNDcd",
"balance": "2627199.00000000"
}
]
}
http://localhost:8000/api/wallets/3E8ociqZa9mZUSwGdSmAEMAoAxBK3FNDcd/ gives me
{
"user": 1,
"address": "3E8ociqZa9mZUSwGdSmAEMAoAxBK3FNDcd",
"balance": "2627199.00000000"
}
If I wanted to add a transactions list, what is the standard way to form this?
http://localhost:8000/api/wallets/3E8ociqZa9mZUSwGdSmAEMAoAxBK3FNDcd/transactions ?
http://localhost:8000/api/wallets/3E8ociqZa9mZUSwGdSmAEMAoAxBK3FNDcd/transactions?offset=100 for pagination
REST doesn't care what spelling conventions you use for your resources. What it would instead expect is that you have representations of links between resources, along with meta data describing the nature of the link.
So this schema is fine.
/api/wallets/3E8ociqZa9mZUSwGdSmAEMAoAxBK3FNDcd
/api/wallets/3E8ociqZa9mZUSwGdSmAEMAoAxBK3FNDcd/transactions
And this schema is also fine.
/api/wallets/3E8ociqZa9mZUSwGdSmAEMAoAxBK3FNDcd
/api/transactions/3E8ociqZa9mZUSwGdSmAEMAoAxBK3FNDcd
As far as I can tell, OpenAPI also gives you the freedom to design your resource model the way that works best for you (it just tells you one possible way to document the resource model you have selected).
My MongoDB data structure looks like this:
{
"id": "7118592",
"passages": [
{
"infons": {
"article-id_pmid": "32292259",
"title": "Keywords",
"type": "front",
"section_type": "TITLE"
},
"text": "Understanding of guidance for acupuncture and moxibustion interventions on COVID-19 (Second edition) issued by CAAM"
},
{
"infons": {
"section_type": "ABSTRACT",
"type": "abstract",
"section": "Abstract"
},
"offset": 116,
"text": "At present, the situation of global fight against COVID-19 is serious. WHO (World Health Organization)-China Joint Mission fully confirms the success of \"China's model\" against COVID-19 in the report. In fact, one particular power in \"China's model\" is acupuncture and moxibustion of traditional Chinese medicine. To better apply \"non-pharmaceutic measures\":the external technique of traditional Chinese medicine, in the article, the main content of Guidance for acupuncture and moxibustion interventions on COVID-19 (Second edition) issued by China Association of Acupuncture-Moxibution is introduced and the discussion is stressed on the selection of moxibustion device and the duration of its exertion."
}
]
}
I want to request the article-id_pmid and the text in the same subdocument, also the text of subdocument which contains in infons a field with section_type : ABSTRACT and type: abstract.
I had tried this request but the result was not what I am searching for:
db.mydata.find({$and:[{"passages.infons.article-id_pmid$":{$exists: true}},{"passages.infons.section_type":"ABSTRACT"},{"passages.infons.type":"abstract"}]},{"passages.infons.article-id_pmid.$":1,_id:0, "passages.text":1})
Each of the top-level conditions is treated independently.
Use https://docs.mongodb.com/manual/reference/operator/query/elemMatch/ to specify multiple conditions on the same array element.
I have a music app that has a job to find music recommendations based on a tag id.
There are two entities involved:
Song - a song record contains its name and a list of music tag ids (genres) this song belongs to
MusicTag - the music tag itself, includes id, name etc.
Data is currently stored in MongoDB.
The Songs collections in mongo have millions of songs, and each song has an average of 7 tag ids.
The MusicTags has about 30K records.
The Songs collection looks like that:
[
{
name: "Metallica - one",
tags: [
"6018703624d8a5e8efa1b76e", // Rock
"601861cc8cef62ba86765017", // Heavy metal
"5fda07ac8db0615c1c503a46" // Hard Rock
]
},
{
name: "Metallica - unforgiven",
tags: [
"6018703624d8a5e8efa1b76e", // Rock
"5fda07ac8db0615c1c503a46", // Metal
]
},
{
name: "Lady Gaga - Bad Romance",
tags: [
"5fc7b9f95e38e17282896b64", // Pop
"5fc729be5e38e17282844eff", // Dance
]
}
]
Given the tag "6018703624d8a5e8efa1b76e" (Rock), I want to query the Songs collection and find all songs that have Rock tag in their tags array.
In Mongo this is the query i'm doing:
db.songs.find({ tags: { $in: [ObjectId("6018703624d8a5e8efa1b76e")] }});
The performance of it is very bad (between 10 to 40 seconds and getting worst as long as the collection grows), I tried to index Mongo in various ways (the table contains more data that involve in the search, such as score and duration, but it's not relevant for now) but my queries are still take too long, I can't explain it (and I read a lot of official and unofficial stuff) but I have a feeling that holding the data in this nested form makes the index worthless and somehow still make a full scan on the table each time - but I can't prove it (the Mongo "explain" not really explained me something :) )
I'm thinking of using ElasticSearch for it, sync all songs data, and query it instead of the Mongo that will stay as the data SSOT and other lightweight ops.
But then the question remains open and I want to make sure: is in Elastic I can hold the data in that form (nested array inside song) or I need to represent it differently (e.g. flat it so every record will be song_tag index etc?
Thanks.
Elasticsearch doesn't offer a dedicated array type so what you'd typically do is define the mapping based on the type of the individual array items -- in your case a keyword:
PUT songs
{
"mappings": {
"properties": {
"tags": {
"type": "keyword"
}
}
}
}
Then you'd index the docs:
POST songs/_doc
{
"name": "Metallica - one",
"tags": [
"6018703624d8a5e8efa1b76e",
"601861cc8cef62ba86765017",
"5fda07ac8db0615c1c503a46"
]
}
and query the tags:
POST songs/_search
{
"query": {
"bool": {
"must": [
{ ... other queries },
{
"terms": {
"tags": [
"6018703624d8a5e8efa1b76e" // one or more
]
}
}
]
}
}
}
The tags are unique keywords but are not human-readable so you'd need to keep the map of them vs. the actual genres somewhere. Since the genres are probably set once and rarely, if ever, updated, you could use nested fields too. But your tags would then become an array of key-value pairs:
POST songs/_doc
{
"name": "Metallica - one",
"tags": [
{
"tag": "6018703624d8a5e8efa1b76e",
"genre": "Rock"
}
...
]
}
The mapping would be slightly different and so would be the queries but now you wouldn't need the translation map, plus you could query or aggregate by human-readable values -- tags.genre.
As I create entities in an Orion server, I can search by ID, as flat or using regular expressions:
http://<localhost>:1026/v1/queryContext
Content:
{
"entities": [
{
"type": "Sensor",
"isPattern": "true",
"id": "sensor_1.*"
}
],
"attributes": ["temperature","humidity"]
}
In the above example I'd get all objects of type "Sensor" whose ID starts with "sensor_1", and their attributes "temperature" and "humidity". I wonder if there is any way that would allow me to search by specific attribute value, for example to get those sensors whose humidity is over "60.2", or this selection must be done over the retrieved data queried by ID.
Not in the current Orion version (0.19.0) but it will be possible in the future. Have a look to the attribute::<name> filter with the = operator in this document.
I have different types of data that would be difficult to model and scale with a relational database (e.g., a product type)
I'm interested in using Mongodb to solve this problem.
I am referencing the documentation at mongodb's website:
http://docs.mongodb.org/manual/tutorial/model-referenced-one-to-many-relationships-between-documents/
For the data type that I am storing, I need to also maintain a relational list of id's where this particular product is available (e.g., store location id's).
In their example regarding "one-to-many relationships with embedded documents", they have the following:
{
name: "O'Reilly Media",
founded: 1980,
location: "CA",
books: [12346789, 234567890, ...]
}
I am currently importing the data with a spreadsheet, and want to use a batchInsert.
To avoid duplicates, I assume that:
1) I need to do an ensure index on the ID, and ignore errors on the insert?
2) Do I then need to loop through all the ID's to insert a new related ID to the books?
Your question could possibly be defined a little better, but let's consider the case that you have rows in a spreadsheet or other source that are all de-normalized in some way. So in a JSON representation the rows would be something like this:
{
"publisher": "O'Reilly Media",
"founded": 1980,
"location": "CA",
"book": 12346789
},
{
"publisher": "O'Reilly Media",
"founded": 1980,
"location": "CA",
"book": 234567890
}
So in order to get those sort of row results into the structure you wanted, one way to do this would be using the "upsert" functionality of the .update() method:
So assuming you have some way of looping the input values and they are identified with some structure then an analog to this would be something like:
books.forEach(function(book) {
db.publishers.update(
{
"name": book.publisher
},
{
"$setOnInsert": {
"founded": book.founded,
"location": book.location,
},
"$addToSet": { "books": book.book }
},
{ "upsert": true }
);
})
This essentially simplified the code so that MongoDB is doing all of the data collection work for you. So where the "name" of the publisher is considered to be unique, what the statement does is first search for a document in the collection that matches the query condition given, as the "name".
In the case where that document is not found, then a new document is inserted. So either the database or driver will take care of creating the new _id value for this document and your "condition" is also automatically inserted to the new document since it was an implied value that should exist.
The usage of the $setOnInsert operator is to say that those fields will only be set when a new document is created. The final part uses $addToSet in order to "push" the book values that have not already been found into the "books" array (or set).
The reason for the separation is for when a document is actually found to exist with the specified "publisher" name. In this case, all of the fields under the $setOnInsert will be ignored as they should already be in the document. So only the $addToSet operation is processed and sent to the server in order to add the new entry to the "books" array (set) and where it does not already exist.
So that would be simplified logic compared to aggregating the new records in code before sending a new insert operation. However it is not very "batch" like as you are still performing some operation to the server for each row.
This is fixed in MongoDB version 2.6 and above as there is now the ability to do "batch" updates. So with a similar analog:
var batch = [];
books.forEach(function(book) {
batch.push({
"q": { "name": book.publisher },
"u": {
"$setOnInsert": {
"founded": book.founded,
"location": book.location,
},
"$addToSet": { "books": book.book }
},
"upsert": true
});
if ( ( batch.length % 500 ) == 0 ) {
db.runCommand( "update", "updates": batch );
batch = [];
}
});
db.runCommand( "update", "updates": batch );
So what is doing in setting up all of the constructed update statements into a single call to the server with a sensible size of operations sent in the batch, in this case once every 500 items processed. The actual limit is the BSON document maximum of 16MB so this can be altered appropriate to your data.
If your MongoDB version is lower than 2.6 then you either use the first form or do something similar to the second form using the existing batch insert functionality. But if you choose to insert then you need to do all the pre-aggregation work within your code.
All of the methods are of course supported with the PHP driver, so it is just a matter of adapting this to your actual code and which course you want to take.