mLab REST API - Query by string - rest

I recently set up a new database/collection with a simple Json document on mLab. I enabled the data API and am experimenting with filtering the results via the query parameter. The complete document is too large to host within my app, unfortunately.
Here is what the document looks like:
{
"_id": {
"$oid": "5c59f496hv7ec06f4f560f4c"
},
"songs": [
{
"title": "title 1",
"artist": "musician 1",
"album": "fake album",
"minsec": "2:04",
"songid": "11100"
},
{
"title": "fake title",
"artist": "musician 1",
"album": "album 2",
"minsec": "2:57",
"songid": "11102"
},
{
"title": "title 3",
"artist": "musician 2",
"album": "album 3",
"minsec": "3:06",
"songid": "11078"
},
{
"title": "title 4",
"artist": "fake musician",
"album": "album 4",
"minsec": "2:28",
"songid": "11103"
}
]
}
I would love to be able to search the document with a string that would return any object from the array that has a value that includes that string. For example, searching for 'fake' which would return the first, second and fourth object with the following url:
https://api.mlab.com/api/1/databases/<my-db>/collections/<my-collection>?q=fake&apiKey=<my-apikey>
It appears that mLab's data API only processes queries("q=") using json notation, and even knowing that I still can't figure out how to return anything other than an empty array or a
"Could not parse JSON parameter, please double-check syntax and encoding"
error.
Thanks for any help you guys can offer!
Update I inserted each song object as an individual document, rather than a single large document, and can filter them using using specific queries...still unsure of how to implement a more wildcard solution to filtering results with values that include/contain the query.
For anyone seeking assistance with mlab related issues, I would suggest reaching out to their very helpful support team.
Final Update I figured out how to use regex to filter my results satisfactorily:
{$regex: '(?i).*<string>.*'}

if using the mlab data API, use regex to filter the results:
https://api.mlab.com/api/1/databases/<my-db>/collections/<my-collection>?apiKey=<my-apikey>&q={'<key>':$regex:'(?i).*<value>.*'}
the i option turns case sensitivity off.

Related

Generate a JSON schema from an existing MongoDB collection

I have a MongoDB collection that contains a lot of documents. They are all roughly in the same format, though some of them are missing some properties while others are missing other properties. So for example:
[
{
"_id": "SKU14221",
"title": "Some Product",
"description": "Product Description",
"salesPrice": 19.99,
"specialPrice": 17.99,
"marketPrice": 22.99,
"puchasePrice": 12,
"currency": "USD",
"color": "red",
},
{
"_id": "SKU14222",
"title": "Another Product",
"description": "Product Description",
"salesPrice": 29.99,
"currency": "USD",
"size": "40",
}
]
I would like to automatically generate a schema from the collection. Ideally it would not which properties are present in all the documents and mark those as required. Detecting unique columns would also be nice, though not really all that necessary. In any event I would be modifying the schema after it's automatically generated.
I've noticed that there are tools that can do this for JSON. But short of downloading the entire collection as JSON, is it possible to do this using the MongoDb console or a CLI tool directly from the collection?
You could try this tool out. It appears to do exactly what you want.
Extract (and visualize) schema from Mongo database, including foreign
keys. Output is simple json file or html with dagre/d3.js diagram
(depending on command line options).
https://www.npmjs.com/package/extract-mongo-schema

MongoDB - how to properly model relations

Let's assume we have the following collections:
Users
{
"id": MongoId,
"username": "jsloth",
"first_name": "John",
"last_name": "Sloth",
"display_name": "John Sloth"
}
Places
{
"id": MongoId,
"name": "Conference Room",
"description": "Some longer description of this place"
}
Meetings
{
"id": MongoId,
"name": "Very important meeting",
"place": <?>,
"timestamp": "1506493396",
"created_by": <?>
}
Later on, we want to return (e.g. from REST webservice) list of upcoming events like this:
[
{
"id": MongoId(Meetings),
"name": "Very important meeting",
"created_by": {
"id": MongoId(Users),
"display_name": "John Sloth",
},
"place": {
"id": MongoId(Places),
"name": "Conference Room",
}
},
...
]
It's important to return basic information that need to be displayed on the main page in web ui (so no additional calls are needed to render the table). That's why, each entry contains display_name of the user who created it and name of the place. I think that's a pretty common scenario.
Now my question is: how should I store this information in db (question mark values in Metting document)? I see 2 options:
1) Store references to other collections:
place: MongoId(Places)
(+) data is always consistent
(-) additional calls to db have to be made in order to construct the response
2) Denormalize data:
"place": {
"id": MongoId(Places),
"name": "Conference room",
}
(+) no need for additional calls (response can be constructed based on one document)
(-) data must be updated each time related documents are modified
What is the proper way of dealing with such scenario?
If I use option 1), how should I query other documents? Asking about each related document separately seems like an overkill. How about getting last 20 meetings, aggregate the list of related documents and then perform a query like db.users.find({_id: { $in: <id list> }})?
If I go for option 2), how should I keep the data in sync?
Thanks in advance for any advice!
You can keep the DB model you already have and still only do a single query as MongoDB introduced the $lookup aggregation in version 3.2. It is similar to join in RDBMS.
$lookup
Performs a left outer join to an unsharded collection in the same database to filter in documents from the “joined” collection for processing. The $lookup stage does an equality match between a field from the input documents with a field from the documents of the “joined” collection.
So instead of storing a reference to other collections, just store the document ID.

Validate referential integrity of object arrays with Joi

I'm trying to validate that the data I am returned it sensible. Validating data types is done. Now I want to validate that I've received all of the data needed to perform a task.
Here's a representative example:
{
"things": [
{
"id": "00fb60c7-520e-4228-96c7-13a1f7a82749",
"name": "Thing 1",
"url": "https://lolagons.com"
},
{
"id": "709b85a3-98be-4c02-85a5-e3f007ce4bbf",
"name": "Thing 2",
"url": "https://lolfacts.com"
}
],
"layouts": {
"sections": [
{
"id": "34f10988-bb3d-4c38-86ce-ed819cb6daee",
"name": "Section 1",
"content:" [
{
"type": 2,
"id": "00fb60c7-520e-4228-96c7-13a1f7a82749" //Ref to Thing 1
}
]
}
]
}
}
So every Section references 0+ Things, and I want to validate that every id value returned in the Content of Sections also exists as an id in Things.
The docs for Object.assert(..) implies that I need a concrete reference. Even if I do the validation within the Object.keys or Array.items, I can't resolve the reference at the other end.
Not that it matters, but my context is that I'm validating HTTP responses within IcedFrisby, a Frisby.js fork.
This wasn't really solveable in the way I asked (i.e. with Joi).
I solved this for my context by writing a plugin for icedfrisby (published on npm here) which uses jsonpath to fetch each id in Content and each id in Things. The plugin will then assert that all of the first set exist within the second.

Getting poll results from Facebook Graph

How can I (if I can) get the results of a poll/question from Facebook graph? Currently I get back something similar to what is below:
"data": [
{
"id": "12345_12345",
"from": {
"name": "My Company Name",
"category": "Category",
"id": "12345"
},
"story": "This is my question",
"icon": "https://s-static.ak.facebook.com/rsrc.php/v1/yy/r/pz5wRf7MB0H.png",
"privacy": {
"description": "Public",
"value": "EVERYONE"
},
"type": "question",
"object_id": "12345",
"application": {
"name": "Questions",
"id": "12345"
},
"created_time": "2012-04-25T12:23:03+0000",
"updated_time": "2012-04-25T12:23:03+0000",
"comments": {
"count": 0
}
}
Can I get more information back about this question? I'm currently using PHP + CURL to get the feed.
Thanks!
From the looks of it, you have queried for the specific post (post-id:12345_12345) : https://graph.facebook.com/12345_12345 .
To get to the question's data we have to query for the question id that is given in this post's data itself:
"type": "question",
"object_id": "12345",
from here we have the question's id, i.e object_id:12345. Using this id we can get the question's info, so query url is : https://graph.facebook.com/12345.
In the question's returned info, we'll also have the options of the poll, it'll be a field named options. Each option's info will be given within this field, and each option has a votes field, which will tell you the number of votes that option received. So you'll have the results of the poll.
Use the graph explorer, to test these things, before you code them. And do read the documentation links to know more about questions.
In general the metadata=1 GET parameter tells you if there is more related data available.

mongodb best practice: nesting

Is this example of nesting generally accepted as good or bad practice (and why)?
A collection called users:
user
basic
name : value
url : value
contact
email
primary : value
secondary : value
address
en-gb
address : value
city : value
state : value
postalcode : value
country : value
es
address : value
city : value
state : value
postalcode : value
country : value
Edit: From the answers in this post I've updated the schema applying the following rules (the data is slightly different from above):
Nest, but only one level deep
Remove unneccesary keys
Make use of arrays to make objects more flexible
{
"_id": ObjectId("4d67965255541fa164000001"),
"name": {
"0": {
"name": "Joe Bloggs",
"il8n": "en"
}
},
"type": "musician",
"url": {
"0": {
"name": "joebloggs",
"il8n": "en"
}
},
"tags": {
"0": {
"name": "guitar",
"points": 3,
"il8n": "en"
}
},
"email": {
"0": {
"address": "joe.bloggs#example.com",
"name": "default",
"primary": 1,
"il8n": "en"
}
},
"updates": {
"0": {
"type": "news",
"il8n": "en"
}
},
"address": {
"0": {
"address": "1 Some street",
"city": "Somecity",
"state": "Somestate",
"postalcode": "SOM STR",
"country": "UK",
"lat": 49.4257641,
"lng": -0.0698241,
"primary": 1,
"il8n": "en"
}
},
"phone": {
"0": {
"number": "+44 (0)123 4567 890",
"name": "Home",
"primary": 1,
"il8n": "en"
},
"1": {
"number": "+44 (0)098 7654 321",
"name": "Mobile",
"il8n": "en"
}
}
}
Thanks!
In my opinion above schema not 'generally accepted', but looks like great. But i suggest some improvements thats will help you to query on your document in future:
User
Name
Url
Emails {email, emailType(primary, secondary)}
Addresses{address, city, state, postalcode, country, language}
Nesting is always good, but two or three level nesting deep can create additional troubles in quering/updating.
Hope my suggestions will help you make right choice of schema design.
You may want to take a look at schema design in MongoDB, and specifically the advice on embedding vs. references.
Embedding is preferred as "Data is then colocated on disk; client-server turnarounds to the database are eliminated". If the parent object is in RAM, then access to the nested objects will always be fast.
In my experience, I've never found any "best practices" for what a MongoDB record actually looks like. The question to really answer is, "Does this MongoDB schema allow me to do what I need to do?"
For example, if you had a list of addresses and needed to update one of them, it'd be a pain since you'd need to iterate through all of them or know which position a particular address was located. You're safe from that since there is a key-value for each address.
However, I'd say nix the basic and contact keys. What do these really give you? If you index name, it'd be basic.name rather than just name. AFAIK, there are some performance impacts to long vs. short key names.
Keep it simple enough to do what you need to do. Try something out and iterate on it...you won't get it right the first time, but the nice thing about mongo is that it's relatively easy to rework your schema as you go.
That is acceptable practice. There are some problems with nesting an array inside of an array. See SERVER-831 for one example. However, you don't seem to be using arrays in your collection at all.
Conversely, if you were to break this up into multiple collections, you would have to deal with a lack of transactions and the resulting race conditions in your data access code.