Error on querying FIWARE Orion with orderBy custom attribute - fiware-orion

I am facing the following issue while querying Orion with orderBy, asking the results to return on chronological order. My data looks like this:
{
"id": "some_unique_id",
"type": "sensor",
"someVariable": {
"type": "String",
"value": "1",
"metadata": {}
},
"datetime": {
"type": "DateTime",
"value": "2018-09-21T00:38:57.00Z",
"metadata": {}
},
"humidity": {
"type": "Float",
"value": 55.9,
"metadata": {}
},
"someMoreVariables": {
"type": "Float",
"value": 6.29,
"metadata": {}
}
My call looks like this:
http://my.ip.com:1026/v2/entities?type=sensor&offset=SOMENUMBERMINUS30&limit=30&orderBy=datetime
Unfortunately, the response is the following:
{
"error": "InternalServerError",
"description": "Error at querying MongoDB"}
Both tenant and subtenant are used in the call, while the Orion version is 1.13.0-next and tenant has been indexed inside the MongoDB. I am running Orion and MongoDB from different Docker instances in the same server.
As always, any help will be highly appreciated.
EDIT1: After fgalan's recommendation, I am adding the relative record from the log (I am sorry, I didn't do it from the beginning):
BadInput some.ip
time=2018-10-16T07:47:36.576Z | lvl=ERROR | corr=bf8461dc-d117-11e8-b0f1-0242ac110003 | trans=1539588749-153-00000013561 | from=some.ip | srv=some_tenant | subsrv=/some_subtenant | comp=Orion | op=AlarmManager.cpp[211]:dbError | msg=Raising alarm DatabaseError: nextSafe(): { $err: "Executor error: OperationFailed: Sort operation used more than the maximum 33554432 bytes of RAM. Add an index, or specify a smaller limit.", code: 17144 }
time=2018-10-16T07:47:36.576Z | lvl=ERROR | corr=bf8461dc-d117-11e8-b0f1-0242ac110003 | trans=1539588749-153-00000013561 | from=some.ip | srv=some_tenant | subsrv=/some_subtenant | comp=Orion | op=AlarmManager.cpp[235]:dbErrorReset | msg=Releasing alarm DatabaseError
From the above, it is clear that indexing is required. I have already done that according to fgalan's answer to another question I had in the past: Indexing Orion
EDIT2: Orion response after indexing:
[
{
"v" : 2,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "orion.entities"
},
{
"v" : 2,
"key" : {
"location.coords" : "2dsphere"
},
"name" : "location.coords_2dsphere",
"ns" : "orion.entities",
"2dsphereIndexVersion" : 3
},
{
"v" : 2,
"key" : {
"creDate" : 1
},
"name" : "creDate_1",
"ns" : "orion.entities"
}
]

You have an index {creDate: 1} which is the fine if you order by entity creation date using dateCreated (or doesn't specify the orderBy parameters, as creation date is the default ordering):
GET /v2/entities
GET /v2/entities?orderBy=dateCreated
However, if you plan to order by a different attribute defined by you (as I understand datetime is) and get the OperationFailed: Sort operation used more than the maximum error, then you have to create an index for the value of such attribute. In particular you have to create this index:
{ attrs.datetime.value: 1 }
EDIT: as suggested by a comment to this answer, the command for creating the above index typically is:
db.entities.createIndex({"attrs.datetime.value": 1});
EDIT2: have a look to this section in documentation for more detail on this kind of indexes.

Related

I get an error trying to import the rcdb tutorial json data to MongoDB via Compass

I am trying to import JSON data into MongoDB via Compass and I get this error
Unexpected token ":" (0x3A) in JSON at position 10 while parsing near " \"_id\" : ObjectId(\"57efaead..." in C:\Users\Michael.Pares\source\repos\forge-rcdb.nodejs\resources\db\rcdb.models.json
> 1 | "_id" : ObjectId("57efaead77c8eb0a560ef465"),
| ^
2 | "name" : "Car Seat",
3 | "env" : "Local",
4 | "layout" : {
Here is what the JSON looks like
{
"_id" : ObjectId("57efaead77c8eb0a560ef465"),
"name" : "Car Seat",
"env" : "Local",
"layout" : {
"type" : "flexLayoutRight",
"rightFlex" : 0.35
},
"model" : {
"urn" : "dXJuOmFkc2sub2JqZWN0czpvcy5vYmplY3Q6bGVlZnNtcC1mb3JnZS9zZWF0LmR3Zg",
"path": "https://sbhehe.github.io/sb233/carseat/0.svf",
"name" : "Car Seat"
},
Any idea why this is happening?
In this case, you can use the option ADD DATA > Insert Document to certify that your schema is correct.
For that, you won't need ObjectId to pass the id of the document.
The JSON below is in the proper format.
{
"_id": "57efaead77c8eb0a560ef465",
"name": "Car Seat",
"env": "Local",
"layout":{
"type": "flexLayoutRight",
"rightFlex": 0.35
},
"model":{
"urn": "dXJuOmFkc2sub2JqZWN0czpvcy5vYmplY3Q6bGVlZnNtcC1mb3JnZS9zZWF0LmR3Zg",
"path": "https://sbhehe.github.io/sb233/carseat/0.svf",
"name": "Car Seat"
}
}
If you compare it with the one you have (with textcompare) you can see that is also missing one } at the end of your JSON.

What will be the mongodb query for getting array of objects values in separate rows?

This is my mongo record. here roles is an array of objects. I want short code of roles in multiple rows.
{
"_id" : ObjectId("111111111111111111111111"),
"roles" : [
{
"name" : "Computer Programme Manager",
"shortCode" : "COMP"
},
{
"name" : "Technical Manager",
"shortCode" : "TEMR"
},
{
"name" : "Technical-Civil",
"shortCode" : "TEMR"
}
],
"deptDbValue" : "i_a",
"deptDisplayValue" : "IA",
"deptShortCode" : "gic"
}
I want all the roles in row wise. I tried this query:
db.departments.distinct("roles.shortCode");
which is giving each role in separates rows which is correct, but how can I get other properties like deptShortCode, deptDbValue etc.
For example, I wanted like this:
id | role_name | role_shortcode
ObjectId("111..") | Computer Prog | COMP
ObjectId("111..") | Technical Manager | TEMR
ObjectId("111..") | Technical-Civil | TEMR
Any suggestions?
The output format of a MongDB query is typically JSON and what you are suggesting as output format is not possible.
But the data you intended to have in the output could be present in the response of your query. For example:
{ "_id" : ObjectId("111111111111111111111111"), "name" : "Computer Programme Manager", "shortCode" : "COMP" }
{ "_id" : ObjectId("111111111111111111111111"), "name" : "Technical Manager", "shortCode" : "TEMR" }
{ "_id" : ObjectId("111111111111111111111111"), "name" : "Technical-Civil", "shortCode" : "TEMR" }
This is quite close to the output you want to have and correct JSON.
You can achieve this output with $unwind (causes the rows to multiply) and then $project (used to transform each single row) in an aggregate query like this one:
db.dummy.aggregate(
[
{
$unwind: {
"path": "$roles"
}
},
{
$project: {
"name": "$roles.name",
"shortCode": "$roles.shortCode",
}
}
]
)
Here is some information on the unwind command:
https://docs.mongodb.com/manual/reference/operator/aggregation/unwind/
and on the project command:
https://docs.mongodb.com/manual/reference/operator/aggregation/project/

Elasticsearch how to join publications and keywords

I have defined two indexes in elasticsearch that are populated with two different queries coming from a postgres database. I have many hundred of documents with thousand of keywords, and I have used logstash to populate the two indexes.
The first index is called publication and is defined as follow:
"mappings" : {
"doc" : {
"properties" : {
"external_id" : {"type": "text" },
"title" : {"type": "text", "analyzer":"english" },
"description" : { "type" : "text", "analyzer":"english" }
}
}
}
The second index is called keyword and is defined as follow:
"mappings" : {
"doc" : {
"properties" : {
"publication_id" : {"type": "keyword" },
"keyword" : {"type": "keyword" }
}
}
}
The relationship between the two indexes is based on the external_id <-> publication_id.
I am trying to define other indexes in a way that I can locate all the publications that have a specific keyword or all the keywords that are defined for a specific publication

MongoDB aggregation query to split and convert JSON?

I have a JSON file with a horrific data structure
{ "#timestamp" : "20160226T065604,39Z",
"#toplevelentries" : "941",
"viewentry" : [ { "#noteid" : "161E",
"#position" : "1",
"#siblings" : "941",
"entrydata" : [
and entrydata is a list of 941 entries, each of which look like this
{ "#columnnumber" : "0",
"#name" : "$Created",
"datetime" : { "0" : "20081027T114133,55+01" }
},
{ "#columnnumber" : "1",
"#name" : "WriteLog",
"textlist" : { "text" : [ { "0" : "2008.OCT.28 12:54:39 CET # EMI" },
{ "0" : "2008.OCT.28 12:56:13 CET # EMI" },
There are many more columns. The structure is always this:
{
"#columnnumber": "17",
"#name": "PublicDocument",
"text": {
"0": "TMI-1-2005.pdf"
}
}
there's a column number which we can throw away, a #name which is the important part, then one of text, datetime or textlist fields where the value is always this weird subdocument with a 0 key and the actual value.
All 941 entries have the same number of these column entries and the column entry is always the same structure. Ie. if "#columnnumber": "13" has a #name: foo then it'll always be foo and if it has a datetime key then it always will have a datetime key, never a text or textlist. This monster was borne out of a SQL or similar database somewhere at the very far end of the pipeline but I have no access to the source beyond this. The goal is to revert the transformation and make it into something a SELECT statement would produce (except textlist, although I guess array_agg and similar could produce that too).
Is there a way to get 941 separate JSON entries out of MongoDB looking like:
{
$Created: "20081027T114133,55+01",
WriteLog: ["2008.OCT.28 12:54:39 CET # EMI", "2008.OCT.28 12:56:13 CET # EMI"],
PublicDocument: "TMI-1-2005.pdf"
}
is viewentry also a list?
if you do an aggregate on the collection, and $unwind on viewentry.entrydata you will get one document for every entrydata. It should be possible to the do a $project to reformat these documents to produce the output you need
this is a nice challenge,
to get outupt like that:
{
"_id" : "161E",
"field" : [
{
"name" : "$Created",
"datetime" : {
"0" : "20081027T114133,55+01"
}
},
{
"name" : "WriteLog",
"textlist" : {
"text" : [
{
"0" : "2008.OCT.28 12:54:39 CET# EMI"
},
{
"0" : "2008.OCT.28 12:56:13 CET# EMI"
}
] } } ]}
use this aggregation pipelines:
db.chx.aggregate([ {$unwind: "$viewentry"}
, {$unwind: "$viewentry.entrydata"}
,{$group:{
"_id":"$viewentry.#noteid", field:{ $push:{
"name": "$viewentry.entrydata.#name" ,
datetime:"$viewentry.entrydata.datetime",
textlist:"$viewentry.entrydata.textlist" }}
}}
]).pretty()
the next step should be extracting log entries, but I have no idea, as my brain is already fried tonight - so probably I can return later...

Pivot rows to columns in MongoDB

The relevant question is Efficiently convert rows to columns in sql server. But the answer is specific to SQL.
I want the same result i.e. pivot row to column without aggregating anything (as of now) in MongoDB.
The collection looks something as below. These are statistics of facebook page properties:
timestamp | propName | propValue
--------------------------------
1371798000000 | page_fans | 100
--------------------------------
1371798000000 | page_posts | 50
--------------------------------
1371798000000 | page_stories | 25
--------------------------------
I need answer like:
timestamp | page_fans | page_posts | page_stories
--------------------------------
1371798000000 | 100 | 50 | 25
--------------------------------
The column names are pre-determined. They don't have to be generated dynamically. But question is how to achieve this in MongoDB.
I believe aggregation is of no use for this purpose. Do I need to use MapReduce? But in that case I have nothing to reduce I guess? Well another option could be fetching these values in code and do the manipulation in programming language e.g. Java
Any insights would be helpful. Thanks in advance :)!!!
EDIT (Based on input from Schaliasos):
Input JSON:
{
"_id" : ObjectId("51cd366644aeac654ecf8f75"),
"name" : "page_storytellers",
"pageId" : "512f993a44ae78b14a9adb85",
"timestamp" : NumberLong("1371798000000"),
"value" : NumberLong(30871),
"provider" : "Facebook"
}
{
"_id" : ObjectId("51cd366644aeac654ecf8f76"),
"name" : "page_fans",
"pageId" : "512f993a44ae78b14a9adb85",
"timestamp" : NumberLong("1371798000000"),
"value" : NumberLong(1291509),
"provider" : "Facebook"
}
{
"_id" : ObjectId("51cd366644aeac654ecf8f77"),
"name" : "page_fan_adds",
"pageId" : "512f993a44ae78b14a9adb85",
"timestamp" : NumberLong("1371798000000"),
"value" : NumberLong(2829),
"provider" : "Facebook"
}
Expected Output JSON:
{
"timestamp" : NumberLong("1371798000000"),
"provider" : "Facebook",
"page_storytellers" : NumberLong(30871),
"page_fans" : NumberLong("1371798000000"),
"page_fan_adds" : NumberLong("1371798000000")
}
Now, you can utilise new aggregation operator $arrayToObject to pivot MongoDB keys. This operator is available in MongoDB v3.4.4+
For example, given an example data of:
db.foo.insert({ provider: "Facebook", timestamp: '1371798000000', name: 'page_storytellers', value: 20871})
db.foo.insert({ provider: "Facebook", timestamp: '1371798000000', name: 'page_fans', value: 1291509})
db.foo.insert({ provider: "Facebook", timestamp: '1371798000000', name: 'page_fan_adds', value: 2829})
db.foo.insert({ provider: "Google", timestamp: '1371798000000', name: 'page_fan_adds', value: 1000})
You can utilise Aggregation Pipeline below:
db.foo.aggregate([
{$group:
{_id:{provider:"$provider", timestamp:"$timestamp"},
items:{$addToSet:{name:"$name",value:"$value"}}}
},
{$project:
{tmp:{$arrayToObject:
{$zip:{inputs:["$items.name", "$items.value"]}}}}
},
{$addFields:
{"tmp.provider":"$_id.provider",
"tmp.timestamp":"$_id.timestamp"}
},
{$replaceRoot:{newRoot:"$tmp"}
}
]);
The output would be:
{
"page_fan_adds": 1000,
"provider": "Google",
"timestamp": "1371798000000"
},
{
"page_fan_adds": 2829,
"page_fans": 1291509,
"page_storytellers": 20871,
"provider": "Facebook",
"timestamp": "1371798000000"
}
See also $group,
$project,
$addFields,
$zip,
and $replaceRoot
I have done something like this using aggregation. Could this help ?
db.foo.insert({ timestamp: '1371798000000', propName: 'page_fans', propValue: 100})
db.foo.insert({ timestamp: '1371798000000', propName: 'page_posts', propValue: 25})
db.foo.insert({ timestamp: '1371798000000', propName: 'page_stories', propValue: 50})
db.foo.aggregate({ $group: { _id: '$timestamp', result: { $push: { 'propName': '$propName', 'propValue': '$propValue' } }}})
{
"result" : [
{
"_id" : "1371798000000",
"result" : [
{
"propName" : "page_fans",
"propValue" : 100
},
{
"propName" : "page_posts",
"propValue" : 50
},
{
"propName" : "page_stories",
"propValue" : 25
}
]
}
],
"ok" : 1
}
You may want to use $sum operator along the way. See here