MongoDB get last 10 activities - mongodb

In My social network I want to get the feed for member A , member A is following lets say 20 category/member.
when a category/member(followed by member A) do an activity it is inserted into a collection called recent_activity :
{
"content_id": "6", // content id member A is following
"content_type_id": "6",// content type (category , other member)
"social_network_id": "2", // the action category did (add/like/follow)
"member_id": "51758", //Member A
"date_added": ISODate("2014-03-23T11:37:03.0Z"),
"platform_id": NumberInt(2),
"_id": ObjectId("532ec75f6b1f76fa2d8b457b"),
"_type": {
"0": "Altibbi_Mongo_RecentActivity"
}
}
I want when member A login into the system to get last 10 activities for the categories/member
my problem :
How to get Only 10 activities for all categories/members.
It is better to do it in one query or to do a for loop.

For this use case, I'd suggest to invert the logic and keep a separate object of the last 10 activities for member A that is kept up-to-date all the time. While that solution is more write-heavy, it makes reading trivially simple and it can be extended very easily. I'd like to blatantly advertise a blog post I wrote a while ago about news feeds with mongodb which outlines this approach.
This 'fan-out' approach might seem overly complex at first, but when you think about importance filtering / ranking (a la facebook), push messages for particularly important events (facebook, twitter) or regular digest emails (practically all), you will get one location in your code to perform all this logic.

I think I commented that T'm not really seeing the selection criteria. So if you are "outside" of a single collection, then you have problems. But if your indicated fields are the things you want to "filter" by, then just do this:
db.collection.find({
"social_network_id": "2",
"content_type_id": "6",
"content_id": "6",
"member_id": { "$ne": "51758" }
})
.sort({ "$natural": -1 })
.limit(10);
So what does that do? You match the various conditions in the data to do the "category match" (if I understood what was meant), then you make sure you are not matching entries by the same member.
The last parts do the "natural" sort. This is important because the ObjectId is monotinic, or math speak for "ever increasing". This means the "newest" entries are always the "highest" value. So descending order is "latest" to "oldest".
And the very final part is a basic "limit". So just return the last 10 entries.
As long as you can "filter" within the same collection in whatever way you want, then this should be fine.

Related

RESTapi nesting endpoint

Ok, new to RESTapi so not sure if I am using the correct terminology for what I want to ask so bear with me. I believe what I am asking about is nested resources in a service but I want to ask specifically about using it for separating a blob of "closely related" content. It may be easier to provide an example. Let's say I have the following service that could output the following:
/Policy
"data": [ {
"name": "PolicyName1",
"description": "",
"size": 25000,
.... (bunch of other fields)
"specialEnablement": true,
“specialEnablementOptions”: { <-- options below valid only if specialEnablement is true
“optionType”: “TypeII”,
“optionFlagA”: false,
“optionFlagB”: true,
“optionFlagC”: false,
...(bunch of other options here)
}
},
{ . . . }],
The specialEnablementOptions are only used if specialEnablement is 'true'. It is all part of this /Policy service so has no primary key other than the policy "name" (and doesnt make sense to have to generate one) so does not fall under some of the other questions I have been reading about nested resources.
It does make it more readable to separate this set of information since there are 12 or so options but, this is REST so, maybe human readability does not weigh heavily here.
I am being told that, if we do it this way, it makes it more complex to work with during POST/PUT/PATCH commands. Specifically, it is being said in my group that if we do this, we should require two calls....one that creates the policy main information then the user must call a second time to PATCH the specialEnablementOptions (assuming specialEnablement is true). This seems kludgy to me.
I am looking for expert advise on what the best practice is.
My questions:
Does having the specialEnablementOptions nested in this way cause a
lot of complexity. Seems to me that either way we have to verify
that the settings are valid?
Does having the specialEnablementOptions nested in this way require
two calls? In other words, can a user not do a POST/PATCH/PUT for
all the fields including those in the specialEnablementOptions in
one call? We are planning to provide a way for the user to do a
PATCH of just the specialEnablementOptions options without changing
any of the first level for ease of use but is there something that
prevents them from creating or modifying all settings in one call?
Another option is to just get rid of the nested
specialEnablementOptions and put everything at the same level. I
dont have a problem with this but wasn't sure if this was just being
lazy. I dont mind doing this if the consensus is it is the best way
to do it....but I also have a second example that is similar to this
scenario but is a bit more complex where putting everything under the parent level is not really optimal (I will show in the next example)
So, my second example is as follows:
/anotherPolicy
"data": [ {
"name": "APolicyName1",
"description": "",
"count": 123,
"lastModified": "2022-05-17-20.37.27.000000",
[{
"ownerId": 1
"ownerCount": 1818181
"specialFlags": 'ABA'
},
{ . . . }]
},
{ . . . }],
The above 'count' is the total number associated to that policy and then there is a nested resource by owner where the count by owner can be seen..plus maybe other information specific to that owner. The SUM(ownerCount) would equal "count" above it. Does this scenario change any of the answers to the questions above?
I appreciate your help. I found a ton of information and reference on when to use or not use nested endpoints but all the examples seem to orient around subjects that seem like they could easily be separated into two resource...for instance whether to nest /employees under /departments or /comments under /posts. Also, they didn't deal with the complexities of having nested endpoints vs avoiding them. And last, if using nesting is unnecessary as a readability standpoint.

How do I perform aggregate queries using SumoLogic APIs

I am trying to perform aggregate queries using SumoLogic APIs as mentioned here.
Something like:
_view = <some_view> | where sourceCategory matches \"something\" | sum(field) by sourceCategory
This works just fine in the Sumo GUI. I get a field in result called "_sum" which gives me the desired result.
However the same doesn't work when I do it using the SUMO APIs. If I create a job with this body:
{
"query": "_view = <some_view> | where sourceCategory matches "something" | sum(field) by sourceCategory",
"from": "start_timestamp",
"to": "end_timestamp",
"timeZone": "some_timezone"
}
I call the "v1/search/jobs" POST method with the above body and I do GET "v1/search/jobs/{job_id}" till the state is "DONE GATHERING RESULTS". Then I do "v1/search/jobs/{job_id}/messages". I was expecting to see aggregated values in the result, but instead I see something similar to:
{
"fields":[
{
"name":"_messageid",
"fieldType":"long",
"keyField":false
}, ...
],
"messages":[
{
"map":{
"_receipttime":"1359407350899",
"_size":"549",
"_sourcecategory":"service",
"_sourceid":"1640",
"the_field_i_mentioned":"not-aggregated-value"
"_messagecount":"2044"
}
}, ...
]
]
Thanks for going through my question. Any advices / work-arounds are appreciated. I don't really want to iterate manually through all items and calculate the sum. I'd prefer to do it on SumoLogic side itself. Thanks Again!
Explanation
Similar as in the User Interface, in the API for log searches you get both raw results (also referred to as messages) and the aggregate results (also referred to as records).
(Obviously, the latter are only returned if there's any aggregation in the query. In your case there is.)
Actual suggestion
Then I do "v1/search/jobs/{job_id}/messages"
Try /records instead.
See the docs for "Paging through the records found by a Search Job"
Disclaimer: I am currently employed by Sumo Logic.

Can you list multiple features within the same Schema.org "LocationFeatureSpecification"?

I am working on Schema.org Resort schema for a ton of resorts on a travel website and am trying to find the most efficient ways of filling out the schema with regards to amenities.
The current code looks something like this:
"amenityFeature": [
{
"#type":"http://schema.org/LocationFeatureSpecification",
"name":"Spa",
"value":"true"
},
{
"#type":"http://schema.org/LocationFeatureSpecification",
"name":"Internet Access",
"value":"true"
},
{
"#type":"http://schema.org/LocationFeatureSpecification",
"name":"Tennis Courts",
"value":"true"
}
]
My question is, can I write it like this instead to shorten lines of code:
{
"#type":"http://schema.org/LocationFeatureSpecification",
"name":[
"Spa", "Internet Access", "Tennis Courts"
],
"value":"true"
}
When I test it in Google’s Structured Data Testing Tool, it doesn’t give any errors. Here is what it looks like in the SDTT when I write it the short way:
And here is what it looks like if I do it the first/long way:
If I do it the short way, I want to make sure all those items are getting listed as amenities and not just different names for the same amenity. Otherwise, I'll go the long route.
No, each LocationFeatureSpecification represents one feature:
Specifies a location feature by providing a structured value representing a feature of an accommodation as a property-value pair of varying degrees of formality.
Your second snippet would represent one feature with multiple names.

Update a given mongo field in unknown parents fields

Lets say I have a document structured like that :
datas: {
foo: {
...
keytoupdate: [...]
},
whatever: {
...
keytoupdate: [...]
},
anystring: {
...
keytoupdate: [...]
},
...: {
...
keytoupdate: [...]
}
}
I know that :
Each direct child property of the "datas" document has a "keytoupdate" field.
The direct child properties of the "datas" document varies from case to case: not necessarily the same name, neither the same number.
I want to update each "keytoupdate" fields, no matter how many of them there are.
The question is: How can I do that ? Is there any magic operator like $ that does the same job for Array ?
Thank you !
I'll answer my own question : there is no way to do that, we can't play with dynamic keys, just forget about it ! But there are 2 workarounds :
The best solution, as suggested by #chridam, is to redesign the schema to make an array of objects, where the keys are parts of the arrays, you can see this question for more details.
If you can't, the other (but not good) solution is to make a request for each field that might be in your document, instead of trying to do this in one request. This is a very bad solution, especially if your document may have lots of fields, and you have to known which fields that could be in your documents. This is a bad solution, absolutely not optimized, but it has the merit of being simple to implement

Is there any way to force a schema to be respected?

First, I'd like to say that I really love NoSQL & MongoDB but I've got some major concerns with its schema-less aspect.
Let's say I have 2 tables. Employees and Movies.
And... I have a very stupid data layer / framework that sometimes like to save objects in the wrong tables.
So one day, a Movie gets saved in the Employees table. Like this:
> use mongoTests;
switched to db mongoTests
> db.employees.insert({ name : "Max Power", sex : "Male" });
> db.employees.find();
{ "_id" : ObjectId("4fb25ce6420141116081ae57"), "name" : "Max Power", "sex" : "Male" }
> db.employees.insert({ title : "Fight Club", actors : [{ name : "Brad Pitt" }, { name : "Edward Norton" }]});
> db.employees.find();
{ "_id" : ObjectId("4fb25ce6420141116081ae57"), "name" : "Max Power", "sex" : "Male" }
{ "_id" : ObjectId("4fb25db834a31eb59101235b"), "title" : "Fight Club", "actors" : [ { "name" : "Brad Pitt" }, { "name" : "Edward Norton" } ] }
This is VERY wrong.
Let's switch the context, think about Movies, and CreditCards (for whatever reason, in this context credit cards would be stored in clear text inside the DB). This is SUPER WRONG?
The code would probably explode because it's trying to use an object
structure and receives another totally unknown structure.
Even worst, the code actually works and the webstore visitors
actually see credit cards information in the "Rent a movie" list.
Is there anything, built-in that would prevent such threat to ever happen? Like some way to "force" a schema to be respected for only some tables?
Or is there any way to force MongoDB to make a schema mandatory? (Can't create new fields in a table, etc)
EDIT: For those who thinks I'm trolling, I'm really not, this is an important question for me and my team because this is a big decision whether or not we're going to use NoSQL.
Thanks and have a nice day.
The schema-less aspect is one of the major positives.
A DB with a schema doesn't fully remove this kind of issue - e.g. there could be a bug in a system that uses a RDBMS that puts the wrong data in the wrong field/table.
IMHO, the bigger concern would be, how did that kind of bug make it through dev, testing and out into production?!
Having said that, you could set up a process that checks the "schema" of documents within a collection (e.g. look at newly added documents, check whether they have fields you would expect to see in there) - then flag up for investigation. There is such a tool (node.js) here (I think, I've never used it):
http://dhendo.github.com/node-mongodb-schema-validator/
Edit:
For those finding this question in future, so the link in my comment doesn't go overlooked, there's a jira item for this kind of thing here:
http://jira.mongodb.org/browse/SERVER-3536