what is the best model to save data on elasticsearch? - mongodb

I have a rails application and use elastic search as a search engine in my rails app. this app collects data from the mobile application and could collect from any kind of mobile app. mobile app sends two types of data user profile details and user actions details. my app admins could search over this data with multiple conditions and operations and fetch the specific results and which are user profile details. after that my app admins could communicate with this profile, for example, send an email, SMS, or even chat online. In my case I have two options to save user data; first of all, I want to save user profiles details and user action details in a separate document with this structure profile doc:
POST profilee-2022-06-09/_doc
{
"profile": {
"app_id": "abbccddeeff",
"profile_id": "2faae1d6-5875-4b36-b119-74a14589c841",
"whatsapp_number": "whatsapp:+61478421940",
"phone": "+61478421940",
"email": "user#mail.com",
"first_name": "john",
"last_name": "doe"
}
}
user actions details:
POST events_app_id_2022-05-17/_doc
{
"app_id": "9vlgwrr6rg",
"event": "Email_Sign_Up",
"profile_id": "2faae1d6-5875-4b36-b119-74a14589c840",
"media": "x1z1",
"date_time": "2022-05-17T11:48:02.511Z",
"device_id": "2faae1d6-5875-4b36-b119-74a14589c840",
"lib": "android",
"lib_version": "1.0.0",
"os": "Android",
"os_version": "12",
"manufacturer": "Google",
"brand": "google",
"model": "sdk_gphone64_arm64",
"google_play_services": "available",
"screen_dpi": 440,
"screen_height": 2296,
"screen_width": 1080,
"app_version_string": "1.0",
"app_build_number": 1,
"has_nfc": false,
"has_telephone": true,
"carrier": "T-Mobile",
"wifi": true,
"bluetooth_version": "ble",
"session_id": "b1ad31ab-d440-435f-ac12-3d03c30ac44f",
"insert_id": "1e285b51-abcf-46ae-8359-9a9d58970cdf"
}
As I said before app admins search over this document to fetch specific profiles and use that result to communicate with them, in this case, the problem is the mobile user could create a profile and a few days or a few months later create some actions so user profile details and user action details are generated in different days so if app admins want to fetch specific result from this data and wrote some complex query I have at least two queries by application on my elastic search in my app it's impossible because each query must save for later use by admin, so As a result of business logic it's impossible to me, and I have to add in some case I need to implement join query that based on elastic search documentation It has cost so it's impossible In the second scenario I decided to save both user profile and action in one docs somethings like this:
POST profilee-2022-06-09/_doc
{
"profile": {
"app_id": "abbccddeeff",
"profile_id": "urm-2faae1d6-5875-4b36-b119-74a14589c841",
"whatsapp_number": "whatsapp:+61478421940",
"phone": "+61478421940",
"email": "user#mail.com",
"first_name": "john",
"last_name": "doe",
"events": [
{
"app_id": "abbccddeeff",
"event": "sign_in",
"profile_id": "urm-2faae1d6-5875-4b36-b119-74a14589c841",
"media": "x1z1",
"date_time": "2022-06-06T11:52:02.511Z"
},
{
"app_id": "abbccddeeff",
"event": "course_begin",
"profile_id": "urm-2faae1d6-5875-4b36-b119-74a14589c841",
"media": "x1z1",
"date_time": "2022-06-06T11:56:02.511Z"
},
{
"app_id": "abbccddeeff",
"event": "payment",
"profile_id": "urm-2faae1d6-5875-4b36-b119-74a14589c841",
"media": "x1z1",
"date_time": "2022-06-06T11:58:02.511Z"
}
]
}
}
In this case, In the same state, I have to do as same as I do in before and I have to generate a profile index per day and append user action to it, so It means I have to update continuously each day, assume I have 100,000 profile and each one have 50 actions it means 100,000 * 50 per day update that have severity on my server so still it's impossible. So Could you please help me what is the best model to save my data in elastic search based on my descriptions?
Update: Does elastic search useful for my requirements? If I switch to other databases like MongoDB or add Hadoop it be more useful in my case?

Related

Which is the best design for a MongoDB database model?

I feel like the MVP of my current database needs some design changes. The number of users is growing quite fast and we are having bad performances in some requests. I also want to get rid of all the DBRef we used.
Our current model can be summarized as follow :
A company can have multiple employees (thousands)
A company can have multiple teams (hundreds)
An employee can be part of a team
A company can have multiple devices (thousands)
An employee is affected to multiple devices
Our application displays in different pages :
The company data
The users
The devices
The teams
I guess I have different options, but I'm not familiar enough with MongoDB to make the best decision.
Option 1
Do not embed and use list of ids for one to many relationships.
// Company document
{
"companyName": "ACME",
"users": [ObjectId(user1), ObjectId(user2)],
"teams": [ObjectId(team1), ObjectId(team2)],
"devices": [ObjectId(device1), ObjectId(device2)]
}
// User Document
{
"userName": "Foo",
"devices": [ObjectId(device2)]
}
// Team Document
{
"teamName": "Foo",
"users": [ObjectId(user1)]
}
// Device Document
{
"deviceName": "Foo"
}
Option 2
Embed data and duplicate informations.
// User Document
{
"companyName": "ACME",
"userName": "Foo",
"team": {
"teamName": "Foo"
},
"device": {
"deviceName": "Foo"
}
}
// Team Document
{
"teamName": "Foo"
"companyName": "ACME",
"users": [
{
"userName": "Foo"
}
]
}
// Device Document
{
"deviceName": "Foo",
"companyName": "ACME",
"user": {
"userName": "Foo"
}
}
Option 3
Do not embed and use id for one to one relationship.
// Company document
{
"companyName": "ACME"
}
// User Document
{
"userName": "Foo",
"company": ObjectId(company),
"team": ObjectId(team1)
}
// Team Document
{
"teamName": "Foo",
"company": ObjectId(company)
}
// Device Document
{
"deviceName": "Foo",
"company": ObjectId(company),
"user": ObjectId(user1)
}
MongoDB recommends to embed data as much as possible but I don't think it can be possible to embed all data in the company document. A company can have multiple devices or users and I believe it can grow too big.
I'm switching from SQL to NoSQL and I think I haven't figured it out by myself yet !
Thanks !
MongodB provides you with a feature which is handling unstructured data.
Every database can contain collection which in turn can contain documents.
Moreover, you cannot use joins in mongodB. So, storing information in one company model is a better choice because you wont be needed join in that scenario.
One more thing, You dont need to embed all the models For example : You can get user and device both from company table, so why embedding users and device as well?

how to divide entities and share it in clean architecture

For my new project in flutter, I am trying to follow the Clean Architecture and diving each different feature in domain, data and presentation`.
Now, I have initiated with the Authentication feature where I have started creating the Entities however, I am pretty much confused and stuck on how to modularize the code based on the Clean Architecture practice.
For example,
The response of my login service is as follow,
so do I need to create entities and models for all JSON nested objects like for response , data , user , accountData and permissions ?
or is there any way to use IResponse to store common elements like status , message and data and then use only for relevant feature.
Not sure whether it is allowed to share entities in between the features. Like user block below can be a feature in Authorization and Employee
{
"status": "success",
"message": "successfully login",
"data": {
"token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpZCI6NywiaXNMb2dnZWRJbiI6dHJ1ZSwidGVuYW50SWQiOjEyLCJpYXQiOjE2MjU5OTE5MTAsImV4cCI6MTYyNjA3ODMxMH0.I0z9OIDnQS-MI1ya6usqycoryZ1TBwj3K52BRfrpMuY",
"user": {
"user_id": 7,
"email": "jd#gmail.com",
"first_name": "John",
"last_name": "Doe",
"phone": "",
"date_of_birth": "2015-08-06T00:00:00.000Z",
},
"accountData": {
"name": "Amazon",
"account_id": 1
},
"permissions": [
"ViewAllEmployee",
"AddNewEmployee"
]
}
}

User storage not saved between conversation but same user is recognized

In my Actions on Google project, I save some data (like an uuivd userId) between conversations using conv.user.storage. With my own testing account, this works fine. On another testing account, the user.storage is cleared and the data is lost. The accounts have these differences:
A Google Home is connected to the account where user storage DOES works and voice match is set up, personal results is enabled.
NO Google Home is connected to the account where user storage DOESN'T work, no voice match is set up, personal results isn't an option since no device other than iPhone is connected. This account is used on smartphone (iPhone) only.
I know the user storage will be cleared when:
Voice match is set up and there is no match.
The user disabled personal data.
But this isn't the case for either accounts. I know the user is recognized as the same account because of the lastSeen and because the userId between conversations remains the same, as can be seen in the Conv object:
at the end of the conversation:
"user": {
"raw": {
"userStorage": "{\"data\":{\"userId\":\"f581e751-ad81-4a6b-9519-00a57d5e30d4\"}}",
"lastSeen": "2019-03-13T11:58:39Z",
"locale": "nl-NL",
"userId": "ABwppHEOonglGmWakeizd_Stx_OpUhSNzx2K4JWETc73FW-KctZLM2vc4B7V6Fxk9OfL3RQ3n5jIgw"
},
"storage": {
"userId": "f581e751-ad81-4a6b-9519-00a57d5e30d4"
},
"_id": "ABwppHEOonglGmWakeizd_Stx_OpUhSNzx2K4JWETc73FW-KctZLM2vc4B7V6Fxk9OfL3RQ3n5jIgw",
"locale": "nl-NL",
"permissions": [],
"last": {
"seen": "2019-03-13T11:58:39.000Z"
},
"name": {},
"entitlements": [],
"access": {},
"profile": {}
},
at the beginning of a new conversation:
"user": {
"raw": {
"lastSeen": "2019-03-13T11:59:33Z",
"locale": "nl-NL",
"userId": "ABwppHEOonglGmWakeizd_Stx_OpUhSNzx2K4JWETc73FW-KctZLM2vc4B7V6Fxk9OfL3RQ3n5jIgw"
},
"storage": {},
"_id": "ABwppHEOonglGmWakeizd_Stx_OpUhSNzx2K4JWETc73FW-KctZLM2vc4B7V6Fxk9OfL3RQ3n5jIgw",
"locale": "nl-NL",
"permissions": [],
"last": {
"seen": "2019-03-13T11:59:33.000Z"
},
"name": {},
"entitlements": [],
"access": {},
"profile": {}
},
Does anyone know any other reason why user.storage might be cleared other than the ones stated above, or another way without using account linking?
I think I figured it out. In https://myaccount.google.com/u/3/activitycontrols?utm_source=google-account&utm_medium=web I forgot to toggle the Chrome-history and activity option.

REST: update a resource with different fields requiring different user permissions

I have an endpoint /groups
I can create a group by POSTing some info to /groups
A single group can be read by /groups/{id}
I can update some fields in the group by POSTing to /group/{id}
HOWEVER I have different fields that are needed to be updated by users with different permissions, for instance: A group might have the structure
{
"id": 1,
"name": "some name",
"members": [
{
"user_id": 456,
"known_as": "Name 1",
"user": { /* some user object */},
"status": "accepted",
"role": "admin",
"shared": "something"
},
{
"user_id": 999227,
"known_as": "Name 1",
"user": { /* some user object */},
"status": "accepted",
"role": "basic",
"shared": "something"
},
{
"user_id": 9883,
"known_as": "Name 1",
"user": { /* some user object */},
"status": "requested",
"role": "basic",
"shared": "something"
}
],
"link": "https://some-link"
}
As an example I have the following 3 operations for the /group/{id}/members/{id} endpoint:
I want only the user to be able to update his own known_as field
I want only group admins to be able to update each member's role and status fields.
I want both the user and the admin to be able to update the shared field
My options are this:
Should I allow all updates to be done by POSTing to /group/{id}/members/{id} with a subset of the fields for a member and throw an unauthorized error if they try to update a field that they aren't allowed to update?
Or should I break each operation into say /group/{id}/members/{id}/role, /group/{id}/members/{id}/shared and /group/{id}/members/{id}/status? The problem with this is that I don't want to have to make lots of requests to update all the fields (I imagine that there will end up being quite a lot of them).
So just for clarification my question is: Is it considered proper REST to do my option 1 where I can post updates to an endpoint that may fail if you try to change a field that you aren't allowed to?
In my opinion, option 1 is much better than option 2.
As you said option 2 is a waste of bandwidth.
More importantly, with option 1 you can easily implement an atomic update (update "all-or-nothing"). It should either complete successfully or fail entirely. There should never be a partial update.
With option 2 it's very likely the update can be implemented to complete some request successfully and reject another request, even if the two requests are considered a single operation.

How to retrive action ID with object ID and User ID? (Open Graph Custom Action)

So the situation is such:
I want the user to perform an (unique) action on my page. I do this through the presentation of an interface and a set of objects on my page. Works flawlessly.
But in order to present a complete user interface I must also give the option to delete the performed action, so I must know if a user has already interacted with a certain object.
Now could I run through all the objects, the user has interacted with and check, if the asked object is in there, but this way is not very resource savvy, when the user has interacted with a lot of items.
So the basic question is: Is there an API method, where I can look whether this user has already an action with the given object?
Thanks for your help!
https://graph.facebook.com/me/APP_NAMESPACE:ACTION_NAME
This will give you a list of all the actions of type ACTION_NAME that the current user has performed.
In the list you'll also find the connected objects.
Example:
https://graph.facebook.com/me/polarprint_forum:ask
{
"data": [
{
"id": "10150663311283415",
"from": {
"id": "549348414",
"name": "Joakim Syk"
},
"start_time": "2012-03-08T13:10:44+0000",
"end_time": "2012-03-08T13:10:44+0000",
"publish_time": "2012-03-08T13:10:44+0000",
"application": {
"id": "346637838687298",
"name": "Polar Print Forum"
},
"data": {
"question": {
"id": "10150604589861693",
"url": "http://www.polarprint.se/facebook_thread_tab/946/k\u0025C3\u0025A4ppteknik_vid_kullersten.html",
"type": "polarprint_forum:question",
"title": "käppteknik vid kullersten"
}
},
"likes": {
"count": 0
},
"comments": {
"count": 0
}
}
],
"paging": {
"next": "https://graph.facebook.com/me/polarprint_forum:ask?format=json&offset=25&limit=25"
}
}
A more efficient approach would probably be to store this information on your end when creating the actions.