API Design: Caching “partial” nested objects - rest

Let's say we have schools with some data including a name and a list of students, and students with some data including courses they're enrolled in and a reference to their school. On the client:
I'd like to show a screen that shows information about a school, which includes a list of all of its students by name.
I'd like to show a screen that shows information about a student, including the name of their school and the names of courses they're taking.
I'd like to cache this information so that I can show the same screen without waiting on a new fetch. I should be able to go from school to student and back to school without fetching the school again.
I'd like to show each screen with only one fetch. Going from the school page to the student page can take a separate fetch, but I should be able to show a school with the full list of student names in one fetch.
I'd like to avoid duplicating data, so that if the school name changes, one fetch to update the school will lead to the correct name being shown both on the school page and the student pages.
Is there a good way to do all of this, or will some of the constraints have to be lifted?
A first approach would be to have an API that does something like this:
GET /school/1
{
id: 1,
name: "Jefferson High",
students: [
{
id: 1
name: "Joel Kim"
},
{
id: 2,
name: "Chris Green"
}
...
]
}
GET /student/1
{
id: 1,
name: "Joel Kim",
school: {
id: 1,
name: "Jefferson High"
}
courses: [
{
id: 3
name: "Algebra 1"
},
{
id: 5,
name: "World History"
}
...
]
}
An advantage of this approach is that, for each screen, we can just do a single fetch. On the client side, we could normalize schools and students so that they reference eachother with IDs, and then store the objects in different data stores. However, the student object nested inside of school isn't a full object -- it doesn't include the nested courses, or a reference back to the school. Likewise, the school object inside of student doesn't have a list of all attending students. Storing partial representations of objects in data stores would lead to a bunch of complicated logic on the client side.
Instead of normalizing these objects, we could store schools and students with their nested partial objects. However, this means data duplication -- each student at Jefferson High would have the name of the school nested. If the school name changed just before doing a fetch for a specific student, then we'd show the right school name for that student but the wrong name everywhere else, including on the "school details" page.
Another approach could be to design the API to just return the ids of nested objects:
GET /school/1
{
id: 1,
name: "Jefferson High",
students: [1, 2]
}
GET /student/1
{
id: 1,
name: "Joel Kim",
school: 1,
courses: [3, 5]
}
We'd always have "complete" representations of objects with all of their references, so it's pretty easy to store this information in data-stores client side. However, this would require multiple fetches to show each screen. To show information about a student, we'd have to fetch the student and then fetch their school, as well as their courses.
Is there a smarter approach that would allow us to cache just one copy of each object, and to prevent multiple fetches to show basic screens?

You might be mixing two concepts: Storage and Representations. You can give back a non-normalized representation (the first option you suggested) without also storing those "partial" object in your database.
So I would suggest to try to return non-normalized representations, but storing them normalized (if you are using a relational DB).
Also, an improvement suggestion: You may want to use proper URIs instead of Ids in your representations. You probably want the clients to know "where" to get that object from, it's easier therefore to just supply the URI. Otherwise the client needs to figure out how to produce a URI out of an Id, and that usually ends up being hard-coded in the client, which is a no-no in REST.

Related

Want to change a model ObjectID value

I am creating a Student and Course relationship
A student may have multiple courses. A one to many relationship.
This is made in Express and I'm using MongoDB. I have shorten the models to keep it simple
Student Model
const studentSchema = new mongoose.Schema({
name: {type: String},
courses: [{
type: ObjectId,
ref: 'class'
}]})
Course Model
const classSchema = new mongoose.Schema({
ClassId: {type: String,},
Grade: {type: Number,}, })
Currently, what I have is when I update the grade, it will update the grade values for the course itself and not the course in the user courses.
router.put(....)
const{username, courseId, grade} = req.params
const existingUser = await Student.findOne({username}).populate({
path: 'courses',
select:['ClassId','Grade']
})
const findCourse = existingUser.courses.find(
x => x.ClassId == courseId
)
findCourse.Grade = parseInt(grade)
await findCourse.save()
The problem is this will change the grade for the course itself. Meaning any student that adds this course will have that grade too.
I'll explain what I want to do in Java/OOP terms if that helps.
I want the student object to have it's own course objects. At the moment, it seems like classes are static class objects.
I want to access that specific student courses and change that student grade of that specific course.
Please help, I already spent a couple of hours on this. In SQL, the student would have a reference key and be able to easily change their values, I'm having trouble in MongoDB.
Alright, I finally figured it out. In hindsight, it makes sense. Gave myself a break from coding and came back to see the problem.
Lets pretend we have two students and one course. This courses is seeded with data.
When a student A picks that course, they add it to their course array. When student B wants that course, they also get that exact course. Now they are sharing the course. Basically, they are sharing the same reference.
The solution to this is to still find the course. Now make a new course object, copy every value of the original to the copy. Save the copy to the database and now you add that course to the student. Now we can still register for courses and use the seeded data and students don't share anymore.

Unsure how to model many-to-many relationship in MongoDB

I'm new to MongoDB and I'm trying to model a Many to Many relationship into at least 2 collections (I need two collections for the project). What I'm having is a collection of universities, faculties and specializations, and another collection for students and their gradebook (this was the middle entity between specializations and students in SQL, not sure if it's needed in Mongo anymore). I tried to use This as an inspiration but it limits me as I can only search students by university id (I want for example to search students from a certain specialization or a certain faculty). I could put every row from university, faculty and specialization in the student collection and vice versa but I really don't think it's ideal. Here's what I have so far:
db.students.insertOne({_id:1, firstname: 'John', lastname: 'Silas', ethnicity:'english', civilstatus:'single', residence:'London', email:'johnSilas#gmail.com', gradebook:[{ year:2018, registrationyear:2017, formofeducation:'traditional'}], universities:[1]})
db.universities.insertOne({_id:1, name:'University of London', city:'London', adress:'whatever', phone: 'whatever', email: 'whatever#gmail.com', faculty:[{name: 'Law', adress:'whatever', phone: 'whatever', email: 'whatever#gmail.com'}], specialization:[{name:'criminal rights', yearlytax:5000, duration: 3, level:'bachelordegree', language:'english'}], students: [1,2]})
I'm sorry if I don't understand basic noSQL concepts, I am new to it. Thanks in advance.
Basic patterns for many to many association between A and B:
Inline references
On A, store the list of B ids in a field like b_ids
On B, store the list of A ids in a field like a_ids
This requires two writes whenever an association is created or destroyed, but requires either zero or one joins at query time to traverse the association (if you just want the id of Bs for a given A and you have the A already no further queries are needed).
Join model
Create a model C which has two fields: a_id and b_id. Each association is represented by a single instance of C.
This requires one write whenever an association is created or destroyed, but requires joins on all queries involving association (potentially two joins per query).

Retrieving arbitrary data into nested object with ORM

I am attempting to develop an api in go, to allow the user to specify an arbitrary data structure, and easily set up endpoints that perform CRUD operations on an auto generated postgres database, based on the structure that they define.
For now, I have been using gorm, and am able to have a database automatically generated based on a user-defined set of structs, that support all types of relations (has one, one to many, etc.). I am also able to insert into the generated database, when JSON is sent in through the endpoints.
The issue I have discovered, is when I try to receive the data. It seems where many of the go ORMs fall short on, is mapping data from all tables back into the nested structs of the parent struct.
For example, if the user defines:
type Member struct {
ID string
FirstName string
Hometown Hometown `gorm:"ForeignKey:MemberRefer"`
}
type Hometown struct {
ID string
City string
Province string
MemberRefer string
}
The database creates the tables:
Members
id
first_name
Hometowns
id
city
province
member_refer
However, when retrieving the data, all that is mapped back is:
{
"id": "dc2bb591-506f-40a5-a141-bdc0c8410ba1",
"name": "Kevin Krishna",
"hometown": {
"id": "",
"city": "",
"province": ""
}
}
Does anyone know of a go orm that supports this kind of behaviour?
Thanks
5 sec google search showed me the answer:
Preloading associations
Now that you actually have them properly related, you can .Preload() get the nested object you want:
db.Preload("GoogleAccount").First(&user)
Get nested object in structure in gorm
https://www.google.com/search?q=gorm+nested+struct+golang

How to manage drafts and multiple versions in Mongo?

This is a follow-up to my previous question.
Suppose there is a product catalog stored as a collection in Mongo. User Alice is a catalog manager and may update, remove and add products to the catalog. User Bob is a customer and may view the catalog.
Currently when Alice changes the catalog Bob sees the changes immediately. Now we want the changes to be visible only if Alice explicitly publish them. For example:
There is a catalog which consists of Product A, Product B, and Product C. Both Alice and Bob see the same products.
Alice changed the catalog. She modified Product A, removed Product C, and added Product D but did not publish the changes.
Now Alice sees Product A' (modified), Product B, and Product D but Bob still sees the previous version: Product A, Product B, and Product C.
Alice published the catalog. Now both Alice and Bob see the same products: Product A' (modified), Product B, and Product D
My questions are:
how to implement it with Mongo
how to manage versions/revisions of the catalog, so Alice will be able to undo/redo the changes she made in the catalog.
Ahh temporal data, the bane of database developers everywhere.
Fortunately this is arguably easier in mongodb than other relational dbs.
If you can make the assumption that you'll have at most ONE unpublished version this problem is much simpler than if you can have different users editing unpublished versions.
Assuming you've got some standard things in your schema:
{
_id: ObjectId
name: String,
CreatedDate: Date,
Price: Number
}
You need to add a sub-document with a duplicate of any field editable by the user. It also will contain a flag for deletion.
{
_id: ObjectId
name: String,
createdDate: Date,
price: Number,
revised: {
name: String,
createdDate: Date,
price: Number,
deleted: Boolean,
}
}
When a user goes to edit the product, you'll copy over the existing props into the 'revised' object. All edits go to that object. When you publish, you copy those items back to the base layer, and delete the 'revised' property.
If you have multiple users editing the document, and they can't see each other's edits you could make your revised document a bit more complicated
{ revised: { U1234: { name : ... }, U2345 : { name: ... } } }
Where each user has a separate copy. Of course when one user publishes it could delete the entity entirely. I would of course recommend adding a 'deleted' flag to the root item instead of actually deleting it from the db, unless these objects are HUGE. (Index the deleted flag of course.)

design mongodb schema for a specific project: embed documents or use foreign key

In my project, it has 3 models:
City
Plaza
Store
a city has plazas and stores; a plaza has stores.
My initial design is to use "foreign keys" for the relationship. (I am from mysql and jsut start to pick up mongodb)
class City(Document):
name = StringField()
class Plaza(Document):
name = StringField()
city = ObjectIdField()
class Store(Document):
name = StringField()
city = ObjectIDField()
plaza = ObjectIdField()
I feel this design is quite like a sql approach.
The scope of the project is like this: 5 cities; each city has 5 plazas; a plaza has 200 stores. a store has a number of products(haven't been modeled in above code)
I will query all stores in a city or in a plaza; all plazas in a city.
Should I embed all stores and plazas in City collection? I have heard do not use reference in mongodb, use embeded documents instead. In my specific projects, which one is a better approach? For me, I am comfortable with the "foreign key" design but am afraid of not taking advantage of mongodb.
From the way you described your project, it seems like an embedded approach is probably not needed - if you use indices on the city and plaza you can perform the queries you mentioned very quickly. Embedding tends to be more helpful for caching or when the embedded data doesn't make much sense on its own, and always is accessed at the same time as the parent data - not really the case here, something like addresses are a good example.
I think it makes sense to have a single collection of stores.
In each store document you could have an attribute called city, you could also have an attribute plaza. There are many other ways to structure its attributes, including more complex (subdocument) values.
If your document is:
{ storeName: "Books and Coffee",
location: "plaza 17",
city: "Anytown",
}
You can easily query for all stores in Anytown with
db.stores.find({"city":"Anytown"})
It doesn't make sense to store city and plaza in separate collections because then you will have to do multiple queries every time you need information that spans more than one collection, like store and the city it's in, or all stores in city "X".