Building models in NOSQL

Building models in NOSQL - nosql

We are trying NOSQL Document database (ravenDB) and we are asking ourselves some questions.
This is our models :
public class User
{
public Guid Id {get;set}
public string Name {get;set;}
}
public class Video
{
public Guid Id {get;set;}
public string Nom {get;set;}
public DateTime PublishDate {get;set;}
public User Publisher {get;set;}
public Uri Adress {get;set;}
}
By default, a video can not be read by anyone.
You can add the rights to see the video at a user or a group of user.
You can recommand a video to a user or a group of user(the rights to see the video is added automatically).
What is the best way to design the models for a NOSQL Document database considering the following use case :
A user is publishing a video he can choose which group(s)/user(s) can see the video and recommend the video to some user(s)/group(s)
A user withdraw the rights to see the video at some user(s)/group(s)
Get the last N videos that a user has been authorized to read
Get the last N videos that have been recommended for a user
We are considering the following :
Add 2 List for each model (VideosReadable, VideosRecommended and UsersAllowedToRead, UserRecommended) where the first list contains all the elements of the second
Add a list of Tuple for each model (ListTuple<User, bool>> and List<Tuple<Video, bool>>), the bool indicates that if it is recommended.
Add a Document UserVideoLink
Which one would be the easiest model for querying ? Is there other better alternatives?

It all comes down to quantities. How many potential users total? How many potential videos total? How many recommendations and assignments? How often will the data change? There is no one best answer.
You may find, for example, that if you have a lot of everything that you are better off creating separate documents to model the active bits, such as a separate class and document to model a Recommendation and another to model an Assignment.
Then again, if one user only has access to a handful of videos, you may find it easier to embed a list of VideoIDs in each user, or a list of Video objects which may or may not be the full video document or just a be a small denormalized piece of data.
You'll have to experiment and decide what works best for you.
However, I'd stay away from using Tuple. They get a bit messy. You'd do better with a class of your own creation for that purpose.
I would also avoid a name like UserVideoLink - that doesn't fit the DDD ideas very well. Think of it more as what you are modeling instead, such as a Recommendation.
Some of this may sound like very relational-database thinking, but it does have a place in document databases also. Just because a document can have structure doesn't mean that everything has to go in a single document. Try to model your domain first using DDD concepts. Then everything you've identified as an "Aggregate Root" entity, and all child entities thereof, (usually) belong in a single document.

Related

When to use Core Data relationships in Swift?

I've read through a bunch of tutorials to the best of my ability, but I'm still stumped on how to handle my current application. I just can't quite grasp it.
My application is simply a read-only directory that lists employees by their company, department, or sorted in alphabetical order.
I am pulling down JSON data in the form of:
Employee
Company name
Department name
First name
Last name
Job title
Phone number
Company
Company name
Department
Company name
Department name
As you can see, the information here is pretty redundant. I do not have control over the API and it will remain structured this way. I should also add that not every employee has a department, and not every company has departments.
I need to store this data, so that it persists. I have chosen Core Data to do this (which I'm assuming was the right move), but I do not know how to structure the model in this instance. I should add that I'm very new to databases.
This leads me to some questions:
Every example I've seen online uses relationships so that the information can be updated appropriately upon deletion of an object - this will not be the case here since this is read-only. Do I even need relationships for this case then? These 3 sets of objects are obviously related, so I am just assuming that I should structure it this way. If it is still advised to create relationships, then what do I gain out of creating those relationships in a read-only application? (For instance, does it make searching my data easier and cleaner? etc.)
The tutorials I've looked at don't seem to have all of this redundant data. As you can see, "company name" appears as a property in each set of objects. If it would be advised that I create relationships amongst my entities (which are Employee, Company, Department), can someone show me how this should look so that I may get an idea of what to do? (This is of course assuming that I should use relationships in my model.)
And I would imagine that this would be the set of rules:
Each company has many or no departments
Each department has 1 or many employees
Each employee has 1 company and 1 (or no) department
Please let me know if I'm on the right track here. If you need clarification, I will try my best.

Yes, use relationships. Make them bi-directional.
The redundant information in your feed doesn't matter, ignore it. If you received partial data it could be used to build the relationships, but you don't need to use it.
You say this data comes from an API, so it isn't read-only as far as the app is concerned. Worry more about how you're going to use the data in the app than how it comes from the server when designing your data model.

Best practices for REST-API models

I am working on a REST-API and have run into an architectural problem.
The model 'Book' represents a single book with properties and CRUD-based functions. It loads itself from a database via a read function.
However, what if I want to get all books in the database? The current book model does not cover this use case.
I have tried several approaches:
1.) A second model called 'Books'. It has a read function which returns a list of book objects.
2.) The model 'Book' itself has a readAll function which loads all books.
3.) The model 'Book' is non-functional, it only has properties. Instead a 'BookStorage' class loads the data and fills one/multiple models.
I am not satisfied with any of these approaches. Is there a best practice for this scenario?

1.) A second model called 'Books'. It has a read function which returns a list of book objects.
This is okay, but users and future developers on the project may be unclear on the fact that you have both Book and Books. So be sure to document the API, and comment the code. Also, you do need to consider input filters to limit down the results, or at least a method to page results. Google once estimated there were 130 million books, so you want to get all of them at once?
e.g. SERVER/Books/?skipRecords=0&limit=100
2.) The model 'Book' itself has a readAll function which loads all books.
This is not ideal as it violates the single responsibility principal. Mostly it just doesn't make a lot of sense in an OO situation to have a single book entity be able to list all of it's sibling entities.
e.g. SERVER/TheHobbit/readAll.... yuck.
3.) The model 'Book' is non-functional, it only has properties. Instead a 'BookStorage' class loads the data and fills one/multiple
models.
If you can expose these functional extension via the API, then that is an fine solution as well. It all comes down to documentation.
Maybe it ends up looking like this
e.g.
SERVER/BookStorage/GetAllBooks?skipRecords=0&limit=100
SERVER/BookStorage/GetBook?title=TheHobbit

3.) The model 'Book' is non-functional, it only has properties. Instead a 'BookStorage' class loads the data and fills one/multiple models.
This approach is similar to the Repository Pattern and is very common. The BookStorage would be a BookRepository with an interface like:
public interface IBookRepository
{
Book GetById(int id);
IEnumerable<Book> GetAll();
// Other methods for creating, updating and deleting books.
}
In your controller you would then have:
public class BooksController: ApiController
{
private readonly IBookRepository _bookRepository;
public BooksController(IBookRepository bookRepository)
{
_bookRepository = bookRepository;
}
[Route("api/books/{id}")]
public IHttpActionResult Get(int id)
{
var book = _bookRepository.GetById(id);
// ...
}
[Route("api/books")]
public IHttpActionResult Get()
{
var books = _bookRepository.GetAll();
// ...
}
}
If you want paging, you would add a filter like Get(int page = 0) and then depending on your page size, use something like bookRepository.GetAll().Skip(PAGE_SIZE * page).Take(PAGE_SIZE).
A model should not load itself as it would violate the Single responsibility principle.

A great resource when designing RESTful webservices is Martin Fowler's article Richardson Maturity Model.
To summarise for your book case:
use POST /book to create a book
use PUT /book to update a book
use GET /book?author=Shakespeare&year=1602&someOtherParam=someValue to find books
use GET /book/:id to retrieve the details of a particular book
use DELETE /book/:id to delete a certain book
Also you might want to follow the HATEOAS principles, so you'll need to include all relevant links for a book into a links property, so that clients of your API will not need to build their own link when they'll want to add/edit/find/delete a book.
Although simple at first sight, designing a good RESTful webservice is not that easy, however HATEOAS helps because url schema changes on the server side don't affect clients as they don't hardcode the URL's needed for the CRUD operations. All you need to do is provide a starting point, e.g. a base url where all contents can be discovered, and clients can start from there navigating through your webservice.

Read the CRUD Operations section here. You will find REST best practices to be followed when designing a REST API

Document design with multiple embedded documents

I have a Schema question regarding MongoDB. I have a User table with 6 different related entities.
public class Profile
{
public List<Entity1> {get;set;}
public List<Entity2> {get;set;}
public List<Entity3> {get;set;}
public List<Entity4> {get;set;}
public List<Entity5> {get;set;}
public List<Entity6> {get;set;}
}
When i show the profile page, i have to show all the data related to the profile. After reading MongoDB tutorials, my initial design was to embedd all the six documents inside Profile document. But i am concerned that, it may exceed the document size. So currently i have 6 seperate collections, and each collection entity has a ProfileId(Indexed) in it. On Profile view, i make 6 different database calls based on ProfileId and show all the results.
public class Entity1
{
public int ProfileId {get;set;}
......
........
}
Is this acceptable ?
Thanks !

As of mongo 2.4, the maximum document size is 16MB which is quite a lot without any BLOBs or something. So if you always want to retrieve the entire profile embedding all the lists is definitely your first choice.
Without knowing your use case, I typically experience apps built on top of mongo becoming slow due to too many queries, in particular if you're working with a remote database. Remember that mongo does not support joins, so accessing 7 collections really means 7 round-trips!
Hence, I would start with the embedded solution and do a bit of document size measuring from time to time in order to check the size. If 16MB is really not enough, you will probably have a single entity list growing too large - in that case, I would only extract this single list to its own collection.
If you want to have maximum flexibility like being able to switch easily while you are evaluating your document sizes, you could additionally store your data to the 6 other entity collections as you are doing now, but without ever reading them. If you have to switch later on, you simply change the corresponding queries and delete the embedded fields from the Profile collection.

EF with Azure - Mixing SQL Server and Windows Azure Storage

I want to use two different data sources in my Azure project:
a SQL Server that contains basic partial info regarding an item (allows indexable data and spatial search)
a Windows Azure Storage that contains full remaining info regarding an item (retrieved by key)
In this way I can combine the powerful of SQL Server with the easy scalability of Windows Azure Storage.
Imagine this Domain POCO class:
class Person
{
string Id { get; set; }
string Name { get; set; }
byte[] Picture { get; set; }
string Biography { get; set; }
}
I would like to use Entity Framework with fluent mapping to let EF understand that the properties Picture and Biography must be loaded from Windows Azure Storage (table, blob) instead of SQL Server (possibly Lazy loaded).
There's a way with EF (or NHibernate) to do this or I have to implement my own ORM strategy?
Thanks

I don't think you can let EF know about Azure storage but you can map only necessary properties to a specific table. For example,
modelBuilder.Entity<Person>().Ignore(p => p.Picture);
So assuming that you have a repository class for your Person class, what you want can be easily achieved by filling the repository class with Azure storage API and EF.

You're trying to solve this problem too early (at the DAL) in my opinion. Look at the web, it fetches large data (e.g. pictures) in a separate call to the server. That has scaled very well. The picture data is not included in the document itself for a reason, it would just slow everything down and it would not be very fault tolerant. If you put them together in one entity you've got the fast entity retrieval that is slowed down by your picture server as they both have to come together before leaving towards your business layer and finally towards the presentation layer. And in the business layer this data is probably just wasting memory (that's why you want to lazy load it). So I think you're making the decision too early. What you describe as your domain object looks like a domain object of the presentation layer to me, similar to a ViewModel. I'm not too big into domain driven design, but while there is a general model of your application, I assume that each part of your application will require a slightly different implementation of that model.
Regarding lazy loading, if you have that enabled and you attempt to send your object over the wire, even if Picture was not loaded, it will get serialized since the data contract serializer (or any other) will call get on your property.
That's probably not the answer you wanted, but I felt that I had to say this. Of course I am open to comments and criticism.

How to do role-based access control for a franchise business?

I'm building the 2nd iteration of a web-based CRM+CMS for a franchise service business in ASP.NET MVC 2. I need to control access to each franchise's services based on the roles a user is assigned for that franchise.
4 examples:
Receptionist should be able to book service jobs in for her "Atlantic Seaboard" franchise, but not do any reporting.
Technician should be able to alter service jobs, but not modify invoices.
Managers should be able to apply discount to invoices for jobs within their stores.
Owner should be able to pull reports for any franchises he owns.
Where should franchise-level access control fit in between the Data - Services - Web layer?
If it belongs in my Controllers, how should I best implement it?
Partial Schema
Roles class
int ID { get; set; } // primary key for Role
string Name { get; set; }
Partial Franchises class
short ID { get; set; } // primary key for Franchise
string Slug { get; set; } // unique key for URL access, eg /{franchise}/{job}
string Name { get; set; }
UserRoles mapping
short FranchiseID; // related to franchises table
Guid UserID; // related to Users table
int RoleID; // related to Roles table
DateTime ValidFrom;
DateTime ValidUntil;
Controller Implementation
Access Control with [Authorize] attribute
If there was just one franchise involved, I could simply limit access to a controller action like so:
[Authorize(Roles="Receptionist, Technician, Manager, Owner")]
public ActionResult CreateJob(Job job)
{
...
}
And since franchises don't just pop up over night, perhaps this is a strong case to use the new Areas feature in ASP.NET MVC 2? Or would this lead to duplicate Views?
Controllers, URL Routing & Areas
Assuming Areas aren't used, what would be the best way to determine which franchise's data is being accessed? I thought of this:
{franchise}/{controller}/{action}/{id}
or is it better to determine a job's franchise in a Details(...) action and limit a user's action with [Authorize]:
{job}/{id}/{action}/{subaction}
{invoice}/{id}/{action}/{subaction}
which makes more sense if any user could potentially have access to more than one franchise without cluttering the URL with a {franchise} parameter.
Any input is appreciated.
Edit:
Background
I built the previous CRM in classic ASP and it runs the business well, but it's time for an upgrade to speed up workflow and leave less room for error. For the sake of proper testing and better separation between data and presentation, I decided to implement the repository pattern as seen in Rob Conery's MVC Storefront series.
How to arrange services and repositories?
It makes sense to have a JobService that retrieves any service jobs based on available filters, eg. IQueryable<Job> GetJobs();. But since a job can only belong to one franchise, a function like IQueryable<Job> GetJobs(int franchiseID); could belong in either FranchiseService or in JobService. Should FranchiseService act as a CatalogService (like in MVC Storefront)?

Let me take a stab at answering this. I am in the process of playing with a sample app that touches some of the aspects mentioned. This is not an authoritative answer, merely experience.
Where should franchise-level access control fit in between the Data - Services - Web layer?
This access restrictions should
permeated through your application at
two levels 1) the database 2) the
application layer. In an MVC context I
would suggest having creating a custom
Authorization attribute - this handles
the security between the Web-Services
layer. I would have this attribute do
two things
Get the current roles allowed for the user (either from the DB of it may
be stored in the user session)
Do the checking to see if the user is part of the allowed list of roles.
With regards to the database, this
depends on how you are storing the
data, one database for all franchises
or database per franchise. In the
first case there are several ways to limit
and setup access restrictions for
data to a particular
franchise.
Since franchises don't just pop up over night, perhaps this is a strong case to use the new Areas feature in ASP.NET MVC 2? Or would this lead to duplicate Views?
I think that Areas should be used to
split and group functionality. If you
were to use Areas to split franchises,
this is where I see a duplication of
views, controllers etc. occurring. Duplicate
views can be overcome by using a
custom view engine to specifically
overriding the way MVC locates your
views. Plug: See my answer to ASP.NET MVC: customized design per domain
Assuming Areas aren't used, what would be the best way to determine which franchise's data is being accessed?
As mentioned above, you could the
users session to store basic
information such as the franchise the
user belongs to and the roles etc
assigned. I think the rule I read
somewhere goes along the lines of
"Secure your actions, not your
controllers"
Create you routes etc for the norm and
not for the exception. eg. Is there
currently a business case that says a
user can have access to more than one
franchise?
How to arrange services and repositories?
Have a set of base services or base
classes that will contain all the
information required for a particular
franchise such as the franchiseId.
Th main issue that it does resolve is
that your service methods are cleaner
not having the franchiseId argument.
The repository however may need this
value since as some point you need to
disambiguate the data you are
requesting or storing (assuming one db
for all franchises). However, you
could overcome some of this using IoC.
The downside I see is that
they there will always be calls to the
database every time your objects are
creating (i.e. if the franchise
route were to be used, you would need
to go the database to obtain the
corresponding franchiseId every time
you create a service object. ( I might
be mistaken on this one, since the IoC
containers do have some LifeStyle
options that may be able to assist and
prevent this) You could have
a list of Franchises that are created
on you Application start that you
could use to map your route values to
obtain the correct information. This
part of the answer is scattered, but
the main thing is that IoC will help
you decouple a lot of dependencies.
Hope this helps..

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse