cassandra database design - nosql

Consider the following situation:
I have a page it will have following fields:
pageid, title, content, like, follow, field1, field2..., field100, pagecomments, images
Like and follow is counter field that will increase on each click.
Now i am thinking of designing this in Cassandra in following ways:
**TYPE A** page_table {
page_id,
title,
content,
like,
follow,
posted_by,
datetime,
image1,
image2,
field1,
field2...,
field100
}
page_comments {
commentid,
page_id,text,
comment_like,
posted_by,
datetime
}
**TYPE B** page_table {
page_id,
title,
content,
posted_by,
datetime,
image1,
image2,
field1,
field2...,
field100
}
page_like {
page_id,
like
}
page_follow {
page_id,
follow
}
page_comments {
commentid,
page_id,
text,
comment_like,
posted_by,
datetime
}
Which one is best way? Or suggest some good Cassandra database design for this, using CQL

You may want to read up on some noSQL patterns
https://github.com/deanhiller/playorm/wiki/Patterns-Page
If you are going to get all the comments from a page, I don't see any FK's to the comments, which you will need in the page_table which brings to lite, that page is missing a pattern. I will add which is a toMany in nosql is frequenly embedded in the row, rather than having an index. So you have this in alot of designs.
page_table {
page_id,
title,
content,
like,
follow,
posted_by,
datetime,
image1,
image2,
fktocomment1,
fktocomment2,
fktocomment3
}
What is typically done is the fktocomment1 is prefexied with the word "comment" so you can find all the fks by stripping off the comment part and using the fk at the end(There is NO value!).
It is a composite name pattern which you can google.
EDIT: patterns page edited/added that pattern it was so common, I never thought to add it before.

Related

Implement search functionality with Entity Framework

I have three tables
SalesDetails with columns SalesId, ProductId, Qty, Price etc
SalesPersonDtls with columns SalesId, SalesPersonId, CommPercentage etc
SalesPerson with columns SalesPersonId, firstName, lastName etc
I have second table because one sale can be done by more than one sales person together with split commission.
I have various inputs in search screen like productname, sales date, sales person name etc.
I am making the model class as 'AsQueryable' and add various where conditions and finally the result into a list.
I have sales person's name in search criteria but I don't know how to include this into the search. Can you please help?
Thanks
Peter
Peter
If I get it correct , relation of your business models is like this :
person (n) <-----> (1) Sale (1) <-----> (n) Details
you put sale and person relation in "SalesPersonDtls" and sale and detail relation to "SalesDetails". I think it's better to change your entities a little bit, if you want to get better result as your project getting bigger and more complex.
Your entities should be like this :
Sale
{
List<SalesDetail> details;
List<Person> persons;
...
}
SalesDetail
{
Sale
...
}
Person
{
Sale
name
...
}
Now it's really simple , if you want sales that is related to a personName :
sales.Where(sale => sale.Persons.Any(person => person.PersonName == "your input name"));
UPDATE :
If you can't or don't want to change your models:
first you need to find personId by it'name and then search into your "SalesPersonDtls" and get saleIds.

Postgresql json select from values in second layer of containment of arrays

I have a jsonb column 'data' that contains a tree like json, example:
{
"libraries":[
{
"books":[
{
"name":"mybook",
"type":"fiction"
},
{
"name":"yourbook",
"type":"comedy"
}
{
"name":"hisbook",
"type":"fiction"
}
]
}
]
}
I want to be able to do a index using query that selects a value from the indented "book" jsons according to the type.
so all book names that are fiction.
I was able to do this using jsonb_array_elements a join query, but as i understand this would not be optimized with using the GIN index.
my query is
select books->'name'
from data,
jsonb_array_elements(data->'libraries') libraries,
jsonb_array_elements(libraries->'books') books,
where books->>'type'='grading'
If the example data you are showing is the type of data that is common in your JSON, I would suggest that you may be setting things up wrong.
Why not make a library table and a book table and not use JSON at all, it seems JSON is not the right choice here.
CREATE TABLE library
(
id serial,
name text
);
CREATE TABLE book
(
isbn BIGINT,
name text,
book_type text
);
CREATE TABLE library_books
(
library_id integer,
isbn BIGINT
)
select book.* from library_books where library_id = 1;

Querying Laravel Relationship

I am trying to get one query work since morning and not able to get it working I have two tables photographers and reviews please have a look at structure and then I will ask the question at the bottom :
Reviews table :
id int(10) unsigned -> primary key
review text
user_id int(10) unsigned foreign key to users table
user_name varchar(64)
photographer_id int(10) unsigned foreign key to photographers table
Photographers table :
id int(10) unsigned -> primary key
name text
brand text
description text
photo text
logo text
featured varchar(255)
Photographers model :
class Photographer extends Model
{
public function reviews()
{
return $this->hasMany('\App\Review');
}
}
Reviews Model :
class Review extends Model
{
public function photographers()
{
return $this->belongsTo('\App\Photographer');
}
}
My logic to query the records
$response = Photographer::with(['reviews' => function($q)
{
$q->selectRaw('max(id) as id, review, user_id, user_name, photographer_id');
}])
->where('featured', '=', 'Yes')
->get();
The question is : I want to fetch all the photographers who have at least one review in the review table, also I want to fetch only one review which is the most latest, I may have more than one review for a photographer but I want only one.
I would add another relationship method to your Photogrpaher class:
public function latestReview()
{
return $this->hasOne('App\Review')->latest();
}
Then you can call:
Photographer::has('latestReview')->with('latestReview')->get();
Notes:
The latest() method on the query builder is a shortcut for orderBy('created_at', 'desc'). You can override the column it uses by passing an argument - ->latest('updated_at')
The with method loads in the latest review.
The has method only queries photographers that have at least one item of the specified relationship
Have a look at Has Queries in Eloquent. If you want to customise the has query further, the whereHas method would be very useful
If you're interested
You can add query methods to the result of a relationship method. The relationship objects have a query builder object that they pass any methods that do not exist on themselves to, so you can use the relationships as a query builder for that relationship.
The advantage of adding query scopes / parameters within a relationship method on an Eloquent ORM model is that they are :
cacheable (see dynamic properties)
eager/lazy-loadable
has-queryable
What you need is best accomplished by a scoped query on your reviews relation.
Add this to your Review model:
use Illuminate\Database\Query\Builder;
use Illuminate\Database\Eloquent\Model;
class Review extends Model {
public function scopeLatest(Builder $query) {
// note: you can use the timestamp date for the last edited review,
// or use "id" instead. Both should work, but have different uses.
return $query->orderBy("updated_at", "desc")->first();
}
}
Then just query as such:
$photographers = Photographer::has("reviews");
foreach ($photographers as $photographer) {
var_dump($photographer->reviews()->latest());
}

How to model mongodb collections for Cassandra database (migration)?

I am new to Cassandra and trying migrate my App from MongoDB to Cassandra
I have the following collections in MongoDB
PhotoAlbums
[
{id: oid1, title:t1, auth: author1, tags: ['bob', 'fun'], photos: [pid1, pid2], views:200 }
{id: oid2, title:t2, auth: author2, tags: ['job', 'fun'], photos: [pid3, pid4], views: 300 }
{id: oid3, title:t3, auth: author3, tags: ['rob', 'fun'], photos: [pid2, pid4], views: 400 }
....
]
Photos
[
{id: pid1, cap:t1, auth: author1, path:p1, tags: ['bob','fun'], comments:40, views:2000, likes:0 }
{id: pid2, cap:t2, auth: author2, path:p2, tags: ['job','fun'], comments:50, views:50, likes:1, liker:[bob] }
{id: pid3, cap:t3, auth: author3, path:p3, tags: ['rob','fun'], comments:60, views: 6000, likes: 0 }
...
]
Comments
[
{id: oid1, photo_id: pid1, commenter: bob, text: photo is cool, likes: 1, likers: [john], replies: [{rep1}, {rep2}]}
{id: oid2, photo_id: pid1, commenter: bob, text: photo is nice, likes: 1, likers: [john], replies: [{rep1}, {rep2}]}
{id: oid3, photo_id: pid2, commenter: bob, text: photo is ok, likes: 2, likers: [john, bob], replies: [{rep1}]}
]
Queries:
Query 1: Show a list of popular albums (based on number of likes)
Query 2: Show a list of most discussed albums (based on number of
comments)
Query 3: Show a list of all albums of a given author on
user's page
Query 4: Show the album with all photos and all comments
(pull album details, show photo thumbnails of all photos in the
album, show all comments of selected photo
Query 5: Show a list of
related albums based on the tags of current album
Given the above schema and requirements, how should I model this in Cassandra?
As I have experience with both Cassandra and Mongo, I'll take a shot at this. The tricky part here, is that MongoDB allows for very loose restrictions around indexing and querying. Cassandra has a trickier model in that respect, but one that should perform fast, at scale, if created correctly. Also, the aspect of counting likes/views/comments on a photo or album can also get tricky, as you'll want to use Cassandra's counter type for that (which has its own challenges).
Disclaimer: Others may solve these problems differently. And I may choose to solve them differently if my first attempt didn't perform. But this is what I would start with.
To satisfy Query 3 I would create a query table called PhotoAlbumsByAuthor and query it like this:
CREATE TABLE PhotoAlbumsByAuthor (
photoalbumid uuid,
title text,
author text,
tags set<text>,
photos set<uuid>,
PRIMARY KEY(author,title,photoalbumid)
);
> SELECT * FROM photoalbumsbyauthor WHERE author='Malcolm Reynolds';
That will return all albums that the user Malcolm Reynolds has created, sorted by title (as title is the first clustering key).
For Query 4 I would create comments as a user defined type (UDT):
CREATE TYPE yourkeyspacename.comment (
commenter text,
commenttext text
);
Then I would create a query table called PhotosByAlbum and query it like this:
CREATE TABLE PhotosByAlbum (
photoalbumid uuid,
photoid uuid,
cap text,
auth text,
path text,
tags set<text>,
comments map<uuid,frozen <comment>>,
PRIMARY KEY(photoalbumid,photoid)
);
> SELECT * FROM PhotosByAlbum WHERE photoalbumid=a50aa80a-8714-44b4-9b97-43ec4b13daa6;
When you add a comment to this table, the uuid key of the map is the commentid. This way you can quickly grab all of the keys and/or values on your application side. In any case, this will return all photos for a given photoalbumid, along with any comments.
I would solve Query 5 in a similar way, by creating a query table (you should be noticing a pattern by now) called PhotoAlbumsByTag and query it like this:
CREATE TABLE PhotoAlbumsByTag (
tag text,
photoalbumid uuid,
title text,
author text,
photos set<uuid>,
PRIMARY KEY(tag,title,photoalbumid)
)
SELECT * FROM PhotoAlbumsByTag WHERE tag='family';
This will return all photo albums with the "family" tag. Note, that this is a denormalized structure of the tags set<text> used above, which means that a photo album will have one entry in this table for each tag it contains. I thought about possibly reusing one of the prior query tables with a secondary index on tags set<text> (as Cassandra now allows indexes on collections) but secondary indexes don't typically perform well. And you would still have to execute a query for each tag in the current album anyway (using a SELECT with the IN keyword is known to not perform well, either).
As for the first two queries, I would create specific tables to store the likes/views/comments counts like this:
CREATE TABLE PhotoCounters (
photoid uuid,
views counter,
comments counter,
likes counter,
PRIMARY KEY (photoid)
);
When using the counter type, Cassandra requires that the primary key and counters be the only columns in that table (can't mix counters with non-counter columns). And I would also process queries/reports on those offline, in an OLAP fashion, using Hadoop or Spark. Hope this helps.

Laravel and Eloquent: Specifying columns in when retrieving related items

This is a followup post to: Laravel 4 and Eloquent: retrieving all records and all related records
The solution given works great:
$artists = Artist::with('instruments')->get();
return \View::make('artists')->withArtists($artists);
It also works with just:
$artists = Artist::get();
Now I'm trying to specify the exact columns to return for both tables. I've tried using select() in both the statement above and in my Class, like this:
ArtistController.php
$artists = Artist::select('firstname', 'lastname', 'instruments.name')->get();
or:
$artists = Artist::with(array('instruments' => function($query) {
$query->select('name');
}))->get();
(as suggested here and while this doesn't throw an error, it also doesn't limit the columns to only those specified)
or in Artist.php:
return $this->belongsToMany('App\Models\Instrument')->select(['name']);
How would I go about getting just the firstname and lastname column from the artists table and the name column from instruments table?
Not sure what I was thinking. I think working on this so long got me cross-eyed.
Anyhow, I looked into this a lot more and searched for answers and finally posted an issue on GitHub.
The bottom line is this is not possible as of Laravel v4.1.
https://github.com/laravel/laravel/issues/2679
This solved it:
Artists.php
public function instruments() {
return $this->hasMany('App\Models\Instrument', 'id');
}
Note that I changed this to a hasMany from a belongsToMany which makes more sense to me as a musicians (or Artist) would have many Instruments they play and an Instrument could belong to many Artists (which I also alluded to in my previous questions referenced above). I also had to specify 'id' column in my model which tells the ORM that instrument.id matches artist_instrument.id. That part confuses me a bit because I thought the order for hasMany was foreign_key, primary_key, but maybe I'm thinking about it backwards. If someone can explain that a bit more I'd appreciate it.
Anyhow, the second part of the solution...
In ArtistsController.php, I did this:
$artists = Artist::with(array(
'instruments' => function($q) {
$q->select('instruments.id', 'name');
})
)->get(array('id', 'firstname', 'lastname'));
That gives me exactly what I want which is a collection of Artists that contains only the firstname and lastname columns from the artists table and the name column for each of the instruments they play from the instruments.
$artists = Artist::with(array('instruments' => function ($query) {
$query->select('id', 'name');
}))->get('id', 'firstname', 'lastname');