Person (p_no, p_name, p_addr)
Investment (inv_no, inv_name, inv_date, inv_amt)
Write a function which will return name of person having maximum total amount of investment.
how to solve the above query using RDBMS
Related
I am developing an API where I am confused as to what is the efficient way to handle join query.
I want to join 2 tables data and return the response. Either I can query the database with join query and fetch the result and then return the response OR I can fire two separate queries and then I would handle the join in the API on the fly and return the response. Which is the efficient and correct way ?
Databases are pretty much faster than querying and joining as class instances. Always do joins in the database and map them from the code. Also look for any lazy loading if possible. Cause in a situation like below:
#Entity
#Table(name = "USER")
public class UserLazy implements Serializable {
#Id
#GeneratedValue
#Column(name = "USER_ID")
private Long userId;
#OneToMany(fetch = FetchType.LAZY, mappedBy = "user")
private Set<OrderDetail> orderDetail = new HashSet();
// standard setters and getters
// also override equals and hashcode
}
you might not want order details when you want the initial results.
Usually it's more efficient to do the join in the database, but there are some corner cases, mostly due to the fact that application CPU time is cheaper than database CPU time. Here are a few examples that come to mind, with a query like "table A join table B":
B is a small table that rarely changes.
In this case it can be profitable to cache the contents of this table in the application, and not query it at all.
Rows in A are quite large, and many rows of B are selected for each row of A.
This will cause useless network traffic and load as rows from A are duplicated many times in each result row.
Rows in B are quite large, and there are few distinct b_id's in A
Same as above, except this time the same few rows from B are duplicated in the result set.
In the previous two examples, it could be useful to perform the query on table A, then gather a set of unique b_id's from the result, and SELECT FROM b WHERE b_id IN (list).
Data structure and ORMs
If each table contains a different object type, and they have a "belongs to" relationship (like category and product) and you use an ORM which will instantiate objects for each selected row, then perhaps you only want one instance of each category, and not one per selected product. In this case, you could select the products, gather a list of unique category_ids, and select the categories from there. The ORM may even do that for you behind the scene.
Complicated aggregates
Sometimes, you want some stuff, and some aggregates of other stuff related to the first stuff, but it just won't fit in a neat GROUP BY, or you may need several ones.
So basically, usually the join works better in the database, so that should be the default. If you do it in the application, then you should know why you're doing it, and decide it's a good reason. If it is, then fine. I gave a few reasons, based on performance, data model, and SQL constraints, these are only examples of course.
The current scene is this:
There is a category table [category], the number of records is only more than 50, almost no increase, and the modification is rare.
There is a product table [product] currently millions of levels, will always increase.
These two are many-to-many relationships. One category will have more products, and each product will have multiple categories.
The category list is almost not changed, and there are about 1000 products in a category, and the list of a category will be changed not frequently.
Query requirements:
Query all categories (excluding the list of products under the category)
Query the category list by product_id
Query the product list by category_id
Operational requirements:
Modify the product list in category (add/delete a product to a category, sort the product list in a category, so the product list in category needs order.)
How many-to-many design of this kind of scene is better, there are some points:
1. If you follow the design of the SQL database, add a Category<-->Product relation table.
[Question] The order of each category of products is not well maintained. For example, the front-end performs a large-scale adjustment order on a category of products, and then requests it. The Category<-->Product relation table also needs to add an index field to indicate the order, and needs to update a lot of records. It is not particularly friendly to the operation requirements, is there any What can be optimized?
2. The way of NOSQL. Add a products:[] directly to the category to indicate a list of items in this category.
[Evaluation] In the query requirement, there is a requirement to query all categories (excluding the list of products under the category), which will pull out a lot of unnecessary data (products) at one time. Not applicable.
3. Add products:[] in the Category<-->Product association table
[Question] This can meet the operational requirements, but if you want to meet the Query requirments-2 [Query the category list by product_id], how to query it and will there be performance problems?
You need a third table (junction table) to complete the relationship. The keys must be primary keys along with a foreign key constraint.
tblProductCategories
PK product_id FK
PK category_id FK
I am trying to do a simple UML model about a car dealership.
The company has at least one store where in each they sell at least one type of a car. Each store has a name and each car has a name, type, and price. Each outlet also keeps stock of every car they sell.
I have outlined the idea in this image:
In addition to this, after every day, the number of cars sold gets recorded into a database. How would I add this to the model? Also, is there a better way to model the amount of cars in stock than to have it as a separate class? If there is a better diagram to model this type of scenario with I'd also be interested.
Thanks for any help!
There are many ways to model sales records. The simplest and most common is to have a sales ledger. It creates sales entries for items. The item is a separate (association) class that records the number of sold items, the price paid, the sales date, the sales person, and more. Pretty simple and straight forwards, until you get to the gory details. Ask your next dealer...
You can model a sales record as a separate class. Let's call it DaySales. Each day, you have a new instance of DaySales, containing the date and the amount of cars sold. I have given attribute date the data type 'String', because UML does not define a Date type. But if you define it yourself, you could better use Date than String.
I have removed the association between Car and Outlet, because it is already implicitly defined via Stock, but you can keep it as a redundant association, if you like.
I have altered the multiplicity of the association between Car and Stock, because there will be multiple cars in stock.
I have a question as follows:
A primary school class contains a number of children and a variety of books. Write a model which keeps track of the books that the children have read. It should maintain a relation hasread between children and books. It should also handle the following events:
record: adds the fact that the given child has read the given book
newbook: outputs a book that the given child has not already read
books_query: outputs the number of books the given child has read
Here is my model so far
CONTEXT
booksContext
SETS
STUDENTS
BOOKS
CONSTANTS
student
book
AXIOMS
axm1: partition(STUDENTS, {student})
axm2: partition(BOOKS,{book})
And my machine is as follows:
MACHINE
books
SEES
booksContext
VARIABLES
students
books
readBooks
INVARIANTS
students ⊆ STUDENTS
books ⊆ BOOKS
readBooks ∈ students → books
I have an event where I want mark a book as read for a given student. It takes in two parameters: the name of the student and the name of the book.
EVENTS
record
ANY
rbook
name
grd1: rbook ∈ books
grd2: name ∈ students
Now for the guards. I want to say
"If the student has not read the book already"
I had this but t doesn't work and I don't know what to do now. Can anyone help me
grd3: rbook(name) = ∅
rbook is just one book, but you are using is as if it was a function. Do you mean readBooks(name) = {}? If yes, the statement would still be "Has the student never read a book?".
The first problem is probably in the definition of readBooks. You modelled it as a total function from students to books. That means that every student has read exactly one book. That is probably not what you wanted to express. To state that every student has read an arbitrary number of books you can map students to sets of books:
readBooks : students --> POW(books)
The the guard would be rbook /: readBooks(name).
Personally I would prefer relations in such a case, they are usually easier to cope with. Here a pair s|->b would be in readBooks if student s has read book b:
readBooks : students <-> books
In that case the guard would be name|->rbook /: readBooks.
I'm willing to give MongoDB and CouchDB a serious try. So far I've worked a bit with Mongo, but I'm also intrigued by Couch's RESTful approach.
Having worked for years with relational DBs, I still don't get what is the best way to get some things done with non relational databases.
For example, if I have 1000 car shops and 1000 car types, I want to specify what kind of cars each shop sells. Each car has 100 features. Within a relational database i'd make a middle table to link each car shop with the car types it sells via IDs. What is the approach of No-sql? If every car shop sells 50 car types, it means replicating a huge amount of data, if I have to store within the car shop all the features of all the car types it sells!
Any help appreciated.
I can only speak to CouchDB.
The best way to stick your data in the db is to not normalize it at all beyond converting it to JSON. If that data is "cars" then stick all the data about every car in the database.
You then use map/reduce to create a normalized index of the data. So, if you want an index of every car, sorted first by shop, then by car-type you would emit each car with an index of [shop, car-type].
Map reduce seems a little scary at first, but you don't need to understand all the complicated stuff or even btrees, all you need to understand is how the key sorting works.
http://wiki.apache.org/couchdb/View_collation
With that alone you can create amazing normalized indexes over differing documents with the map reduce system in CouchDB.
In MongoDB an often used approach would be store a list of _ids of car types in each car shop. So no separate join table but still basically doing a client-side join.
Embedded documents become more relevant for cases that aren't many-to-many like this.
Coming from a HBase/BigTable point of view, typically you would completely denormalize your data, and use a "list" field, or multidimensional map column (see this link for a better description).
The word "column" is another loaded
word like "table" and "base" which
carries the emotional baggage of years
of RDBMS experience.
Instead, I find it easier to think
about this like a multidimensional map
- a map of maps if you will.
For your example for a many-to-many relationship, you can still create two tables, and use your multidimenstional map column to hold the relationship between the tables.
See the FAQ question 20 in the Hadoop/HBase FAQ:
Q:[Michael Dagaev] How would you
design an Hbase table for many-to-many
association between two entities, for
example Student and Course?
I would
define two tables: Student: student
id student data (name, address, ...)
courses (use course ids as column
qualifiers here) Course: course id
course data (name, syllabus, ...)
students (use student ids as column
qualifiers here) Does it make sense?
A[Jonathan Gray] : Your design does
make sense. As you said, you'd
probably have two column-families in
each of the Student and Course tables.
One for the data, another with a
column per student or course. For
example, a student row might look
like: Student : id/row/key = 1001
data:name = Student Name data:address
= 123 ABC St courses:2001 = (If you need more information about this
association, for example, if they are
on the waiting list) courses:2002 =
... This schema gives you fast access
to the queries, show all classes for a
student (student table, courses
family), or all students for a class
(courses table, students family).
In relational database, the concept is very clear: one table for cars with columns like "car_id, car_type, car_name, car_price", and another table for shops with columns "shop_id, car_id, shop_name, sale_count", the "car_id" links the two table together for data Ops. All the columns must well defined in creating the database.
No SQL database systems do not require you pre-define these columns and tables. You just construct your records in a certain format, say JSon, like:
"{car:[id:1, type:auto, name:ford], shop:[id:100, name:some_shop]}",
"{car:[id:2, type:auto, name:benz], shop:[id:105, name:my_shop]}",
.....
After your system is on-line providing service for your management, you may find there are some flaws in your design of db structure, you hope to add one column "employee" of "shop" for your future records. Then your new records coming is as:
"{car:[id:3, type:auto, name:RR], shop:[id:108, name:other_shop, employee:Bill]}",
No SQL systems allow you to do so, but relational database is impossible for this job.