I am trying to create a normalized set of tables for my books and then
to select them ordering by either book title or authors.
I want to be able to have 'n' books per author, and 'n' authors per book.
The problem I want to solve is how to display my books and authors
ordered by tile or by lastname,firstname,middlename?
I started with a table like this with some 1441 entries.
create table books(
bookid serial,
title text,
firstname text,
lastname text);
I then created an authors table
create table authors(
authorid serial,
firstname text,
lastname text);
and populated it.
I then created a cross reference table
create table bookAuthor
(
bookId INTEGER NOT NULL REFERENCES books(bookId),
authorId INTEGER NOT NULL REFERENCES authors(authorId)
);
and
create unique index bookAuthor_unique_index on bookAuthor(bookId, authorId);
I then populated the bookauthor table with 1441 entries.
I am pretty sure the three tables are populated correctly. I managed to
do several inserts into the authors table and then insert the correct cross relationshipes into the bookauthor table.
I am now stuck, I can't figure out how to display my books and authors
ordered by title or by authors names.
Am I going down the wrong path to create this ability to create N titles per author and N authors per book.
I did multiple searches for foreign keys, and multiple tables with nothing that seemed to resolve my problem.
I'm in a postgresql 9.x environment.
A join will be helpful.
select * from bookAuthor
inner join books on bookAuthor.bookId = books.bookid
inner join authors on bookAuthor.authorId = authors.authorid
order by books.title;
Or you can order by authors.lastname, authors.firstname instead.
Related
consider Amazon product category architecture (one product may have 7 parent categories another might have 2). I want to build the same thing using Postgres.
A: Is there any scaleable logical way to do this? or I must consider using a graph database.
ps: the project will not be AMAZON BIG. this is a monolith project, not a microservice.
B: my thoughts are that I should have a field named parent_categories in my category table which is an array of UUIDs of categories then a field named category_id for the products table that is related to the last category parent would work.
something like this:
CREATE TABLE categories (
id UUID PRIMARY KEY NOT NULL DEFAULT gen_random_uuid (),
name VARCHAR NOT NULL,
parent_categories UUID[]
);
CREATE TABLE products (
id UUID PRIMARY KEY NOT NULL DEFAULT gen_random_uuid (),
name VARCHAR NOT NULL,
category_id UUID[],
CONSTRAINT fk_category FOREIGN KEY(category_id) REFERENCES categories(id)
);
the problem is with joining the chained categories I'm expecting a result like the below when fetching categories (I'm using node.js) and I don't know how to join every element of that array.
categories: [{
id: "id",
name: "name",
parent_categories: [{
id: "id",
name: "name"
}]
}]
This question is about relational theory.
You have a pair of tables containing id and name, that's lovely.
Discard the array attributes, and then
CREATE TABLE product_category (
product_id UUID REFERENCES products(id),
category_id UUID REFERENCES categories(id),
PRIMARY KEY (product_id, category_id)
)
Now you are perfectly set up for 3-way JOINs.
Consider adopting the "table names are singular" convention,
rather than the current plural-form names.
Add a parent_id column to categories,
so the table supports self-joins.
Then use WITH RECURSIVE to navigate
the hierarchical tree of categories.
(Classic example in the Oracle documentation
shows how manager can be used for emp
self-joins to produce a deeply nested org chart.)
I have a products table, and a separate table that I'd like to store related products in which consists of two fields: create table related_products(id1 int, id2 int) and indexes placed on each field. This would mean that I'd have to search both id1 and id2 for a product id, then pull out the other id field which seems quite messy. (Of course, one product could have many related products).
Is there a better table structure for storing the related products that I could use in postgresql?
That is not messy from a database perspective, but exactly the way it should be, as long as only pairs of products can be related.
If you want to make sure that a relationship can be entered only once, you could use a unique index:
CREATE UNIQUE INDEX ON related_products(LEAST(id1, id2), GREATEST(id1, id2));
To search products related to product 42, you could query like this:
SELECT products.*
FROM products
JOIN (SELECT id2 AS id
FROM related_products
WHERE id1 = 42
UNION ALL
SELECT id1
FROM related_products
WHERE id2 = 42
) rel
USING (id);
I mainly focus on the query operation, not union or intersection.
Here is an example.
Let say we have a multi-level category:
CATEGORY-TOP-LEVEL:
CATEGORY1:
CATEGORY1.1:
item1
CATEGORY2:
CATEGORY2.1:
item2
Here, item[N] is the data. Category is a tree structure to represent which category the item belongs to.
Now, suppose I'd like to query all data in category 1, the database should give me item1.
Suppose I'd like to query all data in category-top-level, the database should give me item1 and item2.
It's like set theory. Because item1 belongs to CATEGORY1.1, and CATEGORY1.1 belongs to CATEGORY1. Thus item1 belongs to CATEGORY1.
One solution is use Materialized Paths: We put an field in item, named path, the value is like: ",CATEGORY-TOP-LEVEL,CATEGORY1,CATEGORY1.2". But the problem is it will cause a lot of writing operations when I change a category's name or the hierarchy of the category.
Can MongoDB support that? if not, is there a database can support that?
P.S. Let's take query performance into consideration.
Every modern relational database can support that.
There are different ways of modeling this in a relational database, the most common one is called the "adjacency model":
create table category
(
id integer primary key,
name varchar(100) not null,
parent_category_id integer references category
);
If an item can only ever belong to a single category, the item table would look like this:
create table item
(
id integer primary key,
name varchar(100) not null,
category_id integer not null rerences category
);
If an item can belong to more then one category, you need a many-to-many relationship (also very common in the relational world)
To get all categories below a certain category you can use a recursive query:
with recursive cat_tree as (
select id, name, parent_category_id
from category
where id = 42 -- this is the category where you want to start
union all
select c.id, c.name, c.parent_category_id
from category c
join cat_tree p on p.id = c.parent_category_id
)
select *
from cat_tree;
To get the items together with the categories, just join the above to the item table.
The above query is standard ANSI SQL.
Other popular models are the nested set model, the materialized path (you mentioned that) and the closure table.
This gets asked a lot. See the tags recursive-query and hierarchical-data for many more examples.
I am thinking of three tables;
Employee with emp_id an, emp_name and other related data.
and
Person table with person_id, person_name and other related data.
and
GroupList with grp_list_id, member_id (either emp_id or person_id) and is_employee (true for employee and false for person) and other related data.
This is the first time I am trying to do a table whose foreign key can come from two different tables. Can someone please suggest the best way to achieve it?
I'm struggling with postgreSQL, as I don't know how to link one instance of type A to a set of instances of type B. I'll give a brief example:
Let's say we want to set up a DB containing music albums and people, each having a list of their favorite albums. We could define the types like that:
CREATE TYPE album_t AS (
Artist VARCHAR(50),
Title VARCHAR(50)
);
CREATE TYPE person_t AS (
FirstName VARCHAR(50),
LastName VARCHAR(50),
FavAlbums album_t ARRAY[5]
);
Now we want to create tables of those types:
CREATE TABLE Person of person_t WITH OIDS;
CREATE TABLE Album of album_t WITH OIDS;
Now as I want to make my DB as object-realational as it gets, I don't want to nest album "objects" in the row FavAlbums of the table Person, but I want to "point" to the entries in the table Album, so that n Person records can refer to the same Album record without duplicating it over and over.
I read the manual, but it seems that it lacks some vital examples as object-relational features aren't being used that often. I'm also familiar with the realational model, but I want to use extra tables for the relations.
Why you create a new type in postgresql to do what you need?
Why you don't use tables directly?
With n-n relation:
CREATE TABLE album (
idalbum integer primary key,
Artist VARCHAR(50),
Title VARCHAR(50)
);
CREATE TABLE person (
idperson integer primary key,
FirstName VARCHAR(50),
LastName VARCHAR(50)
);
CREATE TABLE person_album (
person_id integer,
album_id integer,
primary key (person_id, album_id),
FOREIGN KEY (person_id)
REFERENCES person (idperson),
FOREIGN KEY (album_id)
REFERENCES album (idalbum));
Or with a "pure" 1-n relation:
CREATE TABLE person (
idperson integer primary key,
FirstName VARCHAR(50),
LastName VARCHAR(50)
);
CREATE TABLE album (
idalbum integer primary key,
Artist VARCHAR(50),
Title VARCHAR(50),
person_id integer,
FOREIGN KEY (person_id)
REFERENCES person (idperson)
);
I hope that I help you.
Now as I want to make my DB as object-realational as it gets, I don't want to nest album "objects" in the row FavAlbums of the table Person, but I want to "point" to the entries in the table Album, so that n Person records can refer to the same Album record without duplicating it over and over.
Drop the array column, add an id primary key column (serial type) to each table, drop the oids (note that the manual recommends against using them). And add a FavoriteAlbum table with two columns (PersonId, AlbumId), the latter of which are a primary key. (Your relation is n-n, not 1-n.)
Sorry for answering my own question, but I just wanted to give some pieces of information I gained by toying around with that example.
ARRAY Type
I found out that the ARRAY Type in PostgreSQL is useful if you want to associate a variable number of values with one attribute, but only if you can live with duplicate entries. So that technique is not suitable for referencing "objects" by their identity.
References to Objects/Records by identity
So if you want to, as in my example, create a table of albums and want to be able to reference one album by more than one person, you should use a separate table to establish these relationships (Maybe by using the OIDs as keys).
Another crazy thing one could do is referencing albums by using an ARRAY of OIDs in the person table. But that is very awkward and really does not improve on the classic relational style.