This May be a dumb question as I am a beginner in postgreSQL but what I'm trying to do is
I have a Table called Products and inside products there is 3 columns Name, Price, Expiry Date. Now I have a second table called orders with 4 columns. Product, purchasePrice, Amount, and CountryRecieved.
All I want is to reference the Product column to the product table so it has all the Information of the product table?
Is this do able?
The key concepts you need to read up on are:
"normalisation": the process of breaking down data into multiple related entities
"foreign keys": pointers from one database table to another
"joins": the query construct used to follow that pointer and get the data back together
In your case:
You have correctly determined that the information from Products should not just be copied manually into each row of the Orders table. This is one of the most basic aspects of normalisation: each piece of data is in one place, so updates cannot make it inconsistent.
You have deduced that the Orders table needs some kind of Product column; this is your foreign key. The most common way to represent this is to give the Products table an ID column that uniquely identifies each row, and then have a ProductID column in the Orders table. You could also use the product's name as the key, but this means you can never rename a product, as other entities in the database might reference it; integer keys will generally be more efficient in storage and speed, as well.
To use that foreign key relationship, you use a JOIN in your SQL queries. For example, to get the name and quantity of products ordered, you could write:
SELECT
P.Name,
O.Amount
FROM
Products as P
INNER JOIN
Orders as O
-- This "ON" clause tells the database how to look up the foreign key
On O.ProductId = P.ProductId
ORDER BY
P.Name
Here I've used an "inner join"; there are also "left outer join" and "right outer join", which can be used when only some rows on one side will meet the condition. I recommend you find a tutorial that explains them better than I can in a single paragraph.
Assuming the name column is key in Products table and product column in Orders table refers to it, you can join the two table on related column(s) and get all the information:
select
o.*, p.*
from orders o
join products p on o.product = p.name;
Related
I want to use a temporary table (let's call it temp_tbl) created in a PostgreSQL function in order to SELECT into it just some rows and columns from a table (let's call it tbl) as follows:
One of the columns that both temp_tbl and tbl have is order_date of type DATE and the function also takes a start_date DATE argument. I want to SELECT in temp_tbl just the rows from tbl that have an order_date later than the function's start_date.
My question is: if this function gets called concurrently 2 or more times in the same session, won't the calls use the same instance of the temporary table temp_tbl ?
More specifically, when using psycopg2 in the backend of a webserver, different clients of the webserver might require calling our function at the same time. Will this generate a conflict over the temp_tbl temporary table declared inside the function?
EDIT: my actual context
I'm building (for education purposes) an online shop. I have 3 tables for 3 kinds of products that all use a common sequence for their ids. I have another table for orders that includes a column which is an array of product ids and a column which is an array of quantities (associated with the product ids of the ordered products).
I want to return a table of common product details (columns common to all 3 tables like id, name, price etc) and their associated number of sales.
My current method is to concatenate all the arrays of ids and quantities from all order entries, then create a temporary table out of the 2 arrays and sum the number of orders for each product id so I have a table with one entry for each ordered product.
Then, I create 3 temporary tables in order to join each product table with the temporary product orders figures table and SELECT only the columns that are common to all 3 tables.
Finally, I UNION the 3 temporary tables.
This is kind of complicated for me so I think that maybe there were better design decisions I could have made ?
I have a schema with one table with the majority of data, customer, and three other tables with foreign key references to customer.entry_id which is a BIGSERIAL field. The three other tables are called location, devices and urls where we store various data related to a specific entry in the customer table.
I want to partition the customer table into monthly child tables, and have that part worked out; customer will stay as-is, each month will have a table customer_YYYY_MM that inherits from the master table with the right CHECK constraint and indexes will be created on each individual child table. Data will be moved to the correct child tables while the master table stays empty.
My question is about the other three tables, as I want to partition them as well. However, they have no date information (at all), only the reference to the primary key from the master table. How can I setup the constraints on these tables? Is it even meaningful or possible without date information?
My application logic knows where to insert all the data (it's fairly trivial), but I expect to be able to do simple SELECT queries without specifying which child tables to get it from. So this should work as you would expect from non-partitioned tables:
SELECT l.*
FROM customer c
JOIN location l USING entry_id
WHERE c.date_field > '2015-01-01'
I would partition them by the reference key. The foreign key is used in join conditions and is not usually subject to change so it fulfills the following important points:
Partition by the information that is used mostly in the WHERE clauses of the queries or other parts where partitioning can be used to filter out tables that don't need to be scanned. As one guide puts it:
The objective when defining partitions should be to allow as many queries as possible to fetch data from as few partitions as possible - ideally one.
Partition by information that is not going to be changed so that rows don't constantly need to be thrown from one subtable to another
This all depends of the size of the tables too of course. If the sizes stay small then there is no need to partition.
Read more about partitioning here.
Use views:
create view customer as
select * from customer_jan_15 union all
select * from customer_feb_15 union all
select * from customer_mar_15;
create view location as
select * from location_jan_15 union all
select * from location_feb_15 union all
select * from location_mar_15;
I have a table 'users' with the columns:
user_id(PK), user_firstname, user_lastname
and another table 'room' with the columns:
event_id(PK), user_id(FK), user_firstname, user_lastname....(and more columns).
I want to know if it is possible to fill the user_firstname and user_lastname automatically just knowing the user_id column.
Like the default value of user_firstname would be like: "select users.user_firstname where users.user_id = user_id"
I don't know if was clear enough...As you can see my knowledge in database is very narrow.
What you want to achieve can be done with JOINs. They will avoid those redundant user_firstname and user_lastname columns. So you'd just fetch from both tables when querying the room table and you get the extra columns of users into the result set:
SELECT * FROM room AS r INNER JOIN users AS u ON r.user_id = u.user_id;
The thing we did here is called normalization. Another important thing to take care of are foreign key constraints and their cascades, in your case room.user_id references user.user_id. A delete on user should most probably cascade to room, if you want to delete users, instead of flagging them deleted.
The columns user_firstname and user_lastname do not belong in your room table. The user_id column references the users table, that is all you need.
To select the data, you can use a JOIN statement, something like
SELECT R.event_id, R.user_id, U.user_firstname, U.user_lastname
FROM room AS R
JOIN users AS U ON R.user_uid = U.user_id
The answer here is sideways to the question. You do not want a user_firstname and user_lastname column in the Event table. The user_id is a proxy for that row of the entire User table. When you need to access user_firstname, you do a JOIN of the two tables on the common column.
I need to create a report that shows, for each master row, two sets of child (detail) rows. In database terms you can think of this as table 'A' having a 1:M relationship to table 'B' and also a 1:M relationship to table 'C'. So, for each row from table 'A', I want to display a list of child rows from 'B' (preferably in one section) and a list of child rows from 'C' (preferably in another section). I would also prefer to avoid the use of sub-reports, if possible. How can I best accomplish this?
Thanks.
I think I understand your question correctly, ie for a given row in Table A, you want the details section(s) to show all connected rows in Table B and then all connected rows in Table C (for any number of rows in B or C, including zero). I only know of two solutions to this, neither of which is straightforward.
The first is, as you've guessed, the disliked subreport option.
The second involves some additional work in the database; specifically, creating a view that combines the entries in Table B and Table C into one table, which can then be used in the main report as a linkable object to report on and you can group on Table A as desired. How straightforward this is will depend on how similar the structures of B and C are. If they were effectively identical, the view could contain something simple like
SELECT 'B' AS DetailType, Field1, Field2, FieldLinkedToTableA
FROM TableB
UNION ALL
SELECT 'C' AS DetailType, Field1, Field2, FieldLinkedToTableA
FROM TAbleC
However, neither option will scale well to reports with lots of results (unless your server supports indexing the view).
This is exactly what Crystal was made for :)
Make a blank .rpt and connect to your data sources as you normally would.
Go to Report->Group Expert and choose your grouping field (aka Index field).
You will now see a Group Header section in your design view. This is your "Master row". The Details sections will be your "Child rows".
In the example image below, this file is grouped by {Client}. For client "ZZZZ", there are 2 records, so there are 2 details sections while all the other clients only have 1 details section.
Edit
Based on your response, how about this:
In your datasource (or perhaps using some kind of intermediary like MS Access), start SQLing as follows.
Make a subquery left joining the primary key of TblA and the foreign key of TblB. Add a third column containing a constant, e.g. "TblB"
Make a subquery left joining the primary key of TblA and the foreign key of TblC. Add a third column containing a different constant, e.g. "TblC"
Union those 2 queries together. That'll be your "index table" of your crystal report.
In Crystal, you can have multiple grouping levels. So group first by your constant column, then by the TblA primary key, then by the foreign key.
This way, all the results from TblB will be displayed first, then TblC. And with a little work, tables B & C won't even have to have the same field definitions.
You can use or create columns that are used for grouping, then group on the table A column, then the table B column, then C. (Crystal's group is not the same as t-sql "group by". In Crystal, it's more of a sorting than a grouping.)
Hopefully my description is a little better than the title, but basically I'm having an issue with one part of a new application schema and i'm stuck on what is the most manageable and elegant solution in table structure.
Bare bones table structure with only relevant fields showing would be as follows:
airline (id, name, ...)
hotel (id, name, ...)
supplier (id, name, ...)
event (id, name,...)
eventComponent (id,name) {e.g Food Catering, Room Hire, Audio/Visual...}
eventFlight (id, eventid, airlineid, ...)
eventHotel (id, eventid, hotelid, ...)
eventSupplier (id, eventid, supplierid, hotelid, eventcomponentid, ...)
So airline, hotel, supplier are all reference tables, and an Event is create with 1 to many relationships between these reference tables. E.g an Event may have 2 flight entries, 3 Other components entries, and 2 hotel entries. But the issue is that in the EventSupplier table the supplier can be either a Supplier or an existing Hotel. So after the user has built their new event on the front-end i need to store this in a fashion that doesn't make it a nightmare to then return this data later.
I've been doing a lot of reading on Polymorphic relations and exclusive arcs and I think my scenario is definitely more along the lines or an Exclusive Arc relationship.
I was thinking:
CREATE TABLE eventSupplier (
id SERIAL PRIMARY KEY,
eventid INT NOT NULL,
hotelid INT,
supplierid INT,
CONSTRAINT UNIQUE (eventid, hotelid, supplierid), -- UNIQUE permits NULLs
CONSTRAINT CHECK (hotelid IS NOT NULL OR supplierid IS NOT NULL),
FOREIGN KEY (hotelid) REFERENCES hotel(id),
FOREIGN KEY (supplierid) REFERENCES supplier(id)
);
And then for the retrieval of this data just use an outer join to both tables to work out which one is linked.
select e.id as eventid, coalesce(h.name,s.name) as supplier
from eventSupplier es
left outer join
supplier s on s.id = es.supplierid
left outer join
hotel h on h.id = es.hotelid
where h.id is not null OR s.id is not null
My other options were to have a single foreign key in the eventSupplier table with another field for the "type" which seems to be a harder solution to retrieve data from, though it does seem quite flexible if I want to extend this down the track without making schema changes. Or alternately to store the hotelid in the Supplier table direct and just declare some suppliers as being a "hotel" though there were then be redundant data which I don't want.
Any thoughts on this would be much appreciated!
Cheers
Phil
How about handling events one-by-one and using an EventGroup to group them together?
EDIT:
I have simply renamed entities to fit the latest comments. This as close as I can get to this -- admittedly I do not understand the problem properly.
A good way to test your solution is to think about what would happen if an airline became a supplier. Does your solution handle that or start to get complicated.
Why do you explicitly need to find hotel data down the supplier route if you don't need that level of data other types of supplier? I would suggest that a supplier is a supplier, whether its a hotel or not for these purposes.
If you want to flag a supplier as a hotel, then simply put hotelid on the supplier table or else wait and hook in the supplier later via whatever mechanism you use to get detail on other suppliers.