Indirectly generate sequence numbers for composite primary keys with JPA - jpa

I have a JPA entity class with a composite primary key (uid,lid) that in the database should look like this;
UID | LID | ...
---------------
1 | 1 | ...
1 | 2 | ...
1 | 3 | ...
2 | 1 | ...
2 | 2 | ...
2 | 3 | ...
How can I make EclipseLink/JPA generate sequence numbers on the fly, or how can I find out the highest number in the UID-column?
Or if I have a UID but want to add a new LID?
Apologies if this is a too easy question. :)
Composite keys a quite complex thing to me, but I think I start to understand them a bit.

No existing key generator can do that for you but you can write your own. See this answer for some pointers about getting started.

Related

Weird error when trying to query from database [duplicate]

This question already has answers here:
PostgreSQL "Column does not exist" but it actually does
(6 answers)
SQL query column does not exist error
(1 answer)
Closed last month.
As the title suggests, I have no clue why this doesn't work. If someone can point out what I am doing wrong it would be sweet.
Here's the current table rows and cols:
Makes table:
id | make
----+---------------
1 | Acura
Models Table:
id | model | makesId
-----+---------------------------------+---------
1 | CL | 1
2 | ILX | 1
3 | Integra | 1
4 | Legend | 1
5 | MDX | 1
6 | NSX | 1
7 | RDX | 1
8 | RL | 1
9 | RLX | 1
I am trying to query from both tables using a simple line with the WHERE clause with the following query:
SELECT models.model, makes.make
FROM models, makes
WHERE models.makesId = makes.id;
funprojectdb=# SELECT models.model, makes.make FROM models, makes WHERE models.makesId = makes.id;
ERROR: column models.makesid does not exist
LINE 1: ...models.model, makes.make FROM models, makes WHERE models.mak...
^
HINT: Perhaps you meant to reference the column "models.makesId".
The goal is to basically show me all of the models associated to the makes id.
Thanks to the people that answered my question.
It was an issue with postgres case sensitivity which I completely forgot about.
I will be re-doing the database columns with proper field names.

PostgreSQL - Setting null values to missing rows in a join statement

SQL newbie here. I'm trying to write a query that generates a scoring table, setting null to a student's grades in a module for which they haven't yet taken their exams (on PostgreSQL).
So I start with tables that look something like this:
student_evaluation:
|student_id| module_id | course_id |grade |
|----------|-----------|-----------|-------|
| 1 | 1 | 1 |3 |
| 1 | 1 | 1 |7 |
| 1 | 2 | 1 |8 |
| 2 | 4 | 2 |9 |
course_module:
| module_id | course_id |
| ---------- | --------- |
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
In our use case, a course is made up of several modules. Each module has a single exam, but a student who failed his exam may have a couple of retries. The same module may also be present in different courses, but an exam attempt only counts for one instance of the module (ie. student A passed module 1's exam on course 1. If course 2 also has module 1, student A has to retake the same exam for course 2 if he also has access to that course).
So the output should look like this:
student_id
module_id
course_id
grade
1
1
1
3
1
1
1
7
1
2
1
8
1
3
1
null
2
4
2
9
I feel like this should have been a simple task, but I think I have a very flawed understanding of how outer and cross joins work. I have tried stuff like:
SELECT se.student_id, se.module_id, se.course_id, se.grade FROM student_evaluation se
RIGHT OUTER JOIN course_module ON course_module.course_id = se.course_id
AND course_module.module_id = se.module_id
or
SELECT se.student_id, se.module_id, se.course_id, se.grade FROM student_evaluation se
CROSS JOIN course_module WHERE course_module.course_id = se.course_id
Neither worked. These all feel wrong, but I'm lost as to what would be the proper way to go about this.
Thank you in advance.
I think you need both join types: first use a cross join to build a list of all combinations of students and courses, then use an outer join to add the grades.
SELECT sc.student_id,
sc.module_id,
sc.course_id,
se.grade
FROM student_evaluation se
RIGHT JOIN (SELECT s.student_id,
c.module_id,
c.course_id
FROM (SELECT DISTINCT student_id
FROM student_evaluation) AS s
CROSS JOIN course_module AS c) AS sc
USING (course_id));

Postgres database: how to model multiple attributes that can have also multiple value, and have relations to other two entities

I have three entities, Items, Categories, and Attributes.
An Item can be in one or multiple Categories, so there is N:M relation.
Item ItemCategories Categories
id name item_id category_id id name
1 alfa 1 1 1 chipset
1 2 2 interface
An Item can have multiple Attributes depending on the 'Categories' they are in.
For example, the items in Category 'chipset' can have as attributes: 'interface', 'memory' 'tech'.
These attributes have a set of predefined values that don't change often, but they can change.
For example: 'memory' can only be ddr2, ddr3, ddr4.
Attributes CategoryAttributes
id name values category_id attribute_id
1 memory {ddr2, ddr3, ddr4} 1 1
An Item that is in the 'chipset' Category has access to the Attribute and can only have Null or the predefined value of the attribute.
I thought to use Enum or Json for Attribute values, but I have two other conditions:
ItemAttributes
item_id attribute_id value
1 1 {ddr2, ddr4}
1) If an Attribute appears in 2 Categories, and an Ithe is in both categories, only once an attribute can be shown.
2) I need to use the value with rank, so if two corresponding attribute values appear for an item, the rank should be greater if it is only one, or the value doesn't exist.
3)Creating separate tables for Attributes is not an option, because the number is not fixed, and can be big.
So, I don't know exactly the best options in the database design are to constrain the values and use for order ranking.
The problem you are describing is a typical open schema or vertical database, which is a classic use case for some kind of EAV model.
EAV is a complex yet powerful paradigm that allows a potentially open schema while respecting the database normal forms and allows to have what you need: having a variable number of attributes depending on specific instances of the same entity.
That is what happens typically in e-commerce using relational database since different products have different attributes (i.e a lipstick has color, but maybe for a hard drive you dont care about color but about capacity) and it doesn't make sense to have one attribute table, because the number is not fixed and can be big, and for most rows, there would be a lot of NULL values (that is the mathematical notion of a sparse matrix, that looks very ugly in a DB table)
You can take a look at Magento DB Model, a true reference in pure EAV at scale, or Wikipedia, but probably you can do that later, and for now, you just need the basics:
The basic idea is to store attributes, and their corresponding values as rows, instead of columns, in a single table.
In the simpler implementation the table has at least three columns: entity (usually a foreign key to an entity, or entity type/category), attribute (this can be a string, o a foreign key in more complex systems), and value.
In my previous example, oversimplifying, we could have a table like this, that lists attribute names and its values for
Item table Attributes table
+------+--------------+ +-------------+-----------+-------------+
| id | name | | item_id | attribute | value |
+------+--------------+ +-------------+-----------+-------------+
| 1 | "hard drive" | | 2 | "color" | "red" |
+------+--------------+ +-------------+-----------+-------------+
| 2 | "lipstick" | | 2 | "price" | 10 |
+------+--------------+ +-------------+-----------+-------------+
| 1 | "capacity"| "1TB" |
+-------------+-----------+-------------+
| 1 | "price" | 200 |
+-------------+-----------+-------------+
So for every item, you can have a list of attributes.
Since your model is more complex, has a few more constraints, so we need to adapt this model.
Since you want to limit the possible values, you will need a table for values
Since you will have a values table, the values hast to refer to an attribute, so you need the attributes to have an id, so you will have an attribute table
to make explicit and strict what categories have what attribute, you need a category-attribute table
With this, you end up with something like
Categories table
List of categories ids and names
+------+--------------+
| id | name |
+------+--------------+
| 1 | "chipset" |
+------+--------------+
| 2 | "interface" |
+------+--------------+
Attributes table
List of attribute ids and their name
+------+--------------+
| id | name |
+------+--------------+
| 1 | "interface" |
+------+--------------+
| 2 | "memory" |
+------+--------------+
| 3 | "tech" |
+------+--------------+
| 4 | "price" |
+------+--------------+
Category-Attribute table
What category has what attributes. Note that one attribute (i.e 4) can belong to 2 categories
+--------------+--------------+
| attribute_id | category_id |
+--------------+--------------+
| 1 | 1 |
+--------------+--------------+
| 2 | 1 |
+--------------+--------------+
| 3 | 1 |
+--------------+--------------+
| 4 | 1 |
+--------------+--------------+
| 4 | 2 |
+--------------+--------------+
Value table
List of possible values for every attribute
+----------+--------------+--------+
| value_id | attribute_id | value |
+-------------+-----------+--------+
| 1 | 2 | "ddr2" |
+----------+--------------+--------+
| 2 | 2 | "ddr3" |
+----------+--------------+--------+
| 3 | 2 | "ddr4" |
+----------+--------------+--------+
| 4 | 3 |"tech_1"|
+----------+--------------+--------+
| 5 | 3 |"tech_2"|
+----------+--------------+--------+
| 6 | ... | ... |
+----------+--------------+--------+
| 7 | ... | ... |
And finally, what you can imagine, the
Item-Attribute table will list one attribute value per row
+----------+--------------+-------+
| item_id | attribute_id | value |
+----------+-----------+----------+
| 1 | 2 | 1 |
+----------+--------------+-------+
| 1 | 2 | 3 |
+----------+--------------+-------+
Meaning that item 1, for attribute 2 (`memory`), has values 1 and 3 (`ddr2` and `ddr3`)
This will cover all your conditions:
Number of attributes is unlimited, as big as needed and not fixed
You can define clearly what category has what attributes
Two categories can have the same attribute
If 1 item belongs to two categories that have the same attribute, you can show only one (ie SELECT * from Category-Attribute where category_id in (SELECT category_id from ItemCategories where item_id = ...) will give you the list of eligible attributes, only one of each even if 2 categories had the same
You can do a rank, I think I dont have enough info for this query, but being this a fully normalized model, definitely, you can do a rank. You have here pretty much the full model, so surely you can figure out the query.
This is very similar to the model that Magento uses. It is very powerful but of course, it can get hard to manage, but it is the best way if we want to keep the model strict and make sure that it will enforce the constraints and that will accept all the SQL functions.
For systems less strict, it is always an option to go for a NoSQL database with much more flexible schemas.

Database design with multiple units and multiple attributes

Supposing I have this database design which I have researched.
Table: Products
ProductId | Name | BaseUnitId
1 | Lab gown | 1
2 | Gloves | 1
FK: BaseUnitId references Units.UnitId
Table: Units
UnitId | Name
1 | Each / Pieces
2 | Dozen
3 | Box
Table: Unit Conversion
ProdID | BaseUnitID | Factor | ConvertToUnitID
1 | 1 | 12 | 2
2 | 1 | 100 | 3
FK: BaseUnitId references Units.UnitID
FK: ConvertToUnitId references Units.UnitID
Table: Product Attribute
AttribId | Prod_ID | Attribute | Value
1 | 1 | Color | Blue
2 | 1 | Size | Large
3 | 2 | Color | Violet
4 | 2 | Size | Small
5 | 2 | Size | Medium
6 | 2 | Size | Large
7 | 2 | Color | White
FK: Prod_ID references Product.ProductID
Table: Inventory
Prod_ID | Base Unit Qty | Expiry
1 | 12 | n/a
2 | 100 | 2020-01-01
2 | 100 | 2021-12-31
FK: Prod_ID references Product.ProductID
How can I breakdown the inventory per unit per attribute?
e.g How can I get the inventory of SMALL VIOLET GLOVES? LARGE WHITE GLOVES?
Any suggestions? My idea is to create another table which will link product unit, product attribute and quantity.
But I dont know how to link the size attribute and color attribute to a unit.
Lastly, is there something wrong with this design?
I think it is quite wrong to split off the attributes of a product into a different table. I understand the desire to normalize, but it should be done differently.
I'd handle a product and its attributes like this:
CREATE TABLE product (
id bigint PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
name text NOT NULL,
baseunit_id bigint NOT NULL REFERENCES unit
);
CREATE TABLE inventory (
id bigint PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
product_id bigint NOT NULL REFERENCES product,
color integer REFERENCES product_color,
size integer REFERENCES product_size,
other_attributes jsonb
);
That also makes sense if you think about in natural language terms: “How many dozens of large blue gloves do we have on store?”
Attributes that do not apply to a certain product can be left NULL.
I make a distinction between common and rare attributes. Common attributes have their own column. Rare attributes are bunched together in a jsonb column. I know that the latter is not normalized nor pretty, but varying attributes are not very suited for a relational model. A GIN index on the column will allow searches to be efficient.

Sane way to store different data types within same column in postgres?

I'm currently attempting to modify an existing API that interacts with a postgres database. Long story short, it's essentially stores descriptors/metadata to determine where an actual 'asset' (typically this is a file of some sort) is storing on the server's hard disk.
Currently, its possible to 'tag' these 'assets' with any number of undefined key-value pairs (i.e. uploadedBy, addedOn, assetType, etc.) These tags are stored in a separate table with a structure similar to the following:
+---------------+----------------+-------------+
|assetid (text) | tagid(integer) | value(text) |
|---------------+----------------+-------------|
|someStringValue| 1234 | someValue |
|---------------+----------------+-------------|
|aDiffStringKey | 1235 | a username |
|---------------+----------------+-------------|
|aDiffStrKey | 1236 | Nov 5, 1605 |
+---------------+----------------+-------------+
assetid and tagid are foreign keys from other tables. Think of the assetid representing a file and the tagid/value pair is a map of descriptors.
Right now, the API (which is in Java) creates all these key-value pairs as a Map object. This includes things like timestamps/dates. What we'd like to do is to somehow be able to store different types of data for the value in the key-value pair. Or at least, storing it differently within the database, so that if we needed to, we could run queries checking date-ranges and the like on these tags. However, if they're stored as text items in the db, then we'd have to a.) Know that this is actually a date/time/timestamp item, and b.) convert into something that we could actually run such a query on.
There is only 1 idea I could think of thus far, without complete changing changing the layout of the db too much.
It is to expand the assettag table (shown above) to have additional columns for various types (numeric, text, timestamp), allow them to be null, and then on insert, checking the corresponding 'key' to figure out what type of data it really is. However, I can see a lot of problems with that sort of implementation.
Can any PostgreSQL-Ninjas out there offer a suggestion on how to approach this problem? I'm only recently getting thrown back into the deep-end of database interactions, so I admit I'm a bit rusty.
You've basically got two choices:
Option 1: A sparse table
Have one column for each data type, but only use the column that matches that data type you want to store. Of course this leads to most columns being null - a waste of space, but the purists like it because of the strong typing. It's a bit clunky having to check each column for null to figure out which datatype applies. Also, too bad if you actually want to store a null - then you must chose a specific value that "means null" - more clunkiness.
Option 2: Two columns - one for content, one for type
Everything can be expressed as text, so have a text column for the value, and another column (int or text) for the type, so your app code can restore the correct value in the correct type object. Good things are you don't have lots of nulls, but importantly you can easily extend the types to something beyond SQL data types to application classes by storing their value as json and their type as the class name.
I have used option 2 several times in my career and it was always very successful.
Another option, depending on what your doing, could be to just have one value column but store some json around the value...
This could look something like:
{
"type": "datetime",
"value": "2019-05-31 13:51:36"
}
That could even go a step further, using a Json or XML column.
I'm not in any way PostgreSQL ninja, but I think that instead of two columns (one for name and one for type) you could look at hstore data type:
data type for storing sets of key/value pairs within a single
PostgreSQL value. This can be useful in various scenarios, such as
rows with many attributes that are rarely examined, or semi-structured
data. Keys and values are simply text strings.
Of course, you have to check how date/timestamps converting into and from this type and see if it good for you.
You can use 2 different technics:
if you have floating type for every tagid
Define table and ID for every tagid-assetid combination and actual data tables:
maintable:
+---------------+----------------+-----------------+---------------+
|assetid (text) | tagid(integer) | tablename(text) | table_id(int) |
|---------------+----------------+-----------------+---------------|
|someStringValue| 1234 | tablebool | 123 |
|---------------+----------------+-----------------+---------------|
|aDiffStringKey | 1235 | tablefloat | 123 |
|---------------+----------------+-----------------+---------------|
|aDiffStrKey | 1236 | tablestring | 123 |
+---------------+----------------+-----------------+---------------+
tablebool
+-------------+-------------+
| id(integer) | value(bool) |
|-------------+-------------|
| 123 | False |
+-------------+-------------+
tablefloat
+-------------+--------------+
| id(integer) | value(float) |
|-------------+--------------|
| 123 | 12.345 |
+-------------+--------------+
tablestring
+-------------+---------------+
| id(integer) | value(string) |
|-------------+---------------|
| 123 | 'text' |
+-------------+---------------+
In case if every tagid has fixed type
create tagid description table
tag descriptors
+---------------+----------------+-----------------+
|assetid (text) | tagid(integer) | tablename(text) |
|---------------+----------------+-----------------|
|someStringValue| 1234 | tablebool |
|---------------+----------------+-----------------|
|aDiffStringKey | 1235 | tablefloat |
|---------------+----------------+-----------------|
|aDiffStrKey | 1236 | tablestring |
+---------------+----------------+-----------------+
and correspodnding data tables
tablebool
+-------------+----------------+-------------+
| id(integer) | tagid(integer) | value(bool) |
|-------------+----------------+-------------|
| 123 | 1234 | False |
+-------------+----------------+-------------+
tablefloat
+-------------+----------------+--------------+
| id(integer) | tagid(integer) | value(float) |
|-------------+----------------+--------------|
| 123 | 1235 | 12.345 |
+-------------+----------------+--------------+
tablestring
+-------------+----------------+---------------+
| id(integer) | tagid(integer) | value(string) |
|-------------+----------------+---------------|
| 123 | 1236 | 'text' |
+-------------+----------------+---------------+
All this is just for general idea. You should adapt it for your needs.