JPA how ensure uniqueness over 2 fields, string and boolean - jpa

I want to create an entity containing 2 fields that need to be unique in together. One of the fields is a Boolean:
#Entity
public class SoldToCountry {
private String countryId;
private Boolean isInt;
}
For a given String there should never exist more than 2 entries one with isInt:true and the other isInt:false.
I read the doc about #Id but it seems that Boolean is not supported. For me it would also be ok to have a unique constraint spanned over both fields and using a generated id.
What is the best way to get this constraint via JPA?

If your table has really two fields only, and you want they are unique, then they should be the composite PK of the table. Take a look at How to create and handle composite primary key in JPA
If, instead, you have another PK, consider Sebastian's comment.

Related

Spring Data JPA bulk identifier generation

I have an entity, for which apart from the primary key, an extra unique identifier should be generated:
#Entity
class MyEntity(
val otherId: String // <- this id is unique as well
) {
#Id
#Generated
var id: UUID // PK
}
otherId property value is derived from a postgres sequence value, by calling SELECT nextval(...) and adding a prefix string. When I do bulk inserts, I have to resort to a custom query defined in my JPA repository for the entity, which retrieves multiple sequence values at once, but I'd like to make this process automatic.
I tried to implement IdentifierGenerator interface, but the best I could achieve is a single SELECT nextval query was made for each new entity inserted, which is totally unacceptable in my case since batches can consist of hundreds of entities. Digging into the hibernate details didn't give me an answer how to do that either.
Is there a way to generate a number of ids via some callback/hook for multiple entities at once? Or I still have to do everything by hand?
There are hooks to implement this, see this article as an example: https://thorben-janssen.com/custom-sequence-based-idgenerator/
To improve performance, you will have to configure the increment size, which by default is 50. This means that it will increment the sequence by 50 and put these values in a pool from which the values are served for identity generation.

Using Spring-Data and mongodb, natural vs. artificial id?

I'm using Spring-data to map pojos to mongo json documents.
The mongo Object Id reference says "If your document has a natural primary key that is immutable we recommend you use that in _id instead of the automatically generated ids." My question is, if my document has a natural primary key but it is some combination of the object's attributes, should I combine them to create the natural primary key?
Assume that neither of the values can ever change, and when concatenated together the result is guaranteed to be unique. Note that whatever type you declare for id, Spring converts it to an ObjectId (unless they don't have a converter for that type, then they convert it to a String).
Here is an example:
#Document
public Class HomeworkAssignment {
#Id
private String id;
private final String yyyymmdd;
private final String uniqueStudentName;
private double homeworkGrade;
public HomeworkAssignment(String yyyymmdd, String uniqueStudentName) {
this.yyyymmdd = yyyymmdd;
this.uniqueStudentName = uniqueStudentName;
// can either set the 'id' here, or let Spring give me an artificial one.
}
// setter provided for the homeworkGrade
}
There is guaranteed to be no more than one homework assignment per student per day. Both yyyymmdd and uniqueStudentName are given to me as Strings.
For example, "20120601bobsmith" uniquely identifies Bob Smith's homework on June 1, 2012. (If there is more than one Bob Smith, it is already handled in the uniqueName I'm given).
Assume that I want to follow the mongo reference advice and use a natural primary key if there is one. There is one, but it is a combination of 2 fields. Is this a case where I should combine them like so?
this.id = yyyymmdd + uniqueStudentName.toLowerCase();
It is certainly reasonable to use a combination of attributes as a primary key. However, rather than concatenating them, it is probably more logically intuitive to place them into a subdocument with two fields (uniqueStudentName and yyyymmdd) that is used as the _id.
Take a look at this question, which involves using a compound primary key:
MongoDB Composite Key

Cassandra Schema Design

I'm continuing exploring Cassandra and I would like to create Student <=> Course relation which is similar to Many-to-Many on RDBMS.
In term of Queries I will use the following query;
Retrieve all courses in which student enrolled.
Retrieve all students enrolled in specific course.
Let's say that I create to Column Families. one for Course and another for Student.
CREATE COLUMN FAMILY student with comparator = UTF8Type AND key_validation_class=UTF8Type and column_metadata=[
{column_name:firstname,validation_class:UTF8Type}
{column_name:lastname,validation_class:UTF8Type}
{column_name:gender,validation_class:UTF8Type}];
CREATE COLUMN FAMILY course with comparator = UTF8Type AND key_validation_class=UTF8Type and column_metadata=[
{column_name:name,validation_class:UTF8Type}
{column_name:description,validation_class:UTF8Type}
{column_name:lecturer,validation_class:UTF8Type}
{column_name:assistant,validation_class:UTF8Type}];
Now how should I move on?
Should I create third Column Family with courseID:studentId CompisiteKey? if yes, Can I use Hector to query by only one (left or right) Composite key component?
Please help.
Update:
Following the suggestion I created the following Schema:
For Student:
CREATE COLUMN FAMILY student with comparator = UTF8Type and key_validation_class=UTF8Type and default_validation_class=UTF8Type;
and then we will add some data:
set student['student.1']['firstName']='Danny'
set student['student.1']['lastName']='Lesnik'
set student['student.1']['course.1']=''
set student['student.1']['course.2']='';
Create column Family for Course:
CREATE COLUMN FAMILY course with comparator = UTF8Type and key_validation_class=UTF8Type and default_validation_class=UTF8Type;
add some data:
set course['course.1']['name'] ='History'
set course['course.1']['description'] ='History Course'
set course['course.1']['name'] ='Algebra'
set course['course.1']['description'] ='Algebra Course'
and Finally Student In Course:
CREATE COLUMN FAMILY StudentInCourse with comparator = UTF8Type and key_validation_class=UTF8Type and default_validation_class=UTF8Type;
add data:
set StudentInCourse['studentIncourse.1']['student.1'] ='';
set StudentInCourse['studentIncourse.2']['student.1'] ='';
I defined a data model below but it is easier to decribe the object model first and then dive into the row model, so from PlayOrm's perspective you would have
public class Student {
#NoSqlId
private String id;
private String firstName;
private String lastName;
#ManyToMany
private List<Course> courses = new ArrayList(); //constructing avoids nullpointers
}
public class Course {
#NoSqlId
private String id;
private String name;
private String description
#ManyToOne
private Lecturer lecturer;
#ManyToMany
private CursorToMany students = new CursorToManyImpl();
}
I could have used List in course but I was concerned I may get OutOfMemory if too many students take a course over years and years and years. NOW, let's jump to what PlayOrm does and you can do something similar if you like
A single student row would look like so
rowKey(the id in above entity) = firstName='dean',
lastName='hiller' courses.rowkey56=null, courses.78=null, courses.98=null, courses.101=null
This is the wide row where we have many columns with the name 'fieldname' and 'rowkey to actual course'
The Course row is a bit more interesting....because the user thinks loading al the Students for a single course could cause out of memory, he uses a cursor which only loads 500 at a time as you loop over it.
There are two rows backing the Course in this case that PlayOrm will have. Sooo, let's take our user row above and he was in course rowkey56 so let's describe that course
rowkey56 = name='coursename', description='somedesc', lecturer='rowkey89ToLecturer'
Then, there is another row in the some index table for the students(it is a very wide row so supports up to millions of students)
indexrowForrowkey56InCourse = student34.56, student39.56, student.23.56....
into the millions of students
If you want a course to have more than millions of students though, then you need to think about partitioning whether you use playOrm or not. PlayOrm does partitioning for you if you need though.
NOTE: If you don't know hibernate or JPA, when you load the above Student, it loads a proxy list so if you start looping over the courses, it then goes back to the noSQL store and loads the Courses so you don't have to ;).
In the case of Course, it loads a proxy Lecturer that is not filled in until you access a property field like lecturer.getName(). If you call lecturer.getId(), it doesn't need to load the lecturer since it already has that from the Course row.
EDIT(more detail): PlayOrm has 3 index tables Decimal(stores double, float, etc and BigDecimal), Integer(long, short, etc and BigInteger and boolean), and String index tables. When you use CursorToMany, it uses one of those tables depending on the FK type of key. It also uses those tables for it's Scalable-SQL language. The reason it uses a separate row on CursorToMany is just so clients don't get OutOfMemory on reading a row in as the toMany could have one million FK's in it in some cases. CursorToMany then reads in batches from that index row.
later,
Dean

JPA 2.0 retrieve entity by business key

I know there have been a number of similar posts about this, but I couldn't find a clear answer to my problem.
To make it as simple as possible, say I have such an entity:
#Entity
public class Person implements Serializable {
#Id
private Long id; // PK
private String name; // business key
/* getters and setters */
/*
override equals() and hashCode()
to use the **name** field
*/
}
So, id is the PK and name is the business key.
Say that I get a list of names, with possible duplicates, which I want to store.
If I simply create one object per name, and let JPA make it persistent, my final table will contain duplicate names - Not acceptable.
My question is what you think is the best approach, considering the alternatives I describe here below and (especially welcome) your own.
Possible solution 1: check the entity manager
Before creating a new person object, check if one with the same person name is already managed.
Problem: The entity manager can only be queried by PK. IS there any workaround Idon't know about?
Possible solution 2: find objects by query
Query query = em.createQuery("SELECT p FROM Person p WHERE p.name = ...");
List<Person> list = query.getResultList();
Questions: Should the objects requested be already loaded in the em, will this still fetch from database? If so, I suppose it would still be not very efficient if done very frequently, due to parsing the query?
Possible solution 3: keep a separate dictionary
This is possible because equals() and hashCode() are overridden to use the field name.
Map<String,Person> personDict = new HashMap<String,Person>();
for(String n : incomingNames) {
Person p = personDict.get(n);
if (p == null) {
p = new Person();
p.setName(n);
em.persist(p);
personDict.put(n,p);
}
// do something with it
}
Problem 1: Wasting memory for large collections, as this is essentially what the entity manager does (not quite though!)
Problem 2: Suppose that I have a more complex schema, and that after the initial writing my application gets closed, started again, and needs to re-load the database. If all tables are loaded explicitly into the em, then I can easily re-populate the dictionaries (one per entity), but if I use lazy fetch and/or cascade read, then it's not so easy.
I started recently with JPA (I use EclipseLink), so perhaps I am missing something fundamental here, because this issue seems to boil down to a very common usage pattern.
Please enlighten me!
The best solution which I can think of is pretty simple, use a Unique Constraint
#Entity
#UniqueConstraint(columnNames="name")
public class Person implements Serializable {
#Id
private Long id; // PK
private String name; // business key
}
The only way to ensure that the field can be used (correctly) as a key is to create a unique constraint on it. You can do this using #UniqueConstraint(columnNames="name") or using #Column(unique = true).
Upon trying to insert a duplicate key the EntityManager (actually, the DB) will throw an exception. This scenario is also true for a manually set primary key.
The only way to prevent the exception is to do a select on the key and check if it exists.

How to create a composite primary key which contains a #ManyToOne attribute as an #EmbeddedId in JPA?

I'm asking and answering my own question, but i'm not assuming i have the best answer. If you have a better one, please post it!
Related questions:
How to set a backreference from an #EmbeddedId in JPA
hibernate mapping where embeddedid (?)
JPA Compound key with #EmbeddedId
I have a pair of classes which are in a simple aggregation relationship: any instance of one owns some number of instances of the other. The owning class has some sort of primary key of its own, and the owned class has a many-to-one to this class via a corresponding foreign key. I would like the owned class to have a primary key comprising that foreign key plus some additional information.
For the sake of argument, let's use those perennial favourites, Order and OrderLine.
The SQL looks something like this:
-- 'order' may have been a poor choice of name, given that it's an SQL keyword!
create table Order_ (
orderId integer primary key
);
create table OrderLine (
orderId integer not null references Order_,
lineNo integer not null,
primary key (orderId, lineNo)
);
I would like to map this into Java using JPA. Order is trivial, and OrderLine can be handled with an #IdClass. Here's the code for that - the code is fairly conventional, and i hope you'll forgive my idiosyncrasies.
However, using #IdClass involves writing an ID class which duplicates the fields in the OrderLine. I would like to avoid that duplication, so i would like to use #EmbeddedId instead.
However, a naive attempt to do this fails:
#Embeddable
public class OrderLineKey {
#ManyToOne
private Order order;
private int lineNo;
}
OpenJPA rejects the use of that as an #EmbeddedId. I haven't tried other providers, but i wouldn't expect them to succeed, because the JPA specification requires that the fields making up an ID class be basic, not relationships.
So, what can i do? How can i write a class whose key contains #ManyToOne relationship, but is handled as an #EmbeddedId?
I don't know of a way to do this which doesn't involve duplicating any fields (sorry!). But it can be done in a straightforward and standard way that involves duplicating only the relationship fields. The key is the #MapsId annotation introduced in JPA 2.
The embeddable key class looks like this:
#Embeddable
public class OrderLineKey {
private int orderId;
private int lineNo;
}
And the embedding entity class looks like this:
#Entity
public class OrderLine{
#EmbeddedId
private OrderLineKey id;
#ManyToOne
#MapsId("orderId")
private Order order;
}
The #MapsId annotation declares that the relationship field to which it is applied effectively re-maps a basic field from the embedded ID.
Here's the code for OrderId.