Sequelize many-to-many self reference - postgresql

I'm trying to get one of the association examples from sequelize working properly and it doesn't seem to be setting up the join table correctly. In the example we have one model called Person and then a many-to-many self-reference for a Person's children. Code:
var Sequelize = require('sequelize');
var sequelize = new Sequelize('postgres://root#localhost/database_bug');
var Person = sequelize.define('Person', {
id: {
type: Sequelize.INTEGER,
primaryKey: true,
autoIncrement: true
},
name: {
type: Sequelize.STRING,
allowNull: false
}
});
Person.belongsToMany(Person, { as: 'Children', through: 'PersonChildren' });
Person.sequelize.sync({force:true}).then(function() {
Person.build({ name: 'John Doe' }).save().then(function(father) {
Person.build({ name: 'Jane Doe' }).save().then(function(daughter) {
father.addChild(daughter);
});
});
});
But when I look at my tables in postgres I feel like a column is missing on the auto-generated join table.
List of relations
Schema | Name | Type | Owner
--------+----------------+----------+-------
public | People | table | root
public | People_id_seq | sequence | root
public | PersonChildren | table | root
Table "public.People"
Column | Type | Modifiers
-----------+--------------------------+-------------------------------------------------------
id | integer | not null default nextval('"People_id_seq"'::regclass)
name | character varying(255) | not null
createdAt | timestamp with time zone | not null
updatedAt | timestamp with time zone | not null
Indexes:
"People_pkey" PRIMARY KEY, btree (id)
Referenced by:
TABLE ""PersonChildren"" CONSTRAINT "PersonChildren_PersonId_fkey" FOREIGN KEY ("PersonId") REFERENCES "People"(id)
Table "public.PersonChildren"
Column | Type | Modifiers
-----------+--------------------------+-----------
createdAt | timestamp with time zone | not null
updatedAt | timestamp with time zone | not null
PersonId | integer | not null
Indexes:
"PersonChildren_pkey" PRIMARY KEY, btree ("PersonId")
Foreign-key constraints:
"PersonChildren_PersonId_fkey" FOREIGN KEY ("PersonId") REFERENCES "People"(id)
PersonChildren needs a column called ChildId or something along those lines to link a Person to its Child.
People table:
database_bug=# SELECT * FROM "People";
id | name | createdAt | updatedAt
----+----------+----------------------------+----------------------------
1 | John Doe | 2015-02-06 09:36:44.975-08 | 2015-02-06 09:36:44.975-08
2 | Jane Doe | 2015-02-06 09:36:44.985-08 | 2015-02-06 09:36:44.985-08
Weirder still, I select to make sure daughter was added as a child to father:
database_bug=# SELECT * from "PersonChildren";
createdAt | updatedAt | PersonId
----------------------------+----------------------------+----------
2015-02-06 09:36:44.997-08 | 2015-02-06 09:36:44.997-08 | 2
But PersonId is 2, not 1. The father was supposed to add the daughter, not the other way around.
Any ideas how to get this association working?

Ok, looks like the example in the documentation is wrong. To be fair they did say must use hasMany but then showed an example using belongsToMany.
I changed belongsToMany to hasMany and it looks like we are good to go:
Table "public.PersonChildren"
Column | Type | Modifiers
-----------+--------------------------+-----------
createdAt | timestamp with time zone | not null
updatedAt | timestamp with time zone | not null
PersonId | integer | not null
ChildId | integer | not null
database_bug=# select * from "PersonChildren";
createdAt | updatedAt | PersonId | ChildId
----------------------------+----------------------------+----------+---------
2015-02-06 10:04:21.624-08 | 2015-02-06 10:04:21.624-08 | 1 | 2
Now I can do father.getChildren() and the promise will return a list of children.

Related

How to query across multiple rows in postgres

I'm saving dynamic objects (objects of which I do not know the type upfront) using the following 2 tables in Postgres:
CREATE TABLE IF NOT EXISTS objects(
id UUID NOT NULL DEFAULT gen_random_uuid(),
user_id UUID NOT NULL,
name TEXT NOT NULL,
PRIMARY KEY(id)
);
CREATE TABLE IF NOT EXISTS object_values(
id UUID NOT NULL DEFAULT gen_random_uuid(),
event_id UUID NOT NULL,
param TEXT NOT NULL,
value TEXT NOT NULL,
);
So for instance, if I have the following objects:
dog = [
{ breed: "poodle", age: 15, ...},
{ breed: "husky", age: 9, ...},
}
monitors = [
{ manufacturer: "dell", ...},
}
It will live in the DB as follows:
-- objects
| id | user_id | name |
|----|---------|---------|
| 1 | 1 | dog |
| 2 | 2 | dog |
| 3 | 1 | monitor |
-- object_values
| id | event_id | param | value |
|----|----------|--------------|--------|
| 1 | 1 | breed | poodle |
| 2 | 1 | age | 15 |
| 3 | 2 | breed | husky |
| 4 | 2 | age | 9 |
| 5 | 3 | manufacturer | dell |
Note, these tables are big (hundreds of millions). Generally optimised for writing.
What would be a good way of querying/filtering objects based on multiple object params? For instance: Select the number of all husky dogs above the age of 10 per unique user.
I also wonder whether it would have been better to denormalise the tables and collapse the params onto a JSON column (and use gin indexes).
Are there any standards I can use here?
"Select the number of all husky dogs above the age of 10 per unique user" - The following query would do it.
SELECT user_id, COUNT(DISTINCT event_id) AS num_husky_dogs_older_than_10
FROM objects o
INNER JOIN object_values ov
ON o.id_ = ov.event_id
AND o.name_ = 'dog'
GROUP BY o.user_id
HAVING MAX(CASE WHEN ov.param = 'age'
AND ov.value_::integer >= 10 THEN 1 END) = 1
AND MAX(CASE WHEN ov.param = 'breed'
AND ov.value_ = 'husky' THEN 1 END) = 1;
Since your queries are most likely affected by having always the same JOIN operation between these two tables on the same fields, would be good to have a indices on:
the fields you join on ("objects.id", "object_values.event_id")
the fields you filter on ("objects.name", "object_values.param", "object_values.value_")
Check the demo here.

unique constraint values in postgres

I applied over a postgres USER table and unique contraint over email. The problem that I am facing now is that the constraint seems to register each value I insert (or try to insert) no matter if a record with that value exists or not.
I.e
Table:
id
user
1
mail#gmail.com
2
mail2#gmail.com
if i insert mail3#gmail.com, delete the value and try to insert mail3#gmail.com again it says:
sqlalchemy.exc.IntegrityError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "email"
my doubt is: the unique constraint guarantees that the value is newer always or that there is only one record of that value in the column?
documentations says it the second but the experience shows is the first one
more details:
| Column | Type | Nullable |
|------------------|-----------------------------|----------|
| id | integer | not null |
| email | character varying(100) | |
| password | character varying(100) | |
| name | character varying(1000) | |
| lastname | character varying(1000) | |
| dni | character varying(20) | |
| cellphone | character varying(20) | |
| accepted_terms | boolean | |
| investor_test | boolean | |
| validated_email | boolean | |
| validated_cel | boolean | |
| last_login_at | timestamp without time zone | |
| current_login_at | timestamp without time zone | |
| last_login_ip | character varying(100) | |
| current_login_ip | character varying(100) | |
| login_count | integer | |
| active | boolean | |
| fs_uniquifier | character varying(255) | not null |
| confirmed_at | timestamp without time zone | |
Indexes:
"bondusers_pkey" PRIMARY KEY, btree (id)
"bondusers_email_key" UNIQUE CONSTRAINT, btree (email)
"bondusers_fs_uniquifier_key" UNIQUE CONSTRAINT, btree (fs_uniquifier)
Insert Statement:
INSERT INTO bondusers (email, password, name, lastname, dni, cellphone, accepted_terms, investor_test, validated_email, validated_cel, last_login_at, current_login_at, last_login_ip, current_login_ip, login_count, active, fs_uniquifier, confirmed_at) VALUES ('mail3#gmail.com', '$pbkdf2-sha256$29000$XyvlfI8x5vwfYwyhtBYi5A$Hhfrzvqs94MjTCmDOVmmnbUyf7ho4kLEY8UYUCdHPgM', 'mail', 'mail3', '123123123', '1139199196', false, false, false, false, NULL, NULL, NULL, NULL, NULL, true, '1c4e60b34a5641f4b560f8fd1d45872c', NULL);
ERROR: duplicate key value violates unique constraint "bondusers_fs_uniquifier_key"
DETAIL: Key (fs_uniquifier)=(1c4e60b34a5641f4b560f8fd1d45872c) already exists.
but when:
select * from bondusers where fs_uniquifier = '1c4e60b34a5641f4b560f8fd1d45872c';
result is 0 rows
I assume that if you run the INSERT, DELETE, INSERT directly within Postgres command line it works OK?
I noticed your error references SQLAlchemy (sqlalchemy.exc.IntegrityError), so I think it may be that and not PostgreSQL. Within a transaction SQLAlchemy's Unit of Work pattern can re-order SQL statements for performance reasons.
The only ref I could find was here https://github.com/sqlalchemy/sqlalchemy/issues/5735#issuecomment-735939061 :
if there are no dependency cycles between the target tables, the flush proceeds as follows:
<snip/>
a. within a particular table, INSERT operations are processed in the order in which objects were add()'ed
b. within a particular table, UPDATE and DELETE operations are processed in primary key order
So if you have the following within a single transaction:
INSERT x
DELETE x
INSERT x
when you commit it, it's probably getting reordered as:
INSERT x
INSERT x
DELETE x
I have more experience with this problem in Java/hibernate. The SQLAlchemy docs do claim it's unit of work pattern is "Modeled after Fowler's "Unit of Work" pattern as well as Hibernate, Java's leading object-relational mapper." so probably relevant here too
To supplement Ed Brook's insightful answer, you can work around the problem by flushing the session after deleting the record:
with Session() as s, s.begin():
u = s.scalars(sa.select(User).where(User.user == 'a')).first()
s.delete(u)
s.flush()
s.add(User(user='a'))
Another solution would be to use a deferred constraint, so that the state of the index is not evaluated until the end of the transaction:
class User(Base):
...
__table_args__ = (
sa.UniqueConstraint('user', deferrable=True, initially='deferred'),
)
but note, from the PostgreSQL documentation:
deferrable constraints cannot be used as conflict arbitrators in an INSERT statement that includes an ON CONFLICT DO UPDATE clause.

How to place a ForeignKey constraint on a JSON column with sqlalchemy?

I'm building an application where the user can add arbitrary fields to a piece of equipment.
These fields are modeled as a Field object table in the database. The Equipment table has a JSONB "data" column, where the key is the field name and the value is the property value. However, this currently isn't enforced by the database. I can't use a traditional extra link table (whatever they're actually called) because the property needs to have an actual value associated with it.
To clarify, the constraint that I want to enforce is that all of the keys to data must exist in field.name.
Field table:
| name (PrimaryKey) | type (enum) | extra (JSONB) |
-------------------------------------------------------------------
| ram | 'float' | {"unit": "GB"} |
| cpus | 'integer' | NULL |
| model | 'enum' | {"values": ["DL385", "R710"]} |
Equipment table:
| id | ... | data (JSONB) |
------------------------------------------------------------------------------------------
| 1e2bda05-2457-48df-8a8b-4e54a0176c57 | ... | {"ram": 100, "model": "DL385"} |
| b65e2547-a4b5-4121-bed3-10f1a026f88a | ... | {"ram": 512, "cpus": 2, "model": "R710"} |
Python code:
class Equipment(Base):
__tablename__ = 'equipment'
id = Column(UUID, primary_key=True, default=lambda: str(uuid.uuid4()))
parent_rack = Column(UUID, ForeignKey('rack.id'), nullable=False)
data = Column(JSONB, nullable=False)
#validates('data')
def validate_data(self, _, data):
s = Session().session
for key in data:
if len(s.query(Field).filter_by(name=key).all()) != 1:
raise ValueError('extra data must be a registered field')
return data
class Field(Base):
__tablename__ = 'field'
name = Column(String, primary_key=True, nullable=False, unique=True)
type = Column(Enum(FieldType), nullable=False)
extra = Column(JSONB)
#validates('extra')
def validate_extra(self, _, extra):
self.type.validate_extra(extra)
return extra

generating nil id having id sequence defined, Why?

I have a table named async_data, which has id column also auto increment defined. But in production I am seeing some insert queries are failing saying
PG::NotNullViolation: ERROR: null value in column "id" violates not-null constraint
Rails Migration file
class CreateAsyncData < ActiveRecord::Migration[5.0]
def change
create_table :async_data do |t|
t.integer :request_id
t.integer :sourced_contact_id
t.integer :data_source_id
t.boolean :is_enriched
t.column :requested_params, :json
t.text :q
t.datetime :fetched_at
t.timestamps
end
end
end
CREATE TABLE public.async_data (
id integer NOT NULL,
request_id integer,
sourced_contact_id integer,
data_source_id integer,
is_enriched boolean DEFAULT false,
requested_params json,
fetched_at timestamp without time zone,
created_at timestamp without time zone NOT NULL,
updated_at timestamp without time zone NOT NULL,
q text,
is_processed boolean DEFAULT false NOT NULL,
is_data_pushed boolean DEFAULT false NOT NULL
);
\d async_data;
Table "public.async_data"
Column | Type | Collation | Nullable | Default
-------------------+-----------------------------+-----------+----------+----------------------------------------------------
id | integer | | not null | nextval('async_data_id_seq'::regclass)
request_id | integer | | |
source_company_id | integer | | |
data_source_id | integer | | |
is_enriched | boolean | | |
requested_params | json | | |
q | text | | |
fetched_at | timestamp without time zone | | |
created_at | timestamp without time zone | | not null |
updated_at | timestamp without time zone | | not null |
Indexes:
"async_data_pkey" PRIMARY KEY, btree (id)
--
-- Name: async_data_id_seq; Type: SEQUENCE; Schema: public; Owner: -
--
CREATE SEQUENCE public.async_data_id_seq
AS integer
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1;
I want to re-produce the same in dev environment and want to know why id nil was generated.

cassandra 2.0.7 cql SELECT Secific Value from map

ALTER TABLE users ADD todo map;
UPDATE users SET todo = { '1':'1111', '2':'2222', '3':'3' ,.... } WHERE user_id = 'frodo';
now ,i want to run the follow cql ,but failed ,is here any other method ?
SELECT user_id, todo['1'] FROM users WHERE user_id = 'frodo';
ps:
the length my map can change. for example : { '1':'1111', '2':'2222', '3':'3' } or { '1':'1111', '2':'2222', '3':'3', '4':'4444'} or { '1':'1111', '2':'2222', '3':'3', '4':'4444' ...}
If you want to use a map collection, you'll have the limitation that you can only select the collection as a whole (docs).
I think you could use the suggestion from the referenced question, even if the length of your map changes. If you store those key/value pairs for each user_id in separate fields, and make your primary key based on user_id and todo_k, you'll have access to them in the select query.
For example:
CREATE TABLE users (
user_id text,
todo_k text,
todo_v text,
PRIMARY KEY (user_id, todo_k)
);
-----------------------------
| user_id | todo_k | todo_v |
-----------------------------
| frodo | 1 | 1111 |
| frodo | 2 | 2222 |
| sam | 1 | 11 |
| sam | 2 | 22 |
| sam | 3 | 33 |
-----------------------------
Then you can do queries like:
select user_id,todo_k,todo_v from users where user_id = 'frodo';
select user_id,todo_k,todo_v from users where user_id = 'sam' and todo_k = 2;