Updating two tables with same data where there is no relationship (PK,FK) between those tables

Updating two tables with same data where there is no relationship (PK,FK) between those tables - postgresql

update Claim
set first_name = random_name(7),
Last_name = random_name(6),
t2.first_name=random_name(7),
t2.last_name=random_name(6)
from Claim t1
inner join tbl_ecpremit t2
on t1.first_name = t2.first_name
I am getting below error
column "t2" of relation "claim" does not exist

You can do this with a so-called data-modifying CTE:
WITH c AS (
UPDATE claim SET first_name = random_name(7), last_name = random_name(6)
WHERE <put your condition here>
RETURNING *
)
UPDATE tbl_ecpremit SET last_name = c.last_name
FROM c
WHERE first_name = c.first_name;
This assumes that random_name() is a function you define, it is not part of PostgreSQL afaik.
The nifty trick here is that the UPDATE in the WITH query returns the updated record in the first table using the RETURNING clause. You can then use that record in the main UPDATE statement to have exactly the same data in the second table.
This is all very precarious though, because you are both linking on and modifying the "first_name" column with some random value. In real life this will only work well if you have some more logic regarding the names and conditions.

Related

Using ANY with raw data work but not subquery

I just can't figure it out why this query work
SELECT id, name, organization_id
FROM facilities
WHERE organization_id = ANY(
'{abc-xyz-123,678-ght-nmp}'
)
But this query wont work with error operator does not exist: uuid = uuid[]
SELECT id, name, organization_id
FROM facilities
WHERE organization_id = ANY(
SELECT organization_ids
FROM admins
WHERE id = 'jkl-iop-345'
)
When the subquery
SELECT organization_ids
FROM admins
WHERE id = 'jkl-iop-345'
give the exact result of {abc-xyz-123,678-ght-nmp}.
I'm using postgres (PostgreSQL) 13.3

The subquery produces one row that contains an array.
If you use = ANY (SELECT ...), the result set is converted to an array, so you end up with
{{abc-xyz-123,678-ght-nmp}}
which is an array of arrays.
You probably want
SELECT id, name, organization_id
FROM facilities
WHERE EXISTS (SELECT 1 FROM admins
WHERE admins.id = 'jkl-iop-345'
AND facilities.organization_id = ANY (admins.organization_ids)
);
Let me remark that storing references to other tables in an array, JSON or other composite data type is an exceptionally bad idea. A normalized schema with a junction table would serve you better.

Multipart column names in update statement with select

I'm sure this could be a duplicate but I can't seem to find the right search phrase.
Given a table in a named schema (i.e. not dbo) requires you include the schema name in the statement. So previously I'd have simply written it as so:
UPDATE [Schema].[Table1]
SET [AColumn] =
(
SELECT [SomeColumn]
FROM [Schema].[Table2]
WHERE [Schema].[Table2].[SameColumnName] = [Schema].[Table1].[SameColumnName]
);
But since More than two-part column name is deprecated, I need to find a new way to do this which is future proof. I have come up with 2 options, firstly using an alias:
UPDATE [Alias1]
SET [AColumn] =
(
SELECT [SomeColumn]
FROM [Schema].[Table2] [Alias2]
WHERE [Alias2].[SameColumnName] = [Alias1].[SameColumnName]
)
FROM [Schema].[Table1] [Alias1];
The second way is the one I'm really having trouble finding out if it's truly VALID T-Sql:
UPDATE [Schema].[Table1]
SET [AColumn] =
(
SELECT [SomeColumn]
FROM [Schema].[Table2]
WHERE [Table2].[SameColumnName] = [Table1].[SameColumnName]
);
I have tested both and they work, so my question is, is the second completely valid and normal to use just the table name without the Schema in this sense or should I rather opt for the slightly more verbose Alias?

As I said in my comment, alias your objects.
SELECT MT.MyColumn,
YT.MyColumn
FROM dbo.MyTable MT
JOIN so.YourTable YT ON MT.ID = YT.fID
WHERE YT.[name] = N'Jane';
If you're performing an UPDATE, then specify the alias of the object to Update:
UPDATE MT
SET MyColumn = YT.MyColumn --Column on the left side of the SET will always reference the table being updated
FROM dbo.MyTable MT
JOIN so.YourTable YT ON MT.ID = YT.fID
WHERE YT.[name] = N'Jane';

I'm trying to insert tuples into a table A (from table B) if the primary key of the table B tuple doesn't exist in tuple A

Here is what I have so far:
INSERT INTO Tenants (LeaseStartDate, LeaseExpirationDate, Rent, LeaseTenantSSN, RentOverdue)
SELECT CURRENT_DATE, NULL, NewRentPayments.Rent, NewRentPayments.LeaseTenantSSN, FALSE from NewRentPayments
WHERE NOT EXISTS (SELECT * FROM Tenants, NewRentPayments WHERE NewRentPayments.HouseID = Tenants.HouseID AND
NewRentPayments.ApartmentNumber = Tenants.ApartmentNumber)
So, HouseID and ApartmentNumber together make up the primary key. If there is a tuple in table B (NewRentPayments) that doesn't exist in table A (Tenants) based on the primary key, then it needs to be inserted into Tenants.
The problem is, when I run my query, it doesn't insert anything (I know for a fact there should be 1 tuple inserted). I'm at a loss, because it looks like it should work.
Thanks.

Your subquery was not correlated - It was just a non-correlated join query.
As per description of your problem, you don't need this join.
Try this:
insert into Tenants (LeaseStartDate, LeaseExpirationDate, Rent, LeaseTenantSSN, RentOverdue)
select current_date, null, p.Rent, p.LeaseTenantSSN, FALSE
from NewRentPayments p
where not exists (
select *
from Tenants t
where p.HouseID = t.HouseID
and p.ApartmentNumber = t.ApartmentNumber
)

SQL update statements updates wrong fields

I have the following code in Postgres
select op.url from identity.legal_entity le
join identity.profile op on le.legal_entity_id =op.legal_entity_id
where op.global_id = '8wyvr9wkd7kpg1n0q4klhkc4g'
which returns 1 row.
Then I try to update the url field with the following:
update identity.profile
set url = 'htpp:sam'
where identity.profile.url in (
select op.url from identity.legal_entity le
join identity.profile op on le.legal_entity_id =op.legal_entity_id
where global_id = '8wyvr9wkd7kpg1n0q4klhkc4g'
);
But the above ends up updating more than 1 row, actually all of the rows of the identity table.
I would assume since the first postgres statement returns one row, only one row at most can be updated, but I am getting the wrong effect where all of the rows are being updated. Why ?? Please help a nubie fix the above update statement.

Instead of using profile.url to identify the row you want to update, use the primary key. That is what it is there for.
So if the primary key column is called id, the statement could be modified to:
UPDATE identity.profile
SET ...
WHERE identity.profile.id IN (SELECT op.id FROM ...);
But you can do this much simpler in PostgreSQL with
UPDATE identity.profile op
SET url = 'htpp:sam'
FROM identity.legal_entity le
WHERE le.legal_entity_id = op.legal_entity_id
AND le.global_id = '8wyvr9wkd7kpg1n0q4klhkc4g';

Select most reviewed courses starting from courses having at least 2 reviews

I'm using Flask-SQLAlchemy with PostgreSQL. I have the following two models:
class Course(db.Model):
id = db.Column(db.Integer, primary_key = True )
course_name =db.Column(db.String(120))
course_description = db.Column(db.Text)
course_reviews = db.relationship('Review', backref ='course', lazy ='dynamic')
class Review(db.Model):
__table_args__ = ( db.UniqueConstraint('course_id', 'user_id'), { } )
id = db.Column(db.Integer, primary_key = True )
review_date = db.Column(db.DateTime)#default=db.func.now()
review_comment = db.Column(db.Text)
rating = db.Column(db.SmallInteger)
course_id = db.Column(db.Integer, db.ForeignKey('course.id') )
user_id = db.Column(db.Integer, db.ForeignKey('user.id') )
I want to select the courses that are most reviewed starting with at least two reviews. The following SQLAlchemy query worked fine with SQlite:
most_rated_courses = db.session.query(models.Review, func.count(models.Review.course_id)).group_by(models.Review.course_id).\
having(func.count(models.Review.course_id) >1) \ .order_by(func.count(models.Review.course_id).desc()).all()
But when I switched to PostgreSQL in production it gives me the following error:
ProgrammingError: (ProgrammingError) column "review.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT review.id AS review_id, review.review_date AS review_...
^
'SELECT review.id AS review_id, review.review_date AS review_review_date, review.review_comment AS review_review_comment, review.rating AS review_rating, review.course_id AS review_course_id, review.user_id AS review_user_id, count(review.course_id) AS count_1 \nFROM review GROUP BY review.course_id \nHAVING count(review.course_id) > %(count_2)s ORDER BY count(review.course_id) DESC' {'count_2': 1}
I tried to fix the query by adding models.Review in the GROUP BY clause but it did not work:
most_rated_courses = db.session.query(models.Review, func.count(models.Review.course_id)).group_by(models.Review.course_id).\
having(func.count(models.Review.course_id) >1) \.order_by(func.count(models.Review.course_id).desc()).all()
Can anyone please help me with this issue. Thanks a lot

SQLite and MySQL both have the behavior that they allow a query that has aggregates (like count()) without applying GROUP BY to all other columns - which in terms of standard SQL is invalid, because if more than one row is present in that aggregated group, it has to pick the first one it sees for return, which is essentially random.
So your query for Review basically returns to you the first "Review" row for each distinct course id - like for course id 3, if you had seven "Review" rows, it's just choosing an essentially random "Review" row within the group of "course_id=3". I gather the answer you really want, "Course", is available here because you can take that semi-randomly selected Review object and just call ".course" on it, giving you the correct Course, but this is a backwards way to go.
But once you get on a proper database like Postgresql you need to use correct SQL. The data you need from the "review" table is just the course_id and the count, nothing else, so query just for that (first assume we don't actually need to display the counts, that's in a minute):
most_rated_course_ids = session.query(
Review.course_id,
).\
group_by(Review.course_id).\
having(func.count(Review.course_id) > 1).\
order_by(func.count(Review.course_id).desc()).\
all()
but that's not your Course object - you want to take that list of ids and apply it to the course table. We first need to keep our list of course ids as a SQL construct, instead of loading the data - that is, turn it into a derived table by converting the query into a subquery (change the word .all() to .subquery()):
most_rated_course_id_subquery = session.query(
Review.course_id,
).\
group_by(Review.course_id).\
having(func.count(Review.course_id) > 1).\
order_by(func.count(Review.course_id).desc()).\
subquery()
one simple way to link that to Course is to use an IN:
courses = session.query(Course).filter(
Course.id.in_(most_rated_course_id_subquery)).all()
but that's essentially going to throw away the "ORDER BY" you're looking for and also doesn't give us any nice way of actually reporting on those counts along with the course results. We need to have that count along with our Course so that we can report it and also order by it. For this we use a JOIN from the "course" table to our derived table. SQLAlchemy is smart enough to know to join on the "course_id" foreign key if we just call join():
courses = session.query(Course).join(most_rated_course_id_subquery).all()
then to get at the count, we need to add that to the columns returned by our subquery along with a label so we can refer to it:
most_rated_course_id_subquery = session.query(
Review.course_id,
func.count(Review.course_id).label("count")
).\
group_by(Review.course_id).\
having(func.count(Review.course_id) > 1).\
subquery()
courses = session.query(
Course, most_rated_course_id_subquery.c.count
).join(
most_rated_course_id_subquery
).order_by(
most_rated_course_id_subquery.c.count.desc()
).all()
A great article I like to point out to people about GROUP BY and this kind of query is SQL GROUP BY techniques which points out the common need for the "select from A join to (subquery of B with aggregate/GROUP BY)" pattern.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Updating two tables with same data where there is no relationship (PK,FK) between those tables - postgresql

update Claim set first_name = random_name(7), Last_name = random_name(6), t2.first_name=random_name(7), t2.last_name=random_name(6) from Claim t1 inner join tbl_ecpremit t2 on t1.first_name = t2.first_name I am getting below error column "t2" of relation "claim" does not exist

Related

Using ANY with raw data work but not subquery

Multipart column names in update statement with select

I'm trying to insert tuples into a table A (from table B) if the primary key of the table B tuple doesn't exist in tuple A

SQL update statements updates wrong fields

Select most reviewed courses starting from courses having at least 2 reviews

Categories

Resources