How to run knex queries and insert in band - postgresql

I have a sql table like this:
// Create table for albums - photos relation with ordering
await knex.schema.createTable('albums_photos', (table) => {
table.integer('album')
.notNullable()
.references('pk')
.inTable('albums')
.onUpdate('CASCADE')
.onDelete('CASCADE')
table.integer('photo')
.notNullable()
.references('pk')
.inTable('photos')
.onUpdate('CASCADE')
.onDelete('CASCADE')
// Photo's order in its album
table.integer('order').notNullable()
})
I did not add db constraint yet but album and order must be unique together.
When I add a new entrie, I want order to be equal to the max value of order for given album, + 1:
async createAlbumPhoto = (albumPk, photoPk) => {
console.log(`Work on photo ${photoPk}`)
// We get current max value for order and album
const maxData = await knex('albums_photos')
.where({ album: albumPk })
.max('order')
.first()
console.log(`Get max ${maxData.max} for photo ${photoPk}`)
// We create new entrie with max + 1
const newAlbumPhoto = await knex('albums_photos')
.insert({
album: albumPk,
photo: photoPk,
order: maxData.max + 1
})
.returning('*')
console.log(`pk ${photoPk} done with order ${newAlbumPhoto[0].order}`)
return newAlbumPhoto[0]
}
When I insert only one photo_album in a while, everything works fine.
But when (web) client send multiple requests one after another, due to asynchronous db requests, I get bad results. Such list of photo pks ['71', '72', '73', '74'] results in:
work on pk 71 (OK)
Get max null for photo 71 (OK)
pk 71 done with order 1 (OK)
work on pk 72
work on pk 73
Get max 1 for photo 72 (OK)
Get max 1 for photo 73 (Wrong)
work on pk 74
Get max 1 for photo 74 (Wrong)
pk 72 done with order 2 (OK)
pk 73 done with order 2 (Wrong)
pk 74 done with order 2 (Wrong)
Orders aren't unique…

Related

Postgres get related post with fallback

I wanted to get related posts from my table.So I tried some code like below
"SELECT post,category FROM library WHERE title LIKE '%" + query + "%' LIMIT 20"
Now sometimes it returns 20 results , but sometimes less than 20 , So when response is less than 20 , I need to fill with random posts which has category I mention , For example something like below
SELECT post,category FROM library WHERE category = 'php' OFFSET floor(random()*20) LIMIT 20;
For example if my search query returns 5 results , it should get random 15 posts based on my 2nd query.
Perhaps you can use UNION ALL?
SELECT * FROM (
(SELECT post,category
FROM library
WHERE title LIKE '%" + query + "%' LIMIT 20)
UNION ALL
(SELECT post,category
FROM library
WHERE category = 'php' OFFSET floor(random()*20) LIMIT 20)
) LIMIT 20;

Is there a way to use `COUNT(DISTINCT)` in EF Core 3.1?

I have a table with the following data
ID DateColumn Amount
1 2021-01-25 50
2 2021-01-24 10
1 2021-01-25 100
I need the following output,
ID DayCount TotalAmount
1 1 150
2 1 10
I'm trying to lambda expression that would generate the following SQL query,
select ID, Count(distinct DateColumn) as DayCount, Sum(Amount) as TotalAmount
from test
group by id
I've writte the following expression,
return await _context.Tests
.GroupBy(g => g.id)
.Select(s => new
{
Data = s.Key,
Count = s.Select(t => t.DateColumn).Distinct().Count()
}).ToListAsync();
and it throws the InValidOperation exception.
There seems to be an error in the resulting table for the first record, which should have a value of 2 for the DayCount.
The resting table construction looks like it could use some rearchitecting due to the Id column containing duplicates.
Regardless of structure, you can use an inner GROUP BY to achieve similar results:
return await _context.Tests
.GroupBy(g => g.id)
.Select(s => new
{
Data = s.Key,
Count = s.GroupBy(t => t.DateColumn).Select(g => g.key).Count()
})
.ToListAsync();

sql code to compare previous row in a series

So I have a table that has an ArticleID (GUID), RevisionNumber (integer), and StatusCode (text).
An Article can have any number of revisions but each time a new revision is created, the StatusCode of the previous revision should be "Revised" and the newest revision's StatusCode could be "Active" or "Draft" or "Canceled". However the data is messed up and I need to identify which records (out of 100's of thousands) do not have the correct status.
Sample data:
Article ID RevisionNumber StatusCode
========== ============== ==========
xx-xxxx-xx 7 Active
xx-xxxx-xx 6 Revised
xx-xxxx-xx 5 Active
xx-xxxx-xx 4 Draft
xx-xxxx-xx 3 Revised
xx-xxxx-xx 2 Active
xx-xxxx-xx 1 Revised
xx-xxxx-xx 0 Revised
xx-yyyy-yy 1 Active
xx-yyyy-yy 0 Active
In the above scenario, I would need to know that xx-xxxx-xx Revision 5, 4, and 2 are not the proper status and xx-yyyy-yy Revision 0 is incorrect. How could I get this information from a sql query using sql server 2012?
To identify any revisions that are not "Revised" if there is a higher number revision.
Then it seems just a matter of knowing what the latest revision is.
A MAX OVER can do that.
SELECT ArticleID, RevisionNumber, StatusCode
FROM
(
SELECT ArticleID, RevisionNumber, StatusCode
, MAX(RevisionNumber) OVER (PARTITION BY ArticleID) AS MaxRevisionNumber
FROM YourTable
) q
WHERE (RevisionNumber < MaxRevisionNumber AND StatusCode != 'Revised')
You can do this with a left join -- for each record we look for one with a greater revision -- like this:
SELECT *
FROM table_you_did_not_name base
LEFT JOIN table_you_did_not_name next ON base.ArticleID = next.ArticleID and base.revisionnumber = next.revisionnumber + 1
WHERE status <> 'Revised' and next.ArticleID is not null

Update rows returned by a complex SQL query with data from query result

I have a multi-table join and want to update a table based on the result of that join. The join table produces both the scope of the update (only those rows whose effort.id appears in the result should be updated) and the data for the update (a new column should be set to the value of a calculated column).
I've made progress but can't quite make it work. Here's my statement:
UPDATE
efforts
SET
dropped_int = jt.split
FROM
(
SELECT
ef.id,
s.id split,
s.kind,
s.distance_from_start,
s.sub_order,
max(s.distance_from_start + s.sub_order)
OVER (PARTITION BY ef.id) AS max_dist
FROM
split_times st
LEFT JOIN splits s ON s.id = st.split_id
LEFT JOIN efforts ef ON ef.id = st.effort_id
) jt
WHERE
((jt.distance_from_start + jt.sub_order) = max_dist)
AND
kind <> 1;
The SELECT produces the correct join table:
id split kind dfs sub max_dist dropped dropped_int
403 33 2 152404 1 152405 TRUE 33
404 33 2 152404 1 152405 TRUE 33
405 31 2 143392 1 143393 TRUE 33
406 31 2 143392 1 143393 TRUE 33
407 29 2 132127 1 132128 TRUE 33
408 29 2 132127 1 132128 TRUE 33
409 29 2 132127 1 132128 TRUE 33
and does indeed update the efforts.id column, but there are two problems: First, it updates all efforts, not just those that are produced from the query, and second, it sets effort.id to the split value of the first row in the query result, but I need it to set each effort to the associated split value.
If this were non-SQL, it might look something like:
jt_rows.each do |jt_row|
efforts[jt_row].dropped_int = jt[jt_row].split
end
But I don't know how to do that in SQL. It seems like this should be a fairly common problem, but after a couple of hours of searching I'm coming up short.
How should I modify my statement to produce the described result? If it matters, this is Postgres 9.5. Thanks in advance for any suggestions.
EDIT:
I did not get a workable answer but ended up solving this with a mixture of SQL and native code (Ruby/Rails):
dropped_splits = SplitTime.joins(:split).joins(:effort)
.select('DISTINCT ON (efforts.id) split_times.effort_id, split_times.split_id')
.where(efforts: {dropped: true})
.order('efforts.id, splits.distance_from_start DESC, splits.sub_order DESC')
update_hash = Hash[dropped_splits.map { |x| [x.effort_id, {dropped_split_id: x.split_id, updated_at: Time.now}] }]
Effort.update(update_hash.keys, update_hash.values)
Use a condition in the WHERE clause that relates efforts table with a subquery:
efforts.id = jt.id
that is:
WHERE
((jt.distance_from_start + jt.sub_order) = max_dist)
AND
kind <> 1
AND
efforts.id = jt.id

How to move pairs of columns to new table, saving FK to the old table and creating a new column with a value as a name of column?

I downloaded database with translations of countries and cities to 70 languages (some of translations are ''), but translations and technical information (populations, flags, territory, phones, etc) about cities\countries saved in the same table.
I mean every translation has its own columns (tranlation itself + description on the language of translation) next to the other info which is not related to translation. Totally about 190 colums, including 70*2 (translation + description).
I don't think this is proper way and I want to move all translations to seperated table keeping FK to main\technical-info table.
So, now I have a table "cities" with the structure like below:
id region_id countries_id phone population lang_1 description_1 lang_2 description_2 lang_3 description_3 .... lang_70 description_70
1 1 1 +7 123 Москва SomeDesc Moscow SomeDesc2 Moskwa SomeText3 Translation70 SomeDesc70
2 1 1 +7 123 Кубинка SomeDesc Kubinka SomeDesc2 Kubinka '' Translation70 SomeDesc70
with 2.5M rows\cities.
I want to move all "lang_(1-70)" and their descriptions to new table "cities_translated" which should look like that:
id cities_id name description lang
1 1 Москва SomeDesc lang_1
2 1 Moscow SomeDesc2 lang_2
3 1 Moskwa SomeText3 lang_3
...
70 1 Translation70 SomeDesc70 lang_70
71 2 Кубинка SomeDesc lang_1
72 2 Kubinka SomeDesc2 lang_2
73 2 Kubinka SomeDesc3 lang_3
...
140 2 Translation70 SomeDesc70 lang_70
Could anyone help me please with proper query to do this transfer?
P.S. I have already a table "languages" and as the next step I will replace all values like 'lang_1', 'lang_2' and so on to proper FK.
Hoped to get raw sql solution in order to improve my sql knowledge, but due to no anwers, i decided to use Python.
initial_table = 'countries.city'
init_table_columns = ['lang_1', 'description_1', 'lang_2', 'description_2', 'lang_3', 'description_3', 'lang_4', 'description_4', 'lang_5', 'description_5', 'lang_6', 'description_6', 'lang_7', 'description_7', 'lang_8', 'description_8', 'lang_9', 'description_9', 'lang_10', 'description_10', 'lang_11', 'description_11', 'lang_12', 'description_12', 'lang_13', 'description_13', 'lang_14', 'description_14', 'lang_15', 'description_15', 'lang_16', 'description_16', 'lang_17', 'description_17', 'lang_18', 'description_18', 'lang_19', 'description_19', 'lang_20', 'description_20', 'lang_21', 'description_21', 'lang_22', 'description_22', 'lang_23', 'description_23', 'lang_24', 'description_24', 'lang_25', 'description_25', 'lang_26', 'description_26', 'lang_27', 'description_27', 'lang_28', 'description_28', 'lang_29', 'description_29', 'lang_30', 'description_30', 'lang_31', 'description_31', 'lang_32', 'description_32', 'lang_33', 'description_33', 'lang_34', 'description_34', 'lang_35', 'description_35', 'lang_36', 'description_36', 'lang_37', 'description_37', 'lang_38', 'description_38', 'lang_39', 'description_39', 'lang_40', 'description_40', 'lang_41', 'description_41', 'lang_42', 'description_42', 'lang_43', 'description_43', 'lang_44', 'description_44', 'lang_45', 'description_45', 'lang_46', 'description_46', 'lang_47', 'description_47', 'lang_48', 'description_48', 'lang_49', 'description_49', 'lang_50', 'description_50', 'lang_51', 'description_51', 'lang_52', 'description_52', 'lang_53', 'description_53', 'lang_54', 'description_54', 'lang_55', 'description_55', 'lang_56', 'description_56', 'lang_57', 'description_57', 'lang_58', 'description_58', 'lang_59', 'description_59', 'lang_60', 'description_60', 'lang_61', 'description_61', 'lang_62', 'description_62', 'lang_63', 'description_63', 'lang_64', 'description_64', 'lang_65', 'description_65', 'lang_66', 'description_66', 'lang_67', 'description_67', 'lang_68', 'description_68', 'lang_69', 'description_69', 'lang_70', 'description_70']
table_translation = 'countries.city_translated'
import psycopg2
import re
conn = psycopg2.connect(database="countries", host='localhost', user="postgres", password="Password")
cur = conn.cursor()
new_cursor = conn.cursor()
cur.execute("""SELECT id FROM %s """ % initial_table)
rows = cur.fetchall()
print("%i rows retrieved" % cur.rowcount)
new_cursor.execute("""BEGIN""")
for row in rows:
print('row:', row)
get_id = row[0]
cur.execute("""SELECT %s FROM %s WHERE id=%s """ % (",".join(init_table_columns), initial_table, get_id))
row_w_info = cur.fetchall()
for i in range(140):
if i%2==0:
name = row_w_info[0][i]
description = row_w_info[0][i+1]
lang_text = init_table_columns[i]
lang_id = int(re.findall(r'\d+', lang_text)[0])
# There are 70 translations, but there is no info what languages are 68, 69, 70
if lang_id >= 68:
lang_id = None
new_cursor.execute("INSERT INTO countries.city_translated (city_id, name, description, lang_id) VALUES (%s, %s, %s, %s)", (get_id, name, description, lang_id))
new_cursor.execute("""COMMIT""")
cur.close()
new_cursor.close()
conn.close()