Insert last characters of a value in Postgres table while inserting that same value - postgresql

CONTEXT:
I'm currently building a custom import script in Python with psycopg2 that inserts values from a csv file into a postgres database. The csv however provides me with a value that needs refining.
PROBLEM: In the example below you can see what I want:
I want the 5 last digits from the 15-digit.
mytestdb=# select * from testtable;
uid | first_name | last_name | age | 15-digit | last_5_digits
-----+------------+-----------+-----+----------------------------+-----------------
1 | John | Doe | 42 | 99999999912345 | 12345
I know I could accomplish this by first inserting the supplied values (first_name, last_name, age and 15-digit) and then using RIGHT(15-digit,5) and an UPDATE statement fill the last_5_digits field.
I would however prefer to do this during the initial INSERT of the row. This would considerably lower the amount of transactions on the database.
Could anyone help me getting this done?

Related

Is it possible to create a graph in AGE using existing table in the database?

I have just started with Apache AGE extension. I am exploring the functionalities of graph database. Is there a way to create a graph from existing tables/schema such that the table becomes the label and the attributes become the properties for the vertex?
The create_graph('graph name') is used for creating graphs but I can only create a new graph using this function.
It's not as simple as that. For a start you have to understand this.
When deriving a graph model from a relational model, keep in mind some general guidelines.
A row is a node.
A table name is a label name.
A join or foreign key is a relationship.
Using those relationships, you can model out the data. This is if you need to ensure no errors.
Without an example here is the Dynamic way of creating Graph from Relational model.
1st make a PostgreSQL function that takes in the arguments. Example, name and title of Person. It will create a node.
CREATE OR REPLACE FUNCTION public.create_person(name text, title text)
RETURNS void
LANGUAGE plpgsql
VOLATILE
AS $BODY$
BEGIN
load 'age';
SET search_path TO ag_catalog;
EXECUTE format('SELECT * FROM cypher(''graph_name'', $$CREATE (:Person {name: %s, title: %s})$$) AS (a agtype);', quote_ident(name), quote_ident(title));
END
$BODY$;
2nd use the function like so,
SELECT public.create_person(sql_person.name, sql_person.title)
FROM sql_schema.Person AS sql_person;
You'll have created a node for every row in SQL_SCHEMA.Person
To export data from a PGSQL table to an AGE graph, you can try exporting a CSV file. For example, if you have the following table called employees:
SELECT * from employees;
id | name | manager_id | title
----+------------------------+------------+------------
1 | Gabriel Garcia Marquez | | Boss
2 | Dostoevsky | 1 | Director
3 | Victor Hugo | 1 | Manager
4 | Albert Camus | 2 | Engineer
5 | Haruki Murakami | 3 | Analyst
6 | Virginia Woolf | 1 | Consultant
7 | Liu Cixin | 2 | Manager
8 | Franz Kafka | 4 | Intern
9 | Daphne Du Maurier | 7 | Engineer
First export a CSV using the following command:
\copy (SELECT * FROM employees) to '/home/username/employees.csv' with csv header
Now you can import this into AGE. Remember that for a graph database, the name of the table is the name of the vertex label. The columns of the table are the properties of the vertex.
First make sure you create a label for your graph. In this case, the label name will be 'employees', the same as the table name.
SELECT create_vlabel('graph_name','employees');
Now we load all the nodes of this label (each row from the original table is one node in the graph).
SELECT load_labels_from_file('graph_name','employees','/home/username/employees.csv');
Now your graph should have all the table data of the employees table.
More information can be found on the documentation:
https://age.apache.org/age-manual/master/intro/agload.html
I don't think it's possible to create a graph using existing tables. Because when we create a graph the graph name is the schema name and the label name for vertices and edges are table names. Create a sample graph and then run the below command to understand more about what schemas and table names are present in Postgresql.
SELECT * FROM pg_catalog.pg_tables

Is this INSERT statement containing SELECT subquery safe for multiple concurrent writes?

In Postgres, suppose I have the following table to be used like to a singly linked list, where each row has a reference to the previous row.
Table node
Column | Type | Collation | Nullable | Default
-------------+--------------------------+-----------+----------+----------------------
id | uuid | | not null | gen_random_uuid()
created_at | timestamp with time zone | | not null | now()
name | text | | not null |
prev_id | uuid | | |
I have the following INSERT statement, which includes A SELECT to look up the last row as data for the new row to be inserted.
INSERT INTO node(name, prev_id)
VALUES (
:name,
(
SELECT id
FROM node
ORDER BY created_at DESC
LIMIT 1
)
)
RETURNING id;
I understand storing prev_id may seem redundant in this example (ordering can be derived from created_at), but that is beside the point. My question: Is the above INSERT statement safe for multiple concurrent writes? Or, is it necessary to explicitly use LOCK in some way?
For clarity, by "safe", I mean is it possible that by the time the SELECT subquery executed and found the "last row", another concurrent query would have just finished an insert, so the "last row" found earlier is no longer the last row, so this insert would use the wrong "last row" value. The effect is multiple rows may share the same prev_id values, which is invalid for a linked list structure.

Postgresql tsvector not searching few strings

I am using PostgreSQL 11, created tsvector with gin index on column search_fields.
Data in table test
id | name | search_fields
-------+--------------------------+--------------------------------
19973 | Ongoing 10x consultation | '10x' 'Ongoing' 'consultation'
19974 | 5x marketing | '5x' 'marketing'
19975 | Ongoing 15x consultation | '15x' 'Ongoing' 'consultation'
The default text search config is set as 'pg_catalog.english'.
Below both queries output 0 rows.
select id, name, search_fields from test where search_fields ## to_tsquery('ongoing');
id | name | search_fields
----+------+---------------
(0 rows)
select id, name, search_fields from test where search_fields ## to_tsquery('simple','ongoing');
id | name | search_fields
----+------+---------------
(0 rows)
But when I pass string as '10x' or 'consultation' it returns the correct output.
Any idea, why it is not searching for 'ongoing' words?
Afterwards, I have created the triggers using function tsvector_update_trigger() and update the search_fields and also set default_text_search_config to 'pg_catalog.simple' in postgresql.conf file, then I updated the search_fields with search_fields and it output as
select id, name, search_fields from test where search_fields ## to_tsquery('ongoing');
id | name | search_fields
----+---------------------------------+-----------------------------------------
19973 | Ongoing 10x consultation | '10x':2 'consultation':3 'ongoing':1
This time when I ran query passing 'ongoing' string it output as per the expected result.
select id, name, search_fields from test where search_fields ## to_tsquery('ongoing');
id | name | search_fields
-------+--------------------------+--------------------------------
19973 | Ongoing 10x consultation | '10x':2 'consultation':3 'ongoing':1
19975 | Ongoing 15x consultation | '15x':2 'consultation':3 'ongoing':1
As per above experiment, setting trigger and default_text_search_config to 'pg_catalog.simple' help to achieve the result.
Now, I don't know what is the reason why it didn't work with default_text_search_config to 'pg_catalog.english'.
Is trigger always required when tsvector is used?
Any help in understanding the difference between both would be appreciated.
Thanks,
Nishit
You don't describe how you create your search_fields initially. It was not constructed correctly. Since we don't know what you did, we don't know what you did wrong. If you rebuild it correctly, then it will start working. When you changed default_text_search_config to 'simple', you appear to have correctly repopulated the search_fields, which is why it worked. If you change back to 'english' and correctly repopulate the search_fields then it will also work.
You don't always need a trigger. A trigger is one way. Another way is to just manually update the tsvector column every time you update the text column. My usual favorite way is not to store the tsvector at all, and just derive it on the fly:
select id, name, search_fields from test where
to_tsvector('english',name) ## to_tsquery('english','ongoing');
If you want to do it this way, you need to specify the configuration, not rely on default_text_search_config, otherwise the expressional gin index will not be used. Also, this way is not a good idea if you want to use phrase searching, as the rechecking will be slow.

Know which table are affected by a connection

I want to know if there is a way to retrieve which table are affected by request made from a connection in PostgreSQL 9.5 or higher.
The purpose is to have the information in such a way that will allow me to know which table where affected, in which order and in what way.
More precisely, something like this will suffice me :
id | datetime | id_conn | id_query | table | action
---+----------+---------+----------+---------+-------
1 | ... | 2256 | 125 | user | select
2 | ... | 2256 | 125 | order | select
3 | ... | 2256 | 125 | product | select
(this will be the result of a select query from user join order join product).
I know I can retrieve id_conn througth "pg_stat_activity", and I can see if there is a running query, but I can't find an "history" of the query.
The final purpose is to debug the database when incoherent data are inserted into the table (due to a lack of constraint). Knowing which connection do the insert will lead me to find the faulty script (as I have already the script name and the id connection linked).

Is it possible to use different forms and create one row of information in a table?

I have been searching for a way to combine two or more rows of one table in a database into one row.
I am currently creating multiple web-based forms that connect to one table in my database. Is there any way to write some mysql and php code that will take separate form submissions and put them into one row of the database instead of multiple rows?
Here is an example of what is going into the database:
This is all in one table with three rows.
Form_ID represents the three different forms that I used to insert the data into the table.
Form_ID | Lot_ID| F_Name | L_Name | Date | Age
------------------------------------------------------------
1 | 1 | John | Evans | *NULL* | *NULL*
-------------------------------------------------------------
2 |*NULL* | *NULL* | *NULL* | 2017-07-06 | *NULL*
-------------------------------------------------------------
3 |*NULL* | *NULL* | *NULL* | *NULL* | 22
This is an example of three separate forms going into one table. Every time the submit button is hit the data just inserts down to the next row of information.
I need some sort of join or update once the submit button is hit to replace the preceding NULL values.
Here is what I want to do after the submit button is hit:
I want it to be combined all into one row but still in one table
Form_ID is still the three separate forms but only in one row now.
Form_ID |Lot_ID | F_Name | L_Name | Date | Age
----------------------------------------------------------
1 | 1 | John | Evans | 2017-07-06 | 22
My goal is once a one form has been submitted I want the next, different form submission to replace the NULL values in the row above it and so on to create a single row of information.
I found a way to solve this issue. I used UPDATE tablename SET columname = newColumnName WHERE Form_ID = newID
So this way when I want to update rows that have blanks spaces I have it finding the matching ID's