I want to change the size of a String column in my PostgreSQL database through alembic.
My first attempt in my local DB was the more straightforward way and the one that seemed logic:
Change the size of the db.Column field I wanted to resize and configure alembic to look for type changes as stated here: add the compare_type=True parameter to the context.configure() of my alembic/env.py. And then run alembic revision --autogenerate which correctly generated a file calling alter_column.
This seems to be OK, but the alembic upgrade head was taking so much time because of the alter column that I cancelled the execution and looked for other solutions as I guess if it takes so long in my computer it would take long in the Heroku server too and I'm not going to have my service paused until this operation is finished.
So I came up with a quite hacky solution that worked perfectly in my machine:
I created my update statement in an alembic file both for upgrade and downgrade:
connection = op.get_bind()
def upgrade():
connection.execute("update pg_attribute SET atttypmod = 1000+4" + \
"where attrelid = 'item'::regclass and attname = 'details'", execution_options=None)
def downgrade():
connection.execute("update pg_attribute SET atttypmod = 200+4" + \
"where attrelid = 'item'::regclass and attname = 'details'", execution_options=None)
And worked really fast in my machine. But When pushing it to my staging app in Heroku and executing the upgrade it prompted ERROR: permission denied for relation pg_attribute. The same happens if I try to execute the update statement directly in psql. I guess this is intentional from Heroku and that I am not supposed to update those kind of tables as I could make malfunction the database if doing it wrong. I guess forcing that update in Heroku is not the way to go.
I have also tried creating a new temporary column, copying all the data from the old-small column into that temp one, deleted the old column, created a new one with the same name as the old one but with the desired size, copied the data from the temp column and deleted it this way:
def upgrade():
op.add_column('item', sa.Column('temp_col', sa.String(200)))
connection.execute("update item SET temp_col = details", execution_options=None)
op.drop_column('item', 'details')
op.add_column('item', sa.Column('details', sa.String(1000)))
connection.execute("update item SET details = temp_col", execution_options=None)
op.drop_column('item', 'temp_col')
def downgrade():
op.add_column('item', sa.Column('temp_col', sa.String(1000)))
connection.execute("update item SET temp_col = details", execution_options=None)
op.drop_column('item', 'details')
op.add_column('item', sa.Column('details', sa.String(200)))
connection.execute("update item SET details = temp_col", execution_options=None)
op.drop_column('item', 'temp_col')
But it also takes ages and doesn't seem to be a really neat way to do it.
So my question is: what is the correct way to resize a string column in postgreSQL in Heroku through alembic without having to wait ages for the alter column to be executed?
Related
I'm writing a Java application to automatically build and run SQL queries. For many tables my code works fine but on a certain table it gets stuck by throwing the following exception:
Exception in thread "main" org.postgresql.util.PSQLException: ERROR: column "continent" does not exist
Hint: Perhaps you meant to reference the column "countries.Continent".
Position: 8
The query that has been run is the following:
SELECT Continent
FROM network.countries
WHERE Continent IS NOT NULL
AND Continent <> ''
LIMIT 5
This essentially returns 5 non-empty values from the column.
I don't understand why I'm getting the "column does not exist" error when it clearly does in pgAdmin 4. I can see that there is a schema with the name Network which contains the table countries and that table has a column called Continent just as expected.
Since all column, schema and table names are retrieved by the application itself I don't think there has been a spelling or semantical error so why does PostgreSQL cause problems regardless? Running the query in pgAdmin4 nor using the suggested countries.Continent is working.
My PostgreSQL version is the newest as of now:
$ psql --version
psql (PostgreSQL) 9.6.1
How can I successfully run the query?
Try to take it into double quotes - like "Continent" in the query:
SELECT "Continent"
FROM network.countries
...
In working with SQLAlchemy environment, i have got this error with the SQL like this,
db.session.execute(
text('SELECT name,type,ST_Area(geom) FROM buildings WHERE type == "plaza" '))
ERROR: column "plaza" does not exist
Well, i changed == by = , Error still persists, then i interchanged the quotes, like follows. It worked. Weird!
....
text("SELECT name,type,ST_Area(geom) FROM buildings WHERE type = 'plaza' "))
This problem occurs in postgres because the table name is not tablename instead it is "tablename".
for eg.
If it shows user as table name,
than table name is "user".
See this:
Such an error can appear when you add a space in the name of a column by mistake (for example "users ").
QUICK FIX (TRICK)
If you have recently added a field which you have already deleted before and now trying to add the same field back then let me share you this simple trick! i did this and the problem was gone!!
so, now just delete the migration folder entirely on the app,then instead of adding that field you need to now add a field but with the name of which you have never declared on this app before, example if you are trying to add title field then create it by the name of heading and now do the migration process separately on the app and runserver, now go to admin page and look for that model and delete all the objects and come to models back and rename the field that you recently made and name it to which you were wishing it with earlier and do the migrations again and now your problem must have been gone!!
this occurs when the objects are there in the db but you added a field which wasn't there when the earlier objs were made, so by this we can delete those objs and make fresh ones again!
I got the same error when I do PIVOT in RedShift.
My code is similar to
SELECT *
INTO output_table
FROM (
SELECT name, year_month, sales
FROM input_table
)
PIVOT
(
SUM(sales)
FOR year_month IN ('nov_2020', 'dec_2020', 'jan_2021', 'feb_2021', 'mar_2021', 'apr_2021', 'may_2021', 'jun_2021', 'jul_2021', 'aug_2021',
'sep_2021', 'oct_2021', 'nov_2021', 'dec_2021', 'jan_2022', 'feb_2022', 'mar_2022', 'apr_2022', 'may_2022', 'jun_2022',
'jul_2022', 'aug_2022', 'sep_2022', 'oct_2022', 'nov_2022')
)
I tried year_month without any quote (got the error), year_month with double quote (got the error), and finally year_month with single quote (it works this time). This may help if someone in the same situation like my example.
I'm writing a Java application to automatically build and run SQL queries. For many tables my code works fine but on a certain table it gets stuck by throwing the following exception:
Exception in thread "main" org.postgresql.util.PSQLException: ERROR: column "continent" does not exist
Hint: Perhaps you meant to reference the column "countries.Continent".
Position: 8
The query that has been run is the following:
SELECT Continent
FROM network.countries
WHERE Continent IS NOT NULL
AND Continent <> ''
LIMIT 5
This essentially returns 5 non-empty values from the column.
I don't understand why I'm getting the "column does not exist" error when it clearly does in pgAdmin 4. I can see that there is a schema with the name Network which contains the table countries and that table has a column called Continent just as expected.
Since all column, schema and table names are retrieved by the application itself I don't think there has been a spelling or semantical error so why does PostgreSQL cause problems regardless? Running the query in pgAdmin4 nor using the suggested countries.Continent is working.
My PostgreSQL version is the newest as of now:
$ psql --version
psql (PostgreSQL) 9.6.1
How can I successfully run the query?
Try to take it into double quotes - like "Continent" in the query:
SELECT "Continent"
FROM network.countries
...
In working with SQLAlchemy environment, i have got this error with the SQL like this,
db.session.execute(
text('SELECT name,type,ST_Area(geom) FROM buildings WHERE type == "plaza" '))
ERROR: column "plaza" does not exist
Well, i changed == by = , Error still persists, then i interchanged the quotes, like follows. It worked. Weird!
....
text("SELECT name,type,ST_Area(geom) FROM buildings WHERE type = 'plaza' "))
This problem occurs in postgres because the table name is not tablename instead it is "tablename".
for eg.
If it shows user as table name,
than table name is "user".
See this:
Such an error can appear when you add a space in the name of a column by mistake (for example "users ").
QUICK FIX (TRICK)
If you have recently added a field which you have already deleted before and now trying to add the same field back then let me share you this simple trick! i did this and the problem was gone!!
so, now just delete the migration folder entirely on the app,then instead of adding that field you need to now add a field but with the name of which you have never declared on this app before, example if you are trying to add title field then create it by the name of heading and now do the migration process separately on the app and runserver, now go to admin page and look for that model and delete all the objects and come to models back and rename the field that you recently made and name it to which you were wishing it with earlier and do the migrations again and now your problem must have been gone!!
this occurs when the objects are there in the db but you added a field which wasn't there when the earlier objs were made, so by this we can delete those objs and make fresh ones again!
I got the same error when I do PIVOT in RedShift.
My code is similar to
SELECT *
INTO output_table
FROM (
SELECT name, year_month, sales
FROM input_table
)
PIVOT
(
SUM(sales)
FOR year_month IN ('nov_2020', 'dec_2020', 'jan_2021', 'feb_2021', 'mar_2021', 'apr_2021', 'may_2021', 'jun_2021', 'jul_2021', 'aug_2021',
'sep_2021', 'oct_2021', 'nov_2021', 'dec_2021', 'jan_2022', 'feb_2022', 'mar_2022', 'apr_2022', 'may_2022', 'jun_2022',
'jul_2022', 'aug_2022', 'sep_2022', 'oct_2022', 'nov_2022')
)
I tried year_month without any quote (got the error), year_month with double quote (got the error), and finally year_month with single quote (it works this time). This may help if someone in the same situation like my example.
I've created some views in my postgres database. I know they're there, because I can query them through the query tool in PGAdmin4 (and they are persistent between restarting the machine hosting the database), but they are neither visible in the schema browser nor queryable through psycopg2.
For larger context, I'm trying to extract some text from a large collection of documents which are stored in a database. (The database is a copy of the data received from a third party, and fully normalized, etc.) I'd like to do my NLP nonsense in Python, while defining a lot of document categorizations through SQL views so the categorizations are consistent, persistent, and broadly shareable to my team.
Googling has not turned up anything relevant here, so I'm wondering if there is a basic configuration issue that I've missed. (I am much more experienced with SQLServer than with postgres.)
Example:
[Assume I'm connected to database DB, schema SC, which has tables T1, T2, T3.]
-- in PGAdmin4 window
CREATE VIEW v_my_view as
SELECT T1.field1, T2.field2
FROM T1
JOIN T2
on T1.field3 = T2.field3
Restart host machine (so definitely new PGAdmin session), the following works:
-- in pgadmin4 window
SELECT *
FROM v_my_view
-- 123456 results returned
...but even though that works, in the pgadmin4 browser panel, the 'views' folder is empty (right underneath the tables folder that proudly shows T1 and T2).
Within psycopg2:
import psycopg2
import pandas as pd
sqluser = 'me'
sqlpwd = 'secret'
dbname = 'DB'
schema_name = 'SC'
pghost = 'localhost'
def q(query):
cnxn = psycopg2.connect(dbname=dbname, user=sqluser, password=sqlpwd, host=pghost)
cursor = cnxn.cursor()
cursor.execute('SET search_path to ' + schema_name)
return pd.read_sql_query(query, cnxn)
view_query = """select *
from v_my_view
limit 100;"""
table_query = """select *
from SC.T1
limit 100;"""
# This works
print(f"Result: {q(table_query)}")
# This does not; error is: relation 'v_my_view' does not exist
# (Same result if view is prefixed with schema name)
# print(f"Result: {q(view_query)}")
Software versions:
pgadmin 4.23
postgres: I'm connected to 10.13 (Ubuntu 10.13-1-pgdg18.04+1), though 12
is also installed.
psycopg2: 2.8.5
Turns out this was a noob mistake. Views are created to the first schema of the search path (which can be checked by executing show search_path;, which in my case was set to "$user", public despite attempting to set it to the appropriate schema name). So the views were getting created against a different schema from the one I was working with/where the tables were defined.
Created views are all visible in the left-hand browser once I look under the correct schema.
The following modification to the psycopg2 code returns the expected results:
import psycopg2
import pandas as pd
sqluser = 'me'
sqlpwd = 'secret'
dbname = 'DB'
schema_name = 'SC'
pghost = 'localhost'
def q(query):
cnxn = psycopg2.connect(dbname=dbname, user=sqluser, password=sqlpwd, host=pghost)
cursor = cnxn.cursor()
cursor.execute('SET search_path to ' + schema_name)
return pd.read_sql_query(query, cnxn)
# NOTE I am explicitly indicating the 'public' schema here
view_query = """select *
from public.v_my_view
limit 100;"""
table_query = """select *
from SC.T1
limit 100;"""
# This works
print(f"Result: {q(table_query)}")
# This works too once I specify the right schema:
print(f"Result: {q(view_query)}")
Try refresh Object on the PGAdmin toolbar. This should refresh the view.
Thanks
Amar
I have an Alembic migration which creates a few DB indexes that were missing in a database. Example:
op.create_index(op.f('ix_some_index'), 'table_1', ['column_1'], unique=False)
However, the migration fails in other environments that already have the index:
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) relation "ix_some_index" already exists
PostgreSQL supports an IF NOT EXISTS option for cases like this, but I don't see any way of invoking it using either Alembic or SQLAlchemy options. Is there a canonical way of checking for an existing index?
Here's a somewhat blunt solution that works for PostgreSQL. It simply checks whether there's an index with the same name before creating the new index.
Beware that it doesn't verify that the index is in the correct Postgres namespace or any other info that could be relevant. It works in my case because I know there's no other chance of name collision:
def index_exists(name):
connection = op.get_bind()
result = connection.execute(
"SELECT exists(SELECT 1 from pg_indexes where indexname = '{}') as ix_exists;"
.format(name)
).first()
return result.ix_exists
def upgrade():
if not index_exists('ix_some_index'):
op.create_index(op.f('ix_some_index'), 'table_1', ['column_1'], unique=False)
Windows/NET/ODBC
I would like to get query results to new table on some handy way which I can see through data adapter but I can't find a way to do it.
There is no much examples around to satisfy beginner's level on this.
Don't know temporary or not but after seeing results that table is no more needed so I can delete it 'by hand' or it can be deleted automatically.
This is what I try:
mCmd = New OdbcCommand("CREATE TEMP TABLE temp1 ON COMMIT DROP AS " & _
"SELECT dtbl_id, name, mystr, myint, myouble FROM " & myTable & " " & _
"WHERE myFlag='1' ORDER BY dtbl_id", mCon)
n = mCmd.ExecuteNonQuery
This run's without error and in 'n' I get correct number of matched rows!!
But with pgAdmin I don't see those table no where?? No matter if I look under opened transaction or after transaction is closed.
Second, should I define columns for temp1 table first or they can be made automatically based on query results (that would be nice!).
Please minimal example to illustrate me what to do based on upper code to get new table filled with query results.
A shorter way to do the same thing your current code does is with CREATE TEMPORARY TABLE AS SELECT ... . See the entry for CREATE TABLE AS in the manual.
Temporary tables are not visible outside the session ("connection") that created them, they're intended as a temporary location for data that the session will use in later queries. If you want a created table to be accessible from other sessions, don't use a TEMPORARY table.
Maybe you want UNLOGGED (9.2 or newer) for data that's generated and doesn't need to be durable, but must be visible to other sessions?
See related: Is there a way to access temporary tables of other sessions in PostgreSQL?