Athena doesn't show column create as BIGINT - postgresql

We tried to upload a csv file with column 'cpf' on AWS-Athena, the field cpf contains numbers like this '372.088.989-03'
create external table (
cpf bigint,
name string
cell bigint
)
Athena doesn't read this field, how can i register?
we try to register like string and this works but is not correct

Ah! It's the CPF number - Wikipedia.
It does not match rules for numbers, and you won't be doing any mathematics on it, so I would recommend treating the CPF as a string.

Related

Creating a postgres column which allows all datatypes

I want to create a logging table which tracks changes in a certain table, like so:
CREATE TABLE logging.zaak_history (
event_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
tstamp timestamp DEFAULT NOW(),
schemaname text,
tabname text,
columnname text,
operation text,
who text DEFAULT current_user,
new_val <any_type>,
old_val <any_type>
);
However, the column that I want to track can take different datatypes, such as text, boolean and numeric. Is there a datatype that support the functionality?
Currently I am thinking about storing is as jsonb, as this will deal with the datatype in the json formatting, but I was wondering if there is a better way.
There is no postgres data type that isn't strongly typed, because the "any" data type that is available as a pseudo type cannot be used as a column (it can be used in functions, etc.)
You could store the binary representation of your data, because every type does have a binary representation.
Your approach of using JSON seems more flexible, as you can also store meta data (such as type information).
However, I recommend looking at how other people have solved the same issue for alternative ideas. For example, most wikis store a copy of the entire record for history, which is easy to reconstruct, can be referenced independently, and has no typing issues.

PostgreSQL: column does not exists

Right now I'm trying to create a filter that would give me every result from start of the month. The query looks like this:
cur.execute('SELECT SUM(money_amount) '
f'FROM expense WHERE created >= "{first_day_of_month}"'
But I'm getting such error: psycopg2.errors.UndefinedColumn: column "2022-08-01" does not exist
my createtable.sql:
CREATE TABLE budget(
codename varchar(255) PRIMARY KEY,
daily_expense INTEGER );
CREATE TABLE category(
codename VARCHAR(255) PRIMARY KEY,
name VARCHAR(255),
is_basic_expense BOOLEAN,
aliases TEXT );
CREATE TABLE expense(
id SERIAL PRIMARY KEY,
money_amount INTEGER,
created DATE,
category_codename VARCHAR(255),
raw_text TEXT,
FOREIGN KEY(category_codename) REFERENCES category(codename) );
What is wrong and why the column does not exist, when it is?
This is probably the most common reason to get a "column does not exist" error: using double quotes. In PostgreSQL, double quotes aren't used for strings, but rather for identifiers. For example, if your column name had a space in it, you wouldn't be able to write WHERE the date > '2022-08-01', but you would be able to write WHERE "the date" > '2022-08-01'. Using double quotes around a string or stringy thing like a date gets interpreted as an attempt to use an identifier, and since you're using it where a value should be it will usually be interpreted as trying to identify a column in particular. I make this mistake at least once a week. Instead, use single quotes or placeholders.

postgresql erro: value too long for type character varying(256)

I have looked into other solutions before but could not find out the problem from explanations. I am trying to run a python script where the data is loaded from an oltp MySQL database (AWS RDS) to an olap database on AWS Redshift. I have defined my table in Redshift as below:
create_product = ("""CREATE TABLE IF NOT EXISTS product (
productCode varchar(15) NOT NULL PRIMARY KEY,
productName varchar(70) NOT NULL,
productLine varchar(50) NOT NULL,
productScale varchar(10) NOT NULL,
productVendor varchar(50) NOT NULL,
productDescription text NOT NULL,
buyPrice decimal(10,2) NOT NULL,
MSRP decimal(10,2) NOT NULL
)""")
I am using a python script to load the data from RDS to Redshift. My function body to load
for query in dimension_etl_query:
oltp_cur.execute(query[0])
items = oltp_cur.fetchall()
try:
olap_cur.executemany(query[1], items)
olap_cnx.commit()
logger.info("Inserted data with: %s", query[1])
except sqlconnector.Error as err:
logger.error('Error %s Couldnt run query %s', err, query[1])
The script run throws the error
olap_cur.executemany(query[1], items)
psycopg2.errors.StringDataRightTruncation: value too long for type character varying(256)
I have checked in my SQL database for each of the columns length and only productDescription has length greater than 265 characters. However I am using text datatype in postgres for that column. Would appreciate any tips on how to find the rootcause?
See here:
https://docs.aws.amazon.com/redshift/latest/dg/r_Character_types.html#r_Character_types-text-and-bpchar-types
TEXT and BPCHAR types
You can create an Amazon Redshift table with a TEXT column, but it is converted to a VARCHAR(256) column that accepts variable-length values with a maximum of 256 characters.
You can create an Amazon Redshift column with a BPCHAR (blank-padded character) type, which Amazon Redshift converts to a fixed-length CHAR(256) column.
Looks like you might need VARCHAR, I think. From same link:
VARCHAR or CHARACTER VARYING
...
If used in an expression, the size of the output is determined using the input expression (up to 65535).
You will have to experiment to see if that works.
Just try to keep everything under 256 chars even if it is a text

How to search specific value in whole table column posgresql

I have a column name demo.
create table demo (
id int,
name varchar,
address varchar,
designation varchar
);
I want to achieve the below scenario in one query
If I pass an empty string like "" then query will return all values in table
If I pass 'av' then query should match 'av' contins string in whole table column(name, address, designation);
You need a WHERE clause like this:
WHERE name LIKE '%av%'
OR address LIKE '%av%'
OR designation LIKE '%av%'
Beware of SQL injection when you construct the search pattern.

Column Data Type wont change to serial [duplicate]

In a Postgres 9.3 table I have an integer as primary key with automatic sequence to increment, but I have reached the maximum for integer. How to convert it from integer to serial?
I tried:
ALTER TABLE my_table ALTER COLUMN id SET DATA TYPE bigint;
But the same does not work with the data type serial instead of bigint. Seems like I cannot convert to serial?
serial is a pseudo data type, not an actual data type. It's an integer underneath with some additional DDL commands executed automatically:
Create a SEQUENCE (with matching name by default).
Set the column NOT NULL and the default to draw from that sequence.
Make the column "own" the sequence.
Details:
Safely rename tables using serial primary key columns
A bigserial is the same, built around a bigint column. You want bigint, but you already achieved that. To transform an existing serial column into a bigserial (or smallserial), all you need to do is ALTER the data type of the column. Sequences are generally based on bigint, so the same sequence can be used for any integer type.
To "change" a bigint into a bigserial or an integer into a serial, you just have to do the rest by hand:
Creating a PostgreSQL sequence to a field (which is not the ID of the record)
The actual data type is still integer / bigint. Some clients like pgAdmin will display the data type serial in the reverse engineered CREATE TABLE script, if all criteria for a serial are met.