save file (.pdf) in database whit python 2.7 - postgresql

Craig Ringer Ican not work whit large object functions
My database looks like this
this is my table
-- Table: files
--
DROP TABLE files;
CREATE TABLE files
(
id serial NOT NULL,
orig_filename text NOT NULL,
file_data bytea NOT NULL,
CONSTRAINT files_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
ALTER TABLE files
I want save .pdf in my database, I saw you did the last answer, but using python27 (read the file and convert to a buffer object or use the large object functions)
I did the code would look like
path="D:/me/A/Res.pdf"
listaderuta = path.split("/")
longitud=len(listaderuta)
f = open(path,'rb')
f.read().__str__()
cursor = con.cursor()
cursor.execute("INSERT INTO files(id, orig_filename, file_data) VALUES (DEFAULT,%s,%s) RETURNING id", (listaderuta[longitud-1], f.read()))
but when I'm downloading, I save
fula = open("D:/INSTALL/pepe.pdf",'wb')
cursor.execute("SELECT file_data, orig_filename FROM files WHERE id = %s", (int(17),))
(file_data, orig_filename) = cursor.fetchone()
fula.write(file_data)
fula.close()
but when I'm downloading the file can not be opened, this damaged I repeat I can not work with large object functions
try this and turned me, can you help ?

I am thinking that psycopg2 Binary function does not user lob functions.
thus I used.....
path="salman.pdf"
f = open(path,'rb')
dat = f.read()
binary = psycopg2.Binary(dat)
cursor.execute("INSERT INTO files(id, file_data) VALUES ('1',%s)", (binary,))
conn.commit()

Just correction in INSERT statement, INSERT statement will be failed with null value in column "orig_filename" violates not-null constraint as orig_filename is defined as NOT NULL.... use instead
("INSERT INTO files(id, orig_filename,file_data) VALUES ('1','filename.pdf',%s)", (binary,))

Related

Kafka/KsqlDb : Why is PRIMARY KEY appending chars?

I intend to create a TABLE called WEB_TICKETS where the PRIMARY KEY is equal to the key->ID value. For some reason, when I run the CREATE TABLE instruction the PRIMARY KEY value is appended with the chars 'JO' - why is this happening?
KsqlDb Statements
These work as expected
CREATE STREAM STREAM_WEB_TICKETS (
ID_TICKET STRUCT<ID STRING> KEY
)
WITH (KAFKA_TOPIC='web.mongodb.tickets', FORMAT='AVRO');
CREATE STREAM WEB_TICKETS_REKEYED
WITH (KAFKA_TOPIC='web_tickets_by_id') AS
SELECT *
FROM STREAM_WEB_TICKETS
PARTITION BY ID_TICKET->ID;
PRINT 'web_tickets_by_id' FROM BEGINNING LIMIT 1;
key: 5d0c2416b326fe00515408b8
The following successfully creates the table but the PRIMARY KEY value isn't what I expect:
CREATE TABLE web_tickets (
id_pk STRING PRIMARY KEY
)
WITH (KAFKA_TOPIC = 'web_tickets_by_id', VALUE_FORMAT = 'AVRO');
select id_pk from web_tickets EMIT CHANGES LIMIT 1;
|ID_PK|
|J05d0c2416b326fe00515408b8
As you can see the ID_PK value has the characters JO appended to it. Why is this?
It appears as though I wasn't properly setting the KEY FORMAT. The following command produces the expected result.
CREATE TABLE web_tickets_test_2 (
id_pk VARCHAR PRIMARY KEY
)
WITH (KAFKA_TOPIC = 'web_tickets_by_id', FORMAT = 'AVRO');

Replace rows based on a modified timestamp

I am looking for an efficient method (which I can reuse for similar situations) to drop rows which have been updated.
My table has many columns, but the important ones are:
creation_timestamp, id, last_modified_timestamp
My primary key is the creation_timestamp and the id. However, after and id has been created, it can be modified by other users which is indicated by the last_modified_timestamp.
1) Read a daily file and add any new rows (based on creation_timestamp and id)
2) Remove old rows which have a different last_modified_timestamp and replace them with the latest versions.
I typically do most of my operations with Pandas (python library) and pyscopg2, so I am not extremely familiar with PostgreSQL 9.6 which is the database I am using. My initial approach is to just add the last_modified_timestamp to the primary key, and then just use a view to SELECT DISTINCT based on the latest changes. However, it seems like that is 'cheating' and I will be wasting space since I do not need to retain previous versions.
EDIT:
def create_update_query(df, table=FACT_TABLE):
columns = ', '.join([f'{col}' for col in DATABASE_COLUMNS])
constraint = ', '.join([f'{col}' for col in PRIMARY_KEY])
placeholder = ', '.join([f'%({col})s' for col in DATABASE_COLUMNS])
updates = ', '.join([f'{col} = EXCLUDED.{col}' for col in DATABASE_COLUMNS])
query = f"""
INSERT INTO {table} ({columns})
VALUES ({placeholder})
ON CONFLICT ({constraint})
DO UPDATE SET {updates};"""
query.split()
query = ' '.join(query.split())
return query
def load_updates(df, connection=DATABASE):
conn = connection.get_conn()
cursor = conn.cursor()
df1 = df.where((pd.notnull(df)), None)
insert_values = df1.to_dict(orient='records')
for row in insert_values:
cursor.execute(create_update_query(df), row)
conn.commit()
cursor.close()
del cursor
conn.close()
This appears to work. I was running into some issues, so right now i am looping through each row of the DataFrame as a dictionary, then inserting that row. Also, I had to figure out a way to fill in the nan columns with None, because I was getting errors with Timestamp dtypes with blank values, etc.

Adding values to a newly inserted column in an existing table in PostgreSQL 9.3

created a table named "collegetable":
create table collegetable (stid integer primary key not null,stname
varchar(50),department varchar(10),dateofjoin date);
provided values for each column:collegetable data
inserted a new column in it named "cgpa" and tried to add values for this column in one shot using the code:
WITH col(stid, cgpa) as
( VALUES((1121,8.01),
(1131,7.12),
(1141,9.86))
)
UPDATE collegetable as colldata
SET cgpa = col.cgpa
FROM col
WHERE colldata.stid = col.stid;
and got error :
ERROR:operator does not exist:integer=record
LINE9:where colldata.stid=col.stid;
HINT:No operator matches the given name and arguement type.you might need to add explicit type casts.
pls help in solving.thanks in advance.
The with clause only defines the names of the columns, not the data types:
with col (stid, cgpa) as (
...
)
update ...;
For details see the tutorial and the full reference

Cassandra - Get on CF with key returning 0 results, but key exists when retrieving whole table using pycassa

We have a table in Cassandra 1.2.0. That has an VarInt key. When we search keys we can see that they exist.
Table description:
CREATE TABLE u (
key varint PRIMARY KEY,
) WITH COMPACT STORAGE AND
bloom_filter_fp_chance=0.010000 AND
caching='KEYS_ONLY' AND
comment='' AND
dclocal_read_repair_chance=0.000000 AND
gc_grace_seconds=864000 AND
read_repair_chance=1.000000 AND
replicate_on_write='true' AND
compaction={'class': 'SizeTieredCompactionStrategy'} AND
compression={'sstable_compression': 'SnappyCompressor'};
Select key from u limit 10;
key
12040911
60619595
3220132
4602232
3997404
6312372
1128185
1507755
1778092
4701841
When I try and get the row for key 60619595 it works fine.
cqlsh:users> select key from u where key = 60619595;
key
60619595
cqlsh:users> select key from u where key = 3997404;
When I use pycassa to get the whole table I can access the row.
import pycassa
from struct import *
from pycassa.types import *
from urlparse import urlparse
import operator
userspool = pycassa.ConnectionPool('users');
userscf = pycassa.ColumnFamily(userspool, 'u');
users = {}
u = list(userscf.get_range())
for r in u:
users[r[0]] = r[1]
print users[3997404]
returns the correct result.
What am I doing wrong? I cannot see what the error is.
Any help would be appreciated,
Regards
Michael.
PS:
I should say that in pycassa when I try:
userscf.get(3997404)
File "test.py", line 10, in
userscf.get(3997404)
File "/usr/local/lib/python2.7/dist-packages/pycassa/columnfamily.py", line 655, in get
raise NotFoundException()
pycassa.cassandra.ttypes.NotFoundException: NotFoundException(_message=None)
It seems to be in Ints that are smaller than the average.
You are mixing CQL and Thrift-based queries, which do not always mix well. CQL abstracts the underlying storage rows, whereas Thrift deals directly with them.
This is a problem we are having in our project. I should have added that
select key from u where key = 3997404;
cqlsh:users>
returns 0 results, even although when select * from u in cqlsh, or get the whole table in pycassa we see the row with the key 3997404.
Sorry for the confusion.
Regards
D.

Save stored prodeure result as file

i have a Stored Procedure , in proc Print certain result like
Print '-- Start Transection--'
Print 'Transection No = ' + #TransectionId
...
...
Print 'Transection Success'
Print '-- End Transection--'
is it possible to Save printed result in a file while call it from UI. after that we have to mail that file to user also ask for download that file
PRINT is for logging and debugging purposes and shouldn't be used to return anything to the caller.
Here's a suggestion: instead of PRINTing, write into a logging table and return a loggingID. Then, query this table from your app and write to a file.
Example: create two tables
CREATE TABLE Logging
(
LoggingID int IDENTITY(1,1) PRIMARY KEY,
Created datetime
)
CREATE TABLE LoggingDetail
(
LoggingDetailID int IDENTITY(1,1) PRIMARY KEY,
LoggingID int FOREIGN KEY REFERENCES Logging,
LoggingText varchar(500)
)
At the beginning of a transaction, create a new loggingID:
INSERT INTO Logging (Created) VALUES (GETUTCDATE())
DECLARE #loggingID INT = ##IDENTITY
Instead of PRINTing the logging messsages, do something like
INSERT INTO LoggingDetail (LoggingID, LoggingText) VALUES (#loggingID,
'-- Start Transection--')
At the end of your sproc, return #loggingID to the caller. You can now get retrieve the log messages from the LoggingDetail table and write them to a file:
SELECT LoggingText FROM LoggingDetail WHERE LoggingID=<loggingID> ORDER BY LoggingDetailID
Might be a good idea to encapsulate the INSERTs in separate sprocs. Those sprocs then can write to the log tables and PRINT the log messages.