Save Table from MySQL as dictionary in Python with MySQL Connector - mysql-connector

Im trying to get Values from a MySQL Table and save them as a dictionary in Python, using MySQL connector since I couldnt manage to install anythingelse...
My problem is that it prints all key-value pairs, but only the last one/last row gets saved in the dictionary f[]. How do I get it to save every key-Value pair in the dictionary?
Thanks in advance!
import mysql.connector
import mysql
conn = mysql.connector.connect(user="F", password="", host="localhost", db="f")
mycursor=conn.cursor(dictionary=True)
query = ("SELECT * FROM 'test')
mycursor.execute(query)
f={}
for row in mycursor:
f = {row["JT"]: row ["f"]}
print f

I found the solution:
for row in mycursor:
f[row['JT']]=row["f"]

Related

pymongo - Upsert collection to add object with nested objects and nested data

I have spent weeks researching, finally asking the question....
Using python/pymongo I am stuck at the simple step of just adding a nested object with nested data "to an existing object or nested object"...so I have not even made it to the upsert yet. I "have" code that confirms if the collection "serial_number+ exists, if not then create it -- however I am stuck at adding nested objects to an existing or newly created collection. Looking for an example to build from, not for this forum to solve the full solution :-)
Attached is a screenshot of the desired layout. Use case, several cars are available to be rented
MongoDB Database with a Collection that is a car's serial number and all activities related to that car
a. Need to capture details of each rental activity - nested data under a each serial number.
b. Need to capture maintenance details for each car - nested data under each serial number.
When either "a" or "b" occur, need to check that the record is not already there, if not then add a new nested object that contains nested data.
I have tried for loops, numerous postings referring to collections and sub-collections -- all seem to require highly detailed information within code to parse the dictionary (which comes from semi-structed sources, not consistent) and add to a collection....but cannot find a starting point to add to an existing object or nested object within an existing collection?
#Assume mongodb is structured like this:
#rent_activity(DB)>vehicle(document)>serial_number(object)>car_details(object under serial_number)
#rent_activity(DB)>vehicle(document)>serial_number(object)>rental_activity(object under serial_number)
#rent_activity(DB)>vehicle(document)>serial_number(object)>maintenance(object under serial_number)
#The example below is attempting to insert new objects with nested arrays to the car_details object
import os,sys
import pymongo
import csv
import boto3
import time
import psycopg2
import pandas as pd
import pandas.io.sql as sqlio
from datetime import datetime
from pymongo import errors
my_connection = psycopg2.connect(dbname='xxx', host='xxx', port=xxx, user='xxx', password='xxx')
client = pymongo.MongoClient("xxxxxxx")
mydb = client["rent_activity"] #This is the mongodb database
collection = mydb["vehicle"]
car_details_sql = "SELECT serial_number, mileage, fuel_level other from car_details"
data = sqlio.read_sql_query(car_details_sql, my_connection) #Create dataframe from query results
try:
#Goal is to add each line from the dataframe result returned from the SQL queary, as a record under collection "vehicle", object "serial_number", nested object "car_details"
for item in data.to_dict(orient="records"): #For each row, transform to dict
mycarkey = str(item.get("purchase_date")) + "_" + str(item.get("serial_number")) #Create unique string
car_number = "car_" + mycarkey
finaldata = {}#Create empty
finaldata[mycarkey] = item #Add outer unique key to existing dict for this set of data
finaldata.update({"_id": car_number}) #Define custom object _id for this set of data
collection.insert_one(finaldata)
#This inserts to the collection "vehicle"
#What is needed is to nest this to:
# rent_activity(DB)>vehicle(document)>serial_number(object)>car_details(object under serial_number)
#NOTE: If I simply add another outer key "serial_number" I receive error key already exists
#..so the issue is how to add an object with nested values beneath an existing object??
except Exception as e:
raise e

Tabpy and Postgres

I'm experimenting with Tabpy in Tableau and writing data back to a database via a users selection in a dashboard.
My data connections work fine. However, I'm having trouble passing a variable back into the SQL query and getting an error saying the "variable name" doesn't exist in the table that I'm trying to update. Here is my code.
The error states that "dlr_status" does not exist on the table. This is the variable I'm trying to pass back to the query. The database is a Postgres database. Any help is greatly appreciated. I've been researching this for several days and can't find anything.
SCRIPT_STR("
import psycopg2
import numpy as np
from datetime import date
con = psycopg2.connect(
dbname='edw',
host='serverinfo',
port='5439',
user='username',
password='userpassword')
dlr_no_change = _arg1[0]
dlr_status = _arg2[0]
update_trigger = _arg3[0]
sql = '''update schema.table set status = dlr_status where dlr_no = dlr_no_change'''
if update_trigger == True:
cur = con.cursor()
cur.execute(sql)
cur.commit",
ATTR([Dlr No]), ATTR([dlr_status]), ATTR([Update_Now]))
Your commit is missing "()". Or add a con.autocommit = True after creating the connection if you don't want to commit each step.

How to use pyodbc to migrate tables from MS Access to Postgres?

I need to migrate tables from MS Access to Postgres. I'd like to use pyodbc to do this as it allows me to connect to the Access database using python and query the data.
The problem I have is I'm not exactly sure how to programmatically create a table with the same schema other than just creating a SQL statement using string formatting. pyodbc provides the ability to list all of the fields, field types and field lengths, so I can create a long SQL statement with all of the relevant information, however how can I do this for a bunch of tables? would I need to build SQL string statements for each table?
import pyodbc
access_conn_str = (r'DRIVER={Microsoft Access Driver (*.mdb, *.accdb)}; 'r'DBQ=C:\Users\bob\access_database.accdb;')
access_conn = pyodbc.connect(access_conn_str)
access_cursor = access_conn.cursor()
postgres_conn_str = ("DRIVER={PostgreSQL Unicode};""DATABASE=access_database;""UID=user;""PWD=password;""SERVER=localhost;""PORT=5433;")
postgres_conn = pyodbc.connect(postgres_conn_str)
postgres_cursor = postgres_conn.cursor()
table_ditc = {}
row_dict = {}
for row in access_cursor.columns(table='table1'):
row_dict[row.column_name] = [row.type_name, row.column_size]
table_ditc['table1'] = row_dict
for table, values in table_ditc.items():
print(f"Creating table for {table}")
access_cursor.execute(f'SELECT * FROM {table}')
result = access_cursor.fetchall()
postgres_cursor.execute(f'''CREATE TABLE {table} (Do I just put a bunch of string formatting in here?);''')
postgres_cursor.executemany(f'INSERT INTO {table} (Do I just put a bunch of string formatting) VALUES (string formatting?)', result)
postgres_conn.commit()
As you can see, with pyodbc I'm not exactly sure how to build the SQL statements. I know I could build a long string by hand, but if I were doing a bunch of different tables, with different fields etc. that would not be realistic. Is there a better, easier way to create the table and insert rows based off of the schema of the Access database?
I ultimately ended up using a combination of pyodbc and pywin32. pywin32 is "basically a very thin wrapper of python that allows us to interact with COM objects and automate Windows applications with python" (quoted from second link below).
I was able to programmatically interact with Access and export the tables directly to Postgres with DoCmd.TransferDatabase
https://learn.microsoft.com/en-us/office/vba/api/access.docmd.transferdatabase
https://pbpython.com/windows-com.html
import win32com.client
import pyodbc
import logging
from pathlib import Path
conn_str = (r'DRIVER={Microsoft Access Driver (*.mdb, *.accdb)}; 'rf'DBQ={access_database_location};')
conn = pyodbc.connect(conn_str)
cursor = conn.cursor()
a = win32com.client.Dispatch("Access.Application")
a.OpenCurrentDatabase(access_database_location)
table_list = []
for table_info in cursor.tables(tableType='TABLE'):
table_list.append(table_info.table_name)
for table in table_list:
logging.info(f"Exporting: {table}")
acExport = 1
acTable = 0
db_name = Path(access_database_location).stem.lower()
a.DoCmd.TransferDatabase(acExport, "ODBC Database", "ODBC;DRIVER={PostgreSQL Unicode};"f"DATABASE={db_name};"f"UID={pg_user};"f"PWD={pg_pwd};""SERVER=localhost;"f"PORT={pg_port};", acTable, f"{table}", f"{table.lower()}_export_from_access")
logging.info(f"Finished Export of Table: {table}")
logging.info("Creating empty table in EGDB based off of this")
This approach seems to be working for me. I like how the creation of the table/fields as well as insertion of data is all handled automatically (which was the original problem I was having with pyodbc).
If anyone has better approaches I'm open to suggestions.

Import data (arrays) to Quicksight

How can I import arrays data into Quicksight from Postgresql? For example {1,2,3,4,5}. I tried to import all data into Quicksight but it doesn't recognize the arrays. However, if I download as csv from Postgresql and then import local csv file to Quicksight, it will recognize the arrays as string.
Have you tried converting the data type using a custom SQL command that will convert the array? https://docs.aws.amazon.com/quicksight/latest/user/adding-a-SQL-query.html
SELECT column_name->'name' AS name
FROM table_name
I achieved that with custom SQL + unnest() on the column with array, difference is that later you will have to keep in mind that some rows are not unique
You can use array_join(array,';'). This will show your array element in quick sight with ; separated like [1;2;3;4;5]

PostgreSQL: Import columns into table, matching key/ID

I have a PostgreSQL database. I had to extend an existing, big table with a few more columns.
Now I need to fill those columns. I tought I can create an .csv file (out of Excel/Calc) which contains the IDs / primary keys of existing rows - and the data for the new, empty fields. Is it possible to do so? If it is, how to?
I remember doing exactly this pretty easily using Microsoft SQL Management Server, but for PostgreSQL I am using PG Admin (but I am ofc willing to switch the tool if it'd be helpfull). I tried using the import function of PG Admin which uses the COPY function of PostgreSQL, but it seems like COPY isn't suitable as it can only create whole new rows.
Edit: I guess I could write a script which loads the csv and iterates over the rows, using UPDATE. But I don't want to reinvent the wheel.
Edit2: I've found this question here on SO which provides an answer by using a temp table. I guess I will use it - although it's more of a workaround than an actual solution.
PostgreSQL can import data directly from CSV files with COPY statements, this will however only work, as you stated, for new rows.
Instead of creating a CSV file you could just generate the necessary SQL UPDATE statements.
Suppose this would be the CSV file
PK;ExtraCol1;ExtraCol2
1;"foo",42
4;"bar",21
Then just produce the following
UPDATE my_table SET ExtraCol1 = 'foo', ExtraCol2 = 42 WHERE PK = 1;
UPDATE my_table SET ExtraCol1 = 'bar', ExtraCol2 = 21 WHERE PK = 4;
You seem to work under Windows, so I don't really know how to accomplish this there (probably with PowerShell), but under Unix you could generate the SQL from a CSV easily with tools like awk or sed. An editor with regular expression support would probably suffice too.