Error when trying to import with CSV file format in Cloud SQL - google-cloud-storage

HTTPError 400: Unknow export file type was thrown when I try to Import csv file from my Cloud Storage bucket into my Cloud SQL db. Any idea what I missed out.
Reference:
gcloud sql import csv

CSV files are not supported in Cloud SQL, MS SQL Server. As mentioned here,
In Cloud SQL, SQL Server currently supports importing databases using
SQL and BAK files.
Somehow, it is supported for MySQL and PostgreSQL versions of Cloud SQL.
You could perform one of the next solutions:
Change the database engine to either PostgreSQL or MySQL (where CSV files are supported).
If the data on your CSV file came from an on-premise SQL Server DB table, you can create an SQL file from it, then use it to import into Cloud SQL, SQL Server.

Related

export Amazon RDS into S3 or locally

i am using Amazon RDS Aurora postgreSQL 10.18, i need to export a specific tables with more than 50,000 rows into csv file (either local or into s3 bucket), i have tried many procedure but ended up with fail :
i tried the button export to csv from the query editor after select all rows but the API response with too large data to return
i tried to use aws_s3.query_export_to_s3, but ERROR: (credentials stored with the database cluster can’t be accessed Hint: Has the IAM role Amazon Resource Name (ARN) been associated with the feature-name "s3Export")
i tried to take a snapshot from our instance, then export it into s3 bucket but ended up with error (The specified db snapshot engine mode isn’t supported and can’t be exported)

Importing Csv file from GCS to postgres Cloud SQL instance invalid input syntax error

When importing a csv file from Cloud Storage into Cloud SQL Postgres using Cloud Composer (AIRFLOW ),I would like to remove the header, or skip rows automatically (in my dag operator: CloudSQLImportInstanceOperator) but i keep having error,It seems CloudSQLImportInstanceOperator doesn't support skip rows,how to resolve such issue?

How to read tables from synapse database tables using pyspark

I am a newbie to Azure Synapse, I have to work on the Azure spark notebook. One of my colleagues connected the on-prime database using the azure link service. Now I have written a test framework for comparing the on-prime data and data-lake(curated) data. but I don't understand how to read those tables using Pyspark.
here is my linked service data structure.
enter image description here
here my Link service names and Database name.
You can read any file as a table which is stored in Synapse Linked location by using Azure Synapse Dedicated SQL Pool Connector for Apache Spark.
First you need to read the file which you need to read as the table in Synapse. Use below code to read the file.
%%pyspark
df = spark.read.load('abfss://sampleadls2#sampleadls1.dfs.core.windows.net/business.csv', format='csv', header=True)
Then convert this file into table using the code below:
%%pyspark
spark.sql("CREATE DATABASE IF NOT EXISTS business")
df.write.mode("overwrite").saveAsTable("business.data")
Refer below image.
Now you can run any Spark SQL command on this table as shown below:
%%pyspark
data = spark.sql("SELECT * FROM business.data")
display(data)
See the output in below image.

o110.pyWriteDynamicFrame. null

I have created a visual job in AWS Glue where I extract data from Snowflake and then my target is a postgresql database in AWS.
I have been able to connect to both Snowflak and Postgre, I can preview data from both.
I have also been able to get data from snoflake, write to s3 as csv and then take that csv and upload it to postgre.
However when I try to get data from snowflake and push it to postgre I get the below error:
o110.pyWriteDynamicFrame. null
So it means that you can get the data from snowflake in a Datafarme and while writing the data from this datafarme to postgres, you are failing.
You need to check was glue logs to get more understanding why is this failing while writing the data into postgres.
Please check if you have the right version of jars (needed by postgres) compatible with scala(on was glue side).

loading one table from RDS / postgres into Redshift

We have a Redshift cluster that needs one table from one of our RDS / postgres databases. I'm not quite sure the best way to export that data and bring it in, what the exact steps should be.
In piecing together various blogs and articles the consensus appears to be using pg_dump to copy the table to a csv file, then copying it to an S3 bucket, and from there use the Redshift COPY command to bring it in to a new table-- that's my high level understanding, but am not sure what the command line switches should be, or the actual details. Is anyone doing this currently and if so, is what I have above the 'recommended' way to do a one-off import into Redshift?
It appears that you want to:
Export from Amazon RDS PostgreSQL
Import into Amazon Redshift
From Exporting data from an RDS for PostgreSQL DB instance to Amazon S3 - Amazon Relational Database Service:
You can query data from an RDS for PostgreSQL DB instance and export it directly into files stored in an Amazon S3 bucket. To do this, you use the aws_s3 PostgreSQL extension that Amazon RDS provides.
This will save a CSV file into Amazon S3.
You can then use the Amazon Redshift COPY command to load this CSV file into an existing Redshift table.
You will need some way to orchestrate these operations, which would involve running a command against the RDS database, waiting for it to finish, then running a command in the Redshift database. This could be done via a Python script that connects to each database (eg via psycopg2) in turn and runs the command.