Google BigQuery stream to PostgresSQL - postgresql

I'm using Google BigQuery for OLAP, and plan to provision Google Cloud SQL (Postgres) for OLTP.
My plan is to direct stream data from Google BigQuery to Postgres.
I try googling the solution, but the option is only using batch file .
Is it possible for the streaming solution from Google BigQuery to PostgreSQL?

Currently there is no streaming read mechanism for accessing bigquery data as mentioned in this Stackoverflow post.
Hence, you'll have to go for reading data through batch process.
You can also setup manual ETL process to integrate BigQuery to PostgreSQL using Cloud Data Fusion as mentioned in this article.

Related

Dump Kafka to GCS

Can someone help me with the best possible way to dump data from a kafka topic to Google Cloud storage?
I would like to build a near real time pipeline which is capable to creating multiple files based on time or size cuts.
Data in kafka topic is in JSON format.
Confluent offers a GCS Kafka Connect sink, but you could also try using Google DataFlow / Apache Beam to do the same.

Spring Batch Azure Databricks example

I'm looking to implement the Spring Batch example which will read the data from Azure Databricks and write it into Postgres or any other systems.
How Spring Batch will connect to Azure Databricks and how data will be populated ?
I don't see Inbuilt ItemReader or ItemWriter is available yet? Does it plan ?
You need to connect to Azure Databricks via JDBC Connection and for one DB to another you can refer https://www.yawintutor.com/spring-boot-batch-read-from-database-and-write-to-database-example/.
Let me know , if this does help you

how to stream data from AWS MSK (kafka) to snowflake using MSK connect

I'm trying to set up a MSK connector for snowflake and i could hardly see any documentation on how to do it. Unfortunately AWS support person also referred me to use snowflake documentation page.
By following this i can create an EC2 instance and spinoff connector but i wanted to go on serverless mode and use MSK connectors
I'm having hard time with connector properties for snowflake and aws doesnt provide much information about it
As answered on the plugins page, you'd need to upload the Snowflake ZIP/JAR plugins to S3, where they'd be downloaded prior to the connector starting
https://docs.aws.amazon.com/msk/latest/developerguide/msk-connect-plugins.html

How to connect Tableau/BI tools to Delta Lake? (Without databricks)

I am trying to migrate a Datawarehouse to Delta lake. One thing that I am struggling to figure out is how to connect to Delta Lake (silver and gold) tables outside a spark session. I want to able to connect to these tables using BI tools like Tableau. I am not using databricks and I was wondering if storing these tables in the hive metastore could help. If not this then could someone help me with an alternative approach or if this is feasible or not.
You can have a Hive metastore and a Thrift server with Spark open source and delta.io open source then connect Tableau desktop for instance.

Best way to stream/logically replicate RDS Postgres data to kinesis

Our primary datastore is an RDS Postgres database. It would be nice if we could stream all changes to that happen in Postgres to some sink - whether that's kinesis, elasticsearch or any other data store.
We use Postgres 9.5 which has support for 'logical replication'. However, all the extensions that tap into this stream are blocked on RDS. There's a tutorial for streaming the MySQL RDS flavor to kinesis - the postgres equivalent would be ideal. Is this possible currently?
Have a look at https://github.com/disneystreaming/pg2k4j . It takes all changes made to your database and streams them to Kinesis. See the README for an example of how to set this up with RDS. We've been using it in production and have found it very useful for solving this exact problem. Disclaimer: I wrote https://github.com/disneystreaming/pg2k4j
Integrate a central Amazon Relational Database Service (Amazon RDS) for PostgreSQL database with other systems by streaming its modifications into Amazon Kinesis Data Streams. An earlier post, Streaming Changes in a Database with Amazon Kinesis, described how to integrate a central RDS for MySQL database with other systems by streaming modifications through Kinesis. In this post, I take it a step further and explain how to use an AWS Lambda function to capture the changes in Amazon RDS for PostgreSQL and stream those changes to Kinesis Data Streams.
https://aws.amazon.com/blogs/database/stream-changes-from-amazon-rds-for-postgresql-using-amazon-kinesis-data-streams-and-aws-lambda/