continuosly stream data from aws postgres to aws elastic search - postgresql

I want to stream the data from aws postgres to aws elastic search.
Note: This is not for migration but for continuously data flow as the record gets inserted/updated/deleted in postgres database

Related

AWS data migration from mongodb to dynamodb

my database is currently present in mongodb atlas.Can i use AWS DMS to directly migrate from atlas to dynamo db or do i have to first make a EC2 instance and make a mongodb server there and dump my data. I can't find any videos nor blogs related to it

DebeziumIO read with PostgresSQL to connect to RDS is not streaming data in Apache Beam

I have created apache beam pipeline which uses DebeziumIO - PostgreSQLconnector class to connect to postgresql in rds. The setup is working for local postgresql database with replication set to logical. AWS is setup with replication to logical and rds_replication role set to the user.
But the data stream is not captured by the pipeline.
I have checked with the embedded debezium with the postgresql connector class and it captures the change data.

Is it possible to connect to a Postgres DB from SageMaker Data Wrangler?

I set up a regular Postgres DB in AWS using the Amazon Relational Database Service (RDS). I would like to ingest this data using data wrangler for inspection and further processing.
Is this possible? I only see S3, Athena, Redshift and SnowFlake as the data ingestion options. Does this mean I must move the data from Postgresql to one of these 4 options before I can use Data Wrangler?
If it's not possible through data wrangler, can I connect to my Postgres through a Jupyter notebook, using a connection string or some kind of option like this? I'm looking to use the data for the SageMaker Feature Store.
Using Amazon RDS with DataWrangler is not possible (yet).
That said, you can connect to Postgres in Jupyter using Python and then use the Feature Store API to ingest data and use it for subsequent tasks.
Source: AWS employee

Migration from AWS Postgresql to ElasticSearch using AWS DMS

To improve the read performance, we are synching RDS Postgresql data to AWS ElasticSearch. Used AWS DMS for it. It's working fine. But Joining tables is a challenge. We need a solution to denormalize the data and put it in elastic search using AWS DMS. For example, In customer and order tables in PostgreSQL should be one record in ElasticSearch.

Amazon Aurora PostgreSQL SELECT INTO OUTFILE S3

We are trying to export data from an Amazon Aurora PostgreSQL database to an S3 buckets. The code being used is like this:
SELECT * FROM analytics.my_test INTO OUTFILE S3
's3-us-east-2://myurl/sampledata'
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n';
MANIFEST ON
OVERWRITE ON;
All permissions have been set up but we get the error
SQL Error [42601]: ERROR: syntax error at or near "INTO" Position: 55
Does this only work with a MySQL database?
It is fairly new feature on Aurora Postgres, but it is possible to export the query result into a file on s3: https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/postgresql-s3-export.html#postgresql-s3-export-file
The syntax is not the same as for MySQL though. For Postgres it is:
SELECT * from aws_s3.query_export_to_s3('select * from sample_table',
aws_commons.create_s3_uri('sample-bucket', 'sample-filepath', 'us-west-2')
);
I believe saving SQL select output data in S3 ONLY works for Amazon Aurora MySQL DB. I don't see any reference in the official documentation that mentions the same for Amazon Aurora PostgresSQL.
Here are snippets from official documentation that I referred to
Integrating Amazon Aurora MySQL with Other AWS Services
Amazon Aurora MySQL integrates with other AWS services so that you can
extend your Aurora MySQL DB cluster to use additional capabilities in
the AWS Cloud. Your Aurora MySQL DB cluster can use AWS services to do
the following:
Synchronously or asynchronously invoke an AWS Lambda function using
the native functions lambda_sync or lambda_async. For more
information, see Invoking a Lambda Function with an Aurora MySQL
Native Function.
Load data from text or XML files stored in an Amazon Simple Storage
Service (Amazon S3) bucket into your DB cluster using the LOAD DATA
FROM S3 or LOAD XML FROM S3 command. For more information, see Loading
Data into an Amazon Aurora MySQL DB Cluster from Text Files in an
Amazon S3 Bucket.
Save data to text files stored in an Amazon S3 bucket from your DB
cluster using the SELECT INTO OUTFILE S3 command. For more
information, see Saving Data from an Amazon Aurora MySQL DB Cluster
into Text Files in an Amazon S3 Bucket.
Automatically add or remove Aurora Replicas with Application Auto
Scaling. For more information, see Using Amazon Aurora Auto Scaling
with Aurora Replicas.
Integrating Amazon Aurora PostgreSQL with Other AWS Services
Amazon Aurora integrates with other AWS services so that you can
extend your Aurora PostgreSQL DB cluster to use additional
capabilities in the AWS Cloud. Your Aurora PostgreSQL DB cluster can
use AWS services to do the following:
Quickly collect, view, and assess performance for your Aurora
PostgreSQL DB instances with Amazon RDS Performance Insights.
Performance Insights expands on existing Amazon RDS monitoring
features to illustrate your database's performance and help you
analyze any issues that affect it. With the Performance Insights
dashboard, you can visualize the database load and filter the load by
waits, SQL statements, hosts, or users.
For more information about Performance Insights, see Using Amazon RDS
Performance Insights.
Automatically add or remove Aurora Replicas with Aurora Auto Scaling.
For more information, see Using Amazon Aurora Auto Scaling with Aurora
Replicas.
Configure your Aurora PostgreSQL DB cluster to publish log data to
Amazon CloudWatch Logs. CloudWatch Logs provide highly durable storage
for your log records. With CloudWatch Logs, you can perform real-time
analysis of the log data, and use CloudWatch to create alarms and view
metrics. For more information, see Publishing Aurora PostgreSQL Logs
to Amazon CloudWatch Logs.
Ther is no mention of saving data to S3 for PostgresSQL