Using Fluentd to transfer data to DocumentDB

Using Fluentd to transfer data to DocumentDB - mongodb

I can use Fluentd to transfer data to mongoDB built on AWS EC2, but I can't transfer data to DocumentDB, which is a managed service compatible with mongoDB.
The following is the td-agent.conf for transferring the json file saved in /var/log/test/bulk/ to mongoDB.
<source>
#type tail
path /var/log/test/bulk/*
tag bulk.*
format json
time_key time
time_format '%F %T.%N %z %Z'
pos_file /var/log/test/run/log-json.pos
read_from_head true
refresh_interval 5s
</source>
<match bulk.**>
#type record_reformer
tag test.${tag_parts[-3]}.${tag_parts[-2]}
</match>
<match test.**>
#type copy
<store>
#type forest
subtype mongo_replset
<template>
host hostname1:27017,hostname2:27017,hostname3:27017
replica_set rs0
database ${tag_parts[-2]}
collection ${tag_parts[-1]}
user ********
password ********
replace_dot_in_key_with __dot__
<buffer>
#type file
path /var/log/test/buffer-mongo/${tag_parts[-2..-1]}
chunk_limit_size 8m
queued_chunks_limit_size 64
flush_interval 1s
</buffer>
</template>
</store>
</match>
When transferring to DocumentDB, I changed the host in the conf file above to the cluster endpoint, but the following error occurred.
[warn]: #0 failed to flush the buffer. retry_time=0 next_retry_seconds=2021-09-14 10:26:56 +0900 chunk="5cbea78155b58ec0810e9fde94aa2355" error_class=Mongo::Error::NoServerAvailable error="No server is available matching preference: #<Mongo::ServerSelector::Primary:0x70233136498300 tag_sets=[] max_staleness=nil> using server_selection_timeout=30 and local_threshold=0.015"
Since TLS is enabled in DocumentDB, I wonder if I need to specify rds-combined-ca-bundle.pem to enable TLS. I think so, but I don't know how to do that.
(When I tested writing to DocumentDB in Python using the link, the above error occurred when TLS was disabled.)
Can you please tell me how to write data to DocumentDB with TLS enabled?

Amazon DocumentDB clusters are deployed within an Amazon Virtual Private Cloud (Amazon VPC). They can be accessed directly by Amazon EC2 instances or other AWS services that are deployed in the same Amazon VPC. Is the client application running in the same VPC and security group as the DocumentDB cluster?

Related

How to pull mongodb logs with Wazuh agent?

I did following settings on /var/ossec/etc/ossec.conf and after that I restart agent but it's not showing logs on the Kibana dashboard
<localfile>
<log_format>syslog</log_format>
<location>/var/log/mongodb/mongod.log</location>

I performed a basic installation of Wazuh + MongoDB on agent side with the following results:
MongoDB by default writes inside syslog file located at /var/log/syslog.
Inside /var/log/mongodb/mongod.log there are internal mongo daemon logs that are more specific.
We could monitor such logs on Wazuh agent by:
<localfile>
<log_format>syslog</log_format>
<location>/var/log/syslog</location>
</localfile>
This rule is included by default on the agent but anyway is good to remember.
the other one as you point it out:
<localfile>
<log_format>syslog</log_format>
<location>/var/log/mongodb/mongod.log</location>
</localfile>
I only see that you didn't copy the closing tag </location> but it could be copy mistake, whatever is good to take a look at /var/ossec/logs/ossec.log to find some error.
With that configuration we could receive alerts like this:
** Alert 1595929148.661787: - syslog,access_control,authentication_failed,pci_dss_10.2.4,pci_dss_10.2.5,gpg13_7.8,gdpr_IV_35.7.d,gdpr_IV_32.2,hipaa_164.312.b,nist_800_53_AU.14,nist_800_53_AC.7,tsc_CC6.1,tsc_CC6.8,tsc_CC7.2,tsc_CC7.3,
2020 Jul 28 09:39:08 (ubuntu-bionic) any->/var/log/mongodb/mongod.log
Rule: 2501 (level 5) -> 'syslog: User authentication failure.'
2020-07-28T09:39:07.431+0000 I ACCESS [conn38] SASL SCRAM-SHA-1 authentication failed for root on admin from client 127.0.0.1:52244 ; UserNotFound: Could not find user "root" for db "admin"
If we run mongo -u root (with bad password) on agent side.

ActiveMQ Artemis JDBC Store: Add connection pool for DB connections?

I use the latest version of ActiveMQ Artemis with the MySQL database as message store. After 8h my server times out client db connections. The AbstractJDBCDriver implementation in Artemis does not recognize this and throws an exception.
What to do? I see no possibility for a DB connection pool with this implementation.
at com.mysql.cj.jdbc.exceptions.SQLError.createCommunicationsException(SQLError.java:174) [mysql-connector-java.jar:8.0.19]
at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:64) [mysql-connector-java.jar:8.0.19]
at com.mysql.cj.jdbc.ConnectionImpl.setAutoCommit(ConnectionImpl.java:2056) [mysql-connector-java.jar:8.0.19]
at org.apache.activemq.artemis.jdbc.store.drivers.AbstractJDBCDriver.stop(AbstractJDBCDriver.java:108) [artemis-jdbc-store-2.10.0.redhat-00004.jar:2.10.0.redhat-00004]
Same question here.

At this point the best thing to do is to restart the broker when this happens. There currently is no DB connection pool implementation, as you note.

In the latest Artemis release (2.16.0) there is support for connection-pooling.
Basic example with MySQL:
Create a new broker instance:
$ARTEMIS_HOME/bin/artemis create hosts/host0 --name host0 --user admin --password admin --require-login
Start the database and create a dedicated instance:
CREATE DATABASE activemq CHARACTER SET utf8mb4;
CREATE USER 'activemq'#'%' IDENTIFIED WITH mysql_native_password BY 'activemq';
GRANT CREATE,SELECT,INSERT,UPDATE,DELETE,INDEX ON activemq.* TO 'activemq'#'%';
FLUSH PRIVILEGES;
Download and copy the specific JDBC driver:
curl -sL https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-8.0.22.tar.gz -o driver.tar.gz
tar xf driver.tar.gz && cp mysql-connector-java-8.0.22/mysql-connector-java-8.0.22.jar hosts/host0/lib/
Add JDBC storage with connection-pooling configuration:
<store>
<database-store>
<data-source-properties>
<!-- All configuration options: https://commons.apache.org/proper/commons-dbcp/configuration.html -->
<data-source-property key="driverClassName" value="com.mysql.cj.jdbc.Driver" />
<data-source-property key="url" value="jdbc:mysql://localhost:3306/activemq" />
<data-source-property key="username" value="activemq" />
<data-source-property key="password" value="activemq" />
<data-source-property key="poolPreparedStatements" value="true" />
<!-- Avoid MySQL default behavior of killing long lived connections after 1 hour -->
<data-source-property key="maxConnLifetimeMillis" value="1800000" />
</data-source-properties>
<bindings-table-name>BINDINGS</bindings-table-name>
<message-table-name>MESSAGES</message-table-name>
<large-message-table-name>LARGE_MESSAGES</large-message-table-name>
<page-store-table-name>PAGE_STORE</page-store-table-name>
<node-manager-store-table-name>NODE_MANAGER_STORE</node-manager-store-table-name>
</database-store>
</store>

"The $changeStream stage is only supported on replica sets" error while using mongodb-source-connect

I get an error when running kafka-mongodb-source-connect
I was trying to run connect-standalone with connect-avro-standalone.properties and MongoSourceConnector.properties so that Connect write data which is written in MongoDB to Kafka topic.
This is what I wanted to do
bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties share/confluent-hub-components/mongodb-kafka-connect-mongodb/etc/MongoSourceConnector.properties
connect-avro-standalone.properties
# Sample configuration for a standalone Kafka Connect worker that uses Avro serialization and
# integrates the the Schema Registry. This sample configuration assumes a local installation of
# Confluent Platform with all services running on their default ports.
# Bootstrap Kafka servers. If multiple servers are specified, they should be comma-separated.
bootstrap.servers=localhost:9092
# The converters specify the format of data in Kafka and how to translate it into Connect data.
# Every Connect user will need to configure these based on the format they want their data in
# when loaded from or stored into Kafka
key.converter=io.confluent.connect.avro.AvroConverter
key.converter.schema.registry.url=http://localhost:8081
value.converter=io.confluent.connect.avro.AvroConverter
value.converter.schema.registry.url=http://localhost:8081
# The internal converter used for offsets and config data is configurable and must be specified,
# but most users will always want to use the built-in default. Offset and config data is never
# visible outside of Connect in this format.
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
# Local storage file for offset data
offset.storage.file.filename=/tmp/connect.offsets
# Confluent Control Center Integration -- uncomment these lines to enable Kafka client interceptors
# that will report audit data that can be displayed and analyzed in Confluent Control Center
# producer.interceptor.classes=io.confluent.monitoring.clients.interceptor.MonitoringProducerInterceptor
# consumer.interceptor.classes=io.confluent.monitoring.clients.interceptor.MonitoringConsumerInterceptor
# These are provided to inform the user about the presence of the REST host and port configs
# Hostname & Port for the REST API to listen on. If this is set, it will bind to the interface used to listen to requests.
#rest.host.name=
#rest.port=8083
# The Hostname & Port that will be given out to other workers to connect to i.e. URLs that are routable from other servers.
#rest.advertised.host.name=
#rest.advertised.port=
# Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for plugins
# (connectors, converters, transformations). The list should consist of top level directories that include
# any combination of:
# a) directories immediately containing jars with plugins and their dependencies
# b) uber-jars with plugins and their dependencies
# c) directories immediately containing the package directory structure of classes of plugins and their dependencies
# Examples:
# plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
# Replace the relative path below with an absolute path if you are planning to start Kafka Connect from within a
# directory other than the home directory of Confluent Platform.
plugin.path=share/java,/Users/anton/Downloads/confluent-5.3.2/share/confluent-hub-components
MongoSourceConnecor.properties
name=mongo-source
connector.class=com.mongodb.kafka.connect.MongoSourceConnector
tasks.max=1
# Connection and source configuration
connection.uri=mongodb://localhost:27017
database=test
collection=test
This is the error:
[2020-01-02 18:55:11,546] ERROR WorkerSourceTask{id=mongo-source-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:179)
com.mongodb.MongoCommandException: Command failed with error 40573 (Location40573): 'The $changeStream stage is only supported on replica sets' on server localhost:27017. The full response is {"ok": 0.0, "errmsg": "The $changeStream stage is only supported on replica sets", "code": 40573, "codeName": "Location40573"}

MongoDB change streams option is available only in replica sets setup. However, you can update your standalone installation to a single node replica set by following the below steps.
Locate the mongodb.conf file and add the replica set details
Add the following replica set details to mongodb.conf file
replication:
replSetName: "<replica-set name>"
Example
replication: replSetName: "rs0"
Note: Location in brew installed MongoDB /usr/local/etc/mongod.conf
Initiate the replica set using rs.initiate()
Login to the MongoDB shell and run command rs.initiate() this will start your replica set. Logs look like the following on successful start
> rs.initiate()
{
"info2" : "no configuration specified. Using a default configuration for the set",
"me" : "127.0.0.1:27017",
"ok" : 1,
"$clusterTime" : {
"clusterTime" : Timestamp(1577545731, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
},
"operationTime" : Timestamp(1577545731, 1)
}
That's all with these two simple steps you are running a MongoDB replica set with one node only.
Reference: https://onecompiler.com/posts/3vchuyxuh/enabling-replica-set-in-mongodb-with-just-one-node

stage is only supported on replica sets
You need to make your Mongo database a replica set in order to read the oplog
https://dba.stackexchange.com/questions/243780/converting-mongodb-instance-from-standalone-to-replica-set-and-backing-up

This is helped in my situation (macOS env), see more:
I. Install Zero-config MongoDB runner. Starts a replica set with no non-Node dependencies, not even MongoDB.
npm install -g run-rs // OR yarn global add run-rs
II. Use connection string
mongodb://localhost:27017,localhost:27018,localhost:27019/YOUR_DB_NAME?replicaSet=rs&retryWrites=false

SQL0206N error when trying to connect HammerDB to DB2 using db2dsdriver.cfg file

I'm trying to generate some data for DB2 10.5 LUW using HammerDB v3.1 which is running on a Windows remote host. There is no ability to run HammerDB on the same host with DB2.
According to the HammerDB documentation I need to set up IBM Data Server Driver for ODBC and CLI.
What I did:
Downloaded and set up the driver on HammerDB host - v10.5fp10_ntx64_odbc_cli.zip as described here
Configure db2dsdriver.cfg file
<configuration>
<dsncollection>
<dsn alias="TPCC" name="<my database name>" host="<my host name>" port="50000"/>
<!-- Long aliases are supported -->
<dsn alias="longaliasname2" name="name2" host="server2.net1.com" port="55551">
<parameter name="Authentication" value="SERVER_ENCRYPT"/>
</dsn>
</dsncollection>
<databases>
<database name="<my database name>" host="<my host name>" port="50000">
<parameter name="CurrentSchema" value="OWNER1"/>
.......
Add environment variable DB2DSDRIVER_CFG_PATH
set DB2DSDRIVER_CFG_PATH=C:\ProgramData\IBM\DB2\C_IBMDB2_CLIDRIVER_clidriver\cfg
Run HammerDB GUI, try to build a schema and receive
Error in Virtual User 1: [IBM][CLI Driver][DB2/LINUXX8664] SQL0206N "GLOBAL_VAR1" is not valid in the context where it is used. SQLSTATE=42703```

This error is happening because the db2dsdriver.cfg has excess information for your DSN on a Db2-client-node.
To recover, you can either rename and recreate your db2dsdriver.cfg/db2cli.ini files, or you can can edit the db2dsdriver.cfg file and remove the following stanza where it occurs for your DSN / database (take a backup as a precaution):
<sessionglobalvariables>
<parameter name="global_var1" value="abc"/>
</sessionglobalvariables>
I usually discard the default db2dsdriver.cfg/db2cli.ini, and use a script to populate them. This is possible by using the command line tool "db2cli", which has a variety of command lines parameters to let you write the cfg file stanzas for both DSN and databases. Documentation here.

Define a database entry in DB2 db2dsdriver.cfg file that can access multiple databases

I currently use the CLI method of registering a node and then a database for that node (with CATALOG NODE / CATALOG DATABASE) to configure my DB2 CLI client to access our database server.
With a single registration of a database I can effectively register the default database but then when I connect in my application using SQLDriverConnect and use the "DATABASE=" option I can connect to other databases available on my server.
I would like to switch to the much easier to manage db2dsdriver.cfg configuration file, however I have been unable to configure it to allow a single to access multiple databases.
Some code to help clarify. My DB2 server instance has two databases defined like:
CREATE DATABASE DB_1 ON /opt/data/DB_1 USING CODESET UTF-8 TERRITORY US COLLATE USING SYSTEM
CREATE DATABASE DB_2 ON /opt/data/DB_2 USING CODESET UTF-8 TERRITORY US COLLATE USING SYSTEM
I register this server with my client CLI using these commands:
CATALOG TCPIP NODE DB_NODE remote example.server.com server 50000
CATALOG DB DB_1 as DB_1 at node DB_NODE
With that setup I can perform the following from my CLI application:
rc = SQLDriverConnect(hdbc, NULL, "DSN=DB_1;UID=dbtest1;PWD=zebco5;DATABASE=DB_1",
SQL_NTS, outStr, 128, &outSize, SQL_DRIVER_NOPROMPT);
or if I want to use the DB_2 database:
rc = SQLDriverConnect(hdbc, NULL, "DSN=DB_1;UID=dbtest1;PWD=zebco5;DATABASE=DB_2",
SQL_NTS, outStr, 128, &outSize, SQL_DRIVER_NOPROMPT);
Note I did not need to change the DSN, merely the "DATABASE" connection option.
Recently I found the db2dsdriver.cfg configuration file which I would rather use. To that end I created this and uncataloged my node and db from the cli:
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<configuration>
<dsncollection>
<dsn alias="DB_1" name="DB_1" host="server.example.com" port="50000"/>
</dsncollection>
<databases>
<database name="DB_1" host="server.example.com" port="50000"/>
<database name="DB_2" host="server.example.com" port="50000"/>
</databases>
</configuration>
I can connect with this:
rc = SQLDriverConnect(hdbc, NULL, "DSN=DB_1;UID=dbtest1;PWD=zebco5;DATABASE=DB_1",
SQL_NTS, outStr, 128, &outSize, SQL_DRIVER_NOPROMPT);
just fine, but now using this to connect to DB_2:
rc = SQLDriverConnect(hdbc, NULL, "DSN=DB_1;UID=dbtest1;PWD=zebco5;DATABASE=DB_2",
SQL_NTS, outStr, 128, &outSize, SQL_DRIVER_NOPROMPT);
results in this error:
SQL0843N The server name does not specify an existing connection. SQLSTATE=08003
Which I understand, but seems like a regression in functionality from the old node/db registration mechanism.
I'm attempting to determine if the functionality I was using is supported with the configuration file and I am doing something wrong or if it just doesn't work that way?
Thank you for your help.