Where can I find a complete list about replication slot options in PostgreSQL? - postgresql

I an working on PG logical replication by Java, and find a demo on the jdbc driver docs
PGReplicationStream stream =
replConnection.getReplicationAPI()
.replicationStream()
.logical()
.withSlotName("demo_logical_slot")
.withSlotOption("include-xids", false)
.withSlotOption("skip-empty-xacts", true)
.start();
then I can parse message from the stream.
This is enough for some daily needs, but now I want to know the transaction commit time.
From the help of the question on stackoverflow, I add .withSlotOption("include-timestamp", "on") and it is working.
My question is where can find a complete list about the "slot option", so we can find them very conveniently instead of search on google or stackoverflow.

The available options depend on the logical decoding plugin of the replication slot, which is specified when the replication slot is created.
The example must be using the test_decoding plugin, which is included with PostgreSQL as a contrib module for testing and playing.
The available options for that plugin are not documented, but can be found in the source code:
include-xids: include the transaction number in BEGIN and COMMIT output
include-timestamp: include timestamp information with COMMIT output
force-binary: specifies that the output mode is binary
skip-empty-xacts: don't output anything for transactions that didn't modify the database
only-local: output only data whose replication origin is not set
include-rewrites: include information from table rewrites caused by DDL statements

Related

how to collect all information about the current Job in Talend data studio

I'm Running any job then I want to log all information like ---
job name
Source detail and destination details (file name/Table name)
No of records input and number of records processed or save.
so I want log all the above information and insert into Mongodb using talend open studio Components also explain what component do I need to perform that task. need some serious response thanks.
You can use tJava component as below. Get the count of source, destination, details of the source name and target name. Now redirect the details to a file in tJava.
For more about logging functionalities, go through below tutorials,
https://www.youtube.com/watch?v=SSi8BC58v3k&list=PL2eC8CR2B2qfgDaQtUs4Wad5u-70ala35&index=2
I'd consider using log4j which has most of this information. Using MDC you could expand the log messages with custom attributes. Log4j does have a JSON format, and there seems to be a MongoDB appender as well.
It might take a bit more time to configure (I'd suggest adding the dependencies via a routine) but once configured it will require absolutely no configuration in the job. Using log4j you can create filters, etc.

How to force to set Pipelines' status to failed

I'm using Copy Data.
When there is some data error. I would export them to a blob.
But in this case, the Pipelines's status is still Succeeded. I want to set it to false. Is it possible?
When there is some data error.
It depends on what error you mentioned here.
1.If you mean it's common incompatibility or mismatch error, ADF supports built-in feature named Fault tolerance in Copy Activity which supports below 3 scenarios:
Incompatibility between the source data type and the sink native
type.
Mismatch in the number of columns between the source and the sink.
Primary key violation when writing to SQL Server/Azure SQL
Database/Azure Cosmos DB.
If you configure to log the incompatible rows, you can find the log file at this path: https://[your-blob-account].blob.core.windows.net/[path-if-configured]/[copy-activity-run-id]/[auto-generated-GUID].csv.
If you want to abort the job as soon as any error occurs,you could set as below:
Please see this case: Fault tolerance and log the incompatible rows in Azure Blob storage
2.If you are talking about your own logic for the data error,may some business logic. I'm afraid that ADF can't detect that for you, though it's also a common requirement I think. However,you could follow this case (How to control data failures in Azure Data Factory Pipelines?) to do a workaround. The main idea is using custom activity to divert the bad rows before the execution of copy activity. In custom activity, you could upload the bad rows into Azure Blob Storage with .net SDK as you want.
Update:
Since you want to log all incompatible rows and enforce the job failed at the same time, I'm afraid that it can not be implemented in the copy activity directly.
However, I came up with an idea that you could use If Condition activity after Copy Activity to judge if the output contains rowsSkipped. If so, output False,then you will know there are some skip data so that you could check them in the blob storage.

What is the role of Logstash Shipper and Logstash Indexer in ELK stack?

I have been studying online about ELK stack for my new project.
Although most of the tech blogs are about how to set ELK up.
Although I need more information to begin with.
What is Logstash ? Further, Logstash Shipper and Indexer.
What is Elasticsearch's role ?
Any leads will be appreciated too if not a proper answer.
I will try to explain the elk stack to you with an example.
Applications generate logs which all have the same format ( timestamp | loglevel | message ) on any machine in our cluster and write those logs to some file.
Filebeat (a logshipper from elk) tracks that file, gathers any updates to the file periodically and forwards them to logstash over the network. Unlike logstash Filebeat is a lightweight application that uses very little resources so I don't mind running it on every machine in the cluster. It notices when logstash is down and waits with tranferring data until logstash is running again (no logs are lost).
Logstash receives messages from all log shippers through the network and applies filters to the messages. In our case it splits up each entry into timestamp, loglevel and message. These are separate fields and can later be searched easily. Any messages that do not conform to that format will get a field: invalid logformat. These messages with fields are now forwarded to elastic search in a speed that elastic search can handle.
Elastic search stores all messages and indexes ( prepares for quick search) all the fields im the messages. It is our database.
We then use Kibana (also from elk) as a gui for accessing the logs. In kibana I can do something like: show me all logs from between 3-5 pm today with loglevel error whose message contains MyClass. Kibana will ask elasticsearch for the results and display them
I don't know, if this helps, but ... whatever... Let's take some really stupid example: I want to do statistics about squirrels in my neighborhood. Every squirrel has a name and we know what they look like. Each neighbor makes a log entry whenever he sees a squirrel eating a nut.
ElasticSearch is a document database that structures data in so called indices. It is able to save pieces (shards) of those indices redundantly on multiple servers and gives you great search functionalities. so you can access huge amounts of data very quickly.
Here we might have finished events that look like this:
{
"_index": "squirrels-2018",
"_id": "zr7zejfhs7fzfud",
"_version": 1,
"_source": {
"squirrel": "Bethany",
"neighbor": "A",
"#timestamp": "2018-10-26T15:22:35.613Z",
"meal": "hazelnut",
}
}
Logstash is the data collector and transformator. It's able to accept data from many different sources (files, databases, transport protocols, ...) with its input plugins. After using one of those input plugins all the data is stored in an Event object that can be manipulated with filters (add data, remove data, load additional data from other sources). When the data has the desired format, it can be distributed to many different outputs.
If neighbor A provides a MySQL database with the columns 'squirrel', 'time' and 'ate', but neighbor B likes to write CSVs with the columns 'name', 'nut' and 'when', we can use Logstash to accept both inputs. Then we rename the fields and parse the different datetime formats those neighbors might be using. If one of them likes to call Bethany 'Beth' we can change the data here to make it consistent. Eventually we send the result to ElasticSearch (and maybe other outputs as well).
Kibana is a visualization tool. It allows you to get an overview over your index structures and server status and create diagrams for your ElasticSearch data
Here we can do funny diagrams like 'Squirrel Sightings Per Minute' or 'Fattest Squirrel (based on nut intake)'

Setting up MongoDB environment requirements for Parse Server

I have my instance running and am able to connect remotely however I'm stuck on where to set this parameter to false since it states that the default is set to true:
failIndexKeyTooLong
Setting the 'failIndexKeyTooLong' is a three-step process:
You need to go to the command console in the Tools menu item for the admin database of your database instance. This command will only work on the admin database, pictured here:
Once there, pick any command from the list and it will give you a short JSON text for that command.
Erase the command they provide (I chose 'ping') and enter the following JSON:
{
"setParameter" : 1,
"failIndexKeyTooLong" : false
}
Here is an example to help:
Note if you are using a free plan at MongoLab: This will NOT work if you have a free plan; it only works with paid plans. If you have the free plan, you will not even see the admin database. HOWEVER, I contacted MongoLab and here is what they suggest:
Hello,
First of all, welcome to MongoLab. We'd be happy to help.
The failIndexKeyTooLong=false option is only necessary when your data
include indexed values that exceed the maximum key value length of
1024 bytes. This only occurs when Parse auto-indexes certain
collections, which can actually lead to incorrect query results. Parse
has updated their migration guide to include a bit more information
about this, here:
https://parse.com/docs/server/guide#database-why-do-i-need-to-set-failindexkeytoolong-false-
Chances are high that your migration will succeed without this
parameter being set. Can you please give that a try? If for any reason
it does fail, please let us know and we can help you on potential next
steps.
Our Dedicated and Shared Cluster plans
(https://mongolab.com/plans/pricing/) do provide the ability to toggle
this option, but because our free Sandbox plans are running on shared
server processes, with other Sandbox users, this parameter is not
configurable.
When launching your mongodb server, you can set this parameter to false :
mongod --setParameter failIndexKeyTooLong=false
I have wrote an article that help you to Setting up Parse-Server and all its dependencies on your own server:
https://medium.com/#jcminarro/run-parse-server-on-your-own-server-using-digitalocean-b2a7d66e1205

How can I get my client application name to show up on zos from java?

This page says I can put "clientProgramName" as one of the connection parameters and it will show up on db2 as the correlation ID.
And I quote:
In a java.util.Properties value in the info parameter of a
DriverManager.getConnection call.
We're using z/OS. The z/OS version of DB2 seems a lot more limited in terms of this kind of stuff.
Setting the client program name in the params hash of the connect call seems to have no effect, and when I put it on the end of the connect string url like this (which it also says I can do):
jdbc:db2://localhost:5036/DBNAME:clientProgramName=myprog
I get this error:
[jcc][10165][10051][4.11.77] Invalid database URL syntax:
jdbc:db2://localhost:5036/DBNAME:clientProgramName=myprog.
ERRORCODE=-4461, SQLSTATE=42815
Is there any way to send a custom user string to a z/OS db2 server so that connection can be identified on the server?
Depending on the method you use to connect to DB2, you use:
Class.forName
Class.forName("com.ibm.db2.jcc.DB2Driver");
Properties props = new Properties();
props.put("user", "scott");
props.put("password", "tiger");
props.put("clientProgramName", "My Program 1");
Connection conn = DriverManager.getConnection(
"jdbc:db2://localhost:50000/sample", props);
DataSource
Connection conn = null;
DB2SimpleDataSource ds = new com.ibm.db2.jcc.DB2SimpleDataSource();
ds.setDriverType(4);
ds.setServerName("localhost");
ds.setPortNumber(50000);
ds.setDatabaseName("sample");
ds.setUser("scott");
ds.setPassword("tiger");
ds.setClientProgramName("My Application 2");
conn = ds.getConnection();
I wrote a blog about that: http://angocadb2.blogspot.fr/2012/12/nombre-de-la-conexion-java-en-db2-java.html (Use your favorite translator because it is in Spanish)
According to this page on Info Center, there should be a function on the DB2Connection interface that allows you to change your application identifier, setDB2ClientApplicationInformation (I can't link directly, because there is no anchor, just search for that name).
You can pull the current application ID using the CURRENT CLIENT_APPLNAME special register:
SELECT CURRENT CLIENT_APPLNAME FROM SYSIBM.SYSDUMMY1
There are some other ways to set that register listed on the Info Center link listed above, including the WLM_SET_CLIENT_INFO function.
I am no DB2 expert, but I am looking at a trace record, generated by DB2 for z/OS, that contains a "correlation ID" (field QWHCCV in the product section correlation header of the trace record) that matches the value I set using setClientProgramName (method of the DB2 data source in my Java application).
My Java application is similar to the "DataSource" example given by AngocA, which is similar to the code quoted in the IBM technote 'The name of a DB2 JDBC application appears as "db2jcc_application". How to change it?'. This Java application, running on my Windows PC, connects to DB2 for z/OS. It also - and this is important, depending on which DB2 traces you have started (discussed below) - actually does something after connecting. For example:
pstmt=conn.prepareStatement("SELECT ... ");
rset=pstmt.executeQuery();
When you say, regarding the first example given by AngocA, "it doesn't do anything": what did you hope to see? Exactly where are you looking, what are you looking for, and what method (or tool) are you using to look for it?
For example, if you are looking for SMF type 100, 101, or 102 records (generated by DB2 traces) containing QWHCCV field values that match your correlation ID, then (with apologies if this is the bleeding obvious, teaching you how to suck eggs), on DB2 for z/OS, you need to start the DB2 traces (using the DB2 command START TRACE) that generate those records. Otherwise, there will be nothing to see ("it doesn't do anything"). Note that not all DB2 trace records generated by an application (such as the Java application described above) will contain your correlation ID; prior to a certain point in processing, the correlation ID of such records will have a different value (but that is getting off-topic, and anyway is about as far as I am comfortable describing).
Warning: Experiment with starting DB2 traces on a "sandbox" (development or test) DB2 system, not a production DB2 system. DB2 traces can result in large volumes of data.
You will also see the correlation ID in the message text of some DB2 V10 messages (such as DSNL027I) after "THREAD-INFO=".
For me I had to add a semicolon after each connection parameter.
EX for your case:
jdbc:db2://localhost:5036/DBNAME:clientProgramName=myprog;
EX with multiple params:
jdbc:db2://localhost:5036/DBNAME:clientProgramName=myprog;enableSysplexWLB=true;blah=true;