logstash out of memory reading postgres large table - postgresql

Im trying to index a Large database table with more then 10.000.000
AND logstash is running out of memory.. :(
The Error:
logstash_1 | Error: Your application used more memory than the safety cap of 1G.
logstash_1 | Specify -J-Xmx####m to increase it (#### = cap size in MB).
logstash_1 | Specify -w for full OutOfMemoryError stack trace
My logstash configuration:
input {
jdbc {
# Postgres jdbc connection string to our database, mydb
jdbc_connection_string => "jdbc:postgresql://database:5432/predictiveparking"
# The user we wish to execute our statement as
jdbc_user => "predictiveparking"
jdbc_password => "insecure"
# The path to our downloaded jdbc driver
jdbc_driver_library => "/app/postgresql-9.4.1212.jar"
# The name of the driver class for Postgresql
jdbc_driver_class => "org.postgresql.Driver"
# our query
statement => "SELECT * from scans_scan limit 10"
}
}
#output {
# stdout { codec => json_lines }
#}
output {
elasticsearch {
index => "scans"
sniffing => false
document_type => "scan"
document_id => "id"
hosts => ["elasticsearch"]
}
}

Just enabling paging..
added:
jdbc_paging_enabled => true
Now the data form database get's cut into pieces and we do not run out of memory. Make sure the sql query is ORDERED!
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-jdbc.html#plugins-inputs-jdbc-jdbc_paging_enabled

Related

JDBC Logstash Elastic Kibana

I'm using JDBC input plugin to ingest data from mongodb to ElasticSearch.
My config is:
`input {
jdbc {
jdbc_driver_class => "mongodb.jdbc.MongoDriver"
jdbc_driver_library => "/usr/share/logstash/logstash-core/lib/jars/mongodb_unityjdbc_free.jar"
jdbc_user => ""
jdbc_password => ""
jdbc_connection_string => "jdbc:mongodb://localhost:27017/pritunl"
schedule => "* * * * *"
jdbc_page_size => 100000
jdbc_paging_enabled => true
statement => "select * from servers_output"
}
}
filter {
mutate {
copy => { "_id" => "[#metadata][id]"}
remove_field => ["_id"]
}
}
output {
elasticsearch {
hosts => "localhost:9200"
index => "pritunl"
document_id => "%{[#metadata][_id]}"
}
stdout {}
}`
In Kibana I see only one hitenter image description here, but in stdout I see many records from mongodb collection. What should I do, to see them all?
The problem is, that all your documents are saved with the same "_id", so even though you're sending different records to ES, only one document is being overwritten internally - thus you get 1 hit in Kibana.
There is a typo in your configuration to cause this issue.
You're copying "_id" into "[#metadata][id]"
But you're trying to read "[#metadata][_id]" with an underscore.
Removing the underscore when reading the value for document_id should fix your issue.
output {
elasticsearch {
hosts => "localhost:9200"
index => "pritunl"
document_id => "%{[#metadata][id]}"
}
stdout {}
}`

Not getting SQL Statement output using ELK logstash jdbc plugin

Trying to get output using logstash with jdbc plugin but not getting data. Try different way to get pg_stat_replication data. But at Index Management did not found that index where my index name is pg_stat_replication.
Conf. file path /etc/logstash/conf.d/postgresql.conf
input {
# pg_stat_replication;
jdbc {
jdbc_driver_library => "/usr/share/logstash/logstash-core/lib/jars/postgresql-jdbc.jar"
jdbc_driver_class => "org.postgresql.Driver"
jdbc_connection_string => "jdbc:postgresql://192.168.43.21:5432/postgres"
jdbc_user => "postgres"
jdbc_password => "p"
statement => "SELECT * FROM pg_stat_replication"
schedule => "* * * * *"
type => "pg_stat_replication"
}
}
output {
elasticsearch {
hosts => "http://192.168.43.100:9200"
index => "%{type}"
user => "elastic"
password => "abc#123"
}
}
~

Connect to mongodb using logstash Jdbc_streaming filter plugin

I'm trying to fetch data from mongodb using Jdbc_streaming filter plugin in logstash in windows.
I'm using mongo-java-driver-3.4.2.jar to connect to the database but, getting a error like this.
JavaSql::SQLException: No suitable driver found for jdbc:mongo://localhost:27017/EmployeeDB
No any luck with existing references. I'm using logstash 7.8.0 version. This is my logstash config:
jdbc_streaming {
jdbc_driver_library => "C:/Users/iTelaSoft-User/Downloads/logstash-7.8.0/mongo-java-driver-3.4.2.jar"
jdbc_driver_class => "com.mongodb.MongoClient"
jdbc_connection_string => "jdbc:mongo://localhost:27017/EmployeeDB"
statement => "select * from Employee"
target => "name"
}
You can also try as follows:
download https://dbschema.com/jdbc-drivers/MongoDbJdbcDriver.zip
unzip and copy all the files to the path(~/logstash-7.8.0/logstash-core/lib/jars/)
modify the .config file
Example:
input {
jdbc{
jdbc_driver_class => "com.dbschema.MongoJdbcDriver"
jdbc_driver_library => "mongojdbc2.1.jar"
jdbc_user => "user"
jdbc_password => "pwd"
jdbc_connection_string => "jdbc:mongodb://localhost:27017/EmployeeDB"
statement => "select * from Employee"
}
}
output {
stdout { }
}

How to create a new field in elasticsearch and import data from postges into it?

I have a big table in postgres having 200 columns and more than a million rows. I want to migrate this data into elasticsearch using logstash. I am currently migrating around 50 columns.
What I want to know is can I add the other columns later mapping them to an index in elasticsearch? For example, say I have 10 columns in postgres and I map 4 into elasticsearch. Can I add the other 6 columns along with their data later to elasticsearch on the same index?
My current logstash config file looks like this:
input {
jdbc {
jdbc_connection_string => "jdbc:postgresql://localhost:5432/school"
jdbc_user => "postgres"
jdbc_password => "postgres"
jdbc_driver_library => "/Users/karangupta/Downloads/postgresql-42.2.8.jar"
jdbc_driver_class => "org.postgresql.Driver"
jdbc_paging_enabled => true
statement_filepath => "/usr/local/Cellar/logstash/7.3.2/conf/myQuery.sql"
}
}
# output {
# stdout { codec => json_lines }
# }
output {
elasticsearch {
index => "schoolupdated"
hosts => "http://localhost:9200"
}
}
The above config file works perfectly and adds the index. How can I add fields to this index later from postgres?
I am using postgres 11.4, elasticsearch 6.8.
Yes, you can as long elasticsearch will be provided with a id for each if the rows.
Just add
document_id => "%{uniqe_identifier_column_name_in_your_result}" #column name is case senitive
to the elasticsearch output plugin configuration.
If you execute the jdbc import again (now with new tables), new fields will be added to the existing documents by overwriting the old documents.
More details on this topic: https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-document_id
Have fun!

cant config logstash to postgres

im not able to config my logstash-2.3.2 with my postgresql-9.5.4-1-windows-x64.
here's is my log-config.conf file
input {
jdbc {
# Postgres jdbc connection string to our database, mydb
jdbc_connection_string => "jdbc:postgresql://localhost:5432/ambule"
# The user we wish to execute our statement as
jdbc_user => "postgres"
# The path to our downloaded jdbc driver
jdbc_driver_library => "C:\Users\Administrator\Downloads\postgresql-9.4-1201-jdbc41.jar"
# The name of the driver class for Postgresql
jdbc_driver_class => "org.postgresql.Driver"
# our query
statement => "SELECT * from table1"
}
}
output {
stdout { codec => json_lines }
}
im getting
error
I guess the exception itself lies within the jdbc_connection_string. What if you have it as such:
jdbc_connection_string => "jdbc:postgresql://host:port/database?user=username"
----------------------------------------------------------------------------------------^ try adding the user
Seems like it has been missed out from the doc. Hope it helps!