How to update data in Elasticsearch on a schedule?

How to update data in Elasticsearch on a schedule? - postgresql

I have a table in the PostgreSQL database. I want to insert data from that table into the Elasticsearch's index. I need to update index data on a schedule. In other words, deletes old data and inserts with a new one. I have such Logstash configuration file but it doesn't update data in index. It's insert data but in the same time I see old data. Therefore, duplicate data occurs. How correctly to update data in Elasticsearch on a schedule?
input {
jdbc {
jdbc_connection_string => "jdbc:postgresql://host:port/postgres"
jdbc_user => "postgres"
jdbc_password => "postgres"
jdbc_driver_library => "postgresql-42.2.9.jar"
jdbc_driver_class => "org.postgresql.Driver"
statement => "SELECT * FROM layers;"
schedule => "0 0 * * MON"
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "layers"
}
}

You index name doesnt change, so everytime you add new records, it adds to same index.
Add a datetime postfix to the index
index => "layers%{+YYYY.MM.dd}"
So there'll be a new index for each date.
Now for searching, create an alias , so you can always use the same name in your application. For example: layers/_search by adding alias like below:
POST _aliases
{
"actions": [
{
"add": {
"index": "layers-2019.12.11",
"alias": "layers"
}
}
]
}
Above step is via kibana or you can use http post. However, i'd recommend using Curator for alias operations. That way once, log stash command completes, you can run curator to remove current index from the alias and add the newly created one.

Related

logstash date filter add_field is not working

I"m connecting to postgres and writing a few rows to elastic via logstash.
same date read/write is working fine.
After I apply a date fileter, fetch a date field and assign it to newly created field, it's not working. Below is the filter
filter {
date {
locale => "en"
match => ["old_date", "YYYY-MM-dd"]
timezone => "Asia/Kolkata"
add_field => { "newdate" => "2022-10-06"}
target => "#newdate"
}
}
I tried with mutate also but the new field is not getting created and there's no any error

Prisma fails on Postgres query with findMany

I have the following query in Prisma that basically returns all users where campaign id is one from the array I provide and they are added to the system within the defined time range. Also I have another entity Click for each user that should be included in the response.
const users = await this.prisma.user.findMany({
where: {
campaign: {
in: [
...campaigns.map((campaign) => campaign.id),
...campaigns.map((campaign) => campaign.name),
],
},
createdAt: {
gte: dateRange.since,
lt: dateRange.until,
},
},
include: {
clicks: true,
},
});
The problem is this query runs fine in localhost where I don't have much data, but in the production database there are nearly 500.000 users and 250.000 clicks in total, so I am not sure if that is the root case but the query fails with the following exception:
Error:
Invalid `this.prisma.user.findMany()` invocation in
/usr/src/app/dist/xyx/xyx.service.js:135:58
132 }
133 async getUsers(campaigns, dateRange) {
134 try {
→ 135 const users = await this.prisma.user.findMany(
Can't reach database server at `xyz`:`25060`
Please make sure your database server is running at `xyz`:`25060`.
Prisma error code is P1001.
xyz replaced for obvious reasons in the paths and connection string to the DB.

the only solution we found is to check what is the limit for your query and then use pagination (skip, take) params in loop to download data part by part and glue them back together then ... not optimal, but, it works. See existing bug report for example
https://github.com/prisma/prisma/issues/8832

Is there a way to update document field with FieldValue.increment() using firestore security rules?

I just started to use noSQL database like firestore and want to find how to increment document field from one collection after creating a document in second collection using firestore security rules.
js code
collection('some_collections').add({name: 'name', anotherId: 'abc' })
.then(() => {
collection.('another_collections').doc('abc')
.update({counter: FieldValue.increment(1) });
});
something like this in security rules
function isIncremented() {
get(/databases/$(database)/documents/another_collections
/$(request.resource.data.anotherId)).counter = FieldValue.increment(1);
return true;
}
match /some_collections/{some_collectionId} {
allow create: if signedIn() && isIncremented();
}

It's not possible to use security rules to modify any data in the database.
If you need to make changes to a document automatically when another document is changed, you should look into Cloud Functions to set up a trigger that executes after a document is changed.

Two outputs in logstash. One for certain aggregations only

I'm trying to specify a second output of logstash in order to save certain aggregated data only. No clue how to achieve it at the moment. Documentation doesn't cover such a case.
At the moment I use a single input and a single output.
Input definition (logstash-udp.conf):
input {
udp {
port => 25000
codec => json
buffer_size => 5000
workers => 2
}
}
filter {
grok {
match => [ "message", "API call happened" ]
}
aggregate {
task_id => "%{example_task}"
code => "
map['api_calls'] ||= 0
map['api_calls'] += 1
map['message'] ||= event.get('message')
event.cancel()
"
timeout => 60
push_previous_map_as_event => true
timeout_code => "event.set('aggregated_calls', event.get('api_calls') > 0)"
timeout_tags => ['_aggregation']
}
}
Output definition (logstash-output.conf):
output {
elasticsearch {
hosts => ["localhost"]
manage_template => false
index => "%{[#metadata][udp]}-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
}
}
What I want to achieve now? I need to add a second, different aggregation (different data and conditions) which will save all the not aggregated data to Elasticsearch like now however aggregated data for this aggregation would be saved to Postgres. I'm pretty much stuck at the moment and searching the web for some docs/examples doesn't help.

I'd suggest using multiple pipelines: https://www.elastic.co/guide/en/logstash/current/multiple-pipelines.html
This way you can have one pipeline for aggregation and second one for pure data.

Filter data on call to getHyperCubeData

When I run the following, I got all records from my table object (assuming i have 100 records in all). Is there a way to send the selection/filter, for example, I want to retrieve only those where department='procuring'.
table.getHyperCubeData('/qHyperCubeDef', [{
qWidth: 8,
qHeight: 100
}]).then(data => console.log(data));

I figured out the answer. Before getting the hypercube data, I need to get the field from the Doc class, then do the following:
.then(doc => doc.getField('department'))
.then(field => field.clear().then(() => field.select({qMatch: filter['procuring']})))

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to update data in Elasticsearch on a schedule? - postgresql

Related

logstash date filter add_field is not working

Prisma fails on Postgres query with findMany

Is there a way to update document field with FieldValue.increment() using firestore security rules?

Two outputs in logstash. One for certain aggregations only

Filter data on call to getHyperCubeData

Categories

Resources