I expose kube-state-metrics to an endpoint and scrape it usingthe prometheus_scrape input plugin (using fluentbit 2.0). I want to select some of these metrics and send them to Azure Log Analytics workspace as logs, but it seems like the scraped data is not a record. Not the whole dump, nor individually. When I write a regex parser and apply it via a filter, it gets applied no matter what key I specify in the filter which is wierd. But it seems like they are still not records, because even a lua script can't operate on them, can't even print it to the stdout via the script.
2022-12-02T15:48:19.388264036Z kube_pod_container_status_restarts_total{namespace="kube-system",pod="ama-logs-t6smx",uid="99825b27-919d-4943-bc7d-b87b56081297",container="ama-logs"} = 0
2022-12-02T15:48:19.388264036Z kube_pod_container_status_restarts_total{namespace="kube-system",pod="ama-logs-t6smx",uid="16195b27-915d-3963-bc7d-b86b56557297",container="ama-logs-prometheus"} = 0
2022-12-02T15:48:19.388264036Z kube_pod_container_status_restarts_total{namespace="kube-system",pod="aks-secrets-store-csi-driver-mk47n",uid="d7924927-caf4-39f3-a28b-356af3144f50",container="liveness-probe"} = 0
I tried dropping or altering the records with a lua script, but it simply does not do anything to them, and they still get printed on screen as I did nothing to them with the script.
Is there any way to make them records? Why is this not working?
Related
I have a Loki instance connected as data source to Grafana collecting the log records from a web-app. Only some of those logs related with web-server contain "request_id=XYZ" and I would like to filter out specific log records using $request_id variable.
I can't parse request_id directly as a label since not all of the logs contain this key-pair value.
In order to do filtering I make a query like this:
{compose_project="$var1",host="$var2"} | regexp request=(?P<request_id>(\w+)) | request_id=$request_id
It works nicely, however when no request_id is passed by the user i.e. $request_id variable is empty, I can't see all of the logs. Only the part WITHOUT request_id in the text is listed.
In a perfect scenario, without $request_id being set I would like to see all of the logs. Now I'm wondering if it's possible to somehow conditionally apply this filter request_id=$request_id only when regex matches an occurence of the extracted label in the log? Or is there maybe another way to accomplish this?
I'm attempting to pull data from the Square Connect v1 API using ADF. I'm utilizing a Copy Activity with a REST source. I am successfully pulling back data, however, the results are unexpected.
The endpoint is /v1/{location_id}/payments. I have three parameters, shown below.
I can successfully pull this data via Postman.
The results are stored in a Blob and are as if I did not specify any parameters whatsoever.
Only when I hardcode the parameters into the relative path
do I get correct results.
I feel I must be missing a setting somewhere, but which one?
You can try setting the values you want into a setVariable activity, and then have your copyActivity reference those variables. This will tell you whether it is an issue with the dynamic content or not. I have run into some unexpected behavior myself. The benefit of the intermediate setVariable activity is twofold. Firstly it coerces the datatype, secondly, it lets you see what the value is.
My apologies for not using comments. I do not yet have enough points to comment.
I have been studying online about ELK stack for my new project.
Although most of the tech blogs are about how to set ELK up.
Although I need more information to begin with.
What is Logstash ? Further, Logstash Shipper and Indexer.
What is Elasticsearch's role ?
Any leads will be appreciated too if not a proper answer.
I will try to explain the elk stack to you with an example.
Applications generate logs which all have the same format ( timestamp | loglevel | message ) on any machine in our cluster and write those logs to some file.
Filebeat (a logshipper from elk) tracks that file, gathers any updates to the file periodically and forwards them to logstash over the network. Unlike logstash Filebeat is a lightweight application that uses very little resources so I don't mind running it on every machine in the cluster. It notices when logstash is down and waits with tranferring data until logstash is running again (no logs are lost).
Logstash receives messages from all log shippers through the network and applies filters to the messages. In our case it splits up each entry into timestamp, loglevel and message. These are separate fields and can later be searched easily. Any messages that do not conform to that format will get a field: invalid logformat. These messages with fields are now forwarded to elastic search in a speed that elastic search can handle.
Elastic search stores all messages and indexes ( prepares for quick search) all the fields im the messages. It is our database.
We then use Kibana (also from elk) as a gui for accessing the logs. In kibana I can do something like: show me all logs from between 3-5 pm today with loglevel error whose message contains MyClass. Kibana will ask elasticsearch for the results and display them
I don't know, if this helps, but ... whatever... Let's take some really stupid example: I want to do statistics about squirrels in my neighborhood. Every squirrel has a name and we know what they look like. Each neighbor makes a log entry whenever he sees a squirrel eating a nut.
ElasticSearch is a document database that structures data in so called indices. It is able to save pieces (shards) of those indices redundantly on multiple servers and gives you great search functionalities. so you can access huge amounts of data very quickly.
Here we might have finished events that look like this:
{
"_index": "squirrels-2018",
"_id": "zr7zejfhs7fzfud",
"_version": 1,
"_source": {
"squirrel": "Bethany",
"neighbor": "A",
"#timestamp": "2018-10-26T15:22:35.613Z",
"meal": "hazelnut",
}
}
Logstash is the data collector and transformator. It's able to accept data from many different sources (files, databases, transport protocols, ...) with its input plugins. After using one of those input plugins all the data is stored in an Event object that can be manipulated with filters (add data, remove data, load additional data from other sources). When the data has the desired format, it can be distributed to many different outputs.
If neighbor A provides a MySQL database with the columns 'squirrel', 'time' and 'ate', but neighbor B likes to write CSVs with the columns 'name', 'nut' and 'when', we can use Logstash to accept both inputs. Then we rename the fields and parse the different datetime formats those neighbors might be using. If one of them likes to call Bethany 'Beth' we can change the data here to make it consistent. Eventually we send the result to ElasticSearch (and maybe other outputs as well).
Kibana is a visualization tool. It allows you to get an overview over your index structures and server status and create diagrams for your ElasticSearch data
Here we can do funny diagrams like 'Squirrel Sightings Per Minute' or 'Fattest Squirrel (based on nut intake)'
I want to use Tarantool database for logging user activity.
Are there any out of the box solutions to create web dashboard with nice charts based on the collected data?
A long time ago, using an old-old version of tarantool I've created a draft of tarbon - time-series database, with carbon-cache identical interface.
Since that time the protocol have changed, but the generic idea still the same: use spaces to store data, compact data organization and correct indexes to access spaces as time-series rows and lua for preparing resulting jsons.
That solution was perfect in performance (either on reads or on writes), but that old version lacks disk storage and without disk I was very limited to metrics capacity.
Tarantool has embedded lua language so u could generate json from your data and use any charting library. For example D3.js has method to load json directly from url.
d3.json(url[, callback])
Creates a request for the JSON file at the specified url with the mime type "application/json". If a callback is specified, the request is immediately issued with the GET method, and the callback will be invoked asynchronously when the file is loaded or the request fails; the callback is invoked with two arguments: the error, if any, and the parsed JSON. The parsed JSON is undefined if an error occurs. If no callback is specified, the returned request can be issued using xhr.get or similar, and handled using xhr.on.
You also could look at c3.js simple facade for d3
Is there a command to know if the kdb server is busy running a query? Even better, knowing what is the percentage completion of the query being run?
So far I've been looking at the top screen on linux to know which server to use...
Unfortunately, not directly. The reason is due to the single threaded nature of a KDB process. In practice, this is easily worked around by adding some basic logging to your server. So whenever a query comes in just log to a file the time the query came in and when the result was returned to the user.
Take a look at the .z.pg and the .z.ps functions which are called to handle synchronous or asynchronous requests, respectively. By default they are just set to "value", which means evaluate the string and return the result. Just replace this with your own function to log events to a file or a log server.
Besides above solution, a more simple way is: keep checking the port.
Normally all queries will be running against port, and kdb server can launched multiple ports for different purpose.
Details:
Use below code to query again port, if the port is busy, null res will return. And you can further kill the port and restart it or whatever the requirement is.
The code will send out 1 to the port and calculate.
.server.testQuery:{[inPort]
res:#[{hopen(x;3000)};`$":",":" sv string `,inPort;0N];
if[not null res;hclose res];
:res
};