Falco node info - kubernetes

How do we get the node information from falco threats events response. According to current supported fields for conditions we do not get any information regarding the node name as such.
https://falco.org/docs/rules/supported-fields/

For Falco to detect threats by using syscalls, it needs to run on the same host where the processes executing the syscalls run. Therefore it doesn't make sense to return information about the hostname since this information is only relevant when all the alerts are aggregated into some external service. In other words, it's the aggregator that adds the origin of the message when received.
However, if what you need to do is to distinguish whether the syscall was executed from inside a container, or from another process on the host, look at the container.id field. If it's set to host, the call didn't happen inside the container.

Related

Kogito - wait until data from multiple endpoints is received

I am using Kogito with Quarkus. I have set on drl rule and am using a bpmn configuration. As can be seen below, currently one endpoint is exposed, that starts the process. All needed data is received from the initial request, it is then evaluated and process goes on.
I would like to extend the workflow to have two separate endpoints. One to provide the age of the person and another to provide the name. The process must wait until all needed data is gathered before it proceeds with evaluation.
Has anybody come across a similar solution?
Technically you could use a signal or message to add more data into a process instance before you execute the rules over the entire data, see https://docs.kogito.kie.org/latest/html_single/#ref-bpmn-intermediate-events_kogito-developing-process-services.
In order to do that you need to have some sort of correlation between these events, otherwise, how do you map that event name 1 should be matched to event age 1. If you can keep the process instance id, then the second event can either trigger a rest endpoint to the specific process instance or send it a message via a message broker.
You also have your own custom logic to aggregate the events and only fire a new process instance once your criteria of complete data is met, and there is also plans in Kogito to extend the capabilities of how correlation is done, allowing for instance to use variables of the process as the identifier. For example, if you have person.id as correlation and event to name and age of the same id would signal the same process instance. HOpe this info helps.

Agent can't be in several flowcharts at the time. At least two flowchart blocks are in conflict:

Suppose I have the following supply chain model see model model1
Agents are communicating with each other through a defined network and send messages to each other through ports. for example, demand is generated for customers through their ports and send as "orders" upstream to facilities. Upstream facilities send "shipments" to downstream facilities
and stats are collected at each node.
The model seems to work for 2 echelons but when one facility is connected to two facilities downstream as desired I get the following error "Agent can't be in several flowcharts at the time. At least two flowchart blocks are in conflict" see error. Based on the description it seems the agent "shipment" is sent to two facilities at the same time.
My question is how could I avoid this conflict?
more information about each node:
Agents' "orders" enter through each node's port and are capture as Enter. take(msg), follow a flowchart, and exit as Agent "shipment" to each destination. Each agent "order" has a double amount and port destination. see facility node
any suggestions please?
You must make sure that you do not send agents into a flowchart that is already in another flow chart, correct. This is bad model design.
One way to debug and find the root issue: before sending any message agent, check currentBlock()!=null and traceln the agent and the block. Also pause the model.
You can then see where you want to (re)send that agent that is already in some other flowchart block.
You probably send message agents out that are still somewhere else.
PS: For messages, you probably do not want to use flow charts at all but normal message passing. This avoids these pains here as you can easily send the same message to several agents. Check how message passing is done in the example agent models

Is there any method to get mutual exclusion in a chef node?

For example, If a process updates a node when a chef-client is running the chef-client will overwrite the node data:
chef-client gets node data (state 1)
The process A gets node data (state 1)
The process A updates locally the node data (state 2)
This process saves node data (state 2)
chef-client updates locally the node data (state 2*)
chef-client saves node data, and this node data does not contains the changes from the process A (state 2). The chef-client overwrite the node data. (state 2*)
The same problem occurs, if we have two processes saving node data in the same moment
EDIT
We need to external modification because we have a nice UI of Chef server to manage remotely a lot of computers, showing like a tree (similar to LDAP). An administrator can update the value of the recipes from here. This project is OpenSource: https://github.com/gecos-team/
Although we had a semaphore system, we have detected that if we have two or more simultaneous requests, we can have a concurrence problem:
The regular case is that the system works
But sometimes the system does not work
EDIT 2
I have added a document with a lot of information about our problem.
Throwing what I would do for this case as an answer:
Have a distributed lock mechanism like
This
I'm not using it myself, it is just for the idea
Build a start/report/error handler which will
at start acquire a lock on the node name in the DLM in 1.
if it can't abort the run or wait untill the lock is free
at end (report or error) release the lock.
Modify the External system to do the same as the handler above, aquire a lock before modifying and release when done.
Pay attention to the lock lifetime !!! It should be longer than your Chef Run plus a margin, and the UI should ensure its lock is still there before writing and abort if not.
A way to get rid of the handler (but you still need a lock for the UI) is to take advantage of the reporting api (premium feature of chef 12, free under 25 nodes, license needed upward)
This turn a bit convoluted and need the node to do reporting (so the chef-server url should end with organizations/ and the client version should be above 11.16 or use the backport)
Then your can ask about the runs for a node and check if there's one at started status for this node, and wait until it is ended.
Chef doesn't implement a transaction feature and also it does not re-converge nodes on updates automatically by default. It's open for race conditions which you can try to reduce by updated node attributes from within a chef-client run (right before you do something critical) but you will never end up in a reliable, working setup.
The longer the converge runs, the higher the gap and risk of corruption.
Chef's node attributes are only useful for debugging or modification by the chef-client running on the node itself and pretty much useless in highly concurrent/dynamic environments.
I would use Consul.io to coordinate semaphores and key/value configuration data in realtime. Access it using chef recipes or LWRPs using one of the various interfaces consul provides (http, DNS, …).
You can implement a very easy push-job task to run chef-client (IMHO easier and more powerful than the chef "push jobs" feature, however not integrated in Chefs' ACL/user management) which also is guarded by a distributed semaphore or using the "Leader Election" feature. Of course you'll have to add this logic to your node update script, too.
Chef-client will then retrieve a lock on start and block you from manipulating data while it converges and vice versa.
Discovered this one in production and came to the conclusion that there is no safe way to edit the node attributes directly. Leave it to the chef-client :-)
Good news is that there are other more reliable ways to set node attributes. Chef roles and environments can both be edited safely while a client is running and only take effect during the next chef run. Additionally node attribute precedence rules ensure that any settings you make override those that might be made by a recipe.
I suggest to avoid Chef node data updates from your external app, and move that desired node configuration to a Chef databag.
So nodes will read Chef node data and configuration databag and write only in node data. And you external app read both but only writes in the databag.
If you want to avoind a dependency on another external service, perhaps you could use some kind of time slicing.
Roughly: nodes only start a chef-client on odd minutes. Api only update chef data on even minutes (distribute these even minutes if you have more than a queue).

Mapping between a port and a process id

When a packet is routed to the destination, it uses a port number to map it to and appropriate process at the server. However, I do not find any documentation on how the mapping of (port- process) is done. Please let me know with some interesting links/references. Thanks.
The operating knows which process has which ports open, that's about it in general terms. A specific answer would require specifying a specific operating system, but you can guess that there is something like a port control block for each port and that it probably contains the PID of the process that owns it, or a pointer to its process control block, etc.

RESTful Job Assignment

I have a collection of jobs that need processing, http://example.com/jobs. Each job has a status of "new", "assigned" or "finished".
I want slave processes to pick off one "new" job, set it's status to "assigned", and then process it. I want to ensure each job is only processed by a single slave.
I considered having each slave do the following:
GET http://example.com/jobs
Pick one that's "new" and do an http PUT to http://example.com/jobs/123 {"status=assigned"}.
Repeat
The problem is that another slave may have assigned the job to itself between the GET and PUT. I could have the second PUT return a 409 (conflict), which would signal the second slave to try a different job.
Am I on the right track, or should I do this differently?
I would have one process that picks "new" jobs and assigns them. Other processes would independently go in and look to see if they've been assigned a job. You'd have to have some way to identify which process a job is assigned to, so some kind of slave process id would be called for.
(You could use POST too, as what you're trying to do shouldn't be idempotent anyway).
You could give each of your clients a unique ID (possibly a UUID) and have an "assignee/worker" field in your job resource.
GET http://example.com/jobs/
POST { "worker"=$myID } to http://example.com/jobs/123
GET http://example.com/jobs/123 and check that the worker ID is that of the client
You could combine this with conditional requests too.
On top of this, you could have a time out feature if the job queue doesn't hear back from a given client, it puts it back in the queue.
It looks that the statuses are an essential part of your job-domain model. So I would expose this as dedicated sub-resources
# 'idle' is what you called 'new'
GET /jobs/idle
GET /jobs/assigned
# start job
PUT /jobs/assigned/123
Slave is only allowed to gather jobs by GET /jobs/idle. This never includes jobs which are running. Still there could be race conditions (two slaves are getting the set, before one them has started job). I think 400 Bad Request or your mentioned 409 Conflict are alright with that.
I prefer above resource-structure instead of working with payloads (which often looks more "procedural" to me).
I was a little to specific, I don't actually care that the slave gets to pick the job, just that it gets a unique one.
With that in mind, I think #manuel aldana was on the right track, but I've made a few modifications.
I'll keep the /jobs resource, but also expose a /jobs/assigned resource. A single job may exist in both collections.
The slave can POST to /jobs/assigned with no parameters. The server will choose one "new" job, move it to "assigned", and return the url (/jobs/assigned/{jobid} or /jobs/{jobid}) in the Location header with a 201 status.
When the slave finishes the job, it will PUT to /jobs/{jobid} (status=finished).