Fluentd persistence and reliability - persistence

Since fluentd does not use redis but supposedly has better built in reliability, how does that solve the problem of the instance going down before it has a chance to send the logs to elastic search? Is this something not significant enough to worry about, for example you could set the steaming of logs at a high frequency, so if you ever lose the instance, only a few lines would have no transferred over?

Fluentd uses a buffering mechanism, once it receive a set of events they are stored either in memory or in the file system, the latest is what is used for reliability. The events are stored in chunks, then upon a certain period of time it flush the chunks to the destination. If a chunk failed, it will retry later.
You can read more about buffering in the official documentation:
http://docs.fluentd.org/articles/buffer-plugin-overview

Related

Can Cassandra or ScyllaDB give incomplete data while reading with PySpark if either clusters are left un-repaired forever?

I use both Cassandra and ScyllaDB 3-node clusters and use PySpark to read data. I was wondering if any of them are not repaired forever, is there any challenge while reading data from either if there are inconsistencies in nodes. Will the correct data be read and if yes, then why do we need to repair them?
Yes you can get incorrect data if reapir is not done. It also depends on with what consistency you are reading or writing. Generally in production systems writes are done with (Local_one/Local_quorum) and read with Local_quorum.
If you are writing with weak consistency level, then repair becomes important as some of the nodes might not have got the mutations and while reading those nodes may get selected.
For example if you write with consistency level ONE on a table TABLE1 with a replication of 3. Now it may happen your write was written to NodeA only and NodeB and NodeC might have missed the mutation. Now if you are reading with Consistency level LOCAL_QUORUM, it may happen that NodeB and 'NodeC' get selected and they do not return the written data.
Repair is an important maintenance task for Cassandra which should be done periodically and continuously to keep data in healthy state.
As others have noted in other answers, different consistency levels make repair more or less important for different reasons. So I'll focus on the consistency level that you said in a comment you are using: LOCAL_ONE for reading and LOCAL_QUORUM for writing:
Successfully writing with LOCAL_QUORUM only guarantees that two replicas have been written. If the third replica is temporarily down, and will later come up - at that point one third of the read requests for this data, reads done from only one node (this is what LOCAL_ONE means) will miss the new data! Moreover, there isn't even a guarantee of so-called monotonic consistency - you can get new data in one read (from one node), and the old data in a later read (from another node).
However, it isn't completely accurate that only a repair can fix this problem. Another feature - enabled by default on both Cassandra and Scylla - is called Hinted Handoff - where when a node is down for relatively short time (up to three hours, but also depending on the amount of traffic in that period), other nodes which tried to send it updates remember those updates - and retry the send when the dead node comes back up. If you are faced only with such relatively short downtimes, repair isn't necessary and Hinted Handoff is actually enough.
That being said, Hinted Handoff isn't guaranteed perfect and might miss some inconsistencies. E.g., the node wishing to save a hint might itself be rebooted before it managed to save the hint, or replaced after saving it. So this mechanism isn't completely foolproof.
By the way, there another thing you need to be aware of: If you ever intend to do a repair (e.g., perhaps after some node was down for too long for Hinted Handoff to have worked, or perhaps because a QUORUM read causes a read repair), you must do it at least once every gc_grace_seconds (this defaults to 10 days).
The reason for this statement is the risk of data resurrection by repair which is too infrequent. The thing is, after gc_grace_seconds, the tombstones marking deleted items are removed forever ("garbage collected"). At that point, if you do a repair and one of the nodes happens to have an old version of this data (prior to the delete), the old data will be "resurrected" - copied to all replicas.
In addition to Manish's great answer, I'll just add that read operations run consistency levels higher than *_ONE have a (small...10% default) chance to invoke a read repair. I have seen that applications running at a higher consistency level for reads, will have less issues with inconsistent replicas.
Although, writing at *_QUORUM should ensure that the majority (quorum) of replicas are indeed consistent. Once it's written successfully, data should not "go bad" over time.
That all being said, running periodic (weekly) repairs is a good idea. I highly recommend using Cassandra Reaper to manage repairs, especially if you have multiple clusters.

JBoss HornetQ/ActiveMQ Artemis saving queue message to file system

Can someone please help me in understanding the impact of saving Hornetq/ActiveMQ Artemis messages to the file system and bypassing the queue every time?
The message is more than 2GB, and I run into Maximum size 2GB exceeded exception in HornetQ described here. So I was planning to not add the message to queue but write it manually to disk and pass the path of the file in header and read the message from the file. I really don't know the performance impact so asking if I do for all messages less than 2GB, will there be any performance impact?
Given the information you've provided I don't think anybody but you can determine the "performance impact" of manually writing the file to disk vs. sending the file to the broker.
Generally speaking, you will save the time required to send the file to the broker in the first place, but you don't indicate how fast the hard-drives are. If the hard-drive on the broker is much faster than the hard-drive on the client then it may take longer overall to write the file to disk manually.
Also, if the network is slow between the clients and the broker and fast between the clients and the shared drive where you're going to write the file then it may be overall faster to write the file to disk manually.
Ultimately it's going to be up to you to test the performance impact of your changes.

Use-case of non-persistence messages in message queue

I was reading about Messaging Queues and found that the messages can be of two types : Persistence and Non-Persistence .
Persistence Message are stored in disk/database so that they will survive a broker restart while the Non-Persistence Messages are stored in Memory which do not survive a broker restart.
Persistent messaging is usually slower than non-persistent delivery.
But I am unable to think of a specific use-case of non-persistent messages.
Can anyone give an example when a programmer should use non-persistent messages.
Generally speaking, when it doesn't matter much if you lose some messages.
For example, railroad signaling... the signals send their state every few seconds. If one or two get lost, there are more coming.
Or stock price display... if the display fails to update for a bit, it's not really a big deal. Not talking about trading activity here - just display in a public area or something.
Aside from specific "business" applications where persistence might not be required, there is another important reason why non-persistent messages might be preferred over persistent ones - performance. Sending and consuming non-persistent messages is almost always much, much faster than the same operations with persistent messages. When dealing with persistent messages the broker must interact with a storage device (e.g. local HDD, local SSD, network attached storage, etc.) which will usually be orders of magnitude slower than RAM (i.e. where non-persistent messages live).

Mongodb update guarantee using w=0

I have a large collection with more that half a million of docs, which I need to updated continuously. To achieve this, my first approach was to use w=1 to ensure write result, which causes a lot of delay.
collection.update(
{'_id': _id},
{'$set': data},
w=1
)
So I decided to use w=0 in my update method, now the performance got significantly faster.
Since my past bitter experience with mongodb, I'm not sure if all the update are guaranteed when w=0. My question is, is it guaranteed to update using w=0?
Edit: Also, I would like to know how does it work? Does it create an internal queue and perform update asynchronously one by one? I saw using mongostat, that some update is being processed even after the python script quits. Or the update is instant?
Edit 2: According to the answer of Sammaye, link, any error can cause silent failure. But what happens if a heavy load of updates are given? Does some updates fail then?
No, w=0 can fail, it is only:
http://docs.mongodb.org/manual/core/write-concern/#unacknowledged
Unacknowledged is similar to errors ignored; however, drivers will attempt to receive and handle network errors when possible.
Which means that the write can fail silently within MongoDB itself.
It is not reliable if you wish to specifically guarantee. At the end of the day if you wish to touch the database and get an acknowledgment from it then you must wait, laws of physics.
Does w:0 guarantee an update?
As Sammaye has written: No, since there might be a time where the data is only applied to the in memory data and is not written to the journal yet. So if there is an outage during this time, which, depending on the configuration, is somewhere between 10 (with j:1 and the journal and the datafiles living on separate block devices) and 100ms by default, your update may be lost.
Please keep in mind that illegal updates (such as changing the _id of a document) will silently fail.
How does the update work with w:0?
Assuming there are no network errors, the driver will return as soon it has send the operation to the mongod/mongos instance with w:0. But let's look a bit further to give you an idea on what happens under the hood.
Next, the update will be processed by the query optimizer and applied to the in memory data set. After sucessful application of the operation a write with write concern w:1 would return now. The operations applied will be synced to the journal every commitIntervalMs, which is divided by 3 with write concern j:1. If you have a write concern of {j:1}, the driver will return after the operations are stored in the journal successfully. Note that there are still edge cases in which data which made it to the journal won't be applied to replica set members in case a very "well" timed outage occurs now.
By default, every syncPeriodSecs, the data from the journal is applied to the actual data files.
Regarding what you saw in mongostat: It's granularity isn't very high, you might well we operations which took place in the past. As discussed, the update to the in memory data isn't instant, as the update first has to pass the query optimizer.
Will heavy load make updates silently fail with w:0?
In general, it is safe to say "No." And here is why:
For each connection, there is a certain amount of RAM allocated. If the load is so high that mongo can't allocate any further RAM, there would be a connection error – which is dealt with, regardless of the write concern, except for unacknowledged writes.
Furthermore, the application of updates to the in memory data is extremely fast - most likely still faster than they come in in case we are talking of load peaks. If mongod is totally overloaded (e.g. 150k updates a second on a standalone mongod with spinning disks), problems might occur, of course, though even that usually is leveraged from a durability point of view by the underlying OS.
However, updates still may silently disappear in case of an outage when the write concern is w:0,j:0 and the outage happens in the time the update is not synced to the journal.
Notes:
The optimal balance between maximum performance and minimal guaranteed durability is a write concern of j:1. With a proper setup, you can reduce the latency to slightly over 10ms.
To further reduce the latency/update, it might be worth having a look at bulk write operations, if those apply to your use case. In my experience, they do more often than not. Please read and try before dismissing the idea.
Doing write operations with w:0,j:0 is highly discouraged in case you expect any guarantee on data durability. Use a t your own risk. This write concern is only meant for "cheap" data, which is easy to reobtain or where speed concern exceeds the need for durability. Collecting real time weather data in a large scale would be an example – the system still works, even if one or two data points are missing here and there. For most applications, durability is a concern. Conclusion: use w:1,j:1 at least for durable writes.

NServiceBus: How to stop distributor acting as a processing bottleneck (reduces rate 65%)

We have an event processing system that will process events sent directly from the source to handler process at 200 eps (events per second). The queues and message sends are transactional. Adding the NSB distributor between the event generator and the handler process reduces this rate from 200 eps to 70 eps. The disk usage and CPU on the distributor box become significantly higher as well.
Seen with commercial build of NServiceBus, version 2.6.0.1505.
Has anyone else seen this behaviour or have any advice?
One thing you can play with is where MSDTC is located. You can have your workers use the same MSDTC as the distributor, therefore downgrading the level of the transaction and speeding up commits. I would recommend if you do this that you cluster MSDTC to protect against failures.
Assuming you are operating on a DB you could shard your databases to work on different sets of data. You could also move the DB(s) closer to the workers(to the same machine).
I would also check into the settings of your DB provider and MSMQ as there are a few things to tweak there in terms of timeouts and such. Note that there is a trade off when applying certain settings, but it sounds like you'd prefer the quickest throughput.
There are lots of other system level things to check, I'll assume you've been through all those items(network/disk/RAM/etc).