Druid segments not available from some moment - druid

Ingestion task was success but segments are unavailable from some moment.
I can't restart master node.. because it is not personal server.
Why segments are unavailable? or How can I fix it?

Due to some reasons, it has segments in meta-data database, and tries to download them, but it cannot. Either your historical node's disk is full, or there are some connection issues for it to download the segments from your deep storage (if it is S3 for example, but if your deep storage is a local disk or a mounted network drive, see why it cannot access them).
Segments are available only if historical BE ABLE to download them; historical nodes are instructed by coordinator to download segments.
It may be in process of downloading them and no specific error has occurred, but the performance is poor and it takes time for it to download all of them and make them available
Maybe it cannot (read historical log)
Maybe coordinator has some issues (read coordinator log)
In any case, if a segment is available, the log entries for those possible errors can be either found in historical's log or coordinator's log.

Related

ActiveMQ Artemis: Disk filling indefinitely without consumers or producers

We are testing ActiveMQ Artemis 2.22.0 with clients using the core protocol. The broker is configured to apply paging. We let producers fill up the broker with messages until max-disk-usage stopped all producers.
After we tried connecting consumers, which worked in the first place. However, the brokers disk kept filling until the broker crashed completely.
Now, even after we disconnected all clients manually we see that after restart the broker is extending its message journal until the disk is full again. After restart we see a lot of messages saying deleting orphaned file. The disk usage goes down. After some seconds however, the journal starts rising again and the story repeats.
That's probably not enough information to clearly solve our issue. Thus, here are my questions:
What are possible causes to fill disk space if neither consumers nor producers are connected?
How can we debug such a situation?
In case (really not hoping for that), the journal became corrupt. Is there any way first elaborate that and second restore it?
What are possible causes to fill disk space if neither consumers nor producers are connected?
You may be hitting ARTEMIS-3868 in which case I strongly recommend you move to the latest release (i.e. 2.25.0 at this point).
How can we debug such a situation?
The first thing to do would be to use the artemis data print command to print details about exactly what is in the journal. Hopefully that will shed light on what is causing the journal growth.
In case (really not hoping for that), the journal became corrupt. Is there any way first elaborate that and second restore it?
Particular records in the journal may be corrupted and lost, but the broker should be able to read and recover everything that is still viable.

Implement exclusive access to an EFS/NFS directory

The goal is to have a shared EFS store used by multiple containers where each container knows what directory it needs and only one container at a time can write inside this directory. If multiple containers started with the same target directory only one should acquire the lock, others may fail or keep waiting.
The only requirement is that data in one directory is never modified by two containers.
Considerations:
NFS file locking is not applicable as the situation when an nfs client loses connectivity releasing the lock therefore allowing another process to write to the directory while still writing itself when nfs client restores the connection is possible.
Using consensus based key-value store to decide whether a directory is available for access. Requires reliable fencing in case a container declared dead comes back. Is it possible for disk writes?
Using fencing agent installed together with containers which kills containers if it detects connectivity issues. Are there reliable options here? How to deal with detection/kill lag?
Is there a reasonable way to solve this? Are there good alternatives (which don't involve redesigning the app)?

filemaker 15 Pro server preformance

I creating a database in Filemaker, the database is about 1GB and includes around 500 photos.
Filemaker maker server is having performance issues, its crashes and takes it’s time when searching though the database. My IT department recommended to raise the cache memory.
I raised the memory 252MB but it's still struggling to give a consistent performance. The database shows now peaks in the CPU.
What can cause this problem?
Verify at FileMaker.com that your server meets the minimum requirements for your version.
For starters:
Increase the cache to 50% of the total memory available to FileMaker server.
Verify that the hard disk is unfragmented and has plenty of free space.
FM Server should be extremely stable.
FMS only does two things:
reads data from the disk and sends it to the network
takes data from the network and writes it to the disk
Performance bottlenecks are always disk and network. FMS is relatively easy on CPU and RAM unless Web Direct is being used.
Things to check:
Are users connecting through ethernet or wifi? (Wifi is slow and unreliable.)
Is FMS running in a virtual machine?
Is the machine running a supported operating system?
Is the database using Web Direct? (Use a 2-machine deployment for web direct.)
Is there anything else running on the machine? (Disable virus and indexing.)
Make sure users are accessing the live databases through FMP client and not through file sharing.
How are the database being backed up? NEVER let anything other than FMS see the live files. Only let OS-level backup processes see backup copies, never the live files.
Make sure all the energy saving options on the server are DISABLED. You do NOT want the CPU or disks sleeping or powering down.
Put the server onto an uninterruptible power supply (UPS). Bad power could be causing problems.

Real Time Streaming With Multiple Data Sources Using Kafka

We are planning to build a real time monitoring system with apache kafka. The overall idea is to push data from multiple data sources to kafka and perform data quality checks. I have few questions with this architecture
What are the best possible approaches of streaming data from multiple sources which mainly include java applications, oracle database, rest api's, log files to apache kafka? Note each client deployment includes each of such data sources. Hence the number of data sources pushing data to kafka would be equal to the number of customers * x where x are the types of data sources that I listed. Ideally a push approach would suit best instead of a pull approach. In the pull approach the target system would have to be configured with the credentials of various source system which would not be practical
How do we handle failures?
How do we perform data quality checks on the incoming messages? For e.g. If a certain message does not have all the required attributes, the message could be discarded and an alert could be raised for the maintenance team to check.
Kindly let me know your expert inputs. Thanks !
I think the best approach here is to use Kafka connect: link
but it's a pull approach :
Kafka Connect sources are pull-based for a few reasons. First, although connectors should generally run continuously, making them pull-based means that the connector/Kafka Connect decides when data is actually pulled, which allows for things like pausing connectors without losing data, brief periods of unavailability as connectors are moved, etc. Second, in distributed mode the tasks that pull data may need to be rebalanced across workers, which means they won't have a consistent location or address. While in standalone mode you could guarantee a fixed network endpoint to work with (and point other services at), this doesn't work in distributed mode where tasks can be moving around between workers. Ewen

Sending a large file over network continuously

We need to write software that would continuously (i.e. new data is sent as it becomes available) send very large files (several Tb) to several destinations simultaneously. Some destinations have a dedicated fiber connection to the source, while some do not.
Several questions arise:
We plan to use TCP sockets for this task. What failover procedure would you recommend in order to handle network outages and dropped connections?
What should happen upon upload completion: should the server close the socket? If so, then is it a good design decision to have another daemon provide file checksums on another port?
Could you recommend a method to handle corrupted files, aside from downloading them again? Perhaps I could break them into 10Mb chunks and calculate checksums for each chunk separately?
Thanks.
Since no answers have been given, I'm sharing our own decisions here:
There is a separate daemon for providing checksums for chunks and whole files.
We have decided to abandon the idea of using multicast over VPN for now; we use a multi-process server to distribute the files. The socket is closed and the worker process exits as soon as the file download is complete; any corrupted chunks need to be downloaded separately.
We use a filesystem monitor to capture new data as soon as it arrives to the tier 1 distribution server.