Can I use mirth connect to store millions of HL7v2 messages (pipe delimited) and query them programmatically by our third party software application at a later point of time?
What's the best way to do that? Is mirth's REST API capable to query its message store efficently?
Unfortunatly I need a running mirth connect instance to browse the REST API documentation according to the manual at page 368. (If it wouldn't require to have a running instance of mirth to browse the documentation of the REST API I wouldn't have asked that question. Is there a mirth connect instance available on the internet to play with? Or would somebody be so kind to post the relevant REST API documentation for that question?)
So far, those are the scenarios I came up yet:
Mirth is integration engine, and its strength is processing messages. Browsing historical messages can be at times difficult or slow, depending on the storage settings for the channel and whether or not you take care to pull additional information out during processing to store in "custom metadata" fields. The custom metadata fields are not indexed by default, but you can add your own (mirth supports several back-end databases, including postgres, mysql, oracle, and mssql.) Searching the message content basically involves doing a full-text search and scanning. Filter options to reduce scan time, apart from the custom metadata you create, are mostly related to the message properties (datetime received, status, etc..) and not the content.
So, I would not recommend it for the use-case you are suggesting.
However, Mirth could definitely be used to convert your messages (batched from files or live) to xml which could be put in a database designed to handle and query large volumes of xml documents. I assume when you say HL7 you mean the ER7 (pipe delimited) format of HL7v2. Mirth automatically does the conversion to xml for those types of messages as they are handled as xml during processing. You could easily create a new parent node that holds both the converted xml and the original message string as children.
If the database you choose has a JDBC driver, Java SDK, or HTTP/REST API, mirth can likely directly insert the converted messages for you as it processes them.
There are two misconceptions here:
HL7v2 message is triggered by the real-world event, called the trigger event, on the placer (sender) side. It expects some activity to happen on the filler (receiver) side by either confirming the message, replying with the query response, etc. I.e., HL7v2 supports data flow among systems.
Mirth Connect is HL7 interface engine aimed at transforming incoming feeds in one format (e.g., HL7v2 in ER7 format) into outgoing feeds in another format (which could be another HL7v2, or XML, or database, etc.). It does not store anything except a configured portion of messages for audit purposes.
Now, to implement a solution you outlined, Mirth Connect or any other transformation mechanism has to implement two flows: receive, convert if needed and store incoming messages; provide an interface to query those messages.
This is obviously can be done with Mirth Connect but your initial question if Mirth is capable in storing millions of records is incorrect. In fact it's recommended to keep as less messages as possible to speed up Mirth processing (each processed message is stored in the Mirth internal database several times depending on configuration). Thus, all transformed messages are going into the external public or private message storage exactly as shown on your diagrams.
Related
So I am currently working on a chat, and I wonder if I could use Redis to store the chat messages. The messages will be only at the web and I want at least a chat history of 20 messages for each private chat. The Chats subscribers will be already stored in MongoDB.
I mainly want to use Redis, because I get rid of the MongoDB stuff, for more speed.
I already use Pub/Sub, but what about storing a copy in Redis Lists? Also what about reading statuses, how could I implement that?
Redis only loses data in case of power outage, if the system is shutdown properly, it will save its data and in this case, data won't be lost.
It is good approach to dump data from redis to mongoDb/anyotherDb when a size limit is reached or on date basis (weekly or monthly) so that your realtime chat database stays light weighted.
Many modern systems now a days prepare for power outage, a ups will run and the system will shutdown properly.
see : https://hackernoon.com/how-to-shutdown-your-servers-in-case-of-power-failure-ups-nut-co-34d22a08e92
Also what about reading statuses, how could I implement that?
Depends on protocol you are implementing, if you are using xmpp, see this.
Otherwise, you can use a property in message model for e.g "DeliveryStatus" and set it to your enums (1. Sent, 2. Delivered, 3. Read). Mark message as Sent as soon as it is received at server. For Delivered and Read, your clients will send you back packets indicating the respective action has occurred.
As pointed in the comment above, the important thing to consider here is the persistency model. Redis offers some persistency (with snapshots and aof-files). The important thing is to first understand what you need:
can you afford to lose all the data? can you afford to lose some of the data? if the answer is no, then perhaps you should not bother with redis.
Here are the details of my use case:
What's my data..
There would be user experiences, error report, state info and so on. The data is fragmented and may change in the future. So I plan to use NoSQL, maybe mongodb, to save data in the server.
What are the clients..
They are clients written in different languages, like C#, C++, LabVIEW and so on. Some don't even have an access to a mongodb driver, so of course it's not an option to communicate with database directly. And framework like below is needed.
Clients -> (Some protocol) -> Broker -> Database.
As those clients are not web client, so common web server using http may not suit for my case, right? Is there any suggestion for the protocol, broker and database, Or even a new framework.
My goal is to make the clients can send data as convenient as possible.
Thank you!
This is not really new, but a message driven application, which is a well understood pattern.
I did this mostly in Java, so I will stick to this language here.
A broker alone would be not enough here. Let us say you use Apache ActiveMQ as you message broker, you would still need to get your data into the database, since MQ is... ...a message queue. So you need a part which gets the messages out of MQ, processes them according to your business rules and stores them in the (correct) database instance, and the correct collection/bucket/table. Of course you could write this part by hand, but that would be pretty much reinventing the wheel. There is a notion of a "message routing and mediation engine", and the most commonly suggested here is Apache Camel, which has quite some components to communicate with databases and other so called consumers and producers. And that is the key point. In general, if possible, your clients should send their data to the message broker directly. But, if they can't, they can simply send text files or make REST calls – there are actually too many options to list here. This incoming data can be preprocessed and normalized to your standard format by a "route" in Apache Camel (a set of a consumer, conversion rules and a producer, in it's simplest form) and send as an AMQP message to MQ. From there, another Camel route can process the AMQP messages, apply your business rules and store the data in the database... ...or whatever else may come to your mind (for example sending an email).
So this solution supports a multitude of protocols for incoming and outgoing messages (as long as they are supported by Camel) and you have your business rules in a centralized and well defined location.
To implement this, I'd strongly suggest using Apache ServiceMix, which is a distribution of ActiveMQ, Camel and a system to manage the components and business rules.
Finally, web server with http protocal could suit for the use case, I think.
Mostly I want is a universal API for different kinds of clients to save data to cloud. Http has method GET, POST, PUT, DELETE, so with a RESTful API it is naturlly suitable for operate data, I think.
My solution at last is Node.js(Express) + Mongodb (a quite common group), and a RESTful API is provided via Express web server, clients can use http to operate data conviencely. Also, it is quite light weight and easy to get started.
Here is some tutorial: http://cwbuecheler.com/web/tutorials/2013/node-express-mongo/
We use mirth as our interface engine and the ActiveMQ and Spring Inbound listener to process messages.
Our customers reported that some of the messages are missing from the mirth console but found in the ActiveMQ queue and the Spring listener application.
Initially we thought that someone may have removed manually from Mirth. But when checked the event logs there's no sign of removing a message.
We found this happening on some of the messages but could not identify the cause of the issue or pattern of messages.
Have anyone face and issue like with Mirth Admin console ?
We have the client DB as well but unable to open except through Mirth to check whether data is available.
Highly appreciate if someone can help on this.
Thanks
I have found some channels don't display "filtered" messages properly. But I have never seen successful messages go "missing".
If you don't trust the Mirth Admin then I would recommend querying the Mirth DB.
This can be done outside the confines of Mirth provided that Mirth is writing to an external DB such as MS-SQL Sever.
The data you get from it is VERY rich, but if you are sending 1000's of messages an hour (or more) you'll probably want to limit the time-range you search. Free-text searching like
select * from message m where m.raw_data like ('%needle%')
is NOT recommended and will take a long time to execute.
Being able search Mirth via the DB has opened up a tone of analysis for us that we don't have through the admin interface.
Just to chime on this question: if you are are running quite a number of channels or if you have quite a high volume of messages mirth may have trouble keeping up with its database updates due to row/table locks and inefficient conversions or data types (this should be resolved now).
We do, however, at peak times every so often, see a message or two processed through the engine with log entries indicating it was unable to insert the message and it was rolled back. I would say we have around 10 per year like that. Hopefully this is non-issue in Mirth 3 with the new backend ...
I am looking for advice/guidance on how to achieve the following:
I have a circuit mounted and connected to an Arduino and I am able to easily retrieve data from it, using Python and the pySerial module. It allows me to determine the value of an analog input over time.
At the moment I am storing that data to a file, with a time stamp and the correspondent value and I would love to hear opinions and thoughts on how I could 'share' this data to a web server and 'play' it live.
Is it possible to 'stream' the values into the dump file and retrieve data from it at the same time through an AJAX request or should I look into event-driven web servers like 'Tornado', 'Twisted'...
I am a bit lost here. Just for the record, I am comfortable with PHP and JavaScript for the final output, I just don't have a clue on how to constantly 'stream' the data I need.
Thanks in advance.
If you don't plan to update the Ardunio device too much then it would make sense to have the Python component continue to collect the data over the serial port and publish it in a way that can easily be consumed by a service which can distribute the information in a more efficient, and probably flexible, manner.
e.g.
read the data from the serial port and publish messages onto a message queue. The message queue can then be read by any other component and the data can then be distributed to other applications/clients.
Make a web call to a server that can process each update and distribute to other applications/clients.
You could use something like Pusher (who I work for) and make a call to the REST API to deliver each message to any connected clients. Whilst this is a good way of distributing your data you will be publishing your data even if no clients are listening so I think you are best to get the data to a component like a web server first.
Assuming you go with 1 or 2, you can then use realtime web solution to distribute the data to any number of clients. You could use Pusher here or you could use a self hosted solution.
So, the data flow as I see it would be:
Ardunio -> small Python app -> Queue (or HTTP request to Web server) -> Realtime Web Technology -> Many clients
I'm about to add IMAP email integration to one of our web applications (ASP.NET / SQL Server). I'm already using a commercial library which exposes the most important IMAP functionality: get folder list, get message headers, get mime message etc.)
Getting email data "live" from the IMAP server works very well. But here comes the difficult task: I have to keep the email/folders caching SQL database synchronized to the IMAP server (I have to show data applying different criteria).
Our database schema essentially contains a "Folders" and an "Emails" table. The "Emails" table contains primarily header information like "FromAddress", "FromName", "IsRead", "IsAnswered", "IsForwarded", "HasAttachments" etc. (without the email content or attachments).
I have to consider two major scenarios:
Getting all messages the first time (or after a user re-organized the folders)
Getting new/recent messages
What would be a good synchronization strategy for keeping the mail server and database server up-to-date, considering that performance is a major design criterion (I can't just query/compare thousands of messages every time I connect, in order to find out if the user moved or deleted some old emails).
Thanks!
From your library's feature list:
Better UniqueId Support: We've added
even more options for requesting a
message's unique id. You can now
return the UniqueId in a message's
DataTable for return trips to the IMAP
server.
And:
Retrieve only New Messages
Search Flagged Messages
Mark/Unmark Messages as Read
It looks to me as though your library has all the support you need to keep your SQL server synchronized. You can programmatically mark messages as read, and the library supports retrieval of only new messages. That takes care of your second item.
Your strategy will depend partly on how your solution works. If I read your question correclty, your users manage their email on the IMAP server, and your SQL Server is "subscribed" to the IMAP server, from a syncronization perspective.
If this is correct, then synchronization is effectively a background task. My approach would be to synchronize using an event model on a user-by-user basis. If possible, "notify" the synchronization program when there is activity (new/deleted emails) for a user. Add a synchronization "job" to a background process that batches synch jobs together. A notification model will ensure that the synch program only works on users that need a synch.
Small new/deleted email synch jobs go to one "processor" and larger jobs like total resynch and folder reorganization go to another. Really big resynch jobs may have to be split up in order to keep overall throughput high. The "small job" and "big job" processors could be two different services, or possibly two different threads depending on performance and design considerations.