We are looking for a solution for collecting data from different SCADA systems. It seems that OPC UA is a good approach for that. Data collection will be done from a single system to multiple SCADA systems over the internet (https). So, we are planning to develop a OPC UA client that can connect to multiple OPC UA servers. Data will be collected with a given interval. The system should be able to handle if the connection between client and server is lost for a period of time. In that case, I assume we need to get the data by looking into historical data. Hence, we need a server that support HA (Historical Access).
Are there any servers supporting this or do we need to develop our own server implementation?
Or is there a better approach than the one described above?
Any help or hints on this would be appreciated.
How long would you expect the connection to be down for?
While leaning on HA is certainly one way to handle this, I think you'll have a hard time finding any products on the market right now that are actually implementing HA.
Luckily, you can probably deal with this scenario without HA. If you create your subscriptions with a generous lifetime, and create your monitored items with a queue size that, based on the sampling interval, could hold enough data changes to span up to that lifetime, then you upon reconnecting you should receive all of the data changes that happened while the connection between client and server was lost.
If the connection is expected be down for days/weeks/months, then this won't work without support for durable subscriptions, introduced in UA 1.03, but then again you're limited by finding a server that supports durable subscriptions. (Durable subscriptions are basically just a way to make the lifetime of target subscriptions much longer than normally allowed and to instruct the server that they are expected to persist these subscriptions to disk and restore them in the case of e.g. server restart)
Related
For a desktop App (ERP like functionality) I'm and wondering what would be wiser to do.
Assuming that both machines are equal in performance and the server has to deal with max. 5-10 clients and no other obligations. Is it better to load all data initially (~20.000 objects) and do filtering, sorting etc. on the client (electron) or is it better to do the processing on the backend (golang + postgres) over Axios. The user interface should be as snappy as possible but also get the data as fast as possible.
A costly operation is filtering 15.000 Objects by a reference ID. (e.g. a client can have several orders)
So objects that belong to a "parent object" are displayed by querying all those objects by a parentID.
Is there a general answer to what would be more performant, or a better choice here? Doing some assumptions, like a latency of 5ms in the network + 20ms for the API + a couple for filling the store.
At which data size will this operation be slower on the frontend or completely unsustainable?
If it's not a performance problem, are there other reasons I would want to do this on the server?
Edit: Client and Server are on the same local network
You specifically mention an ERP-like software. For such software you have to carefully consider the value of consistency:
Will your software need to show the same data for all clients?
If the answer to this is yes, then the simplest implementation is to do data processing on the server which informs all clients of changing data.
If the answer to this is no, then you should be fine doing most processing on the client software.
There are of course ways to do most of your processing on the client yet still have consistency but they will add complexity to your overall design. One implementation is to broadcast changes on one client to all other clients. This is the architecture behind most multiplayer online games.
Another way to tackle this is implemented by git: the data on all clients are different from each other but there are ways to synchronize each client data with the server thus achieving eventual consistency.
Another consideration you have to think about is the size of your data:
Will downloading all the data from the server take more than a few seconds?
If downloading all data from the server takes too long then the UI will be essentially unresponsive when starting.
In other words, is it possible for a server or a client to receive partial data before the transmission is finished?
I am sure this information is written somewhere in the extensive documentation of the OPC Foundation, but I think this is essential.
I am using the "atomic" phrase as used for database writes. When a programmer updates a table, the update is always atomic in the sense that it is either done or not done. We rely on the DB software for guaranteeing against the operation being only partially successful.
In the case of a PLC acting as an OPC UA server; when the client writes say 1 KByte data, are we sure that the program running on the PLC, on any instant read a part of this data while it is still being communicated and written? Since we are dealing with a very fast reading entity (the PLC) on the other side of the communication, is it possible that the PLC gets the first 100 bytes before the rest is received?
In the case of a PLC acting as an OPC UA server; when the client writes say 1 KByte data, are we sure that the program running on the PLC, on any instant read a part of this data while it is still being communicated and written?
You can be certain that any server will not be processing any read or write until the request has been fully transmitted by the client. There is no facility in OPC UA for dealing with streaming/partial requests from clients.
That said, how the server handles that 1KB of data it just received as a write is not covered by spec. There is no guarantee that it is written atomically to whatever the backing/underlying datasource may be (in memory, shared memory, a file, another device on the network, etc...).
I do believe that most OPC UA servers built in to PLCs are probably doing the right thing to ensure atomicity but there is nothing in the spec that guarantees it and no way to be sure other than contacting that vendor or consulting documentation.
Beckhoff OPC-UA Server communicates with the Beckhoff PLC through the ADS protocol.
The default Max Size of the consistent data sent with the help of the ads router is 16 kByte although it can be changed if needed.
This is important to understand, because the OPC-UA Server is not part of the PLC runtime environment.
When an OPC-UA client writes a node of an OPC-UA Server, the OPC-UA Server sends this data to the PLC.
In the Twincat development environment you have the option to declare a special attribute for structured types:
{attribute 'OPC.UA.DA.StructuredType' := '1'}
This tells the Beckhoff OPC-UA Server to send the data concerning that specific data structure in a consistent way to the PLC when it receives it from a client.
The Beckhoff OPC-UA documentation states:
"StructuredTypes allow you to read or write structures without interpreting each byte, because the UA Server
returns the information type of each element of the structure. Based on complex functions in modern
OPC UA SDKs, OPC UA Clients can search and interpret this structural information."
Therefor, regarding data consistency, it is also important that your OPC-UA SDK (client) is modern enough to be able to "search and interpret this structural information".
The answer is provided by Flippo & Kevin in the above comments. So this will be a recap for everyone to quickly spot:
OPC UA guarantees atomicity only for simple types (Integer, Double Integer etc)
Longer structures' atomicity depends on the specific manufacturer.
I'm trying to study about high frequency trading systems. Whats the mechanism that HFT use to connect with the exchange and whats the procedure (does it has to go through a broker or is it direct access, if it's direct access what sort of connection information that i require)
Thanks in advance for your answers.
Understand that there are two different "connections" in an HFT engine. The first is the connection to a market data source. The second is to a clearing resource. As mentioned in kpavlov's answer, a very expensive COLO (co-location) is needed to get as close to the data source/target as possible. Depending on their nominal latency these COLO resources cost thousands of dollars per month.
With both connections, your trading engine must be certified by the provider (ICE, CME, etc) to comply with their requirements. With CME the certification process is automated, with ICE it employs human review. In any case, the certification requires that your software demonstrate conformance to standards and freedom from undesirable network side effects.
You must also subscribe to your data source(s) and clearing service, neither is inexpensive and pricing varies over a pretty wide range. During the subscription process you'll gain access to the service providers technical data specification(s)-- a critical part of designing your trading engine. Using old data that you find on the Internet for design purposes is a recipe for problems later. Subscription also gets you access to the provider(s) test sites. It is on these test sites that you test and debug your engine.
After you think you engine is ready for deployment you begin connecting to the data/clearing production servers. This connection will get you into a place of shadows-- port roulette. Not every port at the provider's network edge has the same latency. Here you'll learn that you can have the shortest latency yet seldom have orders filled first. Traditional load balancing does little to help this and CME has begun deployment of FPGA-based systems to ensure correct temporal sequencing of inbound orders, but it's still early in its deployment process.
Once you're running you then get to learn that mistakes can be very expensive. If you place an order prior to a market pre-open event the order is automatically rejected. Do it too often and the clearing provider will charge you a very stiff penalty. Other things can also get you penalized or even kicked-off the service if your systems are determined to be implementing strategies to block others from access, etc.
All the major exchanges web sites have links to public data and educational resources to help decide if HFT is "for you" and how to go about it.
It usually requires an approval from exchange to grant access from outside. They protect their servers by firewalls so your server/network need to be authorized to access.
Special certification procedure with technician (by phone) is usually required before they authorize you.
Most liquidity providers use FIX protocol or custom APIs. You may consider starting implementing your connector with QuickFix, but it may become a bottleneck later, when your traffic will grow.
Information you need to access by FIX is:
Server IP
Server port
FIX protocol credentials:
SenderCompID
TargetCompID
Username
Password
Other fields
We need to send an XML messages between a point of sale system and a java webservice (outside of our network). the messages contain very sensitive data. The messaging has to be secure and transactional and highly available (24/7) with failover. The solution requires the developement of a broker that does the following:
Poll messages from the POS of system (3 types of messages)
do some transformation to the messages
forward part of the message to the java webservice
store part of the message in a database
notify the POS system of the result
Based on these somewhat simplified requirements, do you believe that Biztalk would be overkill? would MSMQ/WCF do the trick here?
Thank you for your help
Amine
IMO if you have the ability to receive and deliver messages asynchronously, then MSMQ (or other Message Oriented Middleware) would be an obvious choice for reliable, transactional transport, irrespective of the rest of the solution. MSMQ's journalling can also be used for audit and debugging purposes (but you will need a strategy for archiving the journal).
For the Polling, Routing, Mapping / Broker and Auditing requirements you then have the choice of BizTalk, other ESB and EAI products, or a DIY solution.
As you've suggested, it is difficult to justify the cost and learning curve of BizTalk on a single message exchange scenario such as this - you could probably knock up a .NET Windows Service (e.g. using WCF, Workflow Foundation, Transaction Scopes, some XSLT for mapping and a data access layer) in a few days.
However, if this isn't a one-off integration scenario and the need for additional integration arises (more applications to integrate, more services, additional listeners, different communications technologies etc), then it would be advisable for your company to take a long term view on EAI and ESB technologies. IMO the main challenge in integration isn't the initial development work, but is instead the ongoing operational management requirements - e.g. security, auditing, failover, monitoring, handling of bad messages and other exceptions - where products such as BizTalk are really worth the outlay.
Do you want to and have the bandwidth to develop, monitor, and maintain your own custom solution? If you don't mind doing that, then going the route of a custom .net-based, MSMQ/WCF solution might work well.
BizTalk will also cover all of the requirements you have listed. There is a learning curve but it is certainly not insurmountable. The initial ramp-up may be lengthier than would a custom-code solution, but there are considerable benefits, particularly the benefit of having all your requirements reliably met:
secure
transactional
reliable (messages aren't lost)
highly available (24/7)
failover
adapter architecture (includes polling adapters)
transformations
working with external web services
returning correlated responses back to the source system (i.e., orchestrating the end-to-end process)
use a broker (you specifically listed this, and BizTalk is a broker; custom MSMQ and WCF means using no broker)
If BizTalk needs to poll the POS system, then you do not need to worry about using MSMQ. BizTalk can handle transferring messages reliably (they're persisted to SQL Server, while MSMQ persists messages to disk).
Note too that the only way to make MSMQ highly available is to cluster it. So either way you'll need to cluster something.
A BizTalk solution will be easier to maintain over time, particularly if you just want to update your transformations. With versioning you can do so in a way that doesn't require downtime. It'll be tough to update a custom solution without downtime.
Some people have had difficulty in the past with monitoring BizTalk for failed messages, but I have found it to be easier, especially with a tool like SCOM or BizTalk 360, than trying to monitor message queues, which often requires even more custom work to monitor. Just make sure to include monitoring in your cost estimates for the life of your solution.
If you do need auditing, then BizTalk also has you covered. MSMQ Journaling will keep a copy of each message for you, but without significant transaction details and with no out-of-the-box way to search through or archive the data.
Building your own .NET client code to work with a Java web service will likely take a good bit of work regardless of which way you go. With BizTalk that means running a wizard against the endpoint or against the WSDL. With WCF it means doing everything by hand or with the assistance of the svcutil tool.
You should go with MSMQ transporting either way.
If you use MSMQ from .NET you should know its limitation: 4 MB on a message size.
BizTalk on the other hand has MSMQ adapter which overcomes this limitation (if a second BizTalk server listen on the other side of the channel).On top of that BizTalk gives you features like: easy configurable message tracking, visual transformation maps. It can be set up in cluster too (Ent. version only).
But the question is can you (or do you want) afford biztalk licenses and hardware for it servers (it's slower then custom .net solution).
I'm in the planning stages of a .NET service which continually processes incoming messages, which involves various transformations, database inserts and updates, etc. As a whole, the service is huge and complicated, but the individual tasks it performs are small, simple, and well-defined.
For this reason, and in order to allow for easy expansion in future, I want to split the service into several smaller services which basically perform part of the processing before passing it onto the next service in the chain.
In order to achieve this, I need some kind of intermediary messaging system that will pass messages from one service to another. I want this to happen in such a way that if a link in the chain crashing or is taken offline briefly, the messages will begin to queue up and get processed once the destination comes back online.
I've always used message queuing for this type of thing, but have recently been made aware of SQL Service Broker which appears to do something similar. Is SQLSB a viable alternative for this scenario and, if so, would I see any performance benefits by using that instead of standard Message Queuing?
Thanks
It sounds to me like you may be after a service bus architecture. This would provide you with the coordination and fault tolerance you are looking for. I'm most familiar and partial to NServiceBus, but there are others including Mass Transit and Rhino Service Bus.
If most of these steps initiate from a database state and end up in a database update, then merging your message storage with your data storage makes a lot of sense:
a single product to backup/restore
consistent state backups
a single high-availability/disaster recoverability solution (DB mirroring, clustering, log shipping etc)
database scale storage (IO capabilities, size and capacity limitations etc as per the database product characteristics, not the limits of message store products).
a single product to tune, troubleshoot, administer
In addition there are also serious performance considerations, as having your message store be the same as the data store means you are not required to do two-phase commit on every message interaction. Using a separate message store requires you to enroll the message store and the data store in a distributed transaction (even if is on the same machine) which requires two-phase commit and is much slower than the single-phase commit of database alone transactions.
In addition using a message store in the database as opposed to an external one has advantages like queryability (run SELECT over the message queues).
Now if we translate the abstract terms 'message store in the database as being Service Broker and 'non-database message store' as being MSMQ, you can see my point why SSB will run circles any time around MSMQ.
My recent experiences with both approaches (starting with Sql Server Service Broker) led me to the situation in which I cry for getting my messages out of SQL server. The problem is quasi-political but you might want to consider it: SQL server in my organisation is managed by a specialized DBA while application servers (i.e. messaging like NServiceBus) by developers and network team. Any change to database servers requires painful performance analysis from DBA and is immersed in fear that we might get standard SQL responsibilities down by our queuing engine living in the same space.
SSSB is pretty difficult to manage (not unlike messaging middleware) but the difference is that I am more allowed to screw something up in the messaging world (the worst that may happen is some pile of messages building up somewhere and logs filling up) and I can't afford for any mistakes in SQL world, where customer transactional data live and is vital for business (including data from legacy systems). I really don't want to get those 'unexpected database growth' or 'wait time alert' or 'why is my temp db growing without end' emails anymore.
I've learned that application servers are cheap. Just add message handlers, add machines... easy. Virtually no license costs. With SQL server it is exactly opposite. It now appears to me that using Service Broker for messaging is like using an expensive car to plow potato field. It is much better for other things.