My proposed implementation for Tableau server is a 2 server setup with multiple sites which will all me to segregate dashboards to different groups of users. Tableau provides all of this ability out of the box which is good. My question is really about in production, how do I ensure that a request for a dashboard does not consume 100% of the server resources and therefore other dashboard requests will be queued.
It is always good to have an example :-)
Imagine I have 3 'sites' defined in my Tableau server, let's give these names:
Sales
Marketing
Purchasing
Tableau server has users created to permit access to dashboards within each site. Within the Sales site is a dashboard that must run a complex query (I know I would refactor and use the facilities of Tableau to speed this up but this is purely for discussion) which takes a siginificant amount of time and aggregation within Tableau. During this period, how can I ensure that:
Other users within Sales can still access their dashboards?
Marketing and Purchasing are not impacted by the reduced resources on the Server.
Does Tableau provide any way of governing the amount of resources on the box that is assigned to each site?
You can limit the number of users that can be added to a site and can define a storage quota for a site, but I don't think there is a way to reserve CPU or network bandwidth for one site at the expense of others. All the users on the server contend for the same resources, regardless of which site defines the view.
If this is a critical factor in practice, you could stand up a second server to hold the high priority visualizations, thus reserving resources for them alone -- at the cost of a second license.
Related
We’re in the process of migrating our aging monolith to a more robust solution and landed on Kubernetes as the most appropriate platform to achieve what we’re looking for. At the same time, we’re looking to split out and isolate our client data for security and improved privacy.
What we’re considering is ultimately having one database per customer, and embedding those connection details into a deployment for each of them. We’d then build a routing service of some kind that would link a client’s request to their respective deployment/service.
Because our individual clients vary wildly in size (we have some that generate thousands of requests per minute, and others that are hundreds per day), we like the option of having the ability to scale them independently through ReplicaSets on the deployments.
However, I have some concerns regarding upper limits of how many deployments can exist/be successfully managed within a cluster, as we’d be looking at potentially hundreds of different clients, which will continue to grow. I also have concerns of costs, and how having dedicated resources (essentially an entire VM) for our smaller clients might impact our budgets.
So my questions are:
is this a good idea at all? Why or why not, and if not, are there alternative architectures we could look at to achieve the same thing?
is this solution more expensive than it needs to be?
I’d appreciate any insights you could offer, thank you!
I can think of a couple options for this situations:
Deploying separate clusters for each customer. This also allows you to size your clusters properly for each customers and configure autoscaling accordingly for each of them. The drawback is that each cluster has a management fee of 0.10$ per hour, but you get full guarantee that everything is isolated, and you can use the cluster autoscaler to make sure that only the VMs that are actually needed for each customer are running. For smaller customers, you may wanna use this with small (and cheap) machine types.
Another option would be to, as mentioned in the comments, use namespaces. However you would have to configure the cluster properly as there exist ways of accessing services in different namespaces.
Implement customer isolation in your own software running on a cluster. This would imply forcing your software to access only the database for a given customer, but I would not recommend to go this route.
I am designing a solution and would like to double check if this is according to the microservices architecture.
We have clients, accounts and transactions like a normal bank account.
Clients have basic data like name and address.
Accounts might be for savings or current
Transactions are money transfers between 2 accounts
So I am designing the following way:
1 microservice to manage client data (will manage just client basic data and their addresses)
1 microservice to manage account data (will manage account basic, the client id is part of the account data)
1 microservice to manage money data (will have the account's balance and all transfers)
Please let me know if this is according to the microservice architecture and if you have another understanding.
As per my understanding, the main goal of a microservices architecture is to support faster and continuous releases of different parts of a big system without waiting on each other. There are two approaches to design a new system, microservices first approach and microservices later approach. In the first approach, the system is designed from ground up to follow microservices architecture, the system is initially itself broken down into services and the services talk to each other typically over HTTP & REST. In the other approach, system is initially is not built as microservices and all the modules are under single application. Both of these approaches has it pros and cons which is a separate discussion.
In your case, you are taking the first approach, to design the new system with separate services for each functionality. I am not an expert in banking domain but from what I understand, the client (customer) system can be definitely a separate service and be responsible for maintaining customer master data. The account service can be responsible for maintaining accounts data and serve out account related information. However, account balance is an integral property of an account and it should be always associated with an account. Finally the transfer can be a separate service that can record the transfers between accounts. Whenever there is a transfer, it can query the accounts for their current balance and if the transfer is valid one, then it can record the transfer.
However as this involves financial transaction, you would have to make sure that each transaction follows the ACID rules. Maintaining ACID properties among distributed systems is tricky and there are several ways to mitigate this such as only having ACID transactions on the most critical areas and having eventual consistency on the others. For example, banks do not immediately reflect all the transactions to the customer as "completed" and instead show the message saying "pending to be processed" (eventual consistency) so that customer is aware of the exact status.
Imagine I wish to prototype a Bluemix application or simply wish to learn Bluemix. I understand that many of its services have free thresholds before any charges are accrued. Is there a way to set thresholds on my Bluemix account such that I am warned before exceeding free limits? Can I constrain my account such that it will disable services before charges are accrued or otherwise automate the mechanical constraint of my Bluemix utilization?
A potential example of such a need might be for the hobbyist who is self studying but does not want to incur charges or for the programmer who makes a logic error that results in excessive resource consumption or for the user who accidentally neglects to shutdown a resource consuming application after testing.
There are spending limits for paid accounts. According to the documentation, "you can set or edit notifications for total account, runtime, and service spending, as well as spending for individual services, excluding third-party services. You receive notifications when you reach 80%, 90%, and 100% of the spending thresholds you specify."
I'm trying to study about high frequency trading systems. Whats the mechanism that HFT use to connect with the exchange and whats the procedure (does it has to go through a broker or is it direct access, if it's direct access what sort of connection information that i require)
Thanks in advance for your answers.
Understand that there are two different "connections" in an HFT engine. The first is the connection to a market data source. The second is to a clearing resource. As mentioned in kpavlov's answer, a very expensive COLO (co-location) is needed to get as close to the data source/target as possible. Depending on their nominal latency these COLO resources cost thousands of dollars per month.
With both connections, your trading engine must be certified by the provider (ICE, CME, etc) to comply with their requirements. With CME the certification process is automated, with ICE it employs human review. In any case, the certification requires that your software demonstrate conformance to standards and freedom from undesirable network side effects.
You must also subscribe to your data source(s) and clearing service, neither is inexpensive and pricing varies over a pretty wide range. During the subscription process you'll gain access to the service providers technical data specification(s)-- a critical part of designing your trading engine. Using old data that you find on the Internet for design purposes is a recipe for problems later. Subscription also gets you access to the provider(s) test sites. It is on these test sites that you test and debug your engine.
After you think you engine is ready for deployment you begin connecting to the data/clearing production servers. This connection will get you into a place of shadows-- port roulette. Not every port at the provider's network edge has the same latency. Here you'll learn that you can have the shortest latency yet seldom have orders filled first. Traditional load balancing does little to help this and CME has begun deployment of FPGA-based systems to ensure correct temporal sequencing of inbound orders, but it's still early in its deployment process.
Once you're running you then get to learn that mistakes can be very expensive. If you place an order prior to a market pre-open event the order is automatically rejected. Do it too often and the clearing provider will charge you a very stiff penalty. Other things can also get you penalized or even kicked-off the service if your systems are determined to be implementing strategies to block others from access, etc.
All the major exchanges web sites have links to public data and educational resources to help decide if HFT is "for you" and how to go about it.
It usually requires an approval from exchange to grant access from outside. They protect their servers by firewalls so your server/network need to be authorized to access.
Special certification procedure with technician (by phone) is usually required before they authorize you.
Most liquidity providers use FIX protocol or custom APIs. You may consider starting implementing your connector with QuickFix, but it may become a bottleneck later, when your traffic will grow.
Information you need to access by FIX is:
Server IP
Server port
FIX protocol credentials:
SenderCompID
TargetCompID
Username
Password
Other fields
I have to perform load test on REST web services of 1000 concurrent users activities. I have selected NeoLoad tool for it. Is it required to have license for 1000 virtual users to simulate 1000 users or form one virtual user I can simulate 1000 users activities?
Load testing tools (like NeoLoad) price their license based on concurrent virtual users. So, if you have a license for a single user, you can create many requests, but only one at a time. To general load of 1000 users concurrently, you'll need a 1000 users license.
To simulate 1000 concurrent users you really need 1000 VUs as if you use only one you would be:
having wrongly good responsectimes due to caching effect
having wrongly negative response times due to contention on writing
you would probably not be reproducing the production behaviour of server regarding memory usage, caching....
But note that for http restful load testing you could go for a lot of free production ready open sources like Apache JMeter, locust...
https://www.blazemeter.com/blog/rest-api-testing-how-to-do-it-right
https://www.redline13.com/blog/2018/05/test-rest-apis-authentication-using-jmeter/