How to manage Squid based on per user user bandwidth - bandwidth

I want to manage the bandwidth and traffic based on user activities on Squid Server Proxy.
I made some research but couldn't find the solution that I want.
For example, users who have more than 256K traffic should be restricted from server.
Can you help me?
Thanks

I'm assumed squid 3.x:
To provide a way to limit the bandwidth of certain requests based on any list of criteria.
class:
the class of a delay pool determines how the delay is applied, ie, whether the different client IPs are treated separately or as a group (or both)
class 1:
a class 1 delay pool contains a single unified bucket which is used for all requests from hosts subject to the pool
class 2:
a class 2 delay pool contains one unified bucket and 255 buckets, one for each host on an 8-bit network (IPv4 class C)
class 3:
contains 255 buckets for the subnets in a 16-bit network, and individual buckets for every host on these networks (IPv4 class B )
class 4:
as class 3 but in addition have per authenticated user buckets, one per user.
class 5:
custom class based on tag values returned by external_acl_type helpers in http_access. One bucket per used tag value.
Delay pools allows you to limit traffic for clients or client groups, with various features:
Can specify peer hosts which aren't affected by delay pools, ie,
local peering or other 'free' traffic (with the no-delay peer
option).
delay behavior is selected by ACLs (low and high priority traffic,
staff vs students or student vs authenticated student or so on).
each group of users has a number of buckets, a bucket has an amount
coming into it in a second and a maximum amount it can grow to; when
it reaches zero, objects reads are deferred until one of the object's
clients has some traffic allowance.
any number of pools can be configured with a given class and any set
of limits within the pools can be disabled, for example you might
only want to use the aggregate and per-host bucket groups of class 3,
not the per-network one.
In your case can you use:
For a class 4 delay pool:
delay_pools pool 4
delay_parameters pool aggregate network individual user
The last delay_pool, can be configure in your squid server proxy:
for example; each user will be limited to 128Kbits/sec no matter how many workstations they are logged into:
delay_pools 1
delay_class 1 2
delay_access 1 allow all
delay_parameters 4 32000/32000 8000/8000 600/64000 16000/16000
Please read more:
http://wiki.squid-cache.org/Features/DelayPools
http://www.squid-cache.org/Doc/config/delay_parameters/

Related

Limiting the number of times an endpoint of Kubernetes pod can be accessed?

I have a machine learning model inside a docker image. I pushed the docker image to google container registry and then deploy it inside a Kubernetes pod. There is a fastapi application that runs on Port 8000 and this Fastapi endpoint is public
(call it mymodel:8000).
The structure of fastapi is :
app.get("/homepage")
asynd def get_homepage()
app.get("/model):
aysnc def get_modelpage()
app.post("/model"):
async def get_results(query: Form(...))
User can put query and submit them and get results from the machine learning model running inside the docker. I want to limit the number of times a query can be made by all the users combined. So if the query limit is 100, all the users combined can make only 100 queries in total.
I thought of a way to do this:
Store a database that stores the number of times GET and POST method has been called. As soon as the total number of times POST has been called crosses the limit, stop accepting any more queries.
Is there an alternative way of doing this using Kubernetes limits? Such as I can define a limit_api_calls such that the total number of times mymodel:8000 is accessed is at max equal to limit_api_calls.
I looked at the documentation and I could only find setting limits for CPUs, Memory and rateLimits.
There are several approaches that could satisfy your needs.
Custom implementation: As you mentioned, keep in a persistence layer the number of API calls received and deny requests after it has been reached.
Use a service mesh: Istio (for instance) will let you limit the number of requests received and act as a circuit breaker.
Use an external Api Manager: Apigee will also let you limit and even charge your users, however if it is only for internal use (not pay per use) I definitely won't recommend it.
The tricky part is what you want to happen after the limit has been reached, if it is just a pod you may exit the application to finish and clear it.
Otherwise, if you have a deployment with its replica set and several resources associated with it (like configmaps), you probably want to use some kind of asynchronous alert or polling check to clean up everything related to your deployment. You may want to have a deep look at orchestrators like Airflow (Composer) and use several tools such as Helm for keeping deployments easy.

Akka Source, is there a way to throttle based on a global rate limit coming from an api call?

There is the throttle function on Source https://doc.akka.io/docs/akka/current/stream/operators/Source-or-Flow/throttle.html but this only works in a local context (1 server). If I wanted to share a rate limit (for 3rd party api calls) with other servers (say I have 2 servers instead of 1 for redundancy), I'd like the rate limit to efficiently be spread across the 2 servers (if one server dies from out of memory, the other server should pick up the freed up rate limit until the dead server restarts).
Is this possible somehow through akka's Source assuming I have something like Redis returning whether an action is allowed or disallowed + what the time until an action will be allowed?
Off the top of my head, you can dispense with Redis and use Akka Cluster to deal with failure detection: and set up an actor to subscribe to the cluster events (member joined, member left/downed) and update the local throttle.
Local dynamic throttling can be implemented via a custom graph stage (materializing as a handle through which to change the throttle), or you can also do that via an actor (in which case an ask stage is nice). In the latter case, you can go further and have the throttling actors coordinate among themselves to reallocate unused request capacity between nodes.

How can I reach a specific replica of a stateless service

I've created a stateless service within Service Fabric. It has a SingletonPartition, but multiple instances (InstanceCount is -1 in my case).
I want to communicate with a specific replica of this service. To find all replica's I use:
var fabricClient = new FabricClient();
var serviceUri = new Uri(SERVICENAME);
Partition partition = (await fabricClient.QueryManager.GetPartitionListAsync(serviceUri)).First();
foreach(Replica replica in await fabricClient.QueryManager.GetReplicaListAsync(partition.PartitionInformation.Id))
{
// communicate with this replica, but how to construct the proxy?
//var eventHandlerServiceClient = ServiceProxy.Create<IService>(new Uri(replica.ReplicaAddress));
}
The problem is that there is no overload of the ServiceProxy to create one to the replica. Is there another way to communicate with a specific replica?
Edit
The scenario we are building is the following. We have different moving parts with counter information: 1 named partitioned stateful service (with a couple of hundred partitions), 1 int64 partitioned stateful service, and 1 actor with state. To aggregate the counter information, we need to reach out to all service-partitions and actor-instances.
We could of course reverse it and let everyone send there counts to a single (partitioned) service. But that would add a network call in the normal flow (and thus overhead).
Instead, we came up with the following. The mentioned services&actors are combined into one executable and one servicemanifest. Therefore they are in the same process. We add a stateless service with instancecount -1 to the mentioned services&actors. All counter information is stored inside a static variable. The stateless service can read this counter information.
Now, we only need to reach out to the stateless service (which has an upper limit of the number of nodes).
Just to get some terminology out of the way first, "replica" only applies to stateful services where you have a unique replica set for each partition of a service and replicate state between them for HA. Stateless services just have instances, all of which are equal and identical.
Now to answer your actual question: ServiceProxy doesn't have an option to connect to a specific instance of a deployed stateless service. You have the following options:
Primary replica: connect to the primary replica of a stateful service partition.
Random instance: connect to a random instance of a stateless service.
Random replica: connect to a random replica - regardless of its role - of a stateful service partition.
Random secondary replica - connect to a random secondary replica of a stateful service partition.
E.g.:
ServiceProxy.Create<IMyService>(serviceUri, partitionKey, TargetReplicaSelector.RandomInstance)
So why no option to connect to a specific stateless service instance?
Well, I would turn this question around and ask why would you want to connect to a specific stateless service instance? By definition, each stateless instance should be identical. If you are keeping some state in there - like user sessions - then now you're stateful and should use stateful services.
You might think of intelligently deciding which instance to connect to for load balancing, but again since it's stateless, no instance should be doing more work than any other as long as requests are distributed evenly. And for that, Service Proxy has the random distribution option.
With that in mind, if you still have some reason to seek out specific stateless service instances, you can always use a different communication stack - like HTTP - and do whatever you want.
"Well, I would turn this question around and ask why would you want to connect to a specific stateless service instance?"
One example would be if you have multiple (3x) stateless service instances all having WebSocket connections to different clients, let's say 500 each. And you want to notify all 1500 (500x3) users of the same message, if it was possible to connect directly to a specific instance (which I would expect was possible, since I can query for those instances using the FabricClient), I could send a message to each instance which would redirect it to all connected clients.
Instead we have to come up with any of multiple workarounds:
Have all instances connect to some evented system that allows them to trigger on incoming message, e.g. Azure Event Hubs, Azure Service Bus, RedisCache.
Host an additional endpoint, as mentioned here, which makes it 3 endpoints pr service instance: WCF, WebSocket, HTTP.
Change to a stateful partitioned service which doesn't hold any state or any replicas, but simply allows to call partitions.
Currently having some serious issues with RedisCache so migrating away from that, and would like to avoid external dependencies such as Event Hubs and Service Bus just for this scenario.
Sending many messages each second, which will give additional overhead when having to call HTTP, and then the request need to transition over to the WebSocket context.
In order to target a specific instance of stateless service you can use named partitions. You can have a single instance per partition and use multiple Named partitions. For example, you can have 5 named partitions [0,1,2,3,4] each will have only one instance of the "service". Then you can call it like this
ServiceProxy.Create<IMyService>(serviceUri, partitionKey, TargetReplicaSelector.RandomInstance)
where partitionKey parameter will have one of values [0,1,2,3,4].
the real example would be
_proxyFactory.CreateServiceProxy<IMyService>(
_myServiceUri,
new ServicePartitionKey("0"), // One of "0,1,2,3,4"
TargetReplicaSelector.Default,
MyServiceEndpoints.ServiceV1);
This way you can choose one of 5 instances. But all 5 instancies may not be always available. For example during startup or when the service dies and SF is recreating or it is in InBuild stage... So for this reason you should run Partition discovery

Group Priority on a Subset of Nodes

I am using a recent build of Torque/Maui (w/ PBS) to schedule jobs on a cluster with heterogenous hardware. Hardware consists on two set of 10 nodes for which I would like to have two group have elevated priority on one of the sets of nodes. For example:
Node set A of 10 nodes has elevated priority for User Group 1
Node set B of 10 nodes has elevated priority for User Group 2
I am familiar with how this is accomplished for all nodes, which is documented here:
http://docs.adaptivecomputing.com/maui/5.1.3priorityusage.php
However, I am unfamiliar on the best strategy to set this type of priority on a subset of the cluster. From what I can ascertain from the Maui docs it may be done using node sets or partitions, but I am unsure if either of these are correct or there is another strategy all together.
Edit: I would prefer to have a single queue as it simplifies usability and would enable a user to potentially use the entire cluster, albeit with differing priority on node set A and B.
Thanks in advance for the help.
The way I understand the question, you've confused node allocation with job priority. Job priority determines how much more quickly Maui will run a job, as it accrues priority in the priority reservation queue. This will determine how soon a job can run, within the constraints placed on the job, relative to all other jobs in the eligible/idle queue.
That's separate from where Maui decides to place (schedule) jobs. The most natural way to handle this type of use case is with standing reservations. You can create reservations over each set of nodes (via host list, feature, or partition), and then give both groups (or everyone) access to both reservations, but apply negative affinity to everyone outside the group with preferential access.
Example:
SRCFG[rsvA] NODEFEATURES=setA
SRCFG[rsvA] GROUPLIST=group1,ALL-
SRCFG[rsvA] HOSTLIST=ALL
SRCFG[rsvB] NODEFEATURES=setB
SRCFG[rsvB] GROUPLIST=group2,ALL-
SRCFG[rsvB] HOSTLIST=ALL
With this configuration, Maui will create reservation rsvA to include only the nodes with the "setA" property/feature, and jobs from group1 will gravitate (i.e., have positive affinity) to the nodes in that reservation. Likewise, jobs from users in group2 will flow to the nodes in rsvB, with the "setB" property (as defined in the nodes file, or on NODECFG lines in the maui.cfg). This configuration works fine with a single queue, and is essentially user-transparent.

Is there a way to randomly assign routes or roles for a defined number of actors in Akka?

Suppose I want to implement a cluster system where some actos will be request dispatchers and others will be standard nodes. How can I randomly assign a predefined number of actors (does not matter the hostname and port) a specific route or even role?
Explaining better:
Suppose I have these nodes:
1 - akka.tcp://ClusterSystem#192.168.0.1:2551/user/clusterListener
2 - akka.tcp://ClusterSystem#192.168.0.2:2552/user/clusterListener
3 - akka.tcp://ClusterSystem#192.168.0.3:2553/user/clusterListener
4 - akka.tcp://ClusterSystem#192.168.0.4:2554/user/clusterListener
Now I want 2 of them to have the sub route "dispatcher" (akka.tcp://ClusterSystem#xxx.xxx.xxx.xxx:xxxx/user/clusterListener/dispatcher)
You can use http://doc.akka.io/docs/akka/2.3.0/contrib/cluster-singleton.html for coordination.
Every actor without role may send "GetRole" message to the singleton and it will pickup role randomly (using some internal RoleMap). Note that you should listen memberDown message from singleton to free role when some node (obtained this role) has removed.