Maximum number of service accounts in Kubernetes - kubernetes

Is there a hard limit on the number of service accounts that can be created in Kubernetes?
I couldn't find any documentation where this is stated.

It depends on the storage behind a service account registry, as coded in kubernetes/kubernetes/pkg/registry/core/serviceaccount
// storage puts strong typing around storage calls
type storage struct {
rest.StandardStorage
}
// NewRegistry returns a new Registry interface for the given Storage. Any mismatched
// types will panic.
func NewRegistry(s rest.StandardStorage) Registry {
return &storage{s}
}
A storage is a REST call to, for instance, an ETCD storage.
func newStorage(t *testing.T) (*REST, *etcdtesting.EtcdTestServer) {
etcdStorage, server := registrytest.NewEtcdStorage(t, "")
So this is limited by the limits of an ETCD, not so much the number of entries, but rather the storage size.

Related

How the dead nodes are handled in AWS OpenSearch?

Trying to understand what is the right approach to connect to AWS OpenSearch (single cluster, multiple data nodes).
To my understanding, as long as data nodes are behind the load balancer (according to this and other AWS docs: https://aws.amazon.com/blogs/database/set-access-control-for-amazon-elasticsearch-service/), we can not use:
var pool = new StaticConnectionPool(nodes);
and we probably should not use CloudConnectionPool - as originally it was dedicated to elastic search cloud and was left in open search client by mistake?
Hence we use SingleNodeConnectionPool and it works, but I've noticed several exceptions, which indicated that node had DeadUntil set to date one hour in advance - so I was wondering if that is expected behavior, as from client's perspective that is the only node it knows about?
What is correct way to connect to AWS OpenSearch that has multiple nodes and should I be concerned about DeadUntil property?

Querying Remote State Stores in Kubernetes (Interactive Queries)

Are there any recommendations on querying remote state stores between application instances that are deployed in Kubernetes? Our application instances are deployed with 2 or more replicas.
Based on documentation
https://kafka.apache.org/10/documentation/streams/developer-guide/interactive-queries.html#id7
streams.allMetadataForStore("word-count")
.stream()
.map(streamsMetadata -> {
// Construct the (fictituous) full endpoint URL to query the current remote application instance
String url = "http://" + streamsMetadata.host() + ":" + streamsMetadata.port() + "/word-count/alice";
// Read and return the count for 'alice', if any.
return http.getLong(url);
})
.filter(s -> s != null)
.findFirst();
will streamsMetadata.host() result in the POD IP? And if it does, will the call from this pod to another be allowed? Is this the correct approach?
streamsMetadata.host()
This method returns whatever you configured via application.server configuration parameter. I.e., each application instance (in your case each POD), must set this config to provide the information how it is reachable (e.g., its IP and port). Kafka Streams distributes this information for you to all application instances.
You also need to configure your PODs accordingly to allow sending/receiving query request via the specified port. This part is additional code you need to write yourself, i.e., some kind of "query routing layer". Kafka Streams has only built-in support to query local state and to distribute the metadata about which state is hosted where; but there is no built-in remove query support.
An example implementation (WordCountInteractiveQueries) of a query routing layer can be found on Github: https://github.com/confluentinc/kafka-streams-examples
I would also recommend to checkout the docs and blog post:
https://docs.confluent.io/current/streams/developer-guide/interactive-queries.html
https://www.confluent.io/blog/unifying-stream-processing-and-interactive-queries-in-apache-kafka/

is it possible to copyObject from one cloud object storage instance to another. The buckets are in different regions

I would like to use the node sdk to implement a backup and restore mechanism between 2 instances of Cloud Object Storage. I have added a service ID to the instances and added a permissions for the service id to access the buckets present in the instance i want to write to. The buckets will be in different regions. I have tried a variety of endpoints both legacy and non-legacy private and public to achieve this but i usually get Access Denied.
Is what I am trying to do possible with the sdk? if so can someone point me in the right direction?
var config = {
"apiKeyId": "xxxxxxxxxxxxxxxxxxxxxxx-xxxxxxxxxxxxxxxxxxx",
"endpoint": "s3.eu-gb.objectstorage.softlayer.net",
"iam_apikey_description": "Auto generated apikey during resource-key operation for Instance - crn:v1:bluemix:public:cloud-object-storage:global:a/xxxxxxxxxxx:xxxxxxxxxxx::",
"iam_apikey_name": "auto-generated-apikey-xxxxxxxxxxxxxxxxxxxxxx",
"iam_role_crn": "crn:v1:bluemix:public:iam::::serviceRole:Writer",
"iam_serviceid_crn": "crn:v1:bluemix:public:iam-identity::a/0xxxxxxxxxxxxxxxxxxxx::serviceid:ServiceIdxxxxxxxxxxxxxxxxxxxxxx",
"serviceInstanceId": "crn:v1:bluemix:public:cloud-object-storage:global:a/xxxxxxxxxxxxxxxxxxx:xxxxxxxxxxxxxxxxxxxxxxxxxx::",
"ibmAuthEndpoint": "iam.cloud.ibm.com/oidc/token"
}
This should work as long as you are able to properly grant the requesting user access to be able to read the source of the put-copy, so long as you are not using KeyProtect based keys.
So the breakdown here is a bit confusing due to some unintuitive terminology.
A service instance is a collection of buckets. The primary reason for having multiple instances of COS is to have more granularity in your billing, as you'll get a separate line item for each instance. The term is a bit misleading, however, because COS is a true multi-tenant system - you aren't actually provisioning an instance of COS, you're provisioning a sort of sub-account within the existing system.
A bucket is used to segment your data into different storage locations or storage classes. Other behavior, like CORS, archiving, or retention, acts on the bucket level as well. You don't want to segment something that you expect to scale (like customer data) across separate buckets, as there's a limit of ~1k buckets in an instance. IBM Cloud IAM treats buckets as 'resources' and are subject to IAM policies.
Instead, data that doesn't need to be segregated by location or class, and that you expect to be subject to the same CORS, lifecycle, retention, or IAM policies can be separated by prefix. This means a bunch of similar objects share a path, like foo/bar and foo/bas have the same prefix foo/. This helps with listing and organization but doesn't provide granular access control or any other sort of policy-esque functionality.
Now, to your question, the answer is both yes and no. If the buckets are in the same instance then no problem. Bucket names are unique, so as long as there isn't any secondary managed encryption (eg Key Protect) there's no problem copying across buckets, even if they span regions. Keep in mind, however, that large objects will take time to copy, and COS's strong consistency might lead to situations where the operation may not return a response until it's completed. Copying across instances is not currently supported.

What mechanism is there to prevent GCP bucket name squatting?

My company's name is mycomp.
GCP buckets are in a global, public namespace and their names must be publicly unique so all my buckets are prefixed with mycomp.
So mycomp-production, mycomp-test, mycomp-stage, etc.
What is to prevent someone from grabbing mycomp-dev? Like cybersquatting on that bucket name. Something like that could potentially really screw up my organizational structure.
How can I stop or reserve a bucket prefix? Is this even possible? If I want to be an A-hole whats to stop me from grabbing "Nordstrom" or "walmart" if I get their first?
GCS supports domain-named buckets. This would allow you to create buckets like production.mycomp.com and test.mycomp.com. Since the domain must be owned to create buckets with the domain suffix, it ensures that other people can't create buckets with that naming scheme.

Amazon Dynamodb find All in grails

I have a grails application. I'm using amazon dynamodb for a specific requirement which is accessed, and entries are added by a different application. Now I need to get all the information from the dynamodb table to a postgreSQL table. There are over 10000 records in the dynamodb but the throughput is
Read capacity units : 100
Write capacity units : 100
In BuildConfig.groovy I have defined the plugin
compile ":dynamodb:0.1.1"
In config.groovy I have the following configuration
grails {
dynamodb {
accessKey = '***'
secretKey = '***'
disableDrop = true
dbCreate = 'create'
}
}
The code I have looks something similar to this
class book {
Long id
String author
String name
date publishedDate
static constraints = {
}
static mapWith = "dynamodb"
static mapping = {
table 'book'
throughput read:100
}
}
When I try something like book.findAll() I get the following error
AmazonClientException: Unable to unmarshall response (Connection reset)
And when I tried to reduce the number of records by trying something like book.findAllByAuthor() (which also wud have above 1000's of records) I get the following error
Caused by ProvisionedThroughputExceededException: Status Code: 400, AWS Service: AmazonDynamoDB, AWS Request ID: ***, AWS Error Code: ProvisionedThroughputExceededException, AWS Error Message: The level of configured provisioned throughput for the table was exceeded. Consider increasing your provisioning level with the UpdateTable API.
I have the need to get all the records in dynamodb despite the throughput restriction and save it in a postgres table. Is there a way to do so?
I'm very new to this area, thanks in advance for the help.
After some research I came Across Google Guava. But even to use Guava RateLimiter, there wont be a fixed number of times I would need to send the request of how long it would take. So I'm looking for a solution which will suit the requirement I have
Probably your issue is not connected with grails at all. The returned error message claims: The level of configured provisioned throughput for the table was exceeded. Consider increasing your provisioning level with the UpdateTable API
So you should consider increasing level of throughput (for this option you have to pay more) or adjust your queries to obey actual limits.
Check out also this answer: https://stackoverflow.com/a/31484168/2166188