I am trying to create a yaml file to deploy gke cluster in a custom network I created. I get an error
JSON payload received. Unknown name \"network\": Cannot find field."
I have tried a few names for the resources but I am still seeing the same issue
resources:
- name: myclus
type: container.v1.cluster
properties:
network: projects/project-251012/global/networks/dev-cloud
zone: "us-east4-a"
cluster:
initialClusterVersion: "1.12.9-gke.13"
currentMasterVersion: "1.12.9-gke.13"
## Initial NodePool config.
nodePools:
- name: "myclus-pool1"
initialNodeCount: 3
version: "1.12.9-gke.13"
config:
machineType: "n1-standard-1"
oauthScopes:
- https://www.googleapis.com/auth/logging.write
- https://www.googleapis.com/auth/monitoring
- https://www.googleapis.com/auth/ndev.clouddns.readwrite
preemptible: true
## Duplicates node pool config from v1.cluster section, to get it explicitly managed.
- name: myclus-pool1
type: container.v1.nodePool
properties:
zone: us-east4-a
clusterId: $(ref.myclus.name)
nodePool:
name: "myclus-pool1"
I expect it to place the cluster nodes in this network.
The network field needs to be part of the cluster spec. The top-level of properties should just be zone and cluster, network should be on the same indentation as initialClusterVersion. See more on the container.v1.cluster API reference page
Your manifest should look more like:
EDIT: there is some confusion in the API reference docs concerning deprecated fields. I offered a YAML that applies to the new API, not the one you are using. I've update with the correct syntax for the basic v1 API and further down I've added the newer API (which currently relies on gcp-types to deploy.
resources:
- name: myclus
type: container.v1.cluster
properties:
projectId: [project]
zone: us-central1-f
cluster:
name: my-clus
zone: us-central1-f
network: [network_name]
subnetwork: [subnet] ### leave this field blank if using the default network
initialClusterVersion: "1.13"
nodePools:
- name: my-clus-pool1
initialNodeCount: 0
config:
imageType: cos
- name: my-pool-1
type: container.v1.nodePool
properties:
projectId: [project]
zone: us-central1-f
clusterId: $(ref.myclus.name)
nodePool:
name: my-clus-pool2
initialNodeCount: 0
version: "1.13"
config:
imageType: ubuntu
The newer API (which provides more functionality and allows you to use more features including the v1beta1 API and beta features) would look something like this:
resources:
- name: myclus
type: gcp-types/container-v1:projects.locations.clusters
properties:
parent: projects/shared-vpc-231717/locations/us-central1-f
cluster:
name: my-clus
zone: us-central1-f
network: shared-vpc
subnetwork: local-only ### leave this field blank if using the default network
initialClusterVersion: "1.13"
nodePools:
- name: my-clus-pool1
initialNodeCount: 0
config:
imageType: cos
- name: my-pool-2
type: gcp-types/container-v1:projects.locations.clusters.nodePools
properties:
parent: projects/shared-vpc-231717/locations/us-central1-f/clusters/$(ref.myclus.name)
nodePool:
name: my-clus-separate-pool
initialNodeCount: 0
version: "1.13"
config:
imageType: ubuntu
Another note, you may want to modify your scopes, the current scopes will not allow you to pull images from gcr.io, some system pods may not spin up properly and if you are using Google's repository, you will be unable to pull those images.
Finally, you don't want to repeat the node pool resource in both the cluster spec and separately. Instead, create the cluster with a basic (default) node pool, for all additional node pools, create them as separate resources to manage them without going through the cluster. There are very few updates you can perform on a node pool, asside from resizing
Related
Having setup Kibana and a fleet server, I now have attempted to add APM.
When going through the general setup - I forever get an error no matter what is done:
failed to listen:listen tcp *.*.*.*:8200: bind: can't assign requested address
This is when following the steps for setup of APM having created the fleet server.
This is all being launched in Kubernetes and the documentation has been gone through several times to no avail.
We did discover that we can hit the
/intake/v2/events
etc endpoints when shelled into the container but 404 for everything else. Its close but no cigar so far following the instructions.
As it turned out, the general walk through is soon to be depreciated in its current form as is.
And setup is far far simpler in a helm file where its actually possible to configure kibana with package ref for your named apm service.
xpack.fleet.packages:
- name: system
version: latest
- name: elastic_agent
version: latest
- name: fleet_server
version: latest
- name: apm
version: latest
xpack.fleet.agentPolicies:
- name: Fleet Server on ECK policy
id: eck-fleet-server
is_default_fleet_server: true
namespace: default
monitoring_enabled:
- logs
- metrics
unenroll_timeout: 900
package_policies:
- name: fleet_server-1
id: fleet_server-1
package:
name: fleet_server
- name: Elastic Agent on ECK policy
id: eck-agent
namespace: default
monitoring_enabled:
- logs
- metrics
unenroll_timeout: 900
is_default: true
package_policies:
- name: system-1
id: system-1
package:
name: system
- package:
name: apm
name: apm-1
inputs:
- type: apm
enabled: true
vars:
- name: host
value: 0.0.0.0:8200
Making sure these are set in the kibana helm file will allow any spun up fleet server to automatically register as having APM.
The missing key in seemingly all the documentation is the need of a APM service.
The simplest example of which is here:
Example yaml scripts
I am trying to automate the deployment of a SageMaker multi-model endpoints with AWS CDK using Python language (I guess it would be the same by directly writing a CloudFormation template in json/yaml format), but when trying to deploy it, error occurs at the creation of the SageMaker model.
Here is part of the CloudFormation template made with the cdk synth command:
Resources:
smmodelexecutionrole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Statement:
- Action: sts:AssumeRole
Effect: Allow
Principal:
Service: sagemaker.amazonaws.com
Version: "2012-10-17"
Policies:
- PolicyDocument:
Statement:
- Action: s3:GetObject
Effect: Allow
Resource:
Fn::Join:
- ""
- - "arn:"
- Ref: AWS::Partition
- :s3:::<bucket_name>/deploy_multi_model_artifact/*
Version: "2012-10-17"
PolicyName: policy_s3
- PolicyDocument:
Statement:
- Action: ecr:*
Effect: Allow
Resource:
Fn::Join:
- ""
- - "arn:"
- Ref: AWS::Partition
- ":ecr:"
- Ref: AWS::Region
- ":"
- Ref: AWS::AccountId
- :repository/<my_ecr_repository>
Version: "2012-10-17"
PolicyName: policy_ecr
Metadata:
aws:cdk:path: <omitted>
smmodel:
Type: AWS::SageMaker::Model
Properties:
ExecutionRoleArn:
Fn::GetAtt:
- smmodelexecutionrole
- Arn
Containers:
- Image: xxxxxxxxxxxx.dkr.ecr.<my_aws_region>.amazonaws.com/<my_ecr_repository>/multi-model:latest
Mode: MultiModel
ModelDataUrl: s3://<bucket_name>/deploy_multi_model_artifact/
ModelName: MyModel
Metadata:
aws:cdk:path: <omitted>
When running cdk deploy on the Terminal, the following error occur:
3/6 | 7:56:58 PM | CREATE_FAILED | AWS::SageMaker::Model | sm_model (smmodel)
Could not access model data at s3://<bucket_name>/deploy_multi_model_artifact/.
Please ensure that the role "arn:aws:iam::xxxxxxxxxxxx:role/<my_role>" exists
and that its trust relationship policy allows the action "sts:AssumeRole" for the service principal "sagemaker.amazonaws.com".
Also ensure that the role has "s3:GetObject" permissions and that the object is located in <my_aws_region>.
(Service: AmazonSageMaker; Status Code: 400; Error Code: ValidationException; Request ID: xxxxx)
What I have:
An ECR repository containing the docker image
A S3 bucket containing the model artifacts (.tar.gz files) inside the "folder" "deploy_multi_model_artifact"
To test if it is a IAM role issue, I tried to replace MultiModel by SingleModel and replace s3://<bucket_name>/deploy_multi_model_artifact/ with s3://<bucket_name>/deploy_multi_model_artifact/one_of_my_artifacts.tar.gz, and I could create successfully the model. I am then guessing that it is not a problem related with the IAM contrary to what the error message tells me (but I may make a mistake!) as it seems .
So I am wondering where the problem comes from. This is even more confusing as I have already deployed this multi-model endpoints using boto3 without problem.
Any help would be greatly appreciated !!
(About Multi-Model Endpoints deployment: https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/multi_model_xgboost_home_value/xgboost_multi_model_endpoint_home_value.ipynb)
Problem was that I forgot to add SageMaker access permissions to the IAM role.
I can deploy the multi-model endpoints by adding the SageMaker FullAccess managed policy to the IAM role.
I mean, I have an application which is already dockerized, can I provide a cloudformation template to deploy it on the EKS cluster of my client?
I am using Cloudformation for some time, however I did never use it for deploying Kubernetes artifacts (and I've never heard of anybody else so far). I think there is a way to do so (see AWS Blog) but even this solution seems to be based on Helm.
I would definitely recommend to use Helm charts for your use case. Helm charts are straight forward and easy to use, especially if you already know the Kubernetes objects you want to deploy.
You can use cdk8s.io. Here's some examples: https://github.com/awslabs/cdk8s/tree/master/examples
Deploy an Amazon EKS cluster by using the Modular and Scalable Amazon EKS Architecture Quick Start. After the Amazon EKS cluster is deployed, on the Outputs tab, note the following outputs.
HelmLambdaArn
KubeClusterName
KubeConfigPath
KubeGetLambdaArn
The template below installs the WordPress Helm chart the same as if you logged in to the Kubernetes cluster and ran the following command.
helm install stable/wordpress
The following section of the template shows how Helm is used to deploy WordPress. It also creates a load balancer host name, so that you can access the WordPress site.
Resources:
HelmExample:
Type: "Custom::Helm"
Version: '1.0'
Description: 'This deploys the Helm Chart to deploy wordpress in to the EKS Cluster.'
Properties:
ServiceToken: !Ref HelmLambdaArn
KubeConfigPath: !Ref KubeConfigPath
KubeConfigKmsContext: !Ref KubeConfigKmsContext
KubeClusterName: !Ref KubeClusterName
Namespace: !Ref Namespace
Chart: stable/wordpress
Name: !Ref Name
Values:
wordpressUsername: !Ref wordpressUsername
wordpressPassword: !Ref wordpressPassword
WPElbHostName:
DependsOn: HelmExample
Type: "Custom::KubeGet"
Version: '1.0'
Properties:
ServiceToken: !Ref KubeGetLambdaArn
KubeConfigPath: !Ref KubeConfigPath
KubeConfigKmsContext: !Ref KubeConfigKmsContext
Namespace: !Ref Namespace
Name: !Sub 'service/${Name}-wordpress'
JsonPath: '{.status.loadBalancer.ingress[0].hostname}'
Modify the helm chart to fit your application and modify the cloudformation template with the values you got from the output previously.
These are the parameters you will have to fill in when deploying the cloudformation template:
HelmLambdaArn
KubeClusterName
KubeConfigPath
KubeGetLambdaArn
Namespace
Name
You can use AWS Quick start extensions to deploy payload to EKS:
-AWSQS::Kubernetes::Resource
-AWSQS::Kubernetes::Helm
Before you can use new types, activate them in helper template
EKSHelmExtension:
Type: AWS::CloudFormation::TypeActivation
Properties:
AutoUpdate: false
ExecutionRoleArn: !GetAtt DeployClusterRole.Arn
PublicTypeArn: !Sub "arn:aws:cloudformation:${AWS::Region}::type/resource/408988dff9e863704bcc72e7e13f8d645cee8311/AWSQS-Kubernetes-Helm"
EKSResourceExtension:
Type: AWS::CloudFormation::TypeActivation
Properties:
AutoUpdate: false
ExecutionRoleArn: !GetAtt DeployClusterRole.Arn
PublicTypeArn: !Sub "arn:aws:cloudformation:${AWS::Region}::type/resource/408988dff9e863704bcc72e7e13f8d645cee8311/AWSQS-Kubernetes-Resource"
Then, in main template use new types as follows:
Resources:
ExampleCm:
Type: "AWSQS::Kubernetes::Resource"
Properties:
ClusterName: my-eks-cluster-name
Namespace: default
Manifest: |
apiVersion: v1
kind: ConfigMap
metadata:
name: example-cm
data:
example_key: example_value
Helm:
Resources:
KubeStateMetrics:
Type: "AWSQS::Kubernetes::Helm"
Properties:
ClusterID: my-cluster-name
Name: kube-state-metrics
Namespace: kube-state-metrics
Repository: https://prometheus-community.github.io/helm-charts
Chart: prometheus-community/kube-state-metrics
ValueYaml: |
prometheus:
monitor:
enabled: true
I have a very simple jelastic installation manifest which installs a kubernetes cluster:
jpsVersion: 1.3
jpsType: install
application:
id: shopozor-k8s-cluster
name: Shopozor k8s cluster
version: 0.0
settings:
fields:
- name: envName
caption: Env Name
type: string
default: shopozor
- name: topo
type: radio-fieldset
values:
0-dev: '<b>Development:</b> one master (1) and one scalable worker (1+)'
1-prod: '<b>Production:</b> multi master (3) with API balancers (2+) and scalable workers (2+)'
default: 0-dev
- name: k8s-version
type: string
caption: k8s manifest version
default: v1.16.3
onInstall:
- installKubernetes
- attachIpToWorkerNodes
actions:
installKubernetes:
install:
jps: https://github.com/jelastic-jps/kubernetes/blob/${settings.k8s-version}/manifest.jps
envName: ${settings.envName}
displayName: ${settings.envName}
settings:
deploy: cc
topo: ${settings.topo}
dashboard: version2
ingress-controller: Nginx
storage: true
api: true
monitoring: true
version: ${settings.k8s-version}
jaeger: false
attachIpToWorkerNodes:
- forEach(node:nodes.cp):
- jelastic.env.binder.AttachExtIp:
envName: ${settings.envName}
nodeId: ${#node.id}
If I install that manifest, then I get my cluster up and running, but the worker nodes do not get an IPv4 attached. After installing that manifest, if I additionally install the following update manifest, then it works:
jpsVersion: 1.3
jpsType: update
application:
id: attach-ext-ip
name: Attach external IP
version: 0.0
onInstall:
- attachIpToWorkerNodes
actions:
attachIpToWorkerNodes:
- forEach(node:nodes.cp):
- jelastic.env.binder.AttachExtIp:
nodeId: ${#node.id}
What is it I am doing wrong in the install manifest? why aren't the ip attached to my worker nodes, while there are if I perform that action after installation with an update manifest?
Please note, that the "public IP binding" feature is not available in the production yet. It's under active development and will be officially announced in one of our next releases.
In the current stable version, some of the functionality related to it may not work properly. Right now, it's not recommended for production use, but you can try it for test purposes only.
As for the "attachIpToWorkerNodes" action in the original manifest, the issue was that "nodes.cp" of the environment created wasn't declared in scope where "forEach" was invoked. The correct version of the action is:
attachIpToWorkerNodes:
install:
envName: ${settings.envName}
jps:
type: update
name: Attach IP To Worker Nodes
onInstall: jelastic.env.binder.AttachExtIp [nodes.cp.join(id,)]
Please let us know if you have any further questions.
I have the following manifest:
jpsVersion: 1.3
jpsType: install
application:
id: shopozor-k8s-cluster
name: Shopozor k8s cluster
version: 0.0
baseUrl: https://raw.githubusercontent.com/shopozor/services/dev
settings:
fields:
- name: envName
caption: Env Name
type: string
default: shopozor
- name: topo
type: radio-fieldset
values:
0-dev: '<b>Development:</b> one master (1) and one scalable worker (1+)'
1-prod: '<b>Production:</b> multi master (3) with API balancers (2+) and scalable workers (2+)'
default: 0-dev
- name: version
type: string
caption: Version
default: v1.16.3
onInstall:
- installKubernetes
- enableSubDomains
actions:
installKubernetes:
install:
jps: https://github.com/jelastic-jps/kubernetes/blob/${settings.version}/manifest.jps
envName: ${settings.envName}
displayName: ${settings.envName}
settings:
deploy: cmd
cmd: |-
curl -fsSL ${baseUrl}/scripts/install_k8s.sh | /bin/bash
topo: ${settings.topo}
dashboard: version2
ingress-controller: Nginx
storage: true
api: true
monitoring: true
version: ${settings.version}
jaeger: false
enableSubDomains:
- jelastic.env.binder.AddDomains[cp]:
domains: staging,api-staging,assets-staging,api,assets
Unfortunately, when I run that manifest, the k8s cluster gets installed, but the subdomains cannot be created (yet), because:
[15:26:28 Shopozor.cluster:3]: enableSubDomains: {"action":"enableSubDomains","params":{}}
[15:26:29 Shopozor.cluster:4]: api [cp]: {"method":"jelastic.env.binder.AddDomains","params":{"domains":"staging,api-staging,assets-staging,api,assets"},"nodeGroup":"cp"}
[15:26:29 Shopozor.cluster:4]: ERROR: api.response: {"result":2303,"source":"JEL","error":"env for appid [5ce25f5a6988fbbaf34999b08dd1d47c] not created."}
What jelastic API methods can I use to perform the necessary waiting until subdomain creation is possible?
My current workaround is to split that manifest into two manifests: one cluster installation manifest and one update manifest creating the subdomains. However, I'd like to have everything in the same manifest.
Please change this:
enableSubDomains:
- jelastic.env.binder.AddDomains[cp]:
domains: staging,api-staging,assets-staging,api,assets
to:
enableSubDomains:
- jelastic.env.binder.AddDomains[cp]:
envName: ${settings.envName}
domains: staging,api-staging,assets-staging,api,assets