I have a GitLab runner with this configuration:
runners:
privileged: false
config: |
[[runners]]
[runners.kubernetes]
namespace = "managed-ng-1"
pod_labels_overwrite_allowed = ".*"
[runners.kubernetes.pod_labels]
"kubernetes.io/arch" = "amd64"
"job_id" = "${CI_JOB_ID}"
"job_name" = "${CI_JOB_NAME}"
"pipeline_id" = "${CI_PIPELINE_ID}"
"project" = "${CI_PROJECT_PATH}"
I have a .gitlab-ci.yml file with this variables section:
variables:
KUBERNETES_POD_LABELS_1: "karpenter.k8s.aws/instance-local-nvme=256G"
KUBERNETES_POD_LABELS_2: "kubernetes.io/arch=arm64"
When the job runs, the logs show this:
Preparing the "kubernetes" executor
"PodLabels" "karpenter.k8s.aws/instance-local-nvme" overwritten with "256G"
"PodLabels" "kubernetes.io/arch" overwritten with "arm64"
However, if I run kubectl describe pod against the pod, these labels are not there:
Labels: job_id=297
job_name=job1
kubernetes.io/arch=amd64
pipeline_id=116
pod=runner-awq3dkxf-project-5-concurrent-0
project=root_simple-cicd-test
I explicitly added a default value for "kubernetes.io/arch" in case the label overwriting mechanism only worked if there was already a label with a value.
I don't know why this isn't working. Are there any other logs I should be looking at that might explain what is going on?
Thanks.
It turns out there is a bug in the Kubernetes executor: https://gitlab.com/gitlab-org/gitlab-runner/-/issues/29168
Due to security concerns, I can't keep the credentials in values.yaml in the CI/CD system. I'm trying to pass the credentials directly to the helm using --set argument instead. The application requires some of its configs to be in nested formats as shown below.
https://github.com/StackStorm/stackstorm-k8s/blob/master/values.yaml#L67
https://github.com/StackStorm/stackstorm-k8s/blob/master/values.yaml#L89
I'm unable to find a better way to pass these variables to helm during install/upgrade.
configs:
core.yaml: |
---
name: "CoreName"
value: "CoreValue"
element1.yaml: |
---
element_url: "https://example.com/element/"
invit_invite: "var1,var2,var3"
element_token: "elementtokenhere"
element_labels:
name: "value"
type: "value"
element2.yaml: |
---
name: "name"
type: "value"
Is there a better way to handle this within helm arguments before I look for options to change the chart itself?
I am running latest version of SCDF server on Kubernetes cluster. Every time I try to run a composed task, it tries to fetch the application properties for composed-task-runner application and fails to launch the composed task.
First of all, SCDf is trying to pull the properties (metadata) from Spring Maven repo when I am running the server on k8s. my server behind a firewall and it cannot connect to spring maven repo. I already downloaded the composed-task-runner docker image to my local repo and added the composed-task-runner application using UI. Why it still tries to download metadata from Spring Maven repo ? How do I stop it ?
here is the log :
2020-11-21 15:49:07.591 INFO 1 --- [nio-8080-exec-4] o.s.c.d.s.k.DefaultContainerFactory : Using Docker entry point style: exec
2020-11-21 15:49:58.355 WARN 1 --- [nio-8080-exec-6] .s.c.d.s.s.i.TaskConfigurationProperties : org.springframework.cloud.dataflow.server.service.impl.TaskConfigurationProperties.logDeprecationWarning is deprecated. Please use org.springframework.cloud.dataflow.server.service.impl.ComposedTaskRunnerConfigurationProperties.logDeprecationWarning
2020-11-21 15:50:18.427 WARN 1 --- [nio-8080-exec-6] ApplicationConfigurationMetadataResolver : Failed to retrieve properties for resource org.springframework.cloud:spring-cloud-dataflow-composed-task-runner:jar:2.7.0-SNAPSHOT because of ConnectTimeoutException: Connect to repo.spring.io:443 timed out
2020-11-21 15:50:38.522 WARN 1 --- [nio-8080-exec-6] ApplicationConfigurationMetadataResolver : Failed to retrieve properties for resource org.springframework.cloud:spring-cloud-dataflow-composed-task-runner:jar:2.7.0-SNAPSHOT because of ConnectTimeoutException: Connect to repo.spring.io:443 timed out
2020-11-21 15:50:38.572 INFO 1 --- [nio-8080-exec-6] o.s.c.d.s.k.KubernetesTaskLauncher : Preparing to run a container from org.springframework.cloud:spring-cloud-dataflow-composed-task-runner:jar:2.7.0-SNAPSHOT. This may take some time if the image must be downloaded from a remote container registry.
2020-11-21 15:50:38.573 INFO 1 --- [nio-8080-exec-6] o.s.c.d.s.k.DefaultContainerFactory : Using Docker image: //org.springframework.cloud:spring-cloud-dataflow-composed-task-runner:jar:2.7.0-SNAPSHOT
Looks like, the Composed Task Runner docker image can now be set using the environment variable :
name: SPRING_CLOUD_DATAFLOW_TASK_COMPOSED_TASK_RUNNER_URI
value: 'docker://springcloud/spring-cloud-dataflow-composed-task-runner:2.6.0'
We were on SCDF server 2.2.4 version before this and we had to manually add the composed task runner as an Application using the dashboard UI.
Right now, all I had to do is to download this image and push to my local git repo and use it here.
My k8s.yaml inventory file is:
plugin: k8s
connections:
- kubeconfig: '/Users/user1/Documents/Learning/ansible/kubeconfig.test.yaml'
context: 'user1#testeks.us-east-1.eksctl.io'
ansible playbook:
test_new.yml
- hosts: localhost
tasks:
- name: Create a k8s namespace
k8s:
name: testing3
api_version: v1
kind: Namespace
state: present
Looks like the ansibleplaybook command is not picking up the inventory k8s.yaml.Also I am not sure why I am getting Warning invalid characters {'-' in group name warnings.
Please let me know if the above inventory file and ansible playbook files look good or are there anything I am missing?
ansible-playbook -vvvv -i k8s.yaml -vvv ./test_new.yml
No config file found; using defaults
setting up inventory plugins
host_list declined parsing /Users/user1/Documents/Learning/ansible/k8s.yaml as it did not pass its verify_file() method
script declined parsing /Users/user1/Documents/Learning/ansible/k8s.yaml as it did not pass its verify_file() method
Not replacing invalid character(s) "{'-', '9'}" in group name (909676E2B4F81625BF5994625D3353C9-yl4-us-east-1-eks-amazonaws-com)
[WARNING]: Invalid characters were found in group names but not replaced, use -vvvv to see details
Not replacing invalid character(s) "{'-'}" in group name (namespace_add-ons)
Not replacing invalid character(s) "{'-'}" in group name (namespace_add-ons_pods)
Not replacing invalid character(s) "{'.', '/', '-'}" in group name (label_app.kubernetes.io/instance_aws-cluster-autoscaler)
I'm not sure where you got that you need the Kubernetes parameters specified in your inventory file. If you look at the k8s module documentation it says that kubeconfig and context are specified in the playbook or as environment variables.
Your inventory should look something like this:
all:
hosts:
host.where.can.access.the.kubeapiserver.com:
Then your playbook:
- name: Create a k8s namespace
k8s:
name: testing3
api_version: v1
kind: Namespace
state: present
kubeconfig: '/Users/user1/Documents/Learning/ansible/kubeconfig.test.yaml' 👈 this can replaced by the K8S_AUTH_KUBECONFIG env variable
context: 'user1#testeks.us-east-1.eksctl.io' 👈 this can replaced by the K8S_AUTH_CONTEXT env variable
Based on the formatting of your post, it looks like your inventory file contains improper syntax. It should look like this:
plugin: k8s
connections:
- kubeconfig: '/Users/user1/Documents/Learning/ansible/kubeconfig.test.yaml'
context: 'user1#testeks.us-east-1.eksctl.io'
Remember that spaces are important.
For deprecation warnings, be sure to read up on these issues:
https://github.com/ansible/ansible/issues/56930
https://github.com/kubernetes-sigs/kubespray/issues/4830
Usage of hyphens in inventory group names was deprecated in Ansible 2.8 due to Python parser errors when using dot syntax. Auto-transformation can be disabled by adding force_valid_group_names = never to your Ansible config file. Similarly, deprecation warnings can be suppressed by adding deprecation_warnings = False though this is not recommended.
I'm quite new to the Spinnaker and have to ask for some help I guess. Does anyone knows why it could be that I can't create any Application and just keep seeing this screen.
My installation is through Halyard 1.5.0 and Ubuntu 14.04.
We don't use any cloud provider but I did configure Docker and Kubernetes part
And here is the error I see in the /var/log/spinnaker/echo/echo.log:
2017-11-16 13:52:29.901 INFO 13877 --- [ofit-/pipelines] c.n.s.echo.services.Front50Service : java.net.SocketTimeoutException: timeout
at okio.Okio$3.newTimeoutException(Okio.java:207)
at okio.AsyncTimeout.exit(AsyncTimeout.java:261)
at okio.AsyncTimeout$2.read(AsyncTimeout.java:215)
at okio.RealBufferedSource.indexOf(RealBufferedSource.java:306)
at okio.RealBufferedSource.indexOf(RealBufferedSource.java:300)
at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:196)
at com.squareup.okhttp.internal.http.Http1xStream.readResponse(Http1xStream.java:186)
at com.squareup.okhttp.internal.http.Http1xStream.readResponseHeaders(Http1xStream.java:127)
at com.squareup.okhttp.internal.http.HttpEngine.readNetworkResponse(HttpEngine.java:739)
at com.squareup.okhttp.internal.http.HttpEngine.access$200(HttpEngine.java:87)
at com.squareup.okhttp.internal.http.HttpEngine$NetworkInterceptorChain.proceed(HttpEngine.java:724)
at com.squareup.okhttp.internal.http.HttpEngine.readResponse(HttpEngine.java:578)
at com.squareup.okhttp.Call.getResponse(Call.java:287)
at com.squareup.okhttp.Call$ApplicationInterceptorChain.proceed(Call.java:243)
at com.squareup.okhttp.Call.getResponseWithInterceptorChain(Call.java:205)
at com.squareup.okhttp.Call.execute(Call.java:80)
at retrofit.client.OkClient.execute(OkClient.java:53)
at retrofit.RestAdapter$RestHandler.invokeRequest(RestAdapter.java:326)
at retrofit.RestAdapter$RestHandler.access$100(RestAdapter.java:220)
at retrofit.RestAdapter$RestHandler$1.invoke(RestAdapter.java:265)
at retrofit.RxSupport$2.run(RxSupport.java:55)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at retrofit.Platform$Base$2$1.run(Platform.java:94)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketException: Socket closed
at java.net.SocketInputStream.read(SocketInputStream.java:204)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at okio.Okio$2.read(Okio.java:139)
at okio.AsyncTimeout$2.read(AsyncTimeout.java:211)
... 24 more
2017-11-16 13:52:29.901 INFO 13877 --- [ofit-/pipelines] c.n.s.echo.services.Front50Service : ---- END ERROR
#grizzthedj
thanks again for recommendations. It doesn't seem, however, solved the issue. I wonder if it has something to do with my Docker Registry or Kubernetes.
Here is what I have in my .hal/config:
dockerRegistry:
enabled: true
accounts:
- name: <hidden-name>
requiredGroupMembership: []
address: https://docker-registry.<hidden-name>.net/
cacheIntervalSeconds: 30
repositories:
- hellopod
- demoapp
primaryAccount: <hidden-name>
kubernetes:
enabled: true
accounts:
- name: <username>
requiredGroupMembership: []
dockerRegistries:
- accountName: <hidden-name>
namespaces: []
context: sre-os1-dev
namespaces:
- spinnaker
omitNamespaces: []
kubeconfigFile: /home/<username>/.kube/config
I suspect you may be using redis as the persistent storage type(I ran into the same issue).
If this is the case, persistent storage using redis doesn't seem to be working properly out-of-the-box, and it is not supported. I would try using an S3 target, if available.
More info here on support for redis
To configure S3 using Halyard, use the following commands:
echo <SECRET_ACCESS_KEY> | hal config storage s3 edit --access-key-id <ACCESS_KEY_ID> --endpoint <S3_ENDPOINT> --bucket <BUCKET_NAME> --root-folder spinnaker --secret-access-key
hal config storage edit --type s3
hal deploy apply
#grizzthedj,
Here is what I've found inside front50.log (I wiped out ID's of course for security reasons)
You may be right.
2017-11-20 12:40:29.151 INFO 682 --- [0.0-8080-exec-1] com.amazonaws.latency : ServiceName=[Amazon S3], AWSErrorCode=[NoSuchKey], StatusCode=[404], ServiceEndpoint=[https://s3-us-west-2.amazonaws.com], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: ...; S3 Extended Request ID: ...), S3 Extended Request ID: ...], RequestType=[GetObjectRequest], AWSRequestID=[...], HttpClientPoolPendingCount=0, RetryCapacityConsumed=0, HttpClientPoolAvailableCount=1, RequestCount=1, Exception=1, HttpClientPoolLeasedCount=0, ClientExecuteTime=[39.634], HttpClientSendRequestTime=[0.072], HttpRequestTime=[39.213], RequestSigningTime=[0.067], CredentialsRequestTime=[0.001, 0.0], HttpClientReceiveResponseTime=[39.059],
I had a similar issue on kubernetes/aws, when I opened up the chrome dev console I was getting lots of 404 errors trying to connect to localhost:8084, I had to reconfigure the deck and gate baseurls. This is what I did using halyard:
hal config security ui edit --override-base-url http://<deck-loadbalancer-dns-entry>:9000
hal config security api edit --override-base-url http://<gate-loadbalancer-dns-entry>:8084
i did hal deploy apply and when it came back I noticed the developer console was throwing cors errors so I had to do the following.
echo "host: 0.0.0.0" | tee \ ~/.hal/default/service-settings/gate.yml \ ~/.hal/default/service-settings/deck.yml
You may note the lack of TLS and cors config, this is a test system so make better choices in production :)