gcloud app deploy flexible environment hangs - gcloud

Title is self explanatory. I simply downloaded the flexible-hello-world app and deployed it without (almost) a single modification - I deployed it to a service called some-service using this app.yaml
# Copyright 2017, Google, Inc.
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# [START gae_flex_quickstart_yaml]
runtime: nodejs
env: flex
service: some-service
# This sample incurs costs to run on the App Engine flexible environment.
# The settings below are to reduce costs during testing and are not appropriate
# for production use. For more information, see:
# https://cloud.google.com/appengine/docs/flexible/nodejs/configuring-your-app-with-app-yaml
manual_scaling:
instances: 1
resources:
cpu: 1
memory_gb: 0.5
disk_size_gb: 10
# [END gae_flex_quickstart_yaml]
I have a billing account enabled for my project.
It hangs at this line:
...
7104cb4c0c814fa53787009 size: 2385
Finished Step #1
PUSH
DONE
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Updating service [some-service] (this may take several minutes)...⠹
When I go to the app engine console I see it has deployed and can access the URL here: https://some-service-dot-ashored-cloud-dv.uk.r.appspot.com/
But the 502 never goes away.
Help!
EDIT:
Some more information, while it is still hung on the deploy command, I run this in another terminal:
gcloud app instances list --service some-service
and I get:
SERVICE VERSION ID VM_STATUS DEBUG_MODE
some-service 20201119t184828 aef-some--service-20201119t184828-fr7n TERMINATED
EDIT 2:
When I try to ssh to it I get more weirdness:
gcloud app instances ssh aef-some--service-20201119t184828-fr7n --service some-service --version 20201119t184828
WARNING: This instance is serving live application traffic. Any changes made could
result in downtime or unintended consequences.
Do you want to continue (Y/n)?
Sending public key to instance [apps/ashored-cloud-dv/services/some-service/versions/20201119t184828/instances/aef-some--service-20201119t184828-fr7n].
Waiting for operation [apps/ashored-cloud-dv/operations/9de7b298-f4e9-47a7-8a8e-11411e649d50] to complete...done.
ERROR: gcloud crashed (TypeError): can only concatenate str (not "NoneType") to str
EDIT 2:
glcloud --version output:
Google Cloud SDK 319.0.0
bq 2.0.62
cloud-build-local
core 2020.11.13
gsutil 4.55

tl;dr; permissions
I removed the editor permission from my default app engine service account as recommended in the IAM dashboard.
Nowhere in the docs (that I could find) does it tell you what permissions are needed to deploy a app engine flexible service.
Turns out, you need:
Logs Writer
Storage Object Viewer
Without Storage Object Viewer you'll get an error on deployment telling you the exact issue. Without Logs Writer you will not get an error, but the service will never come up.
What a long 10 days...
EDIT: I was wrong, it says here in the docs what permissions you need.
I asked Google Support to file an internal bug that the correct error message is not being returned if you do not have Logs Writer

Related

Azure Devops: Pipeline fails to deploy to Linux Web App

I have a pipeline deploying to my Azure web app, that most of the times errors out because it couldn't deploy to my web app. The task take around 25 mins :
...
Copying file: 'frontend/.gitignore'
Copying file: 'frontend/README.md'
Copying file: 'frontend/package.json'
Copying file: 'frontend/tsconfig.json'
Copying file: 'frontend/yarn.lock'
Omitting next output lines...
An error has occurred during web site deployment.
Kudu Sync failed
\n/opt/Kudu/Scripts/starter.sh "/home/site/deployments/tools/deploy.sh"
##[error]Failed to deploy web package to App Service.
##[error]To debug further please check Kudu stack trace URL : https://$someapp:***#someapp.scm.azurewebsites.net/api/vfs/LogFiles/kudu/trace
##[error]Error: Package deployment using ZIP Deploy failed. Refer logs for more details.
...
When i enable : system.debug = true , i see these logs repeated many time , before start copying the artifact files :
POLL URL RESULT: {"statusCode":202,"statusMessage":"Accepted","headers":{"transfer-encoding":"chunked","content-type":"application/json; charset=utf-8","location":"http://XXXXXXXXX.scm.azurewebsites.net:80/api/deployments/latest?deployer=VSTS_ZIP_DEPLOY&time=2021-07-09_09-01-41Z","server":"Kestrel","date":"Fri, 09 Jul 2021 09:23:37 GMT","connection":"close"},"body":{"id":"68a7a8811796416b993924437493ff87","status":0,"status_text":"Building and Deploying '68a7a8811796416b993924437493ff87'.","author_email":"N/A","author":"N/A","deployer":"VSTS_ZIP_DEPLOY","message":"Created via a push deployment","progress":"Running deployment command...","received_time":"2021-07-09T09:01:50.4159225Z","start_time":"2021-07-09T09:01:51.775357Z","end_time":null,"last_success_end_time":null,"complete":false,"active":false,"is_temp":false,"is_readonly":true,"url":null,"log_url":null,"site_name":"XXXXXXXXXXXXe"}}
Deployment status: 0 'Building and Deploying '68a7a8811796416b993924437493ff87'.'. retry after 5 seconds
setting affinity cookie ["ARRAffinity=c06e9bb74f52245b3695b3079a52f6acbc70c3ee812f67e4fa3f5f65088ff4f7;Path=/;HttpOnly;Secure;Domain=XXXXXXXXXXXXXXXXXXXXXXXXXXXXX.scm.azurewebsites.net","ARRAffinitySameSite=c06e9bb74f52245b3695b3079a52f6acbc70c3ee812f67e4fa3f5f65088ff4f7;Path=/;HttpOnly;SameSite=None;Secure;Domain=XXXXXXXXXXXXXXX.scm.azurewebsites.net"]
[GET]https://XXXXXXXXXXX-test.scm.azurewebsites.net:443/api/deployments/latest?deployer=VSTS_ZIP_DEPLOY&time=2021-07-09_09-01-41Z
POLL URL RESULT: {"statusCode":202,"statusMessage":"Accepted","headers":{"transfer-encoding":"chunked","content-type":"application/json; charset=utf-8","location":"http://XXXXXXXXXXXXXXXXXXX.scm.azurewebsites.net:80/api/deployments/latest?deployer=VSTS_ZIP_DEPLOY&time=2021-07-09_09-01-41Z","server":"Kestrel","date":"Fri, 09 Jul 2021 09:23:45 GMT","connection":"close"},"body":{"id":"68a7a8811796416b993924437493ff87","status":0,"status_text":"Building and Deploying '68a7a8811796416b993924437493ff87'.","author_email":"N/A","author":"N/A","deployer":"VSTS_ZIP_DEPLOY","message":"Created via a push deployment","progress":"Running deployment command...","received_time":"2021-07-09T09:01:50.4159225Z","start_time":"2021-07-09T09:01:51.775357Z","end_time":null,"last_success_end_time":null,"complete":false,"active":false,"is_temp":false,"is_readonly":true,"url":null,"log_url":null,"site_name":"XXXXXXXXXXXX"}}
Deployment status: 0 'Building and Deploying '68a7a8811796416b993924437493ff87'.'. retry after 5 seconds
setting affinity cookie ["ARRAffinity=c06e9bb74f52245b3695b3079a52f6acbc70c3ee812f67e4fa3f5f65088ff4f7;Path=/;HttpOnly;Secure;Domain=XXXXXXXXXXXXXXXXXXXXXX.scm.azurewebsites.net","ARRAffinitySameSite=c06e9bb74f52245b3695b3079a52f6acbc70c3ee812f67e4fa3f5f65088ff4f7;Path=/;HttpOnly;SameSite=None;Secure;Domain=XXXXXXXXXXXXXXXXXX"]
This task fails only in specific slot in myweb app , authors slots and production slot works fine and the job take around 6 mins
Any ideas what could be wrong?
As per the discussion and troubleshooting performed here, I tried to setup a Linux App Service on Standard S1 pricing tier enabling 5 (max) slots with CI/CD configured via Azure Pipelines. Unfortunately, I wasn't able to reproduce the same error as yours despite multiple different trials.
I'd suggest you to try the following:
Kudu Sync failed in the deployment log resembles this open issue from about a year ago: ZipDelpoy on azure web app linux fails during kudu sync #2972. Please check the trace/deployment log files on kudu at https://<appname>.scm.azurewebsites.net/api/vfs/LogFiles/kudu/trace or /deployment or from Kudu's DebugConsole (/LogFiles/kudu/\*) and check if this is caused by deployment lock failures. In that case, check this wiki out for dealing with locked files during deployment.
Try a different deployment method like run from package (to avoid resource locking), using FTP/S, or local git deployment.
This should help you narrow down the issue further, whether it is caused in the App service/deployment method, or the ADO pipeline/task.
Scale up to the next higher tier and re-trigger your pipeline. If it succeeds, you may scale back down to the original tier. This would indirectly restart your SCM sites as well.
If the above workarounds don't help, you could check on the following:
Customize your deploy task with options like TakeAppOfflineFlag, DeploymentType or RenameFilesFlag to streamline your deployment.
Try restarting the app/slot just before the deployment in order to recycle the app pool.
Check if your app is running into any of the prescribed limits (ex: file system storage) for your tier.
Drill down into available metrics for your app to identify any CPU/Memory anomalies.
Try the Diagnose and solve problems tool for any additional insights about your app.
If your environment permits, try setting up and deploying to a new slot within your App Service, or try verifying if this happens to another app in a different region.

Kubeflow fails to deploy using both CLI and Console

I deleted my KF cluster last night to create a new one (using kubectl cluster command not Kfctl delete), and then when I tied to create a new one, it fails, it does not work with CLI not Console. I found other people have run into this issue before, for example (here and here)
"However, as I said even with CLI my deployment fails, the error from console is:
ailed to apply: (kubeflow.error): Code 500 with message: coordinator Apply failed for gcp: (kubeflow.error): Code 500 with message: gcp apply could not update deployment manager Error could not update storage-kubeflow.yaml; Insert deployment error: googleapi: Error 403: Request had insufficient authentication scopes.
More details:
Reason: insufficientPermissions, Message: Insufficient Permission"
and the error I get from Console is:
"Please enable APIs for your project and try again
Please enable cloud resource manager API: https://console.developers.google.com/apis/api/cloudresourcemanager.googleapis.com/ and iam API: https://console.developers.google.com/apis/api/iam.googleapis.com/"
Note that this error is wrong, all the apis are active already. I'm quite sure this is a bug of KF but not sure how to find a workaround, any thoughts?
With CLI, I'm using my own account which has "owner" privileges.
Thanks
It seems you have an issue with IAM and the installation of Kubeflow, a 3rd party product that itself is not supported by us; nevertheless I went ahead and dig some information about this Machine Learning product.
The main issues (and although it seems you already cover permissions) are permissions, number of projects and some fine grained points.
I was checking and found out the following things that may help
a) Troubleshooting Kubeflow 1
b) Deploying Kubeflow in GKE[2]
c) Kubleflow auto deployer for GKE[3]
There are also some discussion about a mismatch permissions setting in Kubeflow that may be worth reading [4]
Finally there is a group that, also on a best-effort basis due the nature of Kubeflow:"google-kubeflow-support#google.com" that may come in handy.
I trust this information will be useful for you to solve your issue

Cloud SQL API [sql-component.googleapis.com] not enabled on project

I am running a cloud build trigger on a cloudbuid.yaml file in which I build a docker container and then deploy it to cloud run. The error stacktrace is as follows:
API [sql-component.googleapis.com] not enabled on project
The problem is that I have enabled both SQL and SQL Admin APIs in both projects (one for the cloud build and one for the database), which was confirmed in the console and in gcloud.
Here is the yaml code for the step I am referring to:
- name: 'gcr.io/cloud-builders/gcloud'
args: [
'beta',
'run',
'deploy',
'MY_NAME',
'--image', 'gcr.io/MY_PROJECT/MY_IMAGE',
'--region', 'MY_REGION',
'--platform', 'managed',
'--set-cloudsql-instances', 'MY_CONNECTION_NAME',
'--set-env-vars', 'NODE_ENV=production,INSTANCE_CONNECTION_NAME=MY_CONNECTION_NAME,SQL_USER=MY_USER,SQL_PASSWORD=MY_PASSWORD,SQL_NAME=MY_SCHEMA,TOPIC_NAME=MY_TOPIC'
]
Any suggestions?
Thanks.
P.S.: As per Eespinola suggestion, I checked and confirmed I am running Google Cloud SDK 254.0.0.
P.S. 2: I have also tried to create a project from scratch but ended up with the same results.
Ok so as per the same thread eespinola posted (see above), the Cloud Build gcloud step will be updated according to Cloud SDK 254.0.0 update in a near future (the actual date may or may not be posted in the same thread in the future). Until then, the alternative is to use the YAML file without the --add-cloudsql-instances flag and add it manually in the UI (I still have not tried this but it should work as per Google's development team).

Node-red in IBM Bluemix crashes while starting after sleeping (lite account)

After sleeping (in lite account type) node-red, created by node-red starter kit, crashes while starting. It is possible to login in editor for a few seconds and then it crashes with error code "an instance of the app crashed: APP/PROC/WEB: Exite with status 1 (out of memory)". Dashboard (node-red-dashboard) was installed before sleeping and worked correctly.
I tried to restart Node-RED, Stop and Start.
I solved this problem. The problem may be due to the memory overflow in the container Garden. Taking into account that the content is stored in the cache, the application cannot start after the restart process, it issues an Exit status 1 (out of memory) error.
The cache is updated only by pushing the application into the cloud.
An option that was checked for application recovery:
View the name of the database for NodeRED (which stores all information about the Node-RED) in Cloudant, for example, "nodered."
Install to PC Cloud Foundry Command Line Interface - CLI https://docs.cloudfoundry.org/cf-cli/install-go-cli.html
Download from github and unarchive the application's code bluemix-starter https://github.com/knolleary/node-red-bluemix-starter (clone or download -> download zip)
In the downloaded folder add a record to a manifest file (manifest.yml) in the env section, in which set the database name (for example, nodered) in Cloudant to environment variable NODE_RED_STORAGE_DB_NAME. Four spaces must be made before NODE_RED_STORAGE_DB_NAME. It is better to make changes using the Notepad ++ editor.
---
applications:
- memory: 256M
env:
OPTIMIZE_MEMORY: true
NODE_RED_STORAGE_DB_NAME: nodered
command: node index.js --settings ./bluemix-settings.js –v
Save the file after changing.
Run the command line (cmd) and then:
a. go to a folder with a downloaded project, such as Windows
cd c:/node-red-bluemix-starter
b. specify the api endpoint where the application is located, in our case:
cf api https://api.eu-gb.bluemix.net
c. send a registration command in the cloud
cf login
d. specify the mail and password (password is entered without explicit character display)
e. pushing the project by specifying the name of your instance Node-RED, for example NameApp
cf push NameApp

Google cloud datalab deployment unsuccessful - sort of

This is a different scenario from other question on this topic. My deployment almost succeeded and I can see the following lines at the end of my log
[datalab].../#015Updating module [datalab]...done.
Jul 25 16:22:36 datalab-deploy-main-20160725-16-19-55 startupscript: Deployed module [datalab] to [https://main-dot-datalab-dot-.appspot.com]
Jul 25 16:22:36 datalab-deploy-main-20160725-16-19-55 startupscript: Step deploy datalab module succeeded.
Jul 25 16:22:36 datalab-deploy-main-20160725-16-19-55 startupscript: Deleting VM instance...
The landing page keeps showing a wait bar indicating the deployment is still in progress. I have tried deploying several times in last couple of days.
About additions described on the landing page -
An App Engine "datalab" module is added. - when I click on the pop-out url "https://datalab-dot-.appspot.com/" it throws an error page with "404 page not found"
A "datalab" Compute Engine network is added. - Under "Compute Engine > Operations" I can see a create instance for datalab deployment with my id and a delete instance operation with *******-ompute#developer.gserviceaccount.com id. not sure what it means.
Datalab branch is added to the git repo- Yes and with all the components.
I think the deployment is partially successful. When I visit the landing page again, the only option I see is to deploy the datalab again and not to start it. Can someone spot the problem ? Appreciate the help.
I read the other posts on this topic and tried to verify my deployment using - "https://console.developers.google.com/apis/api/source/overview?project=" I get the following message-
The API doesn't exist or you don't have permission to access it
You can try looking at the App Engine dashboard here, to verify that there is a "datalab" service deployed.
If that is missing, then you need to redeploy again (or switch to the new locally-run version).
If that is present, then you should also be able to see a "datalab" network here, and a VM instance named something like "gae-datalab-main-..." here. If either of those are missing, then try going back to the App Engine console, deleting the "datalab" service, and redeploying.