WSO2 ESB Deployment Synchronizer stuck (can't gracefully shutdown or deploy services)

WSO2 ESB Deployment Synchronizer stuck (can't gracefully shutdown or deploy services) - deployment

We are facing some issues with WSO2 ESB sincronizer, since we have a clustered configuration, we are using svn to store the content of "repository/deployment/server". The carbon.xml configuration is the following:
<DeploymentSynchronizer>
<Enabled>true</Enabled>
<AutoCommit>false</AutoCommit><!--true for the mgt node-->
<AutoCheckout>true</AutoCheckout>
<RepositoryType>svn</RepositoryType>
<SvnUrl>https://svn/x/trunk/serverESB/desenv/</SvnUrl>
<SvnUser>user</SvnUser>
<SvnPassword>password</SvnPassword>
<SvnUrlAppendTenantId>false</SvnUrlAppendTenantId>
</DeploymentSynchronizer>
It works correctly for some time, but after some deploys and undeploys it stops working. Although it still gives the message that it is going to sincronize and the svn update seems to be corectly performed, the esb does not load the newly deployed XMLs:
TID: [0] [ESB] INFO {org.wso2.carbon.core.deployment.SynchronizeRepositoryRequest} - Received [SynchronizeRepositoryRequest{tenantId=-1234, tenantDomain='carbon.super', messageId=f9b51e23-8a3c-4f08-acb0-5a1f0f4590b2}] {org.wso2.carbon.core.deployment.SynchronizeRepositoryRequest}
TID: [0] [ESB] INFO {org.wso2.carbon.core.deployment.SynchronizeRepositoryRequest} - Going to synchronse artefacts. {org.wso2.carbon.core.deployment.SynchronizeRepositoryRequest}
Normally after this message it prints INFO saying that new services where deployed, but it does no occour.
When i try to shutdown the server it gives me the message "Waiting for deployment completion...", and gets stuck (so i have to kill using "kill -9"):
TID: [0] [ESB] INFO {org.wso2.carbon.core.ServerManagement} - Waiting for deployment completion... {org.wso2.carbon.core.ServerManagement}
If I manually restarts it, all the deployments will work fine, and the sincronizer will start to work fine again (for some time).
p.s: I've tryed to use the OS's svn (SuSe) and also the SVNKit module. Our svn repository version is 1.5.1.

There are a few docs out there which are not upto date; hence would make things more difficult. Even I have tried those and ended up with unexpected problems.
Have you tried the instructions provided in latest WSO2-product clustering docs?
http://docs.wso2.org/wiki/display/Cluster/Creating+a+Cluster
http://docs.wso2.org/wiki/display/Cluster/Configuring+Deployment+Synchronizer
These information are upto date, and well tested with a sample set-up (ESB cluster with one Manager node and three worker nodes fronted by an Elastic Load Balancer). If you have followed those instructions, this should work fine. If you have already followed the same steps respectively and got stuck with this issue, please do confirm whether you have followed the instructions provided by this document, or not.
Thanks.

Related

Azure Devops: Pipeline fails to deploy to Linux Web App

I have a pipeline deploying to my Azure web app, that most of the times errors out because it couldn't deploy to my web app. The task take around 25 mins :
...
Copying file: 'frontend/.gitignore'
Copying file: 'frontend/README.md'
Copying file: 'frontend/package.json'
Copying file: 'frontend/tsconfig.json'
Copying file: 'frontend/yarn.lock'
Omitting next output lines...
An error has occurred during web site deployment.
Kudu Sync failed
\n/opt/Kudu/Scripts/starter.sh "/home/site/deployments/tools/deploy.sh"
##[error]Failed to deploy web package to App Service.
##[error]To debug further please check Kudu stack trace URL : https://$someapp:***#someapp.scm.azurewebsites.net/api/vfs/LogFiles/kudu/trace
##[error]Error: Package deployment using ZIP Deploy failed. Refer logs for more details.
...
When i enable : system.debug = true , i see these logs repeated many time , before start copying the artifact files :
POLL URL RESULT: {"statusCode":202,"statusMessage":"Accepted","headers":{"transfer-encoding":"chunked","content-type":"application/json; charset=utf-8","location":"http://XXXXXXXXX.scm.azurewebsites.net:80/api/deployments/latest?deployer=VSTS_ZIP_DEPLOY&time=2021-07-09_09-01-41Z","server":"Kestrel","date":"Fri, 09 Jul 2021 09:23:37 GMT","connection":"close"},"body":{"id":"68a7a8811796416b993924437493ff87","status":0,"status_text":"Building and Deploying '68a7a8811796416b993924437493ff87'.","author_email":"N/A","author":"N/A","deployer":"VSTS_ZIP_DEPLOY","message":"Created via a push deployment","progress":"Running deployment command...","received_time":"2021-07-09T09:01:50.4159225Z","start_time":"2021-07-09T09:01:51.775357Z","end_time":null,"last_success_end_time":null,"complete":false,"active":false,"is_temp":false,"is_readonly":true,"url":null,"log_url":null,"site_name":"XXXXXXXXXXXXe"}}
Deployment status: 0 'Building and Deploying '68a7a8811796416b993924437493ff87'.'. retry after 5 seconds
setting affinity cookie ["ARRAffinity=c06e9bb74f52245b3695b3079a52f6acbc70c3ee812f67e4fa3f5f65088ff4f7;Path=/;HttpOnly;Secure;Domain=XXXXXXXXXXXXXXXXXXXXXXXXXXXXX.scm.azurewebsites.net","ARRAffinitySameSite=c06e9bb74f52245b3695b3079a52f6acbc70c3ee812f67e4fa3f5f65088ff4f7;Path=/;HttpOnly;SameSite=None;Secure;Domain=XXXXXXXXXXXXXXX.scm.azurewebsites.net"]
[GET]https://XXXXXXXXXXX-test.scm.azurewebsites.net:443/api/deployments/latest?deployer=VSTS_ZIP_DEPLOY&time=2021-07-09_09-01-41Z
POLL URL RESULT: {"statusCode":202,"statusMessage":"Accepted","headers":{"transfer-encoding":"chunked","content-type":"application/json; charset=utf-8","location":"http://XXXXXXXXXXXXXXXXXXX.scm.azurewebsites.net:80/api/deployments/latest?deployer=VSTS_ZIP_DEPLOY&time=2021-07-09_09-01-41Z","server":"Kestrel","date":"Fri, 09 Jul 2021 09:23:45 GMT","connection":"close"},"body":{"id":"68a7a8811796416b993924437493ff87","status":0,"status_text":"Building and Deploying '68a7a8811796416b993924437493ff87'.","author_email":"N/A","author":"N/A","deployer":"VSTS_ZIP_DEPLOY","message":"Created via a push deployment","progress":"Running deployment command...","received_time":"2021-07-09T09:01:50.4159225Z","start_time":"2021-07-09T09:01:51.775357Z","end_time":null,"last_success_end_time":null,"complete":false,"active":false,"is_temp":false,"is_readonly":true,"url":null,"log_url":null,"site_name":"XXXXXXXXXXXX"}}
Deployment status: 0 'Building and Deploying '68a7a8811796416b993924437493ff87'.'. retry after 5 seconds
setting affinity cookie ["ARRAffinity=c06e9bb74f52245b3695b3079a52f6acbc70c3ee812f67e4fa3f5f65088ff4f7;Path=/;HttpOnly;Secure;Domain=XXXXXXXXXXXXXXXXXXXXXX.scm.azurewebsites.net","ARRAffinitySameSite=c06e9bb74f52245b3695b3079a52f6acbc70c3ee812f67e4fa3f5f65088ff4f7;Path=/;HttpOnly;SameSite=None;Secure;Domain=XXXXXXXXXXXXXXXXXX"]
This task fails only in specific slot in myweb app , authors slots and production slot works fine and the job take around 6 mins
Any ideas what could be wrong?

As per the discussion and troubleshooting performed here, I tried to setup a Linux App Service on Standard S1 pricing tier enabling 5 (max) slots with CI/CD configured via Azure Pipelines. Unfortunately, I wasn't able to reproduce the same error as yours despite multiple different trials.
I'd suggest you to try the following:
Kudu Sync failed in the deployment log resembles this open issue from about a year ago: ZipDelpoy on azure web app linux fails during kudu sync #2972. Please check the trace/deployment log files on kudu at https://<appname>.scm.azurewebsites.net/api/vfs/LogFiles/kudu/trace or /deployment or from Kudu's DebugConsole (/LogFiles/kudu/\*) and check if this is caused by deployment lock failures. In that case, check this wiki out for dealing with locked files during deployment.
Try a different deployment method like run from package (to avoid resource locking), using FTP/S, or local git deployment.
This should help you narrow down the issue further, whether it is caused in the App service/deployment method, or the ADO pipeline/task.
Scale up to the next higher tier and re-trigger your pipeline. If it succeeds, you may scale back down to the original tier. This would indirectly restart your SCM sites as well.
If the above workarounds don't help, you could check on the following:
Customize your deploy task with options like TakeAppOfflineFlag, DeploymentType or RenameFilesFlag to streamline your deployment.
Try restarting the app/slot just before the deployment in order to recycle the app pool.
Check if your app is running into any of the prescribed limits (ex: file system storage) for your tier.
Drill down into available metrics for your app to identify any CPU/Memory anomalies.
Try the Diagnose and solve problems tool for any additional insights about your app.
If your environment permits, try setting up and deploying to a new slot within your App Service, or try verifying if this happens to another app in a different region.

Google cloud datalab deployment unsuccessful - sort of

This is a different scenario from other question on this topic. My deployment almost succeeded and I can see the following lines at the end of my log
[datalab].../#015Updating module [datalab]...done.
Jul 25 16:22:36 datalab-deploy-main-20160725-16-19-55 startupscript: Deployed module [datalab] to [https://main-dot-datalab-dot-.appspot.com]
Jul 25 16:22:36 datalab-deploy-main-20160725-16-19-55 startupscript: Step deploy datalab module succeeded.
Jul 25 16:22:36 datalab-deploy-main-20160725-16-19-55 startupscript: Deleting VM instance...
The landing page keeps showing a wait bar indicating the deployment is still in progress. I have tried deploying several times in last couple of days.
About additions described on the landing page -
An App Engine "datalab" module is added. - when I click on the pop-out url "https://datalab-dot-.appspot.com/" it throws an error page with "404 page not found"
A "datalab" Compute Engine network is added. - Under "Compute Engine > Operations" I can see a create instance for datalab deployment with my id and a delete instance operation with *******-ompute#developer.gserviceaccount.com id. not sure what it means.
Datalab branch is added to the git repo- Yes and with all the components.
I think the deployment is partially successful. When I visit the landing page again, the only option I see is to deploy the datalab again and not to start it. Can someone spot the problem ? Appreciate the help.
I read the other posts on this topic and tried to verify my deployment using - "https://console.developers.google.com/apis/api/source/overview?project=" I get the following message-
The API doesn't exist or you don't have permission to access it

You can try looking at the App Engine dashboard here, to verify that there is a "datalab" service deployed.
If that is missing, then you need to redeploy again (or switch to the new locally-run version).
If that is present, then you should also be able to see a "datalab" network here, and a VM instance named something like "gae-datalab-main-..." here. If either of those are missing, then try going back to the App Engine console, deleting the "datalab" service, and redeploying.

Clearing a faulty service in WSo2

I have recently built a proxy service in WSo2 ESB that was not fully implemented. When I saved it generated a fault service message and a link was provided to a faulty service group at the top of the ESB console. Since then, I have corrected the service and it has been transferring files as I have intended. However the service does not show in the initial list of services and I have to click on the link 7 deployed service group(s) to now view it in the proxy list.
Here is what the link looks like:
6 active services. 7 deployed service group(s). 1 faulty service(s).
When I click on the 1 faulty service(s) I see the following but cannot delete it in an attempt to clear it.
*Faulty Service Actions
RenZipExtractProxy proxy
Unable to configure the service RenZipExtractProxy for the VFS transport: Service doesn't have configuration information for transport vfs. This service is being marked as faulty and will not be available over the VFS transport.*
How do I clear this faulty service issue? My updated service works fine but I continue to get the faulty service situation as stated above.

You can manually delete deployment artifacts in the directory <ESB>/repository/deployment/server/synapse-configs/default/proxy-services

I have noticed that when we configure WSO2 esb with VFS protocol and do not provide any parameter then it is going under Faulty service. I resolved this problem after mentioning VFS related parameters.
Please refer https://docs.wso2.com/display/ESB481/VFS+Transport
Note: If we try to delete faulty proxy service from console then also it will not deleted. Please try to delete it from file system.
<ESB_HOME>/repository/deployment/server/synapse-configs/default/proxy-services

Load balancing MySQL ndbcluster

I have successfully setup ndbcluster version 7.1.26.
This contains 2 data nodes[NDBD], 2 mysql [MYSQLD] nodes and one management [MGMD] node.
Replication works successfully.
My Web application is deployed in JBoss-5.0.1 and using JNDI for connection resources which are specified in application specific ds.xml file in load balanced url forms e.g. jbdc:mysql:loadbalance:host1:port1,host2:port2/databaseName.
host1 : refers to first mysqld node and port1 refers the port it is running on.
host2 : refers to second mysqld node and port2 refers the port it is running on.
When both of the [MySQLD] nodes are up and running everything works fine and cluster responds well, replicates data, and data retrieval operations also work properly.
But issues are raised when any of the [MySQLD] nodes goes down. Data gets inserted/updated/replicated but the application is unable to retrieve data from cluster and web page remains busy working which means busy retrieving data. As soon as the node which was down goes up it responds properly and application goes forward and shows up data retrieved from cluster.
At JBoss 5.0.1 startup it showed up a NullPointerException in class LoadBalancingConnectionProxy.invoke(LoadBalancingConnectionProxy.java:439). Tell me if the above Exception plays any role in the above explained issues.
If anyone had faced issues like above and if has any solution regarding the issues please let me know.
Thanks and regards.

I have resolved the issue as it was a bug in the connectorJ's version.
As The project I am working on was already using both the buggy jar mysql-connector-java-5.0.8.jar and the jar version in which the issue is already resolved i.e. mysql-connector-java-5.1.13-bin.jar.
After all the search when I removed the jar mysql-connector-java-5.0.8.jar my issues got resolved.
All that was problematic was that the ConnectorJ/Driver was getting referred from the buggy jar.
The bug id and url which refers to this issue is:
http://bugs.mysql.com/bug.php?id=31053
.
Thanks for considerations.

Are you using different userids and passwords for each of the hosts(host1, host2) specified in the tag ? (Either directly or using tag) ?

Trouble adding a new service

I have followed the instructions at https://github.com/cloudfoundry/oss-docs/tree/master/vcap/adding_a_system_service and copied the echo service and created my new service. (That document is somewhat out-of-date in that "excluded components" no longer exists.
In any case, my service shows up as running with a gateway and a node when I look at 'vcap status' on the server. However, when I look at 'vmc services' from the client my service is not in the list. Where is this list maintained and why is my service not on the list?
Various services, including blob, filesystem, mongodb, etc, are shown on the 'vcm services' list even though they have never been included in my config. Where is this maintained and why are other services on this list?
The cloud_controller.log file shows a "Create service request:" for echo every minute. This service is not in my config file (it was once but it was removed and I repeated the deployment). What is prompting this request for a service that was not defined in the config?
The _gateway.log for my service shows the following:
INFO -- Sending info to cloud controller: ...api.vcap.me/services/v1/offerings
INFO -- Fetching handles from cloud controller .../offerings/.../handles
ERROR -- Failed registering with cloud controller, status=400
DEBUG -- [GaaS-Provisioner] Connected to node mbus..
ERROR -- Failed fetching handles, status=404
Why does my gateway fail to register with the cloud controller? I have found some reports that suggest that the problem is with domain name mapping. I have verified that the server can find itself:
$curl api.vcap.me
Welcome to VMware's Cloud Application Platform
What can I do to register my service?

You can also try asking your question on the vcap_dev google group.
https://groups.google.com/a/cloudfoundry.org/forum/?fromgroups#!forum/vcap-dev
They are focused in answering and discussing OSS subjects for Cloud Foundry!

If you follow the document correctly things should work just fine. I understand that the mechanism for maintaining the excluded list of components has changed and can be a point of confusion when following the steps mentioned in the article (just ignore that step totally).
ERROR -- Failed registering with cloud controller, status=400
Well this is a point of worry. I recently followed the article step by step and was able to add a new service.
Is the echo service showing up in vmc services?
Have you copied the the yml files for node and gateway at ./cloudfoundry/.deployments/devbox/config?
Are the tokens for your gateway unique? and matching in the two files? ./cloudfoundry/.deployments/devbox/config/cloud_controller.yml and ./cloudfoundry/.deployments/devbox/config/**_gateway.yml**
I would recommend that you first concentrate on getting the echo service to be listed in the vmc services output. Once done with this you should replicate the steps (with absolute care to modify things like the token) to get your custom service working.
Cheers,
Ankit

You should follow this guide
It work to me.
regards.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse