Sagemaker certificate issue with Kubernetes - kubernetes

I have created a docker container that is using Sagemaker via the java sdk. This container is deployed on a k8s cluster with several replicas.
The container is doing simple requests to Sagemaker to list some models that we have trained and deployed. However we are now having issues with some java certificate. I am quite novice with k8s and certificates so I will appreciate if you could provide some help to fix the issue.
Here are some traces from the log when it tries to list the endpoints:
org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:394)
at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:353)
at com.amazonaws.http.conn.ssl.SdkTLSSocketFactory.connectSocket(SdkTLSSocketFactory.java:132)
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:141)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
at com.amazonaws.http.conn.$Proxy67.connect(Unknown Source)
at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:380)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1236)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
... 70 common frames omitted
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:397)
at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:302)
at sun.security.validator.Validator.validate(Validator.java:262)
at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:324)
at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:229)
at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:124)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1621)
... 97 common frames omitted
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:141)
at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126)
at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:280)
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:392)
... 103 common frames omitted

This might most likely to do with some custom SSL certification path added to your network by your admin. You might want to inspect the SSL root certificates by opening any secured website on your browser and click on the Secure link to the left of the address bar ( atleast this is how it is in chrome ) . You will see a popup showing certificate and certification information. Go to its Certificate Path and see the ROOT certificate , if it is something of custom certificate then you will need to add the same to your cacerts file. Read this link for more details

I think I have found the answer to my problem. I have set up another k8s cluster and deployed the container there as well. They are working fine and the certificate issues does not happen. When investigating more I have noticed that they were some issues with DNS resolution on the first k8s cluster. In fact the containers with certificate issues could not ping google.com for example.
I fixed the DNS issue by not relying on core-dns and setting the DNS configuration in the deployment.yaml file. I am not sure to understand why exactly but this seems to have fixed the certificate issue.

The error message you're receiving occurs when Java does not know about the root certificate returned by an TLS endpoint. This often occurs if you change the root certificates available.
Per https://docs.oracle.com/javase/7/docs/technotes/guides/security/jsse/JSSERefGuide.html#Customization:
"If a truststore named <java-home>/lib/security/jssecacerts is found, it is used.
If not, then a truststore named <java-home>/lib/security/cacerts is searched for and used (if it exists).
Finally, if a truststore is still not found, then the truststore managed by the TrustManager will be a new empty truststore."
Openssl is a good tool for debugging such certificate issues. You can use the following command to retrieve the certificate returned by an endpoint. This may help you determine what the certificate chain looks like.
openssl s_client -showcerts -connect www.example.com:443 </dev/null
You can view the list of certificates that Java knows about using keytool, a utility vended with the JRE.
keytool -list -cacerts
Some system administrators will override the default certificates by writing an alternative truststore file into the default location. Other times, teams may override the default using the javax.net.ssl.trustStore system property.
Finally, you can use the jps utility, also vended with the JRE, to see the system properties set on a running Java process.
jps -v

Related

Getting cert error even after adding CA cert to trust store on CentOS 8

I am trying to set up Openshift 4.9 and running into issues configuring the mirror registry. I have narrowed down the issue to cert error with quay.io
$ wget "https://quay.io/openshift-release-dev/ocp-release:4.8.15-x86_64"
--2021-10-25 16:57:27-- https://quay.io/openshift-release-dev/ocp-release:4.8.15-x86_64
Resolving quay.io (quay.io)... 35.172.159.14, 34.224.196.162, 3.216.152.103, ...
Connecting to quay.io (quay.io)|35.172.159.14|:443... connected.
ERROR: The certificate of âquay.ioâ is not trusted.
ERROR: The certificate of âquay.ioâ has been revoked.
I have downloaded the cert chain from quay.io and copied it to
/etc/pki/ca-trust/source/anchors/
Then I ran update-ca-trust as well as update-ca-trust extract
I checked the bundle and certs are present.
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt
However, I keep getting the cert for quay.io is not trusted.
Any pointers to fix this would be appreciated.
Two things may help:
First of all, make sure you added the right CA file to the anchors folder:
DigiCert High Assurance EV Root CA Self-signed
Fingerprint SHA256: 7431e5f4c3c1ce4690774f0b61e05440883ba9a01ed00ba6abd7806ed3b118cf
Pin SHA256: WoiWRyIOVNa9ihaBciRSC7XHjliYS9VwUGOIud4PB18=
Then check the result in /etc/pki/tls/certs/ca-bundle.crt

How to pass certificate path with Postgres database url string for SSL connection

I am trying to secure connection to AWS RDS instance over SSL for my Spring boot application. I have looked upon several blogs and official documentation and accordingly modified my connection string to contain certain parameter related to SSL connection.
I have my certificate placed inside a cert folder in resources. Below is how I have tried to pass the certificate path:
jdbc:postgresql://myamazondomain.rds.amazonaws.com:5432/db_name?sslmode=verify-full&sslrootcert=/cert/rds-ca-cert_name.p12&password=my_passwrord
Also I have tried:
jdbc:postgresql://myamazondomain.rds.amazonaws.com:5432/db_name?sslmode=verify-full&sslrootcert=/src/main/resources/cert/rds-ca-cert_name.p12&password=mypassword
However, when I try to connect to the RDS from my ECS container, I receive the following error:
ERROR com.zaxxer.hikari.pool.HikariPool - HikariPool-1 - Exception during pool initialization.
org.postgresql.util.PSQLException: Could not open SSL root certificate file /cert/rds-ca-cert_name.p12.
at org.postgresql.ssl.LibPQFactory.<init>(LibPQFactory.java:120)
at org.postgresql.core.SocketFactoryFactory.getSslSocketFactory(SocketFactoryFactory.java:61)
at org.postgresql.ssl.MakeSSL.convert(MakeSSL.java:33)
at org.postgresql.core.v3.ConnectionFactoryImpl.enableSSL(ConnectionFactoryImpl.java:435)
at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:94)
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:192)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:49)
at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:195)
at org.postgresql.Driver.makeConnection(Driver.java:454)
Can someone suggest what is the error here. What is the correct way of passing the certificate stored in classpath to jdbc connection string.
We need to use SingleCertValidatingFactory class to specify certificate file on classpath (or from file system, environment variables etc). This class has argument sslfactoryarg where we can add path to certificate file.
Your URL should look like:
jdbc:postgresql://myamazondomain.rds.amazonaws.com:5432/db_name?sslmode=verify-full&sslfactory=org.postgresql.ssl.SingleCertValidatingFactory&sslfactoryarg=classpath:cert/rds-ca-cert_name.p12

TLS connection to MongoDB in Quarkus

I am attempting to connect from Quarkus to a MongoDB instance in the cloud which requires TLS. I have the certificate file for the server but cannot see how to use it with Quarkus.
I currently have the following properties set
quarkus.mongodb.connection-string = mongodb://blah:blah#mydomain.com:27017
quarkus.mongodb.database=school
quarkus.mongodb.tls=true
There does not appear to be anywhere to set the certificate file.
I cannot get past this error
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
Warren
There is no specific TLS settings for MongoDB with Quarkus.
If your certificate has a known root (which didn't seem to be the case), there is nothing more to do.
If your certificate is not knownw by your JVM, you need to use the keetool of your JVM to import it.
Be careful that if you deploy your application as a native executable, there is some steps to follow: https://quarkus.io/guides/native-and-ssl

ClickOnce VSTO solution signed with mage.exe - certificate not trusted error

I'm trying to deploy a VSTO solution, which are 2 addins for Word and for Outlook, using ClickOnce. Due to our deployment infrastructure/practices, I cannot publish it using Visual Studio, it is instead built on a build server and deployed via a deployment server.
For local development, a self-signed certificate is used. The deployment worked with this self-signed certificate (if the the self-signed certificate was installed on the machine), but now I want to add a real company certificate so that the application can be deployed to the users.
During deployment, after the configuration files are poked, they are updated and re-signed with the real certificate. However, this produces the following error during installation:
System.Security.SecurityException: Customized functionality in this application will not work because the certificate used to sign the deployment manifest for <app name> or its location is not trusted. Contact your administrator for further assistance.
at Microsoft.VisualStudio.Tools.Applications.Deployment.ClickOnceAddInTrustEvaluator.VerifyTrustPromptKeyInternal(ClickOnceTrustPromptKeyValue promptKeyValue, DeploymentSignatureInformation signatureInformation, String productName, TrustStatus status)
at Microsoft.VisualStudio.Tools.Applications.Deployment.ClickOnceAddInTrustEvaluator.VerifyTrustUsingPromptKey(Uri manifest, DeploymentSignatureInformation signatureInformation, String productName, TrustStatus status)
at Microsoft.VisualStudio.Tools.Applications.Deployment.ClickOnceAddInTrustEvaluator.VerifyTrustUsingPromptKey(Uri manifest, DeploymentSignatureInformation signatureInformation, String productName)
at Microsoft.VisualStudio.Tools.Applications.Deployment.ClickOnceAddInDeploymentManager.ProcessSHA1Manifest(ActivationContext context, DeploymentSignatureInformation signatureInformation, PermissionSet permissionsRequested, Uri manifest, ManifestSignatureInformationCollection signatures, AddInInstallationStatus installState)
at Microsoft.VisualStudio.Tools.Applications.Deployment.ClickOnceAddInDeploymentManager.VerifySecurity(ActivationContext context, Uri manifest, AddInInstallationStatus installState)
at Microsoft.VisualStudio.Tools.Applications.Deployment.ClickOnceAddInDeploymentManager.InstallAddIn()
The Zone of the assembly that failed was:
MyComputer
The only lead I have is that, after re-signing, the values in publisherIdentity element are not changed (both .vsto and .manifest), only the Signature element has values corresponding to the new certificate.
Following commands are used to sign the .vsto and .manifest files (as far as I can see from the deployment scripts):
mage.exe -Update "[path to .vsto/.manifest]"
mage.exe -Sign "[path to .vsto/.manifest]" -CertHash [certificateHash]
where [certificateHash] is the thumbprint of the real certificate and is used to look up the certificate in certificates stores. I'm told this is security measure so that the certificate file doesn't have to be distributed along with the deployment package.
After signing, the files have their Signature values changed, but the publisherIdentity still has the name and issuerKeyHash of the self-signed certificate.
I tried poking these two values prior to re-signing, but I'm don't know how to calculate the issuerKeyHash.
Any advise on how to proceed would be much appreciated!
Edit:
I was trying out other mage.exe parameters, like '-TrustLevel FullTrust' (which didn't have any effect) or '-UseManifestForTrust True' along with Name and Publisher parameters, which yielded this error message (which is different than the one mentioned above).
************** Exception Text **************
System.InvalidOperationException: You cannot specify a <useManifestForTrust> element for a ClickOnce application that specifies a custom host.
at Microsoft.VisualStudio.Tools.Applications.Deployment.ClickOnceAddInDeploymentManager.GetManifests(TimeSpan timeout)
at Microsoft.VisualStudio.Tools.Applications.Deployment.ClickOnceAddInDeploymentManager.InstallAddIn()
.
The certificate that the app is signed with isn't trusted by Windows. As a work around,
Right click on setup.exe,
Select properties then the Digital Signatures tab
Select Vellaichamy/user then click Details
Click View Certificate and Click Install Certificate.
Do not let it automatically choose where to store the sert, install the certificate in the Trusted Root Certification Authorities Store. Once the cert is installed the app should install...
Take a look at the Granting Trust to Office Solutions article which states the following:
If you sign the solution with a known and trusted certificate, the solution will automatically be installed without prompting the end user to make a trust decision. After a certificate is obtained, the certificate must be explicitly trusted by adding it to the Trusted Publishers list.
For more information, see How to: Add a Trusted Publisher to a Client Computer for ClickOnce Applications.
Also you may find the Deploying an Office Solution by Using ClickOnce article helpful.
We have found what the problem was. We used a version of mage.exe tool from Windows SDK from a folder named 7A (I don't remember the full paths, sorry). A colleague then found another folder with versions 7A, 8 and 8A. Once we took the .exe from 8A folder, the installation works as expected.
Try copying all the necessary files to the client computer then install. If you can avoid installing from the network drive you might be able to avoid this exception.

Not able to connect to cluster. Facing Certificate signed by unknown authority

I am not sure either what I am trying to do is possible or correct way.
One of my colleague spinup kubernetes gce cluster (with 1 master and 4 minions.) in a project which is shared with me as owner access.
After setup he shared his ~/.kubernetes_auth keys along with .kubecfg.crt, .kubecfg.ca.crt and .kubecfg.key. I copied all of the at my home folder and setup the kubernetes workspace.
I also set the project name as a default project in geconfig. and now I can connect to the master and slaves using 'gcutil ssh --zone us-central1-b kubernetes-master'
But when I try to list of existing pods using 'cluster/kubecfg.sh list pods'
I see
"F1017 21:05:31.037148 18021 kubecfg.go:422] Got request error: Get https://107.178.208.109/api/v1beta1/pods?namespace=default: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "ChangeMe")
I tried to debug from my side but failed to come any conclusion. Any sort of clue will be helpful.
You can also copy the cert files off of the master again. They are located in /usr/share/nginx on the master.
It is probably due to a not implemented feature, see this issue:
https://github.com/GoogleCloudPlatform/kubernetes/issues/1886
you can copy the files from /usr/share/nginx/... on the master
into your home dir and try again.
I figured out a workaround: set the -insecure_skip_tls_verify option
In kubecfg.sh, change the code near the bottom to
else
auth_config=(
"-insecure_skip_tls_verify"
)
fi
Obviously this is insecure and you are putting yourself at risk of a man in the middle attack, etc.