I’m testing kubeflow pipeline and would like to use it on AWS/On-prem but I saw the below comment on the documentation. Should I wait using it with AWS/on-prem?
Due to kubeflow/pipelines#345 and kubeflow/pipelines#337, Kubeflow Pipelines depends on Google Cloud Platform (GCP) services and some of the functionality is currently not supported by non-GKE clusters.
I'd suggest to wait. See the status update on the issues and acitivity on kubeflow/pipelines#1131 to enable the support on AWS. Similar work is in progress for supporting on-prem as well.
On-prem is definitely possible.
KFP is not tied to GCP and can work even on Windows tables.
AWS support is mostly provided by the community, but most of the issues have been fixed. There are samples that show demonstrate KFP on AWS and there are multiple AWS and Sagemaker components created by people form AWS. https://github.com/kubeflow/pipelines/blob/091316b8bf3790e14e2418843ff67a3072cfadc0/samples/contrib/aws-samples/titanic-survival-prediction/titanic-survival-prediction.py
Related
Can somebody bring some light in IBM Cloud deployments tools / platforms whatever?
I am new to it, so I am looking at their docs, watching videos and still i am confused.
What I want to achieve is typical scenario fetch code from repo, build it, test it, deploy it to cloud. I found strategies / platforms how to achieve that and i still can't see differences advantages / disadvantages between them.
So we have:
toolchain
cloud foundry
code engine
Continuous Delivery (service)
and maybe something more? :)
I am looking at Cloud Foundry explained video and the guy is saying if you want to do not care about the bottom part like networking, security, containers you can choose deploy using K8S service. Wtf? So from total automatic thing you can now handle something in the cloud foundry by yourself. So for me its total mix of everything together and i don't know now which tool / platform / strategy to use.
Any comment is appreciated.
It all depends on your requirements.
IBM DevOps/Continuous Delivery/Toolchanis is set of services that can build and deploy your application to given runtime. You can find various tutorials here - https://www.ibm.com/cloud/architecture/courses/toolchain-tutorials. These tutorials shows you various things that you can embed in your build pipelines (like code scanning with CRA, image scanning, signing etc)
These runtimes can be different depending on your requirements:
CloudFoundry, where you deploy app using a buildpak, but this is rather fading technology, so I wouldn't recommend that
as docker image in K8s/OpenShift cluster - use this if your organization is planning or already utilizing Docker/Kubernetes/OpenShift. You will need to create K8s/OpenShift cluster first.
as serverless app, using the IBM Code Engine
If you are just starting and just want to deploy simple, single app to the Cloud I'd consider using IBM Code Engine and not investigate Toolchains for now. Check basic demo here - How to deploy source code with IBM Cloud Code Engine
As I dive into the world of Cloud Composer, Airflow, Google Kubernetes Engine, and Kubernetes I've not yet found a good answer to what exactly makes Cloud Composer better than Helm and GKE.
Here are some things I've found that could be unique to Composer but mostly seem like they could be handled by GKE.
On their homepage:
End-to-end integration with Google Cloud products including BigQuery, Dataflow, Dataproc, Datastore, Cloud Storage, Pub/Sub, and AI Platform gives users the freedom to fully orchestrate their pipeline.
On the features page:
Identity-Aware Proxy protects the interface
Cloud Composer associates a Cloud Storage bucket with the environment. The associated bucket stores the DAGs, logs, custom plugins, and data for the environment.
The downsides of Composer I've seen include:
It takes many hours to spin up a new instance
It doesn't support Kubernetes Executor
It is risky to change the underlying GKE config because it could be changed back by a composer update
There are often errors that happen when auto-scaling often happen but are documented as known
Upgrading environments is still beta
To be clear, I'm not saying Cloud Composer is bad. I'm just having trouble seeing why people like it. When I've asked folks why it is better than Helm + GKE they haven't had any compelling answers despite that they can tell many stories of Composer being unpredictable and having lots of issues.
Are you comparing the same things?
On one side, GKE, you have a container orchestrator. Declare that you want, it will deploy and maintain the stability of the cluster according with declared configuration. This configuration can be packaged with helm to write it in an easier mode. Because you deploy container, you can use the language that you want in your services.
On the other side, you have a workflow manager, with scheduler, retry policies, parallel task, context forwarding. you write DAG in python (only!) and you have operators to interact with external product/services. It's mainly designed for data processing and used a lot by data scientist and data engineering team.
Note: Cloud Composer is deployed on top of GKE (scheduler and worker), redis, app engine and Cloud SQL.
You compare 2 different worlds: Ops world (GKE/Helm) and the App/Data world (Composer/Airflow). Have a look to this new video
Update 1:
My bad, I didn't understand!!! Anyway, personally I don't want to manage things by myself: a cluster, the update of K8S, VM patching, replicas, snapshot, backup/restore,...
If someone can do this for me, I prefer, and managed services are perfect for me!!
Do you ask yourselves this question about Cloud SQL and a database managed by yourselves on a Compute Engine instance? If not (because Cloud SQL solve a lot of boring issues), my opinion is the same for Composer.
But it's an opinion, I didn't test both and compare the performance, cost and easiness.
I am trying to use KubeFlow on GCP and I am following this codelab, but "click-to-deploy" is no longer supported so I followed the documentation of "kubectl and kpt". However, I keep getting this "You cannot perform this action because the Cloud SDK component manager is disabled for this installation." error and none of the solutions I found worked. I have 2 other friends told me they tried to make KubeFlow work since last year, it never worked, but I did see people post question about KubeFlow on Stackoverflow still, so I want to ask if it is still working, if so, where can I find a decent guide to follow?
Thanks!
I finally got it working. For that error message, it turned out that I just didn't install the Cloud SDK properly. There will be a lot of other issues too down the road, but at least the KubeFlow web UI is working for me now.
yes, as the kubectl and kpt says, the first step in getting prepared to install cluster is installing gcloud that is CLI that manages authentication, local configuration, developer workflow, interactions with Google Cloud APIs.
Without is you simply cant work with objects(in your case you need to enable kpt anthoscli beta) and perform tasks like
creating a Compute Engine VM instance, managing a Google Kubernetes
Engine cluster, and deploying an App Engine application, either from
the command line or in scripts and other automations..
Currently, I deploy python scripts on Kubernetes using Codefresh. I'm looking to incorporate Kubeflow into the deployment plan to get all the Kubeflow goodies such as the UI and all but I'm a little clueless on how to start or where to look.
The docs for Kubeflow mainly only cover setting up with Google Cloud Platform only. Does anybody have any experience with this?
You can use these instructions to install kubeflow on any existing kubernetes cluster, regardless of whether you are running on GCP or any other platform.
These steps can be converted into a codefresh pipeline like any other set of commands. If you need help with that let me know, we use codefresh and I'm well versed with there pipeline files.
I have setup a Hyperledger Fabric V1.0 Network by following the Hyperledger-fabric docs and using fabric-sdk-java client I am able to communicate with the network from my java application. Now everything is working fine in the development setup. But still I am not getting the clear picture about its production level implemenation. Looking for some valuable suggestions for the following points to make it production live.
Will it be possible to use this setup for production? then how can I build my network using this docker-compose setup? Which are the options available for production hosting of the network?
If it is possible to setup in production, should I run this docker-compose set up and all in all the peer system's, then how will I configure the docker-compose.yaml to define each of the peers/organisations which are in different system?
I have found Bluemix Blockchain Service as an alternative, but it is having high monthly charges. So is there any alternative to deploy myown Hyperledger Fabric V1.0 network by defining myown peers and organization?
I think that for a production deployment, you'd likely want to implement Swarm or Kubernetes. See Hyperledger Cello for instance. You will also want to have a process and automation for managing the code going forward. Updating images, chaincode, etc. Further, you might want to further automate some of the on-boarding process which at present is rather bare bones.
As noted above, the Docker Compose is designed for a single system. You'd likely want to use Swarm or Kubernetes to manage nodes on different systems and you want decentralized operations when you are engaging multiple entities into a consortia where the members want to choose where they run their nodes.
There is a developer sandbox offering that you can deploy to IBM's Container service (Kubernetes) but you won't be getting the benefits of the crypto acceleration, HSM, and added security of the LinuxOne platform on which IBM deploys the IBM Blockchain Platform. The good things in life may be free, but I would want to have the added value of a vendor provided cloud offering like IBM Blockchain Platform for my production system. YMMV.