Kubeflow Pipeline in serving model

Kubeflow Pipeline in serving model - kubernetes

I'm beginning to dig into kubeflow pipelines for a project and have a beginner's question. It seems like kubeflow pipelines work well for training, but how about serving in production?
I have a fairly intensive pre processing pipeline for training and must apply that same pipeline for production predictions. Can I use something like Seldon Serving to create an endpoint to kickoff the pre processing pipeline, apply the model, then to return the prediction? Or is the better approach to just put everything in one docker container?

Yes, you can definitely use Seldon for serving. In fact, Kubeflow team offers an easy way to link between training and serving: fairing
Fairing provides a programmatic way of deploying your prediction endpoint. You could also take a look at this example on how to deploy your Seldon endpoint with your training result.

KF Pipelines is designed for pipelines that run from start to finish. Serving process does not have an end, so, although possible, serving itself should be handled outside of a pipeline.
What the pipeline should do is to push a trained model to the long-lasting serving service in the end.
The serving can performed by CMLE serving, Kubeflow's TFServe, Seldon, etc.
Can I use something like Seldon Serving to create an endpoint to kickoff the pre processing pipeline, apply the model, then to return the prediction?
Due to container starting overhead, Kubeflow Pipelines usually handle batch jobs. Of course you can run a pipeline for a single prediction, but the latency might not be acceptable. For serving it might be better to have a dedicated long-lived container/service that accepts requests, transforms data and makes predictions.

Related

Reversing the flow of jobs in a workflow

I'm working on some terraform logic and using github workflows to deploy multiple components in a sequential manner like job2(alb) depending on the completion of job1(creation of VPC). This works fine during the apply phase. However if I were to delete the infra using terraform destroy the sequence of jobs fails as job1 can't be successfull without job1.
Is there a way to enable the execution of the workflow in the bottom-up approach based on input?
I know that we can leverage terraform to deploy these components and handle the dependencies at terraform level. This is an example of a use case I'm working on.

You can control the flow of jobs by using the keyword “needs”. Read the docs here: https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idneeds

Sagemaker Pre-processing/Training Jobs vs ECS

We are considering using Sagemaker jobs/ECS as a resource for a few of our ML jobs. Our jobs are based on a custom docker file (no spark, just basic ML python libraries) and thus all that is required is resource for the container.
Wanted to know is there any specific advantage of using Sagemaker vs ECS here ? Also, As in our use-case we only require a resource for running docker image, would processing Job / training job serve the same purpose? Thanks!

Yeah you could make use of a either a Training Job or Processing Job (assuming the ML jobs are for transient training and/or processing).
The benefit of using SageMaker over ECS is that SageMaker manages the infrastructure. The Jobs are also transient and as such will be killed after training/processing while your artifacts will be automatically saved to S3.
With SageMaker Training or Processing Jobs all you need to do is bring your container (sitting in ECR) and kick off the Job with a single API (CreateTrainingJob, CreateProcessingJob)

Dependency among different ecs tasks

I have developed a backend server using multiple microservices, using spring cloud.
I have discovery service, config service, and different other services.
Right now for testing purposes, I use docker-compose to run them in the right order. Now I have decided to deploy my application on AWS.
I thought of using running them using ECS using fargare, But I am not able to understand how can I define dependency among my tasks.
I found this article https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html#container_definition_dependson
It defines dependency among containers in the same task.
But I do not think that I can run all my services with just one task as there will be complications in assigning vCPUs, even if I use 4vCPUs and huge memory then also I am not sure how well my containers will run. and after that scaling them will be another issue. Overall having such huge vCPUs and memory will incur a lot of costs as well.
Is there any way to define dependency among ECS tasks?

CloudFormation supports the DependsOn attribute which allows you to control the sequence of deployment (you basically trade off speed of parallelism for ordered deployments when you need them).
Assuming your tasks are started as part of ECS services you can set the DependsOn to a service that needs to start first.
E.g.
Resources:
WebService:
DependsOn:
- AppService
Properties:
....
....
Out of curiosity, how did you move from Compose to CloudFormation? FYI we have been working with Docker to add capabilities into the Docker toolset to deploy directly to ECS (basically converting docker compose files into CloudFormation IaC). See here for more background. BTW this mechanism honors the compose dependency chain. That is, if you set one service being dependent on the other in compose, the resulting CFN template uses the DependsOn attribute I described above.

Deploy REST API with dependencies

I want to deploy a trained machine learning model as a REST API. The API would take a file and first decompose it into features. The problem is that this step depends on other libraries (e.g., FFTW). The API would then query the model with the features from the previous step.
Theoretically I can spin up a virtual machine in the cloud, install all the dependencies there, and point the end point to that VM. But this won't scale if we have concurrent requests.
Ideally I'd love to put everything in a API gateway and leverage serverless paradigm so I don't have to worry about scalability.

First of all, you need to decompose your model into different steps. From your question I see preprocessing and model inference steps.
Your preprocessing includes dependencies such as a FFTW.
You didn't specify what kind of model do you have, but I assume that it also requires some sort of environment and/or dependencies.
Having said that, what do you need to do is to implement 2 services for each step.
It's better pack them into docker images in order to keep each container isolated and you will be able to easily deploy them.
Scalability on a docker lever could be achieved by deployment into cloud providers and docker orchestration with AWS ECS or Kubernetes.
There is an open-source project hydro-serving that could help you with this task.
In this case you just need to focus on the models themselves. hydro-serving takes care of the infrastructure.
If preprocessing stage is implemented as Python script -- we can deploy it with all deps from requirements.txt in individual containers.
The same is also true for the model -- it has have out-of-box of Tensorflow and Spark models.
Otherwise it's easy to adapt existing mechanism to satisfy your requirements (other language/toolkit)
Then, assuming that you already have a hydro-serving instance somewhere, you upload your steps with hs upload --host $HOST --port $PORT
and compose an application pipeline with your models.
You can access your application via HTTP api, GRPC api or Kafka topic.
It would be great if you'd specify what the files you are trying to send to REST API.
Possibly you will need to encode them somehow, in order to send them through REST API. On the other hand you could just send them as-is via GRPC api.
Disclosure: I'm a developer of hydro-serving

How to run multiple Kubernetes jobs in sequence?

I would like to run a sequence of Kubernetes jobs one after another. It's okay if they are run on different nodes, but it's important that each one run to completion before the next one starts. Is there anything built into Kubernetes to facilitate this? Other architecture recommendations also welcome!

This requirement to add control flow, even if it's a simple sequential flow, is outside the scope of Kubernetes native entities as far as I know.
There are many workflow engine implementations for Kubernetes, most of them are focusing on solving CI/CD but are generic enough for you to use however you want.
Argo: https://applatix.com/open-source/argo/
Added a custom resource deginition in Kubernetes entity for Workflow
Brigade: https://brigade.sh/
Takes a more serverless like approach and is built on Javascript which is very flexible
Codefresh: https://codefresh.io
Has a unique approach where you can use the SaaS to easily get started without complicated installation and maintenance, and you can point Codefresh at your Kubernetes nodes to run the workflow on.
Feel free to Google for "Kubernetes Workflow", and discover the right platform for yourself.
Disclaimer: I work at Codefresh

I would try to use cronjobs and set the concurrency policy to forbid so it doesn't run concurrent jobs.

I have worked on IBM TWS (Workload Automation) which is a scheduler similar to cronjob where you can mention the dependencies of the jobs.
You can specify a job to run only after it's dependencies has run using follows keyword.