I am using code examples in the MLRun documentation for running a spark job on Iguazio platform. Docs say I can use a default spark docker image provided by the platform, but when I try to run the job the pod hangs with Error ImagePullBackOff. Here is the function spec item I am using:
my_func.spec.use_default_image = True
How do I configure Iguazio to use the default spark image that is supposed to be included in the platform?
You need to deploy the default image to the cluster docker registry. There is one image for remote spark and one image for spark operator. Those images contain all the necessary dependencies for remote Spark and Spark Operator.
See the code below.
# This section has to be invoked once per MLRun/Iguazio upgrade
from mlrun.runtimes import RemoteSparkRuntime
RemoteSparkRuntime.deploy_default_image()
from mlrun.runtimes import Spark3Runtime
Spark3Runtime.deploy_default_image()
Once these images are deployed (to the cluster docker registry), your function with function spec “use_default_image” = True will be able to pull the image and deploy.
I am using kafka testcontainers with JUnit5. Can someone let me know how can I delete data from Kafka testcontainers after each test so that I don't have to destroy and recreate the kafka testcontainer every time.
Test Container Version - 1.6.2
Docker Kafka Image Name - confluentinc/cp-kafka:5.2.1
Make the container variable static
Containers declared as static fields will be shared between test methods. They will be started only once before any test method is executed and stopped after the last test method has executed
https://www.testcontainers.org/test_framework_integration/junit_5/
Make sure that you don't share state between tests, though. For example, if you want to test creating a topic, producing to it, then consuming from it, and deleting it, those all should be in one test. Although you can call separate non test methods.
That being said, each test should ideally use a unique topic name. One that describes the test, maybe.
Also, as documented, you cannot use the parallel test runner
How to run docker-compose across different lifecycle environments (say dev, qa, staging, production).
Sometimes a larger VM is being shared by multiple developers, so would like to start the containers with appropriate developer specific suffixes (say dev1, dev2, dev3 ..). Should port customization be handled manually via the environment file (i.e. .env file)
This is an unusual use case for docker-compose, but I'll leave some tips anyway! :)
There's two different ways to name stuff you start with docker-compose. One is to name the service that you specify under the main services: key of your docker-compose.yml file. By default, individual running containers will be assigned names indicating what project they are from (by default, the name of the directory from which your docker-compose file is in), what service they run (this is what's specified under your services: key), and which instance of that service they are (this number changes if eg. you're using replicas). Eg. default container names for a service named myservice specified in a compose file ~/my_project/docker/docker-compose.yml will have a name like docker_myservice_1 (or _2, _3, etc if more than one container is supposed to run).
You can use environment variables to specify a lot of key-value pairs in docker-compose files, but you can't conditionally specify the service name - service keys are only allowed to have alphanumeric characters in them and compose files can't look like eg:
version: "3"
services:
${ENVVAR}:
image: ubuntu:20.04
However, you can override the container naming scheme by using the container_name field in your docker-compose file (documentation for usage here). Maybe a solution you could use looks like this:
version: "3"
services:
myservice:
image: ubuntu:20.04
container_name: ${DEVELOPER_ENVVAR?err}
this will require a developer to specify DEVELOPER_ENVVAR at runtime, either by exporting it in their shell or by running docker-compose like DEVELOPER_ENVVAR=myservice_dev1 docker-compose up. Note that using container_name is incompatible with using replicas to run multiple containers for the same service - the names have to be unique for those running containers, so you'll either have to define separate services for each name, or give up on using container_name.
However, you're in a pickle if you expect multiple developers to be able to run containers with different names using the same compose file in the same directory. That's because when starting a service, docker-compose has a Recreating step where, if there's already containers implementing that service running, they'll wait for that container to finish. Ultimately, I think this is for the best - if multiple developers were trying to run the exact same compose project at once, should a developer have control over other developers' running containers? Probably not, right?
If you want multiple developers to be able to run services at once in the same VM, I think you probably want to do two things:
first, (and you may well have already done this! but it's still a good reminder) make sure that this is a good idea. Are there going to be resource contention issues (eg. for port-forwarding) that make different running instances of your project conflict? For many Docker services, there are going to be, but there probably won't be for eg. images that are meant to be run in a swarm.
second, have different compose files checked out in different directories, so that there are separate compose projects for each developer. To use .env files one way one obvious option is to just maintain separate copies, one per developer directory. If, for your use case, it's unsatisfactory to maintain one copy of .env per developer this way, you could use symlinks named .env (or whatever your env file is named) to the same file somewhere else on the VM.
After you've done this, you'll be able to tell from the container names who is running what.
If none of these are satisfactory, you might want to consider, eg. using one VM per developer, or maybe even considering using a different container management system than docker-compose.
I have done very similar automation and I've used Ansible to create "docker compose" config on the fly.
So based on input-Environment , the ansible playbook will create the relevant docker-compose file. So basically I have a docker-compose template in my git repository with values that are dynamic and ansible playbook populates them etc.
and also you can use ansible to trigger such creation or automation one after another
A similar sample has been posted at ansible_docker_splunk repository.
Basically the whole project is to automate end-to-end docker cluster from CSV file
For my integration tests I need to create and destroy resources when every tests ran (for example starting and stopping docker images to test against).
Creating a resource can take time so I'd like to do it once for every tests that needs it and destroy it when it's not needed anymore.
For now I've done that by creating the resource lazily and adding a shutdown hook :
object MyResource {
lazy val singleton = {
val docker = Await.result(startImage(), 1 minute)
sys.addShutdownHook(Await.result(docker.stop(), 1 minute))
docker
}
}
I'm looking for a better way to handle that like fixture but I'm not sure that I can start and stop the docker image once with that.
My tests are in multiple classes, so I can't use beforeAndAfterAll (and I'd like to avoid using Await.result).
So I built a REST API microservice which queries a local Elasticsearch instance and translates the results according to an internal protocol. I built it into a Docker image and I would like to run some unit tests on it in build. Being ES connected to a private Docker network, it isn't reachable by the microservice during build, so the tests obviously fail. I was wondering, is there a way around this situation without having to use some complicated testing framework to do dependency injection? How do you test this kind of containers in your work practice?
I would build the application without any testing. Then I would test it using docker run so you can take the docker network advantages.
Roughly this is more elegant than test in the middle of the build:
docker build -t my_app:1.0-early your application in order to obtain an image.
docker run --network my_test_network my_app:1.0-early /run_test_cases.sh. Return the properly exit code or text.
Depending on the success or not of the test, re tag: docker tag my_app:1.0
You will need to have already created a docker network (docker network create my_test_network), or better use docker-compose.