Private python package: setup.cfg vs requirements.txt - python-packaging

Let's say we create a private python package hosted on github. This python package is supposed to be deployed on AWS Lambda Layers from where it can be imported and used by several of my AWS Python Lambdas.
This package has a dependency on 'psycopg2' for local dev environment (Mac) and 'aws-psycopg2' when deployed on AWS Lambda Layers.
Therefore, in this package, for my local dev, I have a requirements.txt file containing:
psycopg2==2.9.3
and for the deployment to AWS Lambda Layers, I have a setup.cfg containing:
....
install_requires =
aws-psycopg2 == 1.2.1
...
So far so good.
Now, lets say I'm developing an AWS Lambda function on my local machine and it has a dependency on the above package.
So I add this package (hosted on my private github) to the AWS Lambda function's requirements.txt file as follows:
git+https://github.com/my_account/my_package.git#dev_branch#egg=my_package
When I run pip on the this requirements file, it installs the package successfully but, installs 'aws-psycopg2' as defined in setup.cfg. Since this is my local machine, I want it to install 'psycopg2' instead as defined in the package's requirements file so that I can run the lambda on my local dev machine (mac).
In short, I want this package to install psycopg2 on my local and aws-psycopg2 when deployed to AWS Lambda Layers.
Is there any workaround/solution for this?
Note: aws-psycopg2 does not work on my local machine (So I need psycopg2 locally) and vice versa for AWS Lambda ie it does not work with psycopg2 so needs aws-psycopg2.

Related

Different Python version between Dataproc master and worker nodes

I had created a dataproc cluster with Anaconda as optional component and created a virtual env. in that. Now when running a pyspark py file on master node I'm getting this error -
Exception: Python in worker has different version 2.7 than that in
driver 3.6, PySpark cannot run with different minor versions.Please
check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON
are correctly set.
I need RDKit package inside the virtual env. and with that python 3x version gets installed. The following commands on my master node and then the python version changes.
conda create -n my-venv -c rdkit rdkit=2019.*
conda activate my-venv
conda install -c conda-forge rdkit
How can I solve this?
There's a few things here:
The 1.3 (default) image uses conda with Python 2.7. I recommend switching to 1.4 (--image-version 1.4) which uses conda with Python 3.6.
If this library will be needed on the workers you can use this initialization action to apply the change consistently to all nodes.
Pyspark does not currently support virtualenvs, but this support is coming. Currently you can run pyspark program from within a virtualenv, but this will not mean workers will run inside the virtualenv. Is it possible to apply your changes to the base conda environment without virtualenv?
Additional info can be found here https://cloud.google.com/dataproc/docs/tutorials/python-configuration

Failed to build a Conda package: missing gtkdocize in conda-builder gitlab-ci environment

I am using an automatic package creation pipeline in gitlab-ci, to build Conda packages for software we use in my company.
One of the software we use relies on gtkdocize, and checks for it in the
configure script. It is only needed for the build, not for the execution.
So, I am not able to build the package because the conda-builder image does
not contain this program.
I am new to Conda, and gitlab-ci, and I imagine conda-builder is a generic
Docker image for building Conda packages in general. How can I add a package
to "my" conda-builder image ?
Or maybe there is a build dependency I am missing in my recipe ? I cannot
find where gtkdocize can come from.
Any help would be appreciated.
The gtkdocize binary is used to set up an Autotools-based project using gtk-doc for generating the API reference. You will need to install whatever package provides gtkdocize; on Debian/Ubuntu, the package is called gtk-doc-tools, whereas on Fedora it's called gtk-doc.

Add external library to an action

I'm developing an action in IBM-Clound functions, that is called in Watson Assistant dialog.This action has to make a SOAP petition to a WS. The problem is when I try to import the suds library because It is not in the default python libraries. How can I add the library?
Thanks in advance.
You can package Python dependencies by using a virtual environment, virtualenv. The virtual environment allows you to link additional packages that can be installed by using pip, for example.
To install dependencies, package them in a virtual environment, and create a compatible OpenWhisk action:
Create a requirements.txt file that contains the pip modules and versions to install.
Install the dependencies and create a virtual environment. The virtual environment directory must be named virtualenv. To ensure compatibility with the OpenWhisk runtime container, package installations inside a virtual environment must use the image that corresponds to the kind.
For kind python:2 use the docker image openwhisk/python2action.
For kind python:3.6 use the docker image ibmfunctions/action-python-v3.6.
For kind python:3.7 use the docker image ibmfunctions/action-python-v3.7.
docker run --rm -v "$PWD:/tmp" ibmfunctions/action-python-v3 bash -c "cd tmp && virtualenv virtualenv && source virtualenv/bin/activate && pip install -r requirements.txt"
Package the virtualenv directory and any additional Python files. The source file that contains the entry point must be named main.py.
zip -r helloPython.zip virtualenv __main__.py
Create the action helloPython.
ibmcloud fn action create helloPython --kind python-jessie:3 helloPython.zip
For more details, refer this link

Add Java in Python Flask Cloud Foundry

I need to run java command from python flask application which is deployed using cf. How can we make java runtime available to this python flask app.
I tried using multi-buildpack, but java_buildpack expects some jar or war to be executed while deploying the application.
Any approach that would make java available to python flask app?
The last buildpack in the buildpack chain is responsible for determining a command to start your application which is why the Java buildpack expects a JAR/WAR to execute.
The Java buildpack, as of writing this, does not ship a supply script so it can only be run as the last buildpack when using multi buildpack support. It looks like that at some point in the future the Java buildpack will provide a supply script, but this is still being worked out here.
For now, what you can do is use the apt-buildpack and install a JRE/JDK that way.
To do this, add a file called apt.yml to the root of your project folder. In that file, put the following:
---
packages:
- openjdk-8-jre
repos:
- deb http://ppa.launchpad.net/openjdk-r/ppa/ubuntu trusty main
keys:
- https://keyserver.ubuntu.com/pks/lookup?op=get&search=0xEB9B1D8886F44E2A
This will tell the apt buildpack to add a PPA for Ubuntu Trusty where we can get the latest openjdk8. This gets installed under /home/vcap/deps/0, which puts the java executable at /home/vcap/deps/0/lib/jvm/java-8-openjdk-amd64/bin/java.
Note: The java binary is unfortunately not on the path because of the way Ubuntu uses update-alternatives and we can't use that tool to put it on the path in the CF app container because we don't have root access.
After setting that up, you'd follow the normal instructions for using multiple buildpacks.
$ cf push YOUR-APP --no-start -b binary_buildpack
$ cf v3-push YOUR-APP -b https://github.com/cloudfoundry/apt-buildpack#v0.1.1 -b python_buildpack
Note: The process to push using multiple buildpacks will likely change in the future and v3-push, which is currently experimental, will go away.
Note: The example above hard codes version v0.1.1 of the apt buildpack. You should use the latest stable release, which you can find here. Using the master branch is not recommended.
One way to achieve your goal to combine Java and Python would be with context-based routing. I have an example to combines Python and Node.js, but the approach is the same.
Basically, you have a second app serving one or more paths of a domain / URI.

virtualenv can't execute on my system?

I'm trying to create a virtual environment to deploy a Flask app. However, when I try to create a virtual environment using virtualenv, I get this error:
Using base prefix '//anaconda'
New python executable in /Users/sydney/Desktop/ptproject/venv/bin/python
ERROR: The executable /Users/sydney/Desktop/ptproject/venv/bin/python is not functioning
ERROR: It thinks sys.prefix is '/Users/sydney/Desktop/ptproject' (should be '/Users/sydney/Desktop/ptproject/venv')
ERROR: virtualenv is not compatible with this system or executable
I think that I installed virtualenv using conda. When I use which virtualenv, I get this
//anaconda/bin/virtualenv
Is this an incorrect location for virtualenv? I can't figure out what else the problem would be. I don't understand the error log at all.
It turns out that virtualenv just doesn't work correctly with conda. For example:
https://github.com/conda/conda/issues/1367
(A workaround is proposed at the end of that thread, but it looks like you may be seeing a slightly different error, so maybe it won't work for you.)
Instead of deploying your app with virtualenv, why not just use a proper conda environment? Conda environments are more general (and powerful) than those provided by virtualenv.
For example, to create a new environment with python-2.7 and flask in it:
conda create -n my-new-env flask python=2.7