save pyspark model with MLflow - pyspark

I'm working on Dataiku notebook and I'm trying to save pyspark model from my notebook to dataiku managed folder with MLflow
import mlflow
mlflow.spark.save_model(model, "/opt/dataiku/design/managed_folders/PROJET_TEST/9KeBcUKy/ML_SAVED_V26_01_2023")
I'm getting this warning
2023/01/26 15:11:44 WARNING mlflow.utils.environment: Encountered an unexpected error while inferring pip requirements (model URI: /opt/dataiku/design/managed_folders/PROJET_TEST/9KeBcUKy/ML_SAVED_V26_01_2023, flavor: spark), fall back to return ['pyspark==3.3.1']. Set logging level to DEBUG to see the full traceback.
And there are not part-00000 files
requirements.txt
mlflow<3,>=2.1
pyspark==3.3.1
:)
I tried to change python version from 3.7 to 3.8

Related

How can I fix the attribute error in VS code when the same code runs perfectly on colab?

I was running the PySINDy package and I kept on getting the module not found-error when I ran this:
pip install pysindy
import pysindy as ps
I fixed this by downgrading the Python in my VS code to 3.8.9. But I do get the following note:
note: This error originates from a subprocess, and is likely not a problem with pip.Note: you may need to restart the kernel to use updated packages.
I am now getting the attribute error when I run the following:
differentiation_method = ps.FiniteDifference(order=2)
Here's the error:
AttributeError: module 'pysindy' has no attribute 'differentiation'
Can someone please help me with this? (I successfully ran the entire code earlier on google colab and the online Jupyter, but I am unable to do it locally. I use MAC os and Jupyter via VS Code.)

Haskell and postgresql - build error "The program pg_config is required but it could not be found."

I am currently learning haskell and just tried using postgresql as a database.
I generated my project with stack (stack new <name> -> stack setup -> stack build)
and then all I changed was adding the dependencies needed to persistent and postgresql to the
package.yaml file (under "dependencies:").
These are:
persistent
persistent-postgresql
persistent-template
This however results in a failing build with the following message:
postgresql-libpq > setup.exe: The program 'pg_config' is required but it could not be found.
postgresql-libpq >
-- While building package postgresql-libpq-0.9.4.2 using:
C:\Users\\AppData\Local\Temp\stack14388\postgresql-libpq-0.9.4.2.stack-work\dist\e626a42b\setup\setup --builddir=.stack-work\dist\e626a42b configure --user --package-db=clear --package-db=global --package-db=C:\sr\snapshots\365a3dde\pkgdb --libdir=C:\sr\snapshots\365a3dde\lib --bindir=C:\sr\snapshots\365a3dde\bin --datadir=C:\sr\snapshots\365a3dde\share --libexecdir=C:\sr\snapshots\365a3dde\libexec --sysconfdir=C:\sr\snapshots\365a3dde\etc --docdir=C:\sr\snapshots\365a3dde\doc\postgresql-libpq-0.9.4.2 --htmldir=C:\sr\snapshots\365a3dde\doc\postgresql-libpq-0.9.4.2 --haddockdir=C:\sr\snapshots\365a3dde\doc\postgresql-libpq-0.9.4.2 --dependency=Cabal=Cabal-2.4.1.0-5rQrtDcYhR2LOcYye7obEr --dependency=Win32=Win32-2.6.1.0 --dependency=base=base-4.12.0.0 --dependency=bytestring=bytestring-0.10.8.2 -f-use-pkg-config --extra-include-dirs=C:\Users\\AppData\Local\Programs\stack\x86_64-windows\msys2-20180531\mingw64\include --extra-lib-dirs=C:\Users\\AppData\Local\Programs\stack\x86_64-windows\msys2-20180531\mingw64\lib --extra-lib-dirs=C:\Users\\AppData\Local\Programs\stack\x86_64-windows\msys2-20180531\mingw64\bin --exact-configuration --ghc-option=-fhide-source-paths
Process exited with code: ExitFailure 1
Does anyone know how to resolve this issue and why it even occurs?
Do I have to install postgresql just for being able to run build the project? If so, how would you
do this in production, when the database could basically lie everywhere?
It looks like Haskell is trying to build with the PostgreSQL client shared library libpq.dll and uses pg_config at build time to determine where PostgreSQL is installed and how it was built.
That would mean that you have to install PostgreSQL on the machine where you build Haskell, including the header files, build environment or however it is called by the installer.
For running Haskell you would only need libpq.dll and the dependent shared libraries.
I solved the issue in Ubuntu with the following command:
apt install libpq-dev

Google Cloud ML scipy.misc.imread returning <PIL.JpegImagePlugin.JpegImageFile>

I am running the following snippet:
import tensorflow as tf
import scipy.misc
from tensorflow.python.lib.io import file_io
file = file_io.FileIO('gs://BUCKET/data/celebA/000007.jpg', mode='r')
img = scipy.misc.imread(file)
If I run that snippet in Cloud Console, I get back a proper array. But when that same snippet runs in Cloud ML, the img object is
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=178x218 at 0x7F1F8F26DA10>
This stackoverflow answer suggests that libjpeg was not installed when PIL was installed. The Cloud ML Runtime Version list shows that for Tensorflow 0.12, libjpeg-dev is an installed debian package.
I was able to reproduce this issue on Cloud ML, and it seems to be an issue with the version of file_io in Tensorflow 0.12.1, and goes away if Tensorflow 1.0 is installed.
If you can, upgrade to the 1.0 build of TF.
If you need a 0.12 version, the Cloud ML "0.12" runtime uses the official 0.12.1 build of TF, but you can upload your own version to install if you like. I did not track down exactly when the issue was fixed but a Nightly Tensorflow build from Feb 2nd seemed to work.

Installing Tensorflow from source

I've been trying to install Tensorflow and get it working over the past few days. Whilst I have managed to install TF and get it working as tested by opening Python in the terminal and typing,
import tensorflow as tf
I have not been successful attempting to retrain Inception v3. I managed to install it from source once by following the instructions laid out here however I am no longer able to do so. When I get to the section 'Create the pip package and install' and go to run bazel build -c opt //tensorflow/tools/pip_package:build_pip_package in the root of my Tensorflow directiory I get the following error.
kieran#kieranUbuntu:~/tensorflow$ bazel build -c opt //tensorflow/tools/pip_package:build_pip_package
ERROR: /home/kieran/tensorflow/tensorflow/core/BUILD:1068:1: no such target '//tensorflow/tools/git:gen/spec.json': target 'gen/spec.json' not declared in package 'tensorflow/tools/git' defined by /home/kieran/tensorflow/tensorflow/tools/git/BUILD and referenced by '//tensorflow/core:version_info_gen'.
ERROR: /home/kieran/tensorflow/tensorflow/core/BUILD:1068:1: no such target '//tensorflow/tools/git:gen/head': target 'gen/head' not declared in package 'tensorflow/tools/git' defined by /home/kieran/tensorflow/tensorflow/tools/git/BUILD and referenced by '//tensorflow/core:version_info_gen'.
ERROR: /home/kieran/tensorflow/tensorflow/core/BUILD:1068:1: no such target '//tensorflow/tools/git:gen/branch_ref': target 'gen/branch_ref' not declared in package 'tensorflow/tools/git' defined by /home/kieran/tensorflow/tensorflow/tools/git/BUILD and referenced by '//tensorflow/core:version_info_gen'.
ERROR: Analysis of target '//tensorflow/tools/pip_package:build_pip_package' failed; build aborted.
INFO: Elapsed time: 3.063s
This is the same error I ran into when I managed to install it and then attempted retaining the classifier following this tutorial. At the section, bazel build tensorflow/examples/image_retraining:retrain.
I just can't figure out what is going wrong and I have been trying for so long.
I'm using this pip version, # Ubuntu/Linux 64-bit, CPU only, Python 2.7
I think you should search before ask, This link can probably solve your issue.
The issue lied in the incorrect use of ./configure. Whilst it was ran I currently have two versions of python on my computer, both of which are stored in different locations, when running ./configure I pointed it to the wrong python version. After rectifying the issue everything worked correctly.

Hygieia - Creation of dashboard is failing

I'm working no setting up Hygieia dashboard.
Hygieia: https://github.com/gigaaks/Hygieia
Trying both approaches of setting it up locally or using Docker based installation/setup.
I'm successfully able to get mongoDB, it's API, UI modules up and running. Hygieia main login screen comes up fine. I successfully created the login user and able to log in.
At this point, I have mongoDB running, API, UI pieces running and it's time to create a CAP One / Split Dashboard (templates provided by Hygieia). When I provide the values for creating a new dashboard, it's throwing the following error in the API logs on the server (vagrant/VirtualBox instance) OR within the Docker's container.
What I found was there are lots of issues in this project's module where things are not correct i.e. database name in one module is dashboard, in other module it's dashboardb and in other, it's expecting dashboarddb. I fixed those issues in my github repo/project and initiated a pull request which is approved and will be merged. The following error though, tells my that Hygieia's UI piece is NOT sending a parameter what the API piece is expecting while trying to create a dashboard (in mongoDB). The parameter is "type" and as UI (Hygieia GUI) is not sending it (as per their latest code in GitHUB), the API is failing throwing an error saying that type variable value is passed as NULL. I found the same when I tried POST operation by sending the same JSON RestAPI query using Postman.
Due to this, I'm currently not able to create a dashboard and start using the collectors provided by Hygieia out of the box (for Stash, GitHub, Jenkins, SonarQube etc).
Has anyone of you have faced or found a workaround for this error/issue?
2016-04-01 02:40:40,357 WARN c.c.d.rest.RestApiExceptionHandler - Bad Request - bind exception:
org.springframework.validation.BindException: org.springframework.validation.BeanPropertyBindingResult: 1 errors
Field error in object 'dashboardRequest' on field 'type': rejected value [null]; codes dashboardRequest.type,NotNull.type,NotNull.java.lang.String,NotNull]; arguments [org.springframework.context.support.DefaultMessageSourceResolvable: codes [dashboardRequest.type,type]; arguments []; default message [type]]; default message [may not be null]
at com.capitalone.dashboard.rest.RestApiExceptionHandler.handleMethodArgumentNotValid(RestApiExceptionHandler.java:55) [api.jar!/:2.0.0-SNAPSHOT]
at org.springframework.web.servlet.mvc.method.annotation.ResponseEntityExceptionHandler.handleException(ResponseEntityExceptionHandler.java:156) [spring-webmvc-4.1.7.RELEASE.jar!/:4.1.7.RELEASE]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_72-internal]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_72-internal]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_72-internal]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_72-internal]
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:221) [spring-web-4.1.7.RELEASE.jar!/:4.1.7.RELEASE]
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:137) [spring-web-4.1.7.RELEASE.jar!/:4.1.7.RELEASE]
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:110) [spring-webmvc-4.1.7.RELEASE.jar!/:4.1.7.RELEASE]
The issue was coming due to that fact that I used Hygieia modules docker images built locally using Maven (mvn clean install ; mvn docker:build) but for the UI module, as I was getting an error message, I cherry picked that image from captial one/hygieia-ui image. For some reason, it didn't work and was showing the above error (as there could be API level changes).
I had to run the following to get Hygieia dashboard up and running in local machine:
git config --global --unset-all url.https
git config --global url."https://".insteadOf git://
npm config set prefix /usr/local
sudo npm install --color=false; sudo npm install -g bower gulp; sudo npm install bower install
mvn clean install; mvn docker:build
gulp serve
Now, everything is working as expected for creating dashboard (PS: You have to create mongo db database first using mongo command line as shows in Hygieia's documentation).
npm -g option will install bower and gulp globally. without -g, it'll also install the same locally. Global values actually refer the local values.
For docker based solution, I just used docker-comopose file and got it up and running.
NodeJS(node) version: v5.10.0
NPM (Node pkg mgr) : 3.8.3
Bower version : 1.7.9
Gulp version : [08:18:42] CLI version 3.9.1,[08:18:42] Local version 3.9.1,
Maven version : Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00)
java version "1.8.0_77"
Java(TM) SE Runtime Environment (build 1.8.0_77-b03)
Java HotSpot(TM) 64-Bit Server VM (build 25.77-b03, mixed mode)
Java version :