Cannot import sparknlp on Databricks - pyspark

I'm trying to do an
import sparknlp
on the Databricks platform and I'm getting a similar message to the one reported at After installing sparknlp, cannot import sparknlp
I can't figure out how to get the python wrapper installed... I can access the spark-nlp library via Scala but I can't get the python version working. Any tips would be greatly appreciated!

These error can be produced due to sparknlp jars have been loaded correctly, but python wrappers library can not be imported. Make sure you have installed those wrappers correctly. Check out the sparknlp documentation site
As is said in documentation webpage, make sure that after install python sparknlp library:
pip install --index-url https://test.pypi.org/simple/ spark-nlp==1.5.4
that your environment variable PYTHONPATH can locate sparknlp wrappers.

Related

pyodbc has a .pyi file but mypy doesn't see the stub file

pyodbc has a .pyi file but when running pytest-mypy, I have this error:
__________________________________________________________________________________________________ connexion.py __________________________________________________________________________________________________
3: error: Cannot find implementation or library stub for module named "pyodbc"
3: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
_
This should only happen when the library doesn't have stub files but it appears there are stub files. What should I do?
I'm using Python 3.10.2 and i've updated pyodbc to the latest version (pyodbc==4.0.34)
Let's assume your python is installed in /usr. In that case, your python executable will be in /usr/bin, and any libraries you install with pip will be installed in /usr/lib/python3.10/site-packages. In this case, all the sources for pyodbc can be found in /usr/lib/python3.10/site-packages/pyodbc.
Following this pattern we would expect to find type stubs in /usr/lib/python3.10/site-packages/pyodbc.pyi, but there is an issue in pyodbc, so that the stubs are actually installed in /usr/pyodbc.pyi.
In order to pick up this path, you will need to modify settings in your development environment. In Linux, try setting PYTHONPATH=/usr in your environment. The link mentioned in rogdham's comment includes others' comments on how to make this work in VS Code. Other development environments should support similar workarounds.

Snowpark connection errors with 0.6.0 jar

I am trying to use snowpark(0.6.0) via Jupiter notebooks(after installing Scala almond kernel). I am using Windows laptop. Had to change the examples here a bit to work around windows. Following documentation here
https://docs.snowflake.com/en/developer-guide/snowpark/quickstart-jupyter.html
Ran into this error
java.lang.NoClassDefFoundError: Could not initialize class com.snowflake.snowpark.Session$
ammonite.$sess.cmd5$Helper.<init>(cmd5.sc:6)
ammonite.$sess.cmd5$.<init>(cmd5.sc:7)
ammonite.$sess.cmd5$.<clinit>(cmd5.sc:-1)
Also tried earlier with IntelliJ IDE,got bunch of errors with missing dependencies for log4j etc.
Can I get help.
Have not set it up in Widows but only with Linux.
You have to do the setup steps for each notebook that is going to use Snowpark (part from installing the kernel).
It's important to make sure you are using a unique folder for each notebook, as in step 2 in the guide.
What was the output of the import $ivy.com.snowflake:snowpark:0.6.0?

Jupyter cant run shapely.geometry

Hey so I've managed to get shapely.geometry to run just fine on PyCharm.
But the difficulty here is in getting the import to run on Jupyter notebook.
I have done:
import geopandas as gpd
This returns shapely.geometry doesn't exist.
I think I know how to fix this through downloading the file
"Shapely-1.6.4.post1-cp37-cp37m-win_amd64.whl" and doing conda install (that)... but it returned that the channel didnt exist...
So I did:
conda install --add channels https://www.lfd.uci.edu/~gohlke/pythonlibs/
(which is where I got the file from) which worked just fine so I then again did "conda install Shapely-1.6.4.post1-cp37-cp37m-win_amd64.whl" but it returned:
CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://www.lfd.uci.edu/~gohlke/pythonlibs/win-64/repodata.json>
A simple retry will get you on your way...
Tried that, didnt work. Someone please help. Reminder that I successfully installed shapely with all of its modules working through "pip install Shapely-1.6.4.post1-cp37-cp37m-win_amd64.whl" WITHIN Pycharm itself.
EDIT 1
Im following the textbook "Mastering Geospatial Anlsysis with Python" It got me to download the packages:
gdal
geos
shapely
fiona
pyshp
pyproj
rasterio
geopandas
EDIT 2
I dont know what i did but somehow i fixed it... but the thing is, i literally did nothing except take out a shapely file with a long name and kept the one just called "shapely".
If i have files like this
gdal-2.2.2-py36hcebd033_1
instead of this
gdal
Is that the problem?????? because if it is, then i dont know how to get files like that they just either appear or they dont.
Shapely is a wrapper of C++ library called GEOS that is not installed with the wheel. You should go to the page and install that library.
Or perhaps you have Pycharm for python 2 and Jupyter for python 3 (or vice-versa).
Running conda install -c conda-forge geos=3.7.1 worked for me.

ImportError: No module named sympy

I am getting the following error while trying to run a sympy file in order to contribute to sympy. It is :
ImportError: No module named sympy
I installed the sympy module through pip for both python2.7 and python 3.
Also, isympy is working.
Strangly, when I try to import sympy in python's interactive console in the main sympy directory, no import errors are shown but in some other directory, it shows import errors.
Please help me to download the sympy module in a way that I will be able to run the code.
Thanks.
Importing module in python console of main directory.
Importing module in some other directory.
A likely cause here is that you are using two different Pythons. If you have Python installed multiple times (like Python 2 and Python 3), each has its own separate packages. You can check what Python you are using by printing sys.executable.
I should point out that for the purposes of contributing to SymPy, you generally want to run against the development version. That is, running Python from the SymPy directory and importing the development version from there, without actually installing it.
Thanks for the reply.
But I solved the problem. I realised that I didn't install sympy in the current conda environment. When I tried it using the command:
conda install sympy
It worked and no error is being shown.
Thanks.

needed python packages for ipython

I have recently installed ipython and have before that installed curses:
jinja2 numpy pexpect pygments qt sqlite3 tornado zmq
I have run iptest just after and got OK at the end. But near the end of the report there was the following:
Tools and libraries NOT available at test time:
azure cython matplotlib oct2py pymongo rpy2 wx wx.aui
My question is whether those listed as not available at test time need to be installed for ipython to run optimally?
Thanks.
Question have already been asked and answerd on ML by other users :
http://python.6.n6.nabble.com/IPython-User-question-about-need-td5006172.html
Here is the answer given by MinRK:
No, but some tests of compatibility or extensions, etc. use those, so it's just a note to tell which tests were run.