PySpark not starting - Windows 10 - pyspark

I am trying to setup Spark for Python - on a windows 10 pro machine.
However, after following these steps:
Installed Anaconda with Python 3.7
Installed JDK 8
Installed pre-built Spark 2.4.6 with hadoop 2.7
Downloaded winutils.exe
Setup all environment variables - also the user path setup
Created a C:\tmp\hive folder
Used the winutils.exe chmod -R 777 C:\tmp\hive command successfully
When I try and launch pyspark via command prompt the following text is output and nothing happense thereafter - also no errors?
(base) C:\Spark\bin>pyspark
Python 3.7.6 (default, Jan 8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32 Type "help", "copyright", "credits" or "license" for more information.
20/08/03 07:49:58 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
FINALLY 1 + hour later this error is printed:
Traceback (most recent call last):
File "C:\Program Files\Python37\lib\socket.py", line 589, in readinto
return self._sock.recv_into(b)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Spark\python\pyspark\shell.py", line 41, in <module>
spark = SparkSession._create_shell_session()
File "C:\Spark\python\pyspark\sql\session.py", line 573, in _create_shell_session
return SparkSession.builder\
File "C:\Spark\python\pyspark\sql\session.py", line 173, in getOrCreate
sc = SparkContext.getOrCreate(sparkConf)
File "C:\Spark\python\pyspark\context.py", line 367, in getOrCreate
SparkContext(conf=conf or SparkConf())
File "C:\Spark\python\pyspark\context.py", line 136, in __init__
conf, jsc, profiler_cls)
File "C:\Spark\python\pyspark\context.py", line 198, in _do_init
self._jsc = jsc or self._initialize_context(self._conf._jconf)
File "C:\Spark\python\pyspark\context.py", line 306, in _initialize_context
return self._jvm.JavaSparkContext(jconf)
File "C:\Spark\python\lib\py4j-0.10.7-src.zip\py4j\java_gateway.py", line 1523, in __call__
File "C:\Spark\python\lib\py4j-0.10.7-src.zip\py4j\java_gateway.py", line 985, in send_command
File "C:\Spark\python\lib\py4j-0.10.7-src.zip\py4j\java_gateway.py", line 1152, in send_command
File "C:\Program Files\Python37\lib\socket.py", line 589, in readinto
return self._sock.recv_into(b)

Related

odoo.service.server: Failed to load server-wide module `web`

I am getting this error during starting the server after installation of odoo 11.
I am using ubuntu 18.04
my config file is like this
*
[options]
; This is the password that allows database operations:
;admin_passwd = admin
db_host = 127.0.0.1
db_port = 8069
db_user = odoo
db_password = 123
addons_path = /opt/odoo/addons,/opt/odoo/odoo/addons
logfile = /var/log/odoo/odoo-server.log
*
Full error is something like this
2020-02-07 05:12:33,809 12706 CRITICAL ? odoo.modules.module: Couldn't load module web
2020-02-07 05:12:33,809 12706 CRITICAL ? odoo.modules.module: The 'odoo.addons.web' package was not installed in a way that PackageLoader understands.
2020-02-07 05:12:33,809 12706 ERROR ? odoo.service.server: Failed to load server-wide module `web`.
The `web` module is provided by the addons found in the `openerp-web` project.
Maybe you forgot to add those addons in your addons_path configuration.
Traceback (most recent call last):
File "/opt/odoo/odoo/service/server.py", line 984, in load_server_wide_modules
odoo.modules.module.load_openerp_module(m)
File "/opt/odoo/odoo/modules/module.py", line 368, in load_openerp_module
__import__('odoo.addons.' + module_name)
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 656, in _load_unlocked
File "<frozen importlib._bootstrap>", line 626, in _load_backward_compatible
File "/opt/odoo/odoo/modules/module.py", line 82, in load_module
exec(open(modfile, 'rb').read(), new_mod.__dict__)
File "<string>", line 4, in <module>
File "/opt/odoo/addons/web/controllers/__init__.py", line 4, in <module>
from . import main, pivot
File "/opt/odoo/addons/web/controllers/main.py", line 56, in <module>
loader = jinja2.PackageLoader('odoo.addons.web', "views")
File "/home/sandip/.local/lib/python3.6/site-packages/jinja2/loaders.py", line 290, in __init__
" PackageLoader understands." % package_name
ValueError: The 'odoo.addons.web' package was not installed in a way that PackageLoader understands.
It's actually a bug inside Jinja2. Update it to version 2.11.2
You might be blocked by a missing parameter in your config file.
Try to add the following parameter in the file:
server_wide_modules = web
It is a requirement for the installation to go through peacefully.
Hope it helps
Please check your jinja version
maybe wrong with your version jinja
uninstall them
sudo pip3 uninstall Jinja2
then install with right version
sudo pip3 install Jinja2==2.10.1

Librabbitmq 2.0.0 with Python 3 gives TypeError: can't pickle memoryview objects

I am using the latest master branch of the git repo https://github.com/celery/librabbitmq and installing librabbitmq==2.0.0 for Python 3.6 by following the instructions in the readme
Using the development version
You can clone the repository by doing the following:
$ git clone git://github.com/celery/librabbitmq.git
Then install it by doing the following:
$ cd librabbitmq
$ make install # or make develop
This works fine (after installing certain binaries for c compliation in the OS), but when I then make a small a+b add task and call it with add.delay(2,2) it fails with the following error. I looked up and saw that Celery 4 uses json as serializer, so clearly it is not because if pickle serialization
Changing from librabbitmq to pyamqp broker works normally
Same exact situation in both MacOS and Ubuntu 16
[2018-04-30 23:40:02,956: CRITICAL/MainProcess] Unrecoverable error:
SystemError(' returned a result with an error set',) Traceback (most
recent call last): File
"/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/kombu/messaging.py",
line 624, in _receive_callback
return on_m(message) if on_m else self.receive(decoded, message) File
"/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/worker/consumer/consumer.py",
line 570, in on_task_received
callbacks, File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/worker/strategy.py",
line 145, in task_message_handler
handle(req) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/worker/worker.py",
line 221, in _process_task_sem
return self._quick_acquire(self._process_task, req) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/kombu/async/semaphore.py",
line 62, in acquire
callback(*partial_args, **partial_kwargs) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/worker/worker.py",
line 226, in _process_task
req.execute_using_pool(self.pool) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/worker/request.py",
line 531, in execute_using_pool
correlation_id=task_id, File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/concurrency/base.py",
line 155, in apply_async
**options) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/billiard/pool.py",
line 1486, in apply_async
self._quick_put((TASK, (result._job, None, func, args, kwds))) File
"/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/concurrency/asynpool.py",
line 813, in send_job
body = dumps(tup, protocol=protocol) TypeError: can't pickle memoryview objects
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File
"/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/worker/worker.py",
line 203, in start
self.blueprint.start(self) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/bootsteps.py",
line 119, in start
step.start(parent) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/bootsteps.py",
line 370, in start
return self.obj.start() File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/worker/consumer/consumer.py",
line 320, in start
blueprint.start(self) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/bootsteps.py",
line 119, in start
step.start(parent) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/worker/consumer/consumer.py",
line 596, in start
c.loop(*c.loop_args()) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/worker/loops.py",
line 88, in asynloop
next(loop) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/kombu/async/hub.py",
line 354, in create_loop
cb(*cbargs) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/kombu/transport/base.py",
line 236, in on_readable
reader(loop) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/kombu/transport/base.py",
line 218, in _read
drain_events(timeout=0) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/librabbitmq-2.0.0-py3.6-macosx-10.6-intel.egg/librabbitmq/init.py",
line 227, in drain_events
self._basic_recv(timeout) SystemError: returned a result with an error set
This library is not recommended to use as rabbitmq broker with celery. Instead please try py-amqp. this is more maintained and less buggy.

"import matlab.engine" works on linux command line but not in Spyder

Matlab engine for python (r2016a) appears to be installed and working with python. I can do the following from a bash prompt:
$ python
Python 3.4.5 |Anaconda 4.3.1 (64-bit)| (default, Jul 2 2016, 17:47:47)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import matlab.engine
>>> eng = matlab.engine.start_matlab()
>>> eng.abs(-1)
1
>>> exit()
Next I start Spyder (typing "spyder &" from the same bash prompt) and this is what I get trying the same thing from within Spyder:
Python 3.4.5 |Anaconda 4.3.1 (64-bit)| (default, Jul 2 2016, 17:47:47)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import matlab.engine
Traceback (most recent call last):
File "/home/XXX/anaconda3/envs/mr2/lib/python3.4/site-packages/matlab/engine/__init__.py", line 42, in <module>
pythonengine = importlib.import_module("matlabengineforpython"+_PYTHONVERSION)
File "/home/XXX/anaconda3/envs/mr2/lib/python3.4/importlib/__init__.py", line 109, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 2254, in _gcd_import
File "<frozen importlib._bootstrap>", line 2237, in _find_and_load
File "<frozen importlib._bootstrap>", line 2224, in _find_and_load_unlocked
ImportError: No module named 'matlabengineforpython3_4'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/XXX/anaconda3/envs/mr2/lib/python3.4/site-packages/matlab/engine/__init__.py", line 58, in <module>
pythonengine = importlib.import_module("matlabengineforpython"+_PYTHONVERSION)
File "/home/XXX/anaconda3/envs/mr2/lib/python3.4/importlib/__init__.py", line 109, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 2254, in _gcd_import
File "<frozen importlib._bootstrap>", line 2237, in _find_and_load
File "<frozen importlib._bootstrap>", line 2226, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 1191, in _load_unlocked
File "<frozen importlib._bootstrap>", line 1161, in _load_backward_compatible
File "<frozen importlib._bootstrap>", line 539, in _check_name_wrapper
File "<frozen importlib._bootstrap>", line 1715, in load_module
File "<frozen importlib._bootstrap>", line 321, in _call_with_frames_removed
ImportError: /opt/local/matlab2016a/extern/engines/python/dist/matlab/engine/glnxa64/../../../../../../../bin/glnxa64/libicuio.so.54: undefined symbol: _ZN6icu_5413UnicodeString9doReplaceEiiPKDsii
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/XXX/anaconda3/envs/mr2/lib/python3.4/site-packages/matlab/engine/__init__.py", line 61, in <module>
'MathWorks Technical Support for assistance: %s' % e)
OSError: Please reinstall MATLAB Engine for Python or contact MathWorks Technical Support for assistance: /opt/local/matlab2016a/extern/engines/python/dist/matlab/engine/glnxa64/../../../../../../../bin/glnxa64/libicuio.so.54: undefined symbol: _ZN6icu_5413UnicodeString9doReplaceEiiPKDsii
>>>
Using IPython instead of python gives similar results but with a less informative error. It's clear that Spyder can't find the module matlabengineforpython3_4 but I don't know where to go from there.
How can I get the MATLAB engine to work correctly from within spyder?
This issue might be due to an incompatibility between the libstdc++ shipped with MATLAB and the libstdc++ shipped with the system that the spyder was linked against.
"CXXABI_1.3.9 not found" error message indicates that the libstdc++.so.6 library that is included with MATLAB is missing ABI versions the system needs to draw graphical content.
This is caused because the version of this particular library packaged with MATLAB is older than the versions that come with newer Linux operating systems, which causes compatibility issues.
You can try the following workarounds:
Renaming the libstdc++.so.6 library file so that MATLAB cannot find it and is forced to use the system's version of the library. This file is located in matlabroot/sys/os/glnxa64 . Renaming it to libstdc++.so.6.old should suffice. (where "matlabroot" is root installation directory of MATLAB)
Forcing MATLAB to load the system version by setting the $LD_PRELOAD environment variable to the newer version of libstdc++.
Also, since you are trying to call an external program from MATLAB, this may be a third party issue. But, you can try setting the library paths correctly in 'LD_LIBRARY_PATH' system variable.
find / -name "libstdc++.so*"
I found that calling matlab with the command LD_PRELOAD=/usr/lib64/libstdc++.so.6 matlab -desktop seems to solve the issue
in order to avoid having to type this command every time I also assigned an alias on my .bashrc file alias matlab="LD_PRELOAD=/usr/lib64/libstdc++.so.6 /usr/local/bin/matlab -desktop"
https://cn.mathworks.com/matlabcentral/answers/329796-issue-with-libstdc-so-6

google-cloud-storage library from nosetests using testbed

I have google-cloud-storage pip installed into a lib directory and vendored in. It's running just fine locally during development of my python appengine app. However, when trying to run unit tests via nose and testbed I'm getting "The 'google-cloud-core' distribution was not found and is required by the application". Here is the stack:
Traceback (most recent call last):
File "/Users/jason/dev/gain-data/venv/lib/python2.7/site-packages/nose/loader.py", line 418, in loadTestsFromName
addr.filename, addr.module)
File "/Users/jason/dev/gain-data/venv/lib/python2.7/site-packages/nose/importer.py", line 47, in importFromPath
return self.importFromDir(dir_path, fqname)
File "/Users/jason/dev/gain-data/venv/lib/python2.7/site-packages/nose/importer.py", line 94, in importFromDir
mod = load_module(part_fqname, fh, filename, desc)
File "/Users/jason/dev/gain-data/data/storage/__init__.py", line 4, in <module>
from google.cloud.storage import Blob, Client
File "/Users/jason/dev/gain-data/lib/google/cloud/storage/__init__.py", line 42, in <module>
from google.cloud.storage.batch import Batch
File "/Users/jason/dev/gain-data/lib/google/cloud/storage/batch.py", line 30, in <module>
from google.cloud.storage.connection import Connection
File "/Users/jason/dev/gain-data/lib/google/cloud/storage/connection.py", line 17, in <module>
from google.cloud import connection as base_connection
File "/Users/jason/dev/gain-data/lib/google/cloud/connection.py", line 31, in <module>
get_distribution('google-cloud-core').version)
File "/Users/jason/dev/gain-data/venv/lib/python2.7/site-packages/pkg_resources/__init__.py", line 557, in get_distribution
dist = get_provider(dist)
File "/Users/jason/dev/gain-data/venv/lib/python2.7/site-packages/pkg_resources/__init__.py", line 431, in get_provider
return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0]
File "/Users/jason/dev/gain-data/venv/lib/python2.7/site-packages/pkg_resources/__init__.py", line 968, in require
needed = self.resolve(parse_requirements(requirements))
File "/Users/jason/dev/gain-data/venv/lib/python2.7/site-packages/pkg_resources/__init__.py", line 854, in resolve
raise DistributionNotFound(req, requirers)
DistributionNotFound: The 'google-cloud-core' distribution was not found and is required by the application
Any thoughts?
I had the same issue with google-cloud-translate, I was forced to also install the package "globally", i.e. pip install google-cloud-translate.
After struggling a lot with this same issue I found out that the error was because the vendor pip lib wasn't in the PYTHONPATH before calling the nosetests.
Try adding the vendor lib to the PYTHONPATH and then run the tests.
export PYTHONPATH="$(HOME)/Projects/myproject/pip_lib:$$PYTHONPATH"; \
nosetests .

No handlers could be found for logger "mongo_connector.util"

I always got the error below when I try to run mongo-connector with neo4j doc manager. I also tried with a config file as in https://github.com/mongodb-labs/mongo-connector/blob/master/config.json
Where is the problem?
mongo-connector -m localhost:27017 -t http://localhost:7474/db/data -d neo4j_doc_manager
No handlers could be found for logger "mongo_connector.util"
Traceback (most recent call last):
File "/usr/bin/mongo-connector", line 11, in <module>
sys.exit(main())
File "/usr/lib/python2.6/site-packages/mongo_connector/util.py", line 85, in wrapped
func(*args, **kwargs)
File "/usr/lib/python2.6/site-packages/mongo_connector/connector.py", line 1041, in main
conf.parse_args()
File "/usr/lib/python2.6/site-packages/mongo_connector/config.py", line 118, in parse_args
option, dict((k, values.get(k)) for k in option.cli_names))
File "/usr/lib/python2.6/site-packages/mongo_connector/connector.py", line 824, in apply_doc_managers
module = import_dm_by_name(dm['docManager'])
File "/usr/lib/python2.6/site-packages/mongo_connector/connector.py", line 803, in import_dm_by_name
module = __import__(full_name, fromlist=(name,))
File "/usr/lib/python2.6/site-packages/mongo_connector/doc_managers/neo4j_doc_manager.py", line 16, in <module>
from py2neo import Graph, authenticate
File "/usr/lib/python2.6/site-packages/py2neo/__init__.py", line 28, in <module>
from py2neo.database import *
File "/usr/lib/python2.6/site-packages/py2neo/database/__init__.py", line 65
parameters = {k: v for k, v in parameters.items() if k not in presub_parameters}
It looks like you are using Python 2.6. I'm not sure if that version is officially supported for this project. I would suggest upgrading to Python 2.7 or preferably 3.4 and trying to reproduce.