Scrapy crawl on crontab under virtual environment

Scrapy crawl on crontab under virtual environment - virtualenv

Trying to run scrapy crawl command using crontab under virtual environment, and getting below error while running scrapy command from cron job -
UserWarning: Cannot import scrapy settings module myspider.settings warnings.warn("Cannot import scrapy settings module
%s" % scrapy_module)
.....
raise KeyError("Spider not found: %s" % spider_name) KeyError: 'Spider not found: myspider'
Any help or suggestion?

It was missing PYTHONPATH while running under crontab.
I have added it before my following cron job:
*/40 * * * * source /home/water/.virtualenvs/water/bin/activate && cd
$HOME/water2012/ && scrapy crawl water2012 >>
$HOME/water2012/log/log_$(date +\%Y\%m\%d).log 2>&1

Related

How to solve error on docker:layers_calculator to compute the Merkle tree on private tangle?

I want to setup a private tangle on my own virtual machine with Ubuntu 18.04, 4GB RAM and 20GB memory.
I have follow this instructions: https://docs.iota.org/docs/compass/0.1/how-to-guides/set-up-a-private-tangle. Every command works fine until reach this one: bazel run //docker:layers_calculator.
It shows an error as follows:
Starting local Bazel server and connecting to it...
ERROR: /home/istabraq/compass/third-party/maven_deps.bzl:3:5: Traceback (most recent call last):
File "/home/istabraq/compass/WORKSPACE", line 42
maven_jars()
File "/home/istabraq/compass/third-party/maven_deps.bzl", line 3, in maven_jars
native.maven_jar(<4 more arguments>)
type 'struct' has no method maven_jar()
ERROR: error loading package '': Encountered error while reading extension file 'protobuf_deps.bzl': no such package '#com_google_protobuf_deps//': error loading package 'external': Could not load //external package
ERROR: error loading package '': Encountered error while reading extension file 'protobuf_deps.bzl': no such package '#com_google_protobuf_deps//': error loading package 'external': Could not load //external package
INFO: Elapsed time: 4.743s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
FAILED: Build did NOT complete successfully (0 packages loaded)
How can I solve this problem? what I have missed?

read carefully the message given after running bazel installer:
Make sure you have "/home/yourusername/bin" in your path. You can also activate bash completion by adding the following line to your :
source /home/yourusername/.bazel/bin/bazel-complete.bash
You can check with: "bazel info" or "bazel version"
Unfortunately, there are further errors:
https://github.com/iotaledger/compass/issues/142

I have solve this issue by using this commands:
Step 3: Set up your environment
If you ran the Bazel installer with the --user flag as above, the Bazel executable is installed in your $HOME/bin directory. It’s a good idea to add this directory to your default paths, as follows:
export PATH="$PATH:$HOME/bin"
You can also add this command to your ~/.bashrc or ~/.zshrc file to make it permanent.
reference:
https://docs.bazel.build/versions/master/install-ubuntu.html

Failed building wheel in Google ML engine

I follow this tutorial to submit machine learning job to Google ML engine. Then I faced the error ImportError: No module named matplotlib.pyplot but solved it by adding the Matplotlib into RequiredPackages in setup.py. Then I face another error which is ImportError: No module named _tkinter, please install the python-tk package. I found this solution and this solution but do not help and give me another error.
Failed building wheel for my-package
Command "/usr/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-T6kjZl-build/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-4LpzWh-record/install-record.txt --single-version-externally-managed --compile --user --prefix=" failed with error code 1 in /tmp/pip-T6kjZl-build/
Command '['pip', 'install', '--user', '--upgrade', '--force-reinstall', '--no-deps', u'my-package-0.1.1.tar.gz']' returned non-zero exit status 1
My setup.py.
"""Setup script for object_detection."""
import logging
import subprocess
from setuptools import find_packages
from setuptools import setup
from setuptools.command.install import install
class CustomCommands(install):
def RunCustomCommand(self, command_list):
p = subprocess.Popen(
command_list,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
# Can use communicate(input='y\n'.encode()) if the command run requires
# some confirmation.
stdout_data, _ = p.communicate()
logging.info('Log command output: %s', stdout_data)
if p.returncode != 0:
raise RuntimeError('Command %s failed: exit code: %s', (command_list, p.returncode))
def run(self):
self.RunCustomCommand(['apt-get', 'update'])
self.RunCustomCommand(
['apt-get', 'install', '-y', 'python-tk'])
install.run(self)
REQUIRED_PACKAGES = ['Pillow>=1.0', 'Matplotlib>=2.1']
setup(
name='object_detection',
version='0.1',
install_requires=REQUIRED_PACKAGES,
include_package_data=True,
packages=[p for p in find_packages() if
p.startswith('object_detection')],
description='Tensorflow Object Detection Library',
cmdclass={
'install': CustomCommands,
})

The tutorial blog link that you are following is outdated I think. I followed this one
you need to change runtime to 1.9 instead of 1.8 in cloud ml training command
From tensorflow/models/research/
gcloud ml-engine jobs submit training `whoami`_object_detection_pets_`date +%m_%d_%Y_%H_%M_%S` \
--runtime-version 1.9 \
--job-dir=gs://${YOUR_GCS_BUCKET}/model_dir \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
--module-name object_detection.model_main \
--region us-central1 \
--config object_detection/samples/cloud/cloud.yml \
-- \
--model_dir=gs://${YOUR_GCS_BUCKET}/model_dir \
--pipeline_config_path=gs://${YOUR_GCS_BUCKET}/data/faster_rcnn_resnet101_pets.config
after changing runtime everything worked perfect. No need to change any setup file. No Library errors.

'no module named setuptools' but it is contained in the DEPENDS variable

This Problem regards Openembedded/Yocto.
I have source code which needs to be compiled by a custom python3 script.
That means, that some python3 script should run during the do_compile() process.
The script imports setuptools, therefore, I added DEPENDS += "python3-setuptools-native" to the recipe. As far as I understand the documentation, this should make the setuptools module available for the building process (native).
But when bitbake executes the do_compile() process, I get this error: no module named 'setuptools'.
Let me break it down to a minimal (non-)working example:
FILE: test.bb
LICENSE = "BSD"
LIC_FILES_CHKSUM = "file://test/LICENSE;md5=d41d8cd98f00b204e9800998ecf8427e"
DEPENDS += "python3-setuptools-native"
SRC_URI = "file://test.py \
file://LICENSE"
do_compile() {
python3 ${S}/../test.py
}
FILE: test.py
import setuptools
print("HELLO")
bitbaking:
$ bitbake test
ERROR: test-1.0-r0 do_compile: Function failed: do_compile (log file is located at /path/to/test/1.0-r0/temp/log.do_compile.8532)
ERROR: Logfile of failure stored in: /path/to/test/1.0-r0/temp/log.do_compile.8532
Log data follows:
| DEBUG: Executing shell function do_compile
| Traceback (most recent call last):
| File "/path/to/test-1.0/../test.py", line 1, in <module>
| import setuptools
| ImportError: No module named 'setuptools'
| WARNING: exit code 1 from a shell command.
| ERROR: Function failed: do_compile (log file is located at /path/to/test/1.0-r0/temp/log.do_compile.8532)
ERROR: Task (/path/to/test.bb:do_compile) failed with exit code '1'
NOTE: Tasks Summary: Attempted 400 tasks of which 398 didn't need to be rerun and 1 failed.
NOTE: Writing buildhistory
Summary: 1 task failed:
/path/to/test.bb:do_compile
Summary: There was 1 ERROR message shown, returning a non-zero exit code.
Is my exepectation wrong, that DEPENDS += "python3-setuptools-native" makes the python3 module 'setuptools' available to the python3 script in do_compile()? How may I accomplish this?

Under the hood quite a bit more is needed to get working setuptools support. Luckily there's a class to handle that:
inherit setuptools3
This should be all that's need to package a setuptools based project with OE-Core. As long as your project has a standard setup.py you don't need to write any do_compile() or do_install() functions.
If you do need to look at the details, meta/classes/setuptools3.bbclass and meta/classes/distutils3.bbclass should contain what you need (including the rather unobvious way to call native python from a recipe).

unregistered task type import errors in celery

I'm having headaches with getting celery to work with my folder structure. Note I am using virtualenv but it should not matter.
cive /
celery_app.py
__init__.py
venv
framework /
tasks.py
__init__.py
civeAPI /
files tasks.py need
cive is my root project folder.
celery_app.py:
from __future__ import absolute_import
from celery import Celery
app = Celery('cive',
broker='amqp://',
backend='amqp://',
include=['cive.framework.tasks'])
# Optional configuration, see the application user guide.
app.conf.update(
CELERY_TASK_RESULT_EXPIRES=3600,
)
if __name__ == '__main__':
app.start()
tasks.py (simplified)
from __future__ import absolute_import
#import other things
#append syspaths
from cive.celery_app import app
#app.task(ignore_result=False)
def start(X):
# do things
def output(X):
# output files
def main():
for d in Ds:
m = []
m.append( start.delay(X) )
output( [n.get() for n in m] )
if __name__ == '__main__':
sys.exit(main(sys.argv[1:]))
I then start workers via (outside root cive dir)
celery -A cive worker --app=cive.celery_app:app -l info
which seems to work fine, loading the workers and showing
[tasks]
. cive.framework.tasks.start_sessions
But when I try to run my tasks.py via another terminal:
python tasks.py
I get the error:
Traceback (most recent call last):
File "tasks.py", line 29, in <module>
from cive.celery_app import app
ImportError: No module named cive.celery_app
If I rename the import to:
from celery_app import app #without the cive.celery_app
I can eventually start the script but celery returns error:
Received unregistered task of type 'cive.start_sessions'
I think there's something wrong with my imports or config but I can't say what.

So this was a python package problem, not particularly a celery issue. I found the solution by looking at How to fix "Attempted relative import in non-package" even with __init__.py .
I've never even thought about this before, but I wasn't running python in package mode. The solution is cd'ing out of your root project directory, then running python as a package (note there is no .py after tasks):
python -m cive.framework.tasks
Now when I run the celery task everything works.

breaking on unhandled exceptions in pydev/gae

i am using pydev to develop a google app engine application. i followed the steps mentioned here to configure pydev debugger to break on unhandled exception. i could get it to work on a sample pydev project, but when i try the same steps in my pydev gae project, it doesn't work and gives following error:
pydev debugger: warning: psyco not
available for speedups (the debugger
will still work correctly, but a bit
slower) pydev debugger: starting
...
Traceback (most recent call last):
File "c:\program
files\google\google_appengine\google\appengine\tools\dev_appserver.py",
line 3858, in _HandleRequest
self._Dispatch(dispatcher, self.rfile, outfile, env_dict) File
"c:\program
files\google\google_appengine\google\appengine\tools\dev_appserver.py",
line 3792, in _Dispatch
base_env_dict=env_dict) File "c:\program
files\google\google_appengine\google\appengine\tools\dev_appserver.py",
line 580, in Dispatch
base_env_dict=base_env_dict) File "c:\program
files\google\google_appengine\google\appengine\tools\dev_appserver.py",
line 2918, in Dispatch
self._module_dict) File "c:\program
files\google\google_appengine\google\appengine\tools\dev_appserver.py",
line 2822, in ExecuteCGI
reset_modules = exec_script(handler_path, cgi_path,
hook) File "c:\program
files\google\google_appengine\google\appengine\tools\dev_appserver.py",
line 2702, in ExecuteOrImportScript
exec module_code in script_module.dict File
"C:\Users\siddjain\workspace\rfad\src\main.py",
line 1, in
import pydevd ImportError: No module named pydevd
my debug configuration for gae project is like this:
the sample pydev project where it works is like this and am following same pattern in my gae project:
import pydevd
def f(x,y):
z = y/x;
return z;
def main():
pydevd.set_pm_excepthook()
print f(0,0)
if __name__ == '__main__':
main()
the run config for test project is like this:
the pydevd.py module is under C:\eclipse\plugins\org.python.pydev.debug_2.0.0.2011040403\pysrc. Although this path is not included in the pythonpath for test project, the breaking works in test. i also tried including this path in pythonpath of gae project to see if that fixes my problem, but it didn't

Its still not fixed although following steps got rid of No module named pydevd error:
1. Create a symlink to C:\eclipse\plugins\org.python.pydev.debug_2.0.0.2011040403\pysrc:
src>mklink /d debugger C:\eclipse\plugins\org.python.pydev.debug_2.0.0.2011040403\pysrc
2. Put an empty __init__.py file in C:\eclipse\plugins\org.python.pydev.debug_2.0.0.2011040403\pysrc. i learnt this from some google link that i can't find now.
3. In source code:
import debugger.pydevd as pydevd
after these steps its able to import pydevd, but still doesn't break on uncaught exception.
The development server runs your application on your local computer for testing your application. The server simulates the App Engine datastore, services and sandbox restrictions (hence step 1). although i don't understand why we need to import pydevd since its pydevd who's running our application in the first place!
In the Run->Debug Configurations->Interpreter tab if I click on "see resulting command line..."
C:\Python25\python.exe -u
C:\eclipse\plugins\org.python.pydev.debug_2.0.0.2011040403\pysrc\pydevd.py
--vm_type python --client 127.0.0.1 --port 0 --file "c:\program
files\google\google_appengine\dev_appserver.py"
The PYTHONPATH that will be used is:
C:\eclipse\plugins\org.python.pydev_2.0.0.2011040403\PySrc\pydev_si...

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Scrapy crawl on crontab under virtual environment - virtualenv

It was missing PYTHONPATH while running under crontab. I have added it before my following cron job: /40 * * * source /home/water/.virtualenvs/water/bin/activate && cd $HOME/water2012/ && scrapy crawl water2012 >> $HOME/water2012/log/log_$(date +\%Y\%m\%d).log 2>&1

Related

How to solve error on docker:layers_calculator to compute the Merkle tree on private tangle?

Failed building wheel in Google ML engine

'no module named setuptools' but it is contained in the DEPENDS variable

unregistered task type import errors in celery

breaking on unhandled exceptions in pydev/gae

Categories

Resources

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Scrapy crawl on crontab under virtual environment - virtualenv

It was missing PYTHONPATH while running under crontab. I have added it before my following cron job: */40 * * * * source /home/water/.virtualenvs/water/bin/activate && cd $HOME/water2012/ && scrapy crawl water2012 >> $HOME/water2012/log/log_$(date +\%Y\%m\%d).log 2>&1

Related

How to solve error on docker:layers_calculator to compute the Merkle tree on private tangle?

Failed building wheel in Google ML engine

'no module named setuptools' but it is contained in the DEPENDS variable

unregistered task type import errors in celery

breaking on unhandled exceptions in pydev/gae

Categories

Resources

It was missing PYTHONPATH while running under crontab. I have added it before my following cron job: /40 * * * source /home/water/.virtualenvs/water/bin/activate && cd $HOME/water2012/ && scrapy crawl water2012 >> $HOME/water2012/log/log_$(date +\%Y\%m\%d).log 2>&1