Celery task subprocess stdout to log - subprocess

I have a celery task which calls other python script external to Django application with subprocess. This program have some print's in it, and I want to have these print's in my celery log file or in my database. When I set CELERY_ALWAYS_EAGER = True in Django settings.py file, everything works fine. If I don't set this setting, celery task log subprocess stdout only when it exit. It seems like p.stdout.readline() is blocking.
run-test.py is a long process, couple of minutes, but it print what it's doing. I want to capture this.
#shared_task
def run_tests(scenario_path, vu):
basedir = os.path.abspath(os.path.dirname(__file__))
config_path = '%s/../../scripts/config.ini' % basedir
cmd = ['python', '%s/../../scripts/aws/run-test.py' % basedir, '%s' % config_path, scenario_path, str(vu), str(2)]
p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
while True:
line = p.stdout.readline()
if line != '':
logger.info(line)
else:
return

I found this to be very useful, using select for polling instead of blocking on readline.
https://gist.github.com/bgreenlee/1402841
child = subprocess.Popen(popenargs, stdout=subprocess.PIPE,
stderr=subprocess.PIPE, **kwargs)
log_level = {child.stdout: stdout_log_level,
child.stderr: stderr_log_level}
def check_io():
ready_to_read = select.select([child.stdout, child.stderr], [], [], 1000)[0]
for io in ready_to_read:
line = io.readline()
logger.log(log_level[io], line[:-1])
# keep checking stdout/stderr until the child exits
while child.poll() is None:
check_io()
check_io() # check again to catch anything after the process exits

Related

Output file not created when running a R command in a Nextflow file?

I am trying to run a nextflow pipeline but the output file is not created.
The main.nf file looks like this:
#!/usr/bin/env nextflow
nextflow.enable.dsl=2
process my_script {
"""
Rscript script.R
"""
}
workflow {
my_script
}
In my nextflow.config I have:
process {
executor = 'k8s'
container = 'rocker/r-ver:4.1.3'
}
The script.R looks like this:
FUN <- readRDS("function.rds");
input = readRDS("input.rds");
output = FUN(
singleCell_data_input = input[[1]], savePath = input[[2]], tmpDirGC = input[[3]]
);
saveRDS(output, "output.rds")
After running nextflow run main.nf the output.rds is not created
Nextflow processes are run independently and isolated from each other from inside the working directory. For your script to be able to find the required input files, these must be localized inside the process working directory. This should be done by defining an input block and declaring the files using the path qualifier, for example:
params.function_rds = './function.rds'
params.input_rds = './input.rds'
process my_script {
input:
path my_function_rds
path my_input_rds
output:
path "output.rds"
"""
#!/usr/bin/env Rscript
FUN <- readRDS("${my_function_rds}");
input = readRDS("${my_input_rds}");
output = FUN(
singleCell_data_input=input[[1]], savePath=input[[2]], tmpDirGC=input[[3]]
);
saveRDS(output, "output.rds")
"""
}
workflow {
function_rds = file( params.function_rds )
input_rds = file( params.input_rds )
my_script( function_rds, input_rds )
my_script.out.view()
}
In the same way, the script itself would need to be localized inside the process working directory. To avoid specifying an absolute path to your R script (which would not make your workflow portable at all), it's possible to simply embed your code, making sure to specify the Rscript shebang. This works because process scripts are not limited to Bash1.
Another way, would be to make your Rscript executable and move it into a directory called bin in the the root directory of your project repository (i.e. the same directory as your 'main.nf' Nextflow script). Nextflow automatically adds this folder to the $PATH environment variable and your script would become automatically accessible to each of your pipeline processes. For this to work, you'd need some way to pass in the input files as command line arguments. For example:
params.function_rds = './function.rds'
params.input_rds = './input.rds'
process my_script {
input:
path my_function_rds
path my_input_rds
output:
path "output.rds"
"""
script.R "${my_function_rds}" "${my_input_rds}" output.rds
"""
}
workflow {
function_rds = file( params.function_rds )
input_rds = file( params.input_rds )
my_script( function_rds, input_rds )
my_script.out.view()
}
And your R script might look like:
#!/usr/bin/env Rscript
args <- commandArgs(trailingOnly = TRUE)
FUN <- readRDS(args[1]);
input = readRDS(args[2]);
output = FUN(
singleCell_data_input=input[[1]], savePath=input[[2]], tmpDirGC=input[[3]]
);
saveRDS(output, args[3])

Within a gimp python-fu plug-in can one create/invoke a modal dialog (and/or register a procedure that is ONLY to be added as a temp procedure?)

I am trying to add a procedure to pop-up a modal dialog inside a plug-in.
Its purpose is to query a response at designated steps within the control-flow of the plug-in (not just acquire parameters at its start).
I have tried using gtk - I get a dialog but it is asynchronous - the plugin continues execution. It needs to operate as a synchronous function.
I have tried registering a plugin in order to take advantage of the gimpfu start-up dialogue for same. By itself, it works; it shows up in the procedural db when queried. But I never seem to be able to actually invoke it from within another plug-in - its either an execution error or wrong number of arguments no matter how many permutations I try.
[Reason behind all of this nonsense: I have written a lot of extension Python scripts for PaintShopPro. I have written a App package (with App.Do, App.Constants, Environment and the like that lets me begin to port those scripts to GIMP -- yes it is perverse, and yes sometimes the code just has to be rewritten, but for a lot of what I actual use in the PSP.API it is sufficient.
However, debugging and writing the module rhymes with witch. So. I am trying to add emulation of psp's "SetExecutionMode" (ie interactive). If
set, the intended behavior is that the App.Do() method will "pause" after/before it runs the applicable psp emulation code by popping up a simple message dialog.]
A simple modal dialogue within a gimp python-fu plug-in can be implemented via gtk's Dialog interface, specifically gtk.MessageDialog.
A generic dialog can be created via
queryDialogue = gtk.MessageDialog(None, gtk.DIALOG_DESTROY_WITH_PARENT \
gtk.MESSAGE_QUESTION, \
gtk.BUTTONS_OK_CANCEL, "")
Once the dialog has been shown,
a synchronous response may be obtained from it
queryDialogue.show()
response = queryDialogue.run()
queryDialogue.hide()
The above assumes that the dialog is not created and thence destroyed after each use.
In the use case (mentioned in the question) of a modal dialog to manage single stepping through a pspScript in gimp via an App emulator package, the dialogue message contents need to be customized for each use. [Hence, the "" for the message argument in the Constructor. [more below]]
In addition, the emulator must be able to accept a [cancel] response to 'get out of Dodge' - ie quit the entire plug-in (gracefully). I could not find a gimpfu interface for the latter, (and do not want to kill the app entirely via gimp.exit()). Hence, this is accomplished by raising a custom Exception class [appTerminate] within the App pkg and catching the exception in the outer-most scope of the plugin. When caught, then, the plug-in returns (exits).[App.Do() can not return a value to indicate continue/exit/etc, because the pspScripts are to be included verbatim.]
The following is an abbreviated skeleton of the solution -
a plug-in incorporating (in part) a pspScript
the App.py pkg supplying the environment and App.Do() to support the pspScript
a Map.py pkg supporting how pspScripts use dot-notation for parameters
App.py demonstrates creation, customization and use of a modal dialog - App.doContinue() displays the dialogue illustrating how it can be customized on each use.
App._parse() parses the pspScript (excerpt showing how it determines to start/stop single-step via the dialogue)
App._exec() implements the pspScript commands (excerpt showing how it creates the dialogue, identifies the message widget for later customization, and starts/stops its use)
# App.py (abbreviated)
#
import gimp
import gtk
import Map # see https://stackoverflow.com/questions/2352181/how-to- use-a-dot-to-access-members-of-dictionary
from Map import *
pdb = gimp.pdb
isDialogueAvailable = False
queryDialogue = None
queryMessage = None
Environment = Map({'executionMode' : 1 })
_AutoActionMode = Map({'Match' : 0})
_ExecutionMode = Map({'Default' : 0}, Silent=1, Interactive=2)
Constants = Map({'AutoActionMode' : _AutoActionMode}, ExecutionMode=_ExecutionMode ) # etc...
class appTerminate(Exception): pass
def Do(eNvironment, procedureName, options = {}):
global appTerminate
img = gimp.image_list()[0]
lyr = pdb.gimp_image_get_active_layer(img)
parsed = _parse(img, lyr, procedureName, options)
if eNvironment.executionMode == Constants.ExecutionMode.Interactive:
resp = doContinue(procedureName, parsed.detail)
if resp == -5: # OK
print procedureName # log to stdout
if parsed.valid:
if parsed.isvalid:
_exec(img, lyr, procedureName, options, parsed, eNvironment)
else:
print "invalid args"
else:
print "invalid procedure"
elif resp == -6: # CANCEL
raise appTerminate, "script cancelled"
pass # terminate plugin
else:
print procedureName + " skipped"
pass # skip execution, continue
else:
_exec(img, lyr, procedureName, options, parsed, eNvironment)
return
def doContinue(procedureName, details):
global queryMessage, querySkip, queryDialogue
# - customize the dialog -
if details == "":
msg = "About to execute procedure \n "+procedureName+ "\n\nContinue?"
else:
msg = "About to execute procedure \n "+procedureName+ "\n\nDetails - \n" + details +"\n\nContinue?"
queryMessage.set_text(msg)
queryDialogue.show()
resp = queryDialogue.run() # get modal response
queryDialogue.hide()
return resp
def _parse(img, lyr, procedureName, options):
# validate and interpret App.Do options' semantics vz gimp
if procedureName == "Selection":
isValid=True
# ...
# parsed = Map({'valid' : True}, isvalid=True, start=Start, width=Width, height=Height, channelOP=ChannelOP ...
# /Selection
# ...
elif procedureName == "SetExecutionMode":
generalOptions = options['GeneralSettings']
newMode = generalOptions['ExecutionMode']
if newMode == Constants.ExecutionMode.Interactive:
msg = "set mode interactive/single-step"
else:
msg = "set mode silent/run"
parsed = Map({'valid' : True}, isvalid=True, detail=msg, mode=newMode)
# /SetExecutionMode
else:
parsed = Map({'valid' : False})
return parsed
def _exec(img, lyr, procedureName, options, o, eNvironment):
global isDialogueAvailable, queryMessage, queryDialogue
#
try:
# -------------------------------------------------------------------------------------------------------------------
if procedureName == "Selection":
# pdb.gimp_rect_select(img, o.start[0], o.start[1], o.width, o.height, o.channelOP, ...
# /Selection
# ...
elif procedureName == "SetExecutionMode":
generalOptions = options['GeneralSettings']
eNvironment.executionMode = generalOptions['ExecutionMode']
if eNvironment.executionMode == Constants.ExecutionMode.Interactive:
if isDialogueAvailable:
queryDialogue.destroy() # then clean-up and refresh
isDialogueAvailable = True
queryDialogue = gtk.MessageDialog(None, gtk.DIALOG_DESTROY_WITH_PARENT, gtk.MESSAGE_QUESTION, gtk.BUTTONS_OK_CANCEL, "")
queryDialogue.set_title("psp/APP.Do Emulator")
queryDialogue.set_size_request(450, 180)
aqdContent = queryDialogue.children()[0]
aqdHeader = aqdContent.children()[0]
aqdMsgBox = aqdHeader.children()[1]
aqdMessage = aqdMsgBox.children()[0]
queryMessage = aqdMessage
else:
if isDialogueAvailable:
queryDialogue.destroy()
isDialogueAvailable = False
# /SetExecutionMode
else: # should not get here (should have been screened by parse)
raise AssertionError, "unimplemented PSP procedure: " + procedureName
except:
raise AssertionError, "App.Do("+procedureName+") generated an exception:\n" + sys.exc_info()
return
A skeleton of the plug-in itself. This illustrates incorporating a pspScript which includes a request for single-step/interactive execution mode, and thus the dialogues. It catches the terminate exception raised via the dialogue, and then terminates.
def generateWebImageSet(dasImage, dasLayer, title, mode):
try:
img = dasImage.duplicate()
# ...
bkg = img.layers[-1]
frameWidth = 52
start = bkg.offsets
end = (start[0]+bkg.width, start[1]+frameWidth)
# pspScript: (snippet included verbatim)
# SetExecutionMode / begin interactive single-step through pspScript
App.Do( Environment, 'SetExecutionMode', {
'GeneralSettings': {
'ExecutionMode': App.Constants.ExecutionMode.Interactive
}
})
# Selection
App.Do( Environment, 'Selection', {
'General' : {
'Mode' : 'Replace',
'Antialias' : False,
'Feather' : 0
},
'Start': start,
'End': end
})
# Promote
App.Do( Environment, 'SelectPromote' )
# und_so_weiter ...
except App.appTerminate:
raise AssertionError, "script cancelled"
# /generateWebImageSet
# _generateFloatingCanvasSetWeb.register -----------------------------------------
#
def generateFloatingCanvasSetWeb(dasImage, dasLayer, title):
mode="FCSW"
generateWebImageSet(dasImage, dasLayer, title, mode)
register(
"generateFloatingCanvasSetWeb",
"Generate Floating- Frame GW Canvas Image Set for Web Page",
"Generate Floating- Frame GW Canvas Image Set for Web Page",
"C G",
"C G",
"2019",
"<Image>/Image/Generate Web Imagesets/Floating-Frame Gallery-Wrapped Canvas Imageset...",
"*",
[
( PF_STRING, "title", "title", "")
],
[],
generateFloatingCanvasSetWeb)
main()
I realize that this may seem like a lot of work just to be able to include some pspScripts in a gimp plug-in, and to be able to single-step through the emulation. But we are talking about maybe 10K lines of scripts (and multiple scripts).
However, if any of this helps anyone else with dialogues inside plug-ins, etc., so much the better.

An error in my code to be a simple ftp

I met an error when running codes at the bottom. It's like a simple ftp.
I use python2.6.6 and CentOS release 6.8
In most linux server, it gets right results like this:(I'm very sorry that I have just sign up and couldn't )
Clinet:
[root#Test ftp]# python client.py
path:put|/home/aaa.txt
Server:
[root#Test ftp]# python server.py
connected...
pre_data:put|aaa.txt|4
cmd: put
file_name: aaa.txt
file_size: 4
upload successed.
But I get errors in some server(such as my own VM in my PC). I have done lots of tests(python2.6/python2.7, Centos6.5/Centos6.7) and found this error is not because them. Here is the error imformation:
[root#Lewis-VM ftp]# python server.py
connected...
pre_data:put|aaa.txt|7sdfsdf ###Here gets the wrong result, "sdfsdf" is the content of /home/aaa.txt and it shouldn't be sent here to 'file_size' and so it cause the "ValueError" below
cmd: put
file_name: aaa.txt
file_size: 7sdfsdf
----------------------------------------
Exception happened during processing of request from ('127.0.0.1', 10699)
Traceback (most recent call last):
File "/usr/lib64/python2.6/SocketServer.py", line 570, in process_request_thread
self.finish_request(request, client_address)
File "/usr/lib64/python2.6/SocketServer.py", line 332, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/usr/lib64/python2.6/SocketServer.py", line 627, in __init__
self.handle()
File "server.py", line 30, in handle
if int(file_size)>recv_size:
ValueError: invalid literal for int() with base 10: '7sdfsdf\n'
What's more, I found that if I insert a time.sleep(1) between sk.send(cmd+"|"+file_name+'|'+str(file_size)) and sk.send(data) in client.py, the error will disappear. I have said that I did tests in different system and python versions and the error is not because them. So I guess that is it because of some system configs? I have check about socket.send() and socket.recv() in python.org but fount nothing helpful. So could somebody help me to explain why this happend?
The code are here:
#!/usr/bin/env python
#coding:utf-8
################
#This is server#
################
import SocketServer
import os
class MyServer(SocketServer.BaseRequestHandler):
def handle(self):
base_path = '/home/ftp/file'
conn = self.request
print 'connected...'
while True:
#####receive pre_data: we should get data like 'put|/home/aaa|7'
pre_data = conn.recv(1024)
print 'pre_data:' + pre_data
cmd,file_name,file_size = pre_data.split('|')
print 'cmd: ' + cmd
print 'file_name: '+ file_name
print 'file_size: '+ file_size
recv_size = 0
file_dir = os.path.join(base_path,file_name)
f = file(file_dir,'wb')
Flag = True
####receive 1024bytes each time
while Flag:
if int(file_size)>recv_size:
data = conn.recv(1024)
recv_size+=len(data)
else:
recv_size = 0
Flag = False
continue
f.write(data)
print 'upload successed.'
f.close()
instance = SocketServer.ThreadingTCPServer(('127.0.0.1',9999),MyServer)
instance.serve_forever()
#!/usr/bin/env python
#coding:utf-8
################
#This is client#
################
import socket
import sys
import os
ip_port = ('127.0.0.1',9999)
sk = socket.socket()
sk.connect(ip_port)
while True:
input = raw_input('path:')
#####we should input like 'put|/home/aaa.txt'
cmd,path = input.split('|')
file_name = os.path.basename(path)
file_size=os.stat(path).st_size
sk.send(cmd+"|"+file_name+'|'+str(file_size))
send_size = 0
f= file(path,'rb')
Flag = True
#####read 1024 bytes and send it to server each time
while Flag:
if send_size + 1024 >file_size:
data = f.read(file_size-send_size)
Flag = False
else:
data = f.read(1024)
send_size+=1024
sk.send(data)
f.close()
sk.close()
The TCP is a stream of data. That is the problem. TCP do not need to keep message boundaries. So when a client calls something like
connection.send("0123456789")
connection.send("ABCDEFGHIJ")
then a naive server like
while True;
data = conn.recv(1024)
print data + "_"
may print any of:
0123456789_ABCDEFGHIJ_
0123456789ABCDEFGHIJ_
0_1_2_3_4_5_6_7_8_9_A_B_C_D_E_F_G_H_I_J_
The server has no chance to recognize how many sends client called because the TCP stack at client side just inserted data to a stream and the server must be able to process the data received in different number of buffers than the client used.
Your server must contain a logic to separate the header and the data. All of application protocols based on TCP use a mechanism to identify application level boundaries. For example HTTP separates headers and body by an empty line and it informs about the body length in a separate header.
Your program works correctly when server receives a header with the command, name and size in a separate buffer it it fails when client is fast enough and push the data into stream quickly and the server reads header and data in one chunk.

ERROR: Uncaught exception in luigi (TypeError: must be string or buffer, not None)

I am having trouble while calling /triggering Luigi Task from a python code.
Basically i need to trigger a luigi task just like we do on command line, but from a python code
I am using supbrocess.popen to call a luigi task using a shell
command
I have a test code named as test.py and have a test class in module
task_scheduler.py which contains my luigi task (both modules in same location/dir)
import luigi
class TestClass(luigi.Task):
# param = luigi.DictParameter(default=dict())
def requires(self):
print "I am TestClass req"
def run(self):
with open('myfile.txt', 'w') as f:
f.write("asasasas")
print "I am TestClass run"
import subprocess
p = subprocess.Popen("python -m luigi --module task_scheduler TestClass", shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
print p.pid
(output, err) = p.communicate()
print "-------------O/P-------------"
print output
print "-------------error-------------"
print err
But I am getting the error as
52688
-------------O/P-------------
-------------error-------------
ERROR: Uncaught exception in luigi
Traceback (most recent call last):
File "/Library/Python/2.7/site-packages/luigi/retcodes.py", line 61, in run_with_retcodes
worker = luigi.interface._run(argv)['worker']
File "/Library/Python/2.7/site-packages/luigi/interface.py", line 238, in _run
return _schedule_and_run([cp.get_task_obj()], worker_scheduler_factory)
File "/Library/Python/2.7/site-packages/luigi/interface.py", line 172, in _schedule_and_run
not(lock.acquire_for(env_params.lock_pid_dir, env_params.lock_size, kill_signal))):
File "/Library/Python/2.7/site-packages/luigi/lock.py", line 82, in acquire_for
my_pid, my_cmd, pid_file = get_info(pid_dir)
File "/Library/Python/2.7/site-packages/luigi/lock.py", line 67, in get_info
pid_file = os.path.join(pid_dir, hashlib.md5(cmd_hash).hexdigest()) + '.pid'
TypeError: must be string or buffer, not None
Can anyone please suggest me what I am doing wrong here?
The command "python -m luigi --module task_scheduler TestClass" works perfectly if I use shell prompt
I just tried executing the test.py from command line.
It seems when I was using the IDE to run (PyCharm) then only it gives this issue
This is fixed in version 2.2.0. Check github issue #1654

django celery queue routing won't work

I have two tasks. the task "heavy_task" need a concurrency of 1 and the "lite_task" need a concurrency of 4
#task
def lite_task():
tabla = Proc_Carga()
sp = tabla.carga()
return None
#task()
def heavy_task(idprov,pfecha):
conci = Buscar_Conci()
spconc = conci.buscarcon(idprov,pfecha)
return None
I define the routes in my settings.py file:
BROKER_URL = 'redis://localhost:6379/0'
CELERY_IMPORTS = ("pc.tasks", )
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_RESULT_BACKEND='djcelery.backends.cache:CacheBackend'
CELERY_ROUTES = {"tasks.heavy_task": {"queue": "heavy"},"tasks.lite_task": {"queue": "lite"}}
I try to excute two workers specifying the concurrency in this way
celery multi start heavy lite -A provcon -c:heavy 1 -c:lite 3
When first call the tasks heavy_task work fine and the concurrency works,
but after call the task lite_task the concurency for the queue heavy change.
I try this:
celery -A provcon worker -Q heavy -c 1
And when I execute the task heavy_task, the routing won't work and the task is not executed.
but if use this:
celery -A provcon -c 1
everything works fine, but I can only execute one task at time, and I need to be able to execute the heavy_task with a concurrency of 1 and the lite_task with a concurrency of 3
Any advice
I try different settings to make the queues works and finally I did it.
In the tasks.py file I harcoded the the queue in the task decorator
#task(queue = 'heavy')
To run the workers I use this:
celery multi start lite_w heavy_w -A provcon -Q:heavy_w heavy -Q:carga_w lite -l info -c:heavy_w 1 -c:lite_w 3 -E
And i delete the routing settings from the settings.py file. My settings are this:
BROKER_URL = 'redis://localhost:6379/0'
CELERY_IMPORTS = ("pc.tasks", )
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_RESULT_BACKEND='djcelery.backends.cache:CacheBackend'