PytestUnhandledThreadExceptionWarning during uiautomator dump - pytest

In my pytest I am actually not doing much just making uiautomator dump and pull the created file to local path, but it does give me the below warning:
copporcomm_test/access_try.py::test_unconfig_deamon_is_running_android
/usr/lib/python3/dist-packages/_pytest/threadexception.py:73: PytestUnhandledThreadExceptionWarning: Exception in thread Thread-1
...
warnings.warn(pytest.PytestUnhandledThreadExceptionWarning(msg))
When I remark out the line where I do uiautomator dump it doesn't occurs.
Any clue what it could be the problem?
My small test case:
def test_android(adb):
adb_output = adb.shell(command="uiautomator dump", assert_ok=True, timeout=10)
assert "UI hierchary dumped to" in adb_output
adb.pull(on_device_path="/sdcard/window_dump.xml", local_path="ui.xml", timeout=10)
print("Pulled file!")
Of course I could catch the warning but I want also to understand why it happens.
After the comment from Teejay Bruno I modified the code to see the threads and I realized that from start it already has 4 thread counts.
Below printout is when thread1 and thread2 lines are comment out (in the test_android()).
START: Current active thread count: 4
Updated test case using threading:
import threading
import pytest
def thread1_sub(adb):
print("Create xml file with current activity layout.")
adb_output = adb.shell(command="uiautomator dump", assert_ok=True, timeout=300)
assert "UI hierchary dumped to" in adb_output
def thread2_sub(adb):
print("Pull created xml file to loacal path.")
adb.pull(on_device_path="/sdcard/window_dump.xml", local_path="ui.xml", timeout=10)
def test_android(adb):
thread1 = threading.Thread(target=thread1_sub, args=(adb,), name="Thread1")
thread2 = threading.Thread(target=thread2_sub, args=(adb,), name="Thread2")
print("START: Current active thread count: ", threading.active_count())
thread1.start()
thread2.start()
thread2.join()
Now by using threading everything works just fine and all threads are handled.

Related

Logs Only Write to EMR stderr/stdout with non-zero exit code

I have a piece of code in Scala which uses log4j to do logging.
I have a few conditions upon which I want to log some information to the stderr/stdout of the EMR.
If I have a non-zero exit code it will log it:
if (args.length != 2) {
log.setLevel(Level.ERROR)
log.error("Expecting 2 args. Usage: ...")
System.exit(4)
}
However if I just have regular logging it does not show:
log.setLevel(Level.INFO)
log.info("This is my log")
I thought that perhaps this was because it only logs the errors in the EMR and tried the following:
log.setLevel(Level.ERROR)
log.error("This is my log")
With no luck.
It seems the logs are only written with a non-zero exit code.
I believe they are probably writing somewhere outside the log file. Is there any way to make it so that they do write where I am expecting them too?

Retrieving Celery task kwargs from task-failed / uuid

Main Issue
I'm testing how to handle certain task failure, for example handling a 'TimeLimitExceeded' exception which instantly kills the task and is not 'catchable' (Yes...I'm aware of the existence of 'SoftTimeLimit' but it doesn't fit my needs).
First Approach
This is my tasks.py (The worker runs with a --time-limit flag):
import logging
from celery import Celery
import time
app = Celery('tasks', broker='pyamqp://guest#localhost//')
def my_fail(task, exc, req_id, req_args, req_kwargs, einfo, *ext_args, **kwargs):
logger.info("args: %r", req_args)
logger.info("kw: %r", req_kwargs)
#app.task(on_failure=my_fail)
def sum(x, y, delay=0, **kw):
result = x+y
if result == 4:
raise Exception("Some Error")
time.sleep(delay)
return x+y
The main idea when a task fails, to be able to perform some handling based on the args/kwargs of the task
For example if I run sum.delay(3, 1, foo="bar") the Exception("Some Error") is raised and the following is logged:
[2019-06-30 17:21:45,120: INFO/Worker-1] args: (3, 1)
[2019-06-30 17:21:45,121: INFO/Worker-1] kw: {'foo': 'bar'}
[2019-06-30 17:21:45,122: ERROR/MainProcess] Task tasks.sum[9e9de032-1469-44e7-8932-4c490fcee2e3] raised unexpected: Exception('Some Error',)
Traceback (most recent call last):
File "/home/apernin/.virtualenvs/dr/local/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task
R = retval = fun(*args, **kwargs)
File "/home/apernin/.virtualenvs/dr/local/lib/python2.7/site-packages/celery/app/trace.py", line 438, in __protected_call__
return self.run(*args, **kwargs)
File "/home/apernin/test/tasks.py", line 89, in sum
raise Exception("Some Error")
Exception: Some Error
Note the args/kwargs are printed by my on-failure handler.
Now if I run sum.delay(3, 2, delay=7) the TimeLimit is triggered
[2019-06-30 17:23:15,244: INFO/MainProcess] Received task: tasks.sum[8c81398b-4378-401d-a674-a3bd3418ccde]
[2019-06-30 17:23:21,070: ERROR/MainProcess] Task tasks.sum[8c81398b-4378-401d-a674-a3bd3418ccde] raised unexpected: TimeLimitExceeded(5.0,)
Traceback (most recent call last):
File "/home/apernin/.virtualenvs/dr/local/lib/python2.7/site-packages/billiard/pool.py", line 645, in on_hard_timeout
raise TimeLimitExceeded(job._timeout)
TimeLimitExceeded: TimeLimitExceeded(5.0,)
[2019-06-30 17:23:21,071: ERROR/MainProcess] Hard time limit (5.0s) exceeded for tasks.sum[8c81398b-4378-401d-a674-a3bd3418ccde]
[2019-06-30 17:23:21,629: ERROR/MainProcess] Process 'Worker-1' pid:15472 exited with 'signal 15 (SIGTERM)'
Note the args/kwargs are note printed, because of the on-failure handler not being excuted. This is somewhat to be expected due to the nature of Celery's Hard Time Limit.
Second Approach
My second approach is to use a event-listener.
from celery import Celery
def my_monitor(app):
state = app.events.State()
def announce_failed_tasks(event):
state.event(event)
# task name is sent only with -received event, and state
# will keep track of this for us.
task = state.tasks.get(event['uuid'])
with app.connection() as connection:
recv = app.events.Receiver(connection, handlers={
'task-failed': announce_failed_tasks,
})
recv.capture(limit=None, timeout=None, wakeup=True)
if __name__ == '__main__':
app = Celery(broker='amqp://guest#localhost//')
my_monitor(app)
The only info I was able to retrieve was the task uuid, I wasn't able to retrieve the name, args or kwargs of the task (the task object contains the attributes but are all None).
Question
Is there a way to either:
Make the on_failure handler in case of a Hard Time Limit?
Retrieve the tasks args/kwargs of a task with a task-failed event listener?
Thanks in advance
First, the timeout is handled by the Worker (the MainProcess) and it is not treated the same as failures happened INSIDE the task, such as exceptions being thrown, etc. This is why you see it as TimeLimitExceeded raised by the MainProcess in the log. So, unfortunately you can't rely on the same logic...
However, your second approach will prove useful in tracking down what is going on.
I have developed (in-house) a Celery monitoring tool that grabs all the events, and populates a database with them so that later we can do all sort of analytics (see average and worst running times for an example, frequency of failures, etc).
In order to grab the details you need from the data given by the task-failed event you also need to record (store it in some dictionary for an example) the task-received event data. This information contains args, task names, and all sort of useful information you may need. You relate them both by the task UUID.

How do I write messages to the output log on AWS Glue?

AWS Glue jobs log output and errors to two different CloudWatch logs, /aws-glue/jobs/error and /aws-glue/jobs/output by default. When I include print() statements in my scripts for debugging, they get written to the error log (/aws-glue/jobs/error).
I have tried using:
log4jLogger = sparkContext._jvm.org.apache.log4j
log = log4jLogger.LogManager.getLogger(__name__)
log.warn("Hello World!")
but "Hello World!" doesn't show up in either of the logs for the test job I ran.
Does anyone know how to go about writing debug log statements to the output log (/aws-glue/jobs/output)?
TIA!
EDIT:
It turns out the above actually does work. What was happening was that I was running the job in the AWS Glue Script editor window which captures Command-F key combinations and only searches in the current script. So when I tried to search within the page for the logging output it seemed as if it hadn't been logged.
NOTE: I did discover through testing the first responder's suggestion that AWS Glue scripts don't seem to output any log message with a level less than WARN!
Try to use built-in python logger from logging module, by default it writes messages to standard output stream.
import logging
MSG_FORMAT = '%(asctime)s %(levelname)s %(name)s: %(message)s'
DATETIME_FORMAT = '%Y-%m-%d %H:%M:%S'
logging.basicConfig(format=MSG_FORMAT, datefmt=DATETIME_FORMAT)
logger = logging.getLogger(<logger-name-here>)
logger.setLevel(logging.INFO)
...
logger.info("Test log message")
I know the article is not new but maybe it could be helpful for someone:
For me logging in glue works with the following lines of code:
# create glue context
glueContext = GlueContext(sc)
# set custom logging on
logger = glueContext.get_logger()
...
#write into the log file with:
logger.info("s3_key:" + your_value)
I noticed the above answers are written in python. For Scala you could do the following
import com.amazonaws.services.glue.log.GlueLogger
object GlueApp {
def main(sysArgs: Array[String]) {
val logger = new GlueLogger
logger.info("info message")
logger.warn("warn message")
logger.error("error message")
}
}
You can find both Python and Scala solution from official doc here
Just in case this helps. This works to change the log level.
sc = SparkContext()
sc.setLogLevel('DEBUG')
glueContext = GlueContext(sc)
logger = glueContext.get_logger()
logger.info('Hello Glue')
This worked for INFO level in a Glue Python job:
import sys
root = logging.getLogger()
root.setLevel(logging.DEBUG)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
root.addHandler(handler)
root.info("check")
source
I faced the same problem. I resolved it by added
logging.getLogger().addHandler(logging.StreamHandler(sys.stdout))
Before there was no prints at all, even ERROR level
The idea was taken from here
https://medium.com/tieto-developers/how-to-do-application-logging-in-aws-745114ac6eb7
Another option would be to log to stdout and glue AWS logging to stdout (using stdout is actually one of the best practices in cloud logging).
Update: it works only for setLevel("WARNING") and when prints ERROR or WARING. I didn't find how to manage it for the INFO level :(
If you're just debugging, print() (Python) or println() (Scala) works just fine.

Return code of scheduled task prefixed with 0x8007000 in list view, registered as 0 in the event log

I am currently trying to setup monitoring of windows scheduled tasks in Zabbix. It seemed easy enough to just monitor the Microsoft-Windows-TaskScheduler/Operational event log filtered by 201 events and regexing on the return code, but when I started simulating errors to test the monitoring, nothing happened.
It turns out that all our windows 2012 servers always log "return code 0" in the event log, even though it actually, sort of, displays it correctly in the Task Scheduler list view. When I say "sort of", it's because the "Last Run Result" actually displays 0x80070001 if the exit code of the program run by the scheduled task is 1.
I have spend a lot of time tweaking the settings, like user account, Run only when user is logged on, Run whether user is logged on or not, setting path on the action, Run with highest privileges, Configure for Vista/7/2012, etc. Nothing helped.
Finally I did some testing on my local machine, Windows 7, and a 2008R2 server, both of which just worked as expected.
The specific task I was testing ran a PowerShell script, using -Command so that it properly propagates the exit, but to rule out any PS issues I also tested with a batch file containing "exit 1" and finally with a small C# console program, that just returns whatever you supply on the command line.
PS, batch and console program all work fine on 7 and 2008, but they all fail in the same manner on 2012.
I've google this to death, but keep coming up short. Apparently 0x80070005 and other similar error codes are have some meaning, but that's not what happens in my case. In my case it seems that my exit code is bitwise or'ed with 0x80070000.
I should note that in all the cases, even 2012, the program started by the task, actually executes and run to the end, it's just the exit code which is handled weirdly.
Following is the output from the test runs:
From Powershell (my shell writes :( if $LASTEXITCODE > 0 ):
54 :( .\ExitCodeTest.exe 1
55 :( $LASTEXITCODE
1
56 :) .\ExitCodeTest.exe 10
57 :( $LASTEXITCODE
10
Windows Server 2008 R2 Standard:
Last Run Result (from list view): 0xA
Event 201 from event log Microsoft-Windows-TaskScheduler/Operational:
Task Scheduler successfully completed task "\ErrorTest" ,
instance "{b67a26cf-7fd8-461a-93d9-a5e48e72e558}" ,
action "D:\Tasks\ExitCodeTest.exe" with return code 10.
Windows Server 2012 Datacenter (notice that the return code in the event log is 0):
Last Run Result (from list view): 0x8007000A
Event 201 from event log Microsoft-Windows-TaskScheduler/Operational:
Task Scheduler successfully completed task "\error test" ,
instance "{2bde46b8-2858-4772-a7ec-d66b29d893a6}" ,
action "D:\Tasks\ExitCodeTest.exe" with return code 0.
Source for ExitCodeTest.exe:
static void Main( string[] args )
{
int exitCode = 0;
if ( args.Length > 0 )
{
exitCode = Convert.ToInt32( args[0] );
}
Environment.Exit( exitCode );
}
Please help, I am at my wits end.
Thanks,
John
(this is NOT an answer, but StackOverflow is refusing to let me add comments - when I click 'add comment', browser scrolls to top of page :-/)
You may be misinterpreting the Last Run Result column. According to Wikipedia (http://en.wikipedia.org/wiki/Windows_Task_Scheduler), LRR values of 0, 1 and 10 are common. Ignore the 0x8007 prefix - this just indicates a WIN32 error code transformed into an HRESULT (http://msdn.microsoft.com/en-us/library/gg567305.aspx).
Try running the test and forcing an exit code of something other than 1 or 10 to see if this influences LRR.
This does not explain of course why action return code is 0 in 2012. Error code 10 is defined as 'environment is incorrect'. Could it be that 2012 server does not want to run 32bit executable?
One other suggestion (and I'm a little out of my depth); according to (http://msdn.microsoft.com/en-us/library/system.environment.exit(v=vs.110).aspx): "Exit requires the caller to have permission to call unmanaged code. The return statement does not.". Might be worth re-compiling ExitCodeTest as follows:
static int Main(string[] args)
{
int exitCode = 0;
if ( args.Length > 0 )
{
exitCode = Convert.ToInt32( args[0] );
}
return exitCode;
}
I'm seeing a similar issue on Server 2012 with a batch file that looks like it succeeds, shows a return value of 0 in event log, but a Last Run Result of 0x80070001.
I see MSFT has a hotfix available for Server 2012 which might address this issue:
http://support.microsoft.com/kb/3003689
I had this problem and fixed it this way.
Instead of calling a batch file move the commands into the actions section of the scheduled task.
I realize this may not work for you as some batch files are long.
I suspect it has to do with circumventing security on a scheduled task -- if you can change the batch file then you could get a scheduled task to run as the identity without windows being the wiser.

How to make a celery task fail from within the task?

Under some conditions, I want to make a celery task fail from within that task. I tried the following:
from celery.task import task
from celery import states
#task()
def run_simulation():
if some_condition:
run_simulation.update_state(state=states.FAILURE)
return False
However, the task still reports to have succeeded:
Task sim.tasks.run_simulation[9235e3a7-c6d2-4219-bbc7-acf65c816e65]
succeeded in 1.17847704887s: False
It seems that the state can only be modified while the task is running and once it is completed - celery changes the state to whatever it deems is the outcome (refer to this question). Is there any way, without failing the task by raising an exception, to make celery return that the task has failed?
To mark a task as failed without raising an exception, update the task state to FAILURE and then raise an Ignore exception, because returning any value will record the task as successful, an example:
from celery import Celery, states
from celery.exceptions import Ignore
app = Celery('tasks', broker='amqp://guest#localhost//')
#app.task(bind=True)
def run_simulation(self):
if some_condition:
# manually update the task state
self.update_state(
state = states.FAILURE,
meta = 'REASON FOR FAILURE'
)
# ignore the task so no other state is recorded
raise Ignore()
But the best way is to raise an exception from your task, you can create a custom exception to track these failures:
class TaskFailure(Exception):
pass
And raise this exception from your task:
if some_condition:
raise TaskFailure('Failure reason')
I'd like to further expand on Pierre's answer as I've encountered some issues using the suggested solution.
To allow custom fields when updating a task's state to states.FAILURE, it is important to also mock some attributes that a FAILURE state would have (notice exc_type and exc_message)
While the solution will terminate the task, any attempt to query the state (For example - to fetch the 'REASON FOR FAILURE' value) will fail.
Below is a snippet for reference I took from:
https://www.distributedpython.com/2018/09/28/celery-task-states/
#app.task(bind=True)
def task(self):
try:
raise ValueError('Some error')
except Exception as ex:
self.update_state(
state=states.FAILURE,
meta={
'exc_type': type(ex).__name__,
'exc_message': traceback.format_exc().split('\n'),
'custom': '...'
})
raise Ignore()
I got an interesting reply on this question from Ask Solem, where he proposes an 'after_return' handler to solve the issue. This might be an interesting option for the future.
In the meantime I solved the issue by simply returning a string 'FAILURE' from the task when I want to make it fail and then checking for that as follows:
result = AsyncResult(task_id)
if result.state == 'FAILURE' or (result.state == 'SUCCESS' and result.get() == 'FAILURE'):
# Failure processing task