I want to know the run time of each test case in the code itself using pytest in data bricks - pytest

I am using data bricks notebook. I have created a test_notebook.py which consists for test cases.
I started executing in another notebook using below code
import pytest
pyargs = ['dbfs/FileStore/test_notebook.py']
pytest.main(pyargs)
I am getting run time for all the test cases.
How can I see runtime for each testcase in the data bricks notebook itself?

Related

Get test execution logs during test run by Nunit Test Engine

We are using NUnit Test Engine to run test programatically.
Lokks like that after we add FrameworkPackageSettings.NumberOfTestWorkers to the Runner code, the test run for our Ui test hangs in execution. I'm not able to see at what time or event the execuiton hangs because Test Runned returns test result logs (in xml) only when entire execution ends
Is there a way to get test execution logs for each test?
I've added InternalTraceLevel and InternalTraceWriter but these logs are something different (BTW, looks like ParallelWorker#9 hangs even to write to console :) )
_package.AddSetting(FrameworkPackageSettings.InternalTraceLevel, "Debug");
var nunitInternalLogsPath = Path.GetDirectoryName(Uri.UnescapeDataString(new Uri(Assembly.GetExecutingAssembly().CodeBase).AbsolutePath)) + "\\NunitInternalLogs.txt";
Console.WriteLine("nunitInternalLogsPath: "+nunitInternalLogsPath);
StreamWriter writer = File.CreateText(nunitInternalLogsPath);
_package.AddSetting(FrameworkPackageSettings.InternalTraceWriter, writer);
The result file, with default name TestResult.xml is not a log. That is, it is not a file produced, line by line, as execution proceeds. Rather, it is a picture of the result of your entire run and therefore is only created at the end of the run.
InternalTrace logs are actual logs in that sense. They were created to allow us to debug the internal workings of NUnit. We often ask users to create them when an NUnit bug is being tracked. Up to four of them may be produced when running a test of a single assembly under nunit3-console...
A log of the console runner itself
A log of the engine.
A log of the agent used to run tests (if an agent is used)
A log received from the test framework running the tests
In your case, #1 is not produced, of course. Based on the content of the trace log, we are seeing #4, triggered by the package setting passed to the framework. I have seen the situation where the log is incomplete in the past but not recently. The logs normally use auto-flush to ensure that all output is actually written.
If you want to see a complete log #2, then set the WorkDirectory and InternalTrace properties of the engine when you create it.
However, as stated, these logs are all intended for debugging NUnit, not for debugging your tests. The console runner produces another "log" even though it isn't given that name. It's the output written to the console as the tests run, especially that produced when using the --labels option.
If you want some similar information from your own runner, I suggest producing it yourself. Create either console output or a log file of some kind, by processing the various events received from the tests as they execute. To get an idea of how to do this, I suggest examining the code of the NUnit3 console runner. In particular, take a look at the TestEventHandler class, found at https://github.com/nunit/nunit-console/blob/version3/src/NUnitConsole/nunit3-console/TestEventHandler.cs

Executing Batch service in Azure Data factory using python script

Hi i've been trying to execute a custom activity in ADF which receives csv file from the container (A) after further transformation on the data set, transformed DF stored into another csv file in a same container (A).
I've written the transformation logic in python and have it stored in the same container (A).
Error raises here, when i execute the pipeline it returns an error *can't find the specified file *
Nothing wrong in the connections, Is anything wrong in batch Account or pools!!
can anyone tell me where to place the python script..!!!
Install azure batch explorer and make sure to choose proper configuration for virtual machine (dsvm-windows) which will ensure python is already in place in the virtual machine where your code is being run.
This video explains the steps
https://youtu.be/_3_eiHX3RKE

Run Databricks notebook jobs via API in a shared context

In the REST documentation for Databricks, you can submit a notebook task as a job to a cluster using the 2.0 API or you can submit a command or python script using the 1.2 API
The 1.2 API allows you to create a context and then all subsequent commands or scripts can be submitted against this context. This allows you to maintain state (dataframes, variables etc) which is much more akin to running notebooks interactively in the browser
What i want is to be able to submit my notebooks into the same context and get the same behaviour as the 1.2 API but this does not seem possible, is there a reason for that? Or am i missing something if it can be done?
My use case is i want to be able to re-run a notebook from the API and have it remember its last state (in the most basic example just knowing its already loaded a dataframe) but more generally having the ability for subsequent jobs to only run what changed since the last run.
As far as I can tell, failing the ability to do this via the 2.0 API, I have 2 options:
Convert my notebook to Python script and have a bootstrap script on client side that invokes an entry point using the 1.2 API within the same context
Create temp tables at checkpoints in my notebook and possibly maintain a special variables dataframe of state variables
Both of these seem unecessarily complex, any other ideas?

Why do Selenium tests behave different on different machines?

I couldn't find much information on Google regarding this topic. Below, I have provided three results from the same Selenium tests. Why am I getting different results when running the tests from different places?
INFO:
So our architecture: Bitbucket, Bamboo Stage 1 (Build, Deploy to QA), Bamboo Stage 2 (start Amazon EC2 instance "Test", run tests from Test against recently deployed QA)
Using Chrome Webdriver.
For all three of the variations I am using the same QA URL that our application is deployed on.
I am running all tests Parallelizable per fixture
The EC2 instance is running Windows Server 2012 R2 with the Chrome browser installed
I have made sure that the test solution has been properly deployed to the EC2 "test" instance. It is indeed the exact same solution and builds correctly.
First, Local:
Second, from EC2 Via SSM Script that invokes the tests:
Note that the PowerShell script calls the nunit3-console.exe just like it would be utilized in my third example using the command line.
Lastly, RDP in on EC2 and run tests from the command line:
This has me perplexed... Any reasons why Selenium is running different on different machines?
This really should be a comment, but I can't comment yet so...
I don't know enough about the application you are testing to say for sure, but this seems like something I've seen testing the application I'm working on.
I have seen two issues. First, Selenium is checking for the element before it's created. Sometimes it works and sometimes it fails, it just depends on how quickly the page loads when the test runs. There's no rhyme or reason to it. Second, the app I'm testing is pretty dumb. When you touch a field, enter data and move on to the next, it, effectively, posts all editable fields back to the database and refreshes all the fields. So, Selenium enters the value, moves to the next field and pops either a stale element error or can't find element error depending on when in the post/refresh cycle it attempts to interact with the element.
The solution I have found is moderately ugly, I tried the wait until, but because it's the same element name, it's already visible and is grabbed immediately which returns a stale element. As a result, the only thing that I have found is that by using explicit waits between calls, I can get it to run correctly consistently. Below is an example of what I have to do with the app I'm testing. (I am aware that I can condense the code, I am working within the style manual for my company)
Thread.Sleep(2000);
By nBaseLocator = By.XPath("//*[#id='attr_seq_1240']");
IWebElement baseRate = driver.FindElement(nBaseLocator);
baseRate.SendKeys(Keys.Home + xBaseRate + Keys.Tab);
If this doesn't help, please tell us more about the app and how it's functioning so we can help you find a solution.
#Florent B. Thank you!
EDIT: This ended up not working...
The tests are still running different when called remotely with a powershell script. But, the tests are running locally on both the ec2 instance and my machine correctly.
So the headless command switch allowed me to replicate my failed tests locally.
Next I found out that a headless chrome browser is used during the tests when running via script on an EC2 instance... That is automatic, so the tests where indeed running and the errors where valid.
Finally, I figured out that the screen size is indeed the culprit as it was stuck to a size of 600/400 (600/400?)
So after many tries, the only usable screen size option for Windows, C# and ChromeDriver 2.32 is to set your webDriver options when you initiate you driver:
ChromeOptions chromeOpt = new ChromeOptions();
chromeOpt.AddArguments("--headless");
chromeOpt.AddArgument("--window-size=1920,1080");
chromeOpt.AddArguments("--disable-gpu");
webDriver = new ChromeDriver(chromeOpt);
FINISH EDIT:
Just to update
Screen size is large enough.
Still attempting to solve the issue. Anyone else ran into this?
AWS SSM Command -> Powershell -> Run Selenium Tests with Start-Process -> Any test that requires an element fails because ElementNotFound or ElementNotVisible exceptions.
Using POM for tests. FindsBy filter in c# is not finding elements.
Running tests locally on EC2 run fine from cmd, powershell and Powershell ISE.
The tests do not work correctly when executing with the AWS SSM Command. Cannot find any resources to fix problem.

Where do 'normal' println go in a scala jar, under Spark

I'm running a simple jar through spark, everything is working fine, but as a crude way to debug, I often find println pretty helpful, unless I really have to attach a debugger
However, output from println statements are nowhere to be found when run under Spark.
The main class in the jar begins like this:
import ...
object SimpleApp {
def main(args: Array[String]) {
println("Starting up!")
...
Why does something as simple as this not show in the driver process.
If it matters, I've tested this running spark locally, as well as under Mesos
update
as Proper way to provide spark application a parameter/arg with spaces in spark-submit I've dumbed down the question scenario, I was actually submitting (with spark-submit) the command through SSH.
The actual value parameter was a query from the BigDataBenchmark, namely:
"SELECT pageURL, pageRank FROM rankings WHERE pageRank > 1000"
Now that wasn't properly escaped on the remote ssh command:
ssh host spark-submit ... "$query"
Became, on the host:
spark-submit ... SELECT pageURL, pageRank FROM rankings WHERE pageRank > 1000
So there you have it, all my stdout was going to a file, whereas "normal" spark output was still appearing as it is stderr, which I only now realise.
This would appear in the stdout of the driver. As an example see SparkPi. I know on Yarn this appears locally in the stdout when in client mode or in the application master stdout log when in cluster mode. Local mode should appear just on the normal stdout (though likely mixed in with lots of logging noise).
I can't say for sure about Spark, but based on what Spark is, I would assume that it starts up child processes, and the standard output of those processes is not sent back to the main process for you to see. You can get around this in a number of ways, such as opening a file to write messages to, or a network connection over your localhost to another process that displays messages it receives. If you're just trying to learn the basics, this may be sufficient. However, if you're going to do a larger project, I'd strongly recommend doing some research into what the Spark community has already developed for that purpose, as it will benefit you in the long run to have a more robust setup for debugging.