Is possible to auto generate documentation for pytest tests? - pytest

I have a project which contains only pytest tests, without modules or classes, which test remote project.
E.g. structure ->
.
├── __init__.py
├── test_basic_auth_app.py
├── test_basic_auth_user.py
├── test_advanced_app_id.py
├── test_advanced_user.py
└── test_oauth_auth.py
Tests look like
"""
Service requires credentials (app_id, app_key) to be passed using the Basic Auth
"""
import base64
import pytest
import authorising.auth
from authorising.resources import Service
#pytest.fixture(scope="module")
def service_settings(service_settings):
"Set auth mode to app_id/app_key"
service_settings.update({"backend_version": Service.Auth_app})
return service_settings
def test_basic_auth_app_id_key(application):
"""Test client access with Basic HTTP Auth using app id and app key
Configure Api/Service to use App ID / App Key Authentication
and Basic HTTP Auth to pass the credentials.
"""
credentials = application.authobj.credentials
encoded = base64.b64encode(
f"{creds['app_id']}:{credentials['app_key']}".encode("utf-8")).decode("utf-8")
response = application.test_request()
assert response.status_code == 200
assert response.request.headers["Auth"] == "Basic %s" % encoded
Is it possible to auto generate documentation from docstrings e.g using Sphinx ?

You can use sphinx-apidoc to generate test-documentation automatically using python-docstrings
For instance, if you have directory structure like below
.
docs
|-- rst
|-- html
tests
├── __init__.py
├── test_basic_auth_app.py
├── test_basic_auth_user.py
├── test_advanced_app_id.py
├── test_advanced_user.py
└── test_oauth_auth.py
sphinx-apidoc -o docs/rst tests
sphinx-build -a -b html docs/rst docs/html -j auto
All Your docs HTML Files will be under docs/html.
There are multiple options sphinx-apidoc supports. Here is the [link]: https://www.sphinx-doc.org/en/master/man/sphinx-apidoc.html

When using sphinx, you should add your test-folder to the Python path in the conf.py file:
sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..', 'tests')))
Then in each rst file you can simply write:
.. automodule:: test_basic_auth_app
:members:
If you want to document also the test results, please take a look into Sphinx-Test-Reports

Related

TemplateNotFound in Airflow

I have the following dir structure
.
├── ConfigSpark.yaml
├── project1
│   ├── dags
│   │   └── dag_1.py
│   └── sparkjob
│   └── spark_1.py
└── sparkutils
I'm trying to import de ConfigSpark.yaml file in my SparkKubernetesOperator using:
job= SparkKubernetesOperator(
task_id = 'job',
params=dict(
app_name='job',
mainApplicationFile='/opt/airflow/dags/project1/sparkjob/spark_1.py',
driverCores=1,
driverCoreRequest='250m',
driverCoreLimit='500m',
driverMemory='2G',
executorInstances=1,
executorCores=2,
executorCoreRequest='1000m',
executorCoreLimit='1000m',
executorMemory='2G'
),
application_file='/opt/airflow/dags/ConfigSpark.yaml',
kubernetes_conn_id='conn_prd_eks',
do_xcom_push=True
)
My dag is returning the following error:
jinja2.exceptions.TemplateNotFound: /opt/airflow/dags/ConfigSpark.yaml
I've noticed that if the DAG is in the same directory of ConfigSpark.yaml my tasks run perfectly, but why my task is not running when I place my dag in a subfolder?
I've checked my values.yaml file and airflowHome is /opt/airflow and defaultAirflowRepository is apache/airflow.
What is happening?
Airflow searches for the template file (ConfigSpark.yaml in your case) from the directory in which the DAG file is stored. Therefore, it doesn't find it automatically with your code.
If you would store the template file in same folder your DAG file is stored in (/project1/dags), or a nested folder inside the /project1/dags folder, you can specify the path from there in your task:
job = SparkKubernetesOperator(
...,
application_file='/path/to/ConfigSpark.yaml',
...
)
Which would read the template file from /project1/dags/path/to/ConfigSpark.yaml.
However, if the folder your template file is stored in is not a child of the folder your DAG file is stored in, the above won't work. In that case you can specify template_searchpath on the DAG-level:
with DAG(..., template_searchpath="/opt/airflow/dags/repo/dags") as dag:
job = SparkKubernetesOperator(
task_id='job',
application_file='ConfigSpark.yaml',
...,
)
This path (for example /opt/airflow/dags) is added to the Jinja searchpath and that way ConfigSpark.yaml will be found.

How to execute sql file with Slick 3.0.0

I have a structure like this
src
└── main
├── resources
│ └── inserts.sql
└── my.package
└── Main.scala
In Main.scala I want to take the file inserts.sql and use Slick 3.0.0 to execute it on my db.
A SQL string can be executed directly by using the SQLActionBuilder class.
Also, since the BufferedSource object we get from Source.fromResource is Closable, we should wrap it with a Using block.
import slick.jdbc.SetParameter.SetUnit
import slick.jdbc.SQLActionBuilder
import scala.io.Source
import scala.util.Using
// ...
Using(Source.fromResource("inserts.sql")) { insertsSqlSource =>
val sqlActionBuilder = SQLActionBuilder(insertsSqlSource.mkString, SetUnit)
database.run(sqlActionBuilder.asUpdate)
}
You can read file content:
val query = scala.io.Source.fromResource("inserts.sql").mkString
and then create query using sql or sqlu interpolators:
//https://scala-slick.org/doc/3.0.0/sql.html
sql"$query".as[ExpectedType]
and run it as always :)
PS: Not tested. Don't have prepared env now.
Looks like there is no way to execute a sql file with Slick other then loading it into memory as a String and executing it with sql, sqlu or tsql.
Beware that in this case the $ interpolation is meant to insert bind variables into the query. To splice literal values into the query you must use #$ instead. Since in this case the variable is the whole query, we have to do
val inserts_sql = Source.fromResource("inserts.sql").mkString
db.run(sqlu"#$query")

How does DVC store differences on the directory level into DVC cache?

Can someone explain how DVC stores differences on the directory level into DVC cache.
I understand that the DVC-files (.dvc) are metafiles to track data, models and reproduce pipeline stages. However, it is not clear for me how the process of creating branches, commiting them and switching back to a master file is exactly saved in differences.
Short version:
.dvc file contains info (md5) about JSON file inside cache that describes current state of directory.
When directory gets updated, there is new md5 in .dvc file and new JSON file is created with updated state of directory.
In git, you store the .dvc file, so that DVC know (basing on md5) where to look for information about directory.
Longer version:
Let me try to break particular steps of directory handling with DVC.
Lets assume we have some data directory you want to add under DVC control:
data
├── 1
└── 2
You are using dvc add data to make DVC track you directory. In result, DVC produces data.dvc file. As you noted this file contains metadata required to connect your git repository with your data storage. Inside this file (besides other things) you can see:
outs:
- md5: f437247ec66d73ba66b0ade0246fcb49.dir
path: data
The md5 part is used to store information about directory in DVC cache (.dvc/cache):
(dvc3.7) ➜ repo$ tree .dvc/cache
.dvc/cache
├── 26
│   └── ab0db90d72e28ad0ba1e22ee510510
├── b0
│   └── 26324c6904b2a9cb4b88d6d61c81d1
└── f4
└── 37247ec66d73ba66b0ade0246fcb49.dir
If you will open the file with .dir suffix, you will see that it contains description of current data state:
(dvc3.7) ➜ repo$ cat .dvc/cache/f4/37247ec66d73ba66b0ade0246fcb49.dir
[{"md5": "b026324c6904b2a9cb4b88d6d61c81d1", "relpath": "1"},
{"md5": "26ab0db90d72e28ad0ba1e22ee510510", "relpath": "2"}]
As you can see, particular files(1 and 2) are described by entries in this file
When you change your directory:
(dvc3.7) ➜ repo$ echo 3 >> data/3
(dvc3.7) ➜ repo$ dvc commit data.dvc
The content of data.dvc will be updated:
outs:
- md5: 12f4b7d54a32e58818e27fba28376fba.dir
path: data
And there is new file inside the cache:
├── 12
│   └── f4b7d54a32e58818e27fba28376fba.dir
...
(dvc3.7) ➜ repo$ cat .dvc/cache/12/f4b7d54a32e58818e27fba28376fba.dir
[{"md5": "b026324c6904b2a9cb4b88d6d61c81d1", "relpath": "1"},
{"md5": "26ab0db90d72e28ad0ba1e22ee510510", "relpath": "2"},
{"md5": "6d7fce9fee471194aa8b5b6e47267f03", "relpath": "3"}]
From perspecitve of git the only change is inside data.dvc.
(Assuming you did git commit after adding data with 1 and 2 inside):
diff --git a/data.dvc b/data.dvc
index 098aec5..88d1a90 100644
--- a/data.dvc
+++ b/data.dvc
## -1,6 +1,6 ##
-md5: a427c5bf8680fbf8d1951806b28b82fe
+md5: 1b674d61c195eea7a6b14f176c020b9c
outs:
-- md5: f437247ec66d73ba66b0ade0246fcb49.dir
+- md5: 12f4b7d54a32e58818e27fba28376fba.dir
path: data
cache: true
metric: false
NOTE: First md5 corresponds to md5 of this file, so it had to change with dir md5 change

How to import data from cloud firestore to the local emulator?

I want to be able to run cloud functions locally and debug against a copy from the production data.
Is there a way to copy the data that is online to the local firestore emulator?
This can be accomplished through a set of commands in terminal on the existing project:
1. Login to firebase and Gcloud:
firebase login
gcloud auth login
2. See a list of your projects and connect to one:
firebase projects:list
firebase use your-project-name
gcloud projects list
gcloud config set project your-project-name
3. Export your production data to gcloud bucket with chosen name:
gcloud firestore export gs://your-project-name.appspot.com/your-choosen-folder-name
4. Now copy this folder to your local machine, I do that in functions folder directly:
Note : Don't miss the dot ( . ) at the end of below command
cd functions
gsutil -m cp -r gs://your-project-name.appspot.com/your-choosen-folder-name .
5. Now we just want to import this folder. This should work with the basic command, thanks to latest update from Firebase team https://github.com/firebase/firebase-tools/pull/2519.
firebase emulators:start --import ./your-choosen-folder-name
Check out my article on Medium about it and a shorthanded script to do the job for you https://medium.com/firebase-developers/how-to-import-production-data-from-cloud-firestore-to-the-local-emulator-e82ae1c6ed8
Note: Its better to use a different bucket for it, as copying into your project bucket will result in the folder created in your firebase storage.
If you are interested in gsutil arguments like -m, you can see them described by executing gsutil --help.
My method is somewhat manual but it does the trick. I've shared it in this useful Github thread but I'll list the steps I did here if you find them useful:
Go to my local Firebase project path.
Start the emulators using: firebase emulators:start
Create manually some mockup data using the GUI at http://localhost:4000/firestore using the buttons provided: + Start Collection and + Add Document.
Export this data locally using: emulators:export ./mydirectory
About the project data located at Firebase Database / Cloud Firestore, I exported a single collection like this: gcloud firestore export gs://my-project-bucket-id.appspot.com --collection-ids=myCollection The export is now located under Firebase Storage in a folder with a timestamp as name (I didn't use a prefix for my test)
Download this folder to local drive with: gsutil cp -r gs://my-project-bucket-id.appspot.com/myCollection ./production_data_export NOTE: I did this in a Windows environment... gsutil will throw this error: "OSError: The filename, directory name, or volume label syntax is incorrect" if the folder has invalid characters for a folder name in Windows (i.e. colons) or this error: "OSError: Invalid argument.9.0 B]" if an inner file in the folder has invalid characters too. To be able to download the export locally, rename these with a valid Windows name (i.e. removing the colons) like this: gsutil mv gs://my-project-bucket-id.appspot.com/2020-05-22T02:01:06_86152 gs://my-project-bucket-id.appspot.com/myCollection
Once downloaded, imitate the local export structure renaming the folder to firestore_export and copying the firebase-export-metadata.json file from the local export folder. Just to be visual, here's the structure I got:
$ tree .
.
├── local_data_export
│ ├── firebase-export-metadata.json
│ └── firestore_export
│ ├── all_namespaces
│ │ └── all_kinds
│ │ ├── all_namespaces_all_kinds.export_metadata
│ │ └── output-0
│ └── firestore_export.overall_export_metadata
└── production_data_export
├── firebase-export-metadata.json
└── firestore_export
├── all_namespaces
│ └── kind_myCollection
│ ├── all_namespaces_kind_myCollection.export_metadata
│ ├── output-0
│ └── output-1
└── firestore_export.overall_export_metadata
8 directories, 9 files
Finally, start the local emulator pointing to this production data to be imported: firebase emulators:start --import=./mock_up_data/production_data_export/
You should see the imported data at: http://localhost:4000/firestore/
This should assist readers for now, while we await a more robust solution from the Firebase folks.
You can use the firestore-backup-restore to export and import your production data as JSON files.
I wrote a quick hack to allow for importing these JSON in the Firebase Simulator Firestore instance.
I proposed a pull request and made this npm module in the meantime.
You can use it this way:
const firestoreService = require('#crapougnax/firestore-export-import')
const path = require('path')
// list of JSON files generated with the export service
// Must be in the same folder as this script
const collections = ['languages', 'roles']
// Start your firestore emulator for (at least) firestore
// firebase emulators:start --only firestore
// Initiate Firebase Test App
const db = firestoreService.initializeTestApp('test', {
uid: 'john',
email: 'john#doe.com',
})
// Start importing your data
let promises = []
try {
collections.map(collection =>
promises.push(
firestoreService.fixtures(
path.resolve(__dirname, `./${collection}.json`),
[],
[],
db,
),
),
)
Promise.all(promises).then(process.exit)
} catch (err) {
console.error(err)
}
Obviously, since this data won't persist in the emulator, you'll typically inject them in the before() function of your test suite or even before every test.
There is no built-in way to copy data from a cloud project to the local emulator. Since the emulator doesn't persist any data, you will have to re-generate the initial data set on every run.
I was able to make some npm scripts to import from remote to local emulator and vice-versa.
"serve": "yarn build && firebase emulators:start --only functions,firestore --import=./firestore_export",
"db:update-local-from-remote": "yarn db:backup-remote && gsutil -m cp -r gs://my-firebase-bucket.appspot.com/firestore_export .",
"db:update-remote-from-local": "yarn db:backup-local && yarn db:backup-remote && gsutil -m cp -r ./firestore_export gs://my-firebase-bucket.appspot.com && yarn run db:import-remote",
"db:import-remote": "gcloud firestore import gs://my-firebase-bucket.appspot.com/firestore_export",
"db:backup-local": "firebase emulators:export --force .",
"db:rename-remote-backup-folder": "gsutil mv gs://my-firebase-bucket.appspot.com/firestore_export gs://my-firebase-bucket.appspot.com/firestore_export_$(date +%d-%m-%Y-%H-%M)",
"db:backup-remote": "yarn db:rename-remote-backup-folder && gcloud firestore export gs://my-firebase-bucket.appspot.com/firestore_export"
So you can export the local Firestore data to remote with:
npm db:update-remote-from-local
Or to update your local Firestore data with remote one, do:
npm db:update-local-from-remote
These operations will backup the remote Firestore data, making a copy of it and storing it on Firebase Storage.
I was about to go add a cli option to firebase-tools but pretty happy with the node-firestore-import-export package.
yarn add -D node-firestore-import-export
"scripts": {
"db:export": "firestore-export -a ./serviceAccountKey.json -b ./data/firestore.json",
"db:import": "firestore-import -a ./serviceAccountKey.json -b ./data/firestore.json",
"db:emulator:export": "export FIRESTORE_EMULATOR_HOST=localhost:8080 && yarn db:export",
"db:emulator:import": "export FIRESTORE_EMULATOR_HOST=localhost:8080 && yarn db:import",
"db:backup": "cp ./data/firestore.json ./data/firestore-$(date +%d-%m-%Y-%H-%M).json",
"dev": "firebase emulators:start --import=./data --export-on-exit=./data",
},
You will need to create a service account in the firebase console.
You can replace the GCLOUD_PROJECT environment variable with hard coded values.
open https://console.firebase.google.com/project/$GCLOUD_PROJECT/settings/serviceaccounts/adminsdk
mv ~/Downloads/myProjectHecticKeyName.json ./serviceAccountKey.json
That being said the gcloud tools are definitely the way to go in production, as you will need s3 backups anyway.
you can use fire-import npm package. for importing both firestore and firebase storage
There is also a way to import data to local storage from Google Cloud Storage without any commands:
export Firestore to Google cloud storage bucket by clicking More in google cloud
choose your desired file in google cloud storage bucket
open terminal (Google terminal shell near the search bar)
in terminal click Open editor
right click on desired file in online VSCode and click download.
You shoud start downloading of .tar file which is in fact your exported data from firestore.
Create a folder in your root (as example you may call it 'firestore-local-data')
Copy paste (or unarchive data) to this folder from archive file .tar
run firebase emulators:start --import ./firestore-local-data
This should do the trick
I wrote a little script to able to do that:
const db = admin.firestore();
const collections = ['albums', 'artists'];
let rawData: any;
for (const i in collections) {
rawData = fs.readFileSync(`./${collections[i]}.json`);
const arr = JSON.parse(rawData);
for (const j in arr) {
db.collection(collections[i]).add(arr[j])
.then(val => console.log(val))
.catch(err => console.log('ERRO: ', err))
}
}

'length error in the tickerplant kdb+/q

When I start up the tick.q with sym.q and feed.q with files provided as follows:
q tick.q sym -p 5010
q feed.q
Github links: https://github.com/KxSystems/cookbook/tree/master/start/tick ,
https://github.com/KxSystems/kdb-tick
The tickerplant process prints 'length error on every update, which usually occurs when incorrect number of elements is passed: https://code.kx.com/wiki/Errors
I suspect that this happens when the feed process calls .u.upd
Any suggestions as to how to solve this problem?
Entering \e 1 into the command line will suspend execution and run the debugger allowing you to see what failed and query the variables which should help pinpoint what is causing the issues.
More about debugging here https://code.kx.com/q/ref/debug/
If you are using the plain vanilla tick setup from KX there is no reason for that error to appear.
Also, I think you need to start the feed as feed.q -t 200 otherwise you will get no data coming through.
Usually the 'length error appears when the table schema does not match. So if you have the sym.q file (and it is loaded correctly) you should not have that issue.
Just to confirm this is the structure of your directory:
.
├── feed.q
├── README.md
├── tick
│   ├── r.q
│   ├── sym.q
│   └── u.q
└── tick.q
The sym.q file contains your table schema. If you change something in the feedhandler the table schema in the sym.q must match that change (i.e if you add a column in the feed you must also add a holder in the table for that column)
Open a new q session on some port (9999), add your schema definition there and define insert as .u.upd or something like this :
.u.upd:{[t;d]
.test.t:t;
.test.d:d;
t upsert d
}
Now point your feed to this q session and stream some data; this will enable you to analyse the test variables in case of the errors.