My Jupyter workflow for exploratory analysis looks like:
Fiddle with some parameters.
Run the notebook; generate output.
Eyeball outputs.
Repeat.
Can anyone suggest a command to make the notebook to save a copy of itself (e.g as an html in the output folder), so that if I want to recreate a particular experiment (results from a particular parameter set) I can do so?
Yes you can. Just add a safe cell by using cell magic. After using nbconvert you can rename the file and append a date
%%bash
jupyter nbconvert --to html MyNotebookName.ipynb
mv MyNotebookName.html $(date +"%m_%d_%Y-%H%M%S")_MyNotebookName.html
Related
I am using JasperStarter to create pdf from several jrprint files and then print it using JasperStarter functtions.
I want to create one single pdf file with all the .jrprint files.
If I give command like:
jasperstarter pr a.jprint b.jprint -f pdf -o rep
It does not recognise the files after the first input file.
Can we create one single output file with many input jasper/jrprint files?
Please help.
Thanks,
Oshin
Looking at the documentation, this is not possible:
The command process (pr)
The command process is for processing a report.
In direct comparison to the command for compiling:
The command compile (cp)
The command compile is for compiling one report or all reports in a directory.
I am trying to export data to existing csv file.
I have been using these methods to export data.
Microsoft.Jet.OLEDB.4.0
SQLCMD
Data Export Wizard
However I cannot find if there is any parameter / option to append the exported data to existing file. Is there any way? Thanks.
Note: answer is biased towards *nix operating systems; I'm not too familiar with windows.
If you can run your sql query via the command line,
using a scripting language, you can use a library that creates an MSSQL connection, (an example of this is a node.js program I authored (https://github.com/skilbjo/aqtl but any tool will do), or
a windows binary that runs something like sqlcmd from the command line,
you can just pipe the output to the csv file. For example:
$ node runquery.js myquery.sql >> existing_csv_file.csv
I have two different Jupyter notebooks, running on the same server. What I would like to do is to access some (only a few of them) of the variables of one notebook through the other notebook (I have to compare if the two different versions of the algorithm give the same results, basically). Is there a way to do this?
Thanks
Between 2 jupyter notebook, you can use the %store command.
In the first jupyter notebook:
data = 'string or data-table to pass'
%store data
del data # This will DELETE the data from the memory of the first notebook
In the second jupyter notebook:
%store -r data
data
You can find more information at here.
If you only need something quick'n dirty, you can use the pickle module to make the data persistent (save it to a file) and then have it picked up by your other notebook. For example:
import pickle
a = ['test value','test value 2','test value 3']
# Choose a file name
file_name = "sharedfile"
# Open the file for writing
with open(file_name,'wb') as my_file_obj:
pickle.dump(a,my_file_obj)
# The file you have just saved can be opened in a different session
# (or iPython notebook) and the contents will be preserved.
# Now select the (same) file to open (e.g. in another notebook)
file_name = "sharedfile"
# Open the file for reading
file_object = open(file_Name,'r')
# load the object from the file into var b
b = pickle.load(file_object)
print(b)
>>> ['test value','test value 2','test value 3']
You can use same magic commands to do this.The Cell magic: %%cache in the IPython notebook can be used to cache results and outputs of long-lasting computations in a persistent pickle file. Useful when some computations in a notebook are long and you want to easily save the results in a file.
To use it in your notebook, you need to install the module ipycache first as this Cell magic command is not a built-in magic command.
then load the module in your notebook:
%load_ext ipycache
Then, create a cell with:
%%cache mycache.pkl var1 var2
var1 = 1 # you can put any code you want at there,
var2 = 2 # just make sure this cell is not empty.
When you execute this cell the first time, the code is executed, and the variables var1 and var2 are saved in mycache.pkl in the current directory along with the outputs. Rich display outputs are only saved if you use the development version of IPython. When you execute this cell again, the code is skipped, the variables are loaded from the file and injected into the namespace, and the outputs are restored in the notebook.
Alternatively use $file_name instead of mycache.pkl, where file_name is a variable holding the path to the file used for caching.
Use the --force or -f option to force the cell's execution and overwrite the file.
Use the --read or -r option to prevent the cell's execution and always load the variables from the cache. An exception is raised if the file does not exist.
ref:
The github repository of ipycache and the example notebook
I am trying to export the DataStage job designs with executables. Below is the screenshots I use to export from the GUI.
This is the two commands I use:
dsexport.exe /h=XX /U=XX /p=XX projectXXX /job=XXX jobname.dsx
dsexport.exe /h=XX /U=XX /p=XX projectXXX /job=XXX /EXEC /APPEND jobname.dsx
The file generated from commands is bigger than the one from GUI. Anyone knows how to use dsexport command to export jobs with the options as in the GUI screenshots. much appreciated. I am using Designer V8.5.
JS
C:\IBM\InformationServer\Clients\Classic>dsexport /d={ip address of server} /u={user id} /p={password} /job={job to export} {Project where job is located in} {FileName.dsx}
try this, it will export a single dsx file with all informations
P.S.I am using version 11.3
As you can see GUI is excluding some read-only files which is not excluded in command line this is why the file size difference is there.
You have "Include Dependent Items" unchecked in the GUI. The command line will include dependent items by default (i.e. shared containers or routines). You can disable this behaviour on the command line by using the /NODEPENDENTS command switch.
I am using following command to get a brief history of the CVS repository.
cvs -d :pserver:*User*:*Password*#*Repo* rlog -N -d "*StartDate* < *EndDate*" *Module*
This works just fine except for one small problem. It lists all tags created on each file in that repository. I want the tag info, but I only want the tags that are created in the date range specified. How do I change this command to do that.
I don't see a way to do that natively with the rlog command. Faced with this problem, I would write a Perl script to parse the output of the command, correlate the tags to the date range that I want and print them.
Another solution would be to parse the ,v files directly, but I haven't found any robust libraries for doing that. I prefer Perl for that type of task, and the parsing modules don't seem to be very high quality.