Training a new font in arabic causing issues: " Compute CTC targets failed " - tesseract

I've been trying to train a new arabic font using tesseract. I was able to train it at first with the default training_text file available once you install Tesseract. But I wanted to train it using my own generated data.
So I proceeded as follow:
First I changed the ara.training_text file and I inserted some of the data that I want to train my model on.
Then I generated the .tif files using this command:
!/content/tesstutorial/tesseract/src/training/tesstrain.sh --fonts_dir /content/fonts \
--fontlist 'Traditional Arabic' \
--lang ara \
--linedata_only \
--langdata_dir /content/tesstutorial/langdata \
--tessdata_dir /content/tesstutorial/tesseract/tessdata \
--save_box_tiff \
--maxpages 100 \
--output_dir /content/train
then I combined the train_best trained data for arabic with the generated ara.lstm
!combine_tessdata -e /content/tesstutorial/tesseract/tessdata/best/ara.traineddata ara.lstm!combine_tessdata -e /content/tesstutorial/tesseract/tessdata/best/ara.traineddata ara.lstm
All good for now, then when I proceed to call lstmtraining, I am getting a "Compute CTC tagets failed" error whenever I am calling training
!OMP_THREAD_LIMIT=8 lstmtraining \
--continue_from /content/ara.lstm \
--model_output /content/output/araNewModel \
--old_traineddata /content/tesstutorial/tesseract/tessdata/best/ara.traineddata \
--traineddata /content/train/ara/ara.traineddata \
--train_listfile /content/train/ara.training_files.txt \
--max_iterations 200 \
--debug_level -1
I realized that his was only happening whenever I was adding arabic numerals to my code. When I pass in a training_text file with no arabic numerals it works fine.
Can someone tell me what this error is about how to solve it.

Related

How to make Powershell pass a list as argument to gcloud

I want to submit my neural network model to google cloud via the following command as in the tutorial:
gcloud ai-platform jobs submit training ${JOB_NAME} \
--region=us-central1 \
--master-image-uri=gcr.io/cloud-ml-public/training/pytorch-gpu.1-10 \
--scale-tier=CUSTOM \
--master-machine-type=n1-standard-8 \
--master-accelerator=type=nvidia-tesla-p100,count=1 \
--job-dir=${JOB_DIR} \
--package-path=./trainer \
--module-name=trainer.task \
-- \
--train-files=gs://cloud-samples-data/ai-platform/chicago_taxi/training/small/taxi_trips_train.csv \
--eval-files=gs://cloud-samples-data/ai-platform/chicago_taxi/training/small/taxi_trips_eval.csv \
--num-epochs=10 \
--batch-size=100 \
--learning-rate=0.001
I was working with powershell and I have a problem with the master-accelerator argument which should be a dictionnary. I don't know how to pass such to gcloud. I have tried #{count=1; type=...}, but received a Bad syntax for dict arg: [#] error.
How can I pass a list of parameters in PowerShell such that the gcloud submit command accepts it?
I thank you in advance for your help.
EDIT
I have tried to use delimiters as ^^^^:^^^^, along this, but this still does not work (Invalid delimeter).

How to convert STL to rotating GIF using OpenSCAD?

Given an STL file, how can you convert it to an animated gif using the command line (bash)?
I've discovered a few articles that vaguely describe how to do this through the GUI. I've been able to generate the following, however the animation is very rough and the shadows jump around.
for ((angle=0; angle <=360; angle+=5)); do
openscad /dev/null -o dump$angle.png -D "cube([2,3,4]);" --imgsize=250,250 --camera=0,0,0,45,0,$angle,25
done
# https://unix.stackexchange.com/a/489210/39263
ffmpeg \
-framerate 24 \
-pattern_type glob \
-i 'dump*.png' \
-r 8 \
-vf scale=512:-1 \
out.gif \
;
OpenScad has a built in --animation X parameter, however using that likely won't work when passing in the camera angle as a parameter.
Resources
https://github.com/openscad/openscad/issues/1632#issuecomment-219203658
https://blog.prusaprinters.org/how-to-animate-models-in-openscad_29523/
https://github.com/openscad/openscad/issues/1573
https://github.com/openscad/openscad/pull/1808
https://forum.openscad.org/Product-Video-produced-with-OpenSCAD-td15783.html
Bash + Docker
Converting an STL to a GIF requires several steps
Center the STL at the origin
Convert the STL into a collection of .PNG files from different angles
Combine those PNG files into a .gif file
Assuming you have docker installed you can run the the following to convert an STL into an animated GIF
(Note: A more up to date version of this script is available at spuder/CAD-scripts/stl2gif
This depends on 3 docker containers
spuder/stl2origin
openscad/openscad:2021.01
linuxserver/ffmpeg:version-4.4-cli
# 1. Use spuder/stl2origin:latest docker container to center the file at origin
# A file with the offsets will be saved to `${MYTMPDIR}/foo.sh`
file=/tmp/foo.stl
MYTMPDIR="$(mktemp -d)"
trap 'rm -rf -- "$MYTMPDIR"' EXIT
docker run \
-e OUTPUT_BASH_FILE=/output/foo.sh \
-v $(dirname "$file"):/input \
-v $MYTMPDIR:/output \
--rm spuder/stl2origin:latest \
"/input/$(basename "$file")"
cp "${file}" "$MYTMPDIR/foo.stl"
# 2. Read ${MYTMPDIR}/foo.sh and load the offset variables ($XTRANS, $XMID,$YTRANS,$YMID,$ZTRANS,$ZMID)
# Save the new centered STL to `$MYTMPDIR/foo-centered.stl`
source $MYTMPDIR/foo.sh
docker run \
-v "$MYTMPDIR:/input" \
-v "$MYTMPDIR:/output" \
openscad/openscad:2021.01 openscad /dev/null -D "translate([$XTRANS-$XMID,$YTRANS-$YMID,$ZTRANS-$ZMID])import(\"/input/foo.stl\");" -o "/output/foo-centered.stl"
# 3. Convert the STL into 60 .PNG images with the camera rotating around the object. Note `$t` is a built in openscad variable that is automatically set based on time when --animate option is used
# OSX users will need to replace `openscad` with `/Applications/OpenSCAD.app/Contents/MacOS/OpenSCAD`
# Save all images to $MYTMPDIR/foo{0..60}.png
# This is not yet running in a docker container due to a bug: https://github.com/openscad/openscad/issues/4028
openscad /dev/null \
-D '$vpr = [60, 0, 360 * $t];' \
-o "${MYTMPDIR}/foo.png" \
-D "import(\"${MYTMPDIR}/foo-centered.stl\");" \
--imgsize=600,600 \
--animate 60 \
--colorscheme "Tomorrow Night" \
--viewall --autocenter
# 4. Use ffmpeg to combine all images into a .GIF file
# Tune framerate (15) and -r (60) to produce a faster/slower/smoother image
yes | ffmpeg \
-framerate 15 \
-pattern_type glob \
-i "$MYTMPDIR/*.png" \
-r 60 \
-vf scale=512:-1 \
"${file}.gif" \
;
rm -rf -- "$MYTMPDIR"
STL File
Gif without centering
Gif with centering

How to print debugging information on one/specific OpenAPI model?

According to the OpenAPI docs here is how one can print generator's models data:
$ java -jar openapi-generator-cli.jar generate \
-g typescript-fetch \
-o out \
-i api.yaml \
-DdebugModels
which outputs 39000 lines and it's a little difficult to find a modele of one's interest.
How to output debug information on just one model?
Unfortunately, there's no way to generate the debug log for just one model or operation.
As a workaround, you can draft a new spec that contains the model you want to debug.

merge chromosomes in Plink

I have downloaded 1000G dataset in the vcf format. Using Plink 2.0 I have converted them into binary format.
Now I need to merge the 1-22 chromosomes.
I am using this script:
${BIN}plink2 \
--bfile /mnt/jw01-aruk-home01/projects/jia_mtx_gwas_2016/common_files/data/clean/thousand_genomes/from_1000G_web/chr1_1000Gv3 \
--make-bed \
--merge-list /mnt/jw01-aruk-home01/projects/jia_mtx_gwas_2016/common_files/data/clean/thousand_genomes/from_1000G_web/chromosomes_1000Gv3.txt \
--out /mnt/jw01-aruk-home01/projects/jia_mtx_gwas_2016/common_files/data/clean/thousand_genomes/from_1000G_web/all_chrs_1000G_v3 \
--noweb
But, I get this error
Error: --merge-list only accepts 1 parameter.
The chromosomes_1000Gv3.txt has files related to chromosomes 2-22 in this format:
chr2_1000Gv3.bed chr2_1000Gv3.bim chr2_1000Gv3.fam
chr3_1000Gv3.bed chr3_1000Gv3.bim chr3_1000Gv3.fam
....
Any suggestions what might be the issue?
Thanks
The --merge-list cannot be used in combination with --bfile. You can either have --bfile/--bmerge or --merge-list only in one plink command.

Progress of simulations in headless mode

Is there any way to check the progress of simulations in headless mode as opposed to gui?
Basic Code:
$ ~/netlogo-5.1.0/netlogo-headless.sh \
--model ~/myproject/MyModel.nlogo \
--experiment MyExperiment \
--table ~/myproject/MyNewOutputData.csv
I'd suggest doing tail -f ~/myproject/MyNewOutputData.csv. This will show you a live view of the output file as it is being written to.