What is the difference between tessconfigs and configs in Tesseract-OCR? - tesseract

In Tesseract-OCR, there are 2 folders in tessdata - tessdata/configs and tessdata/tessconfigs:
Both seem to contain config files - for example makebox is in configs while batch.nochop is in tessconfig.
So what is the difference between them? Why split the config files between 2 folders? Do they contain different categories of config files?

Related

Is there a build_extensions rule in build.yaml to output all generated Flutter models in a common directory?

I am trying to create a build_extensions rule in build.yaml for builders freezed and json_serializable to output all generated models in the directory lib/generated/model, irrespective of their original location that matches lib/**/*.dart.
What I have tried:
I would expect '^lib/**/{{}}.dart': 'lib/generated/model/{{}}.g.dart' to work, but it doesn't match any Dart files.
I also tried things like '^lib/{{path}}/{{file}}.dart': 'lib/generated/model/{{file}}.g.dart' but {{path}} needs to be matched again in the destination, as per documentation (why even enforce this?).
Example:
Base model location: lib/core/feature/profile/profile.dart
Generated outputs after calling flutter pub run build_runner build --delete-conflicting-outputs:
lib/generated/model/profile.g.dart
lib/generated/model/profile.freezed.dart
My current build.yaml (which generates .g.dart and .freezed.dart files in the child generated directory relative to the original model location) is as follows:
targets:
$default:
builders:
source_gen|combining_builder:
generate_for:
- lib/**.dart
options:
build_extensions:
# I want this line to "work":
# '^lib/**/{{}}.dart': 'lib/generated/model/{{}}.g.dart'
'lib/{{path}}/{{file}}.dart': 'lib/{{path}}/generated/{{file}}.g.dart'
freezed:
options:
build_extensions:
# I want this line to "work":
# '^lib/**/{{}}.dart': 'lib/generated/model/{{}}.freezed.dart'
'lib/{{path}}/{{file}}.dart': 'lib/{{path}}/generated/{{file}}.freezed.dart'
field_rename: snake
explicit_to_json: true
json_serializable:
options:
field_rename: snake
explicit_to_json: true
For anyone else looking for a similar solution, turns out it's not possible to place multiple generated outputs into the same directory.
Build extensions need to be written in a way that prevents two different files from emitting the same output file (e.g. if you had lib/a/foo.dart and lib/b/foo.dart, there can't be a single foo.g.dart in lib/generated/model). ** is not something build extensions supports, and {{}} should be used to match multiple chars and then used again in the destination directory. Apart from that, build extensions just match suffixes.
To avoid the problem of different source directories potentially conflicting in the output directory, you'd have to use something like lib/generated/model/{{path}}/{{file}}.g.dart, there's no way to put everything in a single directory, even if you make sure that no generated files will be outputted with the same name in the first place.

appService.zipIgnorePattern How to ignore a file in a subfolder?

I am currently publishing a python web application from VSCODE to an MS Azure App Service.
I have a subfolder that contains a rather large SQLite.db which I would like to exclude when publishing to Azure: "\instance\Output.db"
I have tried the following - see last entry to exclude Output.db but it does not exclude the file. How do I input the subfolder and .db file in this zipIgnorePattern to exclude the file?
"appService.zipIgnorePattern": [
"__pycache__{,/**}",
"*.py[cod]",
"*$py.class",
".Python{,/**}",
"build{,/**}",
"develop-eggs{,/**}",
"Output.db{,/**}"
],

Getting HASH of individual files within folder uploaded to IPFS

When I upload a folder of .jpg files to IPFS, I get the HASH of that folder - which is cool.
But is each individual file in that folder also getting hashed?
And if so, how do I get the hash of each file?
I basically want to be able to upload a whole bunch of files - like 500 images - and do it all at once, or programmatically, and have the hash of each file be returned to me.
Any way to do this?
Yes! From the command line you get back the CIDs (the Content IDentifier, aka, IPFS hash) for each file added when you run ipfs add -r <path to directory>
$ ipfs add -r gifs
added QmfBAEYhJp9ZjGvv8utB3Yv8uuuxsDKjv9rurkHRsYU3ih gifs/martian-iron-man.gif
added QmRBHTH3p4W2xAzgLxvdh8VJvAmWBgchwCr9G98EprwetE gifs/needs-more-dogs.gif
added QmZbffnCcV598QxsUy7WphXCAMZJULZAzy94tuFZzbFcdK gifs/satisfied-with-your-care.gif
added QmTxnmk85ESr97j2xLNFeVZW2Kk9FquhdswofchF8iDGFg gifs/stone-of-triumph.gif
added QmcN71Qh56oSg2YXsEXuf8o6u5CrBXbyYYzgMyAkdkcxxK gifs/thanks-dog.gif
added QmTnuLaivKc1Aj8LBf2iWBHDXsmedip3zSPbQcGi6BFwTC gifs
the root CID for the directory is always the last item in the list.
You can limit the output of that command to just include the CIDs using the --quiet flag
⨎ ipfs add -r gifs --quiet
QmfBAEYhJp9ZjGvv8utB3Yv8uuuxsDKjv9rurkHRsYU3ih
QmRBHTH3p4W2xAzgLxvdh8VJvAmWBgchwCr9G98EprwetE
QmZbffnCcV598QxsUy7WphXCAMZJULZAzy94tuFZzbFcdK
QmTxnmk85ESr97j2xLNFeVZW2Kk9FquhdswofchF8iDGFg
QmcN71Qh56oSg2YXsEXuf8o6u5CrBXbyYYzgMyAkdkcxxK
QmTnuLaivKc1Aj8LBf2iWBHDXsmedip3zSPbQcGi6BFwTC
Or, if you know the CID for a directory, you can list out the files it contains and their individual CIDs with ipfs ls. Here I list out the contents of the gifs dir from the previous example
$ ipfs ls QmTnuLaivKc1Aj8LBf2iWBHDXsmedip3zSPbQcGi6BFwTC
QmfBAEYhJp9ZjGvv8utB3Yv8uuuxsDKjv9rurkHRsYU3ih 2252675 martian-iron-man.gif
QmRBHTH3p4W2xAzgLxvdh8VJvAmWBgchwCr9G98EprwetE 1233669 needs-more-dogs.gif
QmZbffnCcV598QxsUy7WphXCAMZJULZAzy94tuFZzbFcdK 1395067 satisfied-with-your-care.gif
QmTxnmk85ESr97j2xLNFeVZW2Kk9FquhdswofchF8iDGFg 1154617 stone-of-triumph.gif
QmcN71Qh56oSg2YXsEXuf8o6u5CrBXbyYYzgMyAkdkcxxK 2322454 thanks-dog.gif
You can it programatically with the core api in js-ipfs or go-ipfs. Here is an example of adding a files from the local file system in node.js using js-ipfs from the docs for ipfs.addAll(files) - https://github.com/ipfs/js-ipfs/blob/master/docs/core-api/FILES.md#importing-files-from-the-file-system
There is a super helpful video on how adding files to IPFS works over at https://www.youtube.com/watch?v=Z5zNPwMDYGg
And a walk through of js-ipfs here https://github.com/ipfs/js-ipfs/tree/master/examples/ipfs-101

How to associate hash-named .wsp files to my tagged graphite metrics?

I used graphite tagged metrics over grafana and whisper, but http://graphite/tags/delSeries removes something but not .wsp files.
And untagged metrics creates .wsp files in whisper data folder with human-readable names, but tagged metrics creates only hash-named folders and .wsp files in _tagged directory.
Like so:
/whisper
/data
/Players
registrations.wsp
today_registrations.wsp
/Gaming
playing_count.wsp
/_tagged
/f58
/010
f58010d4cef67599a31f4daaab4a53c4d7fd85a9faea546282d2058c40c7e7b9.wsp
/f56
/031
f56031052aec89dc9cc38e44dbe71b2eb08fb513a3e60d515eb1dc23f5b929d1.wsp
How to know .wsp file associated with my tagged metric?
I'm just running into that problem as well, how to map the actual path/tag metric to its corresponding hashed wsp file.
I don't think you can compute the actual metric name from the hash, but you can do the other way around, by using graphite's encoding methods.
I've quickly written a python script just for lab purpose:
- It can take several metric names in parameters and returns a mapping
Just log into your graphite host and create a python script in /opt/graphite/webapp/graphite/tags
#!/opt/graphite/bin/python3
import sys
from utils import TaggedSeries
for line in sys.stdin:
paths = line.split()
for path in paths:
# Normalize first
parsed = TaggedSeries.parse(path)
print( path + " -> /opt/graphite/storage/whisper/" + TaggedSeries.encode(parsed.path,'/',True) + ".wsp")
You can then pipe a list of metrics:
# echo "users.count;server=s1" |python mapper.py
users.count;server=s1 -> /opt/graphite/storage/whisper/_tagged/b6c/c91/b6cc916d608e4b145b318669606e79118cc41d316f96735dd43621db4fd2bcaf.wsp
You can also get all your tagged metrics and generate a file that you can later cat into the script. In this example i get all metrics associated associated with the tag 'server':
# curl -s "http://localhost/tags/findSeries?expr=server=~." | sed s/"\", \""/\\n/g > my_metrics
Then cat your metrics:
# cat my_metrics | python mapper.py
That's a starting point. From there you can easily do some simple scripting for deleting wsp files, like the ones not updated since a month by example.
graphite

TFS Command line filter mask for nested directories

What is the TFS command line filter mask to exclude nested directories?
Consider the following example:
root
______|_______ ____________
| | |
dir1 dir2 resources
|______________
| |
resources dir3
I want only to filter out root/dir1/resources folder.
Based on official MS documentation on folder comparison filters I should be able to write:
"!dir1\resources\" - Does not work. 'dir1\resources' is not filtered
Tried "!root\dir1\resources\" - Also doesn't work.
"!resources\" - This filters out 'root\resources' as well and any other folder which is named 'resources'
What I am missing?
According to the MSDN article you posted above:
If you want to exclude a subset of file or folder names, you must specify the filter for the file or folder name that you want to match first and then specify the exclusion filter.
So, you need to define the Filter to be:
*dir1\;*dir2\;*resources\;*dir3\;!resources\
Then, when you do compare on the root folder, the root\dir1\resources leaf folder will be excluded.