Is it possible to merge raster bands from several folders using GDAL? - merge

I have two folders containing about 15 000 .tif files. Each file in the first folder is a raster with 5 bands, named AA_"number" meaning it looks like
AA_1.tif,
AA_2.tif,
...,
AA_15000.tif.
Each file in the second folder is a raster with 2 bands named BB_"number" and looks like
BB_1.tif,
BB_2.tif,
...,
BB_15000.tif.
My goal is to add bands 1-3 from first file from folder AA with band 1 from the first file in folder BB to create a 4 band raster, and make 15000 4 band rasters. After doing some research and testing things out in QGIS I believe the tool Merge from GDAL could solve this task, but I have not been able make it find the right files in different folders. And as I have 2x 15 000 files, it is not possible to do this selection manually. Is there anyone who know a smart solution to this, preferably using GDAL or QGIS?

There are many ways to do this, and it really depends on what the exact use case is. Like the type of analysis/visualization that needs to be done on the result.
With this many files, it could for example be nice to merge them using a VRT. That will avoid creating redundant data, but whether that's actually the best solution depends. Just stacking them in a new tiff-file would of course also work.
Unfortunately, creating a VRT using gdalbuildvrt / gdal.BuildVRT is not possible with multi-band inputs.
If your inputs are homogeneous in terms of properties, it should be fairly simple to set up a template where you fill in the file locations and write the VRT to disk. For more inputs with heterogeneous properties it might still be possible, but you'll have to be careful to take it all into account.
Conceptually such a VRT would look something like:
<VRTDataset rasterXSize="..." rasterYSize="...">
<SRS>...</SRS>
<GeoTransform>....</GeoTransform>
<VRTRasterBand dataType="..." band="1">
<ComplexSource>
<SourceFilename relativeToVRT="0">//some_drive/aa_folder/aa_file1.tif</SourceFilename>
<SourceBand>1</SourceBand>
...
</ComplexSource>
</VRTRasterBand>
<VRTRasterBand dataType="..." band="2">
<ComplexSource>
<SourceFilename relativeToVRT="0">//some_drive/aa_folder/aa_file1.tif</SourceFilename>
<SourceBand>2</SourceBand>
...
</ComplexSource>
</VRTRasterBand>
<VRTRasterBand dataType="..." band="3">
<ComplexSource>
<SourceFilename relativeToVRT="0">//some_drive/aa_folder/aa_file1.tif</SourceFilename>
<SourceBand>3</SourceBand>
...
</ComplexSource>
</VRTRasterBand>
<VRTRasterBand dataType="..." band="4">
<ComplexSource>
<SourceFilename relativeToVRT="0">//some_drive/bb_folder/bb_file1.tif</SourceFilename>
<SourceBand>1</SourceBand>
...
</ComplexSource>
</VRTRasterBand>
</VRTDataset>
You can first use gdalbuildvrt on some of your files to find all the properties that need to be filled in, like projection, pixel dimensions etc. That will work, but gdalbuildvrt will only be able to take the first band from the inputs. If all bands have homogeneous properties (like nodata value etc), that should be fine as a reference.

Related

Why FastText test of a model return only 1 exemple when my test file contains 135

I'm trying to test the model (model.bin) i've made with fastText on a test file (test.txt). In this test file, i have 135 labelised data. I'm expecting from fastText to test my model on this number of example, but instead, it only test it over 1 example. Where does come from this problem ?
I've already tried to do such a thing with another model and another testing file and all worked nicely.
this is how I test my model. model_baby.bin is the model, and test.data.txt is my testing file.
./fasttext test model_baby.bin test.data.txt
N 1
P#1 1
R#1 0.0164
Number of examples: 1
And here is an extract from my testing file
__label__4.0 I love the fact you can hide your stuff. Only down is that the straps to hold it at midpoint and bottom could be better designed for your car. It's got plenty of room which is great. __label__5.0 This hid our ipad wonderfully. Especially for those quick stops where we all had jump out and use the restroom. It zipped, folded and held all our stuff for the kids in the back seat. __label__3.0
As i have more than 1 labelised example in my testing file, I expect the output "Number of examples: " to be at least more than 1 but the actual one is "1"
From the official documentation (https://fasttext.cc/docs/en/supervised-tutorial.html): Each line of the text file contains a list of labels, followed by the corresponding document. All the labels start by the __label__ prefix, which is how fastText recognize what is a label or what is a word.
I don't understand very much your extract. I think it should be like this:
__label__4.0 I love the fact you can hide your stuff. Only down is that the straps to hold it at midpoint and bottom could be better designed for your car. It's got plenty of room which is great.
__label__5.0 This hid our ipad wonderfully. Especially for those quick stops where we all had jump out and use the restroom. It zipped, folded and held all our stuff for the kids in the back seat.
__label__3.0 ...

How to read info on voltage/beam energy, imaging mode, acquisition date/timestamp, etc. from image meta-data? (Tags)

DM scripting beginner here, almost no programming skills.
I would like to know the commands to access all the metadata of DM images/spectra.
I realized that all my STEM images at 80 kV taken between 2 dates (let's say 02.11.2017-05.04.2019) have the scale calibration wrong by the same factor (scale of all such images needs to be multiplied by 1.21).
I would like to write a script which multiplies the scale value by a factor only for images in scanning mode at 80 kV taken during a period for all images in a folder with subfolders or for all images opened in DM and save the new scale value.
I checked this website http://digitalmicrograph-scripting.tavernmaker.de/other%20resources/Old-DMHelp/AllFunctions.html but only found how to call the scale value (ImageGetDimensionCalibration). I have a general idea how to write the script based on other scripts if I find out how to call the metadata.
If anyone can write the whole script for me I would greatly appreciate your effort.
All general meta-data is organized in the image tag-structure
You can see this, if you open the Image Display Info of an image. (Via the menu, or by pressing CTRL + D) and then browse to the "Tags" section:
All info on the right are image tags and they are organized in a hierarchical tree.
How this tree looks like, and what information is written where, is totally open and will depend on what GMS version you are using, how the hardware is configured etc. Also custom scripts might alter this information.
So for a scripting start, open the data you want to modify and have a look in this tree.
Hint: The following min-script can be useful. It opens a tag-browsing window for the front-most image but as a modeless dialog (i.e. you can keep it open and interact with other parts):
GetFrontImage().ImageGetTagGroup().TagGroupOpenBrowserWindow(0)
The information you need to check against is most probably found in the Microscope Info sub-tree. Here, usually all information gathered from the microscope during acquisition is stored. What is there, will depend on your system and how it is set up.
The information of the STEM image acquisition - as far as the scanning engine and detector is concerned - is most probably in the DigiScan sub-tree.
The Data Bar sub-tree usually contains date and time of creation etc.
Calibration values are not stored in the image tag-structure
What you will not find in this tag-structure is the image calibration, i.e. the values actually used by DM to display calibrated values. These values are "one level up" so to speak here:
This is important to know in the following for your script, because you will need different commands for both the "meta-data" from the tags, and the "calibration" you want to change.
Accessing meta-data by script
The script-commands you need to read from the tags are all described in the F1 help documentation here:
Essentially, you need a command to get the "root" TagGroup of an image, which is ImageGetTagGroup() and then you traverse within this tree.
This might seem confusing - because there are a lot of slightly different commands for the different types of stored tags - but the essential bits are easy:
All "Paths" through the tree are just the individual names (typed exactly)
For each "branch" you have to use a single colon :
The commands to set/get a tag-value all require as input the "root" tagGroup object and the "path" as a string. The get commands require a variable of matching type to store the value in, the set commands need the value which should be written.
= The get commands themeselves return true or false depending on whether or not a tag-path could be found and the value could be read.
So the following script would read the "Imaging Mode" from the tags of the image shown as example above:
string mode
GetFrontImage().ImageGetTagGroup().TagGroupGetTagAsString( "Microscope Info:Imaging Mode", mode )
OKDialog( "Mode: " + mode )
and in a little more verbose form:
string mode // variable to hold the value
image img // variable for the image
string path // variable/constant to specify the where
TagGroup tg // variable to hold the "tagGroup" object
img := GetFrontImage() // Use the selected image
tg = img.ImageGetTagGroup() // From the image get the tags (root)
path = "Microscope Info:Imaging Mode" // specify the path
if ( tg.TagGroupGetTagAsString( path, mode ) )
OKDialog( "Mode: " + mode )
else
Throw( "Tag not found" )
If the tag is not a string but a value, you will need the according commands, i.e.
TagGroupGetTagAsNumber().

How do I generate a fixed sized list of facts (duplicates included)?

I'm new to ASP & Clingo and I need to work on a project for school. I thought about some basic music generator.
For now, I need to generate notes (I'm sticking with C major for now). I also want to generate them randomly and I don't know how to do that. How can I make the following code generate a random sequence of notes (duplicates too)?
note(c;d;e;f;g;a;b).
20 { play(X) : note(X)} 30.
#show play/1.
So far, the code won't allow for more than 7 as the upper bound, because it won't show duplicate notes.
Current output: play(b) play(g) play(e) play(c)
Wanted output: play(d) play(g) play(f) ...[20-30 randomly generated notes]
I want to be able to add constraints later (such as this note should not be followed by that note, and so on). I appreciate any tips since I know so little about this.
An answer set is a set. The atoms have no order and duplicates are not possible because it is a set.
You want to guess one note for each beat.
beat(1..8).
1 { play(N,B) : note(N) } 1 :- beat(B).

Google Sheets - Retrieve "A:File1" to "A:File2" where "Sheetname:File1" = "B:File2" if "C:File2" is between "E" and "F" in "File1"

Sorry for the somewhat long title, but I was told to be as specific as possible. :D
My problem will require some explantion.
So, I have 2 spreadsheets files ("Konverteringstabeller" and "Tee Posen").
In "Tee Posen" I have a sheet named "Scores MIK" (golf scorecard and my name).
In "Konverteringstabeller" I have sheets with conversion tables for multiple golf courses, but if one works, all should.
What I need is to find out what course handicap I would get if my golf handicap is "HCP 26,0" (as shown in File 2 Picture), and in this case that result should be 29 (not visible), but you should get the point.
(example: golf hcp 10 would result in course hcp 11, because 10 is between 9,9-10,7)
While I have been able to find the right result, it has only been in the "Konverteringstabeller" spreadsheet file and that is not the place I need it.
I want to have it written in E6 in the "Scores MIK" sheet in File 2.
I should mention that in "Scores MIK : File 2", cell C2 (Ikast Golf Klub) has data validation so I can easily change between the different courses in the "Konverteringstabeller" file once I add more.
What I have been messing with is something with vlookup and importrange with concatenate in it, but I can't figure out how to do it, so I ask for your help.
And I am by no means skilled in the art of Spreadsheets, so I would very much appreciate a detailed explanation.
Picture - Scores MIK (File 2)
Picture - Ikast Golf Klub (File 1)
Thanks in advance!
// Mikkel Christensen
OK so a couple notes - One is that to join a static cell where you keep the sheet name but allow it to chance you should add '$' around it, also if the rows for B8-E70 will always be the same position on the various sheets you also need to add $ around those as well.
here is an example of the whole formula
=IFERROR(ARRAYFORMULA(VLOOKUP(E5:E25;IMPORTRANGE("spreadsheet key";"'"&C2&"'!$B$8:$E$70");4;TRUE)))
And lastly - using the "&" operator to concatenate is better at least in my opinion because concatenate sometimes does not work as well with array formula - plus I find it personally quicker and easier to use that having wrap yet another function around my stuff.

Text classification using Weka

I'm a beginner to Weka and I'm trying to use it for text classification. I have seen how to StringToWordVector filter for classification. My question is, is there any way to add more features to the text I'm classifying? For example, if I wanted to add POS tags and named entity tags to the text, how would I use these features in a classifier?
It depends of the format of your dataset and the preprocessing steps you perform. For instance, let us suppose that you have pre-POS-tagged your texts, looking like:
The_det dog_n barks_v ._p
So you can build an specific tokenizer (see weka.core.tokenizers) to generate two tokens per word, one would be "The" and the other one would be "The_det" so you keep the tag information.
If you want only tagged words, then you can just ensure that "_" is not a delimiter in the weka.core.tokenizers.WordTokenizer.
My advice is to have both the words and tagged words, so a simpler way would be to write an script that joins the texts and the tagged texts. From a file containing "The dog barks" and another one cointaining "The_det dog_n barks_v ._p", it would generate a file with "The The_det dog dog_n barks barks_v . ._p". You may even forget about the order unless you are going to make use of n-grams.