Including allennlp predictor arguments in config.json - prediction

I'm training an allennlp crf_tagger. I'm using a predictor which is based on the
SentenceTaggerPredictor. The issue is the tokenizer argument - in the case of the SentenceTaggerPredictor there's a language argument.
Since SentenceTaggerPredictor has language="en_core_web_sm" as a defauly argument, when I do
Predictor.from_path("model.tar.gz", "sentence_tagger")
The tokenizer is created using the default language. But what happens if the training data was tokenized using a different language. How do I specify the arguments for the predictor in the model config.json such that Predictor.from_path will be constructed using a non-default language?

The Predictor.from_path() method has an overrides parameter that you could use in this case. For example, Predictor.from_path("model.tar.gz", "sentence_tagger", overrides={"dataset_reader.tokenizer.language": "en"}).

Related

Apache AGE - Creating Functions With Multiple Parameters

I was looking inside the create_vlabel function and noted that to get the graph_name and label_name it is used graph_name = PG_GETARG_NAME(0) and label_name = PG_GETARG_NAME(1). Since these two variables are also passed as parameters, I was thinking that, if I wanted to add one more parameter to this function, then I would need to use PG_GETARG_NAME(2) to get this parameter and use it in the function's logic. Is my assumption correct or do I need to do more tweaks to do this?
You are correct, but you also need to change the function signature in the "age--1.2.0.sql" file, updating the arguments:
CREATE FUNCTION ag_catalog.create_vlabel(graph_name name, label_name name, type new_argument)
RETURNS void
LANGUAGE c
AS 'MODULE_PATHNAME';
Note that all arguments come as a "Datum" struct, and PG_GETARG_NAME automatically converts it to a "Name" struct. If you need an argument as int32, for example, you should use PG_GETARG_INT32(index_of_the_argument), for strings, PG_GETARG_CSTRING(n), and so on.
Yes, your assumption is correct. If you want to add an additional parameter to the create_vlabel function in PostgreSQL, you can retrieve the value of the third argument using PG_GETARG_NAME(2). Keep in mind that you may need to make additional modifications to the function's logic to handle the new parameter correctly.
The answers given by Fahad Zaheer and Marco Souza are correct, but you can also create a Variadic function, with which you could have n number of arguments but one drawback is that you would have to check the type yourself. You can find more information here. You can also check many Apache Age functions made this way e.g agtype_to_int2.

What is the difference in creating uvm_reg_field with or without get_full_name()

What is the difference between
this.ModuleEn=uvm_reg_field::type_id::create("ModuleEn");
and
this.ModuleEn=uvm_reg_field::type_id::create("ModuleEn",,get_full_name());
I don't see difference in simulation results.
The 2nd and 3rd arguments to create() affect the lookup of factory overrides. If you have no overrides (which is typical for RAL models), these arguments will not make any difference.
The second argument would be used to set the context of the override if you were creating this inside a uvm_component. The third argument is used to set a context via a string path, which in this case is being set by the register's path.

Extract Types/Classnames from flat Modelica code

I was wondering if there already exists a possibility to extract from flat Modelica code all variables AND their corresponding types (classnames respectively).
For example:
Given an extract from a flattened Modelica model:
constant Integer nSurfaces = 8;
constant Integer construction1.nLayers(min = 1.0) = 2 "Number of layers of the construction";
parameter Modelica.SIunits.Length construction1.thickness[construction1.nLayers]= {0.2, 0.1} "Thickness of each construction layer";
Here, the wanted output would be something like:
nSurfaces, Integer, constant;
construction1.nLayers, Integer, constant;
construction1.thickness[construction1.nLayers], Modelica.SIunits.Length, parameter
Ideally, for construction1.thickness there would be two lines (=number of construction1.nLayers).
I know, that it is possible to get a list of used variables from the dsin.txt, which is produced while translating a model. But until now I did not find an already existing way to get the corresponding types. And I really would like to avoid writing an own parser :-).
You could try to generate the file modelDescription.xml as defined by the FMI standard. It contains a ton of information and XML should be easier to parse, e.g. python has a couple of xml parsing/reading packages.
If you are using Dymola you just set the flag Advanced.FMI.GenerateModelDescriptionInterface2 = true to generate the model description file.
The second idea could be to let the compiler/tool parse the Modelica file for you as they need to do that anyway, try searching for AST (abstract syntax tree). In Dymola, this is available through the ModelManagement library, and also through the Python interface.
Third idea could be to use one of the Modelica parsers available, e.g. have a look at:
https://github.com/lbl-srg/modelica-json
https://hackage.haskell.org/package/modelicaparser
https://github.com/xie-dongping/modparc
https://github.com/pymoca/pymoca
https://github.com/pymola/pymola/tree/master/src/pymola
Fourth, if all that did not work, you still do not have to write a full parser, you could use ANTLR, then use an existing grammar file (look for e.g. modelica.g4).

handing string to MATLAB function in Simulink

In my Simulink Model I have a MATLAB function, this_function, which uses as one parameter the name of the Simulink Model, modelname. The name is defined in an extra parameter file with all other parameters needed. Loading the parameter file loads modelname into the workspace. The problem is now, that this_function can't access modelname in the workspace and therefore the model doesn't run.
I tried to use modelname as a constant input source for this_function, which I used as a work-around previously, but Simulink doesn't accept chars/strings as signals. Furthermore does setting modelname to global not work as well.
Is there a way to keep modelname in the parameter file instead of writing it directly into this_function?
Simulink does not support strings. Like, anywhere. It really sucks and I don't know why this limitation exists - it seems like a pretty horrible design choice to me.
I've found the following workarounds for this limitation:
Dirty Casting
Let
function yourFun(num_param1, num_param2, ..., str_param);
be your MATLAB function inside the Simulink block, with str_param the parameter you want to be a string, and num_param[X] any other parameters. Then pass the string signal to the function like so:
yourFun(3, [4 5], ..., 'the_string'+0);
Note that '+0' at the end; that is shorthand for casting a string to an array of integers, corresponding to the ASCII codes of each character in the string. Then, inside the function, you could get the string back by doing the inverse:
model = char(str_param);
but usually that results in problems later on (strcmp not supported, upper/lower not supported, etc.). So, you could do string comparisons (and similar operations) in a similar fashion:
isequal(str_param, 'comparison'+0);
This has all the benefits of strings, without actually using strings.
Of course, the '+0' trick can be used inside constant blocks as well, model callbacks to convert workspace variables on preLoad, etc.
Note that support for variable size arrays must be enabled in the MATLAB function.
Fixed Option Set
Instead of passing in a string, you can pass a numeric scalar, which corresponds to a selection in a list of fixed, hardcoded options:
function yourFun(..., option)
...
switch (option)
case 1
model = 'model_idealised';
case 2
model = 'model_with_drag';
case 3
model = 'model_fullscale';
otherwise
error('Invalid option.');
end
...
end
Both alternatives are not ideal, both are contrived, and both are prone to error and/or have reusability and scalability problems. It's pretty hopeless.
Simulink should start supporting strings natively, I mean, come on.

Specification and correct use of (boolean) URI matrix parameters (and making them optional when using CXF/JAXB)?

I was wondering if the "proper" use of URI/URL matrix parameters was ever defined in a specification, such as an RFC or a W3 recommendation?
In particular, I just joined a project where we use matrix parameters and a Java framework to implement a REST service. One of the matrix parameters we have for our REST service is a boolean one, much like ;sortByDate=true
What bugged me about this one is that the Java framework we use apparently insists that boolean parameters are always passed in (i.e. you cannot make them optional/omit them; probably because they are converted to Java boolean type). I think that's a bit odd...
I have to doublecheck what framework we use tomorrow (I think it is JAXB), but in the meantime I wondered if matrix parameters were defined in an official specification somewhere, and if such a specification made any mention of boolean parameters.
So far I found a hint (though no mention of boolean matrix parameters) in Appendix B 2.2 of the W3's "HTML 4.01 Specification":
We recommend that HTTP server implementors, and in particular, CGI implementors support the use of ";" in place of "&" to save authors the trouble of escaping "&" characters in this manner.
And the "Web Application Description Language" specification specifies:
Boolean matrix parameters are represented as: ';' name when value is true and are omitted from identifier when value is false
What I haven't found is "the" specification for matrix parameters. Is there any? Does it mention how boolean matrix parameters should be used? If not, is there an established best practice?
And, as a bonus question: can you omit boolean URL matrix parameters when using CXF (JAXB), or do you always have to specify them?
Cheers! :)
Update: We're using CXF (which apparently uses JAXB under the hood...)
RFC3986 describes Matrix Parameters without explicitly naming them. Quoting https://www.rfc-editor.org/rfc/rfc3986#section-3.3:
For example, the semicolon (";") and equals ("=") reserved characters are often used to delimit parameters and parameter values applicable to that segment. The comma (",") reserved character is often used for similar purposes. For example, one URI producer might use a segment such as "name;v=1.1" to indicate a reference to version 1.1 of "name", whereas another might use a segment such as "name,1.1" to indicate the same.
I hope this helps.
I think that this answer does a good job of explaining the purpose of matrix parameters:
https://stackoverflow.com/a/5602678
You can use the Boolean wrapper class to support an optional boolean value. The values true and false will be mapped to the correct boolean values.
#MatrixParam("sortByDate") Boolean sortByDate
It will be null if the param is not present. Note that JAXB doesn't apply when dealing with JAX-RS parameters.