TOML multi-level table syntax inside array -- illegal or not - toml

I'm seeing an error during parsing with the Python and JS parsers. I can't seem to find an example in the TOML repo about whether this should be accepted or not:
[[somearray]]
one.two = false # fails, '.' not allowed in key
I know I can express this as:
[[somearray]]
[somearray.one]
two = false # OK
The TOML readme offers this example (among others):
[fruit]
apple.color = "red"
apple.taste.sweet = true
which does not involve arrays but seems to legitimize this syntax.
I also observe that a plain file with this content:
apple.color = "red"
is rejected. In conclusion, until you enter "table mode" (so to say) with an actual bracketed table, keys can only be singles.
Now again, in the TOML readme it says (although in the tables section):
"Dotted keys define everything to the left of each dot as a table."
Obviously, if mainstream parsers choke on it, it's not a good idea to use it, but I'd like to understand if/why it's a known no-no. Is here some ambiguity I'm not seeing?

These cases are confirmed valid TOML, per the 1.0 spec, which adds many more examples: https://github.com/toml-lang/toml/blob/1.0.0/toml.md

The following is definitely valid Toml syntax:
[[somearray]]
one.two = false
It should result in a model that is equivalent to this Json syntax:
{
"somearray": [
{
"one": {
"two": false
}
}
]
}
It can be validated here.
Dotted keys should indeed define everything to the left of each dot as a table.

Related

Match part of a string with regex

I have two arrays of strings and I want to check if a string of array a matches a string from array b. Those strings are phone numbers that might come in different formats. For example:
Array a might have a phone number with prefix like so +44123123123 or 0044123123123
Array b have a standard format without prefixes like so 123123123
So I'm looking for a regex that can match a part of a string like +44123123123 with 123123123
Btw I'm using Swift but I don't think there's a native way to do it (at least a more straightforward solution)
EDIT
I decided to reactivate the question after experimenting with the library #Larme mentioned because of inconsistent results.
I'd prefer a simper solution as I've stated earlier.
SOLUTION
Thanks guys for the responses. I saw many comments saying that Regex is not the right solution for this problem. And this is partly true. It could be true (or false) depending on my current setup/architecture ( which thinking about it now I realise that I should've explained better).
So I ended up using the native solution (hasSuffix/contains) but to do that I had to do some refactoring on the way the entire flow was structured. In the end I think it was the least complicated solution and more performant of the two. I'll give the bounty to #Alexey Inkin for being the first to mention the native solution and the right answer to #Ωmega for providing a more complete solution.
I believe regex is not the right approach for this task.
Instead, you should do something like this:
var c : [String] = b.filter ({ (short : String) -> Bool in
var result = false
for full in a {
result = result || full.hasSuffix(short)
}
return result
})
Check this demo.
...or similar solution like this:
var c : [String] = b.filter ({ (short : String) -> Bool in
for full in a {
if full.hasSuffix(short) { return true }
}
return false
})
Check this demo.
As you do not mention requirements to prefixes, the simplest solution is to check if string in a ends with a string in b. For this, take a look at https://developer.apple.com/documentation/swift/string/1541149-hassuffix
Then, if you have to check if the prefix belongs to a country, you may replace ^00 with + and then run a whitelist check against known prefixes. And the prefix itself can be obtained as a substring by cutting b's length of characters. Not really a regex's job.
I agree with Alexey Inkin that this can also nicely be solved without regex. If you really want a regex, you can try something like the following:
(?:(\+|00)(93|355|213|1684|376))?(\d+)
^^^^^^^^^^^^^^^^^^^^^ Add here all your expected country prefixes (see below)
^^^ ^^ Match a country prefix if it exists but don't give it a group number
^^^^^^^ Match the "prefix-prefix" (+ or 00)
^^^^ Match the local phone number
Unfortunatly with this regex, you have to provide all the expected country prefixes. But you can surely get this list online, e.g. here: https://www.countrycode.org
With this regex above you will get the local phone number in matching group 3 (and the "prefix-prefix" in group 1 and the country code in group 2).

Powershell circular dependencies

I have a scenario where I THINK a circular dependency is the right answer. In my actual code I have a class that contains some data that is used for token replacements. That class contains other classes for the data, and contains a method that does the lookup of token values and returns the value. However, those dependent classes need to be validated, and some of those values are dependent on lookups. So, I have mocked up this code to validate the approach.
class Tokens {
[Collections.Specialized.OrderedDictionary]$primaryData
[Collections.Specialized.OrderedDictionary]$secondaryData
Tokens ([Secondary]$seconday) {
$this.primaryData = [Ordered]#{
'1' = 'One'
'2' = 'Two'
'3' = 'Three'
'4' = 'Four'
'5' = 'Five'
}
$this.secondaryData = $seconday.secondaryData
}
[String] GetToken ([String]$library, [String]$item) {
return $this.$library.$item
}
[Void] SetToken ([String]$library, [String]$item, [String]$value) {
$this.$library.$item = $value
}
[String] ToString () {
[Collections.Generic.List[String]]$toString = [Collections.Generic.List[String]]::new()
foreach ($key in $this.primaryData.Keys) {
$toString.Add("$key : $($this.primaryData.$key)")
}
foreach ($key in $this.secondaryData.Keys) {
$toString.Add("$key : $($this.secondaryData.$key)")
}
return [string]::Join("`n", $toString)
}
}
class Secondary {
[Collections.Specialized.OrderedDictionary]$secondaryData
Secondary () {
$this.secondaryData = [Ordered]#{
'A' = 'a'
'B' = 'b'
'C' = 'c'
'D' = 'd'
'E' = 'e'
}
}
[Void] Update ([Tokens]$tokensReference) {
$tokensReference.SetToken('secondaryData', 'A', 'A')
$tokensReference.SetToken('secondaryData', 'B', "$($tokensReference.GetToken('secondaryData', 'A')) and $($tokensReference.GetToken('secondaryData', 'B'))")
}
[String] ToString () {
[Collections.Generic.List[String]]$toString = [Collections.Generic.List[String]]::new()
foreach ($key in $this.secondaryData.Keys) {
$toString.Add("$key : $($this.secondaryData.$key)")
}
return [string]::Join("`n", $toString)
}
}
CLS
$secondary = [Secondary]::new()
$tokens = [Tokens]::new($secondary)
$secondary.Update($tokens)
Write-Host "$($tokens.ToString())"
This is working, exactly as expected. however, the idea of circular dependency injection has my hair standing on end. Like, it could be a real problem, or at least is a code smell. So, my question is, am I on the right track, or is this a dead end and I just haven't found that end yet? Given that PowerShell isn't "fully" object oriented yet, I imagine there could be some uniquely PS related issues, and everything I have found searching for Powershell Circular Dependency talks about removing them. So far I haven't found anything about when it is appropriate and how to do it well.
And, assuming it is a valid approach, is there anything obvious in this simplified implementation that could lead to problems or could be better done some other way?
EDIT: OK, so perhaps I am going to refine my vocabulary a bit too. I was thinking circular dependency since Secondary is a dependency, or member perhaps, of Tokens, and then I update Secondary from inside Secondary, while referencing data from Secondary, via the method in Tokens.
To clarify (I hope) the ultimate goal, these lookups are for program specific data, which I have in XML files. So, for example the data file for Autodesk Revit 2021 would include these three items
<item id="GUID" type="literal">{7346B4A0-2100-0510-0000-705C0D862004}</item>
<item id="installDataKey" type="registryKeyPath">HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall\[Revit 2021~GUID]</item>
<item id="displayName" type="registryPropertyValue" path="installDataKey">Revit 2021</item>
In actual use I would want to get the DisplayName property found in the key defined in <item id="installDataKey">, and if the value matches the value in <item id="displayName"> then I might also look for the value of the DisplayVersion property in the same key, and make decisions based on that. But because there are new versions every year, and 20+ different software packages to address, managing these data files is a pain. So I want to validate the data I have in the files against a machine that actually has the software installed, to be sure I have my data correct. Autodesk is famous for changing things for no good reason, and often for some very customer hostile reasons. So, things like referencing the GUID as data and reusing it as a token, i.e. the [Revit 2021~GUID] above, saves effort. So during the validation process only, I would want to set the GUID, then do the standard token replacement to convert HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall\[Revit 2021~GUID] to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall\{7346B4A0-2100-0510-0000-705C0D862004} and should that key actually be found, use it to validate about 20 other registry and file paths as well as the actual value of DisplayName. If everything validates I will sign the XML, and in actual use signed XML will basically treat everything as a literal and no validation is done, or rather it was prevalidated.
But before use a reference to [product~installDataKey] when the current product is Revit 2021 would resolve first to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall\[Revit 2021~GUID] then to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall\{7346B4A0-2100-0510-0000-705C0D862004} at which point the code could use it as a registry path and see if Revit 2021 is in fact installed.
So, 99.9% of the time, I would instantiate Tokens with a constructor that just includes the full dataset and move on. But on the .1% of occasions where I am validating the data itself, I need to be able to read the xml, set the value for that GUID and immediately use the lookup to validate that Autodesk hasn't done something stupid like move some data out of the GUID key and into a secondary key. They have done that before.
I also might want to replace HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall with a token like [windows~uninstallKey] just to make life easier there too.
Hopefully that makes some sense. It's a mess to be sure, but anything to do with Autodesk is a mess.

How does resource.data.size() work in firestore rules (what is being counted)?

TLDR: What is request.resource.data.size() counting in the firestore rules when writing, say, some booleans and a nested Object to a document? Not sure what the docs mean by "entries in the map" (https://firebase.google.com/docs/reference/rules/rules.firestore.Resource#data, https://firebase.google.com/docs/reference/rules/rules.Map) and my assumptions appear to be wrong when testing in the rules simulator (similar problem with request.resource.data.keys().size()).
Longer version: Running into a problem in Firestore rules where not being able to update data as expected (despite similar tests working in the rules simulator). Have narrowed down the problem to point where can see that it is a rule checking for request.resource.data.size() equaling a certain number.
An example of the data being passed to the firestore update function looks like
Object {
"parentObj": Object {
"nestedObj": Object {
"key1": Timestamp {
"nanoseconds": 998000000,
"seconds": 1536498767,
},
},
},
"otherKey": true,
}
where the timestamp is generated via firebase.firestore.Timestamp.now().
This appears to work fine in the rules simulator, but not for the actual data when doing
let obj = {}
obj.otherKey = true
// since want to set object key name dynamically as nestedObj value,
// see https://stackoverflow.com/a/47296152/8236733
obj.parentObj = {} // needed for adding nested dynamic keys
obj.parentObj[nestedObj] = {
key1: fb.firestore.Timestamp.now()
}
firebase.firestore.collection('mycollection')
.doc('mydoc')
.update(obj)
Among some other rules, I use the rule request.resource.data.size() == 2 and this appears to be the rules that causes a permission denied error (since commenting out this rules get things working again). Would think that since the object is being passed with 2 (top-level) keys, then request.resource.data.size()=2, but this is apparently not the case (nor is it the number of keys total in the passed object) (similar problem with request.resource.data.keys().size()). So there's a long example to a short question. Would be very helpful if someone could clarify for me what is going wrong here.
From my last communications with firebase support around a month ago - there were issues with request.resource.data.size() and timestamp based security rules for queries.
I was also told that request.resource.data.size() is the size of the document AFTER a successful write. So if you're writing 2 additional keys to a document with 4 keys, that value you should be checking against is 6, not 2.
Having said all that - I am still having problems with request.resource.data.size() and any alternatives such as request.resource.size() which seems to be used in this documentation
https://firebase.google.com/docs/firestore/solutions/role-based-access
I also have some places in my security rules where it seems to work. I personally don't know why that is though.
Been struggling with that for a few hours and I see now that the doc on Firebase is clear: "the request.resource variable contains the future state of the document". So with ALL the fields, not only the ones being sent.
https://firebase.google.com/docs/firestore/security/rules-conditions#data_validation.
But there is actually another way to ONLY count the number of fields being sent with request.writeFields.size(). The property writeFields is a table with all the incoming fields.
Beware: writeFields is deprecated and may stop working anytime, but I have not found any replacement.
EDIT: writeFields apparently does not work in the simulator anymore...

UIMA Ruta: Copy the feature value from a contained annotation to a containing annotation

Note: This seems heavily related to Setting feature value to the count of containing annotation in UIMA Ruta. But I cannot quite apply the answer to my situation.
I am analyzing plain text documents where the following structure is assumed:
Document (one, of course)
Section (many)
Heading (one per section)
I am being asked to identify sections by checking whether their headings satisfy conditions. A useful and obvious sort of condition would be: does the heading match a given regular expression? A less-useful but perhaps more achievable condition would be: does the heading contain a given text?
I could and have already achieved this by taking a list of tuples of regular expressions and section titles, and at design time, for each member of the list, as such:
BLOCK(forEach) SECTION{} {
...
HEADING{REGEXP(".*table.*contents.*", true) ->
SETFEATURE("value", "Table of Contents")};
...
}
SECTION{ -> SETFEATURE("value", "Table of Contents")}
<- { HEADING.headingValue == "Table of Contents"; };
This approach is fairly straightforward but has a few big drawbacks:
It heavily violates the DRY principle
Even when writing the rule for just one section to identify, the rule author must copy the section title twice (it should only need to be specified once)
It makes the script needlessly long and unwieldy
It puts a big burden on the rule author, who in an ideal case, would only need to know Regex - not Ruta
So I wanted to refactor to achieve the following goals:
A text file is used to store the regular expressions and corresponding titles, and the rule iterates over these pairs
Features, rather than types, are used to differentiate different sections/headings (i.e. like above, using SECTION.value=="Table of Contents" and not TableOfContentsSection)
After looking over the UIMA Ruta reference to see which options were available to achieve these goals, I settled on the following:
Use a WORDTABLE to store tuples of section title, words to find / regex if possible, lookup type - so for instance, Table of Contents,contents,sectiontitles
Use MARKTABLE to mark an intermediate annotation type LookupMatch whose hint feature contains the section title and whose lookup feature contains the type of lookup we are talking about
For each HEADING, see if a LookupMatch.lookup == "sectiontitle" is inside, and if it is, copy the LookupMatch.hint to the heading's value field.
For each SECTION, see if a HEADING with a value is inside; if so, copy the value to the SECTION.value field.
It was not quite a surprise to find that implementing steps 3 and 4 was not so easy. That's where I am at and why I am asking for help.
// STEP 1
WORDTABLE Structure_Heading_WordTable =
'/uima/resource/structure/Structure_Heading_WordTable.csv';
// STEP 2
Document.docType == "Contract"{
-> MARKTABLE(LookupMatch, // annotation
2, // lookup column #
Structure_Heading_WordTable, // word table to lookup
true, // case-insensitivity
0, // length before case-insensitivity
"", // characters to ignore
0, // matches to ignore
"hint" = 1, "lookup" = 3 // features
)
};
// STEPS 3 AND 4 ... ???
BLOCK(ForEach) LookupMatch.lookup == "sectiontitle"{} {
???
}
HEADING{ -> SETFEATURE("value", ???)} <- {
???
};
Here is my first real stab at it:
HEADING{ -> SETFEATURE("value", lookupMatchHint)} <- {
LookupMatch.lookup == "HeadingWords"{ -> GETFEATURE("hint", lookupMatchHint)};
};
TL; DR
How can I conditionally copy a feature value from one annotation to another? GETFEATURE kind of assumes that you only get 1...

How to escape some characters in postgresql

I have this data in one column in postgresql
{
"geometry":{
"status":"Point",
"coordinates":[
-122.421583,
37.795027
]
},
and i using his query
select * from students where data_json LIKE '%status%' ;
Above query return results but this one does not
select * from students where data_json LIKE '%status:%' ;
How can fix that
Of course the 2nd one doesn't find a match, there's no status: text in the value. I think you wanted:
select * from students where data_json LIKE '%"status":%'
... however, like most cases where you attempt text pattern matching on structured data this is in general a terrible idea that will bite you. Just a couple of problem examples:
{
"somekey": "the value is \"status\": true"
}
... where "status": appears as part of the text value and will match even though it shouldn't, and:
{
status : "blah"
}
where status has no quotes and a space between the quotes and colon. As far as JavaScript is concerned this is the same as "status": but it won't match.
If you're trying to find fields within json or extract fields from json, do it with a json parser. PL/V8 may be of interest, or the json libraries available for tools like pl/perl, pl/pythonu, etc. Future PostgreSQL versions will have functions to get a json key by path, test if a json value exists, etc, but 9.2 does not.
At this point you might be thinking "why don't I use regular expressions". Don't go there, you do not want to try to write a full JSON parser in regex. this blog entry is somewhat relevant.