What are allowed characters for a submodule name in registerModule? - typo3

registerModule() expects a submodule key as a third parameter.
I think it should probably not contain a space and only alphabetic characters (or alphanumeric?) and underscore ('_'), but I'm not really sure.
I could not find specific information for this.

The function makes use of \TYPO3\CMS\Core\Utility\GeneralUtility::underscoredToUpperCamelCase to generate the full module name combined of main module and sub module connected with an _
So you already guessed the right answer.

It's a bit complicated strange to answer!
Official API document does not provide exact information. I have worked around some extension which has multiple sub-module. I'm quite sure this not allow special character as you sub-module key.
eg. web_TestTestbe123 (mainModulename_subModuleKey)
I have noticed bellow characteristic for the key:
Key must be lowercase
No space allowed
Numerica value would be fine
Does this make sense?

I found this in the documentation just now:
Backend modules
1. The modkey is made up of alphanumeric characters only. It does not contain underscores and starts with a letter.
https://docs.typo3.org/m/typo3/reference-coreapi/master/en-us/ExtensionArchitecture/NamingConventions/Index.html

Related

In Python (or any language) what does an "upper" function do to Hindi, Amharric and other non-Latin character sets?

Subject says it all. Been looking for an answer, but cannot seem to find it.
I am writing a web app that will store data in a database and also have language files translated into a wide variety of character sets. At various moments, the text will be presented. I want to control presentation such as spurious blank spaces at the beginning and end of strings. Also I want to ensure some letters are upper or lower case.
My question is: what happens in upper/lower case functions when the character set only has one case?
EDIT Sub question: Are there any unexpected side effects to be aware of?
My guess is that you simply get back the one and only character.
EDIT - Added Description
The main reason for asking this question is that I am writing a webapp that will be distributed and run on machines in remote areas with little or no chance to fix "on-the-spot" bugs. It's not a complicated webapp, but will run with many different language char sets. I want to be certain of my footing before releasing the server.
First of all the upper() and lower() method in python can be applied to Hindi, Amharric and non-letter character sets.
For instance will the upper() method converts the lowercase characters if an equivalent uppercase of this char exists. If not, then not.
Or better said, if there is nothing to convert, it stays the same.

Multiple regex in one command

Disclaimer: I have no engineering background whatsoever - please don't hold it against me ;)
What I'm trying to do:
Scan a bunch of text strings and find the ones that
are more than one word
contain title case (at least one capitalized word after the first one)
but exclude specific proper nouns that don't get checked for title case
and disregard any parameters in curly brackets
Example: Today, a Man walked his dogs named {FIDO} and {Fifi} down the Street.
Expectation: Flag the string for title capitalization because of Man and Street, not because of Today, {FIDO} or {Fifi}
Example: Don't post that video on TikTok.
Expectation: No flag because TikTok is a proper noun
I have bits and pieces, none of them error-free from what https://www.regextester.com/ keeps telling me so I'm really hoping for help from this community.
What I've tried (in piece meal but not all together):
(?=([A-Z][a-z]+\s+[A-Z][a-z]+))
^(?!(WordA|WordB)$)
^((?!{*}))
I think your problem is not really solvable solely with regex...
My recommendation would be splitting the input via [\s\W]+ (e.g. with python's re.split, if you really need strings with more than one word, you can check the length of the result), filtering each resulting word if the first character is uppercase (e.g with python's string.isupper) and finally filtering against a dictionary.
[\s\W]+ matches all whitespace and non-word characters, yielding words...
The reasoning behind this different approach: compiling all "proper nouns" in a regex is kinda impossible, using "isupper" also works with non-latin letters (e.g. when your strings are unicode, [A-Z] won't be sufficient to detect uppercase). Filtering utilizing a dictionary is a way more forward approach and much easier to maintain (I would recommend using set or other data type suited for fast lookups.
Maybe if you can define your use case more clearer we can work out a pure regex solution...

What exactly does 'Type Body Length' mean in Swiftlint?

We just added Swiftlint to our project and we want to follow all the rules but I'm not sure what's meant by 'type_body_length' warning. I'm not a native english speaker so I find it a bit confusing.
There is a rule for file length aswell so how do they differ? What falls under this definition?
type_body_length violation means that the class has too many lines in it. I dont think it counts extensions, comments or whitespace
Type name should only contain alphanumeric characters, start with an uppercase character and span between 3 and 40 characters in length.
The rules documentation linked here and above also gives examples of what would and wouldn't be accepted (Triggering & Non Triggering). - Edit suggested by #GoodSp33d, thanks

How does one go from a Unicode character to its description?

I came across this cute little symbol today:
🔮
I couldn't figure out what it was, so I searched for reverse lookup services and character maps that might be able to reveal a name to no avail. I know, however, that Windows' character map program knows the names of symbols:
How does Windows accomplish this? How might I, but a lowly programmer, divine this same knowledge? What encoding system does Unicode use to tie a symbol to its description?
This information comes from the Unicode Character Database.
Specifically, the code points and their names (and other info like the category of a code point) are defined in UnicodeData.txt.
A lot of programming languages have this information in the standard library, eg. the unicodedata module of Python.
If you just want to know the glyph name, head on over to CodePoints (or Graphemica or probably any one of a dozen other sites) and do a search on it. I'm not sure which lookup services you used "to no avail" but those two have no issues in locating it.
Doing so with 🔮 will lead you to codepoint U+1F52e, which will give you the descriptive name "CRYSTAL BALL", along with all sorts of other useful information about it.

lex default token definition syntax

I guess this is a simple question, but I have found no reference. I have a small lex file defining some tokens from a string and altering them (actually converting them to uppercase).
Basically it is a list of commands like this:
word {setToUppercase(yytext);}
Where setToUppercase is a procedure to change case and store it.
I need to have the complete entry string with the altered words. Is there a way to define a default token / rest of tokens so I can asociate them with an unaltered storage in an output string?
You can do that in one shot with:
.|\n {save_str(yytext);}
I said it was an easy one.
. {save_str(yytext);}
\n {save_str(yytext);}
This way all characters and newline are treated.