ICU custom transliteration - unicode

I am looking to utilize the ICU library for transliteration, but I would like to provide a custom transliteration file for a set of specific custom transliterations, to be incorporated into the ICU core at compile time for use in binary form elsewhere. I am working with the source of ICU 4.2 for compatibility reasons.
As I understand it, from the ICU Data page of their website, one way of going about this is to create the file trnslocal.mk within ICUHOME/source/data/translit/ , and within this file have the single line TRANSLIT_SOURCE_LOCAL=custom.txt.
For the custom.txt file itself, I used the following format, based on the master file root.txt:
custom{
RuleBasedTransliteratorIDs {
Kanji-Romaji {
file {
resource:process(transliterator){"custom/Kanji_Romaji.txt"}
direction{"FORWARD"}
}
}
}
TransliteratorNamePattern {
// Format for the display name of a Transliterator.
// This is the language-neutral form of this resource.
"{0,choice,0#|1#{1}|2#{1}-{2}}" // Display name
}
// Transliterator display names
// This is the English form of this resource.
"%Translit%Hex" { "%Translit%Hex" }
"%Translit%UnicodeName" { "%Translit%UnicodeName" }
"%Translit%UnicodeChar" { "%Translit%UnicodeChar" }
TransliterateLATIN{
"",
""
}
}
I then store within the directory custom the file Kanji_Romaji.txt, as found here. Because it uses > instead of the → I have seen in other files, I converted each entry appropriately, so they now look like:
丁 → Tei ;
七 → Shichi ;
When I compile the ICU project, I am presented with no errors.
When I attempt to utilize this custom transliterator within a testfile, however (a testfile that works fine with the in-built transliterators), I am met with the error error: 65569:U_INVALID_ID.
I am using the following code to construct the transliterator and output the error:
UErrorCode status = U_ZERO_ERROR;
Transliterator *K_R = Transliterator::createInstance("Kanji-Romaji", UTRANS_FORWARD, status);
if (U_FAILURE(status))
{
std::cout << "error: " << status << ":" << u_errorName(status) << std::endl;
return 0;
}
Additionally, a loop through to Transliterator::countAvailableIDs() and Transliterator::getAvailableID(i) does not list my custom transliteration. I remember reading with regard to custom converters that they must be registered within /source/data/mappings/convrtrs.txt . Is there a similar file for transliterators?
It seems that my custom transliterator is either not being built into the appropriate packages (though there are no compile errors), is improperly formatted, or somehow not being registered for use. Incidentally, I am aware of the RuleBasedTransliterator route at runtime, but I would prefer to be able to compile the custom transliterations for use in any produced binary.
Let me know if any additional clarification is necessary. I know there is at least one ICU programmer on here, who has been quite helpful in other posts I have written and seen elsewhere as well. I would appreciate any help I can find. Thank you in advance!

Transliterators are sourced from CLDR - you could add your transliterator to CLDR (the crosswire directory contains it in XML format in the cldr/ directory) and rebuild ICU data. ICU doesn't have a simple mechanism for adding transliterators as you are trying to do. What I would do is forget about trnslocal.mk or custom.txt as you don't need to add any files, and simply modify root.txt - you might file a bug if you have a suggested improvement.

Related

Saving to yml file using Spigot

I'm attempting to produce a Message.yml file using Spigot's YAMLConfiguration.
This is my code:
public static void create() {
if(messagesFile.exists()) return;
try {
messagesFile.createNewFile();
messages.options().copyDefaults(true);
messages.addDefault("MESSAGES.PREFIX", "&c[YourServer] ");
messages.addDefault("MESSAGES.DESIGN", "§8§l- ");
messages.addDefault("MESSAGES.NOPERMS", "§c§lDazu hast du keine Rechte!");
messages.addDefault("MESSAGES.ADDMAP.USAGE", "§c§lBitte nutze /addmap [mapname]!");
messages.save(messagesFile);
} catch(Exception e) {
e.printStackTrace();
}
}
However, the config.yml file I received after running it read as follows:
MESSAGES:
PREFIX: '&c[YourServer] '
DESIGN: "\xa78\xa7l- "
NOPERMS: "\xa7c\xa7lDazu hast du keine Rechte!"
ADDMAP:
USAGE: "\xa7c\xa7lBitte nutze /addmap [mapname]!"
Is there any way to fix it?
It thinks the text is a string and not a standalone character.
https://www.spigotmc.org/threads/special-characters-in-config.298138/
Yeah u use special caracter to save the color but it's a String. Don't put your color here just save the String. When you want to resend the text from the config just put for example.
player.sendMessage(ChatColor.RED + config.get("MESSAGES.PREFIX"));
this is just an example
Like #Minecraft said in his answer, the issue is that Java is recognizing the § as a part of your string and translating it to unicode.
What I would do is have your custom config file stored in your plugin resources directory with all the default values you want it to have already defined.
Then when you want to use the custom message, get it from the config file using getConfig()'s returned value's methods. Then, if you want to support color codes, you should use message = ChatColor.translateAlternateColorCodes('&', yourMessage); or something along those lines. Should be plenty to get you going.
Also, be sure and use a unified symbol for these color codes (default is &), but you can set it in the aforementioned method translateAlternativeColorCodes(). You seem to be using & or §, I would stick to &.
Sources:
https://www.spigotmc.org/wiki/config-files/#using-custom-configurations
https://hub.spigotmc.org/javadocs/bukkit/org/bukkit/ChatColor.html#translateAlternateColorCodes(char,java.lang.String)

How to access the child nodes in a device tree (DTS) in Zephyr using DT_FOREACH_CHILD

I'm developing an application for an nRF52 SoC to access some external devices, kind of detectors in this case, so I have defined a custom format (and its corresponding yaml file) for my device access description node. It is kind of:
n: detectors {
compatible = "foo-detectors";
// Definition of first channel
det0: det_0 {
irq-pins = <13 (GPIO_PULL_UP | GPIO_ACTIVE_LOW)>;
label = "Bar detector channel 1";
};
// Definition of second channel
det1: det_1 {
irq-pins = <17 (GPIO_PULL_UP | GPIO_ACTIVE_LOW)>;
label = "Bar detector channel 2";
};
};
Suppose I have a structure definition for holding data about the pins to which those devices are physically connected, say:
struct foo_detector_desc {
int irqpin;
int irqpin_flags;
}
Using Zephyr macros, I can use from my code the individual values for those nodes. For instance, DT_PROP_BY_IDX(DT_NODELABEL(det0), irq_pins, 0) expands to 13, DT_PROP_BY_IDX(DT_NODELABEL(det0), irq_pins, 1) expands to the value of OR'ed flags GPIO_PULL_UP | GPIO_ACTIVE_LOW.
But I don't want to create a ten-page code full of conditionals in the form DT_NODE_EXISTS(DT_ALIAS(det#)), but something more compact, flexible and maintainable.
I came across Zephyr macro DT_FOREACH_CHILD that is intended for being used in that scenario with which I can create a list with the labels, as showcased in the example embedded in the documentation:
#define LABEL_AND_COMMA(node_id) DT_LABEL(node_id),
const char *child_labels[] = {
DT_FOREACH_CHILD(DT_NODELABEL(n), LABEL_AND_COMMA)
};
Trying to use that for filling a static structures array, I tried following code but, yet it expands to two elements in the array, the structure fields are not initialised with desired values.
#define PIN_INFO_AND_COMMA(node_id) \
{ \
.pin=DT_PROP_BY_IDX(node_id, irq_pins, 0),\
.flags=DT_PROP_BY_IDX(node_id, irq_pins, 1),\
},
const struct detector_data _det_data[] = {
DT_FOREACH_CHILD(DT_NODELABEL(n), PIN_INFO_AND_COMMA)
};
I'm using Visual Studio Code with the nRF Connect plugin.
Is there a way to generate/see how those macros expand when compiled?
What is the correct way for initialising the structure fields (pins, flags)?
BR
It is possible to expand the macros one level at a time in VS Code. It is needed to click over the macro, then select it by double-clicking on it, and a bulb light will, eventually, show up. The option Insert Macro will replace the macro with its expansion.
Regarding using the DTS in code, the code I wrote in my previous post is OK, but I think I found a bug in the debugger. If I don't include any code using the value of _det_data, the debugger doesn't say the constant was optimised-out and displays wrong values when inspecting its content. That made me waste a lot of time.
For anyone interested on using DTS files for nRF devices in Zephyr, I posted all details in a thread in Nordic's developers forum (here).
BR

FileManager: Attribute for when a file was last opened

I need a way of checking when a file was last opened. I tried by creating a custom FileAttributeKey and setting that to the current Date, but when I go to open the file again the attribute does not exist:
private let key = FileAttributeKey(rawValue: "lastOpenedAt")
do {
try FileManager.default.setAttributes(
[key: Date()],
ofItemAtPath: videoNameDirectoryPath
)
} catch {
Log.error(error.localizedDescription)
}
So now I am resorting to using the modification date key to say when I last opened the file, it is not ideal so I am wondering if there is a better way to do this
setAttributes doesn't support custom attributes, you can only use the documented ones.
To set your own attributes, you may use xattr as described in this question:
Write extend file attributes swift example
If you're lucky, you may use kMDItemLastUsedDate from Spotlight aka MDItem as described in the documentation archive of File Metadata Attributes.

SAP custom translations for standard SAPUI5 application

I am currently implementing an extension to a standard application from SAP Marketing.
The extension contains new texts that need to be translated into different languages. In my previous extensions I could use the translation key of the standard application for my extension as well. The first line in the i18n.properties file in this case was always structured as follows:
# SAPUI5 TRANSLATION-KEY XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
However, in the standard app that I'm currently editing, it looks like this:
# This is the resource bundle for Campaigns
# __ldi.translation.uuid = 8e965d5e-c905-4b60-ac2a-205abb14046
In transaction se63, the translation key (is it even a translation key?) is not found - either with hyphens or without. Furthermore, in the standard app, the translations are kept in a single file for each language (e.g., i18n_de.properties). That's why I'm not sure if there's even a translation key for this standard app.
I don't want to create a new translation key for my extension and use this one. Once I did this, all the translations of the standard app had to be maintained for the new translation key as well.
Is anyone familiar with this type of translation? How can I maintain the translations for my extension?
Best Regards,
Christian
I found some solution to my problem:
I generated a new translation key by running program /UI5/TEXT_FILE_GEN_TRANS_KEY in transaction se38
I created a new folder i18n and added a i18nCustom.properties file to it. Then I added the translation key and default translations to the file just like for a regular i18n.properties file.
In the next step, I added the following code to the sap.ui5 property of the extension's manifest.json file to initiate the custom translation file:
"models": {
"i18nCustom": {
"type": "sap.ui.model.resource.ResourceModel",
"settings": {
"bundleName": "<Your Component>.i18n.i18nCustom"
}
}
}
Please note that you have to use something like {i18nCustom>property} in your view now instead of using i18n model.
To enhance the standard translation file with the custom one, I added the following code to the BaseController. You could also add the code only in the controller whose view is using the custom translations.
onBeforeRendering: function() {
var i18n = this.getModel("i18n"); // Get the standard i18n file
var sBundleURL = this.getModel("i18nCustom").getResourceBundle().oUrlInfo.url;
i18n.enhance({bundleUrl:sBundleURL}); // Merge the custom i18n file with the standard one
}
Hope this helps if someone has the same problem.

Extracting structure member names from C File

I am writing a windbg extension to print contents of a structure using ExtRemoteData. I see that I need to keep changing my code as and when the structure changes.
Instead I think it would be more flexible if I can directly read the C file and parse my structure to get the structure member names.
Is there a tool/function I can read a C File, and enumerate the various elements of my structure?
The C compiler internally is doing that, but I am not sure how I can extract that info.
Something like
Tool.exe
Name value pairs would contain info such as
{(membername1,type1),(membername2,type2).... (membernameN,typeN)}
There is more research needed here but as a shortcut you may want to consider scripting languages like python to do your parsing following refers to a python library which can do what you intend to do.
parsing C code using python
Now comes the part of integrating python with windbg as an extension that's already available checkout http://pykd.codeplex.com/
Type information is typically included in PDB (program database) symbol files. There are public symbols and private symbols. You might need private symbols to get all information you want.
You can generate private PDBs not only for debug builds but also for release builds. It should only be a setting in your preferred IDE.
Once you have private symbols, you can read it with the DbgHelp API. Depending on what information you already have, e.g. SymFromName() sounds useful.
Although parsing a C file might also be an option, be aware that the source code file might already have changed, but the compiled DLL and PDBs haven't.
I am not sure if it fits the windbg. But someone provided a way to use pykd. So I am mentioning a way to extract the metadata and load it in python.
I used the SWIG tool to extract the CSV metadata out of a C/C++ source file.
Suppose the C code contains structure like the following,
class Bike {
public:
int color; // color of the bike
int gearCount; // number of configurable gear
Bike() {
// bla bla
}
~Bike() {
// bla bla
}
void operate() {
// bla bla
}
};
Then it will generate the following CSV metadata,
Bike|color|int|variable|public|
Bike|gearCount|int|variable|public|
Bike|operate|void|function|public|f().
Now it is easy to parse the CSV file with cut, or awk or python if needed.
import csv
with open('bike.csv', 'rb') as csvfile:
bike_metadata = csv.reader(csvfile, delimiter='|')
# do your thing
pykd has built-in clang parser, so it can get symbol information from C code:
src = '''
class Bike {
public:
int color; // color of the bike
int gearCount; // number of configurable gear
Bike() {
// bla bla
}
~Bike() {
// bla bla
}
void operate() {
// bla bla
}
};
'''
#the next print statements will get equal output
print( getTypeFromSource(src, 'Bike') )
print( typeInfo('compiled_module!Bike') )