foundation 6 abide validation textarea unicode - unicode

Is it possible to validate unicode text input with Foundation6 Abide? I am already using the following custom pattern
Foundation.Abide.defaults.patterns['alpha_numeric_spaces'] = /^[\w\-\s]+$/;

After seaching I modyfied the patern to:
Foundation.Abide.defaults.patterns['alpha_numeric_spaces'] = ^[\w-\s-\u0000-\u007F\u0370-\u03FF\u0400-\u04FF\u0500-\u052F\u0530-\u058F]+$;
and appears that is working at least for the languages that I am interestedin. The source for the unicode regex has been http://kourge.net/projects/regexp-unicode-block

Related

Converting emoji from hex code to unicode

I want to use emojis in my iOS and Android app. I checked the list of emojis here and it lists out the hex code for the emojis. When I try to use the hex code such as U+1F600 directly, I don't see the emoji within the app. I found one other way of representing emoji which looks like \uD83D\uDE00. When using this notation, the emoji is seen within the app without any extra code. I think this is a Unicode string for the emoji. I think this is more of a general question that specific to emojis. How can I convert an emoji hex code to the Unicode string as shown above. I didn't find any list where the Unicode for the emojis is listed.
It seems that your question is really one of "how do I display a character, knowing its code point?"
This question turns out to be rather language-dependent! Modern languages have little trouble with this. In Swift, we do this:
$ swift
Welcome to Apple Swift version 3.0.2 (swiftlang-800.0.63 clang-800.0.42.1). Type :help for assistance.
1> "\u{1f600}"
$R0: String = "😀"
In JavaScript, it is the same:
$ node
> "\u{1f600}"
'😀'
In Java, you have to do a little more work. If you want to use the code point directly you can say:
new StringBuilder().appendCodePoint(0x1f600).toString();
The sequence "\uD83D\uDE00" also works in all three languages. This is because those "characters" are actually what Unicode calls surrogates and when they are combined together a certain way they stand for a single character. The details of how this all works can be found on the web in many places (look for UTF-16 encoding). The algorithm is there. In a nutshell you take the code point, subtract 10000 hex, and spread out the 20 bits of that difference like this: 110110xxxxxxxxxx110111xxxxxxxxxx.
But rather than worrying about this translation, you should use the code point directly if your language supports it well. You might also be able to copy-paste the emoji character into a good text editor (make sure the encoding is set to UTF-8). If you need to use the surrogates, your best best is to look up a Unicode chart that shows you something called the "UTF-16 encoding."
In Delphi XE #$1F600 is equivalent to #55357#56832 or D83D DE04 smile.
Within a program, I use it in the following way:
const smilepage : array [1..3] of WideString =(#$1F600,#$1F60A,#$2764);
JavaScript - two way
let hex = "😀".codePointAt(0).toString(16)
let emo = String.fromCodePoint("0x"+hex);
console.log(hex, emo);

Convert unicode to emoji

I'm looking for a way to represent an emoji 📄 in my code as unicode which is then displayed as an actual 'image' in output text. I'd like to use http://apps.timwhitlock.info/unicode/inspect/hex/1F4C4 to display the 'page facing up' in application, but I don't like the idea of having pictures in my code (though it is working fine) ;)
You can use arbitrary Unicode characters directly in your source code
let string = "📄"
or use the Swift Unicode escape sequence:
let string = "\u{1F4C4}"
More information in the section about "String Literals" in the Swift reference.

Howto make JRequest::getVar filter correctly accented characters?

I want to filter some variables with accented character on a component for joomla1.5 for example:
$name = JRequest::getVar('name', '', 'post','WORD');
but the getvar function filters áéíóú. I need this get well for a form in spanish language.
I'm new to joomla development, but for as far as I can see, it doesn't let me set any other parameter to config to get this.
Is there a way to do this with the advantage of filtering with JRequest::getVar or should I create a function myself which does so?
Do you mean JRequest::getVar() removes symbols like 'áéíóú'? It is very weird because I've worked on Joomla with danish and hebrew symbols. And they were passed through GET, POST, SESSION successfully. Because Joomla works with UTF8 and it understands such symbols. The problem could be only in your file encoding. They should be in UTF8. Is it so? If not try to change it. This should help.

UTF8 charset, diacritical elements, conversion problems - and Zend Framework form escaping

I am writing a webapp in ZF and am having serious issues with UTF8. It's using multi lingual content through Zend Form and it seems that ZF heavily escapes all of these characters and basically just won't show a field if there's diacritical elements 'é' and if I use the HTML entity equivalent e.g. é it gets escaped so that the user will see 'é'.
Zend Form allows for having non escaped data, but trying to use this is confusing, and it seems it'd need to be used all over the place.
So, I have been told that if the page and the text is in UTF8, no conversion to htmlentities is required. Is this true?
And if the last question is true, then how do I convert the source text to UTF8? I am comfortable setting up apache so that it sends a default UTF8 charset heading, and also adding the charset meta tag to the html, but doing this I am still getting messed up encoding. I have also tried opening the translation csv file in TextWrangler on OSX as UTF8, but it has done nothing.
Thanks!
L
'é' and if I use the HTML entity equivalent e.g. é it gets escaped so that the user will see 'é'.
This I don't understand. Can you show an example of how it is displayed, as opposed to how it should be displayed?
So, I have been told that if the page and the text is in UTF8, no conversion to htmlentities is required. Is this true?
Yup. In more detail: If the data you're displaying and the encoding of the HTML page are both UTF-8, the multi-byte special characters will be displayed correctly.
And if the last question is true, then how do I convert the source text to UTF8?
Advanced editors and IDEs enable you to define what encoding the source file is saved in. You would need to open the file in its current encoding (with special characters being displayed correctly) and save it as UTF-8.
If the content is messed up when you have the right content-type header and/or meta tag specified, then the content is not UTF-8 yet. If you don't get it sorted, post an example of what it looks like here.

RichTextBox use to retrieve Text property in C++

I am using a hidden RichTextBox to retrieve Text property from a RichEditCtrl.
rtb->Text; returns the text portion of either English of national languages – just great!
But I need this text in \u12232? \u32232? instead of national characters and symbols. to work with my db and RichEditCtrl. Any idea how to get from “пассажирским поездом Невский” to “\u12415?\u12395?\u23554?\u20219?\u30456?\u35527?\u21729? (where each national character is represented as “\u23232?”
If you have, that would be great.
I am using visual studio 2008 C++ combination of MFC and managed code.
Cheers and have a wonderful weekend
If you need a System::String as an output as well, then something like this would do it:
String^ s = rtb->Text;
StringBuilder^ sb = gcnew StringBuilder(s->Length);
for (int i = 0; i < s->Length; ++i) {
sb->AppendFormat("\u{0:D5}?", (int)s[i]);
}
String^ result = s->ToString();
By the way, are you sure the format is as described? \u is a traditional Escape sequence for a hexadecimal Unicode codepoint, exactly 4 hex digits long, e.g. \u0F3A. It's also not normally followed by ?. If you actually want that, format specifier {0:X4} should do the trick.
You don't need to use escaping to put formatted Unicode in a RichText control. You can use UTF-8. See my answer here: Unicode RTF text in RichEdit.
I'm not sure what your restrictions are on your database, but maybe you can use UTF-8 there too.