Issues with special characters in QBO API v3 .NET SDK - intuit-partner-platform

I'm using the .NET SDK to import customers and transactions from another system that accepts UTF-8 encoding in their data, and am having a lot of trouble with special characters. Is there a comprehensive list of (a) what characters need to be escaped (like apostrophe), and (b) what characters are simply not allowed in QBO (like colon)? All I can find in the online doc is "use backslash to escape special characters like apostrophe". OK, what about ampersand, em dash, en dash, grave accent, acute accent... you get the idea.
This problem affects both queries and inserts which causes all kinds of problems. For example, if we query a customer by name, and the query fails (maybe due to an invalid character), we try to insert the customer in QBO, which of course also fails, either due to the customer existing or invalid characters. True, we can usually figure out if the query failed due to a bad character vs the record not existing, but we need a design-time solution. Any suggestions?

If you use Query endpoint, then please URL encode the query parameters.
For ex -
For the following query
select * from Customer where DisplayName='Aülleünte'
URL request would be
https://quickbooks.api.intuit.com/v3/company/<relamId>/query?query=select+*+from+Customer+where+DisplayName%3D%27A%C3%BClle%C3%BCnte%27
PN - Some QBO textfields(for ex - 'Description/Note' of Customer window) allow to enter control characters which gets returned as part of query response. As some of those characters are not supported in XML, object deserialization fails/shows warning.
You should either remove those characters from UI or you need to use some lib/regex in the client side code to remove those characters programmatically. Ideally it should be handled in the server side.
QBO Global UI supports UTF-8 encoding for sure. But It seems, QBO US UI behaves differently while dealing with special characters.
For ex - In QBO US UI, if you enter '你好嗎' then after saving, it gets converted to '}Î'.
Edit
Here is a list of accepted characters:
•Alpha-numeric (A-Z, a-z, 0-9)
•Comma (,)
•Dot or period (.)
•Question mark (?)
•At symbol (#)
•Ampersand (&)
•Exclamation point (!)
•Number/pound sign (#)
•Single quote (')
•Tilde (~)
•Asterisk (*)
•Space ( )
•Underscore (_)
•Minus sign/hyphen (-)
•Semi-colon (;)
•Plus sign (+)

Related

Safe delimiter for dsv of email addresses

I need to use/store a delimiter separated value string (not csv) of email addresses. I need to choose a delimiter that is safe.
E.g. bar#foo.com,baz#foo.com, - comma in this e.g. is unsafe as it's valid within an email address.
It seems that almost anything is allowed in an email address, especially now with internationalized email addresses.
What is a safe delimiter to use without jumping through hoops because of corner cases? I can't find a character in the RFC which which is expressly invalid (but there are lots of email related related RFCs, so I'm not sure which to consult).
Where/how will you be storing the string and what will the delimiter be used for?
You could use a non-visible ascii character such as the CR (Ascii 13) or Tab (Ascii 9).
I originally used \ because that is an escape character, however it is allowed if escaped. #MatWalker's answer recommends stuff like CR or LF etc, but those are allowed too, if they are escaped.
Escaping and replacing and unescaping got a bit complicated. So right now I'm using control character STX (i.e. "Start of Text", decimal 2).
Although the RFC doesn't mention (from what I've seen) whether control charactes are valid/invalid, there doesn't seem to be anything that makes it a bad choice. It does say that control chars are "discouraged", but not prohibited for header fields.

Looking for a character that is allowed in Filenames but not allowed in email addresses... Any clue?

I am trying to create multiple html files that are associated with an email address. But since the "#" cannot be used in filenames, and in order to avoid confusion, I am trying to replace it with a character that won't normally exist in an email address.
Anything comes in mind?
Thanks!
Comma and semi-colon is not allowed in email address but in filenames on most file systems.
I believe '~' is used for this purpose.
According to the link here almost all ASCII characters are allow in email addresses so long as the special characters aren't at the beginning or the end.
What characters are allowed in an email address?
Any of , (comma) ; (semi-colon) <> (angle brackets) [] (square brackets) or " (double quote) should work for most cases.
Since these characters are allowed in quoted strings, you could replace the "#" with a sequence that would be invalid such as three double quotes in a row.
According to the RFC
within a quoted string, any ASCII graphic or space is permitted without blackslash-quoting except double-quote and the backslash itself.
You could have an email abc."~~~".def#rst.xyz. But you could not have abc.""".def#rst.xyz; it would have to be abc.""".def#rst.xyz. So you could safely use """ as a substitute for # in the filename.
However, the RFC also says
While the above definition for Local-part is relatively permissive,
for maximum interoperability, a host that expects to receive mail
SHOULD avoid defining mailboxes where the Local-part requires (or
uses) the Quoted-string form or where the Local-part is case-
sensitive.
With SHOULD meaning "...that
there may exist valid reasons in particular circumstances when the
particular behavior is acceptable or even useful, but the full
implications should be understood and the case carefully weighed
before implementing..." RFC2119
So, although """ will work, are the chances you will see an email with quotes worth the trouble of designing for it? If not, then use one of the single characters.

Differentiate properly escaped HTML metacharacters from improperly escaped ones

I'm working on a replacement for a desktop Java app, a single page app written in Scala and Lift.
I have this situation where some of data in the database has properly used HTML metacharacters, such as Unicode escape sequences for accented characters in non-English names. At the same time, I have other data with improper HTML metacharacters, such as ampersands in the names or organizations.
Good (don't escape): Universita\u0027
Bad (needs escape): Bob & Jim
How do I determine whether or not the data needs to be fixed before I send it to the client?
There are two ways to approach this. One is a function that takes a string and returns the index of any improperly escaped HTML metacharacters (which I can fix myself). Alternately it could be a function that takes a string and returns a string with the improperly escaped metacharacters fixed, and leaves the proper ones alone.

SharePoint 2013 REST API odata $filter ignores unicode characters such as German umlauts äöü

I'm trying to use SharePoint 2013 REST API (odata) with unicode characters such as umlauts (ä ö ü).
...?$select=Title%2CID&$filter=substringof%28%27hello%20w%F6rld%27%2C%20Title%29&$orderby=ID%20desc&$top=14
^^ should search for "hello w*ö*rld" using substringof('...', Field)
I'm escaping the URL correctly (and also single quotes with double quotes) and filtering works for all kinds of characters (even backslash and quotes), however, entering ä/ö/ü or any other unicode character has no effect, it is as if those characters were simply filtered out on the server side (i can insert a lot of ääääääs without changing the results).
Any idea how to escape those? I tried the obvious (%ab { \u1234 \xab x1234) without success. Can't find anything on the web or in the specs either.
Thanks for suggestions.
UPDATE - SOLVED
I found that you can use the %uhhhh variant of escaping them:
?$filter=substringof('hello w%u00f6rld')
Of course one must only escape that once (i.e. not the whole thing again), but it seems that's the way to go.
(can't answer my own question now lol)

What constitutes a Regular Indentifier

On page 34 of 70-461 Querying Microsoft SQL Server 2012 it says that an indentifier is regular if:
The rules say that the first character must be a letter in the range
A through Z (lower or uppercase), underscore (_), at sign (#), or
number sign (#). Subsequent characters can include letters, decimal
numbers, at sign, dollar sign ($), number sign, or underscore.
However on pg 271 it says:
Even though you can embed special characters such as #, #, and $ in
an identifier for a schema, table, or column name, that action makes
the identifier delimited, no longer regular.
So to clarify would having special characters like the '$' an identifier regular or not
Having $ after the first character is part of the specification that defines a regular identifier and will not require the use of a delimiter.
I found the definition in SQL Server 2008 R2 Identifiers to be clearer than the one from page 34. It is essentially the same as the one on page 271, but with more detail.
Either you have misquoted pg 271 of the book, or your version is different than mine and has an error:
If you embed special characters other than #, #, and $ in an
identifier for a schema, table, or column, name, that action makes the
identifier delimited, no longer regular.
Here is a regular expression that will match a string that complies with the definition:
^[\p{letter}_##][\p{Letter}\p{Number}_##$]*$
Regex for flavors without unicode support:
^[a-zA-Z_##][a-zA-Z\d_##$]*$