Is there any way to prevent or detect char(0) from texts? - character

In some occasions, specially when copy-pasting, we end up having some text fields with a character 0 (nul) at the end of a string.
It doesn't show in any way when you display the data, but you do detect it when you export it.
We've tried to (at least) detect it by using the "Position" function.
However Position(text_field, char(0), 1, 1) won't find this char (it does return 0, even if the character is there).
I guess this is some kind of bug from FileMaker, but I'd like to know if anyone has found a way to circumvent it...
More info and a database sample at: https://community.claris.com/en/s/question/0D53w00005wrUMMCA2/character-0-0x0-in-text-fields

Unfortunately, the result of Char(0) is an empty string, not the expected control character.
You can generate the null character in a number of ways:
HexDecode ( "00" )
Base64Decode ( "AA==" )
ExecuteSQL ( "SELECT DISTINCT CHR(0) FROM SomeTable" ; "" ; "" )
or paste it into a global field and get it from there.
Once you have the character, it's easy to detect it or just substitute it out.
You may want to bypass the entire issue by allowing only printable characters - see, for example: https://www.briandunning.com/cf/1291

I run into this problem quite frequently when users try to copy-paste text from office programs into FileMaker fields on windows (my guess is that FileMaker for some reason can't handle Microsoft Office line endings properly).
The most efficient solution I found is to use auto enter calculation or script with Filter() function, in order to remove any unwanted characters.
Alterntively if you have access to plug-ins you can try using the MBS ("Text.RemoveControlCharacters") function from Monkeybread FileMaker plug-in which is uspposed to remove all characters with code 32 or lower.

Related

crystal reports attempting to link two tables by matching string with no luck

As stated in the title, I have two tables I'm attempting to link. Both Strings appear to be a match, however Crystal Reports is not picking it up. The only thing I can think is that that length of the field is different, even though the strings are the same. could that cause a discrepancy? If so how can I correct for it? Thank you
Length of the string will prevent a match. If you are using the Trim(string) function, that only removes spaces found at the beginning or end of your string, so the two strings could still be of different lengths after using this function. You will need to use another function to capture a substring of the original string. To do this you can use the Left(string, length) function to ensure both strings are the same length.
If they still do not match then you may have non-printable characters in one or both of your strings. Carriage Return and Line Feed tend to be the most commonly found non-printable characters. A Carriage Return is represented as Chr(10), while a Line Feed is represented as Chr(13). These are Built In Constants similar to those found in VBA and Visual Basic.
You can use a find and replace to remove them with the following formula. Its not a bad idea to also include the trim and left functions in this as well to ensure you get the best match possible.
Replace(Replace(Left(Trim({YourStringField}), 10),Chr(10), ""),Chr(13), "")
There are a few additional Built In Constants you may need to check for if this doesn't work. A Tab is represented as Chr(9) for example. Its very rare for strings to contain the other Built In Constants though. In most cases Carriage Return and Line Feed are the only ones that are typically found in Plain Text. Tabs and the other constants should only be found in Rich Text and are very rare in string data.

PostgreSQL Trimming Leading and Trailing Characters: = and "

I'm working to build an import tool that utilizes a quoted CSV file. However, several of the fields in the CSV file are reported as such:
"=""38000"""
Where 38000 is the data I need. The data integration software I use (Talend 6.11) already strips the leading and trailing double quotes for me (so, "38000" becomes 38000), but I can't find a way to get rid of those others.
So, essentially, I need "=""38000""" to become "38000" where the leading "=" is removed and the trailing "" is removed.
Is there a TRIM function that can accomplish this for me? Perhaps there is a method in Talend that can do this?
As the other answer stated, you could do that operation in SQL. Or, you could do it in Java, Groovy, etc, within Talend. However, if there is an existing Talend component which does the job, my preference is to use it. That leads to faster development, potentially less testing, and easier maintenance. Having said that, it is important to review all the components which are available, so you know what's available to you.
You can use the Talend component tReplace, to inspect each of the input columns you want to trim of quotes and equal signs. A single tReplace component can do search and replace operations on multiple input columns. If all the of the replaces are related to each other, I would keep them within a single tReplace. When it gets to the point of doing unrelated replacements, I might place those within a new tReplace so that logical operations are organized and grouped together.
tReplace
For a given Input Column
search for "=", replace with ""
search for "\"", replace with ""
Something like that:
SELECT format( '"%s"', trim( both '"=' from '"=""38000"""' ) );
-[ RECORD 1 ]---
format | "38000"
1st: trim() function removes all " and = chars. Result is simply 38000
2nd: with format can add double quote back to get wishful end result
Alternatively, can use regexp and other Postgres string functions.
See more:
https://www.postgresql.org/docs/current/static/functions-string.html

Identifying hidden characters in text

I have an ETL process that regularly extracts code from an ODBC data source, manipulates it, and inserts it into my postgres database. One of the columns from this data source regularly has odd characters in it.
For the most part I can catch and convert all of the characters appropriately, but I have one character that exists in the ODBC data source, cannot be brought into postgres (all of the text after that character gets truncated), and I'm having a hard time identifying what the character is.
I can't even insert an example of the character directly into this post because it gets stripped out :/ The closest I can get is a screen shot of the character in textmate (the only application I can actually see the character in):
There character is the diamond between the 1 and 0. When my data comes in, everything after the 0 is truncated.
Is there a good way of identifying what this character is so I can figure out a way of stripping it out?
Per tripleee's comment on the original question post:
To identify the character I grabbed the hex value of the text to identify the hex value of the offending character in question.
There are a number of ways to do this, but the quickest way for me was to use a utility application I have called HexFiend so dump the text into. Once the text was in and I highlighted the character it returned the hex value "00".
A bit more investigation pointed towards the hex null value being used as a line terminator in C applications (which makes sense given the context of my project).
I've fit this null value into my ETL process so that it gets switched out with a new line and now everything is sunshine and daises.
Thanks again for the help!

Dollar sign in text when passing variable

I got stricky/old php code, I just try to clean it , fix some bugs, and so on. Also the server uses php 4 too.
The problem is the following:
I get some data back from the database, I work with those data and show them. If the result contains a dollar sign, the PHP try to handle it as a variable.
For example :
$result = $this->sqlresult('SELECT * From Tablename where id=15');
$details = $result['description'];
echo $details;
Let me show an example what's happening , when the $result['description'] contains any wrong text, like 'This book is available for $148':
It usually doesn't show anything or show a wrong text , like This book is available for 48.
I have tried a preg replace functions on the details, I was looking for char changes , or html_special_chars , and tried those too, but nothing happened or not the original text came up.
preg_replace('/\$ /','/&#36/;' $details);
I know , that the double quotes on passing variables causes a similar error. I checked this topic too, but it wasn't a solution for me.
Current solution is just adding an extra space between the price amount the $ sign, but I am looking for a better one.
preg_replace('/\$/','/\$ /' $details);
Have you tried to use escape characters? This book is \$148.

Apostrophe issue in RTF

I have a function within a custom CRM web application (old VB.Net circa 2003) that takes a set of fields from a database and merges them with palceholders in a set of RTF based template documents. These generate merged letters and documentation. The code essentially loops through each line of the RTF template file and replaces any instances of the placeholder values with text from a database record. The issue I'm having is that users have pasted a certain type of apostrophe into the web app (and therefore into the database) that is not rendering correctly in the resulting RTF file. It is rendering like this - ’.
I need a way to spot this invalid apostrophe in the code and replace it with a valid one. Unfortunately when I paste the invalid apostrophe into the Visual Studio editor it gets converted into the correct one. So I need another way to express this invalid apostrophe's value. Unfortunately I do not know a great deal about unicode and other encodings so I'm calling out for help with this.
Any ideas?
If you really just want to figure out what the character is you might want to try and paste it into a text editor like ultraedit. It has a hex mode that you can flip to to see the actual underlying bytes.
In order to do the replace once you've figured out the character you'd do something like this in Vb,
text.Replace(ChrW(2001), "'")
Note that you might not be able to figure it out easily using the text editor because it might also get mangled by paste from the clipboard. You might want to either print some debug of the ascii values from code. You can use the AscW function to do that.
I can't help but think that it may actually simply be a case of specifying the correct encoding to use when you write out the stream though. Assuming you're using a StreamWriter you can specify it on the constructor. I'm guessing you actually want ASCII given your requirement.
oWriter = New System.IO.StreamWriter(path, False, System.Text.Encoding.ASCII)
It looks like you probably want to encode characters out of the 8 bit range (>255).
You can do that using \uNNNN according to the wikipedia article.