How to determine if base64 and unicode character are equivalent? - unicode

I'm trying to figure out if the two encodings of the character are the same:
SELECT
~b"1",
code_points_to_string([0x380]),
"\u0380"
The first is base64-encoded and the second is a string. How can I determine if these are equivalent? Is there some sort of function where I can, for example extract the code point from a base64-string in BigQuery?

You have mentioned ~b"1" as base64-encoded but according to this google cloud documentation ~ X is a Bitwise not operator. Here you are converting the string value to bytes and then performing Bitwise Not operation on it (Though while showing in UI BQ shows in base64 encoded). To convert it into base64-encoded you can use the TO_BASE64 function.
You can use the to_code_points function to extract the code points from a base 64 encoded string.
select *
,from_base64(d) decoded_val_bytes
,SAFE_CONVERT_BYTES_TO_STRING(from_base64(d)) sring_value
,to_code_points(from_base64(d)) cp
from (
select
~b"foo" a,
code_points_to_string([102,111,111]) b
,to_base64(b"foo") d
,to_code_points("foo") e
)
to_code_points function is to convert string or bytes into an array of INT64 and code_points_to_string function is to convert an array of INT64 into STRING.

Related

atobin() and atohex() in systemverilog

Does anyone know about these 2 functions? Should the output of 'F'.atohex() be 0x16 or 0x46 (directly from the ASCII table)? I have googled this already, but some said the former one is correct while some said the other. Thanks.
Actually, the result is 0xF. These functions do not have the greatest names. What both do is convert an ASCII string in a particular radix to an integral value. atohex assumes the string is formatted in hexadecimal.
from LRM:
— str.atoi() returns the integer corresponding to the ASCII decimal representation in str.
— atohex interprets the string as hexadecimal.
— atooct interprets the string as octal.
— atobin interprets the string as binary.
NOTE—These ASCII conversion functions return a 32-bit integer value
So, the result of the following:
string a = "F";
a.atohex();
ia a 32-bit integer: 32'hF.

how to remove # character from national data type in cobol

i am facing issue while converting unicode data into national characters.
When i convert the Unicode data into national using national-of function, some junk character like # is appended after the string.
E.g
Ws-unicode pic X(200)
Ws-national pic N(600)
--let the value in Ws-Unicode is これらの変更は. getting from java end.
move function national-of ( Ws-unicode ,1208 ) to Ws-national.
--after converting value is like これらの変更は #.
i do not want the extra # character added after conversion.
please help me to find out the possible solution, i have tried to replace N'#' with space using inspect clause.
it worked well but failed in some specific scenario like if we have # in input from user end. in that case genuine # also converted to space.
Below is a snippet of code I used to convert EBCDIC to UTF. Before I was capturing string lengths, I was also getting # symbols:
STRING
FUNCTION DISPLAY-OF (
FUNCTION NATIONAL-OF (
WS-EBCDIC-STRING(1:WS-XML-EBCDIC-LENGTH)
WS-EBCDIC-CCSID
)
WS-UTF8-CCSID
)
DELIMITED BY SIZE
INTO WS-UTF8-STRING
WITH POINTER WS-XML-UTF8-LENGTH
END-STRING
SUBTRACT 1 FROM WS-XML-UTF8-LENGTH
What this code does is string the UTF8 representation of the EBCIDIC string into another variable. The WITH POINTER clause will capture the new length of the string + 1 (+ 1 because the pointer is positioned to the next position after the string ended).
Using this method, you should be able to know exactly how long second string is and use that string with the exact length.
That should remove the unwanted #s.
EDIT:
One thing I forgot to mention, in my case, the # signs were actually EBCDIC low values when viewing the actual hex on the mainframe
Use inspect with reverse and stop after first occurence of #

In PostgreSQL, how can I unwrap a json string to text?

Suppose I have a value of type json, say y. One may obtain such a value through, for example, obj->'key', or any function that returns values of type json.
This value, when cast to text, includes quotation marks i.e. "y" instead of y. In cases where using json types is unavoidable, this poses a problem, especially when we wish to compare the value with literal strings e.g.
select foo(x)='bar';
The API Brainstorm page suggests a from_json function that will intelligently unwrap JSON strings, but I doubt that is available yet. In the meantime, how can one convert JSON strings to text without the quotation marks?
Text:
To extract a value as text, use #>>:
SELECT to_json('foo'::text) #>> '{}';
From: Postgres: How to convert a json string to text?
PostgreSQL doc page: https://www.postgresql.org/docs/11/functions-json.html
So it addresses your question specifically, but it doesn't work with any other types, like integer or float for example. The #> operator will not work for other types either.
Numbers:
Because JSON only has one numeric type, "number", and has no concept of int or float, there's no obvious way to cast a JSON type to a "correct" numeric type. It's best to know the schema of your JSON, extract the text and then cast to the correct type:
SELECT (('{"a":2.01}'::json)->'a'#>>'{}')::float
PostgreSQL does however have support for "arbitrary precision numbers" ("up to 131072 digits before the decimal point; up to 16383 digits after the decimal point") with its "numeric" type. JSON also supports 'e' notation for large numbers.
Try this to test them both out:
SELECT (('{"a":2e99999}'::json)->'a'#>>'{}')::numeric
The ->> operator unwraps quotation marks correctly. In order to take advantage of that operator, we wrap up our value inside an array, and then convert that to json.
CREATE OR REPLACE FUNCTION json2text(IN from_json JSON)
RETURNS TEXT AS $$
BEGIN
RETURN to_json(ARRAY[from_json])->>0;
END; $$
LANGUAGE plpgsql;
For completeness, we provide a CAST that makes use of the function above.
CREATE CAST (json AS text) WITH json2text(json) AS ASSIGNMENT;

How to get hex-encoded md5 hash in Go

I'm trying to get the md5 hash of a file in Go, like thus:
running_hash := md5.New(); // type hash.Hash
running_hash.Write(data); // data is []byte
sum := running_hash.Sum(); // []uint8 according to the compiler
But when I try to get the string of the hash's 'sum' (http://golang.org/pkg/hash/), via
sumstring := string(sum); // returns 'Ӿ��]앿��N��' or similar
when the hash is supposed to be d3be9e835dec95bfbef34ebe1fbf03da. I get the same sort of nonsense, only with different characters, when I try to convert on a byte-by-byte basis.
How am I meant to get the hash's string?
Basically, you've got the binary data but it looks like you're expecting hex. Have a look at the hex package for conversion routines, especially EncodeToString. I'm not a Go programmer, but I think if you just pass sum into hex.EncodeToString, you'll get the answer you expected.
alternately, you can get the hex representation of a string or byte slice easily using fmt.Sprintf("%x", sum)

Import hex-data from a file with matlab

This is a part of my data.
ªU€ÿ ÿ dô # #›ÿÿ;< …æ ³ 3m ...
It is saved in a file. When I look at it with a hex-editor I can see the hex-values. How can I read this "hex-data" with matlab?
EDIT: I get this error:
??? Error using ==> hex2dec at 38
Input string found with characters other than 0-9, a-f, or A-F.
with this code:
a = fread(fid,1,'uint32','l');
fprintf('%X',a)
b = hex2dec(a);
hex2dec() expects a hexadecimal number string as input.
>> hex2dec('28')
With your fread statement I suspect that your 'a' variable will be an integer*4 hence the error message, my understanding being that the precision has already converted the hex string to the type you've declared. If you want to pass this value through hex2dec then you'll need to create a string input.
>> hex2dec(num2str(28));
Do you know the format of your binary file? i.e. is the first value of the data an integer*4?
EDIT: added hex output
In response to the comment, as you read the data in, MATLAB is converting the binary data stream into the format you defined. If you want to get the stream of hexadecimal data then the simpliest way is to convert them back into hexadecimal.
a=dec2hex(fread(fid))
'a' will be a list of all the values in hexadecimal format and should match what you see in your hex editor.
q=dec2bin(hex2dec(num2str(p)))