Import hex-data from a file with matlab - matlab

This is a part of my data.
ªU€ÿ ÿ dô # #›ÿÿ;< …æ ³ 3m ...
It is saved in a file. When I look at it with a hex-editor I can see the hex-values. How can I read this "hex-data" with matlab?
EDIT: I get this error:
??? Error using ==> hex2dec at 38
Input string found with characters other than 0-9, a-f, or A-F.
with this code:
a = fread(fid,1,'uint32','l');
fprintf('%X',a)
b = hex2dec(a);

hex2dec() expects a hexadecimal number string as input.
>> hex2dec('28')
With your fread statement I suspect that your 'a' variable will be an integer*4 hence the error message, my understanding being that the precision has already converted the hex string to the type you've declared. If you want to pass this value through hex2dec then you'll need to create a string input.
>> hex2dec(num2str(28));
Do you know the format of your binary file? i.e. is the first value of the data an integer*4?
EDIT: added hex output
In response to the comment, as you read the data in, MATLAB is converting the binary data stream into the format you defined. If you want to get the stream of hexadecimal data then the simpliest way is to convert them back into hexadecimal.
a=dec2hex(fread(fid))
'a' will be a list of all the values in hexadecimal format and should match what you see in your hex editor.

q=dec2bin(hex2dec(num2str(p)))

Related

IW8ISO8859P8 to utf-8 conversion

My perl script reads values extracted from Oracle DB based on IW8ISO8859P8 codepage into a string. I have also an input file (saved as UTF-8) from which I read also a string.
I am trying to compare both string. printing first string gives me gibberish e.g òøáä -îåñáåú whereas the other string gives me the hebrew letter e.g הסבות. How can I encode the first string to give me the right Hebrew string ?
Thanks
Shimon
I tried using the format $string1 = Encode (uft-8, $string1) but it did not help

How to determine if base64 and unicode character are equivalent?

I'm trying to figure out if the two encodings of the character are the same:
SELECT
~b"1",
code_points_to_string([0x380]),
"\u0380"
The first is base64-encoded and the second is a string. How can I determine if these are equivalent? Is there some sort of function where I can, for example extract the code point from a base64-string in BigQuery?
You have mentioned ~b"1" as base64-encoded but according to this google cloud documentation ~ X is a Bitwise not operator. Here you are converting the string value to bytes and then performing Bitwise Not operation on it (Though while showing in UI BQ shows in base64 encoded). To convert it into base64-encoded you can use the TO_BASE64 function.
You can use the to_code_points function to extract the code points from a base 64 encoded string.
select *
,from_base64(d) decoded_val_bytes
,SAFE_CONVERT_BYTES_TO_STRING(from_base64(d)) sring_value
,to_code_points(from_base64(d)) cp
from (
select
~b"foo" a,
code_points_to_string([102,111,111]) b
,to_base64(b"foo") d
,to_code_points("foo") e
)
to_code_points function is to convert string or bytes into an array of INT64 and code_points_to_string function is to convert an array of INT64 into STRING.

how to remove # character from national data type in cobol

i am facing issue while converting unicode data into national characters.
When i convert the Unicode data into national using national-of function, some junk character like # is appended after the string.
E.g
Ws-unicode pic X(200)
Ws-national pic N(600)
--let the value in Ws-Unicode is これらの変更は. getting from java end.
move function national-of ( Ws-unicode ,1208 ) to Ws-national.
--after converting value is like これらの変更は #.
i do not want the extra # character added after conversion.
please help me to find out the possible solution, i have tried to replace N'#' with space using inspect clause.
it worked well but failed in some specific scenario like if we have # in input from user end. in that case genuine # also converted to space.
Below is a snippet of code I used to convert EBCDIC to UTF. Before I was capturing string lengths, I was also getting # symbols:
STRING
FUNCTION DISPLAY-OF (
FUNCTION NATIONAL-OF (
WS-EBCDIC-STRING(1:WS-XML-EBCDIC-LENGTH)
WS-EBCDIC-CCSID
)
WS-UTF8-CCSID
)
DELIMITED BY SIZE
INTO WS-UTF8-STRING
WITH POINTER WS-XML-UTF8-LENGTH
END-STRING
SUBTRACT 1 FROM WS-XML-UTF8-LENGTH
What this code does is string the UTF8 representation of the EBCIDIC string into another variable. The WITH POINTER clause will capture the new length of the string + 1 (+ 1 because the pointer is positioned to the next position after the string ended).
Using this method, you should be able to know exactly how long second string is and use that string with the exact length.
That should remove the unwanted #s.
EDIT:
One thing I forgot to mention, in my case, the # signs were actually EBCDIC low values when viewing the actual hex on the mainframe
Use inspect with reverse and stop after first occurence of #

Matlab : Read a file name in string format from a .csv file

I am having a .csv file which contains let's say 50 rows.
At the beginning of each row I have a file name in the following format 001_02_03.bmp followed by values separated by commas. Something like this :
001_02_03.bmp,20,30,45,10,40,20
Can someone tell me how can I read the first column from the data?
I know how to obtain the data from the second column onward. I am using the csvread function like this X = csvread('filename.csv', 0, 1);. If I try to read the first column in the same manner it outputs an error, saying the csvread does not support string format.
Use textscan, ie:
fid1 = fopen(csvFileName);
X = textscan(fid1, '%s%f%f%f%f%f%f', 'Delimiter', ',');
fclose(fid1);
FirstCol = X{1, 1};
A little more detail? csvread only works with purely numeric data, so you can't use it to get in data with a .bmp, or underscores for that matter. Thus we use textscan. The funny looking format string input to textscan is just saying that the columns are, in order, of type string %s, then the next 6 columns are of type double %f%f%f%f%f%f (or you might choose to alter this to reflect an integer datatype - I personally rarely bother unless the quantity of data is huge or floating point precision is a problem).
Note, if you just wanted to get the first column and ignore the rest, you can replace the format string with %s% %*[^\n]. A final point, if your csv file has a header line, you can skip it using the HeaderLines optional input to textscan.

decode hex in PostgreSQL - got error "odd number of digits"

I have a problem using this query:
select decode(to_hex(ascii('ل')::int),'hex')
When I execute it, I get:
ERROR: invalid hexadecimal data: odd number of digits
decode(..., 'hex') doesn't mean convert this hexadecimal number to something. Hex encoding is a particular encoding format for bytes, and it requires two hexadecimal digits per octet. On the other hand, to_hex converts an integer to a hexadecimal representation, and that could have an even or odd number of digits.
So the answer is, you can't do that (without some manual fixups). And it's not clear why you would want to, either. It looks like you could just do 'ل'::bytea, but that might not be what you wanted either.
May be it's simpler to use something like this:
select encode('ل','escape');