CK_CHAR vs CK_BYTE in PKCS#11? - pkcs#11

Does PKCS#11 intend CK_CHAR and CK_BYTE to have identical semantics, or is CK_CHAR intended to imply printability?
The standard PKCS#11 type header defines CK_CHAR in terms of CK_BYTE and says "character" instead of "value":
/* an unsigned 8-bit value */
typedef unsigned char CK_BYTE;
/* an unsigned 8-bit character */
typedef CK_BYTE CK_CHAR;
Does this guarantee that every CK_CHAR (and array of CK_CHARs) is within the printable range?

Answering my own question: section 1.3, table 3 of PKCS#11v2.40 establishes that CK_CHAR is always printable (in ANSI C encoding, which is ASCII).

Related

Create Unicode from a hex number in C++

My objective is to take a character which represents to UK pound symbol and convert it to it's unicode equivalent in a string.
Here's my code and output so far from my test program:
#include <iostream>
#include <stdio.h>
int main()
{
char x = 163;
unsigned char ux = x;
const char *str = "\u00A3";
printf("x: %d\n", x);
printf("ux: %d %x\n", ux, ux);
printf("str: %s\n", str);
return 0;
}
Output
$ ./pound
x: -93
ux: 163 a3
str: £
My goal is to take the unsigned char 0xA3 and put it into a string representing the unicode UK pound representation: "\u00A3"
What exactly is your question? Anyway, you say you're writing C++, but you're using char* and printf and stdlib.h so you're really writing C, and base C does not support unicode. Remember that a char in C is not a "character" it's just a byte, and a char* is not an array of characters, it's an array of bytes. When you printf the "\u00A3" string in your sample program, you are not printing a unicode character, you are actually printing those literal bytes, and your terminal is helping you out and interpreting them as a unicode character. The fact that it correctly prints the £ character is just coincidence. You can see this for yourself. If you printf str[0] in your sample program you should just see the "\" character.
If you want to use unicode correctly in C you'll need to use a library. There are many to choose from and I haven't used any of them enough to recommend one. Or you'll need to use C++11 or newer and use std::wstring and friends. But what you are doing is not real unicode and will not work as you expect in the long run.

Iterate through alphabet in Swift explanation

I accidentally wrote this simple code to print alphabet in terminal:
var alpha:Int = 97
while (alpha <= 122) {
write(1, &alpha, 1)
alpha += 1
}
write(1, "\n", 1)
//I'm using write() function from C, to avoid newline on each symbol
And I've got this output:
abcdefghijklmnopqrstuvwxyz
Program ended with exit code: 0
So, here is the question: Why does it work?
In my logic, it should display a row of numbers, because an integer variable is being used. In C, it would be a char variable, so we would mean that we point to a sign at some index in ASCII. Then:
char alpha = 97;
Would be a code point to an 'a' sign, by incrementing alpha variable in a loop we would display each element of ascii through 122nd.
In Swift though, I couldn't assign an integer to Character or String type variable. I used Integer and then declared several variables to assign UnicodeScalar, but accidentally I found out that when I'm calling write, I point to my integer, not the new variable of UnicodeScalar type, although it works! Code is very short and readable, but I don't completely understand how does work and why at all.
Has anyone had such situation?
Why does it work?
This works “by chance” because the integer is stored in little-endian byte order.
The integer 97 is stored in memory as 8 bytes
0x61 0x00 0x00 0x00 0x00 0x00 0x00 0x00
and in write(1, &alpha, 1), the address of that memory location is
passed to the write system call. Since the last parameter (nbyte)
is 1, the first byte at that memory address is written to the
standard output: That is 0x61 or 97, the ASCII code of the letter
a.
In Swift though, I couldn't assign an integer to Character or String type variable.
The Swift equivalent of char is CChar, a type alias for Int8:
var alpha: CChar = 97
Here is a solution which does not rely on the memory layout and
works for non-ASCII character as well:
let first: UnicodeScalar = "α"
let last: UnicodeScalar = "ω"
for v in first.value...last.value {
if let c = UnicodeScalar(v) {
print(c, terminator: "")
}
}
print()
// αβγδεζηθικλμνξοπρςστυφχψω

How does WideCharToMultiByte deal with codepages?

When I execute the below code, why am I getting '?' for the first case? AFAIK, codepage 932 supports line draw characters.
How does this API deal with codepages? AFAIK, it searches and maps the character in the codepage, then returns the position of the character from the codepage.
typedef struct dbcs {
unsigned char HighByte;
unsigned char LowByte;
} DBCS;
static DBCS set[5] = {0x25,0x5D};
unsigned char array[2];
#include <windows.h>
#include <stdio.h>
int main()
{
// printf("hello world");
int str_size;
LPCWSTR charpntr;
LPSTR getcd;
LPBOOL flg;
int i ;
array[0] = set[0].LowByte;
array[1] = set[0].HighByte;
charpntr = &array;
str_size = WideCharToMultiByte(932, 0, charpntr, 1, getcd, 2, NULL, NULL);
printf(" value of %u", getcd);
printf("number of bytes %d character is %s", str_size, getcd);
printf("\n");
array[0] = set[0].LowByte;
array[1] = set[0].HighByte;
charpntr = &array;
str_size = WideCharToMultiByte(437, 0, charpntr, 1, getcd, 2, NULL, NULL);
printf(" value of %u", getcd);
printf("number of bytes %d character is %s", str_size, getcd);
printf("\n");
}
Result of execution in CodeBlocks:
Windows codepage 932 is not a simple thing - as it uses multibyte characters.
I have no Windows here, so I have been experimenting with the encoding of the character you are using in Python3, in an UTF-8 terminal: it works fine with cp437 and UTF-8, but Python refuses to encode the character to what it calls "cp932", or any of its aliases listed in the Wikipedia article:
https://en.wikipedia.org/wiki/Code_page_932_(Microsoft_Windows)
It may be a fault in Python's internal Unicode tables (fetched directly from the Unicode consortium), or possibly, this codepage don't map this character at all.
Anyway, there are problems in your code: one is that you never initialize getcd. Reading the docs for WideCharToMultiByte(), one see it should not be set to NULL, so you have to have the proper return buffer allocated there.
So, try putting the getcd declaration as:
char getcd[6]={};
That should give you enough space for even the widest characters you experiment with, and include a string \x00 terminator.
And another thing is that if these line drawing characters are present in CP932, they are definitely multibyte - thus the cbMultiByte parameter for the call (the "1" after charptr) should be set to at least 2. If no other error kicks in, and the char exists in cp932, this alone might fix your issue.

USART format data type

i would like to ask, how i can send data via usart as integer, i mean variable which stores number. I am able to send char variable, but terminal shows me ascii presentation of this number and i need to see number.
I edited code like shown below but it gives me error: "conflicting types for 'USART_Transmit'"
#include <avr/io.h>
#include <util/delay.h>
#define FOSC 8000000// Clock Speed
#define BAUD 9600
#define MYUBRR FOSC/16/BAUD-1
void USART_Init( unsigned int ubrr );
void USART_Transmit( unsigned char data );
unsigned char USART_Receive( void );
int main( void )
{
unsigned char str[5] = "serus";
unsigned char strLenght = 5;
unsigned int i = 47;
USART_Init ( MYUBRR );
//USART_Transmit('S' );
while(1)
{
/*USART_Transmit( str[i++] );
if(i >= strLenght)
i = 0;*/
USART_Transmit(i);
_delay_ms(250);
}
return(0);
}
void USART_Init( unsigned int ubrr )
{
/* Set baud rate */
UBRR0H = (unsigned char)(ubrr>>8);
UBRR0L = (unsigned char)ubrr;
/* Enable receiver and transmitter */
UCSR0B = (1<<RXEN)|(1<<TXEN);
/* Set frame format: 8data, 2stop bit */
UCSR0C = (1<<USBS)|(3<<UCSZ0);
}
void USART_Transmit( unsigned int data )
{
/* Wait for empty transmit buffer */
while ( !( UCSR0A & (1<<UDRE)) )
;
/* Put data into buffer, sends the data */
UDR0 = data;
}
unsigned char USART_Receive( void )
{
/* Wait for data to be received */
while ( !(UCSR0A & (1<<RXC)) )
;
/* Get and return received data from buffer */
return UDR0;
}
Do you have any ideas what is wrong?
PS: I hope you understand what im trying to explain.
I like to use sprintf to format numbers for serial.
At the top of your file, put:
#include <stdio.h>
Then write some code in a function like this:
char buffer[16];
sprintf(buffer, "%d\n", number);
char * p = buffer;
while (*p) { USART_Transmit(*p++); }
The first two lines construct a null-terminated string in the buffer. The last two lines are a simple loop to send all the characters in the buffer. I put a newline in the format string to make it easier to see where one number ends and the other begins.
Technically a UART serial connection is just a stream of bits divided into symbols of a certain length. It's perfectly possible send the data in raw form, but this comes with a number of issues the must be addressed:
How to identify the start and end of a transmission unambiguously?
How to deal with endianess on either side of the connection?
How to serialize and deserialize the data in a robust way?
How to deal with transmission errors?
At the end of the day it turns out, that you never can resolve all the ambiguties and binary data somehow must be escaped or otherwise encoded to prevent misinterpretation.
As far as delimiting transmissions is concerned, that has been addressed by the creators of the ASCII standard through the set of nonprintable control characters: Of interest for you should be the special control characters
STX / 0x02 / Start of Text
ETX / 0x03 / End of Text
There are also other control characters which form a pretty complete set to make up data structures; you don't need JSON or XML for this. However ASCII itself does support the transmission of arbitrary binary data. However the standard staple for this task for a long time has been and is base64 encoding. Use that for transmission of arbitrary binary data.
Numbers you probably should not transmit in binary at all; just push digits around; if you're using octal or hexadecimal digits parsing into integers is super simple (boils down to a bunch of bit masking and shifting).

searchMemory function in pykd

I'm trying to understand how to use the searchMemory() function in pykd extension for windbg.
The documentation says the following:
Function searchMemory
searchMemory( (long)arg1, (int)arg2, (list)arg3) -> int :
Search in virtual memory
C++ signature :
unsigned __int64 searchMemory(unsigned __int64,unsigned long,class boost::python::list)
searchMemory( (long)arg1, (int)arg2, (str)arg3) -> int :
Search in virtual memory
C++ signature :
unsigned __int64 searchMemory(unsigned __int64,unsigned long,class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> >)
Does someone know what the arguments are and how should I use this function?
First, note that there are 2 overloads of the same method:
searchMemory( (long)arg1, (int)arg2, (list)arg3) -> int
and
searchMemory( (long)arg1, (int)arg2, (str)arg3) -> int
arg1 is the start address or offset at which to start the search,
arg2 is the length or amount of memory to search and
arg3 is the search term, which can be
a string (std::string) or
a list (of char)
the return value is an offset again, certainly the offset of the first occurrence, so to find the next occurrence, you have to search again
I have interpreted all this from the sources in pymemaccess.cpp [Codeplex] and never used it myself yet.
I'm neither very familiar with C++ nor with Python and even worse for the mapping between the two, but IMHO the std::string is a string of bytes and not Unicode characters, so you can put arbitraty bytes in there. It should also be suitable for ASCII search. But you might have to fiddle a bit for UTF-16 / UCS text. The same probably applies for the list of char, because it's not declared as wchar_t.