Unicode characters not shown correctly

Unicode characters not shown correctly - email

I am making a C program that supports many languages. The program send emails using the type WCHAR instead of char. The problem is that when I receive the email and read it, some characters are not shown correctly, even some English ones like e, m, ... This is an example:
<!-- language: lang-c -->
curl_easy_setopt(hnd, CURLOPT_READFUNCTION, payload_source);
curl_easy_setopt(hnd, CURLOPT_READDATA, &upload_ctx);
static const WCHAR *payload_text[]={
L"To: <me#mail.com>\n",
L"From: <me#mail.com>(Example User)\n",
L"Subject: Hello!\n",
L"\n",
L"Message sent\n",
NULL
};
struct upload_status {
int lines_read;
};
static size_t payload_source(void *ptr, size_t size, size_t nmemb, void *userp){
struct upload_status *upload_ctx = (struct upload_status *)userp;
const WCHAR *data;
if ((size == 0) || (nmemb == 0) || ((size*nmemb) < 1)) {
return 0;
}
data = payload_text[upload_ctx->lines_read];
if (data) {
size_t len = wcslen(data);
memcpy(ptr, data, len);
upload_ctx->lines_read ++;
return len;
}
return 0;
}

memcpy() operates on bytes, not on characters. You are not taking into account that sizeof(wchar_t) > 1. It is 2 bytes on some systems and 4 bytes on others. This descrepency makes wchar_t a bad choice when writing portable code. You should be using a Unicode library instead, such as icu or iconv).
You need to take sizeof(wchar_t) into account when calling memcpy(). You also need to take into account that the destination buffer may be smaller than the size of the text bytes you are trying to copy. Keeping track of the lines_read by itself is not enough, you have to also keep track of how many bytes of the current line you have copied so you can handle cases when the current line of text straddles across multiple destination buffers.
Try something more like this instead:
static size_t payload_source(void *ptr, size_t size, size_t nmemb, void *userp)
{
struct upload_status *upload_ctx = (struct upload_status *) userp;
unsigned char *buf = (unsignd char *) ptr;
size_t available = (size * nmemb);
size_t total = 0;
while (available > 0)
{
wchar_t *data = payload_text[upload_ctx->lines_read];
if (!data) break;
unsigned char *rawdata = (unsigned char *) data;
size_t remaining = (wcslen(data) * sizeof(wchar_t)) - upload_ctx->line_bytes_read;
while ((remaining > 0) && (available > 0))
{
size_t bytes_to_copy = min(remaining, available);
memcpy(buf, rawdata, bytes_to_copy);
buf += bytes_to_copy;
available -= bytes_to_copy;
total = bytes_to_copy;
rawdata += bytes_to_copy;
remaining -= bytes_to_copy;
upload_ctx->line_bytes_read += bytes_to_copy;
}
if (remaining < 1)
{
upload_ctx->lines_read ++;
upload_ctx->line_bytes_read = 0;
}
}
return total;
}

Related

XXH64 function has different value in debug mode and release mode

XXH_PUBLIC_API unsigned long long XXH64(const void* input, size_t len, unsigned long long seed)
{
#if 0
/* Simple version, good for code maintenance, but unfortunately slow for small inputs */
XXH64_state_t state;
XXH64_reset(&state, seed);
XXH64_update(&state, input, len);
return XXH64_digest(&state);
#else
XXH_endianess endian_detected = (XXH_endianess)XXH_CPU_LITTLE_ENDIAN;
if (XXH_FORCE_ALIGN_CHECK) {
if ((((size_t)input) & 7) == 0) { /* Input is aligned, let's leverage the speed advantage */
if ((endian_detected == XXH_littleEndian) || XXH_FORCE_NATIVE_FORMAT)
return XXH64_endian_align(input, len, seed, XXH_littleEndian, XXH_aligned);
else
return XXH64_endian_align(input, len, seed, XXH_bigEndian, XXH_aligned);
}
}
if ((endian_detected == XXH_littleEndian) || XXH_FORCE_NATIVE_FORMAT)
return XXH64_endian_align(input, len, seed, XXH_littleEndian, XXH_unaligned);
else
return XXH64_endian_align(input, len, seed, XXH_bigEndian, XXH_unaligned);
#endif
}
this is XXH64 hash function (http://www.opensource.org/licenses/bsd-license.php)
and when I run the code below release mode and debug mode
char buf[65];
unsigned int hash2 = 0;
sprintf(buf, "%I64u", (unsigned long long)_message);
unsigned long long hash = XXH64(buf,sizeof(buf)-1,0);
hash = hash % _n;
hash2 = (unsigned int)hash;
printf("message's hash value : %u \n", hash2);
each mode has a different hash value with same code.

read from socket part of data but only once

I would like to get data from serial port on uClinux. How it works: I have peripheral device that I want to enter bootloader mode. To do this I have to send data i2500$ and it means that it is in bootloader mode. Unfortunetly I can read only >i2 and if I use my method it do not return data any more or if i reset the device and repeat start jumping to bootloader by
int TEnforaUpdate::ReadFromComport (unsigned long timeouta, unsigned long size)
{
FD_SET(fdCom, &read_fds);
int retValue = 0;
// Set timeout to x microseconds
struct timeval timeout;
timeout.tv_sec = 0;
timeout.tv_usec = 1000 * timeouta;
// Wait for input to become ready or until the time out; the first parameter is
// 1 more than the largest file descriptor in any of the sets
if ((select (fdCom + 1, &read_fds, &write_fds, &except_fds, &timeout) == 1)
&& (FD_ISSET(fdCom,&read_fds)))
{
//read max
retValue = read (fdCom, RxBuffer, RX_BUFFER_SIZE);
printf ("Read %d bytes: ", retValue);
int i;
for (i = 0; i < retValue; i++)
printf ("[%02x]", RxBuffer[i]);
printf ("\n");
}
else
return 1;
if (retValue > 2)
{
strcpy (answer, (char*) RxBuffer);
//remove trashes from buffer
memset (RxBuffer, 0x00, RX_BUFFER_SIZE);
printf ("Comport answers: %s \n", answer);
}
FD_CLR(fdCom, &read_fds);
tcflush (fdCom, TCIFLUSH);
return retValue;
}

parsing NSData object for information

I have a NSData object coming back from my server, it varies in its content but sticks to a particular structure.
I would like to know (hopfully with some example code) how to work though this object to get the data I need out of it.
the structure of the data objects inside the objects are like this
leading value (UInt16) - (tells me what section of the response it is)
Size of string (UInt32) or number - (UInt32)
String (not null terminated) i.e. followed by the next leading value.
I have been reading through the Binary Data Programming Guide however that's only really showing me how to put my data into new NSData objects and accessing and compairing the bytes.
The thing I am stuck on is how do I say grab the info dynamically. Check the NSdata objects first leading value figure out if its string or int then get the string or int and move onto the next leading value..
any suggestions or example code would be really helpfull.. just stuck in abit of a mind block as I have never attempted anything like this in objective C.

Some of this depends on how your server is written to encode the data into what it is sending you. Assuming it is encoding the numeric values using standard network byte ordering (big-endian) you will want it converted to the correct byte-ordering for iOS (I believe that is always little-endian).
I would approach it something like this:
uint16_t typeWithNetworkOrdering, typeWithLocalOrdering;
uint32_t sizeWithNetworkOrdering, sizeWithLocalOrdering;
char *cstring = NULL;
uint32_t numberWithNetworkOrdering, numberWithLocalOrdering;
const void *bytes = [myData bytes];
NSUInteger length = [myData length];
while (length > 0) {
memcpy(&typeWithNetworkOrdering, bytes, sizeof(uint16_t));
bytes += sizeof(uint16_t);
length -= sizeof(uint16_t);
memcpy(&sizeWithNetworkOrdering, bytes, sizeof(uint32_t));
bytes += sizeof(uint32_t);
length -= sizeof(uint32_t);
typeWithLocalOrdering = CFSwapInt16BigToHost(typeWithNetworkOrdering);
sizeWithLocalOrdering = CFSwapInt32BigToHost(sizeWithNetworkOrdering);
if (typeWithLocalOrdering == STRING_TYPE) { // STRING_TYPE is whatever type value corresponds to a string
cstring = (char *) malloc(sizeWithLocalOrdering + 1);
strncpy(cstring, bytes, sizeWithLocalOrdering);
cstring[sizeWithLocalOrdering] = '\0';
NSString *resultString = [NSString stringWithCString:cstring encoding:NSUTF8StringEncoding];
NSLog(#"String = %#", resultString);
free(cstring);
bytes += sizeWithLocalOrdering;
length -= sizeWithLocalOrdering;
// Do whatever you need to with the string
}
else if (typeWithLocalOrdering == NUMBER_TYPE) { // NUMBER_TYPE is whatever type value corresponds to a number
memcpy(&numberWithNetworkOrdering, bytes, sizeof(uint32_t));
numberWithLocalOrdering = CFSwapInt32BigToHost(numberWithNetworkOrdering);
NSLog(#"Number = %u", numberWithLocalOrdering);
bytes += sizeof(uint32_t);
length -= sizeof(uint32_t);
// Do whatever you need to with the number
}
}

Define your own internal structs and cast the pointer to it:
NSData* data;
struct headerType
{
uint16_t type;
uint32_t length;
};
const struct headerType* header=(const struct headerType*)[data bytes]; // get the header of the response
if (header->type==1)
{
const char* text=((const char*)header)+6; // skip the header (16bits+32bits=6 bytes offset)
}
EDIT:
If you need to read them in a loop:
NSData* data;
const uint8_t* cursor=(const uint8_t*)[data bytes];
while (true)
{
uint16_t type=*((uint16_t*)cursor);
cursor+=2;
if (cursor==1)
{
// string
uint32_t length=*((uint32_t*)cursor);
cursor+=4;
const char* str=(const char*)cursor;
cursor+=length;
}
else if (cursor==2)
{
// another type
}
else
break;
}

IPv6 raw socket programming with native C

I am working on IPv6 and need to craft an IPv6 packet from scratch and put it into a buffer. Unfortunately I do not have much experience with C. From a tutorial I have successfully done the same thing with IPv4 by defining
struct ipheader {
unsigned char iph_ihl:5, /* Little-endian */
iph_ver:4;
unsigned char iph_tos;
unsigned short int iph_len;
unsigned short int iph_ident;
unsigned char iph_flags;
unsigned short int iph_offset;
unsigned char iph_ttl;
unsigned char iph_protocol;
unsigned short int iph_chksum;
unsigned int iph_sourceip;
unsigned int iph_destip;
};
/* Structure of a TCP header */
struct tcpheader {
unsigned short int tcph_srcport;
unsigned short int tcph_destport;
unsigned int tcph_seqnum;
unsigned int tcph_acknum;
unsigned char tcph_reserved:4, tcph_offset:4;
// unsigned char tcph_flags;
unsigned int
tcp_res1:4, /*little-endian*/
tcph_hlen:4, /*length of tcp header in 32-bit words*/
tcph_fin:1, /*Finish flag "fin"*/
tcph_syn:1, /*Synchronize sequence numbers to start a connection*/
tcph_rst:1, /*Reset flag */
tcph_psh:1, /*Push, sends data to the application*/
tcph_ack:1, /*acknowledge*/
tcph_urg:1, /*urgent pointer*/
tcph_res2:2;
unsigned short int tcph_win;
unsigned short int tcph_chksum;
unsigned short int tcph_urgptr;
};
and fill the packet content in like this:
// IP structure
ip->iph_ihl = 5;
ip->iph_ver = 6;
ip->iph_tos = 16;
ip->iph_len = sizeof (struct ipheader) + sizeof (struct tcpheader);
ip->iph_ident = htons(54321);
ip->iph_offset = 0;
ip->iph_ttl = 64;
ip->iph_protocol = 6; // TCP
ip->iph_chksum = 0; // Done by kernel
// Source IP, modify as needed, spoofed, we accept through command line argument
ip->iph_sourceip = inet_addr("1922.168.1.128");
// Destination IP, modify as needed, but here we accept through command line argument
ip->iph_destip = inet_addr(1922.168.1.1);
// The TCP structure. The source port, spoofed, we accept through the command line
tcp->tcph_srcport = htons(atoi("1024"));
// The destination port, we accept through command line
tcp->tcph_destport = htons(atoi("4201"));
tcp->tcph_seqnum = htons(1);
tcp->tcph_acknum = 0;
tcp->tcph_offset = 5;
tcp->tcph_syn = 1;
tcp->tcph_ack = 0;
tcp->tcph_win = htons(32767);
tcp->tcph_chksum = 0; // Done by kernel
tcp->tcph_urgptr = 0;
// IP checksum calculation
ip->iph_chksum = csum((unsigned short *) buffer, (sizeof (struct ipheader) + sizeof (struct tcpheader)));
However for IPv6 I have not find a similar way. What I already found is this struct from IETF,
struct ip6_hdr {
union {
struct ip6_hdrctl {
uint32_t ip6_un1_flow; /* 4 bits version, 8 bits TC, 20 bits
flow-ID */
uint16_t ip6_un1_plen; /* payload length */
uint8_t ip6_un1_nxt; /* next header */
uint8_t ip6_un1_hlim; /* hop limit */
} ip6_un1;
uint8_t ip6_un2_vfc; /* 4 bits version, top 4 bits
tclass */
} ip6_ctlun;
struct in6_addr ip6_src; /* source address */
struct in6_addr ip6_dst; /* destination address */
};
But I did not know how to fill in the information, for example, how to send a TCP/SYN from 2001:220:806:22:aacc:ff:fe00:1 port 1024 to 2001:220:806:21::4 port 1025?
Could anybody help me or is there any references?
Thank you vere much then.
this is what I have done so far, however there are mismatch between the code and the real packet captured by Wireshark (as discussed in comments below). I'm not sure it is possible to post a long code in comment section, so I just edit my question.
Anyone can help?
#define PCKT_LEN 2000
int main(void) {
unsigned char buffer[PCKT_LEN];
int s;
struct sockaddr_in6 din;
struct ipv6_header *ip = (struct ipv6_header *) buffer;
struct tcpheader *tcp = (struct tcpheader *) (buffer + sizeof (struct ipv6_header));
memset(buffer, 0, PCKT_LEN);
din.sin6_family = AF_INET6;
din.sin6_port = htons(0);
inet_pton(AF_INET6, "::1", &(din.sin6_addr)); // For routing
ip->version = 6;
ip->traffic_class = 0;
ip->flow_label = 0;
ip->length = 40;
ip->next_header = 6;
ip->hop_limit = 64;
inet_pton(AF_INET6, "::1", &(ip->dst)); // IPv6
inet_pton(AF_INET6, "::1", &(ip->src)); // IPv6
tcp->tcph_srcport = htons(atoi("11111"));
tcp->tcph_destport = htons(atoi("13"));
tcp->tcph_seqnum = htons(0);
tcp->tcph_acknum = 0;
tcp->tcph_offset = 5;
tcp->tcph_syn = 1;
tcp->tcph_ack = 0;
tcp->tcph_win = htons(32752);
tcp->tcph_chksum = 0; // Done by kernel
tcp->tcph_urgptr = 0;
s = socket(PF_INET6, SOCK_RAW, IPPROTO_RAW);
if (s < 0) {
perror("socket()");
return 1;
}
unsigned short int packet_len = sizeof (struct ipv6_header) + sizeof (struct tcpheader);
if (sendto(s, buffer, packet_len, 0, (struct sockaddr*) &din, sizeof (din)) == -1) {
perror("sendto()");
close(s);
return 1;
}
close(s);
return 0;
}

Maybe this article can help you getting started?
Edit:
Using the wikipedia article linked above I made this structure (without knowing what some of the fields means):
struct ipv6_header
{
unsigned int
version : 4,
traffic_class : 8,
flow_label : 20;
uint16_t length;
uint8_t next_header;
uint8_t hop_limit;
struct in6_addr src;
struct in6_addr dst;
};
It's no different than how the header-struct was made for IPv4 in your example. Just create a struct containing the fields, in the right order and in the right size, and fill it with the right values.
Just do the same for the TCP headers.

Unfortunately the ipv6 RFCs don't provide the same raw socket interface that you get with ipv4. From what i've seen to create ipv6 packets you have to go a level deeper and use an AF_PACKET socket to send an ethernet frame including your ipv6 packet.

Transmission of float values over TCP/IP and data corruption

I have an extremely strange bug.
I have two applications that communicate over TCP/IP.
Application A is the server, and application B is the client.
Application A sends a bunch of float values to application B every 100 milliseconds.
The bug is the following: sometimes some of the float values received by application B are not the same as the values transmitted by application A.
Initially, I thought there was a problem with the Ethernet or TCP/IP drivers (some sort of data corruption). I then tested the code in other Windows machines, but the problem persisted.
I then tested the code on Linux (Ubuntu 10.04.1 LTS) and the problem is still there!!!
The values are logged just before they are sent and just after they are received.
The code is pretty straightforward: the message protocol has a 4 byte header like this:
//message header
struct MESSAGE_HEADER {
unsigned short type;
unsigned short length;
};
//orientation message
struct ORIENTATION_MESSAGE : MESSAGE_HEADER
{
float azimuth;
float elevation;
float speed_az;
float speed_elev;
};
//any message
struct MESSAGE : MESSAGE_HEADER {
char buffer[512];
};
//receive specific size of bytes from the socket
static int receive(SOCKET socket, void *buffer, size_t size) {
int r;
do {
r = recv(socket, (char *)buffer, size, 0);
if (r == 0 || r == SOCKET_ERROR) break;
buffer = (char *)buffer + r;
size -= r;
} while (size);
return r;
}
//send specific size of bytes to a socket
static int send(SOCKET socket, const void *buffer, size_t size) {
int r;
do {
r = send(socket, (const char *)buffer, size, 0);
if (r == 0 || r == SOCKET_ERROR) break;
buffer = (char *)buffer + r;
size -= r;
} while (size);
return r;
}
//get message from socket
static bool receive(SOCKET socket, MESSAGE &msg) {
int r = receive(socket, &msg, sizeof(MESSAGE_HEADER));
if (r == SOCKET_ERROR || r == 0) return false;
if (ntohs(msg.length) == 0) return true;
r = receive(socket, msg.buffer, ntohs(msg.length));
if (r == SOCKET_ERROR || r == 0) return false;
return true;
}
//send message
static bool send(SOCKET socket, const MESSAGE &msg) {
int r = send(socket, &msg, ntohs(msg.length) + sizeof(MESSAGE_HEADER));
if (r == SOCKET_ERROR || r == 0) return false;
return true;
}
When I receive the message 'orientation', sometimes the 'azimuth' value is different from the one sent by the server!
Shouldn't the data be the same all the time? doesn't TCP/IP guarantee delivery of the data uncorrupted? could it be that an exception in the math co-processor affects the TCP/IP stack? is it a problem that I receive a small number of bytes first (4 bytes) and then the message body?
EDIT:
The problem is in the endianess swapping routine. The following code swaps the endianess of a specific float around, and then swaps it again and prints the bytes:
#include <iostream>
using namespace std;
float ntohf(float f)
{
float r;
unsigned char *s = (unsigned char *)&f;
unsigned char *d = (unsigned char *)&r;
d[0] = s[3];
d[1] = s[2];
d[2] = s[1];
d[3] = s[0];
return r;
}
int main() {
unsigned long l = 3206974079;
float f1 = (float &)l;
float f2 = ntohf(ntohf(f1));
unsigned char *c1 = (unsigned char *)&f1;
unsigned char *c2 = (unsigned char *)&f2;
printf("%02X %02X %02X %02X\n", c1[0], c1[1], c1[2], c1[3]);
printf("%02X %02X %02X %02X\n", c2[0], c2[1], c2[2], c2[3]);
getchar();
return 0;
}
The output is:
7F 8A 26 BF
7F CA 26 BF
I.e. the float assignment probably normalizes the value, producing a different value from the original.
Any input on this is welcomed.
EDIT2:
Thank you all for your replies. It seems the problem is that the swapped float, when returned via the 'return' statement, is pushed in the CPU's floating point stack. The caller then pops the value from the stack, the value is rounded, but it is the swapped float, and therefore the rounding messes up the value.

TCP tries to deliver unaltered bytes, but unless the machines have similar CPU-s and operating-systems, there's no guarantee that the floating-point representation on one system is identical to that on the other. You need a mechanism for ensuring this such as XDR or Google's protobuf.

You're sending binary data over the network, using implementation-defined padding for the struct layout, so this will only work if you're using the same hardware, OS and compiler for both application A and application B.
If that's ok, though, I can't see anything wrong with your code. One potential issue is that you're using ntohs to extract the length of the message and that length is the total length minus the header length, so you need to make sure you setting it properly. It needs to be done as
msg.length = htons(sizeof(ORIENTATION_MESSAGE) - sizeof(MESSAGE_HEADER));
but you don't show the code that sets up the message...

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Unicode characters not shown correctly - email

Related

XXH64 function has different value in debug mode and release mode

read from socket part of data but only once

parsing NSData object for information

IPv6 raw socket programming with native C

Transmission of float values over TCP/IP and data corruption

Categories

Resources