I have this Flutter bit of code here, which is a large String. It would be different every time, but the format would stay the same since it's a template:
"William\nWilliam description here...\n$^170^ usd" + Uuid().v4()
I want to extract the 170 part, and then convert it to interger, so I can remove it from list of ints. I have tried a lot of code, but it isn't working for a few reasons, one is I can't extract the actual number from the String between the ^ and ^, and then I can't convert it to interger. Here's the try function (incomplete).
deleteSumItem(item) {
final regEx = RegExp(r'\^\d+(?:\^\d+)?'); //not sure if this is right regex for the String template
final priceValueMatch = regEx.firstMatch(item); //this doesn't return the particular number extracted
_totalPrice.remove(priceValueMatch); //i get error here that it isn't a int
_counter = _counter - priceValueMatch; //then remove it from interger as int
}
The function would take that String ("William\nWilliam description here...\n$^170^ usd" + Uuid().v4()) template (the number would be different between the ^ ^, but the template is same), then convert it to interger and remove from list as int.
Try the following:
void main() {
RegExp regExp = RegExp(r'\^(\d+)\^');
String input = r"William\nWilliam description here...\n$^170^ usd";
String match = regExp.firstMatch(input).group(1);
print(match); // 170
int number = int.parse(match);
print(number); // 170
}
I have changed the RegExp so it does correctly capture the number in its own capture group. It looked like you got a little confused in the process of creating the RegExp but it could also be I am missing some details about the problem.
I want to hash a string but looking at the example on MSDN I got stuck on the DATA_SIZE. What is this? and how do I know ahead of time what the size of the array is if the plaintext can vary in length?
Also I need to return the result as a vector (consuming method expects this)
Code from MSDN
array<Byte>^ data = gcnew array<Byte>( DATA_SIZE );
array<Byte>^ result;
SHA1^ sha = gcnew SHA1CryptoServiceProvider;
// This is one implementation of the abstract class SHA1.
result = sha->ComputeHash( data );
My method so far looks like
std::vector<byte> sha1(const std::string& plaintext)
{
//#define SHA1_BUFFER_SIZE ????
//array<System::Byte>^ data = gcnew array<System::Byte>(DATA_SIZE);
//convert plaintext string to byte array
array<System::Byte>^ result;
SHA1^ sha = gcnew SHA1CryptoServiceProvider;
result = sha->ComputeHash(data);
//return result as a vector<byte>
}
First, answering your questions:
On the referred MSDN page, it was stated before the example:
This example assumes that there is a predefined constant DATA_SIZE.
So, it was assumed that there was a #define or a enum constant predefined
Of course you don't know the size of the plaintext. But you'll see that in this approach you don't need the DATA_SIZE constant
If you want your function to receive as input a std:string and to output a std::vector, there are a lot of steps to be done. It would look something like that:
std::vector<unsigned char> sha1(const std::string& message)
{
// Steps #1 and #2: convert from std::string to managed string
// and convert from a string to bytes (UTF-8 as example)
array<Byte>^data = Encoding::UTF8->GetBytes(gcnew String(message.c_str() ));
// Steps #3 and #4: perform hash operation
// and store result in a byte array
SHA1^ sha = gcnew SHA1CryptoServiceProvider;
array<Byte>^ array_result = sha->ComputeHash(data);
// Step #5: convert from a managed array
// to unmanaged vector
std::vector<unsigned char> vector_return(array_result->Length);
Marshal::Copy(array_result, 0, IntPtr(&vector_return[0]), array_result->Length);
return vector_return;
}
don't forget to declare the namespaces you're using:
using namespace System;
using namespace System::Text;
using namespace System::Security::Cryptography;
using namespace System::Runtime::InteropServices;
int main(array<System::String ^> ^args)
{
std::string str_test("This is just a test.");
std::vector<unsigned char> vec_digest( sha1(str_test) );
return 0;
}
Final thoughts:
It's a bit misleading to name the string you want to hash as plaintext. More accurate terms would be message (for the input data) and digest (for the hash value)
Avoid mixing the use of std::string and managed String^ if you can.
I am starter in os Deving and manage to make a bootloader and then a kernel.I cam successfully jumped to protected mode and transfer the control to kernel.I able to write single characters but printing string is not working.This is my printString() function.
void printString(char * message[]){
int i;
for(i = 0; message[i] != '\0'; i++)
{
print(message[i]);
}
}
And My print Character function is here
void print(char *character){
unsigned char *vidmem = (unsigned char *) VIDEO_ADDRESS;
int offset; //Variable which hold the offset where we want to print our character
offset = GetCursor(); //Setting our offset to current cursor position
vidmem[offset+1] = character;
vidmem[offset+2] = 0x0f;
SetCursor(offset+2);
}
and this is call to function
printString("manoj");
Please help me I am a starter in os deving
I would recommend keeping track of the X and Y coordinates as (static) globals, and using them for offsets into memory. Also, it shouldn't be offset+1 and offset+2, but rather offset and offset+1. This is in addition to what tangrs said in his answer.
A good tutorial for learning how to print to the screen can be found at http://www.jamesmolloy.co.uk/tutorial_html/3.-The%20Screen.html - he goes into great detail about how to print things. It also is a good place to start learning about OSDev, along with the OSDev forums at http://forum.osdev.org/index.php.
There's several things wrong with your functions
Firstly, your print function takes a pointer to a character where it looks like you want the character itself.
Secondly, your printString function is really taking a pointer to pointer to char which isn't what you want if you're calling the printString function like printString("Hello World");.
Your compiler should have warned you about these.
Your code should looks something like this
void printString(char * message){
// ...
}
void print(char character){
// ...
vidmem[offset+1] = character;
// ...
}
I have several characters that aren't recognized properly.
Characters like:
º
á
ó
(etc..)
This means that the characters encoding is not utf-8 right?
So, can you tell me what character encoding could it be please.
We don't have nearly enough information to really answer this, but the gist of it is: you shouldn't just guess. You need to work out where the data is coming from, and find out what the encoding is. You haven't told us anything about the data source, so we're completely in the dark. You might want to try Encoding.Default if these are files saved with something like Notepad.
If you know what the characters are meant to be and how they're represented in binary, that should suggest an encoding... but again, we'd need to know more information.
read this first http://www.joelonsoftware.com/articles/Unicode.html
There are two encodings: the one that was used to encode string and one that is used to decode string. They must be the same to get expected result. If they are different then some characters will be displayed incorrectly. we can try to guess if you post actual and expected results.
I wrote a couple of methods to narrow down the possibilities a while back for situations just like this.
static void Main(string[] args)
{
Encoding[] matches = FindEncodingTable('Ÿ');
Encoding[] enc2 = FindEncodingTable(159, 'Ÿ');
}
// Locates all Encodings with the specified Character and position
// "CharacterPosition": Decimal position of the character on the unknown encoding table. E.G. 159 on the extended ASCII table
//"character": The character to locate in the encoding table. E.G. 'Ÿ' on the extended ASCII table
static Encoding[] FindEncodingTable(int CharacterPosition, char character)
{
List matches = new List();
byte myByte = (byte)CharacterPosition;
byte[] bytes = { myByte };
foreach (EncodingInfo encInfo in Encoding.GetEncodings())
{
Encoding thisEnc = Encoding.GetEncoding(encInfo.CodePage);
char[] chars = thisEnc.GetChars(bytes);
if (chars[0] == character)
{
matches.Add(thisEnc);
break;
}
}
return matches.ToArray();
}
// Locates all Encodings that contain the specified character
static Encoding[] FindEncodingTable(char character)
{
List matches = new List();
foreach (EncodingInfo encInfo in Encoding.GetEncodings())
{
Encoding thisEnc = Encoding.GetEncoding(encInfo.CodePage);
char[] chars = { character };
byte[] temp = thisEnc.GetBytes(chars);
if (temp != null)
matches.Add(thisEnc);
}
return matches.ToArray();
}
Encoding is the form of modifying some existing content; thus allowing it to be parsed by the required destination protocols.
An example of encoding can be seen when browsing the internet:
The URL you visit: www.example.com, may have the search facility to run custom searches via the URL address:
www.example.com?search=...
The following variables on the URL require URL encoding. If you was to write:
www.example.com?search=cat food cheap
The browser wouldn't understand your request as you have used an invalid character of ' ' (a white space)
To correct this encoding error you should exchange the ' ' with '%20' to form this URL:
www.example.com?search=cat%20food%20cheap
Different systems use different forms of encoding, in this example I have used standard Hex encoding for a URL. In other applications and instances you may find the need to use other types of encoding.
Good Luck!
I'm stuck on stoopid today as I can't convert a simple piece of ObjC code to its Cpp equivalent. I have this:
const UInt8 *myBuffer = [(NSString*)aRequest UTF8String];
And I'm trying to replace it with this:
const UInt8 *myBuffer = (const UInt8 *)CFStringGetCStringPtr(aRequest, kCFStringEncodingUTF8);
This is all in a tight unit test that writes an example HTTP request over a socket with CFNetwork APIs. I have working ObjC code that I'm trying to port to C++. I'm gradually replacing NS API calls with their toll free bridged equivalents. Everything has been one for one so far until this last line. This is like the last piece that needs completed.
This is one of those things where Cocoa does all the messy stuff behind the scenes, and you never really appreciate just how complicated things can be until you have to roll up your sleeves and do it yourself.
The simple answer for why it's not 'simple' is because NSString (and CFString) deal with all the complicated details of dealing with multiple character sets, Unicode, etc, etc, while presenting a simple, uniform API for manipulating strings. It's object oriented at its best- the details of 'how' (NS|CF)String deals with strings that have different string encodings (UTF8, MacRoman, UTF16, ISO 2022 Japanese, etc) is a private implementation detail. It all 'just works'.
It helps to understand how [#"..." UTF8String] works. This is a private implementation detail, so this isn't gospel, but based on observed behavior. When you send a string a UTF8String message, the string does something approximating (not actually tested, so consider it pseudo-code, and there's actually simpler ways to do the exact same thing, so this is overly verbose):
- (const char *)UTF8String
{
NSUInteger utf8Length = [self lengthOfBytesUsingEncoding:NSUTF8StringEncoding];
NSMutableData *utf8Data = [NSMutableData dataWithLength:utf8Length + 1UL];
char *utf8Bytes = [utf8Data mutableBytes];
[self getBytes:utf8Bytes
maxLength:utf8Length
usedLength:NULL
encoding:NSUTF8StringEncoding
options:0UL
range:NSMakeRange(0UL, [self length])
remainingRange:NULL];
return(utf8Bytes);
}
You don't have to worry about the memory management issues of dealing with the buffer that -UTF8String returns because the NSMutableData is autoreleased.
A string object is free to keep the contents of the string in whatever form it wants, so there's no guarantee that its internal representation is the one that would be most convenient for your needs (in this case, UTF8). If you're using just plain C, you're going to have to deal with managing some memory to hold any string conversions that might be required. What was once a simple -UTF8String method call is now much, much more complicated.
Most of NSString is actually implemented in/with CoreFoundation / CFString, so there's obviously a path from a CFStringRef -> -UTF8String. It's just not as neat and simple as NSString's -UTF8String. Most of the complication is with memory management. Here's how I've tackled it in the past:
void someFunction(void) {
CFStringRef cfString; // Assumes 'cfString' points to a (NS|CF)String.
const char *useUTF8StringPtr = NULL;
UInt8 *freeUTF8StringPtr = NULL;
CFIndex stringLength = CFStringGetLength(cfString), usedBytes = 0L;
if((useUTF8StringPtr = CFStringGetCStringPtr(cfString, kCFStringEncodingUTF8)) == NULL) {
if((freeUTF8StringPtr = malloc(stringLength + 1L)) != NULL) {
CFStringGetBytes(cfString, CFRangeMake(0L, stringLength), kCFStringEncodingUTF8, '?', false, freeUTF8StringPtr, stringLength, &usedBytes);
freeUTF8StringPtr[usedBytes] = 0;
useUTF8StringPtr = (const char *)freeUTF8StringPtr;
}
}
long utf8Length = (long)((freeUTF8StringPtr != NULL) ? usedBytes : stringLength);
if(useUTF8StringPtr != NULL) {
// useUTF8StringPtr points to a NULL terminated UTF8 encoded string.
// utf8Length contains the length of the UTF8 string.
// ... do something with useUTF8StringPtr ...
}
if(freeUTF8StringPtr != NULL) { free(freeUTF8StringPtr); freeUTF8StringPtr = NULL; }
}
NOTE: I haven't tested this code, but it is modified from working code. So, aside from obvious errors, I believe it should work.
The above tries to get the pointer to the buffer that CFString uses to store the contents of the string. If CFString happens to have the string contents encoded in UTF8 (or a suitably compatible encoding, such as ASCII), then it's likely CFStringGetCStringPtr() will return non-NULL. This is obviously the best, and fastest, case. If it can't get that pointer for some reason, say if CFString has its contents encoded in UTF16, then it allocates a buffer with malloc() that is large enough to contain the entire string when its is transcoded to UTF8. Then, at the end of the function, it checks to see if memory was allocated and free()'s it if necessary.
And now for a few tips and tricks... CFString 'tends to' (and this is a private implementation detail, so it can and does change between releases) keep 'simple' strings encoded as MacRoman, which is an 8-bit wide encoding. MacRoman, like UTF8, is a superset of ASCII, such that all characters < 128 are equivalent to their ASCII counterparts (or, in other words, any character < 128 is ASCII). In MacRoman, characters >= 128 are 'special' characters. They all have Unicode equivalents, and tend to be things like extra currency symbols and 'extended western' characters. See Wikipedia - MacRoman for more info. But just because a CFString says it's MacRoman (CFString encoding value of kCFStringEncodingMacRoman, NSString encoding value of NSMacOSRomanStringEncoding) doesn't mean that it has characters >= 128 in it. If a kCFStringEncodingMacRoman encoded string returned by CFStringGetCStringPtr() is composed entirely of characters < 128, then it is exactly equivalent to its ASCII (kCFStringEncodingASCII) encoded representation, which is also exactly equivalent to the strings UTF8 (kCFStringEncodingUTF8) encoded representation.
Depending on your requirements, you may be able to 'get by' using kCFStringEncodingMacRoman instead of kCFStringEncodingUTF8 when calling CFStringGetCStringPtr(). Things 'may' (probably) be faster if you require strict UTF8 encoding for your strings but use kCFStringEncodingMacRoman, then check to make sure the string returned by CFStringGetCStringPtr(string, kCFStringEncodingMacRoman) only contains characters that are < 128. If there are characters >= 128 in the string, then go the slow route by malloc()ing a buffer to hold the converted results. Example:
CFIndex stringLength = CFStringGetLength(cfString), usedBytes = 0L;
useUTF8StringPtr = CFStringGetCStringPtr(cfString, kCFStringEncodingUTF8);
for(CFIndex idx = 0L; (useUTF8String != NULL) && (useUTF8String[idx] != 0); idx++) {
if(useUTF8String[idx] >= 128) { useUTF8String = NULL; }
}
if((useUTF8String == NULL) && ((freeUTF8StringPtr = malloc(stringLength + 1L)) != NULL)) {
CFStringGetBytes(cfString, CFRangeMake(0L, stringLength), kCFStringEncodingUTF8, '?', false, freeUTF8StringPtr, stringLength, &usedBytes);
freeUTF8StringPtr[usedBytes] = 0;
useUTF8StringPtr = (const char *)freeUTF8StringPtr;
}
Like I said, you don't really appreciate just how much work Cocoa does for you automatically until you have to do it all yourself. :)
In the sample code above, the following appears:
CFIndex stringLength = CFStringGetLength(cfString)
stringLength is then being used to malloc() a temporary buffer of that many bytes, plus 1.
But the header file for CFStringGetLength() expressly says it returns the number of 16-bit Unicode characters, not bytes. So if some of those Unicode characters are outside the ASCII range, the malloc() buffer won't be long enough to hold the UTF-8 conversion of the string.
Perhaps I'm missing something, but to be absolutely safe, the number of bytes needed to hold N arbitrary Unicode characters is at most 4*n, when they're all converted to UTF-8.
From the documentation:
Whether or not this function returns a valid pointer or NULL depends on many factors, all of which depend on how the string was created and its properties. In addition, the function result might change between different releases and on different platforms. So do not count on receiving a non-NULL result from this function under any circumstances.
You should use CFStringGetCString if CFStringGetCStringPtr returns NULL.
Here's some working code. I started with #johne's answer, replaced CFStringGetBytes with CFStringGetLength for simplicity, and made the correction suggested by #Doug.
const char *useUTF8StringPtr = NULL;
char *freeUTF8StringPtr = NULL;
if ((useUTF8StringPtr = CFStringGetCStringPtr(cfString, kCFStringEncodingUTF8)) == NULL)
{
CFIndex stringLength = CFStringGetLength(cfString);
CFIndex maxBytes = 4 * stringLength + 1;
freeUTF8StringPtr = malloc(maxBytes);
CFStringGetCString(cfString, freeUTF8StringPtr, maxBytes, kCFStringEncodingUTF8);
useUTF8StringPtr = freeUTF8StringPtr;
}
// ... do something with useUTF8StringPtr...
if (freeUTF8StringPtr != NULL)
free(freeUTF8StringPtr);
If it's destined for a socket, perhaps CFStringGetBytes() would be your best choice?
Also note that the documentation for CFStringGetCStringPtr() says:
This function either returns the requested pointer immediately, with no memory allocations and no copying, in constant time, or returns NULL. If the latter is the result, call an alternative function such as the CFStringGetCString function to extract the characters.
Here's a way to printf a CFStringRef which implies we get a '\0'-terminated string from a CFStringRef:
// from: http://lists.apple.com/archives/carbon-development/2001/Aug/msg01367.html
// by Ali Ozer
// gcc -Wall -O3 -x objective-c -fobjc-exceptions -framework Foundation test.c
#import <stdio.h>
#import <Foundation/Foundation.h>
/*
This function will print the provided arguments (printf style varargs) out to the console.
Note that the CFString formatting function accepts "%#" as a way to display CF types.
For types other than CFString and CFNumber, the result of %# is mostly for debugging
and can differ between releases and different platforms. Cocoa apps (or any app which
links with the Foundation framework) can use NSLog() to get this functionality.
*/
void show(CFStringRef formatString, ...) {
CFStringRef resultString;
CFDataRef data;
va_list argList;
va_start(argList, formatString);
resultString = CFStringCreateWithFormatAndArguments(NULL, NULL, formatString, argList);
va_end(argList);
data = CFStringCreateExternalRepresentation(NULL, resultString,
CFStringGetSystemEncoding(), '?');
if (data != NULL) {
printf ("%.*s\n", (int)CFDataGetLength(data), CFDataGetBytePtr(data));
CFRelease(data);
}
CFRelease(resultString);
}
int main(void)
{
// To use:
int age = 25;
CFStringRef name = CFSTR("myname");
show(CFSTR("Name is %#, age is %d"), name, age);
return 0;
}