Dart function to convert full width ASCII characters into normal ASCII characters

Dart function to convert full width ASCII characters into normal ASCII characters - flutter

I wish to convert full width ASCI to ordinary ASCII characters. Do we have any built-in method or package to convert it?

I'm not sure if there is such a method.
If not, you could write your own String Extention.
Extension
extension StringX on String {
static const fullWidthRegExp = r'([\uff01-\uff5e])';
static const halfWidthRegExp = r'([\u0021-\u007e])';
static const halfFullWidthDelta = 0xfee0;
String _convertWidth(String regExpPattern, int delta) {
return replaceAllMapped(RegExp(regExpPattern),
(m) => String.fromCharCode(m[1]!.codeUnits[0] + delta)
);
}
String toFullWidth() => _convertWidth(halfWidthRegExp, halfFullWidthDelta);
String toHalfWidth() => _convertWidth(fullWidthRegExp, -halfFullWidthDelta);
}
Usage
Usage with the range of characters I considered in the String Extension.
void main() {
final myStr = '！＂＃＄％＆＇（）＊＋，－．／０１２３４５６７８９：；＜＝＞？＠ＡＢＣＤＥＦＧＨＩＪＫＬＭＮＯＰＱＲＳＴＵＶＷＸＹＺ［＼］＾＿｀ａｂｃｄｅｆｇｈｉｊｋｌｍｎｏｐｑｒｓｔｕｖｗｘｙｚ｛｜｝～';
print(myStr);
print(myStr.toHalfWidth());
print(myStr.toHalfWidth().toFullWidth());
}
！＂＃＄％＆＇（）＊＋，－．／０１２３４５６７８９：；＜＝＞？＠ＡＢＣＤＥＦＧＨＩＪＫＬＭＮＯＰＱＲＳＴＵＶＷＸＹＺ［＼］＾＿｀ａｂｃｄｅｆｇｈｉｊｋｌｍｎｏｐｑｒｓｔｕｖｗｘｙｚ｛｜｝～
!"#$%&'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
！＂＃＄％＆＇（）＊＋，－．／０１２３４５６７８９：；＜＝＞？＠ＡＢＣＤＥＦＧＨＩＪＫＬＭＮＯＰＱＲＳＴＵＶＷＸＹＺ［＼］＾＿｀ａｂｃｄｅｆｇｈｉｊｋｌｍｎｏｐｑｒｓｔｕｖｗｘｙｚ｛｜｝～

Related

I new to programming and study about operator overloading. To overload "+" to add two string

I new to programming and study about operator overloading. To overload "+" to add two string. But when I try to combine two string using strcpy, the second string replace the first string instead of copy with first string.
#include<string.h>
#include <iostream>
#include<conio.h>
using namespace std;
class String
{
char str[100];
public:
void operator +(String);
String()
{
strcpy(str,"");
}
String(char a[100])
{
strcpy(str,a);
}
};
void String::operator+ (String str1)
{ char temp[100];
strcpy(temp,str);
strcpy(temp,str1.str);
cout<<temp;
}
int main()
{
String s1=String("Hello");;
String s2=String("World");
s1+s2;
return 0;
}

The error in your code is that
In the operator overloading function you should use strcat - string concatenation
For more info check out : String concatenation

I think you've missed to assign the value of the two strings to a new string like this:
String nString = s1 + s2;

using a method to resolve a chess tile colour

I have been looking for a while but people seem to be waaaaay ahead of me on on the chess front. All i want to do is have a method in a class to resolve the colour of a tile but my colour keeps coming up as "null".
import java.util.Scanner;
public class ChessTileTest {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
String tileColour;
chessTile test = new chessTile();
System.out.print(" Enter chess move : ");
String move = in.next();
tileColour = test.setColour(move);
System.out.println(tileColour);
}
}
public class chessTile {
private String colour;
private String address;
public chessTile(){
}
public String setColour(String move){
char letter;
int number;
letter = move.charAt(0);
number = move.charAt(1);
if((letter=='a'||letter=='c'||letter=='e'||letter=='g')&&(number/2==1)){
colour = "Black";
}
else if((letter=='a'||letter=='c'||letter=='e'||letter=='g')&&(number/2==0)){
colour = "white";
}
else if((letter=='b'||letter=='d'||letter=='f'||letter=='h')&&(number/2==1)){
colour = "white";
}
else if((letter=='b'||letter=='d'||letter=='f'||letter=='h')&&(number/2==0)){
colour = "Black";
}
return colour;
}
}

First lines of setColour(...) are
char letter = move.charAt(0); // gets ASCII character at index 0
int number = move.charAt(1); // gets **int value of** ASCII character at index 1
So, for exmaple, if your string is "a1", then letter = a but number = 49 because the integer ASCII value of the character "1" is 49. See this ASCII chart for more . . . http://www.asciitable.com/index/asciifull.gif
You will need to convert the character into a proper int. You can do that with the following . . .
int number = Character.getNumericValue( move.charAt(1) );
Since you are probably getting a bad value, none of the if-statements are satisfied and a null value is returned

ITextSharp / PDFBox text extract fails for certain pdfs

The code below extracts the text from a PDF correctly via ITextSharp in many instances.
using (var pdfReader = new PdfReader(filename))
{
ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
var currentText = PdfTextExtractor.GetTextFromPage(
pdfReader,
1,
strategy);
currentText =
Encoding.UTF8.GetString(Encoding.Convert(
Encoding.Default,
Encoding.UTF8,
Encoding.Default.GetBytes(currentText)));
Console.WriteLine(currentText);
}
However, in the case of this PDF I get the following instead of text: "\u0001\u0002\u0003\u0004\u0005\u0006\a\b\t\a\u0001\u0002\u0003\u0004\u0005\u0006\u0003"
I have tried different encodings and even PDFBox but still failed to decode the PDF correctly. Any ideas on how to solve the issue?

Extracting the text nonetheless
#Bruno's answer is the answer one should give here, the PDF clearly does not provide the information required to allow proper text extraction according to section 9.10 Extraction of Text Content of the PDF specification ISO 32000-1...
But there actually is a slightly evil way to extract the text from the PDF at hand nonetheless!
Wrapping one's text extraction strategy in an instance of the following class, the garbled text is replaced by the correct text:
public class RemappingExtractionFilter : ITextExtractionStrategy
{
ITextExtractionStrategy strategy;
System.Reflection.FieldInfo stringField;
public RemappingExtractionFilter(ITextExtractionStrategy strategy)
{
this.strategy = strategy;
this.stringField = typeof(TextRenderInfo).GetField("text", System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance);
}
public void RenderText(TextRenderInfo renderInfo)
{
DocumentFont font =renderInfo.GetFont();
PdfDictionary dict = font.FontDictionary;
PdfDictionary encoding = dict.GetAsDict(PdfName.ENCODING);
PdfArray diffs = encoding.GetAsArray(PdfName.DIFFERENCES);
;
StringBuilder builder = new StringBuilder();
foreach (byte b in renderInfo.PdfString.GetBytes())
{
PdfName name = diffs.GetAsName((char)b);
String s = name.ToString().Substring(2);
int i = Convert.ToInt32(s, 16);
builder.Append((char)i);
}
stringField.SetValue(renderInfo, builder.ToString());
strategy.RenderText(renderInfo);
}
public void BeginTextBlock()
{
strategy.BeginTextBlock();
}
public void EndTextBlock()
{
strategy.EndTextBlock();
}
public void RenderImage(ImageRenderInfo renderInfo)
{
strategy.RenderImage(renderInfo);
}
public String GetResultantText()
{
return strategy.GetResultantText();
}
}
It can be used like this:
ITextExtractionStrategy strategy = new RemappingExtractionFilter(new LocationTextExtractionStrategy());
string text = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy);
Beware, I had to use System.Reflection to access private members. Some environments may forbid this.
The same in Java
I initially coded this in Java for iText because that's my primary development environment. Thus, here the initial Java version:
public class RemappingExtractionFilter implements TextExtractionStrategy
{
public RemappingExtractionFilter(TextExtractionStrategy strategy) throws NoSuchFieldException, SecurityException
{
this.strategy = strategy;
this.stringField = TextRenderInfo.class.getDeclaredField("text");
this.stringField.setAccessible(true);
}
#Override
public void renderText(TextRenderInfo renderInfo)
{
DocumentFont font =renderInfo.getFont();
PdfDictionary dict = font.getFontDictionary();
PdfDictionary encoding = dict.getAsDict(PdfName.ENCODING);
PdfArray diffs = encoding.getAsArray(PdfName.DIFFERENCES);
;
StringBuilder builder = new StringBuilder();
for (byte b : renderInfo.getPdfString().getBytes())
{
PdfName name = diffs.getAsName((char)b);
String s = name.toString().substring(2);
int i = Integer.parseUnsignedInt(s, 16);
builder.append((char)i);
}
try
{
stringField.set(renderInfo, builder.toString());
}
catch (IllegalArgumentException | IllegalAccessException e)
{
e.printStackTrace();
}
strategy.renderText(renderInfo);
}
#Override
public void beginTextBlock()
{
strategy.beginTextBlock();
}
#Override
public void endTextBlock()
{
strategy.endTextBlock();
}
#Override
public void renderImage(ImageRenderInfo renderInfo)
{
strategy.renderImage(renderInfo);
}
#Override
public String getResultantText()
{
return strategy.getResultantText();
}
final TextExtractionStrategy strategy;
final Field stringField;
}
(RemappingExtractionFilter.java)
It can be used like this:
String extractRemapped(PdfReader reader, int pageNo) throws IOException, NoSuchFieldException, SecurityException
{
TextExtractionStrategy strategy = new RemappingExtractionFilter(new LocationTextExtractionStrategy());
return PdfTextExtractor.getTextFromPage(reader, pageNo, strategy);
}
(from RemappedExtraction.java)
Why does this work?
First of all, this is not the solution to all extraction problems, merely for extracting text from PDFs like the OP has presented.
This method works because the names the PDF uses in its fonts' encoding differences arrays can be interpreted even though they are not standard. These names are built as /Gxx where xx is the hexadecimal representation of the ASCII code of the character this name represents.

A good test to find out whether or not a PDF allows text to be extracted correctly, is by opening it in Adobe Reader and to copy and paste the text.
For instance: I copied the word ABSTRACT and I pasted it in Notepad++:
Do you see the word ABSTRACT in Notepad++? No, you see %&SOH'"%GS. The A is represented as %, the B is represented as &, and so on.
This is a clear indication that the content of the PDF isn't accessible: there is no mapping between the encoding that was use (% = A, & = B,...) and the actual characters that humans can understand.
In short: the PDF doesn't allow you to extract text, not with iText, not with iTextSharp, not with PDFBox. You'll have to find an OCR tool instead and OCR the complete document.
For more info, you may want to watch the following videos:
https://www.youtube.com/watch?v=4ur9WRWVrbM (~5 minutes)
https://www.youtube.com/watch?v=wxGEEv7ibHE (~15 minutes)
https://www.youtube.com/watch?v=g-QcU9B4qMc (~45 minutes)

tostring method giving wrong output

Hi I'm writing a test program to reverse a string. When I convert the character array to a string using the toString() method, I get the wrong output. When I try to print the array manually using a for loop without converting it to a string the answer is correct. The code I've written is shown below:
import java.util.*;
public class stringManip {
/**
* #param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
String str = "This is a string";
System.out.println("String=" +str);
//reverse(s);
char[] c = str.toCharArray();
int left = 0;
int right = str.length() - 1;
for (int i = 0; i < (str.length())/2; i++)
{
char temp = c[left];
c[left++] = c[right];
c[right--] = temp;
}
System.out.print("Reverse="+c.toString());
}
}
I should get the reverse of the string I entered, instead the output am getting is:
String=This is a string
Reverse=[C#45a1472d
Is there something am doing wrong when using the toString() method? Any help is appreciated. Thank you.

Arrays don't override the toString() method. What you're seeing is thus the output of the default Object.toString() implementation, which contains the type of the object ([C means array of chars) followed by its hashCode.
To construct a String from a char array, use
new String(c)

XText: first and last character truncated in custom STRING terminals

I have redefined the STRING terminal this way
terminal STRING : ('.'|'+'|'('|')'|'a'..'z'|'A'..'Z'|'_'|'0'..'9')*;
because I have to recognize STRING not delimited by " or '
the problem is that, though the generated parser works, it truncates the first and the last character of the recognized string. What am I missing?

If you customize the STRING rule, you'll have to adapt the respective value converter, too.
Something like this has to be bound in your runtime module:
public class MyStringValueConverter extends STRINGValueConverter {
#Override
protected String toEscapedString(String value) {
return value;
}
public String toValue(String string, INode node) {
if (string == null)
return null;
return string;
}
}
See the docs for details.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Dart function to convert full width ASCII characters into normal ASCII characters - flutter

I wish to convert full width ASCI to ordinary ASCII characters. Do we have any built-in method or package to convert it?

Related

I new to programming and study about operator overloading. To overload "+" to add two string

using a method to resolve a chess tile colour

ITextSharp / PDFBox text extract fails for certain pdfs

tostring method giving wrong output

XText: first and last character truncated in custom STRING terminals

Categories

Resources