ASP ANSI Conversion for Unicode Based Text File - unicode

I have these ASP script for cache
fn = "caches/"&md5(url)&".html"
// Grab HTML file from site
set oXMLHTTP = CreateObject("MSXML2.ServerXMLHTTP.3.0")
oXMLHTTP.Open "GET", url, false
oXMLHTTP.Send
// Save it
set fs = Server.CreateObject("Scripting.FileSystemObject")
set file = fs.CreateTextfile(fn,false,true)
file.Write oXMLHTTP.responseText
file.close
// Open it and print to screen
set file = fs.OpenTextFile(fn,1)
response.write file.ReadAll
file.Close
response.end
Saved files are "Unicode BOM" encoded, and this causes char problems. When I am convert encoding to "ANSI", everything becomes look normal as expected.
How to covert "oXMLHTTP.responseText" to "ANSI" programaticly?

Best practice is saving binary instead of text always. You can access response binary by using oXMLHTTP.responseBody instead.
However, if you are sure that the response text is compatible with your locale, it is safe to save the response text in ANSI with no problem.
To do this, you need to pass False for third parameter of CreateTextFile method.
set file = fs.CreateTextfile(fn, false, false)
But I strongly recommend you to save the binary like the following:
Dim Stm
Set Stm = Server.CreateObject("Adodb.Stream")
Stm.Type = 1 'adTypeBinary, to set binary stream
Stm.Open
Stm.Write oXMLHTTP.responseBody ' bytes of http response
Stm.SaveToFile fn 'response saved as is
Stm.Close
Set Stm = Nothing

Related

Java iText - special characters not being displayed

I'm trying for quite some time now to generate a PDF in Java using itextpdf (com.itextpdf kernel,layout,form,pdfa) with text containing special characters (äöüß). I tried several things in different variations, like loading a TTF file and setting the encoding:
FontProgram fontProgram = FontProgramFactory.createFont( "font/FreeSans.ttf") ;
PdfFont font = PdfFontFactory.createFont( fontProgram, "UTF-8" ) ;
document.setFont( font );
This way it just doesn't display special characters at all.
This doesn't work either:
var font = PdfFontFactory.createFont(StandardFonts.HELVETICA, PdfEncodings.UTF8);
document.setFont( font );
I haven't found any solution to this and the official tutorials don't seem to have a solution.
Other encodings just render placeholder characters.
this is how I add the text:
PdfWriter writer = new PdfWriter(filename);
PdfDocument pdf = new PdfDocument(writer);
Document document = new Document(pdf);
Paragraph p = new Paragraph("äüöß");
document.add(p);
document.close();
edit: I just realized that it works when I load the text from elsewhere like an input field, instead of passing a normal string. How can I make this work with hardcoded strings?
I tried re-encoding the string as described here: https://www.baeldung.com/java-string-encode-utf-8
but none of these methods work either. It always shows wrong characters.
PdfFont freeUnicode = PdfFontFactory.createFont("font/FreeSans.ttf", PdfEncodings.IDENTITY_H);
String rawString = "äöüß1234'";
byte[] bytes = StringUtils.getBytesUtf8(rawString);
String utf8EncodedString = StringUtils.newStringUtf8(bytes);
document.add(new Paragraph().setFont(freeUnicode)
.add(utf8EncodedString));
edit: The encoding in the source code editor is UTF-8 and I passed UTF-8 to the createFont() method, but that didn't work. When I pass CP1252 and change the source code encoding to ISO-8859-1, it shows the correct characters. Really strange how I couldn't find much information about this problem.

Emoji and accent encoding in dart/flutter

I get the next String from my api
"à é í ó ú ü ñ \uD83D\uDE00\uD83D\uDE03\uD83D\uDE04\uD83D\uDE01\uD83D\uDE06\uD83D\uDE05"
from a response in a json format
{
'apiText': "à é í ó ú ü ñ \uD83D\uDE00\uD83D\uDE03\uD83D\uDE04\uD83D\uDE01\uD83D\uDE06\uD83D\uDE05",
'otherInfo': 'etc.',
.
.
.
}
it contains accents à é í ó ú ü ñ that are not correctly encoded and it contains emojis \uD83D\uDE00\uD83D\uDE03\uD83D\uDE04\uD83D\uDE01\uD83D\uDE06\uD83D\uDE05
so far i have tried
var json = jsonDecode(response.body)
String apiText = json['apiText'];
List<int> bytes = apiText.codeUnits;
comentario = utf8.decode(bytes);
but produces a
[ERROR:flutter/lib/ui/ui_dart_state.cc(166)] Unhandled Exception: FormatException: Invalid UTF-8 byte (at offset 21)
how can i get the correct text with accents and emoji?
Based on the fact you called response.body I assumes you are using the http package which does have the body property on Response objects.
You should note the following detail in the documentation:
This is converted from bodyBytes using the charset parameter of the Content-Type header field, if available. If it's unavailable or if the encoding name is unknown, latin1 is used by default, as per RFC 2616.
Well, it seems rather likely that it cannot figure out the charset and therefore defaults to latin1 which explains how your response got messed up.
A solution for this is to use the resonse.bodyBytes instead which contains the raw bytes from the response. You can then manually parse this with e.g. utf8.decode(resonse.bodyBytes) if you are sure the response should be parsed as UTF-8.
rewrite your function as
String utf8convert(String text) {
var bytes = text.codeUnits;
String decodedCode = utf8.decode(bytes, allowMalformed: true);
if (decodedCode.contains("�")) {
return text;
}
return decodedCode;
}

An error in the encoding of the Content-Disposition header (contains the Cyrillic alphabet) UWP

When I get the file name from the link "Ð\u0097акаÑ\u0082.jpg" instead "Закат.jpg".
Determined that the file name comes in the encoding "windows-1251". When trying to convert to UTF8, UTF16. The file name continues to be displayed incorrectly. Has anyone come across something like this?
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
Encoding wind1252 = Encoding.GetEncoding("windows-1252");
Encoding utf8 = Encoding.UTF8;
byte[] wind1252Bytes = wind1252.GetBytes(FileName);
byte[] utf8Bytes = Encoding.Convert(wind1252, utf8, wind1252Bytes);
string utf8String = Encoding.UTF8.GetString(utf8Bytes);

BrowserWindow.Launch NOT decode string

In a codedui, this line decodes the link before opening the browser:
myBrowser = BrowserWindow.Launch(New System.Uri("http://blablabla/main.aspx?sn=JFA%3d%3d"))
I want it to remain encoded in the url. In other words, when the browser opens, the %3d are already converted to "=" in the string.
http://mylinkhere/main.aspx?sn=JFA==
I want them to remain as %3d.
Thanks
Try this bro, this is in c# btw.
String URL = HttpUtility.UrlEncode("sn=JFA%3d%3d");
window = BrowserWindow.Launch(new Uri("http://blablabla/main.aspx?" + URL));

Custom clipboard data format accross RDC (.NET)

I'm trying to copy a custom object from a RDC window into host (my local) machine. It fails.
Here's the code that i'm using to 1) copy and 2) paste:
1) Remote (client running on Windows XP accessed via RDC):
//copy entry
IDataObject ido = new DataObject();
XmlSerializer x = new XmlSerializer(typeof(EntryForClipboard));
StringWriter sw = new StringWriter();
x.Serialize(sw, new EntryForClipboard(entry));
ido.SetData(typeof(EntryForClipboard).FullName, sw.ToString());
Clipboard.SetDataObject(ido, true);
2) Local (client running on local Windows XP x64 workstation):
//paste entry
IDataObject ido = Clipboard.GetDataObject();
DataFormats.Format cdf = DataFormats.GetFormat(typeof(EntryForClipboard).FullName);
if (ido.GetDataPresent(cdf.Name)) //<- this always returns false
{
//can never get here!
XmlSerializer x = new XmlSerializer(typeof(EntryForClipboard));
string xml = (string)ido.GetData(cdf.Name);
StringReader sr = new StringReader(xml);
EntryForClipboard data = (EntryForClipboard)x.Deserialize(sr);
}
It works perfectly on the same machine though.
Any hints?
There are a couple of things you could look into:
Are you sure the serialization of the object truely converts it into XML? Perhaps the outputted XML have references to your memory space? Try looking at the text of the XML to see.
If you really have a serialized XML version of the object, why not store the value as plain-vanilla text and not using typeof(EntryForClipboard) ? Something like:
XmlSerializer x = new XmlSerializer(typeof(EntryForClipboard));
StringWriter sw = new StringWriter();
x.Serialize(sw, new EntryForClipboard(entry));
Clipboard.SetText(sw.ToString(), TextDataFormat.UnicodeText);
And then, all you'd have to do in the client-program is check if the text in the clipboard can be de-serialized back into your object.
Ok, found what the issue was.
Custom format names get truncated to 16 characters when copying over RDC using custom format.
In the line
ido.SetData(typeof(EntryForClipboard).FullName, sw.ToString());
the format name was quite long.
When i was receiving the copied data on the host machine the formats available had my custom format, but truncated to 16 characters.
IDataObject ido = Clipboard.GetDataObject();
ido.GetFormats(); //used to see available formats.
So i just used a shorter format name:
//to copy
ido.SetData("MyFormat", sw.ToString());
...
//to paste
DataFormats.Format cdf = DataFormats.GetFormat("MyFormat");
if (ido.GetDataPresent(cdf.Name)) {
//this not works
...