Special characters in Sikuli script - special-characters

I am trying to use some French special characters with Sikuli, when I type this in the Sikuli IDE,
App.open('C:\\à table\\app.exe')
But I get this error :
[log] App.open C:\à table\NDC.exe(0)
[error] App.open failed: C:\à table\NDC.exe not found
It seems that Sikuli doesn't handle utf-8 properly for the moment. All I could find in Google was the same problem with type() function and to use paste() instead, which uses the clipboard.
Is there a workaround in the case of App.open ?
Thanks a lot.

Could make a bat file, and have App.Open('path/to/bat/file.bat') which inside contains the path to the .exe

The reason for this problem seems that Python 2.5.X doesn't support character encodings properly. One has to use tricks like encode('cp1252'), encode('utf8')...
Since Sikuli is based on Jython which is based on Python 2.5.2, we are stuck!
I wished I lived in a country only using the standard ASCII table, I really hate all these problems related to codepage and encodings.

Related

How do I generate media files with names that contain blanks?

I have this in my .install4j file:
<mediaSets>
<windows ...
mediaFileName="${compiler:product.name} ${compiler:edition.name} Win32 ${compiler:sys.version}" ...>
<mediaSets>
The issue is with the name of the generated media file.
I get: Our-Product_Enterprise_Win32_6.0.36-SNAPSHOT.exe
I want: Our-Product Enterprise Win32 6.0.36-SNAPSHOT.exe
I switched off the "Convert dots to underscores" setting, which helped clean up the version number, but how do I prevent Install4J from replacing all blanks with underscores?
I could live with using some other character, but:
_ (underscore) is confusing since the name already has dashes in it (it should be possible to read out a name over the telephone).
- (dash) is even more confusing since there are more dashes in it.
~ (tilde) looks weird - the installer is seen by non-tech people, and I don't want to confuse them.
en-space and em-space might work, but could cause character set weirdness, particularly since we're building Linux and Windows installers (and maybe Mac installers in the future). I.e. I'll want to stick with ASCII-7 if at all possible.
Currently this is not possible. For 8.0.3, I have now implemented an option "Convert spaces to underscores" so you can disable the replacement.
Please contact support#ej-technologies.com to get a build where this is already implemented.

Can you write Perl 6 scripts using an encoding that is not utf8?

Perl 5 has the encoding pragma or the Filter::Encoding module, however, I have not found anything similar in Perl 6. I guess eventually source filters will be created, but for the time being, can you use other encodings in Perl 6 scripts?
You cannot write your Perl 6 script in anything except utf8. I don't think there will ever be any other encoding you will be allowed to write your script in, as utf8 is basically the universal standard. Benefits like not having endianess and being back compatible with ASCII are some reasons it has become the standard and not things like utf16 or utf32.
Maybe there was a time before when such a thing may have been useful, but today I do not see that being the case. All text editors in common usage I know of default to utf8, and having files in multiple formats makes it more difficult to share your Perl 6 programs with others. There are plenty of reasons to want to use other encodings external to Perl 6 (writing to files, reading files etc.) but I don't see adding filters as smart move.
Rakudo currently supports an --encoding= option, so you might in theory be able to write a script in a different character encoding, and call it with perl6 --encoding=utf16 yourscript.p6. But in my experiments, I haven't managed to get it working with anything except utf8, and even if it worked, specifying --encoding on the command line would be a big no go for me.
So the operational answer is: currently no.
(And I don't think anybody else has asked for it yet...)

How does computer display a character on the screen with the correct encoding?

I'm interested in the encoding of the character in the computer.
When I open my xxx.c with visual studio code, how does the VS code detect the encoding of my file and interprets these "01" sequence. Further on, how the visual studio code (or even the computer system) display the character on the screen acorrding to my "01" sequence file and the character encoding?
Thank you!
I also uses Chinese during my projects. Sometimes, the file encoding really drive my crazy. Sometimes,my correct utf-8 file created by edit A for example, was destroyed by some text editor B that interpret it as GBK file, and edit A can never get it back correct.
I searched a lot, but the most answers seems to be too abstract or irrelevant. I want to figure out how the software and the computer system( or operating system) cooperate together to make this simple but important job done!
First things first, "can never get it back": Always Use Source Code Control
"How the software and the computer system (or operating system) cooperate together to make this simple but important job done!": They don't that's the problem!
Short history: Many decades ago people used small character sets. The idea was a system would always use the same one. Simple. Every time a text file was transferred between systems, it would be immediately transcribed to the local character encoding. Then came the globalization of file exchanges and systems needed to hold text files in different encodings. There was no general way of recording what the encoding was. In 1991 came the huge character set Unicode. Languages (VB4, Java), operating system APIs (Win32), file systems (NTFS), … began adopting it. However, its encodings (UTF-8, UTF-16) are just yet more possibilities for which encoding a text file uses. Many programs that read text files either rely on the old system of a system default encoding or guess ("detect").
In the programming world, some languages require source files to use a specific encoding (say UTF-8); In others, tools default to specific encoding (say UTF-8). In most cases, the toolset provided with a C or C++ implementation will have a consistent set of rules. If you also use an IDE or other form of project system, you can set the encoding for the entire project and in some cases specific files.
So, the only solution is to only use tools that work for you and to properly configure them. If it hurts, stop doing it.
Aside: On the topic of programming and default character encodings, be careful not to get tricked with various language libraries' use of the system default character encoding—unless that is exactly what's needed. Otherwise, you are giving your users the same problem that you are encountering. (In Java, just avoid it with explicit arguments. In C and C++ libraries, encoding is combined into Locales. But note that many systems initialize a program to use default character encoding.

ANSI escape characters does not appear the way they should on Eclipse console

I have a Scala project and I use Scala-Eclipse-Plugin along with sbt. So far so good. But the problem is that sbt writes some ANSI escape sequences to the output (I might be wrong about this?).They appear pretty well when I invoke sbt from shell but inside eclipse, they appear like this:
[0m[[0minfo[0m] [34m[0m
what's wrong ?
See the discussion for "an eclipse console view that respects ansi color codes" I followed the suggestion of #thegreendroid and used ANSIConsole successfully.
The Eclipse console does not support ANSI escape sequences.

How to discover command line options (if any) for an undocumented executable of unknown origin?

Take an undocumented executable of unknown origin. Trying /?, -h, --help from the command line yields nothing. Is it possible to discover if the executable supports any command line options by looking inside the executable? Possibly reverse engineering? What would be the best way of doing this?
I'm talking about a Windows executable, but would be interested to hear what different approaches would be needed with another OS.
In linux, step one would be run strings your_file which dumps all the strings of printable characters in the file. Any constants chars will thus be shown, including any "usage" instructions.
Next step could be to run ltrace on the file. This shows all function calls the program does. If it includes getopt (or familiar), then it is a sure sign that it is processing input parameters. In fact, you should be able to see exactly what argument the program is expecting since that is the third parameter to the getopt function.
For Windows, you can see this question about decompiling Windows executables. It should be relatively easy to at least discover the options (what they actually do is a different story).
If it's a .NET executable try using Reflector. This will convert the MSIL code into the equivalent C# code which may make it easier to understand. Unfortunately private and local variable names will be lost, as these are not stored in the MSIL but it should still be possible to follow what's going on.