OutputStream write(int b) method: special characters encoding - encoding

I'm not able to solve my problem with writing special characters ěščřžýáíé using method write(int b) from OutputStream class. I'm using this class to redirect console output to JTextArea.
class CustomOutputStream extends OutputStream {
private JTextArea textArea;
CustomOutputStream(JTextArea textArea) {
this.textArea = textArea;
}
#Override
public void write(int b) {
textArea.append(String.valueOf((char) b));
// scrolls the text area to the end of data
textArea.setCaretPosition(textArea.getDocument().getLength());
}
}
It works fine for standard characters. For characters ěščřžýáíé it prints unreadable characters.
I don't know how to set character encoding or how to solve this by other way.
Thank you, J.

You convert b which was an int to char type, so you lose most of the information. And you are not interpreting the b as unicode code point.
textArea.append(Character.toChars​(b))
If your original text is not Unicode, you may decode it.

Related

Eclipse formatter: How to move comment between method name and open brace to always have open brace on same line as method declaration

I have code that looks like this, that I'm trying to format
Original code:
public int doThing(int a) // -incredibly useful comment here
{
int ab = a+1;
return ab;
}
I want it to look like this
public int doThing() { // -incredibly useful comment here
int ab = a+1;
return ab;
}
If I try to turn on the Brace position -> Method Declaration -> Same line option and run the formatter, any code with a comment in the position "breaks" the formatter, and I get an output for my example that looks the same as the original code, but methods without a comment have the correct formatting (meaning the results are inconsistent).
Is it possible with the eclipse formatter to get the style I want? I'm trying to run it against a large amount of code, and would prefer not to have to fix these all manually to get a consistent brace position.
The problem here is that is not formatting but rewriting. Using File Search + regular expression + Replace could do that in bulk.
Try this regex
^(\s*(?:public|private|protected)\s+[^(]+\([^)]*\))(\s*\/\/[^/]+)\R\s*\{
On File Search ( Ctrl + H)
Hit Replace and use $1 { $2\n as replacement
Code should compile after the refactoring.
UPDATE:
Fixed regex part that represents function arguments
\([^)]*\)
Full Regex matches these cases
public int doSmthg() // coment here
{
return 1;
}
private String doSmthgElse(String arg) // coment here
{
return arg;
}

What is the most effective way in systemVerilog to know how many words a string has?

I have Strings in the following structure:
cmd, addr, data, data, data, data, ……., \n
For example:
"write,A0001000,00000000, \n"
I have to know how many words the String has.
I know that I can go over the String and search for the number of commas, but is there more effective way to do it?
UVM provides a facility to do regexp matching using the DPI, in case you're already using that. Have a look at the functions in uvm_svcmd_dpi.svh
Verilab also provides svlib, a package containing string matching functions.
A simpler option would be to change the commas(,) to a space, then you can use $sscanf (or $fscanf to skip the intermediate string and read directly from a file), assuming each command has a maximum number of words.
int code; // returns the number of words read
string str,word[5];
code = $sscanf(str,"%s %s %s %s %s", word[0],word[1],word[2],word[3],word[4]);
You can use %h if you know a word is in hex and translate it directly to a numeric value instead of a string.
The first step is to define extremely clearly what a word actually is vis. what constitutes the start of a word and what constitutes the end of the word, once you understand this, if should become obvious how to parse the string correctly.
In Java StringTokenizer is the best way to find the count of words in a string.
String sampleString= "cmd addr data data data data...."
StringTokenizer st = new Tokenizer(sampleString);
st.countTokens();
Hope this will help you :)
In java you can use following code to count words in string
public class WordCounts{
public static void main(String []args){
String text="cmd, addr, data, data, data, data";
String trimmed = text.trim();
int words = trimmed.isEmpty() ? 0 : trimmed.split("\\s+").length;
System.out.println(words);
}
}

NTriplesParser extract textual value from string

I am using dotnetrdf and trying to parse some triples with NTriplesParser. I have my own handler RobHandler in which I process each triple in turn.
public class RobHandler : BaseRdfHandler
{
protected override bool HandleTripleInternal(Triple t)
{
string predicateUrl = ((BaseUriNode)(t.Predicate)).Uri.AbsoluteUri;
string value = t.Object.ToString();
}
}
This works fine but I want to get the object minus the language. My objects look like "Lincoln"#en. I could obviously write some code to remove the #en bit, but I'd rather use some library code rather than my own that hard-coded strings like #en. To do this I think I need to create a LiteralNode but there doesn't seem to be a way to get from a string which is what I have (my variable value) to a LiteralNode.
How can I extract just the textual value from an object string?
Actually I think I have the answer myself:
if (t.Object.NodeType == NodeType.Literal)
{
var node = (ILiteralNode)t.Object;
}

XElement & UTF-8 Issue

I have a .NET Web Service(.asmx, not .svc) that accepts a string via HTTP POST. The strings it accepts are xml infosets I then parse via XElement.Parse. Once parsed into an XElement instance, I add a node to one of the elements within the instance.
The problem I'm having is that if a string representing an xml infoset comes through with then for some reason, me adding a node to the element XElement throws an exception such as "' ', hexadecimal value 0x06, is an invalid character. Line 1, position 40.". I get a wide array of 0x(*) errors thrown. If I don't attempt to add nodes to the XElement, everythings fine. Here's how I'm adding the element:
var prospect = doc.Element("prospect");
var provider = prospect.Element("provider");
provider.Add(new XElement("id",
new XAttribute("reservation-code",
reservationCode)
));
Is there some sort of string conversion I ought to be doing somewhere?
XML does not allow some Unicode characters. See the XML 1.0 Specification. Unfortunately, there is no standard way to escape those characters in XML, too. For example, you cannot escape it in valid XML using because of the Well-formedness constraint: Legal Character (see character references).
The XElement.ToString() has the check for those characters turned on. However, .NET does provide a way to turn character checking off. It is off by default in the System.Xml.XmlWriter instances. Therefore the following code will work:
/// <summary>
/// Returns the XML string of the <paramref name="xElement"/> WITHOUT CHARACTER CHECKING.
/// </summary>
/// <param name="xElement"></param>
/// <returns></returns>
public static string ToStringWithoutCharacterChecking(this XElement xElement)
{
using (System.IO.StringWriter stringWriter = new System.IO.StringWriter())
{
using (System.Xml.XmlTextWriter xmlTextWriter = new XmlTextWriter(stringWriter))
{
xElement.WriteTo(xmlTextWriter);
}
return stringWriter.ToString();
}
}
Notice however that if you create an System.Xml.XmlWriter instance using System.Xml.XmlWriterSettings, the default is true for character checking. Therefore if you use System.Xml.XmlWriterSettings and want to turn off character checking, use:
XmlWriterSettings s = new XmlWriterSettings();
s.CheckCharacters = false;
using(XmlWriter w = XmlWriter.Create(..., s))
{
//etc.
}
thanks a lot, which solved my problem when I using linq to xsd.
here is my code:
//not using container.Save(new StreamWriter(toStream, new UTF8Encoding(false)));
instead using codes:
using (XmlWriter w = XmlWriter.Create(new StreamWriter(toStream, new UTF8Encoding(false)), new XmlWriterSettings
{//http://stackoverflow.com/questions/5709831/xelement-utf-8-issue
//http://stackoverflow.com/questions/10057171/xdocument-prevent-invalid-charachters
Indent = true,
CheckCharacters = false
}))
{
XTypedServices.Save(w, container.Untyped);
}
toStream.Flush();

Eclipse getter/setter format

Does anyone know of an Eclipse plug-in or method to get Eclipse to generate getter/setters on one line like this:
public String getAbc() { return abc; }
Instead of
public String getAbc() {
return abc;
}
I'm on Eclipse v. 3.2.2.
Thanks.
I don't know how to make Eclipse generate them in the format you want, but you could do a search/replace using these regular expressions after the methods are generated:
Find:
(?m)((?:public |private |protected )?[\w$]+) (get|set|is)([\w$]+)\(([\w$]+(?:\[\])? [\w$]+)?\) \{\s+(return [\w$]+;|this.[\w$]+ = [\w$]+;)\s+\}
Replace by:
$1 $2$3($4) { $5 }
This expression will transform the generated getters and setters to be one line. Don't worry about running it with a mixture of transformed and newly generated methods; it will work just fine.
I think matching generics is important as well, so the correct regexp is:
(?m)((?:public |private |protected )?[\w\<\>$]+) (get|set|is)([\w$]+)\(([\w\<\>$]+ [\w$]+)?\) \{\s+(return [\w$]+;|this.[\w$]+ = [\w$]+;)\s+\}
As a variation of the regexp replacement approach, the following reformats the whitespace so that setters are followed by a blank line, but getters are not.
Find:
(\s(?:get|is|set)\w+\([^)]*\))\s*\{\s*(?:([^=;}]+;)\s*\}\s*(\R)|([^=;}]+=[^=;}]+;)\s*\}\s*(\R))
Replace by:
$1 { $2$4 } \R$5
Results in:
int getTotal() { return total; }
void setTotal(int total) { this.total = total; }
List<String> getList() { return list; }
void setList(List<String> list) { this.list = list; }
Map.Entry<String, Integer> getEntry() { return entry; }
void setEntry(Map.Entry<String, Integer> entry) { this.entry = entry; }
It's a minor aesthetic thing, but I figured that if you're looking for an answer to this question, then you're probably (almost) as anal as me ;-)
I know my regexp conditions are not as strict as those of #Hosam, but I haven't experienced any "false positive" replacements.
Java code formatting in Eclipse does not differentiate between getters/setters and any other methods in a class. So this cannot be done by built-in eclipse formatting.
You will need either to:
run a search/replace with the aforementioned regex
get en external plugin like PMD or CheckStyle and enforce a regex rule based on previous option
You can use fast code plug-in to generate this kind of getter setters. The details are given here : http://fast-code.sourceforge.net/documentation.htm#create-new-field.
I wanted to post as a comment to the designated answer, but I don't seem to be able to.
I modified Hosam Aly's answer to work with generic and inner types of the form:
List<X>
and
Map.Entry
The revised regular expression search string is:
(?m)((?:public |private |protected )?[\w\.\<\>$]+) (get|set|is)([\w$]+)\(([\w\.\<\>$]+ [\w$]+)?\) \{\s+(return [\w\.\<\>$]+;|this.[\w$]+ = [\w$]+;)\s+\}
This regular expression allows for angle brackets and a dot in the type.
For example:
public List<String> getStringList()
and
public void setStringList(List<String> list)
and
public Map.Entry getEntry ()
And the replace string is the same as before:
$1 $2$3($4) { $5 }