counting the number of character in a text using FileReader - character

I am new in this superb place. I got help several times from this site. I have seen many answers regarding my question that was previously discussed but i am facing problem to count the number of characters using FileReader. It's working using Scanner. This is what i tried:
class CountCharacter
{
public static void main(String args[]) throws IOException
{
File f = new File("hello.txt");
int charCount=0;
String c;
//int lineCount=0;
if(!f.exists())
{
f.createNewFile();
}
BufferedReader br = new BufferedReader(new FileReader(f));
while ( (c=br.readLine()) != null) {
String s = br.readLine();
charCount = s.length()-1;
charCount++;
}
System.out.println("NO OF LINE IN THE FILE, NAMED " +f.getName()+ " IS " +charCount);
}
}`

It looks to me that each time you go through the loop, you assign the charCount to be the length of the line that iteration of the loop is concerned with. i.e. instead of
charCount = s.Length() -1;
try
charCount = charCount + s.Length();
EDIT:
If you have say the document with the contents "onlyOneLine"
Then when you first hit the while check the br.readLine() will make the BufferredReader read the first line, during the while's code block however br.readLine() is called again which advances the BufferredReader to the second line of the document, which will return null. As null is assigned to s, and you call length(), then NPE is thrown.
try this for the while block
while ( (c=br.readLine()) != null) {
charCount = charCount + c.Length(); }

Related

MalformedInputException: Input length = 1 while reading text file with Files.readAllLines(Path.get("file").get(0);

Why am I getting this error? I'm trying to extract information from a bank statement PDF and tally different bills for the month. I write the data from a PDF to a text file so I can get specific data from the file (e.g. ASPEN HOME IMPRO, then iterate down to what the dollar amount is, then read that text line to a string)
When the Files.readAllLines(Path.get("bankData").get(0) code is run, I get the error. Any thoughts why? Encoding issue?
Here is the code:
public static void main(String[] args) throws IOException {
File file = new File("C:\\Users\\wmsai\\Desktop\\BankStatement.pdf");
PDFTextStripper stripper = new PDFTextStripper();
BufferedWriter bw = new BufferedWriter(new FileWriter("bankData"));
BufferedReader br = new BufferedReader(new FileReader("bankData"));
String pdfText = stripper.getText(Loader.loadPDF(file)).toUpperCase();
bw.write(pdfText);
bw.flush();
bw.close();
LineNumberReader lineNum = new LineNumberReader(new FileReader("bankData"));
String aspenHomeImpro = "PAYMENT: ACH: ASPEN HOME IMPRO";
String line;
while ((line = lineNum.readLine()) != null) {
if (line.contains(aspenHomeImpro)) {
int lineNumber = lineNum.getLineNumber();
int newLineNumber = lineNumber + 4;
String aspenData = Files.readAllLines(Paths.get("bankData")).get(0); //This is the code with the error
System.out.println(newLineNumber);
break;
} else if (!line.contains(aspenHomeImpro)) {
continue;
}
}
}
So I figured it out. I had to check the properties of the text file in question (I'm using Eclipse) to figure out what the actual encoding of the text file was.
Then, when creating the file in the program, encode the text file to UTF-8 so that Files.readAllLines could read and grab the data I wanted to get.

How do I replicate %Dictionary.ClassDefintionQuery's Summary() in SQL?

There is a procedure in %Dictionary.ClassDefinitionQuery which lists a class summary; in Java I call it like this:
public void readClasses(final Path dir)
throws SQLException
{
final String call
= "{ call %Dictionary.ClassDefinitionQuery_Summary() }";
try (
final CallableStatement statement = connection.prepareCall(call);
final ResultSet rs = statement.executeQuery();
) {
String className;
int count = 0;
while (rs.next()) {
// Skip if System is not 0
if (rs.getInt(5) != 0)
continue;
className = rs.getString(1);
// Also skip if the name starts with a %
if (className.charAt(0) == '%')
continue;
//System.out.println(className);
count++;
}
System.out.println("read: " + count);
}
}
In namespace SAMPLES this returns 491 rows.
I try and replicate it with a pure SQL query like this:
private void listClasses(final Path dir)
throws SQLException
{
final String query = "select id, super"
+ " from %Dictionary.ClassDefinition"
+ " where System = '0' and name not like '\\%%' escape '\\'";
try (
final PreparedStatement statement
= connection.prepareStatement(query);
final ResultSet rs = statement.executeQuery();
) {
int count = 0;
while (rs.next()) {
//System.out.println(rs.getString(1) + ';' + rs.getString(2));
count++;
}
System.out.println("list: " + count);
}
}
Yet when running the program I get this:
list: 555
read: 491
Why are the results different?
I have looked at the code of %Dictionary.ClassDefinitionQuery but I don't understand why it gives different results... All I know, if I store the names in sets and compare, is that:
nothing is missing from list that is in read;
most, but not all, classes returned by list which are not in read are CSP pages.
But that's it.
How I can I replicate the behaviour of the summary procedure in SQL?
Different is in one property. %Dictionary.ClassDefinitionQuery_Summary shows only classes with Deployed<>2. So, sql must be such.
select id,super from %Dictionary.ClassDefinition where deployed <> 2
But one more things is, why count may be different is, such sql requests may be compilled to temporary class, for example "%sqlcq.SAMPLES.cls22"

Need some eclipse search/replace regex help to speed things up

So I have had an issue for a while now and thought it was worth the time to ask the more experienced regex guys if there was a way to fix this issue with a quick search and replace.
So i use a tool which generates java code(not written in java or I would manually fix the cause directly), however, it has an issue calling variables before an object is created.
This always occurs only once per object, but not for every object, the object name is unknown, and the error is always the line directly before the constructor is called. This is the format the error is always in:
this.unknownObjectName.mirror = true;
this.unknownObjectName = new Model(unknown, parameter, values);
I know there should be a trick to fix this, as a simple string replace simply will not work since 'unknownObjectName' is unknown.
Would this even be possible with regex, if so, please enlighten me :)
This is how the code SHOULD read:
this.unknownObjectName = new Model(unknown, parameter, values);
this.unknownObjectName.mirror = true;
For complex models, this error may happen hundreds of times, so this will indeed save a lot of time. That and I would rather walk on hot coals then do mindless busy work like fixing all these manually :)
Edit:
I through together a java app that does the job.
public static void main(String args[]){
File file = new File(args[0]);
File file2 = new File(file.getParentFile(), "fixed-" + file.getName());
try {
if(file2.exists()) {
file2 = new File(file.getParentFile(), "fixed-" + System.currentTimeMillis() + "-" + file.getName());
}
BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(file)));
BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(file2)));
String line, savedline = null, lastInitVar = "";
while((line = br.readLine()) != null){
if(line.contains("= new ")){
String varname = line.substring(0, line.indexOf("=")).trim();
lastInitVar = varname;
}else if(line.contains(".mirror")){
String varname = line.substring(0, line.indexOf(".mirror")).trim();
if(!lastInitVar.equals(varname)){
savedline = line;
continue;
}
}else if(savedline != null && savedline.contains(lastInitVar)){
bw.write(savedline + "\n");
savedline = null;
}
bw.write(line + "\n");
}
bw.flush();
bw.close();
br.close();
} catch (Exception e) {
e.printStackTrace();
}
}
Over thinking it
Write a program to read line by line and when you see a object access before a constructor don't write it out, write out the next line and then write out the buffered line, rinse repeat.
Some people, when confronted with a problem, think "I know, I'll use
regular expressions." Now they have two problems. - Jamie Zawinski
Regular Expressions are for matching patterns not state based logic.

' ', hexadecimal value 0x1F, is an invalid character. Line 1, position 1

I am trying to read a xml file from the web and parse it out using XDocument. It normally works fine but sometimes it gives me this error for day:
**' ', hexadecimal value 0x1F, is an invalid character. Line 1, position 1**
I have tried some solutions from Google but they aren't working for VS 2010 Express Windows Phone 7.
There is a solution which replace the 0x1F character to string.empty but my code return a stream which doesn't have replace method.
s = s.Replace(Convert.ToString((byte)0x1F), string.Empty);
Here is my code:
void webClient_OpenReadCompleted(object sender, OpenReadCompletedEventArgs e)
{
using (var reader = new StreamReader(e.Result))
{
int[] counter = { 1 };
string s = reader.ReadToEnd();
Stream str = e.Result;
// s = s.Replace(Convert.ToString((byte)0x1F), string.Empty);
// byte[] str = Convert.FromBase64String(s);
// Stream memStream = new MemoryStream(str);
str.Position = 0;
XDocument xdoc = XDocument.Load(str);
var data = from query in xdoc.Descendants("user")
select new mobion
{
index = counter[0]++,
avlink = (string)query.Element("user_info").Element("avlink"),
nickname = (string)query.Element("user_info").Element("nickname"),
track = (string)query.Element("track"),
artist = (string)query.Element("artist"),
};
listBox.ItemsSource = data;
}
}
XML file:
http://music.mobion.vn/api/v1/music/userstop?devid=
0x1f is a Windows control character. It is not valid XML. Your best bet is to replace it.
Instead of using reader.ReadToEnd() (which by the way - for a large file - can use up a lot of memory.. though you can definitely use it) why not try something like:
string input;
while ((input = sr.ReadLine()) != null)
{
string = string + input.Replace((char)(0x1F), ' ');
}
you can re-convert into a stream if you'd like, to then use as you please.
byte[] byteArray = Encoding.ASCII.GetBytes( input );
MemoryStream stream = new MemoryStream( byteArray );
Or else you could keep doing readToEnd() and then clean that string of illegal characters, and convert back to a stream.
Here's a good resource for cleaning illegal characters in your xml - chances are, youll have others as well...
https://seattlesoftware.wordpress.com/tag/hexadecimal-value-0x-is-an-invalid-character/
What could be happening is that the content is compressed in which case you need to decompress it.
With HttpHandler you can do this the following way:
var client = new HttpClient(new HttpClientHandler
{
AutomaticDecompression = DecompressionMethods.GZip
| DecompressionMethods.Deflate
});
With the "old" WebClient you have to derive your own class to achieve the similar effect:
class MyWebClient : WebClient
{
protected override WebRequest GetWebRequest(Uri address)
{
HttpWebRequest request = base.GetWebRequest(address) as HttpWebRequest;
request.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
return request;
}
}
Above taken from here
To use the two you would do something like this:
HttpClient
using (var client = new HttpClient(new HttpClientHandler { AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate }))
{
using (var stream = client.GetStreamAsync(url))
{
using (var sr = new StreamReader(stream.Result))
{
using (var reader = XmlReader.Create(sr))
{
var feed = System.ServiceModel.Syndication.SyndicationFeed.Load(reader);
foreach (var item in feed.Items)
{
Console.WriteLine(item.Title.Text);
}
}
}
}
}
WebClient
using (var stream = new MyWebClient().OpenRead("http://myrss.url"))
{
using (var sr = new StreamReader(stream))
{
using (var reader = XmlReader.Create(sr))
{
var feed = System.ServiceModel.Syndication.SyndicationFeed.Load(reader);
foreach (var item in feed.Items)
{
Console.WriteLine(item.Title.Text);
}
}
}
}
This way you also recieve the benefit of not having to .ReadToEnd() since you are working with the stream instead.
Consider using System.Web.HttpUtility.HtmlDecode if you're decoding content read from the web.
If you are having issues replacing the character
For me there were some issues if you try to replace using the string instead of the char. I suggest trying some testing values using both to see what they turn up. Also how you reference it has some effect.
var a = x.IndexOf('\u001f'); // 513
var b = x.IndexOf(Convert.ToString((byte)0x1F)); // -1
x = x.Replace(Convert.ToChar((byte)0x1F), ' '); // Works
x = x.Replace(Convert.ToString((byte)0x1F), " "); // Fails
I blagged this
I had the same issue and found that the problem was a  embedded in the xml.
The solution was:
s = s.Replace("", " ")
I'd guess it's probably an encoding issue but without seeing the XML I can't say for sure.
In terms of your plan to simply replace the character but not being able to, because you have a stream rather than a text, simply read the stream into a string and then remove the characters you don't want.
Works for me.........
string.Replace(Chr(31), "")
I used XmlSerializer to parse XML and faced the same exception.
The problem is that the XML string contains HTML codes of invalid characters
This method removes all invalid HTML codes from string (based on this thread - https://forums.asp.net/t/1483793.aspx?Need+a+method+that+removes+illegal+XML+characters+from+a+String):
public static string RemoveInvalidXmlSubstrs(string xmlStr)
{
string pattern = "&#((\\d+)|(x\\S+));";
Regex regex = new Regex(pattern, RegexOptions.IgnoreCase);
if (regex.IsMatch(xmlStr))
{
xmlStr = regex.Replace(xmlStr, new MatchEvaluator(m =>
{
string s = m.Value;
string unicodeNumStr = s.Substring(2, s.Length - 3);
int unicodeNum = unicodeNumStr.StartsWith("x") ?
Convert.ToInt32(unicodeNumStr.Substring(1), 16)
: Convert.ToInt32(unicodeNumStr);
//according to https://www.w3.org/TR/xml/#charsets
if ((unicodeNum == 0x9 || unicodeNum == 0xA || unicodeNum == 0xD) ||
((unicodeNum >= 0x20) && (unicodeNum <= 0xD7FF)) ||
((unicodeNum >= 0xE000) && (unicodeNum <= 0xFFFD)) ||
((unicodeNum >= 0x10000) && (unicodeNum <= 0x10FFFF)))
{
return s;
}
else
{
return String.Empty;
}
})
);
}
return xmlStr;
}
Nobody can answer if you don't show relevant info - I mean the Xml content.
As a general advice I would put a breakpoint after ReadToEnd() call. Now you can do a couple of things:
Reveal Xml content to this forum.
Test it using VS Xml visualizer.
Copy-paste the string into a txt file and investigate it offline.

Eclipe PDE: Jump to line X and highlight it

A qustion about Eclipse PDE development: I write a small plugin for Eclipse and have the following
* an org.eclipse.ui.texteditor.ITextEditor
* a line number
How can I automatically jump to that line and mark it? It's a pity that the API seems only to support offsets (see: ITextEditor.selectAndReveal()) within the document but no line numbers.
The best would be - although this doesn't work:
ITextEditor editor = (ITextEditor)IDE.openEditor(PlatformUI.getWorkbench().getActiveWorkbenchWindow().getActivePage(), file, true );
editor.goto(line);
editor.markLine(line);
It this possible in some way? I did not find a solution
on the class DetailsView I found the following method.
private static void goToLine(IEditorPart editorPart, int lineNumber) {
if (!(editorPart instanceof ITextEditor) || lineNumber <= 0) {
return;
}
ITextEditor editor = (ITextEditor) editorPart;
IDocument document = editor.getDocumentProvider().getDocument(
editor.getEditorInput());
if (document != null) {
IRegion lineInfo = null;
try {
// line count internaly starts with 0, and not with 1 like in
// GUI
lineInfo = document.getLineInformation(lineNumber - 1);
} catch (BadLocationException e) {
// ignored because line number may not really exist in document,
// we guess this...
}
if (lineInfo != null) {
editor.selectAndReveal(lineInfo.getOffset(), lineInfo.getLength());
}
}
}
Even though org.eclipse.ui.texteditor.ITextEditor deals wiith offset, it should be able to take your line number with the selectAndReveal() method.
See this thread and this thread.
Try something along the line of:
((ITextEditor)org.eclipse.jdt.ui.JavaUI.openInEditor(compilationUnit)).selectAndReveal(int, int);