dotNetRDF not parsing LinkedMovie.nt (VDS.RDF.Parsing.RdfParseException) - dotnetrdf

I have tested linkedmdb-18-05-2009-dump.nt on Java Apache Jena, but on dotNetRDF throwing an exception as
VDS.RDF.Parsing.RdfParseException
HResult=0x80131500
Message=Invalid URI encountered, see inner exception for details
Source=dotNetRDF
StackTrace:
at VDS.RDF.Parsing.NTriplesParser.TryParseUri(TokenisingParserContext context, String uri)
at VDS.RDF.Parsing.NTriplesParser.TryParseTriple(TokenisingParserContext context)
at VDS.RDF.Parsing.NTriplesParser.Parse(TokenisingParserContext context)
at VDS.RDF.Parsing.NTriplesParser.Load(IRdfHandler handler, TextReader input)
at ConsoleApp2_RDFWALKTHROUGH.Program.Main(String[] args) in
This exception was originally thrown at this call stack:
[External Code]
Inner Exception 1:
UriFormatException: Invalid URI: The hostname could not be parsed.
my c# code is as follow:
String inputFile = "D:/linkedmdb-18-05-2009-dump.nt";
IGraph g = new Graph();
NTriplesParser parser = new NTriplesParser(NTriplesSyntax.Original);
Console.WriteLine("RDF DS-1 Loading Started:");
parser.Load(g, new StreamReader(inputFile));
Console.WriteLine("RDF DS-1 Loading Finished:");
Console.WriteLine(new DateTime(loadingTime).ToShortTimeString());
Console.ReadLine();
Please guide me where I am wrong because it's very confusing that the same file is ok on Java but not parsing on dotNetRDF.

The problem is that the dump contains an invalid IRI. At line 3104575 in the dump I downloaded from https://www.cs.toronto.edu/~oktie/linkedmdb/ there is the following:
<http://data.linkedmdb.org/film/9995> <http://xmlns.com/foaf/0.1/page> <http://?> .
The last IRI on that line is the one that is causing the parser to choke as ? is not a valid character at that position in an IRI.

Related

BizTalk - Setting email contentType causes error: There is an error in XML document (1, 1)

When I try to set the Microsoft.XLANGs.BaseTypes.ContentType in a BizTalk orchestration, I get an XML error (regardless of whether I use "text/plain" or "text/xml". I'm using a dynamic send port with the PassThru pipeline.
msg_Email.BodyPart = new Ledger6002.Component.RawString("See attached email. Method 2");
msg_Email.AttachmentPart = msg_Ledger6002_File_XmlDoc;
// Set the filename as it should display on the attachment in the email
// (drop the path, just the filename/extension)
attachmentName = System.IO.Path.GetFileName(
msg_Ledger6002_File_XmlDoc(FILE.ReceivedFileName));
msg_Email.AttachmentPart(MIME.FileName) = attachmentName;
msg_Email.BodyPart(Microsoft.XLANGs.BaseTypes.ContentType) = "text/plain";
msg_Email.AttachmentPart(Microsoft.XLANGs.BaseTypes.ContentType) = "text/plain";
Causes this error:
xlang/s engine event log entry: Uncaught exception (see the 'inner exception' below) has suspended an instance of service 'Ledger6002.Logic.Ledger6002_Process_File(9eb6993c-87d0-7bf0-b0bf-e1f684000af2)'.
The service instance will remain suspended until administratively resumed or terminated.
If resumed the instance will continue from its last persisted state and may re-throw the same unexpected exception.
InstanceId: 216e4bac-0f22-4e06-9e61-2ef46051c991
Shape name: msg_Email
ShapeId: e7ce3f54-0558-4756-a7e4-e3800721178f
Exception thrown from: segment 1, progress 55
Inner exception: There is an error in XML document (1, 1).
Exception type: InvalidOperationException
Source: System.Xml
Target Site: System.Object Deserialize(System.Xml.XmlReader, System.String, System.Xml.Serialization.XmlDeserializationEvents)
The following is a stack trace that identifies the location where the exception occured
at System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader, String encodingStyle, XmlDeserializationEvents events)
at System.Xml.Serialization.XmlSerializer.Deserialize(TextReader textReader)
at Microsoft.XLANGs.Core.Value.GetObject(Type t)
at Microsoft.XLANGs.Core.Value._prepareForWrite(PreferredValueRepresentation pvr)
at Microsoft.XLANGs.Core.ValueTable.PrepareForWrite(ValueToken& vt, PreferredValueRepresentation pvr)
at Microsoft.XLANGs.Core.Part.PrepareForWrite(PreferredValueRepresentation pvr)
at Microsoft.XLANGs.Core.Part.SetPartProperty(Type propType, Object val)
at Microsoft.XLANGs.Core.Part.SetPropertyValue(Type propType, Object val)
at Ledger6002.Logic.Ledger6002_Process_File.segment1(StopConditions stopOn)
at Microsoft.XLANGs.Core.SegmentScheduler.RunASegment(Segment s, StopConditions stopCond, Exception& exp)
Additional error information:
Data at the root level is invalid. Line 1, position 1.
Exception type: XmlException
Source: System.Xml
Target Site: Void Throw(System.Exception)
The following is a stack trace that identifies the location where the exception occured
at System.Xml.XmlTextReaderImpl.Throw(Exception e)
at System.Xml.XmlTextReaderImpl.Throw(String res, String arg)
at System.Xml.XmlTextReaderImpl.ParseRootLevelWhitespace()
at System.Xml.XmlTextReaderImpl.ParseDocumentContent()
at System.Xml.XmlTextReaderImpl.Read()
at System.Xml.XmlTextReader.Read()
at System.Xml.XmlReader.MoveToContent()
at Microsoft.Xml.Serialization.GeneratedAssembly.XmlSerializationReaderObject.Read2_anyType()
msg_email is a multipart message, with two parts: bodyPart and attachmentPart, both defined as a RawString custom class.
What could be causing this error? Do I need a pipeline that uses the MIME/Encoder?
I'm now at a different client, trying a response to a 2017 question (but using an orchestration instead of a pipeline): How Set Attachment Name to Show Properly in Outlook
When I comment out the two lines that set the contentType, I get an email, but same problem as referenced in the 2017 post above.
You're missing at least 2 things:
Use the SMTP.MessagePartsAttachments setting and set it on "2"
Use the SMTP.EmailBodyTextCharset and set it on "UTF-8"

DxlImporter inside a loop throws error " DXL importer operation failed"

I am having a java agent which loops through the view and gets the attachment from each document, The attachment is nothing but the .dxl file containing the document xml data. I am extracting the file at some temp directory and trying import the extracted .dxl as soon as it get extracted.
But the problem here is ,it only imports or works on first document's attachment in the loop and throws the error in java debug console
NotesException: DXL importer operation failed
at lotus.domino.local.DxlImporter.importDxl(Unknown Source)
at JavaAgent.NotesMain(Unknown Source)
at lotus.domino.AgentBase.runNotes(Unknown Source)
at lotus.domino.NotesThread.run(Unknown Source)
My java Agent code is
public class JavaAgent extends AgentBase {
static DxlImporter importer = null;
public void NotesMain() {
try {
Session session = getSession();
AgentContext agentContext = session.getAgentContext();
// (Your code goes here)
// Get current database
Database db = agentContext.getCurrentDatabase();
View v = db.getView("DXLProcessing_mails");
DocumentCollection dxl_tranfered_mail = v.getAllDocumentsByKey("dxl_tranfered_mail");
Document dxlDoc = dxl_tranfered_mail.getFirstDocument();
while(dxlDoc!=null){
RichTextItem rt = (RichTextItem) dxlDoc.getFirstItem("body");
Vector allObjects= rt.getEmbeddedObjects();
System.out.println("File name is "+ allObjects.get(0));
EmbeddedObject eo = dxlDoc.getAttachment(allObjects.get(0).toString());
if(eo.getFileSize()>0){
eo.extractFile(System.getProperty("java.io.tmpdir") + eo.getName());
System.out.println("Extracted File to "+System.getProperty("java.io.tmpdir") + eo.getName());
String filePath = System.getProperty("java.io.tmpdir") + eo.getName();
Stream stream = session.createStream();
if (stream.open(filePath) & (stream.getBytes() >0)) {
System.out.println("In If"+System.getProperty("java.io.tmpdir"));
importer = session.createDxlImporter();
importer.setDocumentImportOption(DxlImporter.DXLIMPORTOPTION_CREATE);
System.out.println("Break Point");
importer.importDxl(stream,db);
System.out.println("Imported Sucessfully");
}else{
System.out.println("In else"+stream.getBytes());
}
}
dxlDoc = dxl_tranfered_mail.getNextDocument();
}
} catch(Exception e) {
e.printStackTrace();
}
The code executes till it prints "Break Point" and throws the error but the attachment get imported for first time
In other case if i hard code the filePath for the specific dxl file from file system it imports the dxl as document in the database with no errors
I am wondering if it is the issue of the stream passed doesn't get completes and the next loop executes.
Any kind of suggestion will be helpful.
I can't see any part where your while loop would move on from the first document.
Usually you would have something like:
Document nextDoc = dxl_tranfered_mail.getNextDocument(dxlDoc);
dxlDoc.recycle();
dxlDoc = nextDoc;
Near the end of the loop to advance it to the next document. As your code currently stands it looks like it would never advance, and always be on the first document.
If you do not know about the need to 'recycle' domino objects I suggest you have a search for some blog posts articles that explain the need to do so.
It is a little complicated but basically, the Java Objects are just a 'wrapper' for the the objects in the C API.
Whenever you create a Domino Object (such as a Document, View, DocumentCollection etc.) a memory handle is allocated in the underlying 'C' layer. This needs to be released (or recycled) and it will eventually do so when the session is recycled, however when your are processing in a loop it is much more important to recycle as you can easily exhaust the available memory handles and cause a crash.
Also it's possible you may need to close (and recycle) each Stream after you a finished importing each file
Lastly, double check that the extracted file that is causing an exception is definitely a valid DXL file, it could simply be that some of the attachments are not valid DXL and will always throw an exception.
you could put a try/catch within the loop to handle that scenario (and report the problem files), which will allow the agent to continue without halting

Entity framework extended throws DynamicProxy exception

When trying to do bulk updates using EntityFramework.Extended I get one of two exceptions.
Looking at the example I tried:
context.ProcessJobs.Where(job => true).Update(job => new ProcessJob
{
Status = ProcessJobStatus.Processing,
StatusTime = DateTime.Now,
LogString = "Processing"
});
I got the following exception:
'EntityFramework.Reflection.DynamicProxy' does not contain a definition for 'InternalQuery'
...
System.Core.dll!System.Dynamic.UpdateDelegates.UpdateAndExecute1(System.Runtime.CompilerServices.CallSite site, object arg0) + 0x153 bytes
EntityFramework.Extended.dll!EntityFramework.Extensions.ObjectQueryExtensions.ToObjectQuery(System.Linq.IQueryable query) + 0x2db bytes
EntityFramework.Extended.dll!EntityFramework.Extensions.BatchExtensions.Update(System.Linq.IQueryable source, System.Linq.Expressions.Expression> updateExpression) + 0xe9 bytes
EntityFramework.Extended.dll!EntityFramework.Extensions.BatchExtensions.Update(System.Linq.IQueryable source, System.Linq.Expressions.Expression> updateExpression) + 0xe9 bytes
Based on a github issue, I tried :
var c = ((IObjectContextAdapter) context).ObjectContext.CreateObjectSet<ProcessJob>();
c.Update(job => new ProcessJob
{
Status = ProcessJobStatus.Processing,
StatusTime = DateTime.Now,
LogString = "Processing"
});
Which results in the exception (probably same error as reported here)
'EntityFramework.Reflection.DynamicProxy' does not contain a definition for 'EnsureMetadata'
...
EntityFramework.Extended.dll!EntityFramework.Mapping.ReflectionMappingProvider.FindMappingFragment(System.Collections.Generic.IEnumerable itemCollection, System.Data.Entity.Core.Metadata.Edm.EntitySet entitySet) + 0xc1e bytes
EntityFramework.Extended.dll!EntityFramework.Mapping.ReflectionMappingProvider.CreateEntityMap(System.Data.Entity.Core.Objects.ObjectQuery query) + 0x401 bytes
EntityFramework.Extended.dll!EntityFramework.Mapping.ReflectionMappingProvider.GetEntityMap(System.Data.Entity.Core.Objects.ObjectQuery query) + 0x58 bytes
EntityFramework.Extended.dll!EntityFramework.Mapping.MappingResolver.GetEntityMap(System.Data.Entity.Core.Objects.ObjectQuery query) + 0x9f bytes
EntityFramework.Extended.dll!EntityFramework.Extensions.BatchExtensions.Update(System.Linq.IQueryable source, System.Linq.Expressions.Expression> updateExpression) + 0x1c8 bytes
I tried the latest version for EF5, and I upgraded to EF6 to see if the latest version works, but I get the same problem. We use Code First.
I am not sure how to proceed, I've started trying to understand how the EntityFramework.Extensions code works. But I am wondering whether I will have to fall back to using a stored procedure or SQL, which neither are ideal for our setup.
Does anyone know what these problems are, or have any ideas about how to work out what is going on?
It turns out that you can ignore this error. I had CLR runtime exceptions debug option turned on. I followed through the source code, and then downloaded it and started debugging.
It seems that the exception being thrown initially is expected and it retries with some other options. Unfortunately I didn't have time to look into the exact problem because I ran into another - but that's the subject of a different question.

Eclipse warning

I am getting the following warning
Null passed for nonnull parameter of new java.util.Scanner(Readable) in
model.WordCount.getFile(File).
Why am I getting this and how do I get rid of this warning? Here is the method:
/**
* Receives and parses input file.
*
* #param the_file The file to be processed.
*/
public void getFile(final File the_file) {
FileReader fr = null;
try {
fr = new FileReader(the_file);
} catch (final FileNotFoundException e) {
e.printStackTrace();
}
Scanner input = null;
String word;
input = new Scanner(fr);
while (input.hasNext()) {
word = input.next();
word = word.toLowerCase().
replaceAll("\\.|\\!|\\,|\\'|\\\"|\\?|\\-|\\(|\\)|\\*|\\$|\\#|\\&|\\~|\\;|\\:", "");
my_first.add(word);
setCounter(getCounter() + 1);
}
input.close();
}
I had to initialize the FileReader to null to avoid an error. This is what triggered the warning though.
If the line
fr = new FileReader(the_file);
throws an exception, then fr remains null and will definitely not work in the Scanner. That's what the warning is about.
It's basically telling you that printing the stack trace of an exception is no proper error handling. Instead you should think about returning out of the method in case of that early exception. Or alternatively, you may want to put the exception handling block around all code of the method, not just around that single line. Then the warning will also vanish, as an exception would lead to not executing any further code in the method.

JbossTextMessage Unicode convert failed in Linux

I'm trying to upload a xml (UTF-8) file and post it on a Jboss MQ. When reading the file from the listener UTF-8 characters are not correctly formatted ONLY in the Jboss (jboss-5.1.0.GA-3) instance running on Linux.
For an instance: BORÅS is converted to BOR¿S at Linux jboss instance.
When I copy and configure the same jboss instance to run at Windows (SP3) it works perfectly.
Also I have change the default setting in Linux by including JAVA_OPTS=-Dfile.encoding=UTF-8 in .bashrc and run.sh files.
inside the Listener JbossTextMessage.getText() is coming with incorrectly specified character.
Any suggestions or workarounds ?
Finally I was able to find a solution, BUT the solution is still a blackbox. If anyone have the answer to WHY it has failed/successful please update the thread.
Solution at a glance :
1. Captured the file contents as a byte arry and wrote it to a xml file in jboss tmp folder using FileOutputStream
When posting to the jboss Message queue, I used the explicitly wrote xml file (1st step) using a FileInputStream as a byte array and pass it as the Message body.
Code example:
View: JSP page with a FormFile
Controller Class :UploadAction.java
public ActionForward execute(ActionMapping mapping, ActionForm form, HttpServletRequest request, HttpServletResponse response){
...........
writeInitFile(theForm.getFile().getFileData()); // Obtain the uploaded file
Message msg = messageHelper.createMessage( readInitFile() ); // messageHelper is a customized factory method to create Message objects. Passing the newly
wrote file's byte array.
messageHelper.sendMsg(msg); // posting in the queue
...........
}
private void writeInitFile(byte[] fileData) throws Exception{
File someFile = new File("/jboss-5.1.0.GA-3/test/server/default/tmp/UploadTmp.xml"); // Write the uploaded file into a temporary file in jboss/tmp folder
FileOutputStream fos = new FileOutputStream(someFile);
fos.write( fileData );
fos.flush();
fos.close();
}
private byte[] readInitFile() throws Exception{
StringBuilder buyteArray=new StringBuilder();
File someFile = new File("/jboss-5.1.0.GA-3/test/server/default/tmp/UploadTmp.xml"); // Read the Newly created file in jboss/tmp folder
FileInputStream fstream = new FileInputStream(someFile);
int ch;
while( (ch = fstream.read()) != -1){
buyteArray.append((char)ch);
}
fstream.close();
return buyteArray.toString().getBytes(); // return the byte []
}
Foot Note: I think it is something to do with the Linux/Windows default file saving type. eg: Windows default : ANSI.