I'm getting [#document=null] while parsing the xml file in java - eclipse

I'm getting [#document=null] while parsing the xml file in Java.
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
File file=new File("./src/test/resources/testdata_input/TMS/mapping/OutputXMLFromLogicApp.xml");
DocumentBuilder db = dbf.newDocumentBuilder();
Document document = db.parse(new FileInputStream(file));
document.getDocumentElement().normalize();
I'm expecting my xml data in document.

Related

How to retrieve Metadata stored on file in MongoDB GridFS (VB.net)

I can upload file to GridFS system, and store Metadata related to file.
Dim fs As Stream = New FileStream(file, FileMode.Open)
Dim gridFSBucketOptions As GridFSBucketOptions = New GridFSBucketOptions()
gridFSBucketOptions.BucketName = "Datamodules"
Dim bucket = New GridFSBucket(mongoDB, gridFSBucketOptions)
Dim UploadOptions = New GridFSUploadOptions With {
.ChunkSizeBytes = 64512,
.Metadata = New BsonDocument From {
{"DMC", info.Dmc},
{"ISSNO", info.Issno},
{"Inwork", info.Inwork}}
}
Dim id As Object = Nothing
id = bucket.UploadFromStream(Path.GetFileName(file), fs, UploadOptions)
fs.Close()
it creates a new record, looking like
_id
6357db2d0120ab34fa67045d
length
31136
chunkSize
64512
uploadDate
2022-10-25T12:48:45.343+00:00
md5
"ada7a4595a04ff20cdae10b383d2cf32"
filename
"DMC-JA-A-00-00-00-00A-005A-A_00101.SGM"
metadata
Object
DMC
"JA-A-00-00-00-00A-005A-A"
ISSNO
"001"
Inwork
"01"
this is exactly i want, but i cannot load metadata, i can download file using
bucket.DownloadToStreamByName(_file, fs, downloadOptions)
but i cannot access (i dont know how) metadada part, please help. Im using GridFS Component of the Official MongoDB .NET Driver version 2.17.1

Apache Tika - Parsing and extracting only metadata without reading content

Is there a way to configure the Apache Tikka so that it only extracts the metadata properties from the file and does not access the content of the file. ? We need a way to do this so as to avoid reading the entire content in larger files.
The code to extract we are using is as follows:
var tikaConfig = TikaConfig.getDefaultConfig();
var metadata = new Metadata();
AutoDetectParser parser = new AutoDetectParser(tikaConfig);
BodyContentHandler handler = new BodyContentHandler();
using (TikaInputStream stream = TikaInputStream.get(new File(filename), metadata))
{
parser.parse(stream, handler, metadata, new ParseContext());
Array metadataKeys = metadata.names();
Array.Sort(metadataKeys);
}
With the above code sample, when we try to extract the metadata even the content is being read. We would need a way to avoid the same.

iText 5.5.3 + XFAWorker: Repeated section not rendering

We are using iText Java 5.5.3 with XFAWorker to inject XML into PDF templates and flatten those to simple PDFs.
To integrate into a system that has parts built in various technologies we have wrapped it into a http-Server using the simple framework
The injection stage works fine, but after flattening some sections are lost.
Detailled information:
The PDF template we inject into is: CLAIM.Plan.pdf.
Sample data we try to inject is: plan_v4.xml.
Without flattening we get: CLAIM_plan_unflattened.pdf.
With flattening the result is CLAIM_plan_flattened.pdf.
plan_v4.xml has multiple instances of //form/PlanItems/PlanItem and [CLAIM_plan_v3.pdf][6] is repeating the section to display all of them. This is the OK case. [CLAIM_plan_v4.pdf][7] is not showing the repeated sections for the individual PlanItem instances. This is the broken case.
The main bit of code that uses iText looks like this:
// load the XML document that contains the contents of the data element.
// NOT the <data> element itself or anything else.
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
DocumentBuilder db = dbf.newDocumentBuilder();
Document newdoc = db.parse(new InputSource(new InputStreamReader(xfaStream, "UTF-8")));
Node newdata = newdoc.getDocumentElement();
// Open the PDF and extract the complete xfa.
PdfReader reader = new PdfReader(pdfStream);
XfaForm xfa = new XfaForm(reader);
Document doc = xfa.getDomDocument();
// Remove all contents of the <data> element
NodeList list = doc.getElementsByTagNameNS("http://www.xfa.org/schema/xfa-data/1.0/", "data");
Node child;
do {
child = list.item(0).getFirstChild();
if (child!=null) {
list.item(0).removeChild(child);
}
} while (child!=null);
// Replace <data> with the new xml to inject.
list.item(0).appendChild(doc.importNode(newdata,true));
ByteArrayOutputStream os = new ByteArrayOutputStream();
PdfStamper stamper = new PdfStamper(reader, os, '\0', false);
xfa.setDomDocument(doc);
xfa.setChanged(true);
XfaForm.setXfa(xfa, stamper.getReader(), stamper.getWriter());
stamper.close();
If I now just finish quickly:
response.getOutputStream().write(os.toByteArray());
I get the good (but unflattened) result.
If I add the flattening stage like this:
com.itextpdf.text.Document document = new com.itextpdf.text.Document();
PdfWriter writer = PdfWriter.getInstance(document, response.getOutputStream());
XFAFlattener xfaf = new XFAFlattener(document, writer);
xfaf.flatten(new PdfReader(os.toByteArray()));
document.close();
The result is flattened but missing the repeated sections.
Can someone tell me what I am doing wrong so the PlanItems don't get rendered in the final PDF?

OpenXml ChangeDocumentType

I need to convert a powerpoint template from potx to pptx. As seen here: http://www.codeproject.com/Tips/366463/Create-PowerPoint-presentation-using-PowerPoint-te I have tried with the following code. However the resulting pptx document is invalid and can't be opened by Office Powerpoint. If I skip the line newDoc.ChangeDocumentType then the resulting document is valid, but not converted to pptx.
templateContentBytes is a byte array containing the content of the potx document.
And temppath points to its local version.
using (var stream = new MemoryStream())
{
stream.Write(templateContentBytes, 0, templateContentBytes.Length);
using (var newdoc = PresentationDocument.Open(stream, true))
{
newdoc.ChangeDocumentType(PresentationDocumentType.Presentation);
PresentationPart presentationPart = newdoc.PresentationPart;
presentationPart.PresentationPropertiesPart.AddExternalRelationship(
"http://schemas.openxmlformats.org/officeDocument/2006/" + "relationships/attachedTemplate",
new Uri(tempPath, UriKind.Absolute));
presentationPart.Presentation.Save();
File.WriteAllBytes(tempPathResult, stream.ToArray());
I had the same problem, just move
File.WriteAllBytes(tempPathResult, stream.ToArray());
outside of the using

How can I set XFA data in a static XFA form in iTextSharp and get it to save?

I'm having a very strange issue with XFA Forms in iText / iTextSharp (iTextSharp 5.3.3 via NuGet). I am trying to fill out a static XFA styled form, however my changes are not taking.
I have both editions of iText in Action and have been consulting the second edition as well as the iTextSharp code sample conversions from the book.
Background: I have an XFA Form that I can fill out manually using Adobe Acrobat on my computer. Using iTextSharp I can read what the Xfa XML data is and see the structure of the data. I am essentially trying to mimic that with iText.
What the data looks like when I add data and save in Acrobat (note: this is only the specific section for datasets)
Here is the XML file I am trying to read in to replace the existing data (note: this is the entire contexts of that file):
However, when I pass the path to the replacement XML File in and try to set the data, the new file created (a copy of the original with the data replaced) without any errors being thrown, but the data is not being updated. I can see that the new file is created and I can open it, but there is no data in the file.
Here is the code being utilized to replace the data or populate for the first time, which is a variation of http://sourceforge.net/p/itextsharp/code/HEAD/tree/trunk/book/iTextExamplesWeb/iTextExamplesWeb/iTextInAction2Ed/Chapter08/XfaMovie.cs
public void Generate(string sourceFilePath, string destinationtFilePath, string replacementXmlFilePath)
{
PdfReader pdfReader = new PdfReader(sourceFilePath);
using (MemoryStream ms = new MemoryStream())
{
using (PdfStamper stamper = new PdfStamper(pdfReader, ms))
{
XfaForm xfaForm = new XfaForm(pdfReader);
XmlDocument doc = new XmlDocument();
doc.Load(replacementXmlFilePath);
xfaForm.DomDocument = doc;
xfaForm.Changed = true;
XfaForm.SetXfa(xfaForm, stamper.Reader, stamper.Writer);
}
var bytes = ms.ToArray();
File.WriteAllBytes(destinationtFilePath, bytes);
}
}
Any help would be very much appreciated.
I found the issue. The replacement DomDocument needs to be the entire merged XML of the new document, not just the data or datasets portion.
I upvoted your answer, because it's not incorrect (I'm happy my reference to the demo led you to have another look at your code), but now that I have a second look at your original code, I think it's better to use the book example:
public byte[] ManipulatePdf(String src, String xml) {
PdfReader reader = new PdfReader(src);
using (MemoryStream ms = new MemoryStream()) {
using (PdfStamper stamper = new PdfStamper(reader, ms)) {
AcroFields form = stamper.AcroFields;
XfaForm xfa = form.Xfa;
xfa.FillXfaForm(XmlReader.Create(new StringReader(xml)));
}
return ms.ToArray();
}
}
As you can see, it's not necessary to replace the whole XFA XML. If you use the FillXfaForm method, the data is sufficient.
Note: for the C# version of the examples, see http://tinyurl.com/iiacsCH08 (change the 08 into a number from 01 to 16 for the examples of the other chapters).