How to merge two ppt by poi - merge

I want to merge multiple ppts. I use POI realize most functions, but there are still some problems. Some elements are not generated. I tested several groups of ppts.
Case 1: If there is only one slide in the PPT, the result is right. If there are multiple slides, will throw exception.
Below is the exception stack:
java.lang.ClassCastException: org.apache.poi.ooxml.POIXMLDocumentPart cannot be cast to org.apache.poi.xslf.usermodel.XSLFPictureData
at org.apache.poi.xslf.usermodel.XSLFSheet.importBlip(XSLFSheet.java:649)
at org.apache.poi.xslf.usermodel.XSLFPictureShape.copy(XSLFPictureShape.java:378)
at org.apache.poi.xslf.usermodel.XSLFSheet.wipeAndReinitialize(XSLFSheet.java:454)
at org.apache.poi.xslf.usermodel.XSLFSheet.importContent(XSLFSheet.java:433)
at org.apache.poi.xslf.usermodel.XSLFSlide.importContent(XSLFSlide.java:294)
at com.office.MergingMultiplePresentations.main(MergingMultiplePresentations.java:38)
Case 2: I tested another PPT, and when I opened it, it prompted “there is a problem with the content, you can try to repair it“”. When I click repair, the some slide of the PPT was deleted. Is there something that hasn't been copied?
Here is my code:
XMLSlideShow ppt = new XMLSlideShow();
//taking the two presentations that are to be merged
String path = "E:\\prj\\test\\";
String file1 = "1.pptx";
String file2 = "2.pptx";
String[] inputs = {file1,file2};
for(String arg : inputs){
FileInputStream inputstream = new FileInputStream(path+arg);
XMLSlideShow src = new XMLSlideShow(inputstream);
for(XSLFSlide srcSlide : src.getSlides()) {
try {
XSLFSlideLayout srcLayout = srcSlide.getSlideLayout();
XSLFSlideMaster srcMaster = srcSlide.getSlideMaster();
XSLFSlide slide = ppt.createSlide();
XSLFSlideLayout layout = slide.getSlideLayout();
XSLFSlideMaster master = slide.getSlideMaster();
layout.importContent(srcLayout);
master.importContent(srcMaster);
slide.importContent(srcSlide);
}
catch (Exception e){
e.printStackTrace();
}
}
}
String file3 = "3.pptx";
//creating the file object
FileOutputStream out = new FileOutputStream(path+file3);
// saving the changes to a file
ppt.write(out);
out.close();

The operation of merging presentations using POI looks a bit cumbersome due to the fact that you have to take care of the layouts and masters yourself. It's easier to use Aspose.Slides for Java for this. The following code example shows you how to merge presentations using that library. Slide layouts and slide masters will be merged automatically.
String file1 = "1.pptx";
String file2 = "2.pptx";
String[] inputs = {file1, file2};
// Prepare a new empty presentation.
Presentation ppt = new Presentation();
ppt.getSlides().removeAt(0); // removes the first empty slide
ppt.getSlideSize().setSize(SlideSizeType.Widescreen, SlideSizeScaleType.Maximize);
// Merge the input presentations.
for (String file : inputs) {
Presentation source = new Presentation(file);
for (ISlide slide : source.getSlides()) {
ppt.getSlides().addClone(slide);
}
source.dispose();
}
ppt.save("3.pptx", SaveFormat.Pptx);
ppt.dispose();
This is a paid product, but you can get a temporary license to try it out.
Alternatively, you could use Aspose.Slides Cloud SDK for Java. This product provides a REST-based API that allows you to make 150 free API calls per month for API learning and presentation processing. The following code example shows you how to do the same using Aspose.Slides Cloud:
SlidesApi slidesApi = new SlidesApi("my_client_id", "my_client_secret");
String file1 = "1.pptx";
String file2 = "2.pptx";
String outFile = "3.pptx";
// Prepare a new empty presentation.
slidesApi.createPresentation(outFile, null, null, null, null, null);
slidesApi.deleteSlide(outFile, 1, null, null, null); // removes the first empty slide
SlideProperties slideProperties = new SlideProperties();
slideProperties.setSizeType(SlideProperties.SizeTypeEnum.WIDESCREEN);
slideProperties.setScaleType(SlideProperties.ScaleTypeEnum.MAXIMIZE);
slidesApi.setSlideProperties(outFile, slideProperties, null, null, null);
// Merge the input presentations.
PresentationsMergeRequest mergeRequest = new PresentationsMergeRequest();
mergeRequest.setPresentationPaths(Arrays.asList(file1, file2));
slidesApi.merge(outFile, mergeRequest, null, null, null);
Sometimes it is necessary to merge presentations without any code. For such cases, you can use the free Aspose Online Merger.
I work as a Support Developer at Aspose.

Related

CwvReader not loading lines starting with #

I'm trying to load a text file (.csv) into a SQL Server database table. Each line in the file is supposed to be loaded into a single column in the table. I find that lines starting with "#" are skipped, with no error. For example, the first two of the following four lines are loaded fine, but the last two are not. Anybody knows why?
ThisLineShouldBeLoaded
This one as well
#ThisIsATestLine
#This is another test line
Here's the segment of my code:
var sqlConn = connection.StoreConnection as SqlConnection;
sqlConn.Open();
CsvReader reader = new CsvReader(new StreamReader(f), false);
using (var bulkCopy = new SqlBulkCopy(sqlConn))
{
bulkCopy.DestinationTableName = "dbo.TestTable";
try
{
reader.SkipEmptyLines = true;
bulkCopy.BulkCopyTimeout = 300;
bulkCopy.WriteToServer(reader);
reader.Dispose();
reader = null;
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
System.Diagnostics.Debug.WriteLine(ex.Message);
throw;
}
}
# is the default comment character for CsvReader. You can change the comment character by changing the Comment property of the Configuration object. You can disable comment processing altogether by setting the AllowComment property to false, eg:
reader.Configuration.AllowComments=false;
SqlBulkCopy doesn't deal with CSV files at all, it sends any data that's passed to WriteServer to the database. It doesn't care where the data came from or what it contains, as long as the column mappings match
Update
Assuming LumenWorks.Framework.IO.Csv refers to this project the comment character can be specified in the constructor. One could set it to something that wouldn't appear in a normal file, perhaps even the NUL character, the default char value :
CsvReader reader = new CsvReader(new StreamReader(f), false, escape:default);
or
CsvReader reader = new CsvReader(new StreamReader(f), false, escape : '\0');

Chapters in iText 7

I'm looking to create a pdf file with chapters and sub chapters with iText 7. I've found examples for previous versions of iText using the Chapter class. However this class does not seem to be included in iText 7.
How is that functionality implemented in iText7?
The Chapter and Section class in iText 5 were problematic. Already with iText 5, we advised people to use PdfOutline.
For an example on how to create chapters, and more specifically, the corresponding outlines in the bookmarks panel, please take a look at the iText 7: Building Blocks tutorial. This tutorial has a recurring theme: the novel "The Strange Case of Dr. Jekyll and Mr. Hyde."
We use that text and a database with movies based on this novel to explain how iText 7 works. If you don't have the time to read it, please jump to Chapter 6.
In this chapter, we create a document that looks like this:
You can download the full sample code here: TOC_OutlinesDestinations
BufferedReader br = new BufferedReader(new FileReader(SRC));
String name, line;
Paragraph p;
boolean title = true;
int counter = 0;
PdfOutline outline = null;
while ((line = br.readLine()) != null) {
p = new Paragraph(line);
p.setKeepTogether(true);
if (title) {
name = String.format("title%02d", counter++);
outline = createOutline(outline, pdf, line, name);
p.setFont(bold).setFontSize(12)
.setKeepWithNext(true)
.setDestination(name);
title = false;
document.add(p);
}
else {
p.setFirstLineIndent(36);
if (line.isEmpty()) {
p.setMarginBottom(12);
title = true;
}
else {
p.setMarginBottom(0);
}
document.add(p);
}
}
In this example, we loop over a text file that contains titles and chapters. Every time we encounter a title, we create a name (title01, title02, and so on), and we use this named as named destination for the title paragraph: setDestination(name).
We create the outlines using the PdfOutline object for which we define a named destination like this: PdfDestination.makeDestination(new PdfString(name))
public PdfOutline createOutline(PdfOutline outline, PdfDocument pdf, String title, String name) {
if (outline == null) {
outline = pdf.getOutlines(false);
outline = outline.addOutline(title);
outline.addDestination(PdfDestination.makeDestination(new PdfString(name)));
return outline;
}
PdfOutline kid = outline.addOutline(title);
kid.addDestination(PdfDestination.makeDestination(new PdfString(name)));
return outline;
}
There are other ways to achieve this result, but using named destinations is the most simple way. Just try the example, you'll discover that most of the complexity of this example is caused by the fact that we turn a simple text file into a document with chapter titles and chapter content.

How to edit pasted content using the Open XML SDK

I have a custom template in which I'd like to control (as best I can) the types of content that can exist in a document. To that end, I disable controls, and I also intercept pastes to remove some of those content types, e.g. charts. I am aware that this content can also be drag-and-dropped, so I also check for it later, but I'd prefer to stop or warn the user as soon as possible.
I have tried a few strategies:
RTF manipulation
Open XML manipulation
RTF manipulation is so far working fairly well, but I'd really prefer to use Open XML as I expect it to be more useful in the future. I just can't get it working.
Open XML Manipulation
The wonderfully-undocumented (as far as I can tell) "Embed Source" appears to contain a compound document object, which I can use to modify the copied content using the Open XML SDK. But I have been unable to put the modified content back into an object that lets it be pasted correctly.
The modification part seems to work fine. I can see, if I save the modified content to a temporary .docx file, that the changes are being made correctly. It's the return to the clipboard that seems to be giving me trouble.
I have tried assigning just the Embed Source object back to the clipboard (so that the other types such as RTF get wiped out), and in this case nothing at all gets pasted. I've also tried re-assigning the Embed Source object back to the clipboard's data object, so that the remaining data types are still there (but with mismatched content, probably), which results in an empty embedded document getting pasted.
Here's a sample of what I'm doing with Open XML:
using OpenMcdf;
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;
...
object dataObj = Forms.Clipboard.GetDataObject();
object embedSrcObj = dateObj.GetData("Embed Source");
if (embedSrcObj is Stream)
{
// read it with OpenMCDF
Stream stream = embedSrcObj as Stream;
CompoundFile cf = new CompoundFile(stream);
CFStream cfs = cf.RootStorage.GetStream("package");
byte[] bytes = cfs.GetData();
string savedDoc = Path.GetTempFileName() + ".docx";
File.WriteAllBytes(savedDoc, bytes);
// And then use the OpenXML SDK to read/edit the document:
using (WordprocessingDocument openDoc = WordprocessingDocument.Open(savedDoc, true))
{
OpenXmlElement body = openDoc.MainDocumentPart.RootElement.ChildElements[0];
foreach (OpenXmlElement ele in body.ChildElements)
{
if (ele is Paragraph)
{
Paragraph para = (Paragraph)ele;
if (para.ParagraphProperties != null && para.ParagraphProperties.ParagraphStyleId != null)
{
string styleName = para.ParagraphProperties.ParagraphStyleId.Val;
Run run = para.LastChild as Run; // I know I'm assuming things here but it's sufficient for a test case
run.RunProperties = new RunProperties();
run.RunProperties.AppendChild(new DocumentFormat.OpenXml.Wordprocessing.Text("test"));
}
}
// etc.
}
openDoc.MainDocumentPart.Document.Save(); // I think this is redundant in later versions than what I'm using
}
// repackage the document
bytes = File.ReadAllBytes(savedDoc);
cf.RootStorage.Delete("Package");
cfs = cf.RootStorage.AddStream("Package");
cfs.Append(bytes);
MemoryStream ms = new MemoryStream();
cf.Save(ms);
ms.Position = 0;
dataObj.SetData("Embed Source", ms);
// or,
// Clipboard.SetData("Embed Source", ms);
}
Question
What am I doing wrong? Is this just a bad/unworkable approach?

How to store and compare annotation (with Gold Standard) in GATE

I am very comfortable with UIMA, but my new work require me to use GATE
So, I started learning GATE. My question is regarding how to calculate performance of my tagging engines (java based).
With UIMA, I generally dump all my system annotation into a xmi file and, then using a Java code compare that with a human annotated (gold standard) annotations to calculate Precision/Recall and F-score.
But, I am still struggling to find something similar with GATE.
After going through Gate Annotation-Diff and other info on that page, I can feel there has to be an easy way to do it in JAVA. But, I am not able to figure out how to do it using JAVA. Thought to put this question here, someone might have already figured this out.
How to store system annotation into a xmi or any format file programmatically.
How to create one time gold standard data (i.e. human annotated data) for performance calculation.
Let me know if you need more specific or details.
This code seems helpful in writing the annotations to a xml file.
http://gate.ac.uk/wiki/code-repository/src/sheffield/examples/BatchProcessApp.java
String docXMLString = null;
// if we want to just write out specific annotation types, we must
// extract the annotations into a Set
if(annotTypesToWrite != null) {
// Create a temporary Set to hold the annotations we wish to write out
Set annotationsToWrite = new HashSet();
// we only extract annotations from the default (unnamed) AnnotationSet
// in this example
AnnotationSet defaultAnnots = doc.getAnnotations();
Iterator annotTypesIt = annotTypesToWrite.iterator();
while(annotTypesIt.hasNext()) {
// extract all the annotations of each requested type and add them to
// the temporary set
AnnotationSet annotsOfThisType =
defaultAnnots.get((String)annotTypesIt.next());
if(annotsOfThisType != null) {
annotationsToWrite.addAll(annotsOfThisType);
}
}
// create the XML string using these annotations
docXMLString = doc.toXml(annotationsToWrite);
}
// otherwise, just write out the whole document as GateXML
else {
docXMLString = doc.toXml();
}
// Release the document, as it is no longer needed
Factory.deleteResource(doc);
// output the XML to <inputFile>.out.xml
String outputFileName = docFile.getName() + ".out.xml";
File outputFile = new File(docFile.getParentFile(), outputFileName);
// Write output files using the same encoding as the original
FileOutputStream fos = new FileOutputStream(outputFile);
BufferedOutputStream bos = new BufferedOutputStream(fos);
OutputStreamWriter out;
if(encoding == null) {
out = new OutputStreamWriter(bos);
}
else {
out = new OutputStreamWriter(bos, encoding);
}
out.write(docXMLString);
out.close();
System.out.println("done");

merge word documents to a single document

I used the code in the link mentioned below to merge word files into a single file
http://devpinoy.org/blogs/keithrull/archive/2007/06/09/updated-how-to-merge-multiple-microsoft-word-documents.aspx
However, seeing the output file i realized that it was unable to copy header image in the first document. How do we merge documents preserving format and content.
I will suggest to use GroupDocs.Merger Cloud for merging multiple word document to a single word document, it keeps the formatting and contents of the source documents. It is a platform independent REST API solution without depending on any third-party tool or software.
Sample C# code:
var configuration = new GroupDocs.Merger.Cloud.Sdk.Client.Configuration(MyAppSid, MyAppKey);
var apiInstance_Document = new GroupDocs.Merger.Cloud.Sdk.Api.DocumentApi(configuration);
var apiInstance_File = new GroupDocs.Merger.Cloud.Sdk.Api.FileApi(configuration);
var pathToSourceFiles = #"C:/Temp/input/";
var remoteFolder = "Temp/";
var joinItem_list = new List<JoinItem>();
try
{
DirectoryInfo dir = new DirectoryInfo(pathToSourceFiles);
System.IO.FileInfo[] files = dir.GetFiles();
foreach (System.IO.FileInfo file in files)
{
var request_upload = new GroupDocs.Merger.Cloud.Sdk.Model.Requests.UploadFileRequest(remoteFolder + file.Name, File.Open(file.FullName, FileMode.Open));
var response_upload = apiInstance_File.UploadFile(request_upload);
var item = new JoinItem
{
FileInfo = new GroupDocs.Merger.Cloud.Sdk.Model.FileInfo
{ FilePath = remoteFolder + file.Name }
};
joinItem_list.Add(item);
}
var options = new JoinOptions
{
JoinItems = joinItem_list,
OutputPath = remoteFolder + "Merged_Document.docx"
};
var request = new JoinRequest(options);
var response = apiInstance_Document.Join(request);
Console.WriteLine("Output file path: " + response.Path);
}
catch (Exception e)
{
Console.WriteLine("Exception while Merging Documents: " + e.Message);
}
That code is inserting a page break after each file.
Since sections control headers, if a second or subsequent document has a header, you'll probably be wanting to keep the original section properties, and insert those after your first document.
If you look at your original document as a docx, you'll probably see that your section is a document level section properties element.
The easiest way around your problem may be to create a second section properties element inside the last paragraph (which contains the header information). Then this should just stay there when the documents are merged (ie other paragraphs added after it).
That's the theory. See also http://www.pcreview.co.uk/forums/thread-898133.php
But I haven't tried it; it assumes InsertFile behaves as I expect it should.