Referencing other documents' content - ms-word

We need to create a matrix from 2 other documents' contents. For example:
doc has fields like:
4.2 Requirements A
Blah
doc has fields like:
2.1 Analysis A
Blah Blah
and we want to create another document (called Traceability Matrix) which is like:
Col1 Col2 Col3
4.2 2.1 Blah Blah Blah
4.2 and 2.1 should be dynamically updated in doc3.
We checked using hyperlink, cross referencing but nothing seems to be useful for combining different documents. Is there anyway to do this?
EDIT:
Here is an example:
Technical Specification Num Requirement Num Requirement
4.2 2.1 A sentence that explains the relationship btw 2 cols: Technical Specification and Requirement Num

I have now created a working example of how this can be implemented using MS Word Interop and C#.
The code contains comments that should explain the most interesting parts.
The sample is implemented as a C# console application using:
.NET 4.5
Microsoft Office Object Library version 15.0, and
Microsoft Word Object Library version 15.0
... that is, the MS Word Interop API that ships with MS Office 2013 Preview.
using System;
using System.Collections.Generic;
using System.Linq;
using Microsoft.Office.Interop.Word;
using Application = Microsoft.Office.Interop.Word.Application;
namespace WordDocStats
{
internal class Program
{
private static void Main()
{
// Open word
var wordApplication = new Application() { Visible = true };
// Open document A, get its headings, and close it again
var documentA = wordApplication.Documents.Open(#"C:\Users\MyUserName\Documents\documentA.docx", Visible: true);
var headingsA = GetHeadingsInDocument(documentA);
documentA.Close();
// Same procedure for document B
var documentB = wordApplication.Documents.Open(#"C:\Users\MyUserName\Documents\documentB.docx", Visible: true);
var headingsB = GetHeadingsInDocument(documentB);
documentB.Close();
// Open the target document (document C)
var documentC = wordApplication.Documents.Open(#"C:\Users\MyUserName\Documents\documentC.docx", Visible: true);
// Add a table to it (the traceability matrix)
// The number of rows is the number of headings + one row reserved for a table header
documentC.Tables.Add(documentC.Range(0, 0), headingsA.Count+1, 3);
// Get the traceability matrix
var traceabilityMatrix = documentC.Tables[1];
// Add a table header and border
AddTableHeaderAndBorder(traceabilityMatrix, "Headings from document A", "Headings from document B", "My Description");
// Insert headings from doc A and doc B into doc C's traceability matrix
for (var i = 0; i < headingsA.Count; i++)
{
// Insert headings from doc A
var insertRangeColOne = traceabilityMatrix.Cell(i + 2, 1).Range;
insertRangeColOne.Text = headingsA[i].Trim();
// Insert headings from doc B
var insertRangeColTwo = traceabilityMatrix.Cell(i + 2, 2).Range;
insertRangeColTwo.Text = headingsB[i].Trim();
}
documentC.Save();
documentC.Close();
wordApplication.Quit();
}
// Based on:
// -> http://csharpfeeds.com/post/5048/Csharp_and_Word_Interop_Part_4_-_Tables.aspx
// -> http://stackoverflow.com/a/1817041/700926
private static void AddTableHeaderAndBorder(Table table, params string[] columnTitles)
{
const int headerRowIndex = 1;
for (var i = 0; i < columnTitles.Length; i++)
{
var tableHeaderRange = table.Cell(headerRowIndex, i+1).Range;
tableHeaderRange.Text = columnTitles[i];
tableHeaderRange.Font.Bold = 1;
tableHeaderRange.Font.Italic = 1;
}
// Repeat header on each page
table.Rows[headerRowIndex].HeadingFormat = -1;
// Enable borders
table.Borders.Enable = 1;
}
// Based on:
// -> http://stackoverflow.com/q/7084270/700926
// -> http://stackoverflow.com/a/7084442/700926
private static List<string> GetHeadingsInDocument(Document document)
{
object headingsAtmp = document.GetCrossReferenceItems(WdReferenceType.wdRefTypeHeading);
return ((Array)(headingsAtmp)).Cast<string>().ToList();
}
}
}
Basically, the code first loads all headings from the two given documents and stores them in memory. Then it opens the target document, creates and styles the traceability matrix, and finally, it inserts the headings into the matrix.
The code is based on the assumptions that:
A target document (documentC.docx) exists.
The number of headings in the two input documents (documentA.docx, and documentB.docx) contains the same amount of headings - this assumption is made based on your comment about not wanting a Cartesian product.
I hope this meets your requirements :)

Related

Dynamic number of columns bound to a UWP datagrid (Windows Community Toolkit)

Has anyone a sample project with a dynamic number of columns bound to a datagrid in UWP? In WPF I can get it to work with an observable collection of dynamic objects with Telerik Datagrid. But in UWP Telerik does not support dynamic objects. I have tried with the Windows Community Toolkit datagrid but failed with it too.
Dynamic number of columns bound to a UWP datagrid (Windows Community Toolkit)
derive from Samed Bejtovic's reply. We could make Dynamic number of columns in code behind with Windows Community Toolkit DataGrid. Before fill the DataGrid, we need convert the collection to DataTable, for example. The following is that load csv file and insert the data to the DataTable.
var dt = new DataTable();
bool firstLine = true;
var sr = new StreamReader("Assets\\Archive.csv");
while (sr.Peek() >= 0)
{
if (firstLine)
{
firstLine = false;
var cols = sr.ReadLine().Split(',');
foreach (string col in cols)
dt.Columns.Add(new DataColumn(col, typeof(string)));
}
else
{
var data = sr.ReadLine().Split(',');
dt.Rows.Add(data);
}
}
Then we call the FillDataGrid(dt,MyDataGrid) that could add the Columns into DataGrid base on above DataTable.
public static void FillDataGrid(DataTable table, DataGrid grid)
{
grid.Columns.Clear();
for (int i = 0; i < table.Columns.Count; i++)
{
grid.Columns.Add(new DataGridTextColumn()
{
Header = table.Columns[i].ColumnName,
Binding = new Binding { Path = new PropertyPath("[" + i.ToString() + "]") }
});
}
var collection = new ObservableCollection<object>();
foreach (DataRow row in table.Rows)
{
collection.Add(row.ItemArray);
}
grid.ItemsSource = collection;
}
DataCollection = new ObservableCollection<dynamic>();
while (reader.Read())
{
string project = reader.GetString(0);
decimal number = reader.GetDecimal(1);
DataCollection.Add(new DataSummary(project, number));
}
Datagrid supports auto generated columns. Set the items source to the collection.
https://www.reddit.com/r/UWP/comments/djzcqj/display_datatable_as_datagrid/?utm_source=share&utm_medium=web2x

Concatenate multiple PDF/A with different conformance levels

Is it possible to concatenate a number of pdf/a (with possibly different conformance levels: some pdf/a-1b, some pdf/a-3b ecc) into a single pdfa ?
I was thinking that using the latest level (3-a or 3b) would be ok but I get errors when validating with VeraPDF:
Here is my code (where :
public static byte[] CreateConformantCopy(List<byte[]> sourcePdfs)
{
var version = PdfVersion.PDF_1_7;
var type = PdfAType.PDF_A_3B;
WriterProperties wp = new WriterProperties();
wp.UseSmartMode();
wp.SetPdfVersion(version.ToPdfVersion());
PdfOutputIntent oi = new PdfOutputIntent("Custom", "", "http://www.color.org", "sRGB IEC61966-2.1", Assembly.GetExecutingAssembly().GetManifestResourceStream("xxx.Resources.sRGB_CS_profile.icm"));
using (var mergedPdf = new MemoryStream())
{
var writer = new PdfWriter(mergedPdf, wp);
using (PdfADocument newDoc = new PdfADocument(writer, type.ToPdfAConformanceLevel(), oi, new DocumentProperties() { }))
{
Document document = new Document(newDoc, PageSize.A4.Rotate());
newDoc.SetTagged();
newDoc.GetCatalog().SetLang(new PdfString(Thread.CurrentThread.CurrentUICulture.Name));
newDoc.GetCatalog().SetViewerPreferences(
new PdfViewerPreferences()
.SetDisplayDocTitle(true)
.SetCenterWindow(true)
);
PdfMerger merger = new PdfMerger(newDoc);
for (int k = 0; k < sourcePdfs.Count; k++)
{
using (var inDoc = PdfHelper.GetDocument(sourcePdfs[k]))
{
var numberOfPages = inDoc.GetNumberOfPages();
merger.Merge(inDoc, 1, numberOfPages);
}
}
newDoc.Close();
}
return mergedPdf.ToArray();
}
}
PDF/A-1 and PDF/A-2 have several differences in the requirements. So, merging them together might not be possible. Looking on your validation errors, I think this is exactly the case. For example, the very first one is about XMP metadata. The PDF/A-2 is more strict here, and you get this error because your first file (which is probably a valid PDF/A-1) does not actually satisfy the PDF/A-2 rules.
What is possible however is to attach a PDF/A-1 document to PDF/A-2 one. This does not even require the use of PDF/A-3, which allows arbitrary attachments. The PDF/A-2 standard does allow attaching valid PDF/A-1 (as well as PDF/A-2 documents).

Sort/Order an Undetermined Number of Columns (LINQ\Entity Framework)

Need to sort/order a list of data based on an undetermined number of columns (1 or more).
What i'm trying to do is loop through the desired columns and add an OrderBy or ThenBy based on their number to the query'd list, but i'm unsuccessful...
Done this, but it doesn't compile:
var query = GetAllItems(); //returns a IQueriable list of items
//for each selected column
for (int i = 0; i < param.Columns.Length; i++)
{
if (i == 0)
{
query = query.OrderBy(x => x.GetType().GetProperty(param.Columns[i].Name));
}
else
{
//ERROR: IQueriable does not contain a definition for "ThenBy" and no extension method "ThenBy"...
query = query.ThenBy(x => x.GetType().GetProperty(param.Columns[i].Data));
}
}
How can i resolve this issue? Or any alternative to accomplish this requirement?
SOLUTION: #Dave-Kidder's solution is well thought and resolves the compile errors i had. Just one problem, OrderBy only executes (actually sorts the results) after a ToList() cast. This is an issue because i can't convert a ToList back to an IOrderedQueryable.
So, after some research i came across a solution that resolve all my issues.
Microsoft assembly for the .Net 4.0 Dynamic language functionality: https://github.com/kahanu/System.Linq.Dynamic
using System.Linq.Dynamic; //need to install this package
Updated Code:
var query = GetAllItems(); //returns a IQueriable list of items
List<string> orderByColumnList = new List<string>(); //list of columns to sort
for (int i = 0; i < param.Columns.Length; i++)
{
string column = param.Columns[i].Name;
string direction = param.Columns[i].Dir;
//ex.: "columnA ASC"
string orderByColumn = column + " " + direction;
//add column to list
orderByColumnList.Add(orderBy);
}
//convert list to comma delimited string
string orderBy = String.Join(",", orderByColumnList.ToArray());
//sort by all columns, yay! :-D
query.OrderBy(orderBy).ToList();
The problem is that ThenBy is not defined on IQueryable, but on the IOrderedQueryable interface (which is what IQueryable.OrderBy returns). So you need to define a new variable for the IOrderedQueryable in order to do subsequent ThenBy calls. I changed the original code a bit to use System.Data.DataTable (to get a similar structure to your "param" object). The code also assumes that there is at least one column in the DataTable.
// using System.Data.DataTable to provide similar object structure as OP
DataTable param = new DataTable();
IQueryable<DataTable> query = new List<DataTable>().AsQueryable();
// OrderBy returns IOrderedQueryable<TSource>, which is the interface that defines
// "ThenBy" so we need to assign it to a different variable if we wish to make subsequent
// calls to ThenBy
var orderedQuery = query.OrderBy(x => x.GetType().GetProperty(param.Columns[0].ColumnName));
//for each other selected column
for (int i = 1; i < param.Columns.Count; i++)
{
orderedQuery = orderedQuery.ThenBy(x => x.GetType().GetProperty(param.Columns[i].ColumnName));
}
you should write ThenBy after OrderBy like this:
query = query
.OrderBy(t=> // your condition)
.ThenBy(t=> // next condition);

Support for basic datatypes in H5Attributes?

I am trying out the beta hdf5 toolkit of ilnumerics.
Currently I see H5Attributes support only ilnumerics arrays. Is there any plan to extend it for basic datatypes (such as string) as part of the final release?
Does ilnumerics H5 wrappers provide provision for extending any functionality to a particular
datatype?
ILNumerics internally uses the official HDF5 libraries from the HDF Group, of course. H5Attributes in HDF5 correspond to datasets with the limitation of being not capable of partial I/O. Besides that, H5Attributes are plain arrays! Support for basic (scalar) element types is given by assuming the array stored to be scalar.
Strings are a complete different story: strings in general are variable length datatypes. In terms of HDF5 strings are arrays of element type Char. The number of characters in the string determines the length of the array. In order to store a string into a dataset or attribute, you will have to store its individual characters as elements of the array. In ILNumerics, you can convert your string into ILArrray or ILArray (for ASCII data) and store that into the dataset/ attribute.
Please consult the following test case which stores a string as value into an attribute and reads the content back into a string.
Disclaimer: This is part of our internal test suite. You will not be able to compile the example directly, since it depends on the existence of several functions which may are not available. However, you will be able to understand how to store strings into datasets and attributes:
public void StringASCIAttribute() {
string file = "deleteA0001.h5";
string val = "This is a long string to be stored into an attribute.\r\n";
// transfer string into ILArray<Char>
ILArray<Char> A = ILMath.array<Char>(' ', 1, val.Length);
for (int i = 0; i < val.Length; i++) {
A.SetValue(val[i], 0, i);
}
// store the string as attribute of a group
using (var f = new H5File(file)) {
f.Add(new H5Group("grp1") {
Attributes = {
{ "title", A }
}
});
}
// check by reading back
// read back
using (var f = new H5File(file)) {
// must exist in the file
Assert.IsTrue(f.Get<H5Group>("grp1").Attributes.ContainsKey("title"));
// check size
var attr = f.Get<H5Group>("grp1").Attributes["title"];
Assert.IsTrue(attr.Size == ILMath.size(1, val.Length));
// read back
ILArray<Char> titleChar = attr.Get<Char>();
ILArray<byte> titleByte = attr.Get<byte>();
// compare byte values (sum)
int origsum = 0;
foreach (var c in val) origsum += (Byte)c;
Assert.IsTrue(ILMath.sumall(ILMath.toint32(titleByte)) == origsum);
StringBuilder title = new StringBuilder(attr.Size[1]);
for (int i = 0; i < titleChar.Length; i++) {
title.Append(titleChar.GetValue(i));
}
Assert.IsTrue(title.ToString() == val);
}
}
This stores arbitrary strings as 'Char-array' into HDF5 attributes and would work just the same for H5Dataset.
As an alternative solution you may use HDF5DotNet (http://hdf5.net/default.aspx) wrapper to write attributes as strings:
H5.open()
Uri destination = new Uri(#"C:\yourFileLocation\FileName.h5");
//Create an HDF5 file
H5FileId fileId = H5F.create(destination.LocalPath, H5F.CreateMode.ACC_TRUNC);
//Add a group to the file
H5GroupId groupId = H5G.create(fileId, "groupName");
string myString = "String attribute";
byte[] attrData = Encoding.ASCII.GetBytes(myString);
//Create an attribute of type STRING attached to the group
H5AttributeId attrId = H5A.create(groupId, "attributeName", H5T.create(H5T.CreateClass.STRING, attrData.Length),
H5S.create(H5S.H5SClass.SCALAR));
//Write the string into the attribute
H5A.write(attributeId, H5T.create(H5T.CreateClass.STRING, attrData.Length), new H5Array<byte>(attrData));
H5A.close(attributeId);
H5G.close(groupId);
H5F.close(fileId);
H5.close();

Export MS Word Document pages to Images

I want to export MS word(docx/doc) document pages to Image(jpeg/png).
I am doing same for presentation(pptx/ppt) using office interop export api for each slide, but didn't found corresponding API for word.
Need suggestion for API/alternate approach for achieving this.
Based on this similar question: "Saving a word document as an image" you could do something like this:
const string basePath = #"C:\Users\SomeUser\SomePath\";
var docPath = Path.Combine(basePath, "documentA.docx");
var app = new Application()
{
Visible = true
};
var doc = app.Documents.Open(docPath);
foreach (Window window in doc.Windows)
{
foreach (Pane pane in window.Panes)
{
for (var i = 1; i <= pane.Pages.Count; i++)
{
var page = pane.Pages[i];
var bits = page.EnhMetaFileBits;
var target = Path.Combine(basePath, string.Format("page-no-{0}", i));
using (var ms = new MemoryStream(bits))
{
var image = Image.FromStream(ms);
var pngTarget = Path.ChangeExtension(target, "png");
image.Save(pngTarget, ImageFormat.Png);
}
}
}
}
app.Quit();
Basically, I'm using the Page.EhmMetaFileBits property which, according to the documentation:
Returns a Object that represents a picture representation of how a
page of text appears.
... and based on that, I create an image and save it to the disk.