How to find a word in multiple files - find

Regards for all of you. I have a question about Borland Builder code.
I have Table1 with DataSource conected on paradox.DB (inside that base are topics of my eBox saved in some folder).
Now I am using EditBox and Button to execute filter in DBGrid and I wanna make search to find a word (from EditBox) in multiple filles whereever is thtat word mentioned in my eBox colection.
All my eBox are in pdf format.
Is it possible??

Related

manipulating Microsoft Word DOCX files that have links and track changes using Python

I have been using the excellent python-docx package to read, modify, and write Microsoft Word files. The package supports extracting the text from each paragraph. It also allows accessing a paragraph a "run" at a time, where the run is a set of characters that have the same font information. Unfortunately, when you access a paragraph by runs, you lose the links, because the package does not support links. The package also does not support accessing change tracking information.
My problem is that I need to access change tracking information. Or, more specifically, I need to copy paragraphs that have change tracking indicated from one document to another.
I've tried doing this at the XML level. For example, this code snippet appends the contents of file1.docx to file2.docx:
from docx import Document
doc1 = Document("file1.docx")
doc2 = Document("file2.docx")
doc2.element.body.append(doc1.element.body)
doc2.save("file2-appended.docx")
When I try to open the file on my Mac for complicated files, I get this error:
But if I click OK, the contents are there. The manipulation also works without problem for very simple files.
What am I missing?
The .element attribute is really an "internal" interface and should be named ._element. In most other places I have named it that. What you're getting there is the root element of the document part. You can see what it is by calling:
print(doc2.element.xml)
That element has one and only one w:body element below it, which is what you get when with doc2.element.body (.xml will work on that too, btw, if you want to inspect that element).
What your code is doing is appending one body element at the end of another w:body element and thereby forming invalid XML. The WordprocessingML vocabulary is quite strict about what element can follow another and how many and so forth. The only surprise for me is that it actually sometimes works for you, I take it :)
If you want to manipulate the XML directly, which is what the ._element attribute is there for, you need to do it carefully, in view of the (complex) WordprocessingML XML Schema.
Unlike when you stick to the published API, there's no safety net once ._element (or .element) appears in your code.
Inside the body XML can be relationships to external document parts, like images and hyperlinks. These will only be valid within the document in which they appear. This might explain why some files can be repaired.

How to generate Confluence page based on tabular data

I have some tabular data about users. I would like to have a Confluence page generated based on it. But I don't want to show the data as it is but instead have a nice table made of it.
For example data includes user identifier. But on the page I would like to have it used for few things. For example make an anchor to the user entry/row, show the identifier in a column and generate link (in another column) to some other tools where the identifier is an argument in URL.
This goes in obvious direction of data vs. presentation separation with all its benefits.
Now the problem is that I don't know how to do that while I feel that it should be somehow possible with all that Confluence offers.
There are various reporting macros. But the problem is how to get the initial tabular data. I tried using Excel (or CSV) attachment. But I failed to extract data from it (otherwise than just showing a simple table based on it).
Any advice? I'm using Confluence 5.4.
I have asked about it previously on Atlassian Answers in question Reporting on spreadsheet data from attachment but there are no answers so far and I think there will be none. While I think Stack Overflow is more popular so I hope that maybe here someone will have any advices.
For the 'display table information on the page' part: This could be achieved with a user macro. The CSV macro and HTML macro can be used to pull data in from an attachment or other locations to display on a wiki page.
There are other ways to display this kind of data. This be done with information extracted from a database using the SQL macro. Confluence can read in from its own database or from external databases.
For example, let's say you wanted to list all pages in a space with hyperlinks using the key page information to edit, view, delete the target page. The information being extracted in this example is in the Confluence table.
{sql-query:dataSource=wiki|output=wiki}
SELECT
'['||B.spacename||'|'||B.spacekey||':]' "Space Name",
'['||A.parentid ||'|///pages/viewpage.action?pageId='||A.parentid||']' "Page Parent",
'['||A.contentid||'|///pages/viewpage.action?pageId='||A.contentid||']' "Page Id",
'['||A.title ||'|///viewpage.action?pageId='||A.contentid||']' "Page Title",
'[View Page |///pages/viewpage.action?pageId='||A.contentid||']' "View Page",
'[Edit Page |///pages/editpage.action?pageId='||A.contentid||']' "Edit Page",
'[Delete Page |///pages/removepage.action?pageId='||A.contentid||']' "Delete Page"
FROM wiki.CONTENT A, SPACES B
WHERE B.SPACEKEY = 'sp' -- Put the spacekey here
AND B.SPACEID = A.SPACEID
AND A.TITLE like '%this%' -- Optionally, only return results for pages with the word 'this' in them
-- AND A.CONTENTID = 125999877 -- optionally, only return results for a single page by id
ORDER BY A.TITLE
{sql-query}
Once you have the content on the page it is possible to post-render wiki content using a JavaScript via the html macro.
{html}
<script type="text/javascript">
AJS.$(document).ready(function() {
AJS.$('#tableid').find('tr > td').contents().html('Hello world'); // or whatever to find and change the html or text
});
</script>
{html}
I presume your Confluence is version 4 or later. The default editor is WISIWYG, but you can also enable Source Editor (read Confluence doc on how to do this).
You can create source of a page in external editor and then copy/paste it in to Confluence Source Editor (or use Confluence REST API if you need to import multiple files).
Create a page with sample table, then view source of this page. Copy/paste elements of this page to your tabular data. Use search and replace patterns to insert tags in right places.
For example, if you have CSV file:
- replace commas with </th><th>
- put <tr><th> at the start of each line
- put </th><tr> at the end of each line
This should create nice table in Confluence.

Purpose for Word Open XML and content controls binding

For word report generation, I am looking at binding XML to content controls to see if it is any easier than to use Word Interop and hardcode index reference to content controls to assign values to them.
However, I don't really understand how to do it.
My work flow is entering information in Excel and then generate an XML file to have content controls populated by XML, however, what I read is the other way round: Word Control Control Toolkit and descriptions where the XML is populated by user entering information in Word, and then programmer to unzip docx file to retrieve the XML file.
How can I populate content controls with XML?
There are samples on generating Word documents from Word templates, XML and data bound content controls # http://worddocgenerator.codeplex.com/
Set up the mapped content controls in the 'template' docx using the content control toolkit or similar. Do this using a sample XML file containing your Excel data.
Now you have that template document, at run time you can inject your XML file into it (ie replace the custom xml part it contains, with your instance data), in C# or Java or whatever.
When the user opens the document in Word 2007/2010, the information in the custom XML part will automatically be copied into the bound controls, and visible to the user.
Note that content control data binding doesn't easily support repeating data (eg populating table rows) in Word 2007/2010, though there are ways to do it.

Exporting data in enhanced Grid to csv or xml format using dojo

In my project we are using dojo framework in UI. We are having a functionality to exporting the data in the enhanced grid into excel/csv files. In the dojo toolkit, they are binding the id in the textarea but i need those values in the excel/csv file...can any one help in this issue...? if possible pls tell me how to export the enhanced grid data to excel/csv files...
If you are already using the Enhanced Data Grid, you should be able to include the exporter plugin - dojox.grid.enhanced.plugins.exporter.CSVWriter - to get the CSV text.
This will give you access to two main functions exportGrid and exportSelected that will take the contents and export them as CSV text.
Unfortunately that doesn't get them as a separate file (click to download), just the formatted text in a textarea (or whatever).
To get a "click to download CSV function), you could write a servlet/jsp proxy, which would take a POST from your page with the CSV text (from the plugin above) as part of the form and simply copy it back out with the correct headers to make it appear as an attachment.
response.setContentType("text/csv"); response.setHeader("Content-Disposition","attatchment;filename=name.csv")
This would require something server side though.. and at that point, you may want to consider having a servlet simply produce the CSV text directly.
http://dojotoolkit.org/reference-guide/dojox/grid/EnhancedGrid/plugins/Exporter.html

How to add content control in a Word 2007 document using OpenXML

I want to create a word 2007 document without using object model. So I would prefer to create it using open xml format. So far I have been able to create the document. Now I want to add a content control in it and map it to xml. Can anybody guide me regarding the same???
Anoop,
You said that you are able to creat the document using OpenXmlSdk. With that assumption, you can use the following code to create the content control to add to the Wordprocessing.Body element of your Document.
//praragraph to be added to the rich text content control
Run run = new Run(new Text("Insert any text Here") { Space = StaticTextConstants.Preserve });
Paragraph paragraph = new Paragraph(run);
SdtProperties sdtPr = new SdtProperties(
new Alias { Val = "MyContentCotrol" },
new Tag { Val = "_myContentControl" });
SdtContentBlock sdtCBlock = new SdtContentBlock(paragraph);
SdtBlock sdtBlock = new SdtBlock(sdtPr, sdtCBlock);
//add this content control to the body of the word document
WordprocessingDocument wDoc = WordprocessingDocument.Open(path, true); //path is where your word 2007 file is
Body mBody = wDoc.MainDocumentPart.Document.Body;
mBody.AppendChild(sdtBlock);
wDoc.MainDocumentPart.Document.Save();
wDoc.Dispose();
I hope this answers a part of your question. I did not understand what you ment by "Map it to XML". Did you mean to say you want to create CustomXmlBlock and add the ContentControl to it?
Have a look for the Word Content Control Toolkit on www.codeplex.com.
Here is a very brief explanation on how to do what you are attempting.
You need to have access to the developer tab on the Word ribbon. To get this working click on the Office (Round thingy) in the top left hand corner and Select Word Options at the bottom of the menu. On the first options page there is a checkbox to show the developer toolbar.
Use the developer toolbar to add the Content controls you want on the page. Click the properties button in the Content controls section of the developer bar and set the name and tag properties (I stick to naming the name and tag fields with the same name).
Save and close the word document.
Open the Content control toolkit and then open your document with the toolkit. Use the left hand pain to create some custom xml to link to your controls.
Now use the bind view to drag and drop the mappings between your custom xml and the custom controls that are displayed in the right panel of the toolkit.
You can use the openxml sdk 1.0 or 2.0 (still in ctp) to open your word document in code and access the custom xml file that is contained as part of the word document.
If you want to have a look at how your word document looks as xml. Make a copy of your word document and then rename it to say "a.zip". Double click on the zip file and then navigate the folder structure. The main content of the word document is held under the word folder in a file called "document.xml". The custom xml part of the document is held under the customXml folder and is generally found in the file named "item1.xml".
I hope this brief explanation get you up and running.