I am an absolute newbie as far as Powershell goes. I found a script here - https://www.litigationsupporttipofthenight.com/single-post/2020/04/19/powershell-script-to-count-words-lines-and-characters-in-multiple-pdfs - that I thought was exactly what I was looking for but when I run it, it seems that it sees my image based pdfs as just text files and says there are thousands of words/characters in them. I have a feeling that I am missing something.....I see various forum postings on the web regarding itextsharp and searching words in pdfs (no idea if this is it or not)?
Hoping someone can point me in the right direction (a specific example of this on the web somewhere) would be very much appreciated.
Gully
So I'm brand spanking new to iTextSharp and I know I have quite a bit of reading ahead of me but in an attempt to shave a bunch of time off a relatively trivial task I thought I reach out the stack brain-trust.
I have a very simple goal: Starting with a template pdf, I need to create new pdf with a few of the characters changed. We're talking single characters on each page. I don't need a detailed answer complete with code (although that'd be awesome) so much as a general list of tools and api's I'm going to need.
The data I need will already be in a db which I could output to xml files if need be.
So far it looks like my template will need the "editable" characters tagged somehow (not sure how to do that yet) and using PDFStamper I can modify the copy. Is that the right path or is there a better way?
Thanks for any insight.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I'm not sure how to deploy best practice for SEO in a new project.
I'm building a CMS that will be used by a group of writers to post news articles to a website. I'm developing the site using Perl and Template-Toolkit (TT2). I've also embedded an open source editor (TinyMCE) in the system that will be used for content creation.
I was planning to save the news article content to the DB as text - though I could also save it to flat files and then save the corresponding file paths to the DB.
From an SEO standpoint, I think it would be very helpful if this content could be exposed to search engines. There will be lots of links and images that could help to improve rankings.
If I put this content in the DB, it won't be discoverable ... right?
If I save this content in template files (content.tt) will the .tt files be recognized by search engines?
Note that the template files (.tt) will be displayed as content via a TT2 wrapper.
I'm also planning to generate a Google XML Sitemap using the Sitemap 0.90 standard. Perhaps this is suffiecient? Or should I try to make the actual content discoverable?
Thanks ... just not sure how the google dance deals with .tt files and such.
If I put this content in the DB, it won't be discoverable ... right?
The database is part of your backend. Google cares about what you expose to the front end.
If I save this content in template files (content.tt) will the .tt files be recognized by search engines?
Your template files are also part of your backend.
Note that the template files (.tt) will be displayed as content via a TT2 wrapper.
The wrapper takes the template files and the data in the database and produces HTML pages. The HTML pages are what Google sees.
Link to those pages.
just not sure how the google dance deals with .tt files and such
Google doesn't care at all about .tt files and the like. Google cares about URLs and the resources that they represent.
When Google is given the URL of the front page of your site, it will visit that URL. Your site will respond to that request by generating the front page, presumably in HTML. Google will then parse that HTML and extract any URLs it finds. It will then visit all of those URLs and the process will repeat. Many times.
The back-end technologies don't matter at all. What matters is that your site is made up of well-constructed HTML pages with meaningful links between them.
I have a requirement to create a PDF file from HTML. The resulting PDF needs to have iTextSharp TextField or something similar. I need to update the PDF document with appropriate text in the text field.
Points to note:
1. The PDF length (page numbers) may vary.
2. Due to this, I may have to only know the name of the field to set value to.
OR
I could create a PDF from HTML. As the content of the PDF may vary, I do not know the exact location of a block that I need to edit. I need to stamp text exactly over the block irrespective of the location of the block (i.e. the block may exist in any page).
Example Scenario:
Create a PDF from HTML.
It is sent for approval process. Once it is approved, the name of the approver is printed at a specific place (however the signature area, mentioned as block above, may come at any page, as it depends on the content of the HTML).
Two resources which may help you:
This article details how to use asp.net & itextsharp to create a pdf.
The whole article is pretty useful for a beginner like me, but the section detailing how to create a pdf from HTML may be useful for you as a start on your problem.
https://web.archive.org/web/20211020001758/https://www.4guysfromrolla.com/articles/030911-1.aspx
I would especially pay attention to how he replaces placeholders in the HTML template he is loading. As it seems you may want to head in that direction.
Now to answer your question more directly, have you thought about using a fillable form?
Here is a related Stack Overflow post.
Creating a fillable PDF form with ITextSharp
As I said I am a beginner, so I can't do much to help you from here. But with any luck you can put those together and accomplish what you are looking for.
Let me know if that helps Good luck!
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
All OSs that exist right now work in files and folders. I was thinking that there are may other ways of storing files. Would it be a better way to store files by tags, for example:
A file called "music1" can have a tag "2013", if the music was made in 2013. The same file can have another tag called "Music", to say that the file is music, another file called "video1" could have the "2013" tag, but also have the "Video" tag instead of the "Music" one. This would be useful, because you could search for tags and generate nice-looking maps of all the files you have.
Here is an example:
In this example, files are in green. Each file has some tags(blue),and some special tags(red). Special tags contain things like the user(only the user in the tag can see files tagged USER:Username) and File type(instead of file extension). Tags in yellow are system filetypes that do not require a program to run them(like .exe in windows)
Black lines link tags to files
Red lines link special tags to files
Blue lines link what the file type(or file) is opened by. For example, the music is an ogg file. It is opened by OggViewer, which is a jar file opened by java. Java is opened by the system.
As far as I know, there is a nice file system level solution to your need called NHFS or nonhierarchical file system. Also available a FUSE based mountable file system called TMSU that may satisfy you.
It could have merit, for example I'm utterly disinterested in the file names/paths of my tens of thousands of music files; I only really care about the artist,title,album,year,etc of them, which is the way my music player (quodlibet) displays them. Choosing a set of music to put on another device or to send to someone could then be as easily as selecting an album (instead of browing to /home/me/music/who/knows/what/someartist - somealbum).
There is TagsForAll for windows. It is a file manager based on tags. Tags can have hierarchical structure. User interface is very simple but nice. Free version fully functional and save tags in database, Pro version save tags also within NTFS stream to a file.
Microsoft tried to do something like that with WinFS
but gave up on it. It would be great if they could get it to work.
There are some other (old, archived) projects implement this idea:
http://nascent.freeshell.org/programming/TagFS/
https://code.google.com/p/dhtfs/
https://code.google.com/p/tagfilesystem/
http://www.tagsistant.net/
Only the last seems to be releasing recent versions.
I think the idea has a future. I've pondered this same idea before. And tags fundamentally work better for most content than folders do; however, I wonder if the hierarchical structure of folders isn't actually better suited for files. In other words, though I like the idea of using tags on many levels I wonder if it would actually increase the overall complexity. For example, consider how tags could be used successfully to manage versioned software libraries. I'm afraid we won't know the answer until someone starts using the concept instead of folders for an entire OS. It'll be interesting to see/try.