PDFCreator to convert pdf to txt in PowerShell

PDFCreator to convert pdf to txt in PowerShell - powershell

Updated Comment: I'm attempting to use PDFCreator to convert pdf files into txt files via PowerShell but it still doesn't seem to be working.
Any help is appreciated!
$PDFCreator = New-Object -ComObject PDFCreator.JobQueue
$PDF = 'C:\Users\userName\Downloads\SampleACORD.pdf'
$TXT = 'C:\Users\userName\Downloads\SampleACORD.txt'
try {
$PDFCreator.initialize()
if($PDFCreator.WaitForJob(5)){
$PJ = $PDFCreator.NextJob
}
if($PJ){
$PJ.PrintFile($PDF)
$PJ.ConvertTo($TXT)
}
} catch {
$_
Break
}
finally {
if($PDFCreator){
$PDFCreator.ReleaseCom()
}
}

You are getting that because $PJ is $null. NextJob isn't returning anything.
To guard against this, WaitForJob(int) returns a bool, $true if a job arrived and $false if not, so you should know after WaitForJob completes whether there is a job to get or not:
if( $PDFCreator.WaitForJob(5) ){
$PJ = $PDFCreator.NextJob
$PJ.allowDefaultPrinterSwitch('C:\Users\userName\Downloads\SampleACORD.txt', $true)
$PJ.ConvertTo($TXT)
} else {
# Handle the no jobs case here
}
You could also do a null check against $PJ before trying to call $PJ.allowDefaultPrinterSwitch:
if( $PJ ){
$PJ.allowDefaultPrinterSwitch('C:\Users\userName\Downloads\SampleACORD.txt', $true)
$PJ.ConvertTo($TXT)
}
Here is some more information on the PDFCreator.JobQueue API, which you may find useful.
To address your issue in the comments, where the file is not being produced, this page of the documentation explains the logical flow of how the conversion process should work:
Call the Initialize() method with your COM Object.
Call WaitForJob(timeOut) if you are waiting for one print job to get in the queue. The parameter timeOut specifies the maximum time the queue waits for the print job to arrive.
Now you are able to get the next job in the queue by calling the property NextJob.
Setup the profile of the job with the method SetProfileByGuid(guid). The guid parameter is used to assign the appropriate conversion profile.
Start the conversion on your print job with ConvertTo(path). The path parameter includes the full path to the location where the converted file should be saved and its full name.
The property IsFinished informs about the conversion state. If the print job is done, IsFinished returns true.
If you want to know whether the job was successfully done, consider the property IsSuccessful. It returns true if the job was converted successfully otherwise false.
In your case, I'm not sure how essential the profile would be, but it does look like your code fails to wait for completion. The following code will wait for the conversion job to finish (and check for success if you need to):
# Wait for completion
while( -Not $PJ.IsFinished ){
Start-Sleep -Seconds 5
}
# Check for success
if( $PJ.IsSuccessful ){
# success case
} else {
# failure case
}
Unrelated, but good practice, wrap your code in a try/finally block, and put your COM release in that block. This way your COM connection closes cleanly even in the event of a terminating error:
$PDFCreator = New-Object -ComObject PDFCreator.JobQueue
try {
# Handle PDF creator calls
} finally {
if( $PDFCreator ){
$PDFCreator.ReleaseCom()
}
}
The finally block is guaranteed to execute before returning to a parent scope, so whether the code succeeds or fails, the finally block will be run.

Related

FastCGI, Perl and Exit

I've been honing the performance a large, decades old codebase I use for projects over the last few weeks and it was suggested to me on here that I should look at something like FastCGI or HTTP::Engine. I've found it impressively straightforward to make use of FastCGI, but there's one nagging question I've found mixed answers on.
Some documents I've read say you should never call exit on a script being run through FastCGI, since that harms the whole concept of keeping it loaded persistently. Others say it doesn’t matter. My code uses exit in a lot of places where it is important to make sure nothing keeps executing. For example, I have restricted access components that call an authorization check:
use MyCode::Authorization;
our $authorization = MyCode::Authorization->new();
sub administration {
$authorization->checkCredentials();
#...Do restricted access stuff.
}
To make it as hard for there to be an error in the code as possible where someone would be permitted to access those functions when they shouldn't, checkCredentials ends the process with exit() after generating a user friendly response with a login page if the answer is that the user does not have the appropriate credentials. E.g.:
sub checkCredentials {
#Logic to check credentials
if ($validCredential) {
return 1;
}
else {
# Build web response.
# Then:
exit;
}
}
}
I’ve used it so that I don’t accidentally overlook something continuing on that causes a security hole. At present, the calling routine can safely assume it only gets back control from checkCredentials if the right credentials are provided.
However, I’m wondering if I need to remove those calls to make good use of FastCGI. Is FCGI's $req->Finish() (or the equivalent in PSGI for HTTP::Engine) an adequate replacement?

I've read say you should never call exit on a script being run through FastCGI,
You don't want the process to exit since the point of using FastCGI is to use a single process to handle multiple requests (to avoid load times, etc).
So you want to do is override exit so that is ends your request-specific code, but not the FastCGI request loop.
You can override exit, but you must do so at compile-time. So use a flag to signal whether the override is active or not.
our $override_exit = 0;
BEGIN {
*CORE::GLOBAL::exit = sub(;$) {
die "EXIT_OVERRIDE\n" if $override_exit;
CORE::exit($_[0] // 0);
};
}
while (get_request()) {
# Other setup...
eval {
local $override_exit = 1;
handle_request();
};
my $exit_was_called = $# eq "EXIT_OVERRIDE\n";
log_error($#) if $# && !$exit_was_called;
log_error("Exit called\n") if $exit_was_called;
# Other cleanup...
}
But that creates an exception that might be caught unintentionally. So let's use last instead.
our $override_exit = 0;
BEGIN {
*CORE::GLOBAL::exit = sub(;$) {
no warnings qw( exiting );
last EXIT_OVERRIDE if $override_exit;
CORE::exit($_[0] // 0);
};
}
while (get_request()) {
# Other setup...
my $exit_was_called = 1;
EXIT_OVERRIDE: {
local $override_exit = 1;
eval { handle_request() };
log_error($#) if $#;
$exit_was_called = 0;
}
log_error("Exit called\n") if $exit_was_called;
# Other cleanup...
}

foreach continue is not working when inside RunWithElevatedPrivileges

Below code keep executing when error arises,
foreach($url in Get-Content $urlsDir)
{
try
{
// do something
// declare X
}
catch
{
// write host or soemthing with exception
continue
}
finally
{
// dispose X
}
}
but when I put this code in RunWithElevatedPrivileges, it completely stops on first error and won't continue execution,
[Microsoft.SharePoint.SPSecurity]::RunWithElevatedPrivileges({
# Iterate through all webs in a text file
foreach($url in Get-Content $urlsDir)
{
try
{
// do something
// declare X
}
catch
{
// write host or soemthing with exception
continue
}
finally
{
// dispose X
}
}
});

It could be related to whether or not you've specified an ErrorAction, I'm not sure how this relates to try catch though. I have done something similar where my foreach would not stop unless explicitly stated.
Basically you need to specify what should happen if an error occurs on each call, as so
Call-Something $SomeParam -ErrorAction Stop
and unless you specify it as above for each call or in the start of your script as below errors can be silently ignored.
// at the start of your script
$ErrorActionPreference = "Stop"
For more information you can read about ErrorAction in powershell here for instance
https://blogs.msdn.microsoft.com/kebab/2013/06/09/an-introduction-to-error-handling-in-powershell/
And according to msdn these are the valid values:
Stop: Displays the debug message and stops executing. Writes an error to the console.
Inquire: Displays the debug message and asks you whether you want to continue. Note that adding the Debug common parameter to a command--when the command is configured to generate a debugging message--changes the value of the $DebugPreference variable to Inquire.
Continue: Displays the debug message and continues with execution.
SilentlyContinue: No effect. The debug message is not (Default) displayed and execution continues without interruption.

Increasing stack size with WinRM for ScriptMethods

We are currently refactoring our administration scripts.
It had just appeared that a combination of WinRM, error handling and ScriptMethod dramatically decreases available recursion depth.
See the following example:
Invoke-Command -ComputerName . -ScriptBlock {
$object = New-Object psobject
$object | Add-Member ScriptMethod foo {
param($depth)
if ($depth -eq 0) {
throw "error"
}
else {
$this.foo($depth - 1)
}
}
try {
$object.foo(5) # Works fine, the error gets caught
} catch {
Write-Host $_.Exception
}
try {
$object.foo(6) # Failure due to call stack overflow
} catch {
Write-Host $_.Exception
}
}
Just six nested calls are enough to overflow the call stack!
Indeed, more than 200 local nested calls work fine, and without the try-catch the available depth doubles. Regular functions are also not that limited in recursion.
Note: I used recursion only to reproduce the problem, the real code contains many different functions on different objects in different modules. So trivial optimizations as "use functions not ScriptMethod" require architectural changes
Is there a way to increase the available stack size? (I have an administrative account.)

You have two problems that conspire to make this difficult. Neither is most effectively solved by increasing your stack size, if such a thing is possible (I don't know if it is).
First, as you've experienced, remoting adds overhead to calls that reduces the available stack. I don't know why, but it's easily demonstrated that it does. This could be due to the way runspaces are configured, or how the interpreter is invoked, or due to increased bookkeeping -- I don't know the ultimate cause(s).
Second and far more damningly, your method produces a bunch of nested exceptions, rather than just one. This happens because the script method is, in effect, a script block wrapped in another exception handler that rethrows the exception as a MethodInvocationException. As a result, when you call foo(N), a block of nested exception handlers is set up (paraphrased, it's not actually PowerShell code that does this):
try {
try {
...
try {
throw "error"
} catch {
throw [System.Management.Automation.MethodInvocationException]::new(
"Exception calling ""foo"" with ""1"" argument(s): ""$($_.Exception.Message)""",
$_.Exception
)
}
...
} catch {
throw [System.Management.Automation.MethodInvocationException]::new(
"Exception calling ""foo"" with ""1"" argument(s): ""$($_.Exception.Message)""",
$_.Exception
)
}
} catch {
throw [System.Management.Automation.MethodInvocationException]::new(
"Exception calling ""foo"" with ""1"" argument(s): ""$($_.Exception.Message)""",
$_.Exception
)
}
This produces a massive stack trace that eventually overflows all reasonable boundaries. When you use remoting, the problem is exacerbated by the fact that even if the script executes and produces this huge exception, it (and any results the function does produce) can't be successfully remoted -- on my machine, using PowerShell 5, I don't get a stack overflow error but a remoting error when I call foo(10).
The solution here is to avoid this particular deadly combination of recursive script methods and exceptions. Assuming you don't want to get rid of either recursion or exceptions, this is most easily done by wrapping a regular function:
$object = New-Object PSObject
$object | Add-Member ScriptMethod foo {
param($depth)
function foo($depth) {
if ($depth -eq 0) {
throw "error"
}
else {
foo ($depth - 1)
}
}
foo $depth
}
While this produces much more agreeable exceptions, even this can quite quickly run out of stack when you're remoting. On my machine, this works up to foo(200); beyond that I get a call depth overflow. Locally, the limit is far higher, though PowerShell gets unreasonably slow with large arguments.
As a scripting language, PowerShell wasn't exactly designed to handle recursion efficiently. Should you need more than foo(200), my recommendation is to bite the bullet and rewrite the function so it's not recursive. Classes like Stack<T> can help here:
$object = New-Object PSObject
$object | Add-Member ScriptMethod foo {
param($depth)
$stack = New-Object System.Collections.Generic.Stack[int]
$stack.Push($depth)
while ($stack.Count -gt 0) {
$item = $stack.Pop()
if ($item -eq 0) {
throw "error"
} else {
$stack.Push($item - 1)
}
}
}
Obviously foo is trivially tail recursive and this is overkill, but it illustrates the idea. Iterations could push more than one item on the stack.
This not only eliminates any problems with limited stack depth but is a lot faster as well.

Might be worth checking this out if you are overrunning the available memory in your remote session: Running Java remotely using PowerShell
I know it's for running a Java app but the solution updates the max memory available to a remote WinRM session.

Why doesn't Pester catch errors using a trap

I'm wondering why I get the following behaviour when running this script. I have the script loaded in PowerShell ISE (v4 host) and have the Pester module loaded. I run the script by pressing F5.
function Test-Pester {
throw("An error")
}
Describe "what happens when a function throws an error" {
Context "we test with Should Throw" {
It "Throws an error" {
{ Test-Pester } | Should Throw
}
}
Context "we test using a try-catch construct" {
$ErrorSeen = $false
try {
Test-Pester
}
catch {
$ErrorSeen = $true
}
It "is handled by try-catch" {
$ErrorSeen | Should Be $true
}
}
Context "we test using trap" {
trap {
$ErrorSeen = $true
}
$ErrorSeen = $false
Test-Pester
It "is handled by trap" {
$ErrorSeen | Should Be $true
}
}
}
I then get the following output:
Describing what happens when a function throws an error
Context we test with Should Throw
[+] Throws an error 536ms
Context we test using a try-catch construct
[+] is handled by try-catch 246ms
Context we test using trap
An error
At C:\Test-Pester.ps1:2 char:7
+ throw("An error")
+ ~~~~~~~~~~~~~~~~~
+ CategoryInfo : OperationStopped: (An error:String) [], RuntimeException
+ FullyQualifiedErrorId : An error
[-] is handled by trap 702ms
Expected: {True}
But was: {False}
at line: 40 in C:\Test-Pester.ps1
40: $ErrorSeen | Should Be $true
Question
Why is the trap{} apparently not running in the final test?

Here are two solutions to the problem based on the comments/suggested answers from #PetSerAl and #Eris. I have also read and appreciated the answer given to this question:
Why are variable assignments within a Trap block not visible outside it?
Solution 1
Although variables set in the script can be read within the trap, whatever you do within the trap happens to a copy of that variable, i.e. only in the scope that is local to the trap. In this solution we evaluate a reference to $ErrorSeen so that when we set the value of the variable, we are actually setting the value of the variable that exists in the parent scope.
Adding a continue to the trap suppresses the ErrorRecord details, cleaning up the output of the test.
Describe "what happens when a function throws an error" {
Context "we test using trap" {
$ErrorSeen = $false
trap {
Write-Warning "Error trapped"
([Ref]$ErrorSeen).Value = $true
continue
}
Test-Pester
It "is handled by trap" {
$ErrorSeen | Should Be $true
}
}
}
Solution 2
Along the same lines as the first solution, the problem can be solved by explicitly setting the scope of the $ErrorSeen variable (1) when it is first created and (2) when used within the trap {}. I have used the Script scope here, but Global also seems to work.
Same principle applies here: we need to avoid the problem where changes to the variable within the trap only happen to a local copy of the variable.
Describe "what happens when a function throws an error" {
Context "we test using trap" {
$Script:ErrorSeen = $false
trap {
Write-Warning "Error trapped"
$Script:ErrorSeen = $true
continue
}
Test-Pester
It "is handled by trap" {
$ErrorSeen | Should Be $true
}
}
}

According to this blog, you need to tell your trap to do something to the control flow:
The [...] thing you notice is that when you run this as script, you will receive both your error message and the red PowerShell error message.
. 'C:\Scripts\test.ps1'
Something terrible happened!
Attempted to divide by zero.
At C:\Scripts\test.ps1:2 Char:3
+ 1/ <<<< null
This is because your Trap did not really handle the exception. To handle an exception, you need to add the "Continue" statement to your trap:
trap { 'Something terrible happened!'; continue }
1/$null
Now, the trap works as expected. It does whatever you specified in the trap script block, and PowerShell does not get to see the exception anymore. You no longer get the red error message.

Switching on SqlException.Number in Powershell

I have this catch block in my Powershell script.
catch [System.Data.SqlClient.SqlException]
{
Write-Host "$_"
Exit 2
}
I would really like to be able to switch on the error number.
I know atleast in C# there's a property on the SqlException called number. Isn't that also true for Powershell?
If the property is there, how do I access it?
Thank you very much in advance

You should be able to access it in your catch block using:
$_.Exception.Number
i.e.
catch [System.Data.SqlClient.SqlException]
{
Write-Host $_.Exception.Number
Exit 2
}

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

PDFCreator to convert pdf to txt in PowerShell - powershell

Related

FastCGI, Perl and Exit

foreach continue is not working when inside RunWithElevatedPrivileges

Increasing stack size with WinRM for ScriptMethods

Why doesn't Pester catch errors using a trap

Switching on SqlException.Number in Powershell

Categories

Resources