Delete files after conversion in Powershell - powershell

I'm very inexperienced in Powershell - but through trial and error I have managed to get a .doc/.docx to .pdf conversion working well for a specified folder and all subfolders.
$wdFormatPDF = 17
$word = New-Object -ComObject word.application
$word.visible = $false
$fileTypes = "*.docx","*.doc"
Get-ChildItem -Recurse -path "C:\test-acrobat" -include $fileTypes |
foreach-object `
{
$path = ($_.fullname).substring(0,($_.FullName).lastindexOf("."))
"Converting $path to pdf ..."
$doc = $word.documents.open($_.fullname)
$doc.saveas( $path, $wdFormatPDF)
$doc.close()
}
$word.Quit()
Now I'd like to be able to delete the original .doc/.docx files once they've been converted. On doing some searching I've found what I think would work:
{
remove-item $fileTypes # delete file from file-system
}
But I'd rather check than throw in a command to delete files...
Any help is greatly appreciated.
Philip

I would add the delete inside the foreach loop.
So you would get:
$wdFormatPDF = 17
$word = New-Object -ComObject word.application
$word.visible = $false
$fileTypes = "*.docx","*.doc"
Get-ChildItem -Recurse -path "C:\test-acrobat" -include $fileTypes |
foreach-object `
{
$path = ($_.fullname).substring(0,($_.FullName).lastindexOf("."))
Write-Host "Converting $path to pdf ..."
$doc = $word.documents.open($_.fullname)
$doc.saveas( $path, $wdFormatPDF)
$doc.close()
Remove-Item $_.fullname
}
$word.Quit()

Related

Create multiple shortcuts to subfolders with the same name

Beginner Here, i PUt together this but cant make it work..
I have multiple folders each of them contain own subfolder ‘zero’ , I’m trying to create shortcuts to each of those subfolders with the name of its parent folder
...
$app = New-Object -ComObject "WScript.Shell"
$container = "C:\Users\$app = New-Object -ComObject "WScript.Shell"
$container = "C:\Users\Desktop\skty"
$path = "C:\Users\Desktop\fold"
if (!(Test-Path $container)) {
New-Item -Type Directory -Path $container | Out-Null
}
Get-Childitem -Path $path -Recurse -Include "zero*" | Foreach-Object {
{ $ShortcutFile = "$container\$_.Directory.Parent.Name + $_.Name"
$app.CreateShortcut($ShortcutFile)
$Shortcut.TargetPath = $_.FullName
$Shortcut.Save()
}
...
This should help you get started:
This will put a shortcuts on everones Desktop pointing to the users Documents folder.
$container = Get-ChildItem -Path C:\Users
ForEach ($upath in $container){
$sfilepath = $upath.FullName + "\Documents"
$scutpath = $upath.FullName + "\Desktop"
$SourceFilePath = $sfilepath
$ShortcutPath = "$scutpath" + "\Documents.lnk"
$WScriptObj = New-Object -ComObject ("WScript.Shell")
$shortcut = $WscriptObj.CreateShortcut($ShortcutPath)
$shortcut.TargetPath = $SourceFilePath
$shortcut.Save()
}
If you know the location you need to use, you can use Get-Content
Please play with the location to get the output you need.

How to fix corrupted files when download from outlook using powershell

i have a powershell script that automatically download from outlook and save in the file i already set. the script works fine but then i realise that some of the attachment downloaded is corrupted. here is the script that i use.
Function saveattachmentexcel
{
$Null = Add-type -Assembly "Microsoft.Office.Interop.Outlook"
#olFolders = "Microsoft.Office.Interop.Outlook.olDefaultFolders" -as [type]
#olFolderInbox = 6
$outlook = new-object -comobject outlook.application
$namespace = $outlook.GetNameSpace("MAPI")
$folder = $nameSpace.GetDefaultFolder([Microsoft.Office.Interop.Outlook.OlDefaultFolders]::olFolderInbox)
$filepath = "D:\DMR Folder\"
$folder.Items | Where {$_.UnRead -eq $True -and $($_.attachments).filename -match '.xlsm'} | ForEach-object {
$filename = $($_.attachments | where filename -match '.xlsm').filename
foreach($file in $filename)
{
$outpath = join-path $filepath $file
$($_.attachments).saveasfile($outpath)
}
$_.UnRead = $False
}
}
saveattachmentexcel
i do not know why this is happening. could anyone please help me?
This is likely because you attempt to save every single attachment to the same file name on disk with the $($_.attachments).saveasfile($outpath) statement.
Change this:
$filename = $($_.attachments | where filename -match '.xlsm').filename
foreach($file in $filename)
{
$outpath = join-path $filepath $file
$($_.attachments).saveasfile($outpath)
}
to:
foreach($attachment in $_.attachments)
{
if($attachment.Filename -like '*.xlsm'){
$outpath = Join-Path $filepath $attachment.Filename
# Only save this particular attachment to disk - not all of them
$attachment.SaveAsFile($outpath)
}
}

Powershell script not iterating through child folders

I sniped this script online and it works fine for converting the files in the parent folder. It does not however iterate through the child folders. I do not get any errors and I have verified all the folder permissions are correct. Additionally, I have scripts that are coded similar for *.docx and *.pptx files and they run successfully. This one however is not working as expected. Any ideas?
$path = "c:\converted\"
$xlFixedFormat = "Microsoft.Office.Interop.Excel.xlFixedFormatType" -as [type]
$excelFiles = Get-ChildItem -Path $path -include *.xls, *.xlsx -recurse
$objExcel = New-Object -ComObject excel.application
$objExcel.visible = $false
foreach($wb in $excelFiles)
{
$filepath = Join-Path -Path $path -ChildPath ($wb.BaseName + ".pdf")
$workbook = $objExcel.workbooks.open($wb.fullname, 3)
$workbook.Saved = $true
"converted $wb.fullname"
$workbook.ExportAsFixedFormat($xlFixedFormat::xlTypePDF, $filepath)
$objExcel.Workbooks.close()
#get rid of conversion copy
#Remove-Item $wb.fullname
}
$objExcel.Quit()
$excelFiles will contain subfolders, but your construction of $filepath uses only the original $path and current $wb.BaseName without taking into account that the current $wb.FullName may contain a longer path.
Replace
$filepath = Join-Path -Path $path -ChildPath ($wb.BaseName + ".pdf")
with
$filepath = $wb.fullname -replace $wb.extension,".pdf"

Removing hidden data and "personal information" from '.doc' , '.docx' and '.pptx' documents

Hi I am trying to remove the 'hidden data' and personal information set for '.doc, .docx, .pptx' documments through powershell :
HEre is the powershell script which I have written for the same :
$path = "C:\Users\anisjain\Documents\GRR Production\HiddenProrerties"
Add-Type -AssemblyName Microsoft.Office.Interop.Word
$xlRemoveDocType = "Microsoft.Office.Interop.xlRDIRemovePersonalInformation" -as [type]
$wordFiles = Get-ChildItem -Path $path -include *.doc, *.docx -recurse
$objword = New-Object -ComObject word.application
foreach($obj in $wordFiles)
{
$documents = $MSWord.Documents.Open($obj.fullname)
"Removing document information from $obj"
$documents.RemoveDocumentInformation($xlRemoveDocType::xlRDIRemovePersonalInformation)
$documents.Save()
$objword.documents.close()
}
$objword.Quit()
This however, doesnt work. Can someone please tell me where am i going wrong?
and if there is some other way of doing it. I have around 2000 records from which i wish to remove the 'hidden document information'. Thanks in advance.
here's the script that works for me, after some googling/copying/modifying
$path = "d:\rubbish\myfolder\"
Add-Type -AssemblyName Microsoft.Office.Interop.Word
$WdRemoveDocType = "Microsoft.Office.Interop.Word.WdRemoveDocInfoType" -as [type]
$wordFiles = Get-ChildItem -Path $path -include *.doc, *.docx -recurse
$objword = New-Object -ComObject word.application
$objword.visible = $false
foreach($obj in $wordFiles)
{
$documents = $objword.Documents.Open($obj.fullname)
"Removing document information from $obj"
# WdRemoveDocInfoType Enumeration Reference
# http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word.wdremovedocinfotype(v=office.14).aspx
# 99 = WdRDIAll
#$documents.RemoveDocumentInformation(99)
$documents.RemoveDocumentInformation($WdRemoveDocType::wdRDIAll)
$documents.Save()
$objword.documents.close()
}
$objword.Quit()

How can I use PowerShell to `save as...` a different file extension?

I'm using the following powershell script to open a few thousand HTML files and "save as..." Word documents.
param([string]$htmpath,[string]$docpath = $docpath)
$srcfiles = Get-ChildItem $htmPath -filter "*.htm*"
$saveFormat = [Enum]::Parse([Microsoft.Office.Interop.Word.WdSaveFormat], "wdFormatDocument");
$word = new-object -comobject word.application
$word.Visible = $False
function saveas-document
{
$opendoc = $word.documents.open($doc.FullName);
$opendoc.saveas([ref]"$docpath\$doc.FullName.doc", [ref]$saveFormat);
$opendoc.close();
}
ForEach ($doc in $srcfiles)
{
Write-Host "Processing :" $doc.FullName
saveas-document
$doc = $null
}
$word.quit();
The content converts splendidly, but my filename is not as expected.
$opendoc.saveas([ref]"$docpath\$doc.FullName.doc", [ref]$saveFormat); results in foo.htm saving as foo.htm.FullName.doc instead of foo.doc.
$opendoc.saveas([ref]"$docpath\$doc.BaseName.doc", [ref]$saveFormat); yields foo.htm.BaseName.doc
How do I set up a Save As... filename variable equal to a concatenation of BaseName and .doc?
Based on our comments above, it seems that moving the files is all you want to accomplish. The following works for me. In the current directory, it replaces .txt extensions with .py extensions. I found the command here.
PS C:\testing dir *.txt | Move-Item -Destination {[IO.Path]::ChangeExtension( $_.Name, "py")}
You can also change *.txt to C:\path\to\file\*.txt so you don't need to execute this line from the location of the files. You should be able to define a destination in a similar manner, so I'll report back if I find a simple way to do that.
Also, I found Microsoft's TechNet Library while I was searching. It has many tutorials on scripting using PowerShell. Files and Folders, Part 3: Windows PowerShell should help you to find additional info on copying and moving files.
I was having problems just converting the filename from .html to .docx. I took your code above and changed it to this:
function Convert-HTMLtoDocx {
param([string]$htmpath)
$srcfiles = Get-ChildItem $htmPath -filter "*.htm*"
$saveFormat = [Microsoft.Office.Interop.Word.WdSaveFormat]::wdFormatXMLDocument
$word = new-object -comobject word.application
$word.Visible = $False
ForEach ($doc in $srcfiles) {
Write-Host "Processing :" $doc.fullname
$name = Join-Path -Path $doc.DirectoryName -ChildPath $($doc.BaseName + ".docx")
$opendoc = $word.documents.open($doc.FullName)
$opendoc.saveas([ref]$name.Value,[ref]$saveFormat)
$opendoc.close()
$doc = $null
} #End ForEach
$word.quit()
} #End Function
The problem was the save format. For whatever reason, so save a document as a .docx you need to specify the format at wdFormatXMLDocument not wdFormatDocument.
This does a recursive walk of a root folder and writes and .doc to .htm filtered:
$docpath = "\\sf-xyz-serverabc01\ChangeTheseDocuments"
$WdTypes = Add-Type -AssemblyName 'Microsoft.Office.Interop.Word, Version=14.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c' -Passthru
$srcfiles = get-childitem $docpath -filter "*.doc" -rec | where {!$_.PSIsContainer} | select-object FullName
$saveFormat = $WdTypes | Where {$_.Name -eq 'WdSaveFormat'}
$word = new-object -comobject word.application
$word.Visible = $False
function saveas-filteredhtml
{
$opendoc = $word.documents.open($doc.FullName);
$Name=($doc.Fullname).replace("doc","htm")
$opendoc.saveas([ref]$Name, [ref]$saveFormat::wdFormatFilteredHTML);
$opendoc.close();
}
ForEach ($doc in $srcfiles)
{
Write-Host "Processing :" $doc.FullName
saveas-filteredhtml
$doc = $null
}
$word.quit();
I know this is an older post but I am posting this code here so that I can find it in the future
**
This does a recursive walk of a root folder and Converts Doc and DocX to Txt
**
Here is a LINK to the diffierent formats you can save to.
$docpath = "C:\Temp"
$WdTypes = Add-Type -AssemblyName 'Microsoft.Office.Interop.Word, Version=14.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c' -Passthru
$srcfiles = get-childitem $docpath -filter "*.doc" -rec | where {!$_.PSIsContainer} | select-object FullName
$saveFormat = $WdTypes | Where {$_.Name -eq 'WdSaveFormat'}
$word = new-object -comobject word.application
$word.Visible = $False
function saveas-filteredhtml
{
$opendoc = $word.documents.open($doc.FullName);
$Name=($doc.Fullname).replace(".docx",".txt").replace(".doc",".txt")
$opendoc.saveas([ref]$Name, [ref]$saveFormat::wdFormatDOSText); ##wdFormatDocument
$opendoc.close();
}
ForEach ($doc in $srcfiles)
{
Write-Host "Processing :" $doc.FullName
saveas-filteredhtml
$doc = $null
}
$word.quit();