Powershell downloading all possible files for a given day - powershell

I'm using Powershell for the first time to download the previous day's files from a webpage for a client. The web page is from a data logger than is on a vendor skid. The data logger always saves the files in the format yyMMdd##.CSV, where ## is the sequential number file for that given day (starting at 00). When viewing the webpage I have only seen the maximum number of CSV files for a given day as 1 (so, 8/31/17's file would be 17083100.CSV). I have got the Powershell code written to give me yesterday's file assuming that 00 is the only file for that day, but I was hoping there was a way I could either use a wildcard or for loop to download any additional files that may exist for the previous day. See the code below for what I currently have:
$a = "http://10.109.120.101/logs/Log1/"
$b = (get-date).AddDays(-1).ToString("yyMMdd") + "00.CSV"
$c = "C:\"
$url = "$a$b"
$WebClient = New-Object net.webclient
$path = "$c$b"
$WebClient.DownloadFile($url, $path)

try Something like this:
$Date=(get-date).AddDays(-1).ToString("yyMMdd")
$URLFormat ='http://10.109.120.101/logs/Log1/{0}{1:D2}.CSV'
$WebClient = New-Object net.webclient
#build destination path
$PathDest="C:\Temp\$Date"
New-Item -Path $PathDest -ItemType Directory -ErrorAction SilentlyContinue
1..99 | %{
$Path="$PathDest\{0:D2}.CSV" -f $_
$URL=$URLFormat -f $Date, $_
try
{
Write-Host ("Try to download '{0}' file to '{1}'" -f $URL, $Path)
$WebClient.DownloadFile($Path, $URL)
}
catch
{
}
}
$WebClient.Dispose()

Related

Powershell: Converting Headers from .msg file to .txt - Current directory doesn't pull header information, but specific directory does

So I am trying to make a script to take a batch of .msg files, pull their header information and then throw that header information into a .txt file. This is all working totally fine when I use this code:
$directory = "C:\Users\IT\Documents\msg\"
$ol = New-Object -ComObject Outlook.Application
$files = Get-ChildItem $directory -Recurse
foreach ($file in $files)
{
$msg = $ol.CreateItemFromTemplate($directory + $file)
$headers = $msg.PropertyAccessor.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x007D001E")
$headers > ($file.name +".txt")
}
But when I change the directory to use the active directory where the PS script is being run from $directory = ".\msg\", it will make all the files into text documents but they will be completely blank with no header information. I have tried different variations of things like:
$directory = -Path ".\msg\"
$files = Get-ChildItem -Path $directory
$files = Get-ChildItem -Path ".\msg\"
If anyone could share some ideas on how I could run the script from the active directory without needing to edit the code to specify the path each location. I'm trying to set this up so it can be done by simply putting it into a folder and running it.
Thanks! Any help is very appreciated!
Note: I do have outlook installed, so its not an issue of not being able to pull the headers, as it works when specifying a directory in the code
The easiest way might actually be to do it this way
$msg = $ol.CreateItemFromTemplate($file.FullName)
So, the complete script would then look something like this
$directory = ".\msg\"
$ol = New-Object -ComObject Outlook.Application
$files = Get-ChildItem $directory
foreach ($file in $files)
{
$msg = $ol.CreateItemFromTemplate($file.FullName)
$headers = $msg.PropertyAccessor.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x007D001E")
$headers > ($file.name +".txt")
}
All that said, it could be worthwhile reading up on automatic variables (Get-Help about_Automatic_Variables) - for instance the sections about $PWD, $PSScriptRoot and $PSCommandPath might be useful.
Alternative ways - even though they seem unnecessarily complicated.
$msg = $ol.CreateItemFromTemplate((Get-Item $directory).FullName + $file)
Or something like this
$msg = $ol.CreateItemFromTemplate($file.DirectoryName + "\" $file)

Creating a new txt file from a PowerShell script and writing to it

I have a script to run a simple test against my system. I have it using stream reader to read the output in the csv when I run it and erase the CSV it is running from at the end of the test. I am trying to tweak it to create a new file when a test is run that is titled "date-time.txt" and output that specific test into the time stamped .txt file
Here is what I have so far, I am not sure if it is easier to piggyback off this code or make a separate function.
$returnValue = Invoke-Expression "C:\example-test.exe -s $serverName -u $username -p $password -c"
#--------------------------------------------------------------------------------
# Process the output from the .exe and build our output
#--------------------------------------------------------------------------------
$dateColumn = 0
$failColumn = 4
$stream_reader = New-Object System.IO.StreamReader{.\example.csv}
$current_line =$stream_reader.ReadLine()
$charArray = $current_line.Split(",")
$testDate = $charArray[$dateColumn]
$failCode = $charArray[$failColumn]
if ($failCode -eq $expectedFailCode) {
$testResult = "PASS"
}
#build our own csv - test#,pass/fail, date,server,username,password, failcode, expectedFailCode
$outputString = "$testNumber,$testResult,$testDate,$serverName,$username,$password,$failCode,$expectedFailCode"
if ($testResult -eq "FAIL"){
write-host "$outputString" -ForegroundColor red
} else {
write-host "$outputString"
}
#must close file
$stream_reader.close()
#must delete npf-audit-csv otherwise we only read the top line every time
#we want it to build a fresh file every time
Remove-Item -Path .\npf-audit.csv -Force
I'm assuming your question is about creating the file with a date as part of the filename ?
I would simply build the file name using Get-Date -Format (which returns a string and not a datetime object like regular Get-Date), create the new file with New-Item and the generated name, and pass the path of the file object returned by New-Item to the StreamReader :
$FileName = (Get-Date -Format "yyMMdd-HHmmss") + ".txt"
$File = New-Item -Type File -Path "." -Name $FileName
$stream_reader = New-Object System.IO.StreamReader{$File.FullName}
I'm assuming a "year month day - hour minute second" format (the yyMMdd-HHmmss part), if you want to tweak it I'll let you read the manual for the -Format parameter of Get-Date and the page on custom datetime formats.

Extracting text from word documents

TASK
Extract text from .doc, .docx and .pdf files and upload content to an Azure SQL database. Needs to be fast as its running over millions of documents.
ISSUES
Script starts to fail if one of the documents has an issue. Some that i have come across are:
This file failed to open last time you tried - Open readonly
File is corrupt
SCRIPT
Firstly i generate a list of files containing 100 file paths. This is so i can continue execution if i need to stop it and / or it errors out:
## Word object
if (!($continue)) {
$files = (Get-ChildItem -force -recurse $documentFolder -include *.doc, *.docx).fullname
$files | Out-File (Join-Path $PSScriptRoot "\documents.txt")
$i=0; Get-Content $documentFile -ReadCount 100 | %{$i++; $_ | Out-File (Join-Path $PSScriptRoot "\FileLists\documents_$i.txt")}
}
Then i create the ComObject with the DisplayAlerts flag set to 0 (i thought this would fix it. It didnt)
$word = New-Object -ComObject word.application
$word.Visible = $false
$saveFormat = [Enum]::Parse([Microsoft.Office.Interop.Word.WdSaveFormat], "wdFormatText")
$word.DisplayAlerts = 0
After this, I loop through each file in each list, save the file as .txt to the temp folder, extract the text and generate an SQL INSERT statemnt:
foreach ($file in (Get-Content $list)) {
Try {
if ($file -like "*-*") {
Write-Output "Processing: $($file)"
$doc = $word.Documents.Open($file)
$fileName = [io.path]::GetFileNameWithoutExtension($file)
$fileName = $filename + ".txt"
$doc.SaveAs("$env:TEMP\$fileName", [ref]$saveFormat)
$doc.Close()
$4ID = $fileName.split('-')[-1].replace(' ', '').replace(".txt", "")
$text = Get-Content -raw "$env:TEMP\$fileName"
$text = $text.replace("'", "")
$query += "
('$text', $4ID),"
Remove-Item -Force "$env:TEMP\$fileName"
}
<# Upload to azure #>
$query = $query.Substring(0,$query.Length-1)
$query += ";"
$params = #{
'Database' = $TRIS5DATABASENAME
'ServerInstance' = $($AzureServerInstance.FullyQualifiedDomainName)
'Username' = $AdminLogin
'Password' = $InsecurePassword
'query' = $query
}
Invoke-Sqlcmd #params -ErrorAction "continue"
$query = "INSERT INTO tmp_CachedText (tCachedText, tOID)
VALUES "
}
Catch {
Write-Host "$($file) failed to process" -ForegroundColor RED;
}
}
Remove-Item -Force $list.FullName
ISSUES
As stated above, if something is wrong with one of the files or the document failed to open properly off a previous run the script starts failing. Everything in the loop throws errors, starting with:
You cannot call a method on a null-valued expression.
At D:\OneDrive\Scripts\Microsoft Cloud\CachedText-Extraction\CachedText-Extraction.ps1:226 char:13
+ $doc = $word.Documents.Open($file)
Basically what i want is a way to stop those errors from appearing by simply skipping the file if it has an error with the document. Alternatively, if there is a better way to extract text from document files using PowerShell and not using word that would be good too.
An example of one of the error messages:
This causes the file to be locked and execution to pause. The only way to get around it is to kill word, which then causes the rest of the script to fail.

WinSCP XML log with PowerShell to confirm multiple file upload

With my script, I am attempting to scan a directory for a subdirectory that is automatically created each day that contains the date in the directory name. Once it finds yesterdays date (since I need to upload previous day), it looks for another subdirectory, then any files that contain "JONES". Once it finds those files, it does a foreach loop to upload them using winscp.com.
My issue is that I'm trying to use the .xml log created from winscp to send to a user to confirm uploads. The problem is that the .xml file contains only the last file uploaded.
Here's my script:
# Set yesterday's date (since uploads will happen at 2am)
$YDate = (Get-Date).AddDays(-1).ToString('MM-dd-yyyy')
# Find Directory w/ Yesterday's Date in name
$YesterdayFolder = Get-ChildItem -Path "\\Path\to\server" | Where-Object {$_.FullName.contains($YDate)}
If ($YesterdayFolder) {
#we specify the directory where all files that we want to upload are contained
$Dir= $YesterdayFolder
#list every sql server trace file
$FilesToUpload = Get-ChildItem -Path (Join-Path $YesterdayFolder.FullName "Report") | Where-Object {$_.Name.StartsWith("JONES","CurrentCultureIgnoreCase")}
foreach($item in ($FilesToUpload))
{
$PutCommand = '& "C:\Program Files (x86)\WinSCP\winscp.com" /command "open ftp://USERNAME:PASSWORD#ftps.hostname.com:21/dropoff/ -explicitssl" "put """"' + $Item.FullName + '""""" "exit"'
Invoke-Expression $PutCommand
}
} Else {
#Something Else will go here
}
I feel that it's my $PutCommand line all being contained within the ForEach loop, and it just overwrites the xml file each time it connects/exits, but I haven't had any luck breaking that script up.
You are running WinSCP again and again for each file. Each run overwrites a log of the previous run.
Call WinSCP once instead only. It's even better as you avoid re-connecting for each file.
$FilesToUpload = Get-ChildItem -Path (Join-Path $YesterdayFolder.FullName "Report") |
Where-Object {$_.Name.StartsWith("JONES","CurrentCultureIgnoreCase")}
$PutCommand = '& "C:\Program Files (x86)\WinSCP\winscp.com" /command "open ftp://USERNAME:PASSWORD#ftps.hostname.com:21/dropoff/ -explicitssl" '
foreach($item in ($FilesToUpload))
{
$PutCommand += '"put """"' + $Item.FullName + '""""" '
}
$PutCommand += '"exit"'
Invoke-Expression $PutCommand
Though all you really need to do is checking WinSCP exit code. If it is 0, all went fine. No need to have the XML log as a proof.
And even better, use the WinSCP .NET assembly from PowerShell script, instead of driving WinSCP from command-line. It does all error checking for you (you get an exception if anything goes wrong). And you avoid all nasty stuff of command-line (like escaping special symbols in credentials and filenames).
try
{
# Load WinSCP .NET assembly
Add-Type -Path "WinSCPnet.dll"
# Setup session options
$sessionOptions = New-Object WinSCP.SessionOptions -Property #{
Protocol = [WinSCP.Protocol]::Ftp
FtpSecure = [WinSCP.FtpSecure]::Explicit
TlsHostCertificateFingerprint = "xx:xx:xx:xx:xx:xx..."
HostName = "ftps.hostname.com"
UserName = "username"
Password = "password"
}
$session = New-Object WinSCP.Session
try
{
# Connect
$session.Open($sessionOptions)
# Upload files
foreach ($item in ($FilesToUpload))
{
$session.PutFiles($Item.FullName, "/dropoff/").Check()
Write-Host "Upload of $($Item.FullName) succeeded"
}
}
finally
{
# Disconnect, clean up
$session.Dispose()
}
exit 0
}
catch [Exception]
{
Write-Host "Error: $($_.Exception.Message)"
exit 1
}

Powershell automated deletion of specified SharePoint documents

We have a csv file with approximately 8,000 SharePoint document file URLs - the files in question they refer to have to be downloaded to a file share location, then deleted from the SharePoint. The files are not located in the same sites, but across several hundred in a server farm. We are looking to remove only the specified files - NOT the entire library.
We have the following script to effect the download, which creates the folder structure so that the downloaded files are separated.
param (
[Parameter(Mandatory=$True)]
[string]$base = "C:\Export\",
[Parameter(Mandatory=$True)]
[string]$csvFile = "c:\export.csv"
)
write-host "Commencing Download"
$date = Get-Date
add-content C:\Export\Log.txt "Commencing Download at $date":
$webclient = New-Object System.Net.WebClient
$webclient.UseDefaultCredentials = $true
$files = (import-csv $csvFile | Where-Object {$_.Name -ne ""})
$line=1
Foreach ($file in $files) {
$line = $line + 1
if (($file.SpURL -ne "") -and ($file.path -ne "")) {
$lastBackslash = $file.SpURL.LastIndexOf("/")
if ($lastBackslash -ne -1) {
$fileName = $file.SpURL.substring(1 + $lastBackslash)
$filePath = $base + $file.path.replace("/", "\")
New-Item -ItemType Directory -Force -Path $filePath.substring(0, $filePath.length - 1)
$webclient.DownloadFile($file.SpURL, $filePath + $fileName)
$url=$file.SpURL
add-content C:\Export\Log.txt "INFO: Processing line $line in $csvFile, writing $url to $filePath$fileName"
} else {
$host.ui.WriteErrorLine("Exception: URL has no backslash on $line for filename $csvFile")
}
} else {
$host.ui.WriteErrorLine("Exception: URL or Path is empty on line $line for filename $csvFile")
}
}
write-Host "Download Complete"
Is there a way we could get the versions for each file?
I have been looking for a means to carry out the deletion, using the same csv file as reference - all of the code I have seen refers to deleting entire libraries, which is not desired.
I am very new to PowerShell and am getting lost. Can anyone shed some light?
Many thanks.
This looks like it might be useful. It's a different approach and would need to be modified to pull in the file list from your CSV but it looks like it generally accomplishes what you are looking to do.
https://sharepoint.stackexchange.com/questions/6511/download-and-delete-documents-using-powershell