Downloading website files in powershell - powershell

I'm trying to get a script to query files on an IIS website, then download those files automatically. So far, I have this:
$webclient = New-Object System.Net.webclient
$source = "http://testsite:8005/"
$destination = "C:\users\administrator\desktop\testfolder\"
#The following line returns the links in the webpage
$testcode1 = $webclient.downloadstring($source) -split "<a\s+" | %{ [void]($_ -match "^href=['"]([^'">\s]*)"); $matches[1] }
foreach ($line in $test2) {
$webclient.downloadfile($source + $line, $destination + $line)
}
I'm not that good at PowerShell yet, and I get some errors, but I manage to get a couple test files I threw into my wwwroot folder (the web.config file seems undownloadable, so I'd imagine thats one of my errors). When I tried to change my $source value to a subfolder on my site that had some test text files(example = http://testsite:8005/subfolder/, I get errors and no downloads at all. Running my $testcode1 will give me the following links in my subfolder:
/subfolder/test2/txt
/
/subfolder/test1.txt
/subfolder/test2.txt
I don't know why it lists the test2 file twice. I figured my problem was that since it was returning the subfolder/file format, that I was getting errors because I was trying to download $source + $line, which would essentially be http://testsite:8005/subfolder/subfolder/test1.txt, but when I tried to remedy that by adding in a $root value that was the root directory of my site and do a foreach($line in $testcode1) { $webclient.downloadfile($root + $line, $destination + $line) }, I still get errors.
If some of you high speed gurus can help show me the error of my ways, I'd be grateful. I am looking to download all the files in each subfolder on my site, which I know would involve use of some recursive action, but again, I currently do not have the skill level myself to do that. Thank you in advance on helping me out!

Best way to download files from a website is to use
Invoke-WebRequest –Uri $url
Once you are able to get hold of the html you can parse the content for the links.
$result = (((Invoke-WebRequest –Uri $url).Links | Where-Object {$_.href -like “http*”} ) | select href).href
Give it a try. Its simpler than $webclient = New-Object System.Net.webclient

This is to augment A_N's answer with two examples.
Download this Stackoverflow question to C:/temp/question.htm.
Invoke-RestMethod -Uri stackoverflow.com/q/19572091/1108891 -OutFile C:/temp/question.htm
Download a simple text document to C:/temp/rfc2616.txt.
Invoke-RestMethod -Uri tools.ietf.org/html/rfc2616 -OutFile C:/temp/rfc2616.txt

I made a simple Powershell script to clone an openbsd package repo. It probably would work / could be implemented in other ways/use cases for similar things.
GitHub link
# Quick and dirty script to clone a package repo. Only tested against OpenBSD.
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
$share = "\\172.16.10.99\wmfbshare\obsd_repo\"
$url = "https://ftp3.usa.openbsd.org/pub/OpenBSD/snapshots/packages/amd64/"
cd $share
$packages = Invoke-WebRequest -Uri $url -UseBasicParsing $url
$dlfolder = "\\172.16.10.99\wmfbshare\obsd_repo\"
foreach ($package in $packages.links.href){
if ((get-item $package -ErrorAction SilentlyContinue)){
write-host "$package already downloaded"
} else {
write-host "Downlading $package"
wget "$url/$package" -outfile "$dlfolder\$package"
}
}

I would try this:
$webclient = New-Object System.Net.webclient
$source = "http://testsite:8005/"
$destination = "C:\users\administrator\desktop\testfolder\"
#The following line returns the links in the webpage
$testcode1 = $webclient.downloadstring($source) -split "<a\s+" | %{ [void]($_ -match "^href=['"]([^'">\s]*)"); $matches[1] }
foreach ($line in $testcode1) {
$Destination = "$destination\$line"
#Create a new directory if it doesn't exist
if (!(Test-Path $Destination)){
New-Item $Destination -type directory -Force
}
$webclient.downloadfile($source + $line, $destination + $line)
}
I think your only issue here is that you were grabbing a new file from a new directory, and putting it into a folder that didn't exist yet (I could be mistaken).
You can do some additional troubleshooting if that doesn't fix your problem:
Copy each line individually into your powershell window and run them up to the foreach loop. Then type out your variable holding all the gold:
$testcode1
When you enter that into the console, it should spit out exactly what's in there. Then you can do additional troubleshooting like this:
"Attempting to copy $Source$line to $Destination$line"
And see if it looks the way it should all the way on down. You might have to adjust my code a bit.
-Dale Harris

Related

Powershell: Converting Headers from .msg file to .txt - Current directory doesn't pull header information, but specific directory does

So I am trying to make a script to take a batch of .msg files, pull their header information and then throw that header information into a .txt file. This is all working totally fine when I use this code:
$directory = "C:\Users\IT\Documents\msg\"
$ol = New-Object -ComObject Outlook.Application
$files = Get-ChildItem $directory -Recurse
foreach ($file in $files)
{
$msg = $ol.CreateItemFromTemplate($directory + $file)
$headers = $msg.PropertyAccessor.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x007D001E")
$headers > ($file.name +".txt")
}
But when I change the directory to use the active directory where the PS script is being run from $directory = ".\msg\", it will make all the files into text documents but they will be completely blank with no header information. I have tried different variations of things like:
$directory = -Path ".\msg\"
$files = Get-ChildItem -Path $directory
$files = Get-ChildItem -Path ".\msg\"
If anyone could share some ideas on how I could run the script from the active directory without needing to edit the code to specify the path each location. I'm trying to set this up so it can be done by simply putting it into a folder and running it.
Thanks! Any help is very appreciated!
Note: I do have outlook installed, so its not an issue of not being able to pull the headers, as it works when specifying a directory in the code
The easiest way might actually be to do it this way
$msg = $ol.CreateItemFromTemplate($file.FullName)
So, the complete script would then look something like this
$directory = ".\msg\"
$ol = New-Object -ComObject Outlook.Application
$files = Get-ChildItem $directory
foreach ($file in $files)
{
$msg = $ol.CreateItemFromTemplate($file.FullName)
$headers = $msg.PropertyAccessor.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x007D001E")
$headers > ($file.name +".txt")
}
All that said, it could be worthwhile reading up on automatic variables (Get-Help about_Automatic_Variables) - for instance the sections about $PWD, $PSScriptRoot and $PSCommandPath might be useful.
Alternative ways - even though they seem unnecessarily complicated.
$msg = $ol.CreateItemFromTemplate((Get-Item $directory).FullName + $file)
Or something like this
$msg = $ol.CreateItemFromTemplate($file.DirectoryName + "\" $file)

Powershell downloading all possible files for a given day

I'm using Powershell for the first time to download the previous day's files from a webpage for a client. The web page is from a data logger than is on a vendor skid. The data logger always saves the files in the format yyMMdd##.CSV, where ## is the sequential number file for that given day (starting at 00). When viewing the webpage I have only seen the maximum number of CSV files for a given day as 1 (so, 8/31/17's file would be 17083100.CSV). I have got the Powershell code written to give me yesterday's file assuming that 00 is the only file for that day, but I was hoping there was a way I could either use a wildcard or for loop to download any additional files that may exist for the previous day. See the code below for what I currently have:
$a = "http://10.109.120.101/logs/Log1/"
$b = (get-date).AddDays(-1).ToString("yyMMdd") + "00.CSV"
$c = "C:\"
$url = "$a$b"
$WebClient = New-Object net.webclient
$path = "$c$b"
$WebClient.DownloadFile($url, $path)
try Something like this:
$Date=(get-date).AddDays(-1).ToString("yyMMdd")
$URLFormat ='http://10.109.120.101/logs/Log1/{0}{1:D2}.CSV'
$WebClient = New-Object net.webclient
#build destination path
$PathDest="C:\Temp\$Date"
New-Item -Path $PathDest -ItemType Directory -ErrorAction SilentlyContinue
1..99 | %{
$Path="$PathDest\{0:D2}.CSV" -f $_
$URL=$URLFormat -f $Date, $_
try
{
Write-Host ("Try to download '{0}' file to '{1}'" -f $URL, $Path)
$WebClient.DownloadFile($Path, $URL)
}
catch
{
}
}
$WebClient.Dispose()

Powershell not sending the right path for a file as argument

I'm trying to apply a hash function to all the files inside a folder as some kind of version control. The idea is to make a testfile that lists the name of the file and the generated checksum. Digging online I found some code that should do the trick (in theory):
$list = Get-ChildItem 'C:\users\public\documents\folder' -Filter *.cab
$sha1 = New-Object System.Security.Cryptography.SHA1CryptoServiceProvider
foreach ($file in $list) {
$return = "" | Select Name, Hash
$returnname = $file.Name
$returnhash = [System.BitConverter]::ToString($sha1.ComputeHash([System.IO.File]::ReadAllBytes($file.Name)))
$return = "$returnname,$returnhash"
Out-File -FilePath .\mylist.txt -Encoding Default -InputObject ($return) -Append
}
When I run it however, I get an error because it tries to read the files from c:\users\me\, the folder where I'm running the script. And the file c:\users\me\aa.cab does not exist and hence can't be reached.
I've tried everything that I could think of, but no luck. I'm using Windows 7 with Powershell 2.0, if that helps in any way.
Try with .FullName instead of just .Name.
$returnhash = [System.BitConverter]::ToString($sha1.ComputeHash([System.IO.File]::ReadAllBytes($file.FullName)))

Powershell automated deletion of specified SharePoint documents

We have a csv file with approximately 8,000 SharePoint document file URLs - the files in question they refer to have to be downloaded to a file share location, then deleted from the SharePoint. The files are not located in the same sites, but across several hundred in a server farm. We are looking to remove only the specified files - NOT the entire library.
We have the following script to effect the download, which creates the folder structure so that the downloaded files are separated.
param (
[Parameter(Mandatory=$True)]
[string]$base = "C:\Export\",
[Parameter(Mandatory=$True)]
[string]$csvFile = "c:\export.csv"
)
write-host "Commencing Download"
$date = Get-Date
add-content C:\Export\Log.txt "Commencing Download at $date":
$webclient = New-Object System.Net.WebClient
$webclient.UseDefaultCredentials = $true
$files = (import-csv $csvFile | Where-Object {$_.Name -ne ""})
$line=1
Foreach ($file in $files) {
$line = $line + 1
if (($file.SpURL -ne "") -and ($file.path -ne "")) {
$lastBackslash = $file.SpURL.LastIndexOf("/")
if ($lastBackslash -ne -1) {
$fileName = $file.SpURL.substring(1 + $lastBackslash)
$filePath = $base + $file.path.replace("/", "\")
New-Item -ItemType Directory -Force -Path $filePath.substring(0, $filePath.length - 1)
$webclient.DownloadFile($file.SpURL, $filePath + $fileName)
$url=$file.SpURL
add-content C:\Export\Log.txt "INFO: Processing line $line in $csvFile, writing $url to $filePath$fileName"
} else {
$host.ui.WriteErrorLine("Exception: URL has no backslash on $line for filename $csvFile")
}
} else {
$host.ui.WriteErrorLine("Exception: URL or Path is empty on line $line for filename $csvFile")
}
}
write-Host "Download Complete"
Is there a way we could get the versions for each file?
I have been looking for a means to carry out the deletion, using the same csv file as reference - all of the code I have seen refers to deleting entire libraries, which is not desired.
I am very new to PowerShell and am getting lost. Can anyone shed some light?
Many thanks.
This looks like it might be useful. It's a different approach and would need to be modified to pull in the file list from your CSV but it looks like it generally accomplishes what you are looking to do.
https://sharepoint.stackexchange.com/questions/6511/download-and-delete-documents-using-powershell

(Powershell) Loop to delete files from an FTP Location

Good morning!
I have made it to the last (and rather pivotal) stage in my script, which is looping to delete files from a directory. I'm not going to pretend I'm knowledgeable at Powershell (far from it), so I'm sort-of chopping up blocks of code I find on the net, improvising and hoping it works.
I'm hoping someone can decipher what I'm trying to do here and see what I'm doing wrong!
# Clear FTP Directory
$DelLoop=1
$server = "www.newsbase.com"
$dir = "/usr/local/tomcat/webapps/newsbasearchive/monitors/asiaelec/"
"open $server
user Canttell Youthis
binary
cd $dir
" +(
For ($DelLoop=1; $DelLoop -le 5; 5)
{
$FileList[$DelLoop] | %{ "delete ""$_""`n" }
$DelLoop++
})| ftp -i -in
I know that the 'Open Connection' portion works, it's just the loop. It just keeps complaining about misplaced operators, and when I fix those, it doesn't throw up any errors - but it doesn't do anything either.
I spent the best part of 4 hours researching this yesterday, and I'm hoping one of you guys can help me.
Thanks in advance!
ADDENDUM:
Here is more of the code, as requested:
# Clear existing .htm file to avoid duplication
Get-ChildItem -path ".\" -recurse -include index.jsp | ForEach-Object {
Clear-Content "index.jsp"
}
# Set first part of .JSP Body
$HTMLPart1="</br><tr><td colspan=9 align=center><p style=""font-family:Arial"">Here are the links to the last 3 AsiaElec PDFs:</br><ul>"
# Recurse through directory, looking for 3 most recent .PDF files 3 times
$Directory="C:\PDFs"
$HTMLLinePrefix="<li><a style=""font-family:Arial""href="""
$HTMLLineSuffix="</a></li>"
$HTMLLine=#(1,2,3,4)
$Loop=1
$PDF=#(1,2,3,4)
Get-ChildItem -path $Directory -recurse -include *.pdf | sort-object -Property LastWriteTime -Descending | select-object -First 3 | ForEach-Object {
$PDF[$Loop]=$_.name
$HTMLLine[$Loop]=$HTMLLinePrefix + $_.name + """>" + $_.name + $HTMLLineSuffix
$Loop++
}
# Final .JSP File Assembly
Get-Content "header.html" >> "index.jsp"
$HTMLPart1 >> "index.jsp"
$LineParse=""
$Loop2=1
For ($Loop2=1; $Loop2 -le 3; 3)
{
$HTMLLine[$Loop2] >> "index.jsp"
$Loop2++
}
Get-Content "tail.html" >> "index.jsp"
# Prepare File List
$FileList=#(1,2,3,4,5)
$FileList[2]=$PDF[2]
$FileList[3]=$PDF[3]
$FileList[4]="index.jsp"
# Clear FTP Directory
$DelLoop=1
$server = "www.newsbase.com"
$dir = "/usr/local/tomcat/webapps/newsbasearchive/monitors/asiaelec/"
"open $server
user derek bland1ne
binary
cd $dir
" +(
For ($DelLoop=1; $DelLoop -le 5; 5)
{
$FileList[$DelLoop] | %{ "delete ""$_""`n" }
$DelLoop++
})| ftp -i -in
This isn't all of it, but I believe it contains all the relevant info.
Your $dir path looks like you're on a unix system so this may be a little different, but all you need to do is change your final loop a little bit:
For ($DelLoop=1; $DelLoop -le 5; $DelLoop++)
{
$FileList[$DelLoop] | % { rm $FileList[$DelLoop] }
}
This is assuming that $FileList contains the files you want to delete and not only (what I'm guessing are dummy) numbers. I also suggest that you download the Module that #Graimer mentions and then put it in WindowsPowerShell > Modules > %ModuleFolder% > %Module.psm1% and import it from your profile.
You can then just use PS> Remove-FTPItem -Path "/myFolder" -Recurse to remove your FTP stuff. Making your life easier.
Tweaking the solution to this post may also help Upload files with FTP using PowerShell
e.g:
Using $ftp.Method = [System.Net.WebRequestMethods+Ftp]::DeleteFile to delete the file,
and $response = $ftp.GetResponse() to find out if things went smoothly.
EDIT
Wrote this function after doing a little bit of research from here http://social.msdn.microsoft.com/forums/en-US/netfxnetcom/thread/17a3abbc-6144-433b-aadd-1f776c042bd5 and adapting the code from the Accepted Answer in the above link as well as the module #Graimer talked about.
function deleteFTPSide
{
Param(
[String] $ftpUserName = "muUserName",
[String] $ftpDomain = "ftp.place.com", # Normal domains begin with "ftp" here
[String] $ftpPassword = "myPassword",
[String] $ftpPort = 21, # Leave as the default FTP port
[String] $fileToDelete = "folder.domain.com/subfolder/file.txt"
)
# Create the direct path to the file you want to delete
[String] $ftpPath = "ftp://"+"$ftpUserName"+":"+"$ftpPassword#$ftpDomain"+":"+"$ftpPort/$fileToDelete"
# create the FtpWebRequest and configure it
$ftp = [System.Net.FtpWebRequest]::Create($ftpPath)
$ftp.Method = [System.Net.WebRequestMethods+Ftp]::DeleteFile
$ftp.Credentials = new-object System.Net.NetworkCredential($ftpUserName,$ftpPassword)
$ftp.UseBinary = $true
$ftp.UsePassive = $true
$response = [System.Net.FtpWebResponse]$ftp.GetResponse()
$response.Close()
}
While, admittedly, not one of the most elegant solutions written, I've tested it and it works at deleting a specified file off an FTP server.