I am having a excel workbook with 1000 worksheets in it.
I want to create a index page with link to all these worksheets. so if we click in one like it should navigate to that specific worksheet. Like below
Need some idea on how to do this
I don't have excel installed in the server from where i am executing the below code to generate the excel file.
$path = "C:\Scripts\"
$csvs = Get-ChildItem $path -Filter *.csv
$excelFileName = Join-Path $path -ChildPath "$(Get-Date -f yyyyMMdd)_combineddata.xlsx"
foreach ($csv in $csvs)
{
$content = Import-Csv -Path $csv
$props = #{
WorksheetName = $csv.BaseName
PassThru = $true
InputObject = $content
}
if(-not $xlsx)
{
$props.Path = $excelFileName
$xlsx = Export-Excel #props
continue
}
$props.ExcelPackage = $xlsx
Export-Excel #props > $null
}
Close-ExcelPackage $xlsx
Update
I managed to create the final excel with Index sheet at first with all the other sheet names.
I am using ImportExcel Module as i don't have excel installed in server
How to convert those into hyperlinks
Related
I am working on a script where I am trying to merge multiple csv into a single Excel file with csv filename as different sheet.
All the csv files have the same number of columns and name.
I don't have Excel installed on my server, so I've written this code using ImportExcel:
#Install-Module ImportExcel -scope CurrentUser
$path="C:\Scripts" #target folder
cd $path;
$csvs = Get-ChildItem .\* -Include *.csv
$csvCount = $csvs.Count
Write-Host "Detected the following CSV files: ($csvCount)"
foreach ($csv in $csvs) {
Write-Host " -"$csv.Name
}
$excelFileName = $path + "\" + $(get-date -f yyyyMMdd) + "_combined-data.xlsx"
Write-Host "Creating: $excelFileName"
foreach ($csv in $csvs) {
$csvPath = $path + $csv.Name
$worksheetName = $csv.Name.Replace(".csv","")
Write-Host " - Adding $worksheetName to $excelFileName"
Import-Csv -Path $csvPath | Export-Excel -Path $excelFileName -WorkSheetname $worksheetName
The script is taking time and executing without any issue but the Excel sheet is not being generated.
Please can you help me find and fix the issue in this script?
If you want to add a worksheet per CSV to your Excel file the code should look something like this:
$path = "D:\Testing"
$csvs = Get-ChildItem $path -Filter *.csv
$excelFileName = Join-Path $path -ChildPath "$(Get-Date -f yyyyMMdd)_combined-data.xlsx"
foreach ($csv in $csvs)
{
$props = #{
WorksheetName = $csv.BaseName
Path = $excelFileName
}
Import-Csv -Path $csv |
Export-Excel #props
}
If you want to do all in memory (more efficient than the example above), it's a bit more complicated. You need to use -PassThru.
Note, this works on ImportExcel 7.1.2, make sure you have the last up to date version.
$path = "D:\Testing"
$csvs = Get-ChildItem $path -Filter *.csv
$excelFileName = Join-Path $path -ChildPath "$(Get-Date -f yyyyMMdd)_combined-data.xlsx"
foreach ($csv in $csvs)
{
$content = Import-Csv -Path $csv
$props = #{
WorksheetName = $csv.BaseName
PassThru = $true
InputObject = $content
}
if(-not $xlsx)
{
$props.Path = $excelFileName
$xlsx = Export-Excel #props
continue
}
$props.ExcelPackage = $xlsx
Export-Excel #props > $null
}
Close-ExcelPackage $xlsx
If you want all CSV files on the same worksheet, assuming they all have the same headers:
$path = "D:\Testing"
$csvs = Get-ChildItem $path -Filter *.csv
$excelFileName = Join-Path $path -ChildPath "$(Get-Date -f yyyyMMdd)_combined-data.xlsx"
Import-Csv -Path $csvs |
Export-Excel -WorksheetName 'Merged' -Path $excelFileName
I am trying to copy multiple excel workbooks to a single excel workbook with the below, but it is only copying 6 columns when I have 35.
#Get a list of files to copy from
$Files = GCI 'C:\Users\bob\Desktop\Und' | ?{$_.Extension -Match "xlsx?"} | select -ExpandProperty FullName
#Launch Excel, and make it do as its told (supress confirmations)
$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $True
$Excel.DisplayAlerts = $False
#Open up a new workbook
$Dest = $Excel.Workbooks.Add()
`enter code here` ForEach($File in $Files[0..4]){
$Source = $Excel.Workbooks.Open($File,$true,$true)
If(($Dest.ActiveSheet.UsedRange.Count -eq 1) -and ([String]::IsNullOrEmpty ($Dest.ActiveSheet.Range("A1").Value2))){ #If there is only 1 used cell and it is blank select A1
[void]$source.ActiveSheet.Range("A1","F$(($Source.ActiveSheet.UsedRange.Rows|Select -Last 1).Row)").Copy()
[void]$Dest.Activate()
[void]$Dest.ActiveSheet.Range("A1").Select()}
Else{ #If there is data go to the next empty row and select Column A
[void]$source.ActiveSheet.Range("A2","F$(($Source.ActiveSheet.UsedRange.Rows|Select -Last 1).Row)").Copy()
[void]$Dest.Activate()
[void]$Dest.ActiveSheet.Range ("A$(($Dest.ActiveSheet.UsedRange.Rows|Select -last 1).row+1)").Select()}
[void]$Dest.ActiveSheet.Paste()
$Source.Close()}
$Dest.SaveAs("C:\Users\bob\Desktop\Und\combo\Combined.xlsx",51)
$Dest.close()
$Excel.Quit()
A solution with the excellent PoserShell module ImportExcel (with PowerShell 5 or more)
First, install the module:
in an Administrator PowerShell console: Install-Module -Name ImportExcel
in a non Administrator PowerShell console: Install-Module -Name ImportExcel -Scope CurrentUser
Then, use the following code:
$source = 'C:\Users\bob\Desktop\Und'
$destination = 'C:\Users\bob\Desktop\Und\combo\Combined.xlsx'
$fileList = Get-ChildItem -Path $source -Filter '*.xlsx'
foreach ($file in $fileList) {
$fileContent = Import-Excel -Path $file.FullName
$excelParameters = #{
Path = $destination
WorkSheetname = 'Combined'
}
if ((Test-Path -Path $destination) -and (Import-Excel #excelParameters)) {
$excelParameters.Append = $true
}
$fileContent | Export-Excel #excelParameters
}
This code assumes that all your Excel source files have the same headers and you want all your data in the same WorkSheet. But can be adapted to support other scenarios.
I have drafted a PowerShell script that searches for a string among a large number of Word files. The script is working fine, but I have around 1 GB of data to search through and it is taking around 15 minutes.
Can anyone suggest any modifications I can do to make it run faster?
Set-StrictMode -Version latest
$path = "c:\Tester1"
$output = "c:\Scripts\ResultMatch1.csv"
$application = New-Object -comobject word.application
$application.visible = $False
$findtext = "Roaming"
$charactersAround = 30
$results = #()
Function getStringMatch
{
For ($i=1; $i -le 4; $i++) {
$j="D"+$i
$finalpath=$path+"\"+$j
$files = Get-Childitem $finalpath -Include *.docx,*.doc -Recurse | Where-Object { !($_.psiscontainer) }
# Loop through all *.doc files in the $path directory
Foreach ($file In $files)
{
$document = $application.documents.open($file.FullName,$false,$true)
$range = $document.content
If($range.Text -match ".{$($charactersAround)}$($findtext).{$($charactersAround)}"){
$properties = #{
File = $file.FullName
Match = $findtext
TextAround = $Matches[0]
}
$results += New-Object -TypeName PsCustomObject -Property $properties
$document.close()
}
}
}
If($results){
$results | Export-Csv $output -NoTypeInformation
}
$application.quit()
}
getStringMatch
import-csv $output
As mentioned in comments, you might want to consider using the OpenXML SDK library (you can also get the newest version of the SDK on GitHub), since it's way less overhead than spinning up an instance of Word.
Below I've turned your current function into a more generic one, using the SDK and with no dependencies on the caller/parent scope:
function Get-WordStringMatch
{
param(
[Parameter(Mandatory,ValueFromPipeline)]
[System.IO.FileInfo[]]$Files,
[string]$FindText,
[int]$CharactersAround
)
begin {
# import the OpenXML library
Add-Type -Path 'C:\Program Files (x86)\Open XML SDK\V2.5\lib\DocumentFormat.OpenXml.dll' |Out-Null
# make a "shorthand" reference to the word document type
$WordDoc = [DocumentFormat.OpenXml.Packaging.WordprocessingDocument] -as [type]
# construct the regex pattern
$Pattern = ".{$CharactersAround}$([regex]::Escape($FindText)).{$CharactersAround}"
}
process {
# loop through all the *.doc(x) files
foreach ($File In $Files)
{
# open document, wrap content stream in streamreader
$Document = $WordDoc::Open($File.FullName, $false)
$DocumentStream = $Document.MainDocumentPart.GetStream()
$DocumentReader = New-Object System.IO.StreamReader $DocumentStream
# read entire document
if($DocumentReader.ReadToEnd() -match $Pattern)
{
# got a match? output our custom object
New-Object psobject -Property #{
File = $File.FullName
Match = $FindText
TextAround = $Matches[0]
}
}
}
}
end{
# Clean up
$DocumentReader.Dispose()
$DocumentStream.Dispose()
$Document.Dispose()
}
}
Now that you have a nice function that supports pipeline input, all you need to do is gather your documents and pipe them to it!
# variables
$path = "c:\Tester1"
$output = "c:\Scripts\ResultMatch1.csv"
$findtext = "Roaming"
$charactersAround = 30
# gather the files
$files = 1..4|ForEach-Object {
$finalpath = Join-Path $path "D$i"
Get-Childitem $finalpath -Recurse | Where-Object { !($_.PsIsContainer) -and #('*.docx','*.doc' -contains $_.Extension)}
}
# run them through our new function
$results = $files |Get-WordStringMatch -FindText $findtext -CharactersAround $charactersAround
# got any results? export it all to CSV
if($results){
$results |Export-Csv -Path $output -NoTypeInformation
}
Since all of our components now support pipelining, you could do it all in one go:
1..4|ForEach-Object {
$finalpath = Join-Path $path "D$i"
Get-Childitem $finalpath -Recurse | Where-Object { !($_.PsIsContainer) -and #('*.docx','*.doc' -contains $_.Extension)}
} |Get-WordStringMatch -FindText $findtext -CharactersAround $charactersAround |Export-Csv -Path $output -NoTypeInformation
I am new to Powershell and am struggling a bit. I have obtained an example of the sort of function I want to use and adapted it partially. What I want is for it to loop through each subdirectory of C:\Test\, and combine just the PDFs in each subdirectory together (leaving the resulting PDF in each subdirectory).
At the moment I can get it to comb through the subdirectories, but it then combines the contents of all subdirectories into one giant PDF in the top level directory, which is not what I want. I feel like maybe I need to use an array of sorts but I don't know Powershell well enough yet.
BTW this uses PDFSharp - a .Net library.
Function PDFCombine {
$filepath = 'C:\Test\'
$filename = '.\Combined' #<--- ???
$output = New-Object PdfSharp.Pdf.PdfDocument
$PdfReader = [PdfSharp.Pdf.IO.PdfReader]
$PdfDocumentOpenMode = [PdfSharp.Pdf.IO.PdfDocumentOpenMode]
foreach($i in (gci $filepath *.pdf -Recurse)) {
$input = New-Object PdfSharp.Pdf.PdfDocument
$input = $PdfReader::Open($i.fullname, $PdfDocumentOpenMode::Import)
$input.Pages | %{$output.AddPage($_)}
}
$output.Save($filename)
}
Your question was unclear about how many levels you need to go down. You can try this (untested). It goes one level down from $filepath, gets all pdf files in that folder and it's subfolders and combines them into Subfoldername-Combined.pdf:
Function PDFCombine {
$filepath = 'C:\Test\'
$PdfReader = [PdfSharp.Pdf.IO.PdfReader]
$PdfDocumentOpenMode = [PdfSharp.Pdf.IO.PdfDocumentOpenMode]
#Foreach subfolder(FIRST LEVEL ONLY!)
Get-ChildItem $filepath | Where-Object { $_.PSIsContainer } | Foreach-Object {
#Create new ouput pdf-file
$output = New-Object PdfSharp.Pdf.PdfDocument
$outfilepath = Join-Path $_.FullName "$($_.Name)-Combined.pdf"
#Find and add pdf files in subfolders
Get-ChildItem -Path $_.FullName -Filter *.pdf -Recurse | ForEach-Object {
#$input = New-Object PdfSharp.Pdf.PdfDocument #Don't think this one's necessary
$input = $PdfReader::Open($_.fullname, $PdfDocumentOpenMode::Import)
$input.Pages | %{ $output.AddPage($_) }
}
#Save
$output.Save($outfilepath)
}
}
So you should get this:
c:\Test\Folder1\Folder1-Combined.pdf #should include all pages in Folder1 and ANY subfolders below)
c:\Test\Folder2\Folder2-Combined.pdf #should include all pages in Folder2 and ANY subfolders below)
#etc.
If you need it to create a combined pdf for every subfolder(not only the first level), then you could try this(untested):
Function PDFCombine {
$filepath = 'C:\Test\'
$PdfReader = [PdfSharp.Pdf.IO.PdfReader]
$PdfDocumentOpenMode = [PdfSharp.Pdf.IO.PdfDocumentOpenMode]
#Foreach subfolder with pdf files
Get-ChildItem -Path $filepath -Filter *.pdf -Recurse | Group-Object DirectoryName | ForEach-Object {
#Create new ouput pdf-file
$output = New-Object PdfSharp.Pdf.PdfDocument
$outfilepath = Join-Path $_.Name "Combined.pdf"
#Find and add pdf files in subfolders
$_.Group | ForEach-Object {
#$input = New-Object PdfSharp.Pdf.PdfDocument #I don't think you need this
$input = $PdfReader::Open($_.fullname, $PdfDocumentOpenMode::Import)
$input.Pages | %{ $output.AddPage($_) }
}
#Save
$output.Save($outfilepath)
#Remove output-object
Remove-Variable output
}
}
not tested ...
Function PDFCombine {
$filepath = 'C:\Test\'
$filename = '.\Combined' #<--- ???
$output = New-Object PdfSharp.Pdf.PdfDocument
$PdfReader = [PdfSharp.Pdf.IO.PdfReader]
$PdfDocumentOpenMode = [PdfSharp.Pdf.IO.PdfDocumentOpenMode]
$lastdir=""
foreach($i in (gci $filepath *.pdf -Recurse)) {
$input = New-Object PdfSharp.Pdf.PdfDocument
$input = $PdfReader::Open($i.fullname, $PdfDocumentOpenMode::Import)
$input.Pages | %{$output.AddPage($_)}
if ($lastdir -ne $_.directoryname){
$lastdir=$_.directoryname
$output.Save("$lastdir\$filename")
$output = New-Object PdfSharp.Pdf.PdfDocument
}
}
}
Basically what I'm trying to do is gather users folder size from their network folder then export that to a .csv, directory structure looks something like this: network:\Department\user...User's-stuff
The script I have right now gets the department file name and the user's folder size, but not the user's name (folder name in the department). As for the TimeStamp, I'm not sure it's working correctly. It's meant to make a timestamp when it starts on the users in the next department so basically, all users in the same department will have the same timestamp.
This is what I have so far:
$root = "network"
$container= #()
$place = "C:\temp\"
$file = "DirectoryReport.csv"
Function Get-FolderSize
{
BEGIN{$fso = New-Object -comobject Scripting.FileSystemObject}
PROCESS
{
$prevDept = (Split-Path $path -leaf)
$path = $input.fullname
$folder = $fso.GetFolder($path)
$Volume = $prevDept + "-users"
$user = $folder.name #can't figure this part out...
$size = $folder."size(MB)"
if ( (Split-Path $path -leaf) -ne $prevDept)
{
$time = Get-Date -format M/d/yyy" "HH:mm #Probably wrong too..
}
return $current = [PSCustomObject]#{'Path' = $path; 'Users' = $user; 'Size(MB)' = ($size /1MB ); 'Volume' = $Volume; 'TimeStamp' = $time;}
}
}
$container += gci $root -Force -Directory -EA 0 | Get-FolderSize
$container
#Creating the .csv path
$placeCSV = $place + $file
#Checks if the file already exists
if ((test-path ($placeCSV)) -eq $true)
{
$file = "DirectoryReport" + [string](Get-Date -format MM.d.yyy.#h.mm.sstt) + ".csv"
rename-item -path $placeCSV -newname $file
$placeCSV = $place + $file
}
#Exports the CSV file to desired folder
$container | epcsv $placeCSV -NoTypeInformation -NoClobber
But in the CSV file the user and the timestamp are wrong. Thanks for any/all help
This really seems to be doing it the hard way. Why you wouldn't just use Get-ChildItem to do this almost makes this script seem a little masochistic to me, so I'm going to use that cmdlet instead of creating a comobject to do it.
I am a little confused as to why you wouldn't want to recurse for size, but ok, we'll go that route. This will get you your folders sizes, in MB.
#Get a listing of department folders
$Depts = GCI $root -force -Directory
#Loop through them
ForEach($Dept in $Depts){
$Users = #()
$Timestamp = Get-Date -Format "M/d/yyy HH:mm"
#Loop through each user for the current department
GCI $Dept -Directory |%{
$Users += [PSCustomObject]#{
User=$_.Name
Path=$_.FullName
"Size(MB)"=(GCI $_|Measure-Object -Sum Length|Select Sum)/1MB
Volume="$($Dept.Name)-Users"
TimeStamp=$Timestamp
}
}
}
#Rename output file if it exists
If(Test-Path "C:\Temp\DirectoryReport.csv"){
Rename-Item "C:\Temp\DirectoryReport.csv" "DirectoryReport.$(Get-Date -format MM.d.yyy.#h.mm.sstt).csv"
}
#Output file
$Users | Export-Csv "C:\Temp\DirectoryReport.csv" -NoTypeInformation
If you want to get the total size for all files within each user's folder, including files within subfolders, change the "Size(MB)"=(GCI $_|Measure-Object -Sum Length|Select Sum)/1MB to be recursive by replacing it with "Size(MB)"=(GCI $_ -recurse|Measure-Object -Sum Length|Select Sum)/1MB and that should have you good to go.