Extract Pages from a PDF using itextsharp in Powershell - powershell

I have been researching this for weeks now and can't seem to make much ground on the subject. I have a large PDF (900+ pages), that is the result of a mail merge. The result is 900+ copies of the same document which is one page, with the only difference being someone's name on the bottom. What I am trying to do, is have a powershell script read the document using itextsharp and save pages that contain a specific string (the person's name) into their respective folder.
This is what I have managed so far.
Add-Type -Path C:\scripts\itextsharp.dll
$reader = New-Object iTextSharp.text.pdf.pdfreader -ArgumentList
"$pwd\downloads\TMs.pdf"
for($page = 1; $page -le $reader.NumberOfPages; $page++) {
$pageText = [iTextSharp.text.pdf.parser.PdfTextExtractor]::GetTextFromPage($reader,$page).Split([char]0x000A)
if($PageText -match 'DAN KAGAN'){
Write-Host "DAN FOUND"
}
}
As you can see I am only using one name for now for testing. The script finds the name properly 10 times. What I cannot seem to find any information on, is how to extract pages that this string appears on.
I hope this was clear. If I can be of any help, please let me know.
Thanks!

I actually just finished writing a very similar script. With my script, I need to scan a PDF of report cards, find a student's name and ID number, and then extract that page and name it appropriately. However, each report card can span multiple pages.
It looks like you're using iTextSharp 5, which is good because so am I. iTextSharp 7's syntax is wildly different and I haven't learned it yet.
Here's the logic that does the page extraction, roughly:
$Document = [iTextSharp.text.Document]::new($PdfReader.GetPageSizeWithRotation($StartPage))
$TargetMemoryStream = [System.IO.MemoryStream]::new()
$PdfCopy = [iTextSharp.text.pdf.PdfSmartCopy]::new($Document, $TargetMemoryStream)
$Document.Open()
foreach ($Page in $StartPage..$EndPage) {
$PdfCopy.AddPage($PdfCopy.GetImportedPage($PdfReader, $Page));
}
$Document.Close()
$NewFileName = 'Elementary Student Record - {0}.pdf' -f $Current.Student_Id
$NewFileFullName = [System.IO.Path]::Combine($OutputFolder, $NewFileName)
[System.IO.File]::WriteAllBytes($NewFileFullName, $TargetMemoryStream.ToArray())
Here is the complete working script. I've removed as little as possible to provide you a near working example:
Import-Module -Name SqlServer -Cmdlet Invoke-Sqlcmd
Add-Type -Path 'C:\...\itextsharp.dll'
# Get table of valid student IDs
$ServerInstance = '...'
$Database = '...'
$Query = #'
select student_id, student_name from student
'#
$ValidStudents = #{}
Invoke-Sqlcmd -Query $Query -ServerInstance $ServerInstance -Database $Database -OutputAs DataRows | ForEach-Object {
[void]$ValidStudents.Add($_.student_id.trim(), $_.student_name)
}
$PdfFiles = Get-ChildItem "G:\....\*.pdf" -File |
Select-Object -ExpandProperty FullName
$OutputFolder = 'G:\...'
$StudentIDSearchPattern = '(?mn)^(?<Student_Id>\d{6,7}) - (?<Student_Name>.*)$'
foreach ($PdfFile in $PdfFiles) {
$PdfReader = [iTextSharp.text.pdf.PdfReader]::new($PdfFile)
$StudentStack = [System.Collections.Stack]::new()
# Map out the PDF file.
foreach ($Page in 1..($PdfReader.NumberOfPages)) {
[iTextSharp.text.pdf.parser.PdfTextExtractor]::GetTextFromPage($PdfReader, $Page) |
Where-Object { $_ -match $StudentIDSearchPattern } |
ForEach-Object {
$StudentStack.Push([PSCustomObject]#{
Student_Id = $Matches['Student_Id']
Student_Name = $Matches['Student_Name']
StartPage = $Page
IsValid = $ValidStudents.ContainsKey($Matches['Student_Id'])
})
}
}
# Extract the pages and save the files
$LastPage = $PdfReader.NumberOfPages
while ($StudentStack.Count -gt 0) {
$Current = $StudentStack.Pop()
$StartPage = $Current.StartPage
$EndPage = $LastPage
$Document = [iTextSharp.text.Document]::new($PdfReader.GetPageSizeWithRotation($StartPage))
$TargetMemoryStream = [System.IO.MemoryStream]::new()
$PdfCopy = [iTextSharp.text.pdf.PdfSmartCopy]::new($Document, $TargetMemoryStream)
$Document.Open()
foreach ($Page in $StartPage..$EndPage) {
$PdfCopy.AddPage($PdfCopy.GetImportedPage($PdfReader, $Page));
}
$Document.Close()
$NewFileName = 'Elementary Student Record - {0}.pdf' -f $Current.Student_Id
$NewFileFullName = [System.IO.Path]::Combine($OutputFolder, $NewFileName)
[System.IO.File]::WriteAllBytes($NewFileFullName, $TargetMemoryStream.ToArray())
$LastPage = $Current.StartPage - 1
}
}
In my test environment this processes about 500 students across 5 source PDFs in about 15 seconds.
I tend to use constructors instead of New-Object, but there's no real difference between them. I just find them easier to read.

Related

SMLETS: Powershell

We want to generate an SR per row based on the criteria of a CSV file looking like:
SR templete
The additional criterion:
If the SLO countdown is less than 7 days then the due date is always 7 days for the ticket to be due. Otherwise then then countdown is number SLO _Countdown
The support group is always servicedesk
Unless the host_name does not contain "RES" then it is the support group is EITS_HW_Notes and it will be assigned to "custodian".
No matter what an SR is generated even if null.
My difficulty is my lack familiarity with smlets. I am happy to consider generating tickets via email as well. But would like help on how best to do that via powershell. But the code I came up with is below:
`#Prod
#$GLOBAL:smdefaultcomputer = "prodserver"
#Test
$GLOBAL:smdefaultcomputer = "testserver"
Import-Module SMlets
$path = "C:\Temp\Test.csv"
$csv = Import-csv -path $path
#Variable / Class Setup
$srClass = Get-SCSMClass -name System.WorkItem.ServiceRequest
$srprior = Get-SCSMEnumeration -Name ServiceRequestPriorityEnum.Medium
$srurg = Get-SCSMEnumeration -Name ServiceRequestUrgencyEnum.Medium
#$ararea = get-SCSMEnumeration -Name ServiceRequestAreaEnum.Other
$ararea = get-SCSMEnumeration -Name Enum.add3768303064ec18890170ba33cffda
$title = “Title Goes Here”
$descrip = "Description info goes here"
#Service Request Arguements
$srargs = #{
Title = $title;
Urgency = $srurg;
Priority = $srprior;
ID = “SR{0}”;
Area = $ararea;
SupportGroup = "ServiceDesk";
Description = $descrip
}
#Create Service Request
$newServiceRequest = New-SCSMOBject -Class $srClass -PropertyHashtable $srargs -PassThru
#get SR ID of the new object
$SRId = $newServiceRequest.id
#Get Projection & Object for Created Service Request
$srTypeProjection = Get-SCSMTypeProjection -name System.WorkItem.ServiceRequestProjection$
$SRProj = Get-scsmobjectprojection -ProjectionName $srTypeProjection.Name -filter “Id -eq $SRId”
#Set Afffected User
$userClass = Get-SCSMClass -Name Microsoft.AD.UserBase$
$cType = "Microsoft.EnterpriseManagement.Common.EnterpriseManagementObjectCriteria"
$cString = "UserName = 'itservicenotifications' and Domain = 'SHERMAN'"
$crit = new-object $cType $cString,$userClass
$user = Get-SCSMObject -criteria $crit
$AffectedUserRel = get-scsmrelationshipclass -name System.WorkItemAffectedUser$
New-SCSMRelationshipObject -RelationShip $AffectedUserRel -Source $newServiceRequest -Target $user -Bulk`
I tried the above code but am running into issues recognizing the column name in the CSV file and am unfamiliar with SMLETS + powershell if statements.
Columns are:
CSV Columns
CSV text with examples is: Columns with examples
Could you paste the CSV columns as text, please? Or, better, a sample CSV with one or two rows (redact any sensitive data).
I would expect a CSV to contain multiple rows - even if yours does not, it's good defensive programming to act as if it does. So the first modification I suggest is:
$path = "C:\Temp\Test.csv"
$csv = Import-csv -path $path
foreach ($Row in $csv)
{
# the rest of your code goes in here
}
I find it helpful while debugging to go step-by-step. If I understand your problem right, it's about building the right hashtable in $srargs to pass to New-SCSMOBject. So the next modification is:
foreach ($Row in $csv)
{
$srClass = Get-SCSMClass -name System.WorkItem.ServiceRequest
# etc
$srargs = #{
Title = $title
Urgency = $srurg
Priority = $srprior
ID = “SR{0}”
Area = $ararea
SupportGroup = "ServiceDesk"
Description = $descrip
}
$srargs # write the hashtable so you can inspect it
# skip the rest of the code for now
}
I understand your question as "how to express the logic of":
support group is always servicedesk
Unless the host_name does not contain "RES"
then the support group is contents of EITS_HW_Notes cell in CSV
and it will be assigned to "custodian"
I can't help you with setting the assignee. But we can rejig the rest of the statement:
if host_name contains "RES"
SupportGroup = servicedesk
else
SupportGroup = contents of EITS_HW_Notes cell
You can code that like this:
foreach ($Row in $csv)
{
$srClass = Get-SCSMClass -name System.WorkItem.ServiceRequest
# etc
if ($Row.host_name -like "*RES*")
{
$SupportGroup = "ServiceDesk"
}
else
{
$SupportGroup = $Row.EITS_HW_Notes
}
$srargs = #{
Title = $title
# etc
SupportGroup = $SupportGroup
Description = $descrip
}
}
Does that get you any closer to your solution?

Powershell For Loop for multiple servers - to get SSAS connection string details

I am very new to powershell script. i am trying to get SSAS Tabular model connection string details for multiple servers. i have code which will return only for single server. How to modify the code to pass multiple servers?
$servername = "servername1"
# Connect SSAS Server
$server = New-Object Microsoft.AnalysisServices.Server
$server.connect($servername)
$DSTable = #();
foreach ( $db in $server.databases)
{
$dbname = $db.Name
$Srver = $db.ParentServer
foreach ( $ds in $db.Model.DataSources)
{
$hash = #
{
"Server" = $Srver;
"Model_Name" = $dbname ;
"Datasource_Name" = $ds.Name ;
"ConnectionString" = $ds.ConnectionString ;
"ImpersonationMode" = $ds.ImpersonationMode;
"Impersonation_Account" = $ds.Account;
}
$row = New-Object psobject -Property $hash
$DSTable += $row
}
}
As commented, you can surround the code you have in another foreach loop.
Using array concatenation with += is a bad idea, because on each addition, the entire array needs to be recreated in memory, so that is both time and memory consuming.
Best thing is to let PowerShell do the heavy lifting of collecting the data:
$allServers = 'server01','server02','server03' # etc. an array of servernames
# loop through the servers array and collect the utput in variable $result
$result = foreach($servername in $allServers) {
# Connect SSAS Server
$server = New-Object Microsoft.AnalysisServices.Server
$server.Connect($servername)
foreach ( $db in $server.databases) {
foreach ( $ds in $db.Model.DataSources) {
# output an object with the desired properties
[PsCustomObject]#{
Server = $db.ParentServer
Model_Name = $db.Name
Datasource_Name = $ds.Name
ConnectionString = $ds.ConnectionString
ImpersonationMode = $ds.ImpersonationMode
Impersonation_Account = $ds.Account
}
}
}
}
# output on screen
$result | Out-GridView -Title 'SSAS connection string details'
# output to a CSV file (change the path and filename here of course..)
$result | Export-Csv -Path 'D:\Test\MySSAS_Connections.csv' -UseCulture -NoTypeInformation
The above uses parameter -UseCulture because then the delimiter used for the CSV file is the same as your machine expects when double-clicking and opening in Excel. Without that, the default comma is used

Return all SSRS reports in a folder with the data source name and ConnectString

This what I have so far. However, I want to list every report to it's connection string. I don't see a unique identifier in the GetDataSourceContents() method to join the report and data source lists.
$ReportServerUri = "YOUR_SERVER";
$rs = New-WebServiceProxy -Uri $ReportServerUri -UseDefaultCredential -Namespace "SSRS"
$rs.Url = "YOUR_SERVER"
$rs.Credentials = [System.Net.CredentialCache]::DefaultNetworkCredentials;
$BICItems = $rs.ListChildren("/", $true);
$BICFolders = $BICItems | Where { $_.TypeName -eq "Folder"}
$BICDataSources = $BICItems | Where {$_.typename -eq "DataSource"}
$BICDataSourceFolders = $BICFolders | Where {$_.path -like "*Data Source*"}
$BICReports = $BICItems | Where {$_.typename -eq "Report"}
foreach ($DataSource in $BICDataSources)
{
$BICDataSourceContents = $rs.GetDataSourceContents($DataSource.Path)
$MyConnectStrings = $BICDataSourceContents | Where {$_.ConnectString -like "*MY_CONNECT_STRING*"}
$MyConnectStrings
}
I don't see a unique identifier in the GetDataSourceContents method to join the report and data source lists.
Nope. Neither do I. However when were are querying for those details we already know something unique enough. The path to the datasource itself. This is also what a report would be using so that should be a good connector.
There is a series of functions that I made to serve this purpose. Find-SSRSEntities, Get-SSRSReportDataSources and Get-SSRSDatasourceDetails are what I will try and showcase here. The last one I just made since I had no reason for those details but it was easy enough to integrate into my module.
Find-SSRSEntities
Return items from a SSRS connection. Supports loads of filtering options.
function Find-SSRSEntities{
[CmdletBinding()]
param(
[Parameter(Position=0,Mandatory=$true)]
[Alias("Proxy")]
[Web.Services.Protocols.SoapHttpClientProtocol]$ReportService,
[Parameter(Position=1)]
[Alias("Path")]
[string]$SearchPath="/",
[Parameter(Position=2)]
[ValidateSet("All", "Folder", "Report", "Resource", "LinkedReport", "DataSource", "Model")]
[Alias("Type")]
[String]$EntityType = "All",
[Parameter(Position=3)]
[String]$Match,
[Parameter(Position=4)]
[Switch]$Partial=$false
)
# Get all of the catalog items that match the criteria passed
# https://msdn.microsoft.com/en-us/library/reportservice2005.reportingservice2005.listchildren.aspx
$recursive = $true
$catalogItems = $ReportService.ListChildren($SearchPath,$recursive)
Write-Verbose "$($catalogItems.Count) item(s) located in the root path $SearchPath"
# Limit the results to the catalog types requested
if($EntityType -ne "All"){$catalogItems = $catalogItems | Where-Object{$_.Type -eq $EntityType}}
Write-Verbose "$($catalogItems.Count) item(s) found matching the type $EntityType"
# Set the match string based on parameters
if(-not $Partial.isPresent -and $Match){$Match = "^$Match$"}
Write-Verbose "Returning all items matching: '$Match'"
# If the regex is an empty string all object will be returned.
return $catalogItems | Where-Object{$_.Name -match $Match}
}
Get-SSRSReportDataSources
When given a valid report path it will return all associated datasources of that report.
function Get-SSRSReportDataSources{
[CmdletBinding()]
param(
[Parameter(Position=0,Mandatory=$true)]
[Alias("Proxy","SSRSService")]
[Web.Services.Protocols.SoapHttpClientProtocol]$ReportService,
[Parameter(Position=1,Mandatory=$true)]
[Alias("Path")]
[string]$ReportPath
)
# Test the report path to be sure it is for a valid report
if(Test-SSRSPath -ReportService $ReportService -EntityPath $ReportPath -EntityType Report){
$ReportService.GetItemDataSources($reportPath) | ForEach-Object{
[pscustomobject][ordered]#{
ReportPath = $reportPath
DataSourceName = $_.name
Reference = $_.item.reference
}
}
} else {
Write-Error "$ReportPath is not a valid report path"
}
}
Get-SSRSDatasourceDetails
When given a valid datasource path it will return all detail of that datasource. Also attaches an extra path property.
function Get-SSRSDatasourceDetails{
[CmdletBinding()]
param(
[Parameter(Position=0,Mandatory=$true)]
[Alias("Proxy")]
[Web.Services.Protocols.SoapHttpClientProtocol]$ReportService,
[Parameter(Position=1,Mandatory=$true,ValueFromPipelineByPropertyName)]
[Alias("Path")]
[string]$EntityPath
)
process{
# Split the path into its folder and entity parts
$SearchPath = Split-SSRSPath $EntityPath -Parent
$EntityName = Split-Path $EntityPath -Leaf
# Verify the path provided is to a valid datasource
if((Find-SSRSEntities -ReportService $ReportService -SearchPath $SearchPath -EntityType DataSource -Match $EntityName -Partial:$false) -as [boolean]){
Add-Member -InputObject ($ReportService.GetDataSourceContents($EntityPath)) -MemberType NoteProperty -Name "Path" -Value $EntityPath -PassThru
} else {
Write-Warning "Could not find a datasource at path: $EntityPath"
}
}
}
So armed with those lets match up all reports in a folder to their datasource connection strings. I would note that all of these functions rely on a active connection to work. Something like this
$ssrsservice = Connect-SSRSService "http://ssrsreports/ReportServer/ReportService2005.asmx" -Credential $credentials
$PSDefaultParameterValues.Add("*SSRS*:ReportService",$ssrsservice)
That will automatically apply the populated -ReportService $ssrsservice to all the SSRS functions I made below.
Else you could just add something like Find-SSRSEntities -ReportService $rs to the code below and it would work.
# Lets get all of the Marketing Datasources
$datasources = Find-SSRSEntities -SearchPath "/data sources/marketing" -EntityType DataSource | Get-SSRSDatasourceDetails
# Now gather all of their reports
Find-SSRSEntities -SearchPath "/Marketing" -EntityType Report |
# Get the report datasources
Get-SSRSReportDataSources | ForEach-Object{
# Attach the connection strings to each object
$reportDataSourceDetail = $_
# Filter the datasource for the individual datasource mapping of this report
$matchingDatasource = $datasources | Where-Object{$_.path -eq $reportDataSourceDetail.Reference}
Add-Member -InputObject $_ -MemberType NoteProperty -Name ConnectionString -Value $matchingDatasource.ConnectString -PassThru
}
This would net me results that look like this:
ReportPath : /Marketing/OandD Class Summary By Month
DataSourceName : Marketing
Reference : /Data Sources/Marketing/Marketing
ConnectionString : Data Source=SQL08R2VM; Initial Catalog=Marketing;
ReportPath : /Marketing/OandD Class YTD Summary
DataSourceName : Marketing
Reference : /Data Sources/Marketing/Marketing
ConnectionString : Data Source=SQL08R2VM; Initial Catalog=Marketing;
These, and other functions, suite me just fine. I have not really had anyone else using them so you might have issues that I have never encountered. Works fine connecting to my SSRS 2008R2 server using PowerShell v5
Here's a T-SQL statement that will return the data source name, path & connection string with the report name and path.
;WITH
XMLNAMESPACES -- XML namespace def must be the first in with clause.
(
DEFAULT 'http://schemas.microsoft.com/sqlserver/reporting/2006/03/reportdatasource'
,'http://schemas.microsoft.com/SQLServer/reporting/reportdesigner'
AS rd
)
,
shared_datasource
AS
(
SELECT
DsnSharedName = sds.[Name]
, DsnPath = sds.[Path]
, DEF = CONVERT(xml, CONVERT(varbinary(max), content))
FROM
dbo.[Catalog] AS sds
WHERE sds.[Type] = 5) --> 5 = Shared Datasource
,
data_source_name (DsnPath, DsnSharedName, DsnConnString)
AS
(
SELECT
cn.DsnPath
, cn.DsnSharedName
, cn.DsnConnString
FROM
(SELECT
sd.DsnPath
, sd.DsnSharedName
, DsnConnString = dsn.value('ConnectString[1]', 'varchar(150)')
FROM
shared_datasource AS sd
CROSS APPLY sd.DEF.nodes('/DataSourceDefinition') AS R(dsn)
) AS cn
)
SELECT
DataSourceName = lk.[Name]
, dsn.DsnPath
, dsn.DsnConnString
, ReportName = c.[Name]
, ReportFolder = c.[Path]
FROM
dbo.[Catalog] c
INNER JOIN dbo.DataSource ds ON c.ItemID = ds.ItemID
INNER JOIN dbo.[Catalog] lk ON ds.Link = lk.ItemID
INNER JOIN data_source_name dsn ON dsn.DsnSharedName = lk.[Name]
WHERE
c.[Type] = 2 --> 2 = Reports
--AND dsn.DsnConnString LIKE '%Initial Catalog%=%DatabaseNameHere%'
Then you can run the T-SQL script file in powershell with this. original post
<# Function to Check whether Server is Ping Status of the Server #>
Function Check-Ping()
{
param
(
[string]$HostName
)
$PingStatus=Get-WmiObject -Query "Select * from Win32_PingStatus where Address='$HostName'"
Return $PingStatus
}
<# Function to Check Instance name Present in the Server #>
Function Get-SQLInstances()
{
param
(
[string]$SQLServerName
)
$Status=Check-Ping($SQLServerName)
if($Status.StatusCode -ne 0)
{
Return "The Server Is Not Reachable"
}
elseif($Status.StatusCode -eq 0)
{
$Reg = [Microsoft.Win32.RegistryKey]::OpenRemoteBaseKey('LocalMachine', $SQLServerName)
$RegKey = $Reg.OpenSubKey("SOFTWARE\\Microsoft\\Microsoft SQL Server")
$Instances=$regKey.GetValue("installedinstances")
Return $Instances
}
}
<# Function To Run TSQL and Return Results within HTML Table Tag #>
Function Run-TSQL()
{
Param
(
[string]$MachineName,
[string]$TSQLfilePath
)
$Assembly=[reflection.assembly]::LoadWithPartialName("Microsoft.SqlServer.Smo")
$Instances=Get-SQLInstances($MachineName)
$TSQL=Get-Content $TSQLfilePath
foreach($Instance in $Instances)
{
$SQLServiceStatus=Get-Service -ComputerName $MachineName | Where-Object {$_.displayname -like "SQL Server ("+$Instance+")"}
if($SQLServiceStatus.Status -eq "Running")
{
if($Instance -eq "MSSQLSERVER")
{
$SQLServer=$MachineName
}
Else
{
$SQLServer=$MachineName+"\"+$Instance
}
$SQLServerObject = new-Object Microsoft.SqlServer.Management.Smo.Server($SQLServer)
$DatabaseObject = New-Object Microsoft.SqlServer.Management.Smo.Database
$DatabaseObject = $SQLServerObject.Databases.Item("Master")##The TSQL Script Runs in Master Database
$OutPutDataSet = $DatabaseObject.ExecuteWithResults($TSQL)
for($t=0;$t -lt $OutPutDataSet.Tables.Count;$t++)
{
$OutString+="<Table Border=2>"
$OutString+="<Tr>"
foreach($Column in $OutPutDataSet.Tables[$t].Columns)
{
$OutString+="<Th>"
$OutString+=$Column.ColumnName
$OutString+="</Th>"
}
$OutString+="</Tr>"
for($i=0;$i -lt $OutPutDataSet.Tables[$t].Rows.Count;$i++)
{
$OutString+="<Tr>"
for($j=0;$j -lt $OutPutDataSet.Tables[$t].Columns.Count;$j++)
{
$OutString+="<Td>"
$OutString+=$($OutPutDataSet.Tables[$t].Rows[$i][$j])
$OutString+="</Td>"
}
$OutString+="</Tr>"
}
$OutString+="</Table>"
$OutString+="</Br>"
$OutString+="</Br>"
}
}
}
Return $OutString
}
<# Function To Add Table Tag to with In HTML tags
Modify Title and Subject as Per yoru Requirement
#>
Function Get-HTMLOut()
{
Param
(
[String]$InputFile,
[String]$OutputFile,
[String]$TSQL
)
$Out+="<Html>"
$Out+="<Title>Run TSQL and Return HTML FIle</Title>" ## Modify 'TiTle' Tag as per your Required
$Out+="<Head><style>body {background-color:lightgray} H3{color:blue}H1{color:green}table, td, th {border: 1px solid green;}th {background-color: green;color: white;}</style></Head>" ## Modify 'Head' Tag as per your Required
$Out+="<Body><H1 Align='Center'>Run TSQL and Return HTML File</H1></Br></Br>" ## Modify 'Body' Tag as per your Required
ForEach($ServerName in Get-Content $InputFile)
{
$Out+="<H3 align='center'>--------------$ServerName--------------</H3>" ## Modify 'header Text' Tag as per your Required
$Out+="</Br>"
$Out+=Run-TSQL -MachineName $ServerName -TSQLfilePath $TSQL
}
$Out+="</Body></Html>"
Set-Content -Value $Out -Path $OutputFile
}
<# Call Get-HTMLOut Function
It Accepts 3 parameter
a. -InputFile (.txt file each server in a List withOut Instance Name)
b. -OutputFile (.Html File to which Output need to be sent)
c. -TSQL (.sql file which Contains the Script to Run)
#>
Get-HTMLOut -InputFile ".\Servers.txt" -OutputFile .\Status.Html -TSQL '.\TSQL Script.sql'

Using -eq across two datasets in powershell

I'm in a juncture here. I have two datasets in powershell. Dataset 1($table) is received via an sql query (varies from 12 to 17 rows and has 8 columns) and Dataset 2($team) is hard coded in the script (has 18 rows and 2 columns). Both of these have a common column, Contest. Now the script I have to get working is - for each Contest in $table.contest, get other corresponding parameters from $table and match the Contest in $team.contest and get the corresponding $team.column2 value into play.
I'm able to get the data individually from each table, but when I use "-eq" condition across $table.contest & $team.contest, nothing happens.
This is the snippet from the code where I'm facing the problem.
$Contests = ($DataSet.Contest)
$Team = ($Team.cont)
foreach($Contest in $Contests)
{
$ContestName = $Contest
$stats = $DataSet | where {$_.contest -eq $contest}
$signups = $stats.SignUps
$newbies = $stats.Newbies
$uploads = $stats.Uploads
$views = $stats.Views
$eviews = $stats.EViews
$votes = $stats.Votes
$date = $stats.EndDate
$teamx = $team | where {$_ -eq $stats.contest}
$contest
$teamx
}
$contest shows the contest name, but $teamx is blank
The following is the changed code with respect hash tables. I tried to convert object array to string but in vain.
$team = #{
"Short Film" = "Member4";
"Student Photography" = "Member0";
"Student Art" = "Member1";
"Macro Photography" = "Member2";
"Landscape Photography" = "Member3";
}
$Contests = ($DataSet.Contest)
$Contests = $Contests | where {$_ -ne "" -and $_ -ne $null -and $_ -ne [dbnull]::value}
foreach($Contest in $Contests)
{
$ContestName = $Contest
$stats = $DataSet | where {$_.contest -eq $contest}
$signups = $stats.SignUps
$datatemp = $stats.Contest
if ($team.ContainsKey($datatemp)) {write-output "Exists"}
else {write-output "Doesn't Exist"}
$datatemp
$team.count
}
I tried directly feeding $Contest, $ContestName, and $stast.Contest inside ContainsKey, but all the time output is the same -
Doesn't Exist
Short Film
5
Doesn't Exist
Student Photography
5
Doesn't Exist
Student Art
5
Doesn't Exist
Macro Photography
5
Doesn't Exist
Landscape Photography
5
What am I doing wrong?
I can't tell for sure without knowing exactly what's in $Dataset, but your symptoms all point to trailing whitespace in the Contest value that's causing your tests to fail.
Try this and see if you get different results:
foreach($Contest in $Contests)
{
$ContestName = $Contest
$stats = $DataSet | where {$_.contest -eq $contest}
$signups = $stats.SignUps
$datatemp = $stats.Contest.trim()
if ($team.ContainsKey($datatemp)) {write-output "Exists"}
else {write-output "Doesn't Exist"}
$datatemp
$team.count
}

PowerShell unable to get file metadata from Comments field when it's too long

I want to extract some xml data from the Comments metadata field in .WMA files.
I'm using a script from Technet's Scripting Guy column to get all metadata, and it lists every attribute except the Comments field!
Some research by my colleague showed that when we shortened the data in the Comments field to < 1024 bytes, the data from the Comments field lists out fine.
It seems to me that the limitation is in the Shell.Application object; it just returns an empty Comments field when the contents is more than 1024 characters. Also, instead of listing every attribute, I just get the Comments, which is number 24.
The sample file I have contains 1188 bytes, and I think files will be aruond there, so it's not over by much.
Here is the script I'm currently running (removed comments for brevity):
Function Get-FileMetaData
{
Param([string[]]$folder)
foreach($sFolder in $folder)
{
$a = 0
$objShell = New-Object -ComObject Shell.Application
$objFolder = $objShell.namespace($sFolder)
foreach ($File in $objFolder.items())
{
$FileMetaData = New-Object PSOBJECT
$hash += #{"Filename" = $($objFolder.getDetailsOf($File, 0)) }
$hash += #{"My Comment field" = $($objFolder.getDetailsOf($File, 24)) }
$hash += #{"Length" = $($objFolder.getDetailsOf($File, 24)).Length }
$FileMetaData | Add-Member $hash
$hash.clear()
} #end foreach
$a=0
$FileMetaData
} #end foreach $sfolder
}
Get-FileMetaData -folder "C:\DATA\wma" | fl
Is there another approach I can use that will allow me to extract the full XML data?
you can try to use the taglib-sharp dll from http://taglib.org/
here I copy the content of a 156 KB file to the comment :
[system.reflection.assembly]::loadfile("c:\temp\taglib-sharp.dll")
$data=[taglib.file]::create('c:\mp3\01. Stromae - Alors On Danse.mp3')
$data.Tag.Comment = (gc c:\temp\IMP_ERR.LOG)
$data.Save()
verification :
PS>$data=[taglib.file]::create('c:\mp3\01. Stromae - Alors On
Danse.mp3') PS>$data.tag.Comment.length / 1KB
PS>155,2197265625
edit
I was able to use same code for a wma file