search csv for multiple strings in column a in powershell - powershell

I have a sample.csv file
#Period,Account,Entity,Year,Version,Currency,HSP_Rates,Scenario,Data
Apr,1,9,FY22,F,L,H,And,2
Apr,1,9,FY22,F,L,H,And,2
Apr,1,9,FY22,F,L,H,OR,2
here i want to get output csv file where scenario only equals to AND
#Period,Account,Entity,Year,Version,Currency,HSP_Rates,Scenario,Data
Apr,1,9,FY22,F,L,H,And,2
Apr,1,9,FY22,F,L,H,And,2
i have written code which is not giving required output
$csvRaw = Get-Content -Path 'D:\sample.csv' -Raw
$Csv = $CsvRaw.TrimStart('#') | ConvertFrom-Csv
$NewCsv = $csv | ForEach-Object {
[PsCustomObject]#{
Period = $_.Period
Account = $_.Account
Entity = $_.Entity
Year = $_.Year
Version = $_.Version
Currency = $_.Currency
HSP_Rates = $_.HSP_Rates
Scenario = $_.Scenario | where {$_.Scenario -match "And" }
Data = $_.Data
} }
$OutCsv = ($NewCsv | ConvertTo-Csv -NoTypeInformation).TrimStart('"#')
# Converting back to CSV surround everything with double quotes. # We need to insert back the hash sign for the #period header$OutCsv[0] = """#" + $OutCsv[0]
$OutCsv | Out-File 'D:\sample_1.csv'

It looks like a syntax issue in the construction of the [PSCustomObject].
Try replacing:
$NewCsv = $csv | ForEach-Object {
[PsCustomObject]#{
Period = $_.Period
Account = $_.Account
Entity = $_.Entity
Year = $_.Year
Version = $_.Version
Currency = $_.Currency
HSP_Rates = $_.HSP_Rates
Scenario = $_.Scenario | where {$_.Scenario -match "And" }
Data = $_.Data
} }
With:
$NewCsv = $csv | ForEach-Object {
[PsCustomObject]#{
Period = $_.Period
Account = $_.Account
Entity = $_.Entity
Year = $_.Year
Version = $_.Version
Currency = $_.Currency
HSP_Rates = $_.HSP_Rates
Scenario = $_.Scenario
Data = $_.Data
} } | Where { $_.Scenario -match "And" }
Which - as Theo commented - is basically the same as:
$NewCsv = $csv | Where { $_.Scenario -match "And" }
Also, if you're using Powershell 6+, you may take advantage of the -UseQuotes parameter from the ConvertTo-CSV cmdlet to remove the quotes from the output.

Related

How can I add string and create new column in my csv file using PowerShell

In my existing CSV file I have a column called "SharePoint ID" and it look like this
1.ylkbq
2.KlMNO
3.
4.MSTeam
6.
7.MSTEAM
8.LMNO83
and I'm just wondering how can I create a new Column in my CSV call "SharePoint Email" and then add "#gmail.com" to only the actual Id like "ylkbq", "KLMNO" and "LMNO83" instead of applying to all even in the blank space. And Maybe not add/transfer "MSTEAM" to the new Column since it's not an Id.
$file = "C:\AuditLogSearch\New folder\OriginalFile.csv"
$file2 = "C:\AuditLogSearch\New folder\newFile23.csv"
$add = "#GMAIL.COM"
$properties = #{
Name = 'Sharepoint Email'
Expression = {
switch -Regex ($_.'SharePoint ID') {
#Not sure what to do here
}
}
}, '*'
Import-Csv -Path $file |
Select-Object $properties |
Export-Csv $file2 -NoTypeInformation
Using calculated properties with Select-Object this is how it could look:
$add = "#GMAIL.COM"
$expression = {
switch($_.'SharePoint ID')
{
{[string]::IsNullOrWhiteSpace($_) -or $_ -match 'MSTeam'}
{
# Null value or mathces MSTeam, leave this Null
break
}
Default # We can assume these are IDs, append $add
{
$_.Trim() + $add
}
}
}
Import-Csv $file | Select-Object *, #{
Name = 'SharePoint Email'
Expression = $expression
} | Export-Csv $file2 -NoTypeInformation
Sample Output
Index SharePoint ID SharePoint Email
----- ------------- ----------------
1 ylkbq ylkbq#GMAIL.COM
2 KlMNO KlMNO#GMAIL.COM
3
4 MSTeam
5
6 MSTEAM
7 LMNO83 LMNO83#GMAIL.COM
A more concise expression, since I misread the point, it can be reduced to just one if statement:
$expression = {
if(-not [string]::IsNullOrWhiteSpace($_.'SharePoint ID') -and $_ -notmatch 'MSTeam')
{
$_.'SharePoint ID'.Trim() + $add
}
}

Powershell: Selecting parts of an array with WHERE

I have two arrays $list_CloudUsers and $list_Active. Both have a column called Alias.
I want to filter $list_CloudUsers so that the list does not have any of the ALIASes that are contained in $list_Active.
I was able to do it with:
$list_arc = Import-CSV $Arc_LastAccess
$list_Active = $list_arc | Where { [int]$_.InactivityDays -le 30}
$NewList = #()
ForEach ($User_Cloud in $list_CloudUsers)
{
$Alias = $User_Cloud.Alias
if ($list_Active -match $alias) {continue}
$NewList += $User_Cloud
}
But it does not work with this. Any ideas how I can get the WHERE to work correctly.
$list_arc = Import-CSV $Arc_LastAccess
$list_Active = $list_arc | Where { [int]$_.InactivityDays -le 30}
$NewList1 = $list_CloudUsers | Where {$list_Active -NotMatch $_.alias}
Try
$NewList1 = $list_CloudUsers | Where-Object {$list_Active.Alias -notcontains $_.Alias}

Remove certain duplicate values from csv file

I try to import a csv file and create a xlsx file from the data afterwards. My Goal is to only show the value of Column1 once and not in every row. The csv file is already sorted so a check if the previous/next row has the same value would be possible.
CSV
"Column1";"Column2";"Column3"
"Value1A";"Value1B";"Value1C"
"Value1A";"Value2B";"Value2C"
"Value1A";"Value3B";"Value3C"
"Value2A";"Value4B";"Value4C"
Expected Outcome
"Column1";"Column2";"Column3"
"Value1A";"Value1B";"Value1C"
"";"Value2B";"Value2C"
"";"Value2B";"Value1C"
"Value2A";"Value4B";"Value4C"
Outcome
"Column1";"Column2";"Column3"
"Value1A";"Value1B";"Value1C"
"Value1A";"Value2B";"Value2C"
"Value1A";"Value2B";"Value1C"
"Value2A";"Value4B";"Value4C"
Only column1 duplicate cells should be empty.
My Code to import and add to Excel
$csv = "C:\path\to\file.csv"
$i = 1
Import-Csv $csv | Select-Object -Property Column1,Column2,Column3 | ForEach-Object {
$j = 1
foreach ($prop in $_.PSObject.Properties) {
if ($i -eq 1) {
$serverInfoSheet.Cells.Item($i, $j++).Value = $prop.Name
} else {
$serverInfoSheet.Cells.Item($i, $j++).Value = $prop.Value
}
}
$i++
}
To provide further context imagine Column1 as a Date and Columns2 and 3 are Employees.
Example of expected outcome
"12/01/2020";"Mark";"Tony"
"";"Mark";"Andrew"
"";"Tony;Vanessa"
"12/02/2020";"Tony";"Michael"
I dont want the date to repeat 2 times because the excel sheet loses clear view.
$Csv = #'
"Column1";"Column2";"Column3"
"Value1A";"Value1B";"Value1C"
"Value1A";"Value2B";"Value2C"
"Value1A";"Value3B";"Value3C"
"Value2A";"Value4B";"Value4C"
'#
$Csv | ConvertFrom-Csv -Delimiter ';' |
Foreach-Object -Begin { $Last1 = $Null } {
if ( $_.Column1 -eq $Last1 ) { $_.Column1 = '' }
else { $Last1 = $_.Column1 }
$_
} | ConvertTo-Csv -Delimiter ';'
"Column1";"Column2";"Column3"
"Value1A";"Value1B";"Value1C"
"";"Value2B";"Value2C"
"";"Value3B";"Value3C"
"Value2A";"Value4B";"Value4C"

Find out Text data in CSV File Numeric Columns in Powershell

I am very new in powershell.
I am trying to validate my CSV file by finding out if there is any text value in my numeric fields. I can define with columns are numeric.
This is my source data like this
ColA ColB ColC ColD
23 23 ff 100
2.30E+01 34 2.40E+01 23
df 33 ss df
34 35 36 37
I need output something like this (only text values if found in any column)
ColA ColC ColD
2.30E+01 ff df
df 2.40E+01
ss
I have tried some code but not getting any results, get only some output like as under
System.Object[]
---------------
xxx fff' ddd 3.54E+03
...
This is what I was trying
#
cls
function Is-Numeric ($Value) {
return $Value -match "^[\d\.]+$"
}
$arrResult = #()
$arraycol = #()
$FileCol = #("ColA","ColB","ColC","ColD")
$dif_file_path = "C:\Users\$env:username\desktop\f2.csv"
#Importing CSVs
$dif_file = Import-Csv -Path $dif_file_path -Delimiter ","
############## Test Datatype (Is-Numeric)##########
foreach($col in $FileCol)
{
foreach ($line in $dif_file) {
$val = $line.$col
$isnum = Is-Numeric($val)
if ($isnum -eq $false) {
$arrResult += $line.$col
$arraycol += $col
}
}
}
[pscustomobject]#{$arraycol = "$arrResult"}| out-file "C:\Users\$env:username\Desktop\Errors1.csv"
####################
can someone guide me right direction?
Thanks
You can try something like this,
function Is-Numeric ($Value) {
return $Value -match "^[\d\.]+$"
}
$dif_file_path = "C:\Users\$env:username\desktop\f2.csv"
#Importing CSVs
$dif_file = Import-Csv -Path $dif_file_path -Delimiter ","
#$columns = $dif_file | Get-member -MemberType 'NoteProperty' | Select-Object -ExpandProperty 'Name'
# Use this to specify certain columns
$columns = "ColB", "ColC", "ColD"
foreach($row in $dif_file) {
foreach ($col in $columns) {
if ($col -in $columns) {
if (!(Is-Numeric $row.$col)) {
$row.$col = ""
}
}
}
}
$dif_file | Export-Csv C:\temp\formatted.txt
Look up name of columns as you go
Look up values of each col in each row and if it is not numeric, change to ""
Exported updated file.
I think not displaying columns that have no data creates the challenge here. You can do the following:
$csv = Import-Csv "C:\Users\$env:username\desktop\f2.csv"
$finalprops = [collections.generic.list[string]]#()
$out = foreach ($line in $csv) {
$props = $line.psobject.properties | Where {$_.Value -notmatch '^[\d\.]+$'} |
Select-Object -Expand Name
$props | Where {$_ -notin $finalprops} | Foreach-Object { $finalprops.add($_) }
if ($props) {
$line | Select $props
}
$out | Select-Object ($finalprops | Sort)
Given the nature of Format-Table or tabular output, you only see the properties of the first object in the collection. So if object1 has ColA only, but object2 has ColA and ColB, you only see ColA.
The output order you want is quite different than the input CSV; you're tracking bad text data not by first occurrence, but by column order, which requires some extra steps.
test.csv file contents:
ColA,ColB,ColC,ColD
23,23,ff,100
2.30E+01,34,2.40E+01,23
df,33,ss,df
34,35,36,37
Sample code tested to meet your description:
$csvIn = Import-Csv "$PSScriptRoot\test.csv";
# create working data set with headers in same order as input file
$data = [ordered]#{};
$csvIn[0].PSObject.Properties | foreach {
$data.Add($_.Name, (New-Object System.Collections.ArrayList));
};
# add fields with text data
$csvIn | foreach {
$_.PSObject.Properties | foreach {
if ($_.Value -notmatch '^-?[\d\.]+$') {
$null = $data[$_.Name].Add($_.Value);
}
}
}
$removes = #(); # remove `good` columns with numeric data
$rowCount = 0; # column with most bad values
$data.GetEnumerator() | foreach {
$badCount = $_.Value.Count;
if ($badCount -eq 0) { $removes += $_.Key; }
if ($badCount -gt $rowCount) { $rowCount = $badCount; }
}
$removes | foreach { $data.Remove($_); }
0..($rowCount - 1) | foreach {
$h = [ordered]#{};
foreach ($key in $data.Keys) {
$h.Add($key, $data[$key][$_]);
}
[PSCustomObject]$h;
} |
Export-Csv -NoTypeInformation -Path "$PSScriptRoot\text-data.csv";
output file contents:
"ColA","ColC","ColD"
"2.30E+01","ff","df"
"df","2.40E+01",
,"ss",
#Jawad, Finally I have tried
function Is-Numeric ($Value) {
return $Value -match "^[\d\.]+$"
}
$arrResult = #()
$columns = "ColA","ColB","ColC","ColD"
$dif_file_path = "C:\Users\$env:username\desktop\f1.csv"
$dif_file = Import-Csv -Path $dif_file_path -Delimiter "," |select $columns
$columns = $dif_file | Get-member -MemberType 'NoteProperty' | Select-Object -ExpandProperty 'Name'
foreach($row in $dif_file) {
foreach ($col in $columns) {
$val = $row.$col
$isnum = Is-Numeric($val)
if ($isnum -eq $false) {
$arrResult += $col+ " " +$row.$col
}}}
$arrResult | out-file "C:\Users\$env:username\desktop\Errordata.csv"
I get correct result in my out file, order is very ambiguous like
ColA ss
ColB 5.74E+03
ColA ss
ColC rrr
ColB 3.54E+03
ColD ss
ColB 8.31E+03
ColD cc
any idea to get proper format? thanks
Note: with your suggested code, I get complete source file with all data , not the specific error data.

How to export two variables into same CSV as joined via PowerShell?

I have a PowerShell script employing poshwsus module like below:
$FileOutput = "C:\WSUSReport\WSUSReport.csv"
$ProcessLog = "C:\WSUSReport\QueryLog2.txt"
$WSUSServers = "C:\WSUSReport\Computers.txt"
$WSUSPort = "8530"
import-module poshwsus
ForEach ($Server in Get-Content $WSUSServers)
{
& connect-poshwsusserver $Server -port $WSUSPort | out-file $ProcessLog -append
$r1 = & Get-PoshWSUSClient | select #{name="Computer";expression={$_.FullDomainName}},#{name="LastUpdated";expression={if ([datetime]$_.LastReportedStatusTime -gt [datetime]"1/1/0001 12:00:00 AM") {$_.LastReportedStatusTime} else {$_.LastSyncTime}}}
$r2 = & Get-PoshWSUSUpdateSummaryPerClient -UpdateScope (new-poshwsusupdatescope) -ComputerScope (new-poshwsuscomputerscope) | Select Computer,NeededCount,DownloadedCount,NotApplicableCount,NotInstalledCount,InstalledCount,FailedCount
}
What I need to do is to export CSV outpout including the results with the columns (like "inner join"):
Computer, NeededCount, DownloadedCount, NotApplicableCount, NotINstalledCount, InstalledCount, FailedCount, LastUpdated
I have tried to use the line below in foreach, but it didn't work as I expected.
$r1 + $r2 | export-csv -NoTypeInformation -append $FileOutput
I appreciate if you may help or advise.
EDIT --> The output I've got:
ComputerName LastUpdate
X A
Y B
X
Y
So no error, first two rows from $r2, last two rows from $r1, it is not joining the tables as I expected.
Thanks!
I've found my guidance in this post: Inner Join in PowerShell (without SQL)
Modified my query accordingly like below, works like a charm.
$FileOutput = "C:\WSUSReport\WSUSReport.csv"
$ProcessLog = "C:\WSUSReport\QueryLog.txt"
$WSUSServers = "C:\WSUSReport\Computers.txt"
$WSUSPort = "8530"
import-module poshwsus
function Join-Records($tab1, $tab2){
$prop1 = $tab1 | select -First 1 | % {$_.PSObject.Properties.Name} #properties from t1
$prop2 = $tab2 | select -First 1 | % {$_.PSObject.Properties.Name} #properties from t2
$join = $prop1 | ? {$prop2 -Contains $_}
$unique1 = $prop1 | ?{ $join -notcontains $_}
$unique2 = $prop2 | ?{ $join -notcontains $_}
if ($join) {
$tab1 | % {
$t1 = $_
$tab2 | % {
$t2 = $_
foreach ($prop in $join) {
if (!$t1.$prop.Equals($t2.$prop)) { return; }
}
$result = #{}
$join | % { $result.Add($_,$t1.$_) }
$unique1 | % { $result.Add($_,$t1.$_) }
$unique2 | % { $result.Add($_,$t2.$_) }
[PSCustomObject]$result
}
}
}
}
ForEach ($Server in Get-Content $WSUSServers)
{
& connect-poshwsusserver $Server -port $WSUSPort | out-file $ProcessLog -append
$r1 = & Get-PoshWSUSClient | select #{name="Computer";expression={$_.FullDomainName}},#{name="LastUpdated";expression={if ([datetime]$_.LastReportedStatusTime -gt [datetime]"1/1/0001 12:00:00 AM") {$_.LastReportedStatusTime} else {$_.LastSyncTime}}}
$r2 = & Get-PoshWSUSUpdateSummaryPerClient -UpdateScope (new-poshwsusupdatescope) -ComputerScope (new-poshwsuscomputerscope) | Select Computer,NeededCount,DownloadedCount,NotApplicableCount,NotInstalledCount,InstalledCount,FailedCount
Join-Records $r1 $r2 | Select Computer,NeededCount,DownloadedCount,NotApplicableCount,NotInstalledCount,InstalledCount,FailedCount, LastUpdated | export-csv -NoTypeInformation -append $FileOutput
}
I think this could be made simpler. Since Select-Object's -Property parameter accepts an array of values, you can create an array of the properties you want to display. The array can be constructed by comparing your two objects' properties and outputting a unique list of those properties.
$selectProperties = $r1.psobject.properties.name | Compare-Object $r2.psobject.properties.name -IncludeEqual -PassThru
$r1,$r2 | Select-Object -Property $selectProperties
Compare-Object by default will output only differences between a reference object and a difference object. Adding the -IncludeEqual switch displays different and equal comparisons. Adding the -PassThru parameter outputs the actual objects that are compared rather than the default PSCustomObject output.