I have a short script in which I am recursively searching for a string and writing out some results. However I have hundreds of strings to search for, so I would like to grab the value from a CSV file use it as my string search and move to the next row.
Here is what I have:
function searchNum {
#I would like to go from manual number input to auto assign from CSV
$num = Read-Host 'Please input the number'
get-childitem "C:\Users\user\Desktop\SearchFolder\input" -recurse | Select String -pattern "$num" -context 2 | Out-File "C:\Users\user\Desktop\SearchFolder\output\output.txt" -width 300 -Append -NoClobber
}
searchNum
How can I run through a CSV to assign the $num value for each line?
Do you have a CSV with several columns, one of which you want to use as search values? Or do you have a "regular" text file with one search pattern per line?
In case of the former, you could read the file with Import-Csv:
$filename = 'C:\path\to\your.csv'
$searchRoot = 'C:\Users\user\Desktop\SearchFolder\input'
foreach ($pattern in (Import-Csv $filename | % {$_.colname})) {
Get-ChildItem $searchRoot -Recurse | Select-String $pattern -Context 2 | ...
}
In case of the latter a simple Get-Content should suffice:
$filename = 'C:\path\to\your.txt'
$searchRoot = 'C:\Users\user\Desktop\SearchFolder\input'
foreach ($pattern in (Get-Content $filename})) {
Get-ChildItem $searchRoot -Recurse | Select-String $pattern -Context 2 | ...
}
I assume you need something like this
$csvFile = Get-Content -Path "myCSVfile.csv"
foreach($line in $csvFile)
{
$lineArray = $line.Split(",")
if ($lineArray -and $lineArray.Count -gt 1)
{
#Do a search num with the value from the csv file
searchNum -num $lineArray[1]
}
}
This will read a csv file and call you function for each line. The parameter given will be the value in the csv file (the second item on the csv line)
Related
I have two csv file where I contain data, I need to check if value from CSV 1 exist in CSV 2 and if so then replace this value in file2 with data from file1, if no just skip to another row,
File1.csv
NO;Description
L001;DREAM
L002;CAR
L003;PHONE
L004;HOUSE
L005;PLANE
File2.csv
ID;Name;Status*;Scheduled Start Date;Actual Start Date;Actual End Date;Scheduled End Date;SLA
144862;DREAM;Scheduled;1524031200;;;1524033000;
149137;CAR;Implementation In Progress;1528588800;;;1548968400;
150564;PHONE;Scheduled;1569456000;;;1569542400;
150564;HOUSE;Scheduled;1569456000;;;1569542400;
150564;PLANE;;;;;;
I tried something like that but it is not working for me:
$file1 = Import-Csv "C:\Users\file1.csv" |Select-Object -ExpandProperty Description
$file2 = Import-Csv "C:\Users\file1.csv" |Select-Object -ExpandProperty NO
Import-Csv "C:\Users\file3.csv" |Where-Object {$file1 -like $_.Name} |ForEach-Object {
$_.Name = $file2($_.NO)
} |Out-File "C:\Users\File4.csv"
File4.csv should like that:
ID;Name;Status*;Scheduled Start Date;Actual Start Date;Actual End Date;Scheduled End Date;SLA
144862;L001;Scheduled;1524031200;;;1524033000;
149137;L002;Implementation In Progress;1528588800;;;1548968400;
150564;L003;Scheduled;1569456000;;;1569542400;
150564;L004;Scheduled;1569456000;;;1569542400;
150564;L005;;;;;;
Maybe there is another way to achive my goal! Thank you
Here's one approach you can take:
Import both CSV files with Import-Csv
Create a lookup hash table from the first CSV file, where the Description you want to replace are the keys, and NO are the values.
Go through the second CSV file, and replace any values from the Name column from the hash table, if the key exists. We can use System.Collections.Hashtable.ContainsKey to check if the key exists. This is a constant time O(1) operation, so lookups are fast.
Then we can export the final CSV with Export-Csv. I used -UseQuotes Never to put no " quotes in your output file. This feature is only available in PowerShell 7. For lower PowerShell versions, you can have a look at How to remove all quotations mark in the csv file using powershell script? for other alternatives to removing quotes from a CSV file.
Demo:
$csvFile1 = Import-Csv -Path .\File1.csv -Delimiter ";"
$csvFile2 = Import-Csv -Path .\File2.csv -Delimiter ";"
$ht = #{}
foreach ($item in $csvFile1) {
if (-not [string]::IsNullOrEmpty($item.Description)) {
$ht[$item.Description] = $item.NO
}
}
& {
foreach ($line in $csvFile2) {
if ($ht.ContainsKey($line.Name)) {
$line.Name = $ht[$line.Name]
}
$line
}
} | Export-Csv -Path File4.csv -Delimiter ";" -NoTypeInformation -UseQuotes Never
Or instead of wrapping the foreach loop inside a script block using the Call Operator &, we can use Foreach-Object. You can have a look at about_script_blocks for more information about script blocks.
$csvFile2 | ForEach-Object {
if ($ht.ContainsKey($_.Name)) {
$_.Name = $ht[$_.Name]
}
$_
} | Export-Csv -Path File4.csv -Delimiter ";" -NoTypeInformation -UseQuotes Never
File4.csv
ID;Name;Status*;Scheduled Start Date;Actual Start Date;Actual End Date;Scheduled End Date;SLA
144862;L001;Scheduled;1524031200;;;1524033000;
149137;L002;Implementation In Progress;1528588800;;;1548968400;
150564;L003;Scheduled;1569456000;;;1569542400;
150564;L004;Scheduled;1569456000;;;1569542400;
150564;L005;;;;;;
Update
For handling multiple values with the same Name, we can transform the above to use a hash table of System.Management.Automation.PSCustomObject, where we have two properties Count to keep track of the current item we're seeing and NO which is an array of numbers:
$csvFile1 = Import-Csv -Path .\File1.csv -Delimiter ";"
$csvFile2 = Import-Csv -Path .\File2.csv -Delimiter ";"
$ht = #{}
foreach ($row in $csvFile1) {
if (-not $ht.ContainsKey($row.Description) -and
-not [string]::IsNullOrEmpty($item.Description)) {
$ht[$row.Description] = [PSCustomObject]#{
Count = 0
NO = #()
}
}
$ht[$row.Description].NO += $row.NO
}
& {
foreach ($line in $csvFile2) {
if ($ht.ContainsKey($line.Name)) {
$name = $line.Name
$pos = $ht[$name].Count
$line.Name = $ht[$name].NO[$pos]
$ht[$name].Count += 1
}
$line
}
} | Export-Csv -Path File4.csv -Delimiter ";" -NoTypeInformation -UseQuotes Never
If your files aren't too big, you could do this with a simple ForEach-Object loop:
$csv1 = Import-Csv -Path 'D:\Test\File1.csv' -Delimiter ';'
$result = Import-Csv -Path 'D:\Test\File2.csv' -Delimiter ';' |
ForEach-Object {
$name = $_.Name
$item = $csv1 | Where-Object { $_.Description -eq $name } | Select-Object -First 1
# update the Name property and output the item
if ($item) {
$_.Name = $item.NO
# if you output the row here, the result wil NOT contain rows that did not match
# $_
}
# if on the other hand, you would like to retain the items that didn't match unaltered,
# then output the current row here
$_
}
# output on screen
$result | Format-Table -AutoSize
#output to new CSV file
$result | Export-Csv -Path 'D:\Test\File4.csv' -Delimiter ';' -NoTypeInformation
Result on screen:
ID Name Status* Scheduled Start Date Actual Start Date Actual End Date Scheduled End Date SLA
-- ---- ------- -------------------- ----------------- --------------- ------------------ ---
144862 L001 Scheduled 1524031200 1524033000
149137 L002 Implementation In Progress 1528588800 1548968400
150564 L003 Scheduled 1569456000 1569542400
150564 L004 Scheduled 1569456000 1569542400
150564 L005
I have a CSV file which is structured like this:
"SA1";"21020180123155514000000000000000002"
"SA2";"21020180123155514000000000000000002";"210"
"SA4";"21020180123155514000000000000000002";"210";"200000001"
"SA5";"21020180123155514000000000000000002";"210";"200000001";"140000001";"ZZ"
"SA1";"21020180123155522000000000000000002"
"SA2";"21020180123155522000000000000000002";"210"
"SA4";"21020180123155522000000000000000002";"210";"200000001"
"SA5";"21020180123155522000000000000000002";"210";"200000001";"140000671";"ZZ"
"SA1";"21020180123155567000000000000000002"
"SA2";"21020180123155567000000000000000002";"210"
"SA4";"21020180123155567000000000000000002";"210";"200000001"
"SA5";"21020180123155567000000000000000002";"210";"200000001";"140000001";"ZZ"
So the Value in the second field (separator ';') marks the data which belongs together and value 140000001 or 140000671 is the trigger.
So the result should be:
1st file: 140000001.txt
"SA1";"21020180123155514000000000000000002"
"SA2";"21020180123155514000000000000000002";"210"
"SA4";"21020180123155514000000000000000002";"210";"200000001"
"SA5";"21020180123155514000000000000000002";"210";"200000001";"140000001";"ZZ"
"SA1";"21020180123155567000000000000000002"
"SA2";"21020180123155567000000000000000002";"210"
"SA4";"21020180123155567000000000000000002";"210";"200000001"
"SA5";"21020180123155567000000000000000002";"210";"200000001";"140000001";"ZZ"
2nd file: 140000671.txt
"SA1";"21020180123155522000000000000000002"
"SA2";"21020180123155522000000000000000002";"210"
"SA4";"21020180123155522000000000000000002";"210";"200000001"
"SA5";"21020180123155522000000000000000002";"210";"200000001";"140000671";"ZZ"
For now I found a snippet which splits the big file by the second field:
$src = "C:\temp\ORD001.txt"
$dstDir = "C:\temp\files\"
Remove-Item -Path "$dstDir\\*"
$header = Get-Content -Path $src | select -First 1
Get-Content -Path $src | select -Skip 1 | foreach {
$file = "$(($_ -split ";")[1]).txt"
Write-Verbose "Wrting to $file"
$file = $file.Replace('"',"")
if (-not (Test-Path -Path $dstDir\$file))
{
Out-File -FilePath $dstDir\$file -InputObject $header -Encoding ascii
}
$file -replace '"', ""
Out-File -FilePath $dstDir\$file -InputObject $_ -Encoding ascii -Append
}
For the rest I'm standing in the dark.
Please help.
The Import-CSV cmdlet will work here, if you don't already know about it. I would use that, as it returns all the rows as different objects in an array, with the properties being the column values. And you don't have to manually remove the quotes and such. Assuming the second column is a date time value, and should be unique for each group of 4 consecutive rows, then this will work:
$src = "C:\temp\ORD001.txt"
$dstDir = "C:\temp\files\"
Remove-Item -Path "$dstDir\*"
$csv = Import-CSV $src -Delimiter ';'
$DateTimeGroups = $csv | Group-Object -Property 'ColumnTwoHeader'
foreach ($group in $DateTimeGroups) {
$filename = $group.Group.'ColumnFiveHeader' | select -Unique
$group.Group | Export-CSV "$dstDir\$filename.txt" -Append -NoTypeInformation
}
However, this will break if two of those "groups of 4 consecutive rows" have the same value for the second column and the fifth column. There isn't a way to fix this unless you are certain that there will always be 4 consecutive rows in each time group. In which case:
$src = "C:\temp\ORD001.txt"
$dstDir = "C:\temp\files\"
Remove-Item -Path "$dstDir\*"
$csv = Import-CSV $src -Delimiter ';'
if ($csv.count % 4 -ne 0) {
Write-Error "CSV does not have a proper number of rows. Attempting to continue will be bad :)"
return
}
for ($i = 0 ; $i -lt $csv.Count ; $i=$i+4) {
$group = $csv[$i..($i+4)]
$group | Export-Csv "$dstDir\$($group[3].'ColumnFiveHeader').txt" -Append -NoTypeInformation
}
Just be sure to replace Column2Header and Column5Header with the appropriate values.
If performance is not a concern, combining Import-Csv / Export-Csv with Group-Object allows the most concise, direct expression of your intent, using PowerShell's ability to convert CSV to objects and back:
$src = "C:\temp\ORD001.txt" # Input CSV file
$dstDir = "C:\temp\files" # Output directory
# Delete previous output files, if necessary.
Remove-Item -Path "$dstDir\*" -WhatIf
# Import the source CSV into custom objects with properties named for the columns.
# Note: The assumption is that your CSV header line defines columns "Col1", "Col2", ...
Import-Csv $src -Delimiter ';' |
# Group the resulting objects by column 2
Group-Object -Property Col2 |
ForEach-Object { # Process each resulting group.
# Determine the output filename via the group's last row's column 5 value.
$outFile = '{0}\{1}.txt' -f $dstDir, $_.Group[-1].Col5
# Append the group at hand to the target file.
$_.Group | Export-Csv -Append -Encoding Ascii $outFile -Delimiter ';' -NoTypeInformation
}
Note:
The assumption - in line with your sample data - is that it is always the last row in a group of lines sharing the same column-2 value whose column 5 contains the root of the output filename (e.g., 140000001)
Sorry but I don't have a Header Column. It's a semikolon seperated txt file for an interface
You can simply read the file with Get-Content, and then search for the trigger in the line.
I hope this small example can help:
$file = Get-Content CSV_File.txt
$140000001 = #()
$140000671 = #()
$bTrig = #()
foreach($line in $file){
$bTrig += $line
if($line -match ';"140000001";'){
$140000001 += $bTrig
$bTrig = #()
}
elseif($line -match ';"140000671";'){
$140000671 += $bTrig
$bTrig = #()
}
}
if($bTrig.Count -ne 0){Write-Warning "No trigger for $bTrig"}
$140000001 | Out-File 140000001.txt -Encoding ascii
$140000671 | Out-File 140000671.txt -Encoding ascii
I am using the following script that iterates through hundreds of text files looking for specific instances of the regex expression within. I need to add a second data point to the array, which tells me the object the pattern matched in.
In the below script the [Regex]::Matches($str, $Pattern) | % { $_.Value } piece returns multiple rows per file, which cannot be easily output to a file.
What I would like to know is, how would I output a 2 column CSV file, one column with the file name (which should be $_.FullName), and one column with the regex results? The code of where I am at now is below.
$FolderPath = "C:\Test"
$Pattern = "(?i)(?<=\b^test\b)\s+(\w+)\S+"
$Lines = #()
Get-ChildItem -Recurse $FolderPath -File | ForEach-Object {
$_.FullName
$str = Get-Content $_.FullName
$Lines += [Regex]::Matches($str, $Pattern) |
% { $_.Value } |
Sort-Object |
Get-Unique
}
$Lines = $Lines.Trim().ToUpper() -replace '[\r\n]+', ' ' -replace ";", '' |
Sort-Object |
Get-Unique # Cleaning up data in array
I can think of two ways but the simplest way is to use a hashtable (dict). Another way is create psobjects to fill your Lines variable. I am going to go with the simple way so you can only use one variable, the hashtable.
$FolderPath = "C:\Test"
$Pattern = "(?i)(?<=\b^test\b)\s+(\w+)\S+"
$Results =#{}
Get-ChildItem -Recurse $FolderPath -File |
ForEach-Object {
$str = Get-Content $_.FullName
$Line = [regex]::matches($str,$Pattern) | % { $_.Value } | Sort-Object | Get-Unique
$Line = $Line.Trim().ToUpper() -Replace '[\r\n]+', ' ' -Replace ";",'' | Sort-Object | Get-Unique # Cleaning up data in array
$Results[$_.FullName] = $Line
}
$Results.GetEnumerator() | Select #{L="Folder";E={$_.Key}}, #{L="Matches";E={$_.Value}} | Export-Csv -NoType -Path <Path to save CSV>
Your results will be in $Results. $Result.keys contain the folder names. $Results.Values has the results from expression. You can reference the results of a particular folder by its key $Results["Folder path"]. of course it will error if the key does not exist.
I have a problem that I am trying to solve, however, due to my non existing PowerShell knowledge it is proving to be harder than I hoped. So any help would be appreciated.
The problem can be simplified as:
Find a string in a txtfile
Extract the information on the row after that string
Store the information in a handle
Find a second string in the txtfile and repeat the procedure
Store both strings in a new file or delete everything else in the txt file.
I am then trying to do this for approx 20k files. I would love to have the information under their keyword and comma delimited so that I can import them in other systems.
My files look somewhat like the following
random words
that are unimportant
Keyword
FirstlineofNumbersthatIwanttoExtract
random words again that are unimportant
Secondkeyword
SecondLineOfNumbersThatIWantToExtract
end of the file
All files are however not similar in terms of the row that the lines I want to extract are on. I would the output to be something like
Keyword, SecondKeyword
FirstLineOfNumbersThatIWantToExtract, SecondLineOfNumbersThatIWantToExtract
And done. I got this far
[System.IO.DirectoryInfo]$folder = 'C:\users\xx\Desktop\mappcent3'
foreach ($file in ($folder.EnumerateFiles())) {
if ($file.Extension -eq '.txt') {
$content = Get-Content $file
$FirstRegex = 'KeyWordOne
(.+)$'
$First_output = "\1"
$test = Select-String -Path $file.FullName -Pattern $FirstRegex
}
}
This would do something similar to what you are asking. This requires PowerShell 3.0+
$path = 'C:\users\xx\Desktop\mappcent3'
$firstKeyword = "Keyword"
$secondKeyword = "Secondkeyword"
$resultsPath = "C:\Temp\results.csv"
Get-ChildItem $path -Filter "*.txt" | ForEach-Object{
# Read the file in
$fileContents = Get-Content $_.FullName
# Find the first keyword data
$firstKeywordData = ($fileContents | Select-String -Pattern $firstKeyword -Context 0,1 -SimpleMatch).Context.PostContext[0]
# Find the second keyword data
$secondKeywordData = ($fileContents | Select-String -Pattern $secondKeyword -Context 0,1 -SimpleMatch).Context.PostContext[0]
# Create a new object with details gathered.
[pscustomobject][ordered]#{
File = $_.FullName
FirstKeywordData = $firstKeywordData
SecondKeywordData = $secondKeywordData
}
} | Export-CSV $resultsPath -NoTypeInformation
Select-String is what does most of the magic here. We take advantage of -Context which consumes lines before and after the match. We want the one following so that is why we use 0,1. Wrap that up in a custom object and then we can export it to a CSV file.
Keyword Overlap
Beware that your keywords can overlap and create odd results in your output files. In your sample Keyword matches multiple lines so the result set would reflect that.
If you did just want to write back to the original file you could easily do that as well
"$firstKeywordData,$secondKeywordData" | Set-Content $_.FullName
Or something similar.
The Select-String cmdlet has a -Context parameter that makes it easy to extract lines before or after the line on which there's a match.
You can use Export-Csv to export to the format you require (although with 20K files you may want to write directly to the output files)
foreach($file in Get-ChildItem C:\users\xx\Desktop\mappcent3 |Where {-not $_.PsIsContainer})
{
$FirstKeyword = 'FirstKeyword'
$FirstLine = Select-String -Path $file.FullName -Pattern $FirstKeyword -Context 0,1 |Select -Expand Context -First 1 |Select -Expand PostContext
$SecondKeyword = 'SecondKeyword'
$SecondLine = Select-String -Path $file.FullName -Pattern $SecondKeyword -Context 0,1 |Select -Expand Context -First 1 |Select -Expand PostContext
New-Object psobject -Property #{$FirstKeyword=$FirstLine;$SecondKeyword=$SecondLine} |Export-Csv (Join-Path $file.DirectoryName ($file.BaseName + '_keywords.txt'))
}
I have a list of strings in a CSV file. The format is:
OldValue,NewValue
223134,875621
321321,876330
....
and the file contains a few hundred rows (each OldValue is unique). I need to process changes over a number of text files in a number of folders & subfolders. My best guess of the number of folders, files, and lines of text are - 15 folders, around 150 text files in each folder, with approximately 65,000 lines of text in each folder (between 400-500 lines per text file).
I will make 2 passes at the data, unless I can do it in one. First pass is to generate a text file I will use as a check list to review my changes. Second pass is to actually make the change in the file. Also, I only want to change the text files where the string occurs (not every file).
I'm using the following Powershell script to go through the files & produce a list of the changes needed. The script runs, but is beyond slow. I haven't worked on the replace logic yet, but I assume it will be similar to what I've got.
# replace a string in a file with powershell
[reflection.assembly]::loadwithpartialname("Microsoft.VisualBasic") | Out-Null
Function Search {
# Parameters $Path and $SearchString
param ([Parameter(Mandatory=$true, ValueFromPipeline = $true)][string]$Path,
[Parameter(Mandatory=$true)][string]$SearchString
)
try {
#.NET FindInFiles Method to Look for file
[Microsoft.VisualBasic.FileIO.FileSystem]::GetFiles(
$Path,
[Microsoft.VisualBasic.FileIO.SearchOption]::SearchAllSubDirectories,
$SearchString
)
} catch { $_ }
}
if (Test-Path "C:\Work\ListofAllFilenamesToSearch.txt") { # if file exists
Remove-Item "C:\Work\ListofAllFilenamesToSearch.txt"
}
if (Test-Path "C:\Work\FilesThatNeedToBeChanged.txt") { # if file exists
Remove-Item "C:\Work\FilesThatNeedToBeChanged.txt"
}
$filefolder1 = "C:\TestFolder\WorkFiles"
$ftype = "*.txt"
$filenames1 = Search $filefolder1 $ftype
$filenames1 | Out-File "C:\Work\ListofAllFilenamesToSearch.txt" -Width 2000
if (Test-Path "C:\Work\FilesThatNeedToBeChanged.txt") { # if file exists
Remove-Item "C:\Work\FilesThatNeedToBeChanged.txt"
}
(Get-Content "C:\Work\NumberXrefList.CSV" |where {$_.readcount -gt 1}) | foreach{
$OldFieldValue, $NewFieldValue = $_.Split("|")
$filenamelist = (Get-Content "C:\Work\ListofAllFilenamesToSearch.txt" -ReadCount 5) #|
foreach ($j in $filenamelist) {
#$testvar = (Get-Content $j )
#$testvar = (Get-Content $j -ReadCount 100)
$testvar = (Get-Content $j -Delimiter "\n")
Foreach ($i in $testvar)
{
if ($i -imatch $OldFieldValue) {
$j + "|" + $OldFieldValue + "|" + $NewFieldValue | Out-File "C:\Work\FilesThatNeedToBeChanged.txt" -Width 2000 -Append
}
}
}
}
$FileFolder = (Get-Content "C:\Work\FilesThatNeedToBeChanged.txt" -ReadCount 5)
Get-ChildItem $FileFolder -Recurse |
select -ExpandProperty fullname |
foreach {
if (Select-String -Path $_ -SimpleMatch $OldFieldValue -Debug -Quiet) {
(Get-Content $_) |
ForEach-Object {$_ -replace $OldFieldValue, $NewFieldValue }|
Set-Content $_ -WhatIf
}
}
In the code above, I've tried several things with Get-Content - default, with -ReadCount, and -Delimiter - in an attempt to avoid an out of memory error.
The only thing I have control over is the length of the old & new replacement strings file. Is there a way to do this in Powershell? Is there a better option/solution? I'm running Windows 7, Powershell version 3.0.
Your main problem is that you're reading the file over and over again to change each of the terms. You need to invert the looping of the replace terms and looping of the files. Also, pre-load the csv. Something like:
$filefolder1 = "C:\TestFolder\WorkFiles"
$ftype = "*.txt"
$filenames = gci -Path $filefolder1 -Filter $ftype -Recurse
$replaceValues = Import-Csv -Path "C:\Work\NumberXrefList.CSV"
foreach ($file in $filenames) {
$contents = Get-Content -Path $file
foreach ($replaceValue in $replaceValues) {
$contents = $contents -replace $replaceValue.OldValue, $replaceValue.NewValue
}
Copy-Item $file "$file.old"
Set-Content -Path $file -Value $contents
}