I've a scenario where everyday I will received 2 csv files where the file naming is something like CMCS_{Timestamp}, example CMCS_02012016100101 and CMCS_02012016100102 . This 2 files are different files and have different structure, but because this 2 files will go into same folder where my ETL tools will pick it up and process. So I wrote a script where the script will based on the structure of the file to distinguish it whether is a file A or file B.
For File A, i tell a script to look at first line of the file and if line start with 'Name,Emp(Date).' then copy the file to folderA else if line start with 'Name,Group.' then copy the file to folderB else copy file to folder C
Here the code that i wrote, the powershell does not generate any errors but it does not produce any results too. I wonder what wrong in my script.
$fileDirectory = "D:\Data";
$output_path = "D:\Output\FileA";
$output_path2 = "D:\Output\FileB";
$output_path2 = "D:\Output\FileC";
foreach($file in Get-ChildItem $fileDirectory)
{
# the full file path.
$filePath = $fileDirectory + "\" + $file;
$getdata = Get-Content -path $filePath
$searchresults = $getdata | Select -Index 1 | Where-Object { $_ -like 'Name,Emp(Date).*' }
$searchresults2 = $getdata | Select -Index 1 | Where-Object { $_ -like 'Name,Group.*' }
if ($searchresults -ne $null) {
Copy-Item $filePath $output_path
}
if ($searchresults2 -ne $null) {
Copy-Item $filePath $output_path2
}
}
Your issue may be caused by the Select -Index 1, as Powershell uses 0 based indexing this will actually select the second line of the file. If you change this to 0 it should correctly get the header row.
On a separate note, instead of doing $filePath = $fileDirectory + "\" + $file; you can just use $file.FullName to get the file path.
EDIT:
I think this should do what you're after:
[string] $FileDirectory = "D:\Data";
[string] $OutputPath = "D:\Output\FileA";
[string] $OutputPath2 = "D:\Output\FileB";
[string] $OutputPath3 = "D:\Output\FileC";
foreach ($FilePath in Get-ChildItem $FileDirectory | Select-Object -ExpandProperty FullName)
{
[string] $Header = Get-Content $FilePath -First 1
if ($Header -match 'Name,Emp.*') {
Copy-Item $FilePath $OutputPath
}
elseif ($Header -match 'Name,Group.*') {
Copy-Item $FilePath $OutputPath2
}
else {
Copy-Item $FilePath $OutputPath3
}
}
Related
In a directory, there are files with the following filenames:
ExampleFile.mp3
ExampleFile_pn.mp3
ExampleFile2.mp3
ExampleFile2_pn.mp3
ExampleFile3.mp3
I want to iterate through the directory, and IF there is a filename that contains the string '_pn.mp3', I want to test if there is a similarly named file without the '_pn.mp3' in the same directory. If that file exists, I want to remove it.
In the above example, I'd want to remove:
ExampleFile.mp3
ExampleFile2.mp3
and I'd want to keep ExampleFile3.mp3
Here's what I have so far:
$pattern = "_pn.mp3"
$files = Get-ChildItem -Path '$path' | Where-Object {! $_.PSIsContainer}
Foreach ($file in $files) {
If($file.Name -match $pattern){
# filename with _pn.mp3 exists
Write-Host $file.Name
# search in the current directory for the same filename without _pn
<# If(Test-Path $currentdir $filename without _pn.mp3) {
Remove-Item -Force}
#>
}
enter code here
You could use Group-Object to group all files by their BaseName (with the pattern removed), and then loop over the groups where there are more than one file. The result of grouping the files and filtering by count would look like this:
$files | Group-Object { $_.BaseName.Replace($pattern,'') } |
Where-Object Count -GT 1
Count Name Group
----- ---- -----
2 ExampleFile {ExampleFile.mp3, ExampleFile_pn.mp3}
2 ExampleFile2 {ExampleFile2.mp3, ExampleFile2_pn.mp3}
Then if we loop over these groups we can search for the files that do not end with the $pattern:
#'
ExampleFile.mp3
ExampleFile_pn.mp3
ExampleFile2.mp3
ExampleFile2_pn.mp3
ExampleFile3.mp3
'# -split '\r?\n' -as [System.IO.FileInfo[]] | Set-Variable files
$pattern = "_pn"
$files | Group-Object { $_.BaseName.Replace($pattern,'') } |
Where-Object Count -GT 1 | ForEach-Object {
$_.Group.Where({-not $_.BaseName.Endswith($pattern)})
}
This is how your code would look like, remove the -WhatIf switch if you consider the code is doing what you wanted.
$pattern = "_pn.mp3"
$files = Get-ChildItem -Path -Filter *.mp3 -File
$files | Group-Object { $_.BaseName.Replace($pattern,'') } |
Where-Object Count -GT 1 | ForEach-Object {
$toRemove = $_.Group.Where({-not $_.BaseName.Endswith($pattern)})
Remove-Item $toRemove -WhatIf
}
I think you can get by here by adding file names into a hash map as you go. If you encounter a file with the ending you are interested in, check if a similar file name was added. If so, remove both the file and the similar match.
$ending = "_pn.mp3"
$files = Get-ChildItem -Path $path -File | Where-Object { ! $_.PSIsContainer }
$hash = #{}
Foreach ($file in $files) {
# Check if file has an ending we are interested in
If ($file.Name.EndsWith($ending)) {
$similar = $file.Name.Split($ending)[0] + ".mp3"
# Check if we have seen the similar file in the hashmap
If ($hash.Contains($similar)) {
Write-Host $file.Name
Write-Host $similar
Remove-Item -Force $file
Remove-Item -Force $hash[$similar]
# Remove similar from hashmap as it is removed and no longer of interest
$hash.Remove($similar)
}
}
else {
# Add entry for file name and reference to the file
$hash.Add($file.Name, $file)
}
}
Just get a list of the files with the _pn then process against the rest.
$pattern = "*_pn.mp3"
$files = Get-ChildItem -Path "$path" -File -filter "$pattern"
Foreach ($file in $files) {
$TestFN = $file.name -replace("_pn","")
If (Test-Path -Path $(Join-Path -Path $Path -ChildPath $TestFN)) {
$file | Remove-Item -force
}
} #End Foreach
I am trying to construct a script that moves through specific folders and the log files in it, and filters the error codes. After that it passes them into a new file.
I'm not really sure how to do that with for loops so I'll leave my code bellow.
If someone could tell me what I'm doing wrong, that would be greatly appreciated.
$file_name = Read-Host -Prompt 'Name of the new file: '
$path = 'C:\Users\user\Power\log_script\logs'
Add-Type -AssemblyName System.IO.Compression.FileSystem
function Unzip
{
param([string]$zipfile, [string]$outpath)
[System.IO.Compression.ZipFile]::ExtractToDirectory($zipfile, $outpath)
}
if ([System.IO.File]::Exists($path)) {
Remove-Item $path
Unzip 'C:\Users\user\Power\log_script\logs.zip' 'C:\Users\user\Power\log_script'
} else {
Unzip 'C:\Users\user\Power\log_script\logs.zip' 'C:\Users\user\Power\log_script'
}
$folder = Get-ChildItem -Path 'C:\Users\user\Power\log_script\logs\LogFiles'
$files = foreach($logfolder in $folder) {
$content = foreach($line in $files) {
if ($line -match '([ ][4-5][0-5][0-9][ ])') {
echo $line
}
}
}
$content | Out-File $file_name -Force -Encoding ascii
Inside the LogFiles folder are three more folders each containing log files.
Thanks
Expanding on a comment above about recursing the folder structure, and then actually retrieving the content of the files, you could try something line this:
$allFiles = Get-ChildItem -Path 'C:\Users\user\Power\log_script\logs\LogFiles' -Recurse
# iterate the files
$allFiles | ForEach-Object {
# iterate the content of each file, line by line
Get-Content $_ | ForEach-Object {
if ($_ -match '([ ][4-5][0-5][0-9][ ])') {
echo $_
}
}
}
It looks like your inner loop is of a collection ($files) that doesn't yet exist. You assign $files to the output of a ForEach(...) loop then try to nest another loop of $files inside it. Of course at this point $files isn't available to be looped.
Regardless, the issue is you are never reading the content of your log files. Even if you managed to loop through the output of Get-ChildItem, you need to look at each line to perform the match.
Obviously I cannot completely test this, but I see a few issues and have rewritten as below:
$file_name = Read-Host -Prompt 'Name of the new file'
$path = 'C:\Users\user\Power\log_script\logs'
$Pattern = '([ ][4-5][0-5][0-9][ ])'
if ( [System.IO.File]::Exists( $path ) ) { Remove-Item $path }
Expand-Archive 'C:\Users\user\Power\log_script\logs.zip' 'C:\Users\user\Power\log_script'
Select-String -Path 'C:\Users\user\Power\log_script\logs\LogFiles\*' -Pattern $Pattern |
Select-Object -ExpandProperty line |
Out-File $file_name -Force -Encoding ascii
Note: Select-String cannot recurse on its own.
I'm not sure you need to write your own UnZip function. PowerShell has the Expand-Archive cmdlet which can at least match the functionality thus far:
Expand-Archive -Path <SourceZipPath> -DestinationPath <DestinationFolder>
Note: The -Force parameter allows it to over write the destination files if they are already present. which may be a substitute for testing if the file exists and deleting if it does.
If you are going to test for the file that section of code can be simplified as:
if ( [System.IO.File]::Exists( $path ) ) { Remove-Item $path }
Unzip 'C:\Users\user\Power\log_script\logs.zip' 'C:\Users\user\Power\log_script'
This is because you were going to run the UnZip command regardless...
Note: You could also use Test-Path for this.
Also there are enumerable ways to get the matching lines, here are a couple of extra samples:
Get-ChildItem -Path 'C:\Users\user\Power\log_script\logs\LogFiles' |
ForEach-Object{
( Get-Content $_.FullName ) -match $Pattern
# Using match in this way will echo the lines that matched from each run of
# Get-Content. If nothing matched nothing will output on that iteration.
} |
Out-File $file_name -Force -Encoding ascii
This approach will read the entire file into an array before running the match on it. For large files it may pose a memory issue, however it enabled the clever use of -match.
OR:
Get-ChildItem -Path 'C:\Users\user\Power\log_script\logs\LogFiles' |
Get-Content |
ForEach-Object{ If( $_ -match $Pattern ) { $_ } } |
Out-File $file_name -Force -Encoding ascii
Note: You don't need the alias echo or its real cmdlet Write-Output
UPDATE: After fuzzing around a bit and trying different things I finally got it to work.
I'll include the code below just for demonstration purposes.
Thanks everyone
$start = Get-Date
"`n$start`n"
$file_name = Read-Host -Prompt 'Name of the new file: '
Out-File $file_name -Force -Encoding ascii
Expand-Archive -Path 'C:\Users\User\Power\log_script\logs.zip' -Force
$i = 1
$folders = Get-ChildItem -Path 'C:\Users\User\Power\log_script\logs\logs\LogFiles' -Name -Recurse -Include *.log
foreach($item in $folders) {
$files = 'C:\Users\User\Power\log_script\logs\logs\LogFiles\' + $item
foreach($file in $files){
$content = Get-Content $file
Write-Progress -Activity "Filtering..." -Status "File $i of $($folders.Count)" -PercentComplete (($i / $folders.Count) * 100)
$i++
$output = foreach($line in $content) {
if ($line -match '([ ][4-5][0-5][0-9][ ])') {
Add-Content -Path $file_name -Value $line
}
}
}
}
$end = Get-Date
$time = [int]($end - $start).TotalSeconds
Write-Output ("Runtime: " + $time + " Seconds" -join ' ')
I have a directory that our work order program dumps xml files into. I need to search those files for a specific string and then copy them to another location based on that string. I modified the below code from another post and while I don't get any errors it also doesn't work. I'm very much a scripting newbie so any help would be greatly appreciated.
[string] $FileDirectory = "D:\Temp";
[string] $OutputPath = "D:\Temp\Temp_NY";
[string] $OutputPath2 = "D:\Temp\TEMP_FL";
foreach ($FilePath in Get-ChildItem $FileDirectory | Select-Object -ExpandProperty FullName)
{
[string] $Header = Get-Content $FilePath -First 0
if ($Header -match 'PARTNER |TEST_NY') {
Copy-Item $FilePath $OutputPath
}
elseif ($Header -match 'PARTNER |TEST_FL*') {
Copy-Item $FilePath $OutputPath2
}
}
The header would be -First 1 (first line only). -First 0 returns nothing. Try:
$FileDirectory = "D:\Temp";
$OutputPath = "D:\Temp\Temp_NY";
$OutputPath2 = "D:\Temp\TEMP_FL";
Get-ChildItem $FileDirectory | ? { !$_.PSIsContainer } | ForEach-Object {
$FilePath = $_.FullName
$Header = Get-Content $FilePath -First 1
if ($Header -match 'PARTNER |TEST_NY') {
Copy-Item $FilePath $OutputPath
}
elseif ($Header -match 'PARTNER |TEST_FL*') {
Copy-Item $FilePath $OutputPath2
}
}
I am trying to
Create a CD_TMP file in each WE*.MS directory
Set content by processing the AHD*.TPL and ADT*.TPL files
Rename the AHD*.TPL to AHD*.TPL.Done and ADT*.TPL to AHD*.TPL.Done.
When there is only one WE.20150408.MS directory, the scripts works fine
but when there are more than one directories (i.e. WE.20150408.MS, WE.20151416.MS,WE.20140902.MS), it does not work and gives error message:
Get-Content: An object at specified path AHD*TPL does not exist of has been filtered by the -Include or -Exclude parameter.
At C:\Temp\Script\Script.ps1:24 Char:14
+ $content = Get=Content -path $AHD
+ CatagoryInfo :ObjectNotFound: (System.String[]:Strint[1) [Get-Content], Exception
+ FullyQualifiedErrorID: ItemNotFound,Micorsoft.Powershell.Commands.GetContentCommand
SCRIPT:
$SOURCE_DIR = "C:\Work"
$Work_DIR = "WE*MS"
$WE_DIR = "$SOURCE_DIR\$Work_DIR"
$AHD = "AHD*TPL"
$ADT = "ADT*TPL"
$AHD_FILES = $SOURCE_DIR
$CD_TMP = "CD_TMP"
$Str1 = "TEMP"
##############
Set-Location $WE_DIR
New-Item -Path "CD_TMP" -type file -force
#############
foreach ( $File in ( get-childitem -name $WE_DIR))
{
$content = Get-Content -path $AHD
$content | foreach {
If ($_.substring(0,4) -NotLike $Str1)
{
'0011' + '|' + 'HD' + '|' + 'AHD' + $_
}
} | Set-Content $CD_TMP
}
Get-ChildItem AHD*.TPL| ForEach {Move-Item $_ ($_.Name -replace ".TPL$",
".TPL.Done")}
##############
foreach ( $File in ( get-childitem -name $WE_DIR))
{
$content = Get-Content -path $ADT
$content | foreach {
If ($_.substring(0,4) -NotLike $Str1)
{
'0022' + '|' + 'DT' + '|' + 'ADT' + $_
}
} | Set-Content $CD_TMP
}
Get-ChildItem ADT*TPL| ForEach {Move-Item $_ ($_.Name -replace ".TPL$",
".TPL.Done")}
PAUSE
Is it first giving the error Set-Location : Cannot set the location because path 'C:\Work\WE*MS' resolved to multiple containers. ? That's what I expect it to say when it fails.
Then, because it can't change into the folder, it can't find any AHD files.
Does it work properly for one folder? It writes the CD_TMP file for AHD files, then overwrites it for ADT files. That doesn't seem right.
Also you can make it a bit more direct by changing:
putting lots of things in $CAPITAL variables at the start, then using them once, or never.
The .substring() -notlike test to use .startswith()
The string building with ++++ into a single string
The renaming into a Rename-Item with -NewName scriptblock
I'm thinking this:
$folders = Get-ChildItem "C:\Work\WE*MS" -Directory
foreach ($folder in $folders) {
# AHD files
$content = Get-Content "$folder\AHD*.TPL"
$content = $content | where { -not $_.StartsWith('TEMP') }
$content | foreach {"0011|HD|AHD$_"} | Set-Content "$folder\CD_TMP" -Force
Get-ChildItem "$folder\AHD*.TPL" | Rename-Item -NewName {$_.Name + '.Done'}
# ADT files
$content = Get-Content "$folder\ADT*.TPL"
$content = $content | where { -not $_.StartsWith('TEMP') }
$content | foreach {"0011|HD|ADT$_"} | Add-Content "$folder\CD_TMP"
Get-ChildItem "$folder\ADT*.TPL" | Rename-Item -NewName {$_.Name + '.Done'}
}
Although I don't know what the input or output should be, so I can't test it. NB. it now does Add-Content to append to the CD_TMP file, instead of overwriting it.
There's still alot of redundancy with $content, but the lines mostly stand alone like this.
I have a list of strings in a CSV file. The format is:
OldValue,NewValue
223134,875621
321321,876330
....
and the file contains a few hundred rows (each OldValue is unique). I need to process changes over a number of text files in a number of folders & subfolders. My best guess of the number of folders, files, and lines of text are - 15 folders, around 150 text files in each folder, with approximately 65,000 lines of text in each folder (between 400-500 lines per text file).
I will make 2 passes at the data, unless I can do it in one. First pass is to generate a text file I will use as a check list to review my changes. Second pass is to actually make the change in the file. Also, I only want to change the text files where the string occurs (not every file).
I'm using the following Powershell script to go through the files & produce a list of the changes needed. The script runs, but is beyond slow. I haven't worked on the replace logic yet, but I assume it will be similar to what I've got.
# replace a string in a file with powershell
[reflection.assembly]::loadwithpartialname("Microsoft.VisualBasic") | Out-Null
Function Search {
# Parameters $Path and $SearchString
param ([Parameter(Mandatory=$true, ValueFromPipeline = $true)][string]$Path,
[Parameter(Mandatory=$true)][string]$SearchString
)
try {
#.NET FindInFiles Method to Look for file
[Microsoft.VisualBasic.FileIO.FileSystem]::GetFiles(
$Path,
[Microsoft.VisualBasic.FileIO.SearchOption]::SearchAllSubDirectories,
$SearchString
)
} catch { $_ }
}
if (Test-Path "C:\Work\ListofAllFilenamesToSearch.txt") { # if file exists
Remove-Item "C:\Work\ListofAllFilenamesToSearch.txt"
}
if (Test-Path "C:\Work\FilesThatNeedToBeChanged.txt") { # if file exists
Remove-Item "C:\Work\FilesThatNeedToBeChanged.txt"
}
$filefolder1 = "C:\TestFolder\WorkFiles"
$ftype = "*.txt"
$filenames1 = Search $filefolder1 $ftype
$filenames1 | Out-File "C:\Work\ListofAllFilenamesToSearch.txt" -Width 2000
if (Test-Path "C:\Work\FilesThatNeedToBeChanged.txt") { # if file exists
Remove-Item "C:\Work\FilesThatNeedToBeChanged.txt"
}
(Get-Content "C:\Work\NumberXrefList.CSV" |where {$_.readcount -gt 1}) | foreach{
$OldFieldValue, $NewFieldValue = $_.Split("|")
$filenamelist = (Get-Content "C:\Work\ListofAllFilenamesToSearch.txt" -ReadCount 5) #|
foreach ($j in $filenamelist) {
#$testvar = (Get-Content $j )
#$testvar = (Get-Content $j -ReadCount 100)
$testvar = (Get-Content $j -Delimiter "\n")
Foreach ($i in $testvar)
{
if ($i -imatch $OldFieldValue) {
$j + "|" + $OldFieldValue + "|" + $NewFieldValue | Out-File "C:\Work\FilesThatNeedToBeChanged.txt" -Width 2000 -Append
}
}
}
}
$FileFolder = (Get-Content "C:\Work\FilesThatNeedToBeChanged.txt" -ReadCount 5)
Get-ChildItem $FileFolder -Recurse |
select -ExpandProperty fullname |
foreach {
if (Select-String -Path $_ -SimpleMatch $OldFieldValue -Debug -Quiet) {
(Get-Content $_) |
ForEach-Object {$_ -replace $OldFieldValue, $NewFieldValue }|
Set-Content $_ -WhatIf
}
}
In the code above, I've tried several things with Get-Content - default, with -ReadCount, and -Delimiter - in an attempt to avoid an out of memory error.
The only thing I have control over is the length of the old & new replacement strings file. Is there a way to do this in Powershell? Is there a better option/solution? I'm running Windows 7, Powershell version 3.0.
Your main problem is that you're reading the file over and over again to change each of the terms. You need to invert the looping of the replace terms and looping of the files. Also, pre-load the csv. Something like:
$filefolder1 = "C:\TestFolder\WorkFiles"
$ftype = "*.txt"
$filenames = gci -Path $filefolder1 -Filter $ftype -Recurse
$replaceValues = Import-Csv -Path "C:\Work\NumberXrefList.CSV"
foreach ($file in $filenames) {
$contents = Get-Content -Path $file
foreach ($replaceValue in $replaceValues) {
$contents = $contents -replace $replaceValue.OldValue, $replaceValue.NewValue
}
Copy-Item $file "$file.old"
Set-Content -Path $file -Value $contents
}