Merge CSV Files in Powershell traverse subfolders - archiving & deleting old files use folder name for Target-CSV - powershell

I want to merge many CSV-files into one (a few hundred files) removing the header row of the added CSVs.
As the files sit in several subfolders I need to start from the root traversing all the subfolders and process all CSVs in there. Before merging I want to archive them with zip deleting old CSVs. The new merged CSV-file and the zip-archive should be named like their parent folder.
In case the Script is started again for the same folder none of already processed files should be damaged or removed accidentally.
I am not a Powershell guy so I have been copying pasting from several resources in the web and came up with the following solution (Sorry don't remember the resources feel free to put references in the comment if you know).
This patch-work code does the job but it doesn't feel very bulletproof. For now it is processing the CSV files in the subfolders only. Processing the files within the given $targDir as well would also be nice.
I am wondering if it could be more compact. Suggestions for improvement are appreciated.
$targDir = "\\Servername\folder\"; #path
Get-ChildItem "$targDir" -Recurse -Directory |
ForEach-Object { #walkinthrough all subfolder-paths
#
Set-Location -Path $_.FullName
#remove existing AllInOne.csv (targed name for a merged file) in case it has been left over from a previous execution.
$FileName = ".\AllInOne.csv"
if (Test-Path $FileName) {
Remove-Item $FileName
}
#remove existing AllInOne.csv (targed name for archived files) in case it has been left over from a previous execution.
$FileName = ".\AllInOne.zip"
if (Test-Path $FileName) {
Remove-Item $FileName
}
#compressing all csv files in the current path, temporarily named AllInOne.zip. Doing that for each file adding it to the archive (with -Update)
# I wonder if there is a more efficient way to do that.
dir $_.FullName | where { $_.Extension -eq ".csv"} | foreach { Compress-Archive $_.FullName -DestinationPath "AllInOne.zip" -Update}
##########################################################
# This code is basically merging all the CSV files
# skipping the header of added files
##########################################################
$getFirstLine = $true
get-childItem ".\*.csv" | foreach {
$filePath = $_
$lines = $lines = Get-Content $filePath
$linesToWrite = switch($getFirstLine) {
$true {$lines}
$false {$lines | Select -Skip 1}
}
$getFirstLine = $false
Add-Content ".\AllInOne.csv" $linesToWrite
# Output file is named AllInOne.csv temporarily - this is not a requirement
# It was simply easier for me to come up with this temp file in the first place (symptomatic for copy&paste).
}
#########################################################
#deleting old csv files
dir $_.FullName | where { $_.Extension -eq ".csv" -and $_ -notlike "AllInOne.csv"} | foreach { Remove-Item $_.FullName}
# Temporarily rename AllinOne files with parent folder name
Get-ChildItem -Path $_.FullName -Filter *.csv | Rename-Item -NewName {$_.Basename.Replace("AllInOne",$_.Directory.Name) + $_.extension}
Get-ChildItem -Path $_.FullName -Filter *.zip | Rename-Item -NewName {$_.Basename.Replace("AllInOne",$_.Directory.Name) + $_.extension}
}
I have been executing it in the Powershell ISE. The Script is for a house keeping only, executed casually and not on a regular base - so performance doesn't matter so much.
I prefer to stick with a script that doesn't depend on additional libraries if possible (e.g. for Zip).

It may not be bulletproof, but I have seen worse cobbled together scripts. It'll definitely do the job you want it to, but here are some small changes that will make it a bit shorter and harder to break.
Since all your files are CSVs and all would have the same headers, you can use Import-CSV to compile all of the files into an array. You won't have to worry about stripping the headers or accidentally removing a row.
Get-ChildItem "*.csv" | Foreach-Object {
$csvArray += Import-CSV $_
}
Then you can just use Export-CSV -Path $_.FullName -NoTypeInformation to output it all in to a new CSV file.
To have it check the root folder and all the subfolders, I would throw all of the lines in the main ForEach loop into a function and then call it once for the root folder and keep the existing loop for all the subfolders.
function CompileCompressCSV {
param (
[string] $Path
)
# Code from inside the ForEach Loop
}
# Main Script
CompileCompressCSV -Path $targetDir
Get-ChildItem -Path $targetDir -Recurse -Directory | ForEach-Object {
CompileCompressCSV -Path $_.FullName
}
This is more of a stylistic choice, but I would do the steps of this script in a slightly different order:
Get Parent Folder Name
Remove old compiled CSVs and ZIPs
Compile CSVs into an array and output with Parent Folder Name
ZIP together CSVs into a file with the Parent Folder Name
Remove all CSV files
Personally, I'd rather name the created files properly the first time instead of having to go back and rename them unless there is absolutely no way around it. That doesn't seem the case for your situation so you should be able to create them with the right name on the first go.

Related

Powershell: Find Folders and Run Command in Those Folders

so trying to find a way to combine a couple of things the Stack Overflow crowd has helped me do in the past. So I know how to find folders with a specific name and move them where I want them to go:
$source_regex = [regex]::escape($sourceDir)
(gci $sourceDir -recurse | where {-not ($_.psiscontainer)} | select -expand fullname) -match "\\$search\\" |
foreach {
$file_dest = ($_ | split-path -parent) -replace $source_regex,$targetDir
if (-not (test-path $file_dest)){mkdir $file_dest}
move-item $_ -Destination $file_dest -force -verbose
}
And I also know how to find and delete files of a specific file extension within a preset directory:
Get-ChildItem $source -Include $searchfile -Recurse -Force | foreach{ "Removing file $($_.FullName)"; Remove-Item -force -recurse $_}
What I'm trying to do now is combine the two. Basically, I'm looking for a way to tell Powershell:
"Look for all folders named 'Draft Materials.' When you find a folder with that name, get its full path ($source), then run a command to delete files of a given file extension ($searchfile) from that folder."
What I'm trying to do is create a script I can use to clean up an archive drive when and if space starts to get tight. The idea is that as I develop things, a lot of times I go through a ton of incremental non-final drafts (hence folder name "Draft Materials"), and I want to get rid of the exported products (the PDFs, the BMPs, the AVIs, the MOVs, atc.) and just leave the master files that created them (the INDDs, the PRPROJs, the AEPs, etc.) so I can reconstruct them down the line if I ever need to. I can tell the script what drive and folder to search (and I'd assign that to a variable since the backup location may change and I'd like to just change it once), but I need help with the rest.
I'm stuck because I'm not quite sure how to combine the two pieces of code that I have to get Powershell to do this.
If what you want is to
"Look for all folders named 'Draft Materials.' When you find a folder with that name, get its full path ($source), then run a command to delete files of a given file extension ($searchfile) from that folder."
then you could do something like:
$rootPath = 'X:\Path\To\Start\Searching\From' # the starting point for the search
$searchFolder = 'Draft Materials' # the folder name to search for
$deleteThese = '*.PDF', '*.BMP', '*.AVI', '*.MOV' # an array of file patterns to delete
# get a list of all folders called 'Draft Materials'
Get-ChildItem -Path $rootPath -Directory -Filter $searchFolder -Recurse | ForEach-Object {
# inside each of these folders, get the files you want to delete and remove them
Get-ChildItem -Path $_.FullName -File -Recurse -Include $deleteThese |
Remove-Item -WhatIf
}
Or use Get-ChildItem only once, having it search for files. Then test if their fullnames contain the folder called 'Draft Materials'
$rootPath = 'X:\Path\To\Start\Searching\From'
$searchFolder = 'Draft Materials'
$deleteThese = '*.PDF', '*.BMP', '*.AVI', '*.MOV'
# get a list of all files with extensions from the $deleteThese array
Get-ChildItem -Path $rootPath -File -Recurse -Include $deleteThese |
# if in their full path names the folder 'Draft Materials' is present, delete them
Where-Object { $_.FullName -match "\\$searchFolder\\" } |
Remove-Item -WhatIf
In both cases I have added safety switch -WhatIf so when you run this, nothing gets deleted and in the console is written what would happen.
If that info shows the correct files are being removed, take off (or comment out) -Whatif and run the code again.

How to rename large number of files using Powershell and a CSV

Ultimately, I need a solid PowerShell script that will take a folder with several hundred video files, import the existing file names into the program, lookup the new file name in a CSV, and rename it. The old filename is simply (ie. File1.mp4, File2.mp4, etc.) I would like to appended a date to the front of the file in the format of (YYYY-MM-DD).
For testing, I created a folder on my desktop with (10) text files, each with a unique file name.
My CSV file appears as follows:
Image of CSV
The "newfilename" column, was created by using the Concatenate command in Excel.
`(=CONCATENATE(TEXT(A2, "yyyy-mm-dd")," ", B2)`
As much as I would just like PowerShell to handle everything, I feel using Excel for most of this might be the best way.
In my testing, everything was in one folder. However, at work, I will have video files on one drive, and the script will have to be in a folder on my desktop. Because I am in a corporate network, I need a special batch file to run my scripts, which is nothing new. I just modify the script name, and away it goes!
So what commands do I need to do in order to have the script separate from the video files AND the CSV file?
Here is the code that I have so far. Everything works when it's in one folder.
PS C:\Users\ceran\Desktop\Rename Project> Import-Csv -Path .\MyFileList.csv | ForEach-Object {
>> $Src = Join-Path -Path $TargetDir -ChildPath $_.filename
>> $Dst = Join-Path -Path $TargetDir -ChildPath $_.newfilename
>> Rename-Item -Path $Src -NewName $Dst
>> }
Thanks in advance for the help!
Chris
I'm not sure what the date column is in your Excel file and if you want to rename all files in the folder, but if that is the case, you don't need a csv file at all and can do this:
$sourceFolder = 'X:\Path\to\the\video\files' # change this to the real path
Get-ChildItem -Path $sourceFolder -Filter '*.mp4' -File | # iterate through the files in the folder
Where-Object {$_.Name -notmatch '^\d{4}-\d{2}-\d{2}'} | # don't rename files that already start with the date
Rename-Item -NewName { '{0:yyyy-MM-dd} {1}' -f $_.LastWriteTime, $_.Name } -WhatIf
This uses parameter -Filter '*.mp4', to get only files with an .mp4 extension. For the files in your testfolder (Desktop\Rename Project), change this to -Filter '*.txt'.
If you want all files renamed, no matter what the extension, simply remove the Filter from the cmdlet.
Because of the -WhatIf switch, no file is actually renamed and the code just shows in the console what would happen. Once satisfied that this is OK, remove the -WhatIf
Hope that helps.
$targetdir="C:\path\to\where\our\file\directory\is"
$pathtocsv="c:\path\to\csv.csv"
Import-Csv -Path $pathtocsv | ForEach-Object {
$Src = Join-Path -Path $TargetDir -ChildPath $_.filename
$Dst = Join-Path -Path $TargetDir -ChildPath $_.newfilename
Rename-Item -Path $Src -NewName $Dst
}
Why would this not work in any situation?
By the way, if the csv had the columns path and newname, it could be piped directly to rename-item:
path,newname
file.txt,file2.txt
import-csv ren.csv | Rename-Item -whatif
What if: Performing the operation "Rename File" on target "Item: /Users/js/foo/file.txt Destination: /Users/js/foo/file2.txt".

Copying files to specific folder declared in a CSV file using Powershell Script

i am quite new to powershell and i am trying to make a script that copy files to certain folders that are declared in a CSV file. But till now i am getting errors from everywhere and can't find nothing to resolve this issue.
I have this folders and .txt files created in the same folder as the script.
Till now i could only do this:
$files = Import-Csv .\files.csv
$files
foreach ($file in $files) {
$name = $file.name
$final = $file.destination
Copy-Item $name -Destination $final
}
This is my CSV
name;destination
file1.txt;folderX
file2.txt;folderY
file3.txt;folderZ
As the comments indicate, if you are not using default system delimiters, you should make sure to specify them.
I also recommend typically to use quotes for your csv to ensure no problems with accidentally including an entry that includes the delimiter in the name.
#"
"taco1.txt";"C:\temp\taco2;.txt"
"# | ConvertFrom-CSV -Delimiter ';' -Header #('file','destination')
will output
file destination
---- -----------
taco1.txt C:\temp\taco2;.txt
The quotes make sure the values are correctly interpreted. And yes... you can name a file foobar;test..txt. Never underestimate what users might do. 😁
If you take the command Get-ChildItem | Select-Object BaseName,Directory | ConvertTo-CSV -NoTypeInformation and review the output, you should see it quoted like this.
Sourcing Your File List
One last tip. Most of the time I've come across a CSV for file input lists a CSV hasn't been needed. Consider looking at grabbing the files you in your script itself.
For example, if you have a folder and need to filter the list down, you can do this on the fly very easily in PowerShell by using Get-ChildItem.
For example:
$Directory = 'C:\temp'
$Destination = $ENV:TEMP
Get-ChildItem -Path $Directory -Filter *.txt -Recurse | Copy-Item -Destination $Destination
If you need to have more granular matching control, consider using the Where-Object cmdlet and doing something like this:
Get-ChildItem -Path $Directory -Filter *.txt -Recurse | Where-Object Name -match '(taco)|(burrito)' | Copy-Item -Destination $Destination
Often you'll find that you can easily use this type of filtering to keep CSV and input files out of the solution.
example
Using techniques like this, you might be able to get files from 2 directories, filter the match, and copy all in a short statement like this:
Get-ChildItem -Path 'C:\temp' -Filter '*.xlsx' -Recurse | Where-Object Name -match 'taco' | Copy-Item -Destination $ENV:TEMP -Verbose
Hope that gives you some other ideas! Welcome to Stack Overflow. 👋

PowerShell copy and rename multiple .csv files from 10+ subfolders

I'm searching for a way to copy multiple .csv files all named exactly the same, located in different folders (all of them are in the same dierctory) and merge them into 1 .csv file (I would like to skip copying the first line which is head, except from the first file and there is no rule how many lines are written in each .csv file, so the script should recognize written lines to know how many and which one to merge /to avoid blank lines).
This is what I tried so far:
$src = "C:\Users\E\Desktop\Merge\Input\Files*.csv"
$dst = "C:\Users\E\Desktop\Merge\Output"
Get-ChildItem -Path $src -Recurse -File | Copy-Item -Destination $dst
and this one:
Get-ChildItem -Path $src -Recurse -File | Copy-Item -Destination $dst |
ForEach-Object {
$NewName = $_.Name
$Destination = Join-Path -Path $_.Directory.FullName -ChildPath $NewName
Move-Item -Path $_.FullName -Destination $Destination -Force
}
any help please? :)
Since you are looking to merge the files you may as well read them all into PowerShell, and then output the whole thing at once. You could do something like:
$Data = Get-ChildItem -Path $src -Recurse -File | Import-Csv
$Data | Export-Csv $dst\Output.csv -NoTypeInformation
That may not be feasible if your CSV files are extremely large, but it is a simple way to merge CSV files if the header row is the same in all files.
Another method would be to just treat it as text, which is much less memory intensive. For that you would want to get a list of files, copy the first one intact, and then copy the rest of them skipping the header row.
$Files = Get-ChildItem $src -Recurse
$TargetFile = Join-Path $dst $Files[0].Name
$Files[0] | Copy-Item -Dest $TargetFile
#Skip the first file, and loop through the rest
$Files | Select -Skip 1 | ForEach-Object{
#Get the contents of the file, and skip the header row, then append the rest to the target
Get-Content $_ | Select -Skip 1 | Add-Content $TargetFile
}
Edit: Ok, I wanted to replicate the process so that I could figure out what was giving you errors. To do that I created 3 folders, and copied a .csv file with 4 entries into each folder, with all of the files named 'Files 06202018.csv'. I ran my code above, and it did what it should, but there was some file corruption where the second file would be appended directly to the end of the first file without a new line being created for it, so I changed things from just copying the first file, to reading it and creating a new file in the destination. The below code worked flawlessly for me:
$src = "C:\Temp\Test\Files*.csv"
$dst = "C:\Temp\Test\Output"
$Files = Get-ChildItem $src -Recurse
$TargetFile = Join-Path $dst $Files[0].Name
GC $Files[0] | Set-Content $TargetFile
#Skip the first file, and loop through the rest
$Files | Select -Skip 1 | ForEach-Object{
#Get the contents of the file, and skip the header row, then append the rest to the target
Get-Content $_ | Select -Skip 1 | Add-Content $TargetFile
}
That took the files:
C:\Temp\Test\Lapis\Files 06202018.csv
C:\Temp\Test\Malachite\Files 06202018.csv
C:\Temp\Test\Opal\Files 06202018.csv
And it combined those three files into a correctly merged file at:
C:\Temp\Test\Output\Files 06202018.csv
The only time that I had any issues is if I forgot to delete the target file before running this. Depending on how large these files are, and how much memory you have available, you could probably speed this up by changing the last two lines to this:
Get-Content $_ | Select -Skip 1
} | Add-Content $TargetFile
That would read all of the files in (other than the first one) and only write to the destination once, instead of having to get file lock, open the file for writing, write, and close the destination for each file.

PowerShell to copy files to destination's subfolders while excluding certain folders in the destination

So I have danced with this off and on throughout the day and the timeless phrase "There's more than one way to skin a cat" keeps coming to mind so I decided to take to the community.
Scenario:
Source folder "C:\Updates" has 100 files of various extensions. All need to be copied to the sub-folders only of "C:\Prod\" overwriting any duplicates that it may find.
The Caveats:
The sub-folder names (destinations) in "C:\Prod" are quite dynamic and change frequently.
A naming convention is used to determine which sub-folders in the destination need to be excluded when the source files are being copied (to retain the original versions). For ease of explanation lets say any folder names starting with "!stop" should be excluded from the copy process. (!stop* if wildcards considered)
So, here I am wanting the input of those greater than I to tackle this in PS if I'm lucky. I've tinkered with Copy-Item and xcopy today so I'm excited to hear other's input.
Thanks!
-Chris
Give this a shot:
Get-ChildItem -Path C:\Prod -Exclude !stop* -Directory `
| ForEach-Object { Copy-Item -Path C:\Updates\* -Destination $_ -Force }
This grabs each folder (the -Directory switch ensures we only grab folders) in C:\Prod that does not match the filter and pipes it to the ForEach-Object command where we are running the Copy-Item command to copy the files to the directory.
The -Directory switch is not available in every version of PowerShell; I do not know which version it was introduced in off the top of my head. If you have an older version of PowerShell that does not support -Directory then you can use this script:
Get-ChildItem -Path C:\Prod -Exclude !stop* `
| Where-Object { $_.PSIsContainer } `
| ForEach-Object { Copy-Item -Path C:\Updates\* -Destination $_ -Force }
To select only sub folders which do not begin with "!stop" do this
$Source = "C:\Updates\*"
$Dest = "C:\Prod"
$Stop = "^!stop"
$Destinations = GCI -Path $Dest |?{$_.PSIsContainer -and $_.Name -notmatch $Stop }
ForEach ($Destination in $Destinations) {
Copy-Item -Path $Source -Destination $Destination.FullName -Force
}
Edited Now copies all files from Update to subs of Source not beginning with "!stop" The -whatif switch shows what would happen, to arm the script remove the -whatif.
Edit2 Streamlined the script. If also Sub/sub-folders of C:\Prod shall receive copies include a -rec option to the gci just in front of he pipe.