PowerShell copy and rename multiple .csv files from 10+ subfolders - powershell

I'm searching for a way to copy multiple .csv files all named exactly the same, located in different folders (all of them are in the same dierctory) and merge them into 1 .csv file (I would like to skip copying the first line which is head, except from the first file and there is no rule how many lines are written in each .csv file, so the script should recognize written lines to know how many and which one to merge /to avoid blank lines).
This is what I tried so far:
$src = "C:\Users\E\Desktop\Merge\Input\Files*.csv"
$dst = "C:\Users\E\Desktop\Merge\Output"
Get-ChildItem -Path $src -Recurse -File | Copy-Item -Destination $dst
and this one:
Get-ChildItem -Path $src -Recurse -File | Copy-Item -Destination $dst |
ForEach-Object {
$NewName = $_.Name
$Destination = Join-Path -Path $_.Directory.FullName -ChildPath $NewName
Move-Item -Path $_.FullName -Destination $Destination -Force
}
any help please? :)

Since you are looking to merge the files you may as well read them all into PowerShell, and then output the whole thing at once. You could do something like:
$Data = Get-ChildItem -Path $src -Recurse -File | Import-Csv
$Data | Export-Csv $dst\Output.csv -NoTypeInformation
That may not be feasible if your CSV files are extremely large, but it is a simple way to merge CSV files if the header row is the same in all files.
Another method would be to just treat it as text, which is much less memory intensive. For that you would want to get a list of files, copy the first one intact, and then copy the rest of them skipping the header row.
$Files = Get-ChildItem $src -Recurse
$TargetFile = Join-Path $dst $Files[0].Name
$Files[0] | Copy-Item -Dest $TargetFile
#Skip the first file, and loop through the rest
$Files | Select -Skip 1 | ForEach-Object{
#Get the contents of the file, and skip the header row, then append the rest to the target
Get-Content $_ | Select -Skip 1 | Add-Content $TargetFile
}
Edit: Ok, I wanted to replicate the process so that I could figure out what was giving you errors. To do that I created 3 folders, and copied a .csv file with 4 entries into each folder, with all of the files named 'Files 06202018.csv'. I ran my code above, and it did what it should, but there was some file corruption where the second file would be appended directly to the end of the first file without a new line being created for it, so I changed things from just copying the first file, to reading it and creating a new file in the destination. The below code worked flawlessly for me:
$src = "C:\Temp\Test\Files*.csv"
$dst = "C:\Temp\Test\Output"
$Files = Get-ChildItem $src -Recurse
$TargetFile = Join-Path $dst $Files[0].Name
GC $Files[0] | Set-Content $TargetFile
#Skip the first file, and loop through the rest
$Files | Select -Skip 1 | ForEach-Object{
#Get the contents of the file, and skip the header row, then append the rest to the target
Get-Content $_ | Select -Skip 1 | Add-Content $TargetFile
}
That took the files:
C:\Temp\Test\Lapis\Files 06202018.csv
C:\Temp\Test\Malachite\Files 06202018.csv
C:\Temp\Test\Opal\Files 06202018.csv
And it combined those three files into a correctly merged file at:
C:\Temp\Test\Output\Files 06202018.csv
The only time that I had any issues is if I forgot to delete the target file before running this. Depending on how large these files are, and how much memory you have available, you could probably speed this up by changing the last two lines to this:
Get-Content $_ | Select -Skip 1
} | Add-Content $TargetFile
That would read all of the files in (other than the first one) and only write to the destination once, instead of having to get file lock, open the file for writing, write, and close the destination for each file.

Related

Powershell: Find Folders with (Name) and Foreach Copy to Location Preserve Directory Structure

Got another multi-step process I'm looking to streamline. Basically, I'm looking to build a Powershell script to do three things:
Get-Childitem to look for folders with a specific name (we'll call it NAME1 as a placeholder)
For each folder it finds that has the name, I want it to output the full directory to a TXT file (so that in the end I wind up with a text file that has a list of the results it found, with their full paths; so if it finds folders with "NAME1" in five different subdirectories of the folder I give it, I want the full path beginning with the drive letter and ending with "NAME1")
Then I want it to take the list from the TXT file, and copy each file path to another drive and preserve directory structure
So basically, if it searches and finds this:
D:\TEST1\NAME1
D:\TEST7\NAME1
D:\TEST8\NAME1\
That's what I want to appear in the text file.
Then what I want it to do is to go through each line in the text file and plug the value into a Copy-Item (I'm thinking the source directory would get assigned to a variable), so that when it's all said and done, on the second drive I wind up with this:
E:\BACKUP\TEST1\NAME1
E:\BACKUP\TEST7\NAME1
E:\BACKUP\TEST8\NAME1\
So in short, I'm looking for a Get-Childitem that can define a series of paths, which Copy-Item can then use to back them up elsewhere.
I already have one way to do this, but the problem is it seems to copy everything every time, and since one of these drives is an SSD I only want to copy what's new/changed each time (not to mention that would save time when I need to run a backup):
$source = "C:\"
$target = "E:\BACKUP\"
$search = "NAME1"
$source_regex = [regex]::escape($source)
(gci $source -recurse | where {-not ($_.psiscontainer)} | select -expand fullname) -match "\\$search\\" |
foreach {
$file_dest = ($_ | split-path -parent) -replace $source_regex,$target
if (-not (test-path $file_dest)){mkdir $file_dest}
copy-item $_ -Destination $file_dest -force -verbose
}
If there's a way to do this that wouldn't require writing out a TXT file each time I'd be all for that, but I don't know a way to do this the way I'm looking for except a Copy-Item.
I'd be very grateful for any help I can get with this. Thanks all!
If I understand correctly, you want to copy all folders with a certain name, keeping the original folder structure in the destination path and copy only files that are newer than what is in the destination already.
Try
$source = 'C:\'
$target = 'E:\BACKUP\'
$search = 'NAME1'
# -ErrorAction SilentlyContinue because in the C:\ disk you are bound to get Access Denied on some paths
Get-ChildItem -Path $source -Directory -Recurse -Filter $search -ErrorAction SilentlyContinue | ForEach-Object {
# construct the destination folder path
$dest = Join-Path -Path $target -ChildPath $_.FullName.Substring($source.Length)
# copy the folder including its files and subfolders (but not empty subfolders)
# for more switches see https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/robocopy
robocopy $_.FullName $dest  /XO /S /R:0
}
If you don't want console output of robocopy you can silence it by appending 2>&1, so neither stdout nor stderr is echoed
If you want to keep a file after this with both the source paths and the destinations, I'd suggest doing
$source = 'C:\'
$target = 'E:\BACKUP\'
$search = 'NAME1'
$output = [System.Collections.Generic.List[object]]::new()
# -ErrorAction SilentlyContinue because in the C:\ disk you are bound to get Access Denied on some paths
Get-ChildItem -Path $source -Directory -Recurse -Filter $search -ErrorAction SilentlyContinue | ForEach-Object {
# construct the destination folder path
$dest = Join-Path -Path $target -ChildPath $_.FullName.Substring($source.Length)
# add an object to the output list
$output.Add([PsCustomObject]#{Source = $_.FullName; Destination = $dest })
# copy the folder including its files and subfolders (but not empty subfolders)
# for more switches see https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/robocopy
robocopy $_.FullName $dest  /XO /S /R:0
}
# write the output to csv file
$output | Export-Csv -Path 'E:\backup.csv' -NoTypeInformation

PowerShell: Export Data from Specific Row Number from multiple files and export to one csv

Beginner user...
I have 30+ .dat files in one folder and need to export row 10 from each file and compile into one csv file.
I know I am on the right lines but am not sure of the middle section - this is where I'm at...
Get-ChildItem -Path C:\Users\pitters\Folder\* -Include *.dat -Recurse
ForEach-Object {
Select-Object -Skip 9 -First 1 }
Export-CSV -Path Users\pitters\Folder\output.csv
Am I missing Get-Content and can anyone help with what needs correcting?
Thanks in advance.
Matt
As you mentioned yourself, you need to invoke Get-Content to actually read the file contents. In addition, you also need to construct an object with appropriate properties corresponding to the coumns you want in your CSV file - something we can do with Select-Object directly:
Get-ChildItem -Path C:\Users\pitters\Folder\* -Include *.dat -Recurse |ForEach-Object {
$file = $_
$file |Get-Content |Select-Object #{Name='Line10';Expression={$_}},#{Name='File';Expression={$file.FullName}} -Skip 9 -First 1
} |Export-CSV -Path Users\pitters\Folder\output.csv

Merge CSV Files in Powershell traverse subfolders - archiving & deleting old files use folder name for Target-CSV

I want to merge many CSV-files into one (a few hundred files) removing the header row of the added CSVs.
As the files sit in several subfolders I need to start from the root traversing all the subfolders and process all CSVs in there. Before merging I want to archive them with zip deleting old CSVs. The new merged CSV-file and the zip-archive should be named like their parent folder.
In case the Script is started again for the same folder none of already processed files should be damaged or removed accidentally.
I am not a Powershell guy so I have been copying pasting from several resources in the web and came up with the following solution (Sorry don't remember the resources feel free to put references in the comment if you know).
This patch-work code does the job but it doesn't feel very bulletproof. For now it is processing the CSV files in the subfolders only. Processing the files within the given $targDir as well would also be nice.
I am wondering if it could be more compact. Suggestions for improvement are appreciated.
$targDir = "\\Servername\folder\"; #path
Get-ChildItem "$targDir" -Recurse -Directory |
ForEach-Object { #walkinthrough all subfolder-paths
#
Set-Location -Path $_.FullName
#remove existing AllInOne.csv (targed name for a merged file) in case it has been left over from a previous execution.
$FileName = ".\AllInOne.csv"
if (Test-Path $FileName) {
Remove-Item $FileName
}
#remove existing AllInOne.csv (targed name for archived files) in case it has been left over from a previous execution.
$FileName = ".\AllInOne.zip"
if (Test-Path $FileName) {
Remove-Item $FileName
}
#compressing all csv files in the current path, temporarily named AllInOne.zip. Doing that for each file adding it to the archive (with -Update)
# I wonder if there is a more efficient way to do that.
dir $_.FullName | where { $_.Extension -eq ".csv"} | foreach { Compress-Archive $_.FullName -DestinationPath "AllInOne.zip" -Update}
##########################################################
# This code is basically merging all the CSV files
# skipping the header of added files
##########################################################
$getFirstLine = $true
get-childItem ".\*.csv" | foreach {
$filePath = $_
$lines = $lines = Get-Content $filePath
$linesToWrite = switch($getFirstLine) {
$true {$lines}
$false {$lines | Select -Skip 1}
}
$getFirstLine = $false
Add-Content ".\AllInOne.csv" $linesToWrite
# Output file is named AllInOne.csv temporarily - this is not a requirement
# It was simply easier for me to come up with this temp file in the first place (symptomatic for copy&paste).
}
#########################################################
#deleting old csv files
dir $_.FullName | where { $_.Extension -eq ".csv" -and $_ -notlike "AllInOne.csv"} | foreach { Remove-Item $_.FullName}
# Temporarily rename AllinOne files with parent folder name
Get-ChildItem -Path $_.FullName -Filter *.csv | Rename-Item -NewName {$_.Basename.Replace("AllInOne",$_.Directory.Name) + $_.extension}
Get-ChildItem -Path $_.FullName -Filter *.zip | Rename-Item -NewName {$_.Basename.Replace("AllInOne",$_.Directory.Name) + $_.extension}
}
I have been executing it in the Powershell ISE. The Script is for a house keeping only, executed casually and not on a regular base - so performance doesn't matter so much.
I prefer to stick with a script that doesn't depend on additional libraries if possible (e.g. for Zip).
It may not be bulletproof, but I have seen worse cobbled together scripts. It'll definitely do the job you want it to, but here are some small changes that will make it a bit shorter and harder to break.
Since all your files are CSVs and all would have the same headers, you can use Import-CSV to compile all of the files into an array. You won't have to worry about stripping the headers or accidentally removing a row.
Get-ChildItem "*.csv" | Foreach-Object {
$csvArray += Import-CSV $_
}
Then you can just use Export-CSV -Path $_.FullName -NoTypeInformation to output it all in to a new CSV file.
To have it check the root folder and all the subfolders, I would throw all of the lines in the main ForEach loop into a function and then call it once for the root folder and keep the existing loop for all the subfolders.
function CompileCompressCSV {
param (
[string] $Path
)
# Code from inside the ForEach Loop
}
# Main Script
CompileCompressCSV -Path $targetDir
Get-ChildItem -Path $targetDir -Recurse -Directory | ForEach-Object {
CompileCompressCSV -Path $_.FullName
}
This is more of a stylistic choice, but I would do the steps of this script in a slightly different order:
Get Parent Folder Name
Remove old compiled CSVs and ZIPs
Compile CSVs into an array and output with Parent Folder Name
ZIP together CSVs into a file with the Parent Folder Name
Remove all CSV files
Personally, I'd rather name the created files properly the first time instead of having to go back and rename them unless there is absolutely no way around it. That doesn't seem the case for your situation so you should be able to create them with the right name on the first go.

How to rename large number of files using Powershell and a CSV

Ultimately, I need a solid PowerShell script that will take a folder with several hundred video files, import the existing file names into the program, lookup the new file name in a CSV, and rename it. The old filename is simply (ie. File1.mp4, File2.mp4, etc.) I would like to appended a date to the front of the file in the format of (YYYY-MM-DD).
For testing, I created a folder on my desktop with (10) text files, each with a unique file name.
My CSV file appears as follows:
Image of CSV
The "newfilename" column, was created by using the Concatenate command in Excel.
`(=CONCATENATE(TEXT(A2, "yyyy-mm-dd")," ", B2)`
As much as I would just like PowerShell to handle everything, I feel using Excel for most of this might be the best way.
In my testing, everything was in one folder. However, at work, I will have video files on one drive, and the script will have to be in a folder on my desktop. Because I am in a corporate network, I need a special batch file to run my scripts, which is nothing new. I just modify the script name, and away it goes!
So what commands do I need to do in order to have the script separate from the video files AND the CSV file?
Here is the code that I have so far. Everything works when it's in one folder.
PS C:\Users\ceran\Desktop\Rename Project> Import-Csv -Path .\MyFileList.csv | ForEach-Object {
>> $Src = Join-Path -Path $TargetDir -ChildPath $_.filename
>> $Dst = Join-Path -Path $TargetDir -ChildPath $_.newfilename
>> Rename-Item -Path $Src -NewName $Dst
>> }
Thanks in advance for the help!
Chris
I'm not sure what the date column is in your Excel file and if you want to rename all files in the folder, but if that is the case, you don't need a csv file at all and can do this:
$sourceFolder = 'X:\Path\to\the\video\files' # change this to the real path
Get-ChildItem -Path $sourceFolder -Filter '*.mp4' -File | # iterate through the files in the folder
Where-Object {$_.Name -notmatch '^\d{4}-\d{2}-\d{2}'} | # don't rename files that already start with the date
Rename-Item -NewName { '{0:yyyy-MM-dd} {1}' -f $_.LastWriteTime, $_.Name } -WhatIf
This uses parameter -Filter '*.mp4', to get only files with an .mp4 extension. For the files in your testfolder (Desktop\Rename Project), change this to -Filter '*.txt'.
If you want all files renamed, no matter what the extension, simply remove the Filter from the cmdlet.
Because of the -WhatIf switch, no file is actually renamed and the code just shows in the console what would happen. Once satisfied that this is OK, remove the -WhatIf
Hope that helps.
$targetdir="C:\path\to\where\our\file\directory\is"
$pathtocsv="c:\path\to\csv.csv"
Import-Csv -Path $pathtocsv | ForEach-Object {
$Src = Join-Path -Path $TargetDir -ChildPath $_.filename
$Dst = Join-Path -Path $TargetDir -ChildPath $_.newfilename
Rename-Item -Path $Src -NewName $Dst
}
Why would this not work in any situation?
By the way, if the csv had the columns path and newname, it could be piped directly to rename-item:
path,newname
file.txt,file2.txt
import-csv ren.csv | Rename-Item -whatif
What if: Performing the operation "Rename File" on target "Item: /Users/js/foo/file.txt Destination: /Users/js/foo/file2.txt".

Copying files to specific folder declared in a CSV file using Powershell Script

i am quite new to powershell and i am trying to make a script that copy files to certain folders that are declared in a CSV file. But till now i am getting errors from everywhere and can't find nothing to resolve this issue.
I have this folders and .txt files created in the same folder as the script.
Till now i could only do this:
$files = Import-Csv .\files.csv
$files
foreach ($file in $files) {
$name = $file.name
$final = $file.destination
Copy-Item $name -Destination $final
}
This is my CSV
name;destination
file1.txt;folderX
file2.txt;folderY
file3.txt;folderZ
As the comments indicate, if you are not using default system delimiters, you should make sure to specify them.
I also recommend typically to use quotes for your csv to ensure no problems with accidentally including an entry that includes the delimiter in the name.
#"
"taco1.txt";"C:\temp\taco2;.txt"
"# | ConvertFrom-CSV -Delimiter ';' -Header #('file','destination')
will output
file destination
---- -----------
taco1.txt C:\temp\taco2;.txt
The quotes make sure the values are correctly interpreted. And yes... you can name a file foobar;test..txt. Never underestimate what users might do. 😁
If you take the command Get-ChildItem | Select-Object BaseName,Directory | ConvertTo-CSV -NoTypeInformation and review the output, you should see it quoted like this.
Sourcing Your File List
One last tip. Most of the time I've come across a CSV for file input lists a CSV hasn't been needed. Consider looking at grabbing the files you in your script itself.
For example, if you have a folder and need to filter the list down, you can do this on the fly very easily in PowerShell by using Get-ChildItem.
For example:
$Directory = 'C:\temp'
$Destination = $ENV:TEMP
Get-ChildItem -Path $Directory -Filter *.txt -Recurse | Copy-Item -Destination $Destination
If you need to have more granular matching control, consider using the Where-Object cmdlet and doing something like this:
Get-ChildItem -Path $Directory -Filter *.txt -Recurse | Where-Object Name -match '(taco)|(burrito)' | Copy-Item -Destination $Destination
Often you'll find that you can easily use this type of filtering to keep CSV and input files out of the solution.
example
Using techniques like this, you might be able to get files from 2 directories, filter the match, and copy all in a short statement like this:
Get-ChildItem -Path 'C:\temp' -Filter '*.xlsx' -Recurse | Where-Object Name -match 'taco' | Copy-Item -Destination $ENV:TEMP -Verbose
Hope that gives you some other ideas! Welcome to Stack Overflow. 👋