Powershell - Pass list of directory paths to FOR Loop - Output results to CSV - powershell

The code below works. Rather than specify the path manually I would like to pass a list of values from a csv file E:\Data\paths.csv and then output individual csv files for each path processed displaying the $Depth for that directory......
$StartLevel = 0 # 0 = include base folder, 1 = sub-folders only, 2 = start at 2nd level
$Depth = 10 # How many levels deep to scan
$Path = "E:\Data\MyPath" # starting path
For ($i=$StartLevel; $i -le $Depth; $i++) {
$Levels = "\*" * $i
(Resolve-Path $Path$Levels).ProviderPath | Get-Item | Where PsIsContainer |
Select FullName
}
Thanks,
Phil

Get-Help Import-Csv will help you in this regards.
regards,
kvprasoon

I assume you want something like the following:
# Create sample input CSV
#"
Path,StartLevel,Depth
"E:\Data\MyPath",0,10
"# > PathSpecs.csv
# Loop over each input CSV row (object with properties
# .Path, .StartLevel, and .Depth)
foreach ($pathSpec in Import-Csv PathSpecs.csv) {
& { For ([int] $i=$pathSpec.StartLevel; $i -le $pathSpec.Depth; $i++) {
$Levels = "\*" * $i
Resolve-Path "$($pathSpec.Path)$Levels" | Get-Item | Where PsIsContainer |
Select FullName
} } | # Export paths to a CSV file named "Path-<input-path-with-punct-stripped>.csv"
Export-Csv -NoTypeInformation "Path-$($pathSpec.Path -replace '[^\p{L}0-9]+', '_').csv"
}
Note that your approach to breadth-first enumeration of subdirectories in the subtree works, but will be quite slow with large subtrees.

Related

Export CSV: file structure with folders as columns

My question is quite similar to one posted here: Export CSV. Folder, subfolder and file into separate column
I have a file and folder structure containing possibly up to 10 folders deep and I want to run PowerShell to create a hash table that writes each file into a row, with each of the folders as a separate column, and then the filename at a dedicated column.
I start off with
gci -path C:\test -file -recurse|export-csv C:\temp\out.csv -notypeinformation
But this produces the standard table with some of the info I need but the directory is of course presented as one long string.
I'd like to get an output where each folder and its subfolder that houses the file to be presented as a column.
C:\Test\Folder1\Folder2\Folder3\file.txt
to be presented as
Name
Parent1
Parent2
Parent3
Parent4
Parent5
Parent6
Filename
file.txt
Folder1
Folder2
Folder3
file.txt
image1.png
Folder1
image.1png
Doc1.docx
Folder1
Folder2
Folder3
Folder4
Folder5
Folder6
Doc1.docx
table3.csv
Folder1
Folder2
table3.csv
As you can see there are some files which have just one folder whereas others could stored in several folders deep.
I need to keep this consistent, as I want to use Power Automate and the File system connector to read the file paths using the Excel table and then parse and create the file into SharePoint using the parent/folder levels as metadata/column in the document library.
I took zett42's code from the linked question and modified it.
$allItems = Get-ChildItem C:\Test -File -Recurse | ForEach-Object {
# Split on directory separator (typically '\' for Windows and '/' for Unix-like OS)
$FullNameSplit = $_.FullName.Split( [IO.Path]::DirectorySeparatorChar )
# Create an object that contains the splitted path and the path depth.
# This is implicit output that PowerShell captures and adds to $allItems.
[PSCustomObject] #{
FullNameSplit = $FullNameSplit
PathDepth = $FullNameSplit.Count
Filename = $_.Name
}
}
# Determine highest column index from maximum depth of all paths.
# Minus one, because we'll skip root path component.
$maxColumnIndex = ( $allItems | Measure-Object -Maximum PathDepth ).Maximum - 1
$allRows = foreach( $item in $allItems ) {
# Create an ordered hashtable
$row = [ordered]#{}
# Add all path components to hashtable. Make sure all rows have same number of columns.
foreach( $i in 1..$maxColumnIndex ) {
$row[ "Filename" ] = $item.Filename
$row[ "Column$i" ] = if( $i -lt $item.FullNameSplit.Count ) { $item.FullNameSplit[ $i ] } else { $null }
}
# Convert hashtable to object suitable for output to CSV.
# This is implicit output that PowerShell captures and adds to $allRows.
[PSCustomObject] $row
}
I can get the filename to show as a separate column but I don't want the script to add the filename at the last column.
PowerShell allrows output screenshot
Thanks
I've answered my own question.
Modified zett42's script, and included a few variables around splitting around just the Name of from GetChild-Item as opposed to the FullName and then of course the fixed column with just the filename in the hash table.
$allItems = Get-ChildItem C:\Test -File -Recurse | ForEach-Object {
# Split on directory separator (typically '\' for Windows and '/' for Unix-like OS)
# $FullNameSplit = $_.FullName.Split( [IO.Path]::DirectorySeparatorChar )
$FullNameSplit = split-path -Path $_.FullName
$DirNameSplit = $FullNameSplit.Split( [IO.Path]::DirectorySeparatorChar )
# Create an object that contains the splitted path and the path depth.
# This is implicit output that PowerShell captures and adds to $allItems.
[PSCustomObject] #{
#FullNameSplit = $FullNameSplit
#PathDepth = $FullNameSplit.Count
DirNameSplit = $DirNameSplit
PathDepth = $DirNameSplit.Count
Filename = $_.Name
}
}
# Determine highest column index from maximum depth of all paths.
# Minus one, because we'll skip root path component.
$maxColumnIndex = ( $allItems | Measure-Object -Maximum PathDepth ).Maximum - 1
$allRows = foreach( $item in $allItems ) {
# Create an ordered hashtable
$row = [ordered]#{}
# Add all path components to hashtable. Make sure all rows have same number of columns.
foreach( $i in 1..$maxColumnIndex ) {
$row[ "Filename" ] = $item.Filename
#$row[ "Column$i" ] = if( $i -lt $item.FullNameSplit.Count ) { $item.FullNameSplit[ $i ] } else { $null }
$row[ "Parent$i" ] = if( $i -lt $item.DirNameSplit.Count ) { $item.DirNameSplit[ $i ] } else { $null }
# $row[ "Column$i" ] = $item.DirNameSplit[$i]
}
# Convert hashtable to object suitable for output to CSV.
# This is implicit output that PowerShell captures and adds to $allRows.
[PSCustomObject] $row
}

How to add text to existing text in a csv file using PowerShell

I have a csv file that contains one column of cells (column A), each row/cell contains a single file name. The csv file has no header.
Something like this -
6_2021-05-10_02-00-36.mp4
6_2021-05-10_05-04-01.mp4
6_2021-05-10_05-28-59.mp4
6_2021-05-10_05-35-05.mp4
6_2021-05-10_05-35-34.mp4
6_2021-05-10_05-39-36.mp4
6_2021-05-10_05-39-41.mp4
6_2021-05-10_05-39-52.mp4
The number of rows in this csv file is variable.
I need to add a URL to the beginning of the text in each cell, such that, a valid URL is created - and the resulting csv content looks exactly like this:
https:\\www.url.com\6_2021-05-10_02-00-36.mp4
https:\\www.url.com\6_2021-05-10_05-04-01.mp4
https:\\www.url.com\6_2021-05-10_05-28-59.mp4
https:\\www.url.com\6_2021-05-10_05-35-05.mp4
https:\\www.url.com\6_2021-05-10_05-35-34.mp4
https:\\www.url.com\6_2021-05-10_05-39-36.mp4
https:\\www.url.com\6_2021-05-10_05-39-41.mp4
https:\\www.url.com\6_2021-05-10_05-39-52.mp4
So, this is what I've come up with, but it does not work.....
Param($File)
$csvObjects = C:\_TEMP\file_list_names.csv $file
$NewCSVObject = "https:\\www.url.com\"
foreach ($item in $csvObjects)
{
$item = ($NewCSVObject += $item)
}
$csvObjects | export-csv "C:\_TEMP\file_list_names_output.csv" -noType
But it's not working, and my PowerShell skills are not so sharp.
I'd be so very grateful for some assistance on this.
Thanks in advance-
Gregg
Sierra Vista, AZ
just concat with what you want:
$file2 ="C:\fic2.csv"
$x = Get-Content $file2
for($i=0; $i -lt $x.Count; $i++){
$x[$i] = "https:\\www.url.com\" + $x[$i]
}
$x
Technically speaking your inputfile can serve as csv, but because it contains only one column of data and has no headers, you can treat it best with Get-Content instead of using Import-Csv
Here's two alternatives for you to try.
$result = foreach ($fileName in (Get-Content -Path 'C:\_TEMP\file_list_names.csv')) {
'https:\\www.url.com\{0}' -f $fileName
}
# next save the file
$result | Set-Content -Path 'C:\_TEMP\file_urls.csv'
OR something like:
Get-Content -Path 'C:\_TEMP\file_list_names.csv' | ForEach-Object {
"https:\\www.url.com\$_"
} | Set-Content -Path 'C:\_TEMP\file_urls.csv'
Urls usually use forward slashes / not backslashes \.. I left these in, so you can replace them yourself if needed
With the help of Frenchy.... the complete answer is.... (URL changed for security reasons obviously)
#opens list of file names
$file2 ="C:\_TEMP\file_list_names.csv"
$x = Get-Content $file2
#appends URl to beginning of file name list
for($i=0; $i -lt $x.Count; $i++){
$x[$i] = "https://bizops-my.sharepoint.com/:f:/g/personal/gpowell_bizops_onmicrosoft_com/Ei4lFpZHTe=Jkq1fZ\" + $x[$i]
}
$x
#remove all files in target directory prior to saving new list
get-childitem -path C:\_TEMP\file_list_names_url.csv | remove-item
Add-Content -Path C:\_TEMP\file_list_names_url.csv -Value $x

Export CSV. Folder, subfolder and file into separate column

I created a script that lists all the folders, subfolders and files and export them to csv:
$path = "C:\tools"
Get-ChildItem $path -Recurse |select fullname | export-csv -Path "C:\temp\output.csv" -NoTypeInformation
But I would like that each folder, subfolder and file in pfad is written into separate column in csv.
Something like this:
c:\tools\test\1.jpg
Column1
Column2
Column3
tools
test
1.jpg
I will be grateful for any help.
Thank you.
You can split the Fullname property using the Split() method. The tricky part is that you need to know the maximum path depth in advance, as the CSV format requires that all rows have the same number of columns (even if some columns are empty).
# Process directory $path recursively
$allItems = Get-ChildItem $path -Recurse | ForEach-Object {
# Split on directory separator (typically '\' for Windows and '/' for Unix-like OS)
$FullNameSplit = $_.FullName.Split( [IO.Path]::DirectorySeparatorChar )
# Create an object that contains the splitted path and the path depth.
# This is implicit output that PowerShell captures and adds to $allItems.
[PSCustomObject] #{
FullNameSplit = $FullNameSplit
PathDepth = $FullNameSplit.Count
}
}
# Determine highest column index from maximum depth of all paths.
# Minus one, because we'll skip root path component.
$maxColumnIndex = ( $allItems | Measure-Object -Maximum PathDepth ).Maximum - 1
$allRows = foreach( $item in $allItems ) {
# Create an ordered hashtable
$row = [ordered]#{}
# Add all path components to hashtable. Make sure all rows have same number of columns.
foreach( $i in 1..$maxColumnIndex ) {
$row[ "Column$i" ] = if( $i -lt $item.FullNameSplit.Count ) { $item.FullNameSplit[ $i ] } else { $null }
}
# Convert hashtable to object suitable for output to CSV.
# This is implicit output that PowerShell captures and adds to $allRows.
[PSCustomObject] $row
}
# Finally output to CSV file
$allRows | Export-Csv -Path "C:\temp\output.csv" -NoTypeInformation
Notes:
The syntax Select-Object #{ Name= ..., Expression = ... } creates a calculated property.
$allRows = foreach captures and assigns all output of the foreach loop to variable $allRows, which will be an array if the loop outputs more than one object. This works with most other control statements as well, e. g. if and switch.
Within the loop I could have created a [PSCustomObject] directly (and used Add-Member to add properties to it) instead of first creating a hashtable and then converting to [PSCustomObject]. The choosen way should be faster as no additional overhead for calling cmdlets is required.
While a file with rows containing a variable number of items is not actually a CSV file, you can roll your own and Microsoft Excel can read it.
=== Get-DirCsv.ps1
Get-Childitem -File |
ForEach-Object {
$NameParts = $_.FullName -split '\\'
$QuotedParts = [System.Collections.ArrayList]::new()
foreach ($NamePart in $NameParts) {
$QuotedParts.Add('"' + $NamePart + '"') | Out-Null
}
Write-Output $($QuotedParts -join ',')
}
Use this to capture the output to a file with:
.\Get-DirCsv.ps1 | Out-File -FilePath '.\dir.csv' -Encoding ascii

Split a large csv file into multiple csv files according to the size in powershell

I have a large CSV file and I want to split it with respect to size and the header should be in every file.
For example, I have this 1.6MB file and I want the child files shouldn't be more than 512KB. So practically the parent file should have 4 child file.
Tried with the below simple program but the file is splitting with blank child files.
function csvSplitter {
$csvFile = "D:\Test\PTest\Dummy.csv";
$split = 10;
$content = Import-Csv $csvFile;
$start = 1;
$end = 0;
$records_per_file = [int][Math]::Ceiling($content.Count / $split);
for($i = 1; $i -le $split; $i++) {
$end += $records_per_file;
$content | Where-Object {[int]$_.Id -ge $start -and [int]$_.Id -le $end} | Export-Csv -Path "D:\Test\PTest\Destination\file$i.csv" -NoTypeInformation;
$start = $end + 1;
}
}csvSplitter
The logic for the size of the file is yet to write.
Tried to add both the files but I guess there is no option to add files.
this takes a slightly different path to a solution. [grin]
it ...
loads the CSV as a plain text file
saves the 1st line as a header line
calcs the batch size from the total line count & the batch count
uses array index ranges to grab the lines for each batch
combines the header line with the current batch of lines
writes that out to a text file
the reason for such a roundabout method is to save RAM. one drawback to loading the file as a CSV is the sheer amount of RAM needed. just loading the lines of text requires noticeably less RAM.
$SourceDir = $env:TEMP
$InFileName = 'LargeFile.csv'
$InFullFileName = Join-Path -Path $SourceDir -ChildPath $InFileName
$BatchCount = 4
$DestDir = $env:TEMP
$OutFileName = 'LF_Batch_.csv'
$OutFullFileName = Join-Path -Path $DestDir -ChildPath $OutFileName
#region >>> build file to work with
# remove this region when you are ready to do this with your test data OR to do this with real data
if (-not (Test-Path -LiteralPath $InFullFileName))
{
Get-ChildItem -LiteralPath $env:APPDATA -Recurse -File |
Sort-Object -Property Name |
Select-Object Name, Length, LastWriteTime, Directory |
Export-Csv -LiteralPath $InFullFileName -NoTypeInformation
}
#endregion >>> build file to work with
$CsvAsText = Get-Content -LiteralPath $InFullFileName
[array]$HeaderLine = $CsvAsText[0]
$BatchSize = [int]($CsvAsText.Count / $BatchCount) + 1
$StartLine = 1
foreach ($B_Index in 1..$BatchCount)
{
if ($B_Index -ne 1)
{
$StartLine = $StartLine + $BatchSize + 1
}
$CurrentOutFullFileName = $OutFullFileName.Replace('_.', ('_{0}.' -f $B_Index))
$HeaderLine + $CsvAsText[$StartLine..($StartLine + $BatchSize)] |
Set-Content -LiteralPath $CurrentOutFullFileName
}
there is no output on screen, but i got 4 files named LF_Batch_1.csv thru LF_Batch_4.csv that contained the 4our parts of the source file as expected. the last file has a slightly smaller number of rows, but that is what happens when the row count is not evenly divisible by the batch count. [grin]
Try this:
Add-Type -AssemblyName System.Collections
function Split-Csv {
param (
[string]$filePath,
[int]$partsNum
)
# Use generic lists for import/export
[System.Collections.Generic.List[object]]$contentImport = #()
[System.Collections.Generic.List[object]]$contentExport = #()
# import csv-file
$contentImport = Import-Csv $filePath
# how many lines per export file
$linesPerFile = [Math]::Max( [int]($contentImport.Count / $partsNum), 1 )
# start pointer for source list
$startPointer = 0
# counter for file name
$counter = 1
# main loop
while( $startPointer -lt $contentImport.Count ) {
# clear export list
[void]$contentExport.Clear()
# determine from-to from source list to export
$endPointer = [Math]::Min( $startPointer + $linesPerFile, $contentImport.Count )
# move lines to export to export list
[void]$contentExport.AddRange( $contentImport.GetRange( $startPointer, $endPointer - $startPointer ) )
# export
$contentExport | Export-Csv -Path ($filePath.Replace('.', $counter.ToString() + '.' ) ) -NoTypeInformation -Force
# move pointer
$startPointer = $endPointer
# increase counter for filename
$counter++
}
}
Split-Csv -filePath 'test.csv' -partsNum 7
try running this script:
$sw = new-object System.Diagnostics.Stopwatch
$sw.Start()
$FilePath = $HOME +'\Documents\Projects\ADOPT\Data8277.csv'
$SplitDir = $HOME +'\Documents\Projects\ADOPT\Split\'
CSV-FileSplitter -Path $FilePath -PartSizeBytes 35MB -SplitDir $SplitDir #-Verbose
$sw.Stop()
Write-Host "Split complete in " $sw.Elapsed.TotalSeconds "seconds"
I created this for files larger than 50GB files

Remove Top Line of Text File with PowerShell

I am trying to just remove the first line of about 5000 text files before importing them.
I am still very new to PowerShell so not sure what to search for or how to approach this. My current concept using pseudo-code:
set-content file (get-content unless line contains amount)
However, I can't seem to figure out how to do something like contains.
While I really admire the answer from #hoge both for a very concise technique and a wrapper function to generalize it and I encourage upvotes for it, I am compelled to comment on the other two answers that use temp files (it gnaws at me like fingernails on a chalkboard!).
Assuming the file is not huge, you can force the pipeline to operate in discrete sections--thereby obviating the need for a temp file--with judicious use of parentheses:
(Get-Content $file | Select-Object -Skip 1) | Set-Content $file
... or in short form:
(gc $file | select -Skip 1) | sc $file
It is not the most efficient in the world, but this should work:
get-content $file |
select -Skip 1 |
set-content "$file-temp"
move "$file-temp" $file -Force
Using variable notation, you can do it without a temporary file:
${C:\file.txt} = ${C:\file.txt} | select -skip 1
function Remove-Topline ( [string[]]$path, [int]$skip=1 ) {
if ( -not (Test-Path $path -PathType Leaf) ) {
throw "invalid filename"
}
ls $path |
% { iex "`${$($_.fullname)} = `${$($_.fullname)} | select -skip $skip" }
}
I just had to do the same task, and gc | select ... | sc took over 4 GB of RAM on my machine while reading a 1.6 GB file. It didn't finish for at least 20 minutes after reading the whole file in (as reported by Read Bytes in Process Explorer), at which point I had to kill it.
My solution was to use a more .NET approach: StreamReader + StreamWriter.
See this answer for a great answer discussing the perf: In Powershell, what's the most efficient way to split a large text file by record type?
Below is my solution. Yes, it uses a temporary file, but in my case, it didn't matter (it was a freaking huge SQL table creation and insert statements file):
PS> (measure-command{
$i = 0
$ins = New-Object System.IO.StreamReader "in/file/pa.th"
$outs = New-Object System.IO.StreamWriter "out/file/pa.th"
while( !$ins.EndOfStream ) {
$line = $ins.ReadLine();
if( $i -ne 0 ) {
$outs.WriteLine($line);
}
$i = $i+1;
}
$outs.Close();
$ins.Close();
}).TotalSeconds
It returned:
188.1224443
Inspired by AASoft's answer, I went out to improve it a bit more:
Avoid the loop variable $i and the comparison with 0 in every loop
Wrap the execution into a try..finally block to always close the files in use
Make the solution work for an arbitrary number of lines to remove from the beginning of the file
Use a variable $p to reference the current directory
These changes lead to the following code:
$p = (Get-Location).Path
(Measure-Command {
# Number of lines to skip
$skip = 1
$ins = New-Object System.IO.StreamReader ($p + "\test.log")
$outs = New-Object System.IO.StreamWriter ($p + "\test-1.log")
try {
# Skip the first N lines, but allow for fewer than N, as well
for( $s = 1; $s -le $skip -and !$ins.EndOfStream; $s++ ) {
$ins.ReadLine()
}
while( !$ins.EndOfStream ) {
$outs.WriteLine( $ins.ReadLine() )
}
}
finally {
$outs.Close()
$ins.Close()
}
}).TotalSeconds
The first change brought the processing time for my 60 MB file down from 5.3s to 4s. The rest of the changes is more cosmetic.
$x = get-content $file
$x[1..$x.count] | set-content $file
Just that much. Long boring explanation follows. Get-content returns an array. We can "index into" array variables, as demonstrated in this and other Scripting Guys posts.
For example, if we define an array variable like this,
$array = #("first item","second item","third item")
so $array returns
first item
second item
third item
then we can "index into" that array to retrieve only its 1st element
$array[0]
or only its 2nd
$array[1]
or a range of index values from the 2nd through the last.
$array[1..$array.count]
I just learned from a website:
Get-ChildItem *.txt | ForEach-Object { (get-Content $_) | Where-Object {(1) -notcontains $_.ReadCount } | Set-Content -path $_ }
Or you can use the aliases to make it short, like:
gci *.txt | % { (gc $_) | ? { (1) -notcontains $_.ReadCount } | sc -path $_ }
Another approach to remove the first line from file, using multiple assignment technique. Refer Link
$firstLine, $restOfDocument = Get-Content -Path $filename
$modifiedContent = $restOfDocument
$modifiedContent | Out-String | Set-Content $filename
skip` didn't work, so my workaround is
$LinesCount = $(get-content $file).Count
get-content $file |
select -Last $($LinesCount-1) |
set-content "$file-temp"
move "$file-temp" $file -Force
Following on from Michael Soren's answer.
If you want to edit all .txt files in the current directory and remove the first line from each.
Get-ChildItem (Get-Location).Path -Filter *.txt |
Foreach-Object {
(Get-Content $_.FullName | Select-Object -Skip 1) | Set-Content $_.FullName
}
For smaller files you could use this:
& C:\windows\system32\more +1 oldfile.csv > newfile.csv | out-null
... but it's not very effective at processing my example file of 16MB. It doesn't seem to terminate and release the lock on newfile.csv.