i am trying to append binary AFP files into one file. When I used my code below the same file gets written three times instead of the three files I have getting appended to one file. Why would the value of $bytes not change? Get-Content was unsuccessful without causing errors in the AFP file.
$dira = "D:\User1\Desktop\AFPTest\"
$list = get-childitem $dira -filter *.afp -recurse | % { $_.FullName } | Sort-Object
foreach($afpFile in $list){
$bytes = [System.IO.File]::ReadAllBytes($afpFile)
[io.file]::WriteAllBytes("D:\User1\Desktop\AFPTest\Content.afp",$bytes)
}
The script below is after I made a change to store the $bytes to a $data variable and then write out $data.
$dira = "D:\User1\Desktop\AFPTest\"
$list = get-childitem $dira -filter *.afp -recurse | % { $_.FullName } | Sort-Object -descending
foreach($afpFile in $list){
Write-Host $afpFile
$bytes = [System.IO.File]::ReadAllBytes($afpFile)
$data += $bytes
}
[io.file]::WriteAllBytes("D:\User1\Desktop\AFPTest\Content.afp",$bytes)
I attempted to combine them manually by setting each of the three files to a variable and then adding them to the $data array but the same issue happens of the repeated image. The code is below.
$dira = "D:\User1\Desktop\AFPTest\"
$list = get-childitem $dira -filter *.afp -recurse | % { $_.FullName } | Sort-Object
$file3 = [System.IO.File]::ReadAllBytes("D:\User1\Desktop\AFPTest\000001.afp")
$file2 = [System.IO.File]::ReadAllBytes("D:\User1\Desktop\AFPTest\000002.afp")
$file1 = [System.IO.File]::ReadAllBytes("D:\User1\Desktop\AFPTest\000003.afp")
$data = $file1 + $file2
[io.file]::WriteAllBytes("D:\User1\Desktop\AFPTest\AFP.afp",$data)
WriteAllBytes() always creates a new file. You want to append. Try this:
...
$bytes = #()
foreach($afpFile in $list) {
$bytes += [System.IO.File]::ReadAllBytes($afpFile)
}
[io.file]::WriteAllBytes("D:\User1\Desktop\AFPTest\Content.afp",$bytes)
Related
getting memory exception while running this code. Is there a way to filter one file at a time and write output and append after processing each file. Seems the below code loads everything to memory.
$inputFolder = "C:\Change\2019\October"
$outputFile = "C:\Change\2019\output.csv"
Get-ChildItem $inputFolder -File -Filter '*.csv' |
ForEach-Object { Import-Csv $_.FullName } |
Where-Object { $_.machine_type -eq 'workstations' } |
Export-Csv $outputFile -NoType
May be can you export and filter your files one by one and append result into your output file like this :
$inputFolder = "C:\Change\2019\October"
$outputFile = "C:\Change\2019\output.csv"
Remove-Item $outputFile -Force -ErrorAction SilentlyContinue
Get-ChildItem $inputFolder -Filter "*.csv" -file | %{import-csv $_.FullName | where machine_type -eq 'workstations' | export-csv $outputFile -Append -notype }
Note: The reason for not using Get-ChildItem ... | Import-Csv ... - i.e., for not directly piping Get-ChildItem to Import-Csv and instead having to call Import-Csv from the script block ({ ... } of an auxiliary ForEach-Object call, is a bug in Windows PowerShell that has since been fixed in PowerShell Core - see the bottom section for a more concise workaround.
However, even output from ForEach-Object script blocks should stream to the remaining pipeline commands, so you shouldn't run out of memory - after all, a salient feature of the PowerShell pipeline is object-by-object processing, which keeps memory use constant, irrespective of the size of the (streaming) input collection.
You've since confirmed that avoiding the aux. ForEach-Object call does not solve the problem, so we still don't know what causes your out-of-memory exception.
Update:
This GitHub issue contains clues as to the reason for excessive memory use, especially with many properties that contain small amounts of data.
This GitHub feature request proposes using strongly typed output objects to help the issue.
The following workaround, which uses the switch statement to process the files as text files, may help:
$header = ''
Get-ChildItem $inputFolder -Filter *.csv | ForEach-Object {
$i = 0
switch -Wildcard -File $_.FullName {
'*workstations*' {
# NOTE: If no other columns contain the word `workstations`, you can
# simplify and speed up the command by omitting the `ConvertFrom-Csv` call
# (you can make the wildcard matching more robust with something
# like '*,workstations,*')
if ((ConvertFrom-Csv "$header`n$_").machine_type -ne 'workstations') { continue }
$_ # row whose 'machine_type' column value equals 'workstations'
}
default {
if ($i++ -eq 0) {
if ($header) { continue } # header already written
else { $header = $_; $_ } # header row of 1st file
}
}
}
} | Set-Content $outputFile
Here's a workaround for the bug of not being able to pipe Get-ChildItem output directly to Import-Csv, by passing it as an argument instead:
Import-Csv -LiteralPath (Get-ChildItem $inputFolder -File -Filter *.csv) |
Where-Object { $_.machine_type -eq 'workstations' } |
Export-Csv $outputFile -NoType
Note that in PowerShell Core you could more naturally write:
Get-ChildItem $inputFolder -File -Filter *.csv | Import-Csv |
Where-Object { $_.machine_type -eq 'workstations' } |
Export-Csv $outputFile -NoType
Solution 2 :
$inputFolder = "C:\Change\2019\October"
$outputFile = "C:\Change\2019\output.csv"
$encoding = [System.Text.Encoding]::UTF8 # modify encoding if necessary
$Delimiter=','
#find header for your files => i take first row of first file with data
$Header = Get-ChildItem -Path $inputFolder -Filter *.csv | Where length -gt 0 | select -First 1 | Get-Content -TotalCount 1
#if not header founded then not file with sise >0 => we quit
if(! $Header) {return}
#create array for header
$HeaderArray=$Header -split $Delimiter -replace '"', ''
#open output file
$w = New-Object System.IO.StreamWriter($outputfile, $true, $encoding)
#write header founded
$w.WriteLine($Header)
#loop on file csv
Get-ChildItem $inputFolder -File -Filter "*.csv" | %{
#open file for read
$r = New-Object System.IO.StreamReader($_.fullname, $encoding)
$skiprow = $true
while ($line = $r.ReadLine())
{
#exclude header
if ($skiprow)
{
$skiprow = $false
continue
}
#Get objet for current row with header founded
$Object=$line | ConvertFrom-Csv -Header $HeaderArray -Delimiter $Delimiter
#write in output file for your condition asked
if ($Object.machine_type -eq 'workstations') { $w.WriteLine($line) }
}
$r.Close()
$r.Dispose()
}
$w.close()
$w.Dispose()
You have to read and write to the .csv files one row at a time, using StreamReader and StreamWriter:
$filepath = "C:\Change\2019\October"
$outputfile = "C:\Change\2019\output.csv"
$encoding = [System.Text.Encoding]::UTF8
$files = Get-ChildItem -Path $filePath -Filter *.csv |
Where-Object { $_.machine_type -eq 'workstations' }
$w = New-Object System.IO.StreamWriter($outputfile, $true, $encoding)
$skiprow = $false
foreach ($file in $files)
{
$r = New-Object System.IO.StreamReader($file.fullname, $encoding)
while (($line = $r.ReadLine()) -ne $null)
{
if (!$skiprow)
{
$w.WriteLine($line)
}
$skiprow = $false
}
$r.Close()
$r.Dispose()
$skiprow = $true
}
$w.close()
$w.Dispose()
get-content *.csv | add-content combined.csv
Make sure combined.csv doesn't exist when you run this, or it's going to go full Ouroboros.
This code displays the ImageName and FolderName properly, but the Dimension remains blank.And the data which is displayed properly is not getting saved in a csv file.
Also, If condition based on dimension is not working.
#-----------powershell script------------
foreach ($folder in $Folders) {
$Images = Get-ChildItem -Path $Folder -Filter *.png
$Results = #()
foreach ($image in $Images) {
$dimensions = $image.Dimensions
# $dimensions = "$($image.Width) x $($image.Height)"
If ($dimensions -ne '1000 x 1000') {
$Results += [pscustomobject]#{
ImageName = $image
FolderName = $Folder
Dimension = $dimensions
}
}
}
$Results | FT -auto
# $ExcelData= Format-Table -property #{n="$image";e='$Folder'}
$Results | Export-csv "C:\Users\M1036098\Documents\Imagelessthan1000pi.txt" -NoTypeInformation
}
Dimension property in output remains blank
So lets go over why this doesnt work. Get-ChildItem brings back the Object System.Io.FileInfo
System.IO.FileIonfo we can see from Microsoft there is no method or property named Dimensions.
So then lets get those dimensions...
First we are going to load the image into memory and get the size.
$Folders = #("C:\Test")
$Folders | %{
$Folder = $_
Get-ChildItem -Path $_ -Filter *.png | %{
try{
$Image = New-Object System.Drawing.Bitmap "$($_.FullName)"
[pscustomobject]#{
ImageName = $_.Name
FolderName = $Folder
Dimension = "$($Image.Height) x $($Image.Width)"
}
}catch{
}
} | ?{
$_.Dimension -ne "1000 x 1000"
}
}
Output looks like
ImageName FolderName Dimension
--------- ---------- ---------
Test1.png C:\Test 1440 x 2560
Test2.png C:\Test 1200 x 1200
Edit : Adding function for Sonam. Based on a answer that was posted.
function Get-ImageDimension([string]$Path, [array]$ImageExtensions, [array]$ExcludeDimensions){
Get-ChildItem -Path $Path -Include $ImageExtensions -Recurse | foreach-object{
try{
$image = [Drawing.Image]::FromFile($_);
[pscustomobject]#{
ImageName = $_.Name
FolderName = $_.DirectoryName
Dimension= "$($image.Width) x $($image.Height)"
}
}catch{
}
} | ?{
$ExcludeDimensions -notcontains $_.Dimension
}
}
Get-ImageDimension C:\Test\ -ImageExtensions *.png -ExcludeDimensions "1000 x 1000" | export-csv C:\Test\Test.csv
Thanks for the solution. I rewrote the code using System.Drawing.
Code Snippet:
try{
$FolderBase= "E:\D\Demo\Demo\"
$Folders= get-ChildItem "$FolderBase*" |
foreach ($folder in $Folders){
$Images=Get-ChildItem -Path $Folder -Filter *.png
$Results = #()
#$image=(Get-ChildItem -Path $Folder *.png)
$Images | ForEach-Object {
$img = [Drawing.Image].FromFile($_);
$dimensions = "$($img.Width) x $($img.Height)"
If ($dimensions -ne "1000 x 1000") {
$Results += [pscustomobject]#{
ImageName = $img
FolderName = $Folder
Dimension= $dimensions
}
}
}
}
$Results | FT -auto
# $ExcelData= Format-Table -property #{n="$image";e='$Folder'}
$Results|Export-csv "C:\Users\****\Documents\Imagelessthan1000pi.txt" -NoTypeInformation
}catch{
}
Now, I get the desired results but
I am not able to write the data to a file.(excel (preferred),csv,text). I tried out-File and Export-csv.
One of the folder contains a large no of big size(700 kb max) images. and while executing the loop on that particular folder,
memory exception is thrown and it does not loop through all the
images.
I have some 6 files which are created dynamically (so,I dont know the contents). I need to compare these 6 files (exactly speaking compare one file with 5 others) and see what all contents in the file 1 are matching with the other 5. The contents which are matching should be saved, others need to be deleted.
I coded something like below, but is deleting everything (which are matching too).
$lines = Get-Content "C:\snaps.txt"
$check1 = Get-Content "C:\Previous_day_latest.txt"
$check2 = Get-Content "C:\this_week_saved_snaps.txt"
$check3 = Get-Content "C:\all_week_latest_snapshots.txt"
$check4 = Get-Content "C:\each_month_latest.txt"
$check5 = Get-Content "C:\exclusions.txt"
foreach($l in $lines)
{
if(($l -notmatch $check1) -and ($l -notmatch $check2) -and ($l -notmatch $check3) -and ($l -notmatch $check4))
{
Remove-Item -Path "C:\$l.txt"
}else
{
#nothing
}
}
foreach($ch in $check5)
{
Remove-Item -Path "C:\$ch.txt"
}
Contents of 6 files will be as shown below:
$lines
testinstance-01-07-15-08-00
testinstance-10-07-15-23-00
testinstance-13-02-15-13-00
testinstance-15-06-15-23-00
testinstance-19-01-15-23-00
testinstance-23-05-15-20-00
testinstance-27-03-15-23-00
testinstance-28-02-15-23-00
testinstance-29-07-15-08-00
testinstance-30-04-15-23-00
testinstance-30-06-15-23-00
testinstance-31-01-15-23-00
testinstance-31-12-14-23-00
$check1
testinstance-29-07-15-08-00
$check2
testinstance-23-05-15-20-00
testinstance-27-03-15-23-00
$check3
testinstance-01-07-15-23-00
testinstance-13-02-15-13-00
testinstance-19-01-15-23-00
$check4
testinstance-28-02-15-23-00
testinstance-30-04-15-23-00
testinstance-30-06-15-23-00
testinstance-31-01-15-23-00
$check5
testinstance-31-12-14-23-00
I've read about compare-object. But not sure how that can be implemented in my case as contents of all 5 files will be different and all those contents should be saved from deletion. Can someone please guide me to achieve what I said.? Any help would be really appreciated.
I would create an array of the files to check so you can simply add new files without modifying other parts of your script.
I use the where cmdlet which filters all lines that are in the reference file using -in condition and finally overwrite the file:
$referenceFile = 'C:\snaps.txt'
$compareFiles = #(
'C:\Previous_day_latest.txt',
'C:\this_week_saved_snaps.txt',
'C:\all_week_latest_snapshots.txt',
'C:\each_month_latest.txt',
'C:\exclusions.txt'
)
# get the content of the reference file
$referenceContent = (gc $referenceFile)
foreach ($file in $compareFiles)
{
# get the content of the file to check
$content = (gc $file)
# filter all contents from the file to check which are in the reference file and save it
$content | where { $_ -in $referenceContent } | sc $file
}
You can use the -contains operator to compare array contents. If you open all the files you want to check and store into an array, you can compare that with the reference file:
$lines = Get-Content "C:\snaps.txt"
$check1 = "C:\Previous_day_latest.txt"
$check2 = "C:\this_week_saved_snaps.txt"
$check3 = "C:\all_week_latest_snapshots.txt"
$check4 = "C:\each_month_latest.txt"
$check5 = "C:\exclusions.txt"
$checklines = #()
(1..5) | ForEach-Object {
$comp = Get-Content $(Get-Variable check$_).value
$checklines += $comp
}
$matches = $lines | ? { $checklines -contains $_ }
If you switch the -contains to -notcontains you'll see the three lines that don't match
The other answers here are great but I wanted to show you that Compare-Object could still work. You need to use it in a loop however. Just to try and show something else I included a simple use of Join-Path for building the array of checks. Basically we are saving some typing when you move your files to a production area. Update one path instead of more.
$rootPath = "C:\"
$fileNames = "Previous_day_latest.txt", "this_week_saved_snaps.txt", "all_week_latest_snapshots.txt", "each_month_latest.txt", "exclusions.txt"
$lines = Get-Content (Join-path $rootPath "snaps.txt")
$checks = $fileNames | ForEach-Object{Join-Path $rootPath $_}
ForEach($check in $checks){
Compare-Object -ReferenceObject $lines -DifferenceObject (Get-Content $check) -IncludeEqual |
Where-Object{$_.SideIndicator -eq "=="} |
Select-Object -ExpandProperty InputObject |
Set-Content $check
}
So we take each file path and use Compare-Object in a loop comparing each to the $lines array. Using -IncludeEqual we find the lines that both files share and write those back to the file.
Depending on how many checks you have and where they are it might be easier to have this line to build the array $checks
$checks = Get-ChildItem "C:\" -Filter "*.txt" | Select-Object -Expand FullName
So far I have a hash table with 2 values in it. Right now the code below, exports all the unique lines and gives me a count of how many times the line was referenced in 100's of xml files. This is one part.
I now need to find out which subfolder had the xml file in it that has that unique line of referenced in the hash table. Is this possible?
$ht = #{}
Get-ChildItem -recurse -Filter *.xml | Get-Content | %{$ht[$_] = $ht[$_]+1}
$ht
# To export to CSV:
$ht.GetEnumerator() | select key, value | Export-Csv D:\output.csv
To get file path to your output, you need to assign it to a variable in the first pipe.
Is this something similar to what you need?
$ht = #{}
Get-ChildItem -recurse -Filter *.xml | %{$path = $_.FullName; Get-Content $path} | % { $ht[$_] = $ht[$_] + $path + ";"}
The code above will return a hash-table in "config line" = "count" format.
EDIT:
If you need to return three elements (unique line, count and array of paths where it was found) it gets more complicated. Here is a code that will return an array of PSObjects. Each contains info for one unique line in XML files.
$ht = #()
$files = Get-ChildItem -recurse -Filter *.xml
foreach ($file in $files) {
$path = $file.FullName
$lines = Get-Content $path
foreach ($line in $lines) {
if ($match = $ht | where {$_.line -EQ $line}) {
$match.count = $match.count + 1
$match.Paths += $path
} else {
$ht += new-object PSObject -Property #{
Count = 1
Paths = #(,$path)
Line = $line }
}
}
}
$ht
I'm sure it can be shortened and optimized, but hopefully it is enough to get you started.
I am new to Powershell and am struggling a bit. I have obtained an example of the sort of function I want to use and adapted it partially. What I want is for it to loop through each subdirectory of C:\Test\, and combine just the PDFs in each subdirectory together (leaving the resulting PDF in each subdirectory).
At the moment I can get it to comb through the subdirectories, but it then combines the contents of all subdirectories into one giant PDF in the top level directory, which is not what I want. I feel like maybe I need to use an array of sorts but I don't know Powershell well enough yet.
BTW this uses PDFSharp - a .Net library.
Function PDFCombine {
$filepath = 'C:\Test\'
$filename = '.\Combined' #<--- ???
$output = New-Object PdfSharp.Pdf.PdfDocument
$PdfReader = [PdfSharp.Pdf.IO.PdfReader]
$PdfDocumentOpenMode = [PdfSharp.Pdf.IO.PdfDocumentOpenMode]
foreach($i in (gci $filepath *.pdf -Recurse)) {
$input = New-Object PdfSharp.Pdf.PdfDocument
$input = $PdfReader::Open($i.fullname, $PdfDocumentOpenMode::Import)
$input.Pages | %{$output.AddPage($_)}
}
$output.Save($filename)
}
Your question was unclear about how many levels you need to go down. You can try this (untested). It goes one level down from $filepath, gets all pdf files in that folder and it's subfolders and combines them into Subfoldername-Combined.pdf:
Function PDFCombine {
$filepath = 'C:\Test\'
$PdfReader = [PdfSharp.Pdf.IO.PdfReader]
$PdfDocumentOpenMode = [PdfSharp.Pdf.IO.PdfDocumentOpenMode]
#Foreach subfolder(FIRST LEVEL ONLY!)
Get-ChildItem $filepath | Where-Object { $_.PSIsContainer } | Foreach-Object {
#Create new ouput pdf-file
$output = New-Object PdfSharp.Pdf.PdfDocument
$outfilepath = Join-Path $_.FullName "$($_.Name)-Combined.pdf"
#Find and add pdf files in subfolders
Get-ChildItem -Path $_.FullName -Filter *.pdf -Recurse | ForEach-Object {
#$input = New-Object PdfSharp.Pdf.PdfDocument #Don't think this one's necessary
$input = $PdfReader::Open($_.fullname, $PdfDocumentOpenMode::Import)
$input.Pages | %{ $output.AddPage($_) }
}
#Save
$output.Save($outfilepath)
}
}
So you should get this:
c:\Test\Folder1\Folder1-Combined.pdf #should include all pages in Folder1 and ANY subfolders below)
c:\Test\Folder2\Folder2-Combined.pdf #should include all pages in Folder2 and ANY subfolders below)
#etc.
If you need it to create a combined pdf for every subfolder(not only the first level), then you could try this(untested):
Function PDFCombine {
$filepath = 'C:\Test\'
$PdfReader = [PdfSharp.Pdf.IO.PdfReader]
$PdfDocumentOpenMode = [PdfSharp.Pdf.IO.PdfDocumentOpenMode]
#Foreach subfolder with pdf files
Get-ChildItem -Path $filepath -Filter *.pdf -Recurse | Group-Object DirectoryName | ForEach-Object {
#Create new ouput pdf-file
$output = New-Object PdfSharp.Pdf.PdfDocument
$outfilepath = Join-Path $_.Name "Combined.pdf"
#Find and add pdf files in subfolders
$_.Group | ForEach-Object {
#$input = New-Object PdfSharp.Pdf.PdfDocument #I don't think you need this
$input = $PdfReader::Open($_.fullname, $PdfDocumentOpenMode::Import)
$input.Pages | %{ $output.AddPage($_) }
}
#Save
$output.Save($outfilepath)
#Remove output-object
Remove-Variable output
}
}
not tested ...
Function PDFCombine {
$filepath = 'C:\Test\'
$filename = '.\Combined' #<--- ???
$output = New-Object PdfSharp.Pdf.PdfDocument
$PdfReader = [PdfSharp.Pdf.IO.PdfReader]
$PdfDocumentOpenMode = [PdfSharp.Pdf.IO.PdfDocumentOpenMode]
$lastdir=""
foreach($i in (gci $filepath *.pdf -Recurse)) {
$input = New-Object PdfSharp.Pdf.PdfDocument
$input = $PdfReader::Open($i.fullname, $PdfDocumentOpenMode::Import)
$input.Pages | %{$output.AddPage($_)}
if ($lastdir -ne $_.directoryname){
$lastdir=$_.directoryname
$output.Save("$lastdir\$filename")
$output = New-Object PdfSharp.Pdf.PdfDocument
}
}
}