PowerShell reading and writing compressed files with byte arrays

PowerShell reading and writing compressed files with byte arrays - powershell

Final Update: Turns out I didn't need Binary writer. I could just copy memory streams from one archive to another.
I'm re-writing a PowerShell script which works with archives. I'm using two functions from here
Expand-Archive without Importing and Exporting files
and can successfully read and write files to the archive. I've posted the whole program just in case it makes things clearer for someone to help me.
However, there are three issues (besides the fact that I don't really know what I'm doing).
1.) Most files have this error on when trying to run
Add-ZipEntry -ZipFilePath ($OriginalArchivePath + $PartFileDirectoryName) -EntryPath $entry.FullName -Content $fileBytes}
Cannot convert value "507" to type "System.Byte". Error: "Value was either too large or too small for an unsigned byte." (replace 507 with whatever number from the byte array is there)
2.) When it reads a file and adds it to the zip archive (*.imscc) it adds a character "a" to the beginning of the file contents.
3.) The only file it doesn't error on are text files, when I really want it to handle any file
Thank you for any assistance!
Update: I've tried using System.IO.BinaryWriter, with the same errors.
Add-Type -AssemblyName 'System.Windows.Forms'
Add-Type -AssemblyName 'System.IO.Compression'
Add-Type -AssemblyName 'System.IO.Compression.FileSystem'
function Folder-SuffixGenerator($SplitFileCounter)
{
return ' ('+$usrSuffix+' '+$SplitFileCounter+')'
}
function Get-ZipEntryContent(#returns the bytes of the first matching entry
[string] $ZipFilePath, #optional - specify a ZipStream or path
[IO.Stream] $ZipStream = (New-Object IO.FileStream($ZipFilePath, [IO.FileMode]::Open)),
[string] $EntryPath){
$ZipArchive = New-Object IO.Compression.ZipArchive($ZipStream, [IO.Compression.ZipArchiveMode]::Read)
$buf = New-Object byte[] (0) #return an empty byte array if not found
$ZipArchive.GetEntry($EntryPath) | ?{$_} | %{ #GetEntry returns first matching entry or null if there is no match
$buf = New-Object byte[] ($_.Length)
Write-Verbose " reading: $($_.Name)"
$_.Open().Read($buf,0,$buf.Length)
}
$ZipArchive.Dispose()
$ZipStream.Close()
$ZipStream.Dispose()
return ,$buf
}
function Add-ZipEntry(#Adds an entry to the $ZipStream. Sample call: Add-ZipEntry -ZipFilePath "$PSScriptRoot\temp.zip" -EntryPath Test.xml -Content ([text.encoding]::UTF8.GetBytes("Testing"))
[string] $ZipFilePath, #optional - specify a ZipStream or path
[IO.Stream] $ZipStream = (New-Object IO.FileStream($ZipFilePath, [IO.FileMode]::OpenOrCreate)),
[string] $EntryPath,
[byte[]] $Content,
[switch] $OverWrite, #if specified, will not create a second copy of an existing entry
[switch] $PassThru ){#return a copy of $ZipStream
$ZipArchive = New-Object IO.Compression.ZipArchive($ZipStream, [IO.Compression.ZipArchiveMode]::Update, $true)
$ExistingEntry = $ZipArchive.GetEntry($EntryPath) | ?{$_}
If($OverWrite -and $ExistingEntry){
Write-Verbose " deleting existing $($ExistingEntry.FullName)"
$ExistingEntry.Delete()
}
$Entry = $ZipArchive.CreateEntry($EntryPath)
$WriteStream = New-Object System.IO.StreamWriter($Entry.Open())
$WriteStream.Write($Content,0,$Content.Length)
$WriteStream.Flush()
$WriteStream.Dispose()
$ZipArchive.Dispose()
If($PassThru){
$OutStream = New-Object System.IO.MemoryStream
$ZipStream.Seek(0, 'Begin') | Out-Null
$ZipStream.CopyTo($OutStream)
}
$ZipStream.Close()
$ZipStream.Dispose()
If($PassThru){$OutStream}
}
$NoDeleteFiles = #('files_meta.xml' ,'course_settings.xml', 'assignment_groups.xml', 'canvas_export.txt', 'imsmanifest.xml')
Set-Variable usrSuffix -Option ReadOnly -Value 'part' -Force
$MaxImportFileSize = 1000
$compressionLevel = [System.IO.Compression.CompressionLevel]::Optimal
$SplitFileCounter = 1
$FileBrowser = New-Object System.Windows.Forms.OpenFileDialog
$FileBrowser.filter = "Canvas Export Files (*.imscc)| *.imscc"
[void]$FileBrowser.ShowDialog()
$FileBrowser.FileName
$FilePath = $FileBrowser.FileName
$OriginalArchivePath = $FilePath.Substring(0,$FilePath.Length-6)
$PartFileDirectoryName = $OriginalArchive + (Folder-SuffixGenerator($SplitFileCounter)) + '.imscc'
$CourseZip = [IO.Compression.ZipFile]::OpenRead($FilePath)
$CourseZipFiles = $CourseZip.Entries | Sort Length -Descending
$CourseZip.Dispose()
<#
$SortingTable = $CourseZip.entries | Select Fullname,
#{Name="Size";Expression={$_.length}},
#{Name="CompressedSize";Expression={$_.Compressedlength}},
#{Name="PctZip";Expression={[math]::Round(($_.compressedlength/$_.length)*100,2)}}|
Sort Size -Descending | format-table –AutoSize
#>
# Add mandatory files
ForEach($entry in $CourseZipFiles)
{
if ($NoDeleteFiles.Contains($entry.Name)){
Write-Output "Adding to Zip" + $entry.FullName
# Add to Zip
$fileBytes = Get-ZipEntryContent -ZipFilePath $FilePath -EntryPath $entry.FullName
Add-ZipEntry -ZipFilePath ($OriginalArchivePath + $PartFileDirectoryName) -EntryPath $entry.FullName -Content $fileBytes
}
}```

System.IO.StreamWriter is a text writer, and therefore not suitable for writing raw bytes. Cannot convert value "507" to type "System.Byte" indicates that an inappropriate attempt was made to convert text - a .NET string composed of [char] instances which are in effect [uint16] code points (range 0x0 - 0xffff) - to [byte] instances (0x0 - 0xff). Therefore, any Unicode character whose code point is greater than 255 (0xff) will cause this error.
The solution is to use a .NET API that allows writing raw bytes, namely System.IO.BinaryWriter:
$WriteStream = [System.IO.BinaryWriter]::new($Entry.Open())
$WriteStream.Write($Content)
$WriteStream.Flush()
$WriteStream.Dispose()

Related

powershell -match defaulting to garbage

I have a highly used method that I use to get contents from a file, and then it returns contents between two given parameters. It works for every other one (about 15 files), but for the one I just added, it's defaulting to garbage text that isn't even in the file read. I've tried re-using the filename being fed to the method, as well as using a different filename. I've tried using different from/to strings in the file in the area returning the garbage.
This is the method called:
#Function to look for method content with to parse and return contents in method as string
#Note that $followingMethodName is where we parse to, as an end string. $methodNameToReturn is where we start getting data to return.
Function Get-MethodContents{
[cmdletbinding()]
Param ( [string]$codePath, [string]$methodNameToReturn, [string]$followingMethodName)
Process
{
$contents = ""
Write-Host "In GetMethodContents method File:$codePath method:$methodNameToReturn followingMethod:$followingMethodName" -ForegroundColor Green
$contents = Get-Content $codePath -Raw #raw gives content as single string instead of a list of strings
$null = $contents -match "($methodNameToReturn[\s\S]*)$followingMethodName" ###############?? wrong for just the last one added
Write-Host "File Contents Found: $($Matches.Item(1))" -ForegroundColor DarkYellow
Write-Host "File Contents Found: $($Matches.Item(0))" -ForegroundColor Cyan
Write-Host "File Contents Found: $($Matches[0])" -ForegroundColor Cyan
Write-Host "File Contents Found: $($Matches[1])" -ForegroundColor Cyan
return $Matches.Item(1)
}#End of Process
}#End of Function
This is the calling code. The GetMethodContents for FileHandler2 is defaulting to $currentVersion (6000) when it returns, which isn't even in the file being provided.
elseif($currentVersion -Match '^(6000)') #6000
{
$HopResultMap = [ordered]#{}
$HopResultMap2 = [ordered]#{}
#call method to return basePathFull cppFile method contents
$matchFound = Get-MethodContents -codePath $File[0] -methodNameToReturn "Build the HOP error map" -followingMethodName "CHop2Windows::getHOPError" #correct match
#call method to get what is like case info but is map in 6000 case....it's 2 files so 2 maps for 6000
$HopResultMap = (Get-Contents60 -fileContent $matchFound) #error map of ex: seJam to HOP_JAM
$FileHandler = Join-Path -Path $basePathFull -ChildPath "Hop2Windows\XXHandler.cpp"
$matchFound2 = Get-MethodContents -codePath $FileHandler -methodNameToReturn "XXHandler::populateVec" -followingMethodName "m_Warnings" #matches correctly
$HopResultMap2 = (Get-Contents60_b -fileContent $matchFound2) #used in foreach
#sdkErr uses Handler file too but it Get-methodContents is returning 6000 so try diff filename
$FileHandler2 = Join-Path -Path $basePathFull -ChildPath "Hop2Windows\XXHandler.cpp"
$matchFound3 = Get-MethodContents -codePath $FileHandler2 -methodNameToReturn "feed from No. 0" -followingMethodName "class CHop2Windows;" # returns 6000-wrong######################??
$HopResultMap3 = (Get-Contents60_c -fileContent $matchFound3) #used in foreach
#next need to put these 2 maps together
#need to test I got matches correct in 6000 map still################
#combine the maps so can reuse 7000's case/design. $hopResultMap key is the common tie with $hopResultMap2 and thrown away
$resultCase = foreach ($key in $HopResultMap.Keys){
[PSCustomObject][ordered]#{
sdkErr = "HopResultMap3[$key]" # 0x04
sdkDesc = "HopResultMap2[$key]" # Fatal Error
sdkOutErr = "$($HopResultMap[$key])"
}
}
}//else
This is with powershell 5.1 and VSCode.
Update (as requested):
$pathViewBase = 'C:\Data\CC_SnapViews\EndToEnd_view\'
$HopBase = '\Output\HOP\'
$basePathFull = Join-Path -Path $pathViewBase -ChildPath $HopBase
$Hop2PrinterVersionDirs = #('Hop2Windowsxx\Hop2Windowsxx.cpp')
...
foreach($cppFile in $Hop2VersionDirs) #right now there is only one
{
$File = #(Get-ChildItem -Path (Join-Path -Path $basePathFull -ChildPath $cppFile))
Update2:
I tried escaping like this with the problematic content returned:
$matchFound3 = Get-MethodContents -codePath $FileHandler2 -methodNameToReturn [regex]::Escape("feed from No. 0") -followingMethodName regex[Escape("class CHop2Windowsxx;")
and see this error:
Get-MethodContents : A positional parameter cannot be found that
accepts argument 'Roll feed from No. 0'.

I figured it out. I was giving it the .cpp filename, but the content I was looking for was in the .h filename. :\

is there a simple way to output to xlsx?

I am trying to output a query from a DB to a xlsx but it takes so much time to do this because there about 20,000 records to process, is there a simpler way to do this?
I know there is a way to do it for csv but im trying to avoid that, because if the records had any comma is going to take it as a another column and that would mess with the info
this is my code
$xlsObj = New-Object -ComObject Excel.Application
$xlsObj.DisplayAlerts = $false
$xlsWb = $xlsobj.Workbooks.Add(1)
$xlsObj.Visible = 0 #(visible = 1 / 0 no visible)
$xlsSh = $xlsWb.Worksheets.Add([System.Reflection.Missing]::Value, $xlsWb.Worksheets.Item($xlsWb.Worksheets.Count))
$xlsSh.Name = "QueryResults"
$DataSetTable= $ds.Tables[0]
Write-Output "DATA SET TABLE" $DataSetTable
[Array] $getColumnNames = $DataSetTable.Columns | SELECT *
Write-Output "COLUMN NAMES" $DataSetTable.Rows[0]
[Int] $RowHeader = 1
foreach ($ColH in $getColumnNames)
{
$xlsSh.Cells.item(1, $RowHeader).font.bold = $true
$xlsSh.Cells.item(1, $RowHeader) = $ColH.ColumnName
Write-Output "Nombre de Columna"$ColH.ColumnName
$RowHeader++
}
[Int] $rowData = 2
[Int] $colData = 1
foreach ($rec in $DataSetTable.Rows)
{
foreach ($Coln in $getColumnNames)
{
$xlsSh.Cells.NumberFormat = "#"
$xlsSh.Cells.Item($rowData, $colData) = $rec.$($Coln.ColumnName).ToString()
$ColData++
}
$rowData++; $ColData = 1
}
$xlsRng = $xlsSH.usedRange
[void] $xlsRng.EntireColumn.AutoFit()
#Se elimina la pestaña Sheet1/Hoja1.
$xlsWb.Sheets(1).Delete() #Versión 02
$xlsFile = "directory of the file"
[void] $xlsObj.ActiveWorkbook.SaveAs($xlsFile)
$xlsObj.Quit()
Start-Sleep -Milliseconds 700
While ([System.Runtime.Interopservices.Marshal]::ReleaseComObject($xlsRng)) {''}
While ([System.Runtime.Interopservices.Marshal]::ReleaseComObject($xlsSh)) {''}
While ([System.Runtime.Interopservices.Marshal]::ReleaseComObject($xlsWb)) {''}
While ([System.Runtime.Interopservices.Marshal]::ReleaseComObject($xlsObj)) {''}
[gc]::collect() | Out-Null
[gc]::WaitForPendingFinalizers() | Out-Null
$oraConn.Close()

I'm trying to avoid [CSV files], because if the records had any comma is going to take it as a another column and that would mess with the info
That's only the case if you try to construct the output format manually. Builtin commands like Export-Csv and ConvertTo-Json will automatically quote the values as necessary:
PS C:\> $customObject = [pscustomobject]#{ID = 1; Name = "Solis, Heber"}
PS C:\> $customObject
ID Name
-- ----
1 Solis, Heber
PS C:\> $customObject |ConvertTo-Csv -NoTypeInformation
"ID","Name"
"1","Solis, Heber"
Notice, in the example above, how:
The string value assigned to $customObject.Name does not contain any quotation marks, but
In the output from ConvertTo-Csv we see values and headers clearly enclosed in quotation marks
PowerShell automatically enumerates the row data when you pipe a [DataTable] instance, so creating a CSV might (depending on the contents) be as simple as:
$ds.Tables[0] |Export-Csv table_out.csv -NoTypeInformation
What if you want TAB-separated values (or any other non-comma separator)?
The *-Csv commands come with a -Delimiter parameter to which you can pass a user-defined separator:
# This produces semicolon-separated values
$data |Export-Csv -Path output.csv -Delimiter ';'
I usually try and refrain from recommending specific modules libraries, but if you insist on writing to XSLX I'd suggest checking out ImportExcel (don't let the name fool you, it does more than import from excel, including exporting and formatting data from PowerShell -> XSLX)

delete some sequence of bytes in Powershell [duplicate]

This question already has answers here:
Methods to hex edit binary files via Powershell
(4 answers)
Closed 3 years ago.
I have a *.bin file. How can I delete with poweshell some part of bytes (29 bytes, marked yellow) with repeatig sequence of bytes (12 bytes, marked red pen)? Thanks a lot!!

Using a very helpful article and ditto function I found here, it seems it is posible to read a binary file and convert it to a string while not altering any of the bytes by using Codepage 28591.
With that (I slightly changed the function), you can do this to delete the bytes in your *.bin file:
function ConvertTo-BinaryString {
# converts the bytes of a file to a string that has a
# 1-to-1 mapping back to the file's original bytes.
# Useful for performing binary regular expressions.
[OutputType([String])]
Param (
[Parameter(Mandatory = $True, ValueFromPipeline = $True, Position = 0)]
[ValidateScript( { Test-Path $_ -PathType Leaf } )]
[String]$Path
)
$Stream = New-Object System.IO.FileStream -ArgumentList $Path, 'Open', 'Read'
# Note: Codepage 28591 returns a 1-to-1 char to byte mapping
$Encoding = [Text.Encoding]::GetEncoding(28591)
$StreamReader = New-Object System.IO.StreamReader -ArgumentList $Stream, $Encoding
$BinaryText = $StreamReader.ReadToEnd()
$StreamReader.Close()
$Stream.Close()
return $BinaryText
}
$inputFile = 'D:\test.bin'
$outputFile = 'D:\test2.bin'
$fileBytes = [System.IO.File]::ReadAllBytes($inputFile)
$binString = ConvertTo-BinaryString -Path $inputFile
# create your regex: 17 bytes in range of \x00 to \xFF followed by 12 bytes specific range
$re = [Regex]'[\x00-\xFF]{17}\xEB\x6F\xD3\x01\x18\x00{3}\xFF{3}\xFE'
# use a MemoryStream object to store the result
$ms = New-Object System.IO.MemoryStream
$pos = $replacements = 0
$re.Matches($binString) | ForEach-Object {
# write the part of the byte array before the match to the MemoryStream
$ms.Write($fileBytes, $pos, $_.Index)
# update the 'cursor' position for the next match
$pos += ($_.Index + $_.Length)
# and count the number of replacements done
$replacements++
}
# write the remainder of the bytes to the stream
$ms.Write($fileBytes, $pos, $fileBytes.Count - $pos)
# save the updated bytes to a new file (will overwrite existing file)
[System.IO.File]::WriteAllBytes($outputFile, $ms.ToArray())
$ms.Dispose()
if ($replacements) {
Write-Host "$replacements replacement(s) made."
}
else {
Write-Host "Byte sequence not found. No replacements made."
}

How to count files in FTP directory

I have this script. I'm trying to count how many file are in.
clear
$ftp_uri = "ftp://ftp.domain.net:"
$user = "username"
$pass = "password"
$subfolder = "/test/out/"
$ftp_urix = $ftp_uri + $subfolder
$uri=[system.URI] $ftp_urix
$ftp=[system.net.ftpwebrequest]::Create($uri)
$ftp.Credentials=New-Object System.Net.NetworkCredential($user,$pass)
#Get a list of files in the current directory.
$ftp.Method=[system.net.WebRequestMethods+ftp]::ListDirectorydetails
$ftp.UseBinary = $true
$ftp.KeepAlive = $false
$ftp.EnableSsl = $true
$ftp.Timeout = 30000
$ftp.UsePassive=$true
try
{
$ftpresponse=$ftp.GetResponse()
$strm=$ftpresponse.GetResponseStream()
$ftpreader=New-Object System.IO.StreamReader($strm,'UTF-8')
$list=$ftpreader.ReadToEnd()
$lines=$list.Split("`n")
$lines
$lines.Count
$ftpReader.Close()
$ftpresponse.Close()
}
catch{
$_|fl * -Force
$ftpReader.Close()
$ftpresponse.Close()
}
In the directory I have three files but $lines.count return 4. $lines have 4 rows, three files and an empty line. Somebody can explain me the mystery?

The $list contains:
file1`nfile2`nfile3`n
If you split the string by "`n", you (correctly) get four parts, with the last one being empty.
You can use an overload of String.Split that takes StringSplitOptions and use RemoveEmptyEntries:
$list.Split("`n", [System.StringSplitOptions]::RemoveEmptyEntries)

PowerShell unable to get file metadata from Comments field when it's too long

I want to extract some xml data from the Comments metadata field in .WMA files.
I'm using a script from Technet's Scripting Guy column to get all metadata, and it lists every attribute except the Comments field!
Some research by my colleague showed that when we shortened the data in the Comments field to < 1024 bytes, the data from the Comments field lists out fine.
It seems to me that the limitation is in the Shell.Application object; it just returns an empty Comments field when the contents is more than 1024 characters. Also, instead of listing every attribute, I just get the Comments, which is number 24.
The sample file I have contains 1188 bytes, and I think files will be aruond there, so it's not over by much.
Here is the script I'm currently running (removed comments for brevity):
Function Get-FileMetaData
{
Param([string[]]$folder)
foreach($sFolder in $folder)
{
$a = 0
$objShell = New-Object -ComObject Shell.Application
$objFolder = $objShell.namespace($sFolder)
foreach ($File in $objFolder.items())
{
$FileMetaData = New-Object PSOBJECT
$hash += #{"Filename" = $($objFolder.getDetailsOf($File, 0)) }
$hash += #{"My Comment field" = $($objFolder.getDetailsOf($File, 24)) }
$hash += #{"Length" = $($objFolder.getDetailsOf($File, 24)).Length }
$FileMetaData | Add-Member $hash
$hash.clear()
} #end foreach
$a=0
$FileMetaData
} #end foreach $sfolder
}
Get-FileMetaData -folder "C:\DATA\wma" | fl
Is there another approach I can use that will allow me to extract the full XML data?

you can try to use the taglib-sharp dll from http://taglib.org/
here I copy the content of a 156 KB file to the comment :
[system.reflection.assembly]::loadfile("c:\temp\taglib-sharp.dll")
$data=[taglib.file]::create('c:\mp3\01. Stromae - Alors On Danse.mp3')
$data.Tag.Comment = (gc c:\temp\IMP_ERR.LOG)
$data.Save()
verification :
PS>$data=[taglib.file]::create('c:\mp3\01. Stromae - Alors On
Danse.mp3') PS>$data.tag.Comment.length / 1KB
PS>155,2197265625
edit
I was able to use same code for a wma file

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

PowerShell reading and writing compressed files with byte arrays - powershell

Related

powershell -match defaulting to garbage

is there a simple way to output to xlsx?

delete some sequence of bytes in Powershell [duplicate]

How to count files in FTP directory

PowerShell unable to get file metadata from Comments field when it's too long

Categories

Resources