PFX encoded and back to PFX in Powershell - powershell

Is it possible when you convert a PFX to lets say Base64, to then convert it back to PFX ?
$PFX_FILE = get-content 'dummy.pfx' -Encoding Byte
[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes($PFX_FILE)) | Out-File 'dummy.txt'
$BASE64_STR = get-content 'dummy.txt' -Encoding utf8
[Text.Encoding]::Utf8.GetString([Convert]::FromBase64String($BASE64_STR)) | Out-File 'dummy-2.pfx'
The output of line four is unsurprisingly invalid, but I am not sure how to go about it.

I created a PFX cert in location : C:\temp\PowerShellGraphCert.pfx and ran the following. I believe this is what you are looking for.
I converted > PowerShellGraphCert.pfx to PowerShellGraphCert.txt and then back to dummy-3.pfx.
Now PowerShellGraphCert.pfx = dummy-3.pfx
$PFX_FILE = get-content 'C:\temp\PowerShellGraphCert.pfx' -Encoding Byte
$base64 = [System.Convert]::ToBase64String($PFX_FILE) | Out-File 'C:\temp\PowerShellGraphCertbase64.txt'
$BASE64_STR = get-content 'C:\temp\PowerShellGraphCertbase64.txt'
$filename = 'C:\temp\dummy-3.pfx'
$bytes = [Convert]::FromBase64String($BASE64_STR)
[IO.File]::WriteAllBytes($filename, $bytes)

Related

Log File Format Getting Fowled Up [duplicate]

I am using Sandcastle Helpfile Builder to produce a helpfile (.chm). The project is a .shfbproj file, which is XML format, works with msbuild.
I want to automatically update the Footer text that appears in the generated .chm file. I use this snippet:
$newFooter = "<FooterText>MyProduct v1.2.3.4</FooterText>";
get-content -Encoding ASCII $projFile.FullName |
%{$_ -replace '<FooterText>(.+)</FooterText>', $newFooter } > $TmpFile
move-item $TmpFile $projFile.FullName -force
The output directed to the $TmpFile is always a multi-byte string. But I don't want that. How do I set the encoding of the output to ASCII?
You could change the $OutputEncoding variable before writing to the file. The other option is not to use the > operator, but instead pipe directly to Out-File and use the -Encoding parameter.
The > redirection operator is a "shortcut" to Out-File. Out-File's default encoding is Unicode but you can change it to ASCII, so pipe to Out-File instead:
Get-Content -Encoding ASCII $projFile.FullName |
% { $_ -replace '<FooterText>(.+)</FooterText>', $newFooter } |
Out-File $tmpfile -Encoding ASCII
| sc filename does the trick (sc being an alias for Set-Content)
for >> filename use | ac filename does the trick (ac being an alias for Add-Content)
I found I had to use the following:
write-output "First line" | out-file -encoding ascii OutputFileName
write-output "Next line" | out-file -encoding ascii -append OutputFileName
....
Changing the output encoding using:
$OutputEncoding = New-Object -typename System.Text.ASCIIEncoding
did not work
You can set the default encoding of out-file to be ascii:
$PSDefaultParameterValues=#{'out-file:encoding'='ascii'}
Then something like this will result in an ascii file:
echo hi > out
In powershell 6 and 7, the default encoding of out-file was changed to utf8 no bom.
Just a little example using streams, although I realize this wasn't the original question.
C:\temp\ConfirmWrapper.ps1 -Force -Verbose 4>&1 6>&1 | Out-File -Encoding default -FilePath C:\temp\confirmLog.txt -Append
Will output the information(6) and verbose(4) streams to the output(1) stream and redirect all that to the out-file with ANSI(default) encoding.

Out-File -Encoding problems using PowerShell replace command [duplicate]

Out-File seems to force the BOM when using UTF-8:
$MyFile = Get-Content $MyPath
$MyFile | Out-File -Encoding "UTF8" $MyPath
How can I write a file in UTF-8 with no BOM using PowerShell?
Update 2021
PowerShell has changed a bit since I wrote this question 10 years ago. Check multiple answers below, they have a lot of good information!
Using .NET's UTF8Encoding class and passing $False to the constructor seems to work:
$MyRawString = Get-Content -Raw $MyPath
$Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding $False
[System.IO.File]::WriteAllLines($MyPath, $MyRawString, $Utf8NoBomEncoding)
The proper way as of now is to use a solution recommended by #Roman Kuzmin in comments to #M. Dudley answer:
[IO.File]::WriteAllLines($filename, $content)
(I've also shortened it a bit by stripping unnecessary System namespace clarification - it will be substituted automatically by default.)
I figured this wouldn't be UTF, but I just found a pretty simple solution that seems to work...
Get-Content path/to/file.ext | out-file -encoding ASCII targetFile.ext
For me this results in a utf-8 without bom file regardless of the source format.
Note: This answer applies to Windows PowerShell; by contrast, in the cross-platform PowerShell Core edition (v6+), UTF-8 without BOM is the default encoding, across all cmdlets.
In other words: If you're using PowerShell [Core] version 6 or higher, you get BOM-less UTF-8 files by default (which you can also explicitly request with -Encoding utf8 / -Encoding utf8NoBOM, whereas you get with-BOM encoding with -utf8BOM).
If you're running Windows 10 and you're willing to switch to BOM-less UTF-8 encoding system-wide - which can have side effects - even Windows PowerShell can be made to use BOM-less UTF-8 consistently - see this answer.
To complement M. Dudley's own simple and pragmatic answer (and ForNeVeR's more concise reformulation):
A simple, (non-streaming) PowerShell-native alternative is to use New-Item, which (curiously) creates BOM-less UTF-8 files by default even in Windows PowerShell:
# Note the use of -Raw to read the file as a whole.
# Unlike with Set-Content / Out-File *no* trailing newline is appended.
$null = New-Item -Force $MyPath -Value (Get-Content -Raw $MyPath)
Note: To save the output from arbitrary commands in the same format as Out-File would, pipe to Out-String first; e.g.:
$null = New-Item -Force Out.txt -Value (Get-ChildItem | Out-String)
For convenience, below is advanced function Out-FileUtf8NoBom, a pipeline-based alternative that mimics Out-File, which means:
you can use it just like Out-File in a pipeline.
input objects that aren't strings are formatted as they would be if you sent them to the console, just like with Out-File.
an additional -UseLF switch allows you use Unix-format LF-only newlines ("`n") instead of the Windows-format CRLF newlines ("`r`n") you normally get.
Example:
(Get-Content $MyPath) | Out-FileUtf8NoBom $MyPath # Add -UseLF for Unix newlines
Note how (Get-Content $MyPath) is enclosed in (...), which ensures that the entire file is opened, read in full, and closed before sending the result through the pipeline. This is necessary in order to be able to write back to the same file (update it in place).
Generally, though, this technique is not advisable for 2 reasons: (a) the whole file must fit into memory and (b) if the command is interrupted, data will be lost.
A note on memory use:
M. Dudley's own answer
and the New-Item alternative above require that the entire file contents be built up in memory first, which can be problematic with large input sets.
The function below does not require this, because it is implemented as a proxy (wrapper) function (for a concise summary of how to define such functions, see this answer).
Source code of function Out-FileUtf8NoBom:
Note: The function is also available as an MIT-licensed Gist, and only it will be maintained going forward.
You can install it directly with the following command (while I can personally assure you that doing so is safe, you should always check the content of a script before directly executing it this way):
# Download and define the function.
irm https://gist.github.com/mklement0/8689b9b5123a9ba11df7214f82a673be/raw/Out-FileUtf8NoBom.ps1 | iex
function Out-FileUtf8NoBom {
<#
.SYNOPSIS
Outputs to a UTF-8-encoded file *without a BOM* (byte-order mark).
.DESCRIPTION
Mimics the most important aspects of Out-File:
* Input objects are sent to Out-String first.
* -Append allows you to append to an existing file, -NoClobber prevents
overwriting of an existing file.
* -Width allows you to specify the line width for the text representations
of input objects that aren't strings.
However, it is not a complete implementation of all Out-File parameters:
* Only a literal output path is supported, and only as a parameter.
* -Force is not supported.
* Conversely, an extra -UseLF switch is supported for using LF-only newlines.
.NOTES
The raison d'ĂȘtre for this advanced function is that Windows PowerShell
lacks the ability to write UTF-8 files without a BOM: using -Encoding UTF8
invariably prepends a BOM.
Copyright (c) 2017, 2022 Michael Klement <mklement0#gmail.com> (http://same2u.net),
released under the [MIT license](https://spdx.org/licenses/MIT#licenseText).
#>
[CmdletBinding(PositionalBinding=$false)]
param(
[Parameter(Mandatory, Position = 0)] [string] $LiteralPath,
[switch] $Append,
[switch] $NoClobber,
[AllowNull()] [int] $Width,
[switch] $UseLF,
[Parameter(ValueFromPipeline)] $InputObject
)
begin {
# Convert the input path to a full one, since .NET's working dir. usually
# differs from PowerShell's.
$dir = Split-Path -LiteralPath $LiteralPath
if ($dir) { $dir = Convert-Path -ErrorAction Stop -LiteralPath $dir } else { $dir = $pwd.ProviderPath }
$LiteralPath = [IO.Path]::Combine($dir, [IO.Path]::GetFileName($LiteralPath))
# If -NoClobber was specified, throw an exception if the target file already
# exists.
if ($NoClobber -and (Test-Path $LiteralPath)) {
Throw [IO.IOException] "The file '$LiteralPath' already exists."
}
# Create a StreamWriter object.
# Note that we take advantage of the fact that the StreamWriter class by default:
# - uses UTF-8 encoding
# - without a BOM.
$sw = New-Object System.IO.StreamWriter $LiteralPath, $Append
$htOutStringArgs = #{}
if ($Width) { $htOutStringArgs += #{ Width = $Width } }
try {
# Create the script block with the command to use in the steppable pipeline.
$scriptCmd = {
& Microsoft.PowerShell.Utility\Out-String -Stream #htOutStringArgs |
. { process { if ($UseLF) { $sw.Write(($_ + "`n")) } else { $sw.WriteLine($_) } } }
}
$steppablePipeline = $scriptCmd.GetSteppablePipeline($myInvocation.CommandOrigin)
$steppablePipeline.Begin($PSCmdlet)
}
catch { throw }
}
process
{
$steppablePipeline.Process($_)
}
end {
$steppablePipeline.End()
$sw.Dispose()
}
}
Starting from version 6 powershell supports the UTF8NoBOM encoding both for set-content and out-file and even uses this as default encoding.
So in the above example it should simply be like this:
$MyFile | Out-File -Encoding UTF8NoBOM $MyPath
When using Set-Content instead of Out-File, you can specify the encoding Byte, which can be used to write a byte array to a file. This in combination with a custom UTF8 encoding which does not emit the BOM gives the desired result:
# This variable can be reused
$utf8 = New-Object System.Text.UTF8Encoding $false
$MyFile = Get-Content $MyPath -Raw
Set-Content -Value $utf8.GetBytes($MyFile) -Encoding Byte -Path $MyPath
The difference to using [IO.File]::WriteAllLines() or similar is that it should work fine with any type of item and path, not only actual file paths.
This script will convert, to UTF-8 without BOM, all .txt files in DIRECTORY1 and output them to DIRECTORY2
foreach ($i in ls -name DIRECTORY1\*.txt)
{
$file_content = Get-Content "DIRECTORY1\$i";
[System.IO.File]::WriteAllLines("DIRECTORY2\$i", $file_content);
}
important!: this only works if an extra space or newline at the start is no problem for your use case of the file
(e.g. if it is an SQL file, Java file or human readable text file)
one could use a combination of creating an empty (non-UTF8 or ASCII (UTF8-compatible)) file and appending to it (replace $str with gc $src if the source is a file):
" " | out-file -encoding ASCII -noNewline $dest
$str | out-file -encoding UTF8 -append $dest
as one-liner
replace $dest and $str according to your use case:
$_ofdst = $dest ; " " | out-file -encoding ASCII -noNewline $_ofdst ; $src | out-file -encoding UTF8 -append $_ofdst
as simple function
function Out-File-UTF8-noBOM { param( $str, $dest )
" " | out-file -encoding ASCII -noNewline $dest
$str | out-file -encoding UTF8 -append $dest
}
using it with a source file:
Out-File-UTF8-noBOM (gc $src), $dest
using it with a string:
Out-File-UTF8-noBOM $str, $dest
optionally: continue appending with Out-File:
"more foo bar" | Out-File -encoding UTF8 -append $dest
Old question, new answer:
While the "old" powershell writes a BOM, the new platform-agnostic variant does behave differently: The default is "no BOM" and it can be configured via switch:
-Encoding
Specifies the type of encoding for the target file. The default value is utf8NoBOM.
The acceptable values for this parameter are as follows:
ascii: Uses the encoding for the ASCII (7-bit) character set.
bigendianunicode: Encodes in UTF-16 format using the big-endian byte order.
oem: Uses the default encoding for MS-DOS and console programs.
unicode: Encodes in UTF-16 format using the little-endian byte order.
utf7: Encodes in UTF-7 format.
utf8: Encodes in UTF-8 format.
utf8BOM: Encodes in UTF-8 format with Byte Order Mark (BOM)
utf8NoBOM: Encodes in UTF-8 format without Byte Order Mark (BOM)
utf32: Encodes in UTF-32 format.
Source: https://learn.microsoft.com/de-de/powershell/module/Microsoft.PowerShell.Utility/Out-File?view=powershell-7
Emphasis mine
For PowerShell 5.1, enable this setting:
Control Panel, Region, Administrative, Change system locale, Use Unicode UTF-8
for worldwide language support
Then enter this into PowerShell:
$PSDefaultParameterValues['*:Encoding'] = 'Default'
Alternatively, you can upgrade to PowerShell 6 or higher.
https://github.com/PowerShell/PowerShell
I would say to use just the Set-Content command, nothing else needed.
The powershell version in my system is :-
PS C:\Users\XXXXX> $PSVersionTable.PSVersion | fl
Major : 5
Minor : 1
Build : 19041
Revision : 1682
MajorRevision : 0
MinorRevision : 1682
PS C:\Users\XXXXX>
So you would need something like following.
PS C:\Users\XXXXX> Get-Content .\Downloads\finddate.txt
Thursday, June 23, 2022 5:57:59 PM
PS C:\Users\XXXXX> Get-Content .\Downloads\finddate.txt | Set-Content .\Downloads\anotherfile.txt
PS C:\Users\XXXXX> Get-Content .\Downloads\anotherfile.txt
Thursday, June 23, 2022 5:57:59 PM
PS C:\Users\XXXXX>
Now when we check the file as per the screenshot it is utf8.
anotherfile.txt
Change multiple files by extension to UTF-8 without BOM:
$Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding($False)
foreach($i in ls -recurse -filter "*.java") {
$MyFile = Get-Content $i.fullname
[System.IO.File]::WriteAllLines($i.fullname, $MyFile, $Utf8NoBomEncoding)
}
[System.IO.FileInfo] $file = Get-Item -Path $FilePath
$sequenceBOM = New-Object System.Byte[] 3
$reader = $file.OpenRead()
$bytesRead = $reader.Read($sequenceBOM, 0, 3)
$reader.Dispose()
#A UTF-8+BOM string will start with the three following bytes. Hex: 0xEF0xBB0xBF, Decimal: 239 187 191
if ($bytesRead -eq 3 -and $sequenceBOM[0] -eq 239 -and $sequenceBOM[1] -eq 187 -and $sequenceBOM[2] -eq 191)
{
$utf8NoBomEncoding = New-Object System.Text.UTF8Encoding($False)
[System.IO.File]::WriteAllLines($FilePath, (Get-Content $FilePath), $utf8NoBomEncoding)
Write-Host "Remove UTF-8 BOM successfully"
}
Else
{
Write-Warning "Not UTF-8 BOM file"
}
Source How to remove UTF8 Byte Order Mark (BOM) from a file using PowerShell
If you want to use [System.IO.File]::WriteAllLines(), you should cast second parameter to String[] (if the type of $MyFile is Object[]), and also specify absolute path with $ExecutionContext.SessionState.Path.GetUnresolvedProviderPathFromPSPath($MyPath), like:
$Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding $False
Get-ChildItem | ConvertTo-Csv | Set-Variable MyFile
[System.IO.File]::WriteAllLines($ExecutionContext.SessionState.Path.GetUnresolvedProviderPathFromPSPath($MyPath), [String[]]$MyFile, $Utf8NoBomEncoding)
If you want to use [System.IO.File]::WriteAllText(), sometimes you should pipe the second parameter into | Out-String | to add CRLFs to the end of each line explictly (Especially when you use them with ConvertTo-Csv):
$Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding $False
Get-ChildItem | ConvertTo-Csv | Out-String | Set-Variable tmp
[System.IO.File]::WriteAllText("/absolute/path/to/foobar.csv", $tmp, $Utf8NoBomEncoding)
Or you can use [Text.Encoding]::UTF8.GetBytes() with Set-Content -Encoding Byte:
$Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding $False
Get-ChildItem | ConvertTo-Csv | Out-String | % { [Text.Encoding]::UTF8.GetBytes($_) } | Set-Content -Encoding Byte -Path "/absolute/path/to/foobar.csv"
see: How to write result of ConvertTo-Csv to a file in UTF-8 without BOM
I have the same error in the PowerShell and used this isolation and fixed it
$PSDefaultParameterValues['*:Encoding'] = 'utf8'
One technique I utilize is to redirect output to an ASCII file using the Out-File cmdlet.
For example, I often run SQL scripts that create another SQL script to execute in Oracle. With simple redirection (">"), the output will be in UTF-16 which is not recognized by SQLPlus. To work around this:
sqlplus -s / as sysdba "#create_sql_script.sql" |
Out-File -FilePath new_script.sql -Encoding ASCII -Force
The generated script can then be executed via another SQLPlus session without any Unicode worries:
sqlplus / as sysdba "#new_script.sql" |
tee new_script.log
Update: As others have pointed out, this will drop non-ASCII characters. Since the user asked for a way to "force" conversion, I assume they do not care about that as perhaps their data does not contain such data.
If you care about the preservation of non-ASCII characters, this is not the answer for you.
Used this method to edit a UTF8-NoBOM file and generated a file with correct encoding-
$fileD = "file.xml"
(Get-Content $fileD) | ForEach-Object { $_ -replace 'replace text',"new text" } | out-file "file.xml" -encoding ASCII
I was skeptical at this method at first, but it surprised me and worked!
Tested with powershell version 5.1
Could use below to get UTF8 without BOM
$MyFile | Out-File -Encoding ASCII
This one works for me (use "Default" instead of "UTF8"):
$MyFile = Get-Content $MyPath
$MyFile | Out-File -Encoding "Default" $MyPath
The result is ASCII without BOM.

to retrieve a Japanese or localized string from a file

I have the below code to retrieve a Japanese or localized string from an mht file. i have used almost all the encoding params listed here(unknown,string,unicode,bigendianunicode,utf8,utf7,utf32,ascii,default,oem) and verified. it always prints the junk characters instead of original Chinese or Japanese name
$log = "c:\scripts\meta.mht"
$patt = 'title'
$indx = Select-String $patt $log | ForEach-Object {$_.LineNumber}
write-host (Get-Content $log)[$indx] | out-file -encoding string c:\scripts\temp1.xml
can some one help me how to print the localized string? which encoding param should i use?
i have tried with all the listed params but no luck (unknown,string,unicode,bigendianunicode,utf8,utf7,utf32,ascii,default,oem)
thanks in advance.
Try changing your encoding for Get-Content like this:
write-host (Get-Content -Path $log -Encoding UTF8)[$indx] | out-file -encoding UTF8 c:\scripts\temp1.xml
I'm not sure if you'll need UTF8 or UNICODE, try both for Get-Content and Out-File.

Using PowerShell to write a file in UTF-8 without the BOM

Out-File seems to force the BOM when using UTF-8:
$MyFile = Get-Content $MyPath
$MyFile | Out-File -Encoding "UTF8" $MyPath
How can I write a file in UTF-8 with no BOM using PowerShell?
Update 2021
PowerShell has changed a bit since I wrote this question 10 years ago. Check multiple answers below, they have a lot of good information!
Using .NET's UTF8Encoding class and passing $False to the constructor seems to work:
$MyRawString = Get-Content -Raw $MyPath
$Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding $False
[System.IO.File]::WriteAllLines($MyPath, $MyRawString, $Utf8NoBomEncoding)
The proper way as of now is to use a solution recommended by #Roman Kuzmin in comments to #M. Dudley answer:
[IO.File]::WriteAllLines($filename, $content)
(I've also shortened it a bit by stripping unnecessary System namespace clarification - it will be substituted automatically by default.)
I figured this wouldn't be UTF, but I just found a pretty simple solution that seems to work...
Get-Content path/to/file.ext | out-file -encoding ASCII targetFile.ext
For me this results in a utf-8 without bom file regardless of the source format.
Note: This answer applies to Windows PowerShell; by contrast, in the cross-platform PowerShell Core edition (v6+), UTF-8 without BOM is the default encoding, across all cmdlets.
In other words: If you're using PowerShell [Core] version 6 or higher, you get BOM-less UTF-8 files by default (which you can also explicitly request with -Encoding utf8 / -Encoding utf8NoBOM, whereas you get with-BOM encoding with -utf8BOM).
If you're running Windows 10 and you're willing to switch to BOM-less UTF-8 encoding system-wide - which can have side effects - even Windows PowerShell can be made to use BOM-less UTF-8 consistently - see this answer.
To complement M. Dudley's own simple and pragmatic answer (and ForNeVeR's more concise reformulation):
A simple, (non-streaming) PowerShell-native alternative is to use New-Item, which (curiously) creates BOM-less UTF-8 files by default even in Windows PowerShell:
# Note the use of -Raw to read the file as a whole.
# Unlike with Set-Content / Out-File *no* trailing newline is appended.
$null = New-Item -Force $MyPath -Value (Get-Content -Raw $MyPath)
Note: To save the output from arbitrary commands in the same format as Out-File would, pipe to Out-String first; e.g.:
$null = New-Item -Force Out.txt -Value (Get-ChildItem | Out-String)
For convenience, below is advanced function Out-FileUtf8NoBom, a pipeline-based alternative that mimics Out-File, which means:
you can use it just like Out-File in a pipeline.
input objects that aren't strings are formatted as they would be if you sent them to the console, just like with Out-File.
an additional -UseLF switch allows you use Unix-format LF-only newlines ("`n") instead of the Windows-format CRLF newlines ("`r`n") you normally get.
Example:
(Get-Content $MyPath) | Out-FileUtf8NoBom $MyPath # Add -UseLF for Unix newlines
Note how (Get-Content $MyPath) is enclosed in (...), which ensures that the entire file is opened, read in full, and closed before sending the result through the pipeline. This is necessary in order to be able to write back to the same file (update it in place).
Generally, though, this technique is not advisable for 2 reasons: (a) the whole file must fit into memory and (b) if the command is interrupted, data will be lost.
A note on memory use:
M. Dudley's own answer
and the New-Item alternative above require that the entire file contents be built up in memory first, which can be problematic with large input sets.
The function below does not require this, because it is implemented as a proxy (wrapper) function (for a concise summary of how to define such functions, see this answer).
Source code of function Out-FileUtf8NoBom:
Note: The function is also available as an MIT-licensed Gist, and only it will be maintained going forward.
You can install it directly with the following command (while I can personally assure you that doing so is safe, you should always check the content of a script before directly executing it this way):
# Download and define the function.
irm https://gist.github.com/mklement0/8689b9b5123a9ba11df7214f82a673be/raw/Out-FileUtf8NoBom.ps1 | iex
function Out-FileUtf8NoBom {
<#
.SYNOPSIS
Outputs to a UTF-8-encoded file *without a BOM* (byte-order mark).
.DESCRIPTION
Mimics the most important aspects of Out-File:
* Input objects are sent to Out-String first.
* -Append allows you to append to an existing file, -NoClobber prevents
overwriting of an existing file.
* -Width allows you to specify the line width for the text representations
of input objects that aren't strings.
However, it is not a complete implementation of all Out-File parameters:
* Only a literal output path is supported, and only as a parameter.
* -Force is not supported.
* Conversely, an extra -UseLF switch is supported for using LF-only newlines.
.NOTES
The raison d'ĂȘtre for this advanced function is that Windows PowerShell
lacks the ability to write UTF-8 files without a BOM: using -Encoding UTF8
invariably prepends a BOM.
Copyright (c) 2017, 2022 Michael Klement <mklement0#gmail.com> (http://same2u.net),
released under the [MIT license](https://spdx.org/licenses/MIT#licenseText).
#>
[CmdletBinding(PositionalBinding=$false)]
param(
[Parameter(Mandatory, Position = 0)] [string] $LiteralPath,
[switch] $Append,
[switch] $NoClobber,
[AllowNull()] [int] $Width,
[switch] $UseLF,
[Parameter(ValueFromPipeline)] $InputObject
)
begin {
# Convert the input path to a full one, since .NET's working dir. usually
# differs from PowerShell's.
$dir = Split-Path -LiteralPath $LiteralPath
if ($dir) { $dir = Convert-Path -ErrorAction Stop -LiteralPath $dir } else { $dir = $pwd.ProviderPath }
$LiteralPath = [IO.Path]::Combine($dir, [IO.Path]::GetFileName($LiteralPath))
# If -NoClobber was specified, throw an exception if the target file already
# exists.
if ($NoClobber -and (Test-Path $LiteralPath)) {
Throw [IO.IOException] "The file '$LiteralPath' already exists."
}
# Create a StreamWriter object.
# Note that we take advantage of the fact that the StreamWriter class by default:
# - uses UTF-8 encoding
# - without a BOM.
$sw = New-Object System.IO.StreamWriter $LiteralPath, $Append
$htOutStringArgs = #{}
if ($Width) { $htOutStringArgs += #{ Width = $Width } }
try {
# Create the script block with the command to use in the steppable pipeline.
$scriptCmd = {
& Microsoft.PowerShell.Utility\Out-String -Stream #htOutStringArgs |
. { process { if ($UseLF) { $sw.Write(($_ + "`n")) } else { $sw.WriteLine($_) } } }
}
$steppablePipeline = $scriptCmd.GetSteppablePipeline($myInvocation.CommandOrigin)
$steppablePipeline.Begin($PSCmdlet)
}
catch { throw }
}
process
{
$steppablePipeline.Process($_)
}
end {
$steppablePipeline.End()
$sw.Dispose()
}
}
Starting from version 6 powershell supports the UTF8NoBOM encoding both for set-content and out-file and even uses this as default encoding.
So in the above example it should simply be like this:
$MyFile | Out-File -Encoding UTF8NoBOM $MyPath
When using Set-Content instead of Out-File, you can specify the encoding Byte, which can be used to write a byte array to a file. This in combination with a custom UTF8 encoding which does not emit the BOM gives the desired result:
# This variable can be reused
$utf8 = New-Object System.Text.UTF8Encoding $false
$MyFile = Get-Content $MyPath -Raw
Set-Content -Value $utf8.GetBytes($MyFile) -Encoding Byte -Path $MyPath
The difference to using [IO.File]::WriteAllLines() or similar is that it should work fine with any type of item and path, not only actual file paths.
This script will convert, to UTF-8 without BOM, all .txt files in DIRECTORY1 and output them to DIRECTORY2
foreach ($i in ls -name DIRECTORY1\*.txt)
{
$file_content = Get-Content "DIRECTORY1\$i";
[System.IO.File]::WriteAllLines("DIRECTORY2\$i", $file_content);
}
important!: this only works if an extra space or newline at the start is no problem for your use case of the file
(e.g. if it is an SQL file, Java file or human readable text file)
one could use a combination of creating an empty (non-UTF8 or ASCII (UTF8-compatible)) file and appending to it (replace $str with gc $src if the source is a file):
" " | out-file -encoding ASCII -noNewline $dest
$str | out-file -encoding UTF8 -append $dest
as one-liner
replace $dest and $str according to your use case:
$_ofdst = $dest ; " " | out-file -encoding ASCII -noNewline $_ofdst ; $src | out-file -encoding UTF8 -append $_ofdst
as simple function
function Out-File-UTF8-noBOM { param( $str, $dest )
" " | out-file -encoding ASCII -noNewline $dest
$str | out-file -encoding UTF8 -append $dest
}
using it with a source file:
Out-File-UTF8-noBOM (gc $src), $dest
using it with a string:
Out-File-UTF8-noBOM $str, $dest
optionally: continue appending with Out-File:
"more foo bar" | Out-File -encoding UTF8 -append $dest
Old question, new answer:
While the "old" powershell writes a BOM, the new platform-agnostic variant does behave differently: The default is "no BOM" and it can be configured via switch:
-Encoding
Specifies the type of encoding for the target file. The default value is utf8NoBOM.
The acceptable values for this parameter are as follows:
ascii: Uses the encoding for the ASCII (7-bit) character set.
bigendianunicode: Encodes in UTF-16 format using the big-endian byte order.
oem: Uses the default encoding for MS-DOS and console programs.
unicode: Encodes in UTF-16 format using the little-endian byte order.
utf7: Encodes in UTF-7 format.
utf8: Encodes in UTF-8 format.
utf8BOM: Encodes in UTF-8 format with Byte Order Mark (BOM)
utf8NoBOM: Encodes in UTF-8 format without Byte Order Mark (BOM)
utf32: Encodes in UTF-32 format.
Source: https://learn.microsoft.com/de-de/powershell/module/Microsoft.PowerShell.Utility/Out-File?view=powershell-7
Emphasis mine
For PowerShell 5.1, enable this setting:
Control Panel, Region, Administrative, Change system locale, Use Unicode UTF-8
for worldwide language support
Then enter this into PowerShell:
$PSDefaultParameterValues['*:Encoding'] = 'Default'
Alternatively, you can upgrade to PowerShell 6 or higher.
https://github.com/PowerShell/PowerShell
I would say to use just the Set-Content command, nothing else needed.
The powershell version in my system is :-
PS C:\Users\XXXXX> $PSVersionTable.PSVersion | fl
Major : 5
Minor : 1
Build : 19041
Revision : 1682
MajorRevision : 0
MinorRevision : 1682
PS C:\Users\XXXXX>
So you would need something like following.
PS C:\Users\XXXXX> Get-Content .\Downloads\finddate.txt
Thursday, June 23, 2022 5:57:59 PM
PS C:\Users\XXXXX> Get-Content .\Downloads\finddate.txt | Set-Content .\Downloads\anotherfile.txt
PS C:\Users\XXXXX> Get-Content .\Downloads\anotherfile.txt
Thursday, June 23, 2022 5:57:59 PM
PS C:\Users\XXXXX>
Now when we check the file as per the screenshot it is utf8.
anotherfile.txt
Change multiple files by extension to UTF-8 without BOM:
$Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding($False)
foreach($i in ls -recurse -filter "*.java") {
$MyFile = Get-Content $i.fullname
[System.IO.File]::WriteAllLines($i.fullname, $MyFile, $Utf8NoBomEncoding)
}
[System.IO.FileInfo] $file = Get-Item -Path $FilePath
$sequenceBOM = New-Object System.Byte[] 3
$reader = $file.OpenRead()
$bytesRead = $reader.Read($sequenceBOM, 0, 3)
$reader.Dispose()
#A UTF-8+BOM string will start with the three following bytes. Hex: 0xEF0xBB0xBF, Decimal: 239 187 191
if ($bytesRead -eq 3 -and $sequenceBOM[0] -eq 239 -and $sequenceBOM[1] -eq 187 -and $sequenceBOM[2] -eq 191)
{
$utf8NoBomEncoding = New-Object System.Text.UTF8Encoding($False)
[System.IO.File]::WriteAllLines($FilePath, (Get-Content $FilePath), $utf8NoBomEncoding)
Write-Host "Remove UTF-8 BOM successfully"
}
Else
{
Write-Warning "Not UTF-8 BOM file"
}
Source How to remove UTF8 Byte Order Mark (BOM) from a file using PowerShell
If you want to use [System.IO.File]::WriteAllLines(), you should cast second parameter to String[] (if the type of $MyFile is Object[]), and also specify absolute path with $ExecutionContext.SessionState.Path.GetUnresolvedProviderPathFromPSPath($MyPath), like:
$Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding $False
Get-ChildItem | ConvertTo-Csv | Set-Variable MyFile
[System.IO.File]::WriteAllLines($ExecutionContext.SessionState.Path.GetUnresolvedProviderPathFromPSPath($MyPath), [String[]]$MyFile, $Utf8NoBomEncoding)
If you want to use [System.IO.File]::WriteAllText(), sometimes you should pipe the second parameter into | Out-String | to add CRLFs to the end of each line explictly (Especially when you use them with ConvertTo-Csv):
$Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding $False
Get-ChildItem | ConvertTo-Csv | Out-String | Set-Variable tmp
[System.IO.File]::WriteAllText("/absolute/path/to/foobar.csv", $tmp, $Utf8NoBomEncoding)
Or you can use [Text.Encoding]::UTF8.GetBytes() with Set-Content -Encoding Byte:
$Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding $False
Get-ChildItem | ConvertTo-Csv | Out-String | % { [Text.Encoding]::UTF8.GetBytes($_) } | Set-Content -Encoding Byte -Path "/absolute/path/to/foobar.csv"
see: How to write result of ConvertTo-Csv to a file in UTF-8 without BOM
I have the same error in the PowerShell and used this isolation and fixed it
$PSDefaultParameterValues['*:Encoding'] = 'utf8'
One technique I utilize is to redirect output to an ASCII file using the Out-File cmdlet.
For example, I often run SQL scripts that create another SQL script to execute in Oracle. With simple redirection (">"), the output will be in UTF-16 which is not recognized by SQLPlus. To work around this:
sqlplus -s / as sysdba "#create_sql_script.sql" |
Out-File -FilePath new_script.sql -Encoding ASCII -Force
The generated script can then be executed via another SQLPlus session without any Unicode worries:
sqlplus / as sysdba "#new_script.sql" |
tee new_script.log
Update: As others have pointed out, this will drop non-ASCII characters. Since the user asked for a way to "force" conversion, I assume they do not care about that as perhaps their data does not contain such data.
If you care about the preservation of non-ASCII characters, this is not the answer for you.
Used this method to edit a UTF8-NoBOM file and generated a file with correct encoding-
$fileD = "file.xml"
(Get-Content $fileD) | ForEach-Object { $_ -replace 'replace text',"new text" } | out-file "file.xml" -encoding ASCII
I was skeptical at this method at first, but it surprised me and worked!
Tested with powershell version 5.1
Could use below to get UTF8 without BOM
$MyFile | Out-File -Encoding ASCII
This one works for me (use "Default" instead of "UTF8"):
$MyFile = Get-Content $MyPath
$MyFile | Out-File -Encoding "Default" $MyPath
The result is ASCII without BOM.

script to save file as unicode

Do you know any way that I could programmatically or via scrirpt transform a set of text files saved in ansi character encoding, to unicode encoding?
I would like to do the same as I do when I open the file with notepad and choose to save it as an unicode file.
This could work for you, but notice that it'll grab every file in the current folder:
Get-ChildItem | Foreach-Object { $c = (Get-Content $_); `
Set-Content -Encoding UTF8 $c -Path ($_.name + "u") }
Same thing using aliases for brevity:
gci | %{ $c = (gc $_); sc -Encoding UTF8 $c -Path ($_.name + "u") }
Steven Murawski suggests using Out-File instead. The differences between both cmdlets are the following:
Out-File will attempt to format the input it receives.
Out-File's default encoding is Unicode-based, whereas Set-Content uses the system's default.
Here's an example assuming the file test.txt doesn't exist in either case:
PS> [system.string] | Out-File test.txt
PS> Get-Content test.txt
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True String System.Object
# test.txt encoding is Unicode-based with BOM
PS> [system.string] | Set-Content test.txt
PS> Get-Content test.txt
System.String
# test.txt encoding is "ANSI" (Windows character set)
In fact, if you don't need any specific Unicode encoding, you could as well do the following to convert a text file to Unicode:
PS> Get-Content sourceASCII.txt > targetUnicode.txt
Out-File is a "redirection operator with optional parameters" of sorts.
The easiest way would be Get-Content 'path/to/text/file' | out-file 'name/of/file'.
Out-File has an -encoding parameter, the default of which is Unicode.
If you wanted to script a batch of them, you could do something like
$files = get-childitem 'directory/of/text/files'
foreach ($file in $files)
{
get-content $file | out-file $file.fullname
}
Use the System.IO.StreamReader(To read the file contents) class together with the System.Text.Encoding.Encoding(To create the Encoder object which does the encoding) base class.
You could create a new text file and write the bytes from the original file into the new one, placing a '\0' before each original byte (assuming the original text file was in English).
You can use iconv. On Windows you can use it under Cygwin.
iconv -f from_encoding -t to_encoding file
pseudo code...
Dim system, file, contents, newFile, oldFile
Const ForReading = 1, ForWriting = 2, ForAppending = 3
Const AnsiFile = -2, UnicodeFile = -1
Set system = CreateObject("Scripting.FileSystemObject...
Set file = system.GetFile("text1.txt")
Set oldFile = file.OpenAsTextStream(ForReading, AnsiFile)
contents = oldFile.ReadAll()
oldFile.Close
system.CreateTextFile "text1.txt"
Set file = system.GetFile("text1.txt")
Set newFile = file.OpenAsTextStream(ForWriting, UnicodeFile)
newFile.Write contents
newFile.Close
Hope this approach will work..