The powershell cmdlet out-file has the switch -encoding witch you can set to default. This default value will use the encoding of the system's current ANSI code page.
My question is: How can I get the name of this default encoding that out-file will use with powershell?
Take a look at [System.Text.Encoding]::Default, I believe it is used as "default".
E.g. in my case:
[System.Text.Encoding]::Default.EncodingName
gets
Cyrillic (Windows)
Related
I have a script that processes data from files and writes result based on a condition to txt. Given data are strings with words like: "Distribución" or "México". When processed, those special characters like "é" and "ó" are broken (typical white square or question mark).
How can i encode the output file to make it work with those characters? I tried encoding in Utf8, utf8 without BOM, it doesn't work. Here is to file writing line:
...| Out-file -encoding XXX .\result.txt
in XXX i tried ASCII, Utf8, nothing works :/
Out-File will always add a BOM. It's a particularly annoying "feature" of that Cmdlet. Unfortunately - to my knowledge - there is no quick way to save a file using UTF8 WITHOUT a BOM in powershell. You can, however, leverage .Net to do this. This isn't really production ready, but here's a quick example:
$outputPath = "D:\temp.txt"
$data = "Distribución or México"
[System.IO.File]::WriteAllLines($outputPath, $data)
Wrap it in a Cmdlet, function and / or module to make it reusable. Of course you can take more control over the file encoding with .Net too.
I'm running a powershell script on XML files recursively to search and replace text. The code is working fine with searching and replacing the text. However in certain files there are other languages text like fréquentes which is getting changed to fréquentes after running the script. I've been using UTF8 encoding in the script. Any pointers on how to retain the encoading?
$content| Foreach-Object{$_ -replace 'test1' , 'testing'`
-replace 'test2' , 'testing' }| Out-File file.FullName -Encoding utf8
You seem to be ignoring the XML file's encoding, which seems to be Latin 1. XML files specify their encoding at the start (or, if they don't, they will be autodetected as UTF-8, UTF-16, or UTF-32):
<?xml version='1.0' encoding='utf-8'?>
So it seems to me like you read the content with the correct encoding, but write the file in UTF-8 which doesn't match the declared one.
You could use the XML APIs to change the file, which may be preferable, or simply change your Out-File to
Out-File -Encoding Default
However, that can cause the encoding to differ between different computers, so careful with that. I pretty much only use it for files I know are in the system's legacy codepage, or for quick one-off scripts.
I was trying to write help text to a file with
Set-Content -path "help.txt" -Value $(help -Full "help")
Then I found that help cmdlet generates an object rather than text.
But simply adding toString() at the end does not work either.
So how can I get clean text from help command and write it to file using Set-Content?
In order to capture output as it would print on the screen, use either output redirection operator >, or pipe to cmdlet Out-File, which is required if you want to use an output character encoding other than the default, UTF-16 LE:
help -full help > help.txt # invariably creates a UTF-16 LE file
help -full help | Out-File help.txt # equivalent, but supports -Encoding <name>
By contrast, Set-Content:
does not use PowerShell's default output formatting; instead, it applies (at least conceptually) a .ToString() call to each input object, which may or may not give a meaningful representation.
creates ASCII files by default, but, like Out-File, it supports different encodings via the
-Encoding parameter.
In Powershell using > is the same as using | Out-File, so I can write
"something" > file.txt and It will write 'something' into file.txt . This is what I expect of a shell. Unfortunately, Powershell uses Unicode for writing file.txt. The only way to change it into UTF-8 is to write the quite long command:
"something" | Out-File file.txt -Encoding UTF8
I want to override the > shortcut, so that it adds the UTF-8 encoding by default. Is there a way to do that?
NOT A DUPLICATE CLARIFICATION:
This is not a duplicate. As is explained clearly here, Out-File has a hard-coded default. I don't want to change Out-File's behavior, I want to change >'s behavior.
No, can't be done
Even the documentation alludes to this.
From the last paragraph of Get-Help about_Redirection:
When you are
writing to files, the redirection operators use Unicode encoding. If
the file has a different encoding, the output might not be formatted
correctly. To redirect content to non-Unicode files, use the Out-File
cmdlet with its Encoding parameter.
(emphasis added)
The output encoding can be overriden by changing the $OutputEncoding variable. However, that only works for piping output into executables. It doesn't work for redirection operators. If you need a specific encoding for file output you must use Out-File or Set-Content with the -Encoding parameter (or a StreamWriter).
I'm currently working with powershell in order to create a .bat script.
I put text in .bat script with >>
For example,
Write "start program xxx" >> script.bat
but when i try to execute this script.bat with cmd, it says :
"■s" is not recognize ... etc.
And in powershell it says : 'þp' is not recognize ..
So I guess that doing >> script put special character at the beginning of the line. If someone got information on this. And what those "■s" and 'þp' are.
The file redirection operators (>> etc.) will write text encoded in UTF-16. If the file already contains text in a different encoding everything will be confused (and I'm not use of cmd.exe understands UTF-16 at all.
Easier to use Out-File with the -encoding parameter to specify something consistent. Use the -append switch parameter to append rather than overwriting.
Eg.
"Some text" | Out-File -encoding ASCII -append -FilePath 'script.bat`
(If you find yourself writing the same out-file and parameters, then put it in a helper advanced function that will read pipeline input to encapsulate the out-file.)