I'm trying to get the output of a command in PowerShell and encode it and then decode it again to receive the results of the said command as shown.
$enc = [system.Text.Encoding]::UTF8
$bytes = $enc.GetBytes((Invoke-Expression "net users"))
$enc.GetString($bytes)
However, the result comes out malformed as opposed to the original net users command. I've tried changing the encodings to ASCII and Unicode and still the result is malformed.
Any ideas on how to maintain the formatting?
The problem isn't caused by the encoding, but because PowerShell will mangle the command output unless you force it into a string:
$bytes = $enc.GetBytes((Invoke-Expression "net users" | Out-String))
You don't need Invoke-Expression BTW. This will work as well:
$bytes = $enc.GetBytes((net users | Out-String))
To complement Ansgar Wiechers' helpful answer:
Invoking an external command returns the output lines as an array of strings.
Because Encoding.GetBytes() expects a single string as input, the array was implicitly coerced to a single string, in which case PowerShell joins the array elements using the space character, so the original line formatting was lost; e.g.:
PS> [string] 1, 2, 3
1 2 3 # single string containing the array elements joined with spaces
Piping to Out-String, as in Ansgar's answer, prevents creation of the array and returns the external command output as a single string.
PS> (1, 2, 3 | Out-String | Measure-Object).Count
1 # Out-String output a single string
Another option would be to join the array elements with newlines on demand (you won't see the difference in the console, but you do get a single, multi-line output string with this technique):
PS> (net users) -join "`n" # or, more robustly: [environment]::NewLine
Note: With this technique, the output string will not have a trailing newline, unlike when you use Out-String.
Out-String always appends a trailing newline, which can be undesired.
Alternatively, you can tell PowerShell what encoding to expect from an external command by setting [Console]::OutputEncoding (temporarily):
However, that is only necessary if you know the external utility to use an output encoding other than the default (your system's active OEM code page) - and I doubt that that's necessary for net users; that said, here's how it would work:
$prevEnc = [Console]::OutputEncoding
[Console]::OutputEncoding = New-Object System.Text.UTF8Encoding
$str = net users | Out-String # `net users` output is now properly decoded as UTF-8
[Console]::OutputEncoding = $prevEnc
Related
I'm using an AHK script to dump the current clipboard contents to a file (which contains a copy of a part of Microsoft OneNote page to a file).
I would like to modify this binary file to search for a specific string and be able to import it back into AHK.
I tried the following but it looks like powershell is doing something additional to the file (like changing the encoding) and the import of the file into the clipboard is failing.
$ThisFile = 'B:\Users\Desktop\onenote-new-entry.txt'
$data = Get-Content $ThisFile
$data = $data.Replace('asdf','TESTREPLACE!')
$data | Out-File -encoding utf8 $ThisFile
Any suggestions on doing a string replace to the file without changing existing encoding?
I tried manually modifying in a text editor and it works fine. Obviously though I would like to have the modifications be done in mass and automatically which is why I need a script.
The text copied from OneNote and then dumped to file via AHK looks like this:
However, note the clipboard dump file has a lot of other meta-data as shown below when opened in an editor. To download for testing with PS, click here:
Since your file is a mix of binary data and UTF-8 text, you cannot use text processing (as you tried with Out-File -Encoding utf8), because the binary data would invariably be interpreted as text too, resulting in its corruption.
PowerShell offers no simple method for editing binary files, but you can solve your problem via an auxiliary "hex string" representation of the file's bytes:
# To compensate for a difference between Windows PowerShell and PowerShell (Core) 7+
# with respect to how byte processing is requested: -Encoding Byte vs. -AsByteStream
$byteEncParam =
if ($IsCoreCLR) { #{ AsByteStream = $true } }
else { #{ Encoding = 'Byte' } }
# Read the file *as a byte array*.
$ThisFile = 'B:\Users\Desktop\onenote-new-entry.txt'
$data = Get-Content #byteEncParam -ReadCount 0 $ThisFile
# Convert the array to a "hex string" in the form "nn-nn-nn-...",
# where nn represents a two-digit hex representation of each byte,
# e.g. '41-42' for 0x41, 0x42, which, if interpreted as a
# single-byte encoding (ASCII), is 'AB'.
$dataAsHexString = [BitConverter]::ToString($data)
# Define the search and replace strings, and convert them into
# "hex strings" too, using their UTF-8 byte representation.
$search = 'asdf'
$replacement = 'TESTREPLACE!'
$searchAsHexString = [BitConverter]::ToString([Text.Encoding]::UTF8.GetBytes($search))
$replaceAsHexString = [BitConverter]::ToString([Text.Encoding]::UTF8.GetBytes($replacement))
# Perform the replacement.
$dataAsHexString = $dataAsHexString.Replace($searchAsHexString, $replaceAsHexString)
# Convert he modified "hex string" back to a byte[] array.
$modifiedData = [byte[]] ($dataAsHexString -split '-' -replace '^', '0x')
# Save the byte array back to the file.
Set-Content #byteEncParam $ThisFile -Value $modifiedData
Note:
As discussed in the comments, in the case at hand this can only be expected to work if the search and the replacements strings are of the same length, because the file also contains metadata denoting the position and length of the embedded text parts. A replacement string of different length would require adjusting that metadata accordingly.
The string replacement performed is (a) literal, and (b) case-sensitive, and (c) - for accented characters such as é - only works the if the input - like string literals in .NET - uses the composed Unicode normalization form , where é is a single code point and encoded as such (resulting in a multi-byte UTF-8 escape sequence).
More sophisticated replacements, such as regex-based ones, would only be possible if you knew how to split the file data into binary and textual parts, allowing you to operate on the textual parts directly.
Optional reading: Modifying a UTF-8 file without incidental alterations:
Note:
The following applies to text-only files that are UTF-8-encoded.
Unless extra steps are taken, reading and re-saving such files in PowerShell can result in unwanted incidental changes to the file. Avoiding them is discussed below.
PowerShell never preserves information about the character encoding of an input file, such as one read with Get-Content. Also, unless you use -Raw, information about the specific newline format is lost, as well as whether the file had a trailing newline or not.
Assuming that you know the encoding:
Read the file with Get-Content -Raw and specify the encoding with -Encoding (if necessary). You'll receive the file's content as a single, multi-line .NET string.
Use Set-Content -NoNewLine to save the modified string back to the file, using -Encoding with the original encoding.
Caveat: In Windows PowerShell, -Encoding utf8 invariably creates a UTF-8 file with BOM, unlike in PowerShell (Core) 7+, which defaults to BOM-less UTF-8 and requires you to use -Encoding utf8BOM if you want a BOM.
If you're using Windows PowerShell and do not want a UTF-8 BOM, use $null =New-Item -Force ... as a workaround, and pass the modified string to the -Value parameter.
Therefore:
$ThisFile = 'B:\Users\Desktop\onenote-new-entry.txt'
$data = Get-Content -Raw -Encoding utf8 $ThisFile
$data = $data.Replace('asdf','TESTREPLACE!')
# !! Note the caveat re BOM mentioned above.
$data | Set-Content -NoNewLine -Encoding utf8 $ThisFile
Streamlined reformulation, in a single pipeline:
(Get-Content -Raw -Encoding utf8 $ThisFile) |
ForEach-Object Replace 'asdf', 'TESTREPLACE!' |
Set-Content -NoNewLine -Encoding utf8 $ThisFile
With the New-Item workaround, if the output file mustn't have a BOM:
(Get-Content -Raw -Encoding utf8 $ThisFile) |
ForEach-Object Replace 'asdf', 'TESTREPLACE!' |
New-Item -Force $ThisFile |
Out-Null # suppress New-Item's output (a file-info object)
I have the following powershell:
# Find all .csproj files
$csProjFiles = get-childitem ./ -include *.csproj -recurse
# Remove the packages.config include from the csproj files.
$csProjFiles | foreach ($_) {(get-content $_) |
select-string -pattern '<None Include="packages.config" />' -notmatch |
Out-File $_ -force}
And it seems to work fine. The line with the packages.config is not in the file after I run.
But after I run there is an extra newline at that TOP of the file. (Not the bottom.)
I am confused as to how that is getting there. What can I do to get rid of the extra newline char that this generates at the top of the file?
UPDATE:
I swapped out to a different way of doing this:
$csProjFiles | foreach ($_) {$currentFile = $_; (get-content $_) |
Where-Object {$_ -notmatch '<None Include="packages.config" />'} |
Set-Content $currentFile -force}
It works fine and does not have the extra line at the top of the file. But I wouldn't mind knowing why the top example was adding the extra line.
Out-File and redirection operators > / >> take arbitrary input objects and convert them to string representations as they would present in the console - that is, PowerShell's default output formatting is applied - and sends those string representations to the output file.
These string representations often have leading and/or trailing newlines for readability.
See Get-Help about_Format.ps1xml to learn more.
Set-Content is for input objects that are already strings or should be treated as strings.
PowerShell calls .psobject.ToString() on all input objects to obtain the string representation, which in most cases defers to the underlying .NET type's .ToString() method.
The resulting representations are typically not the same, and it's important to know when to choose which cmdlet / operator.
Additionally, the default character encodings differ:
Out-File and > / >> default to UTF-16 LE, which PowerShell calls Unicode in the context of the optional -Encoding parameter.
Set-Content defaults to your system's legacy "ANSI" code page (a single-byte, extended-ASCII code page), which PowerShell calls Default.
Note that the the docs as of PSv5.1 mistakenly claim that the default is ASCII.[1]
To change the encoding:
Ad-hoc change: Use the -Encoding parameter with Out-File or Set-Content to control the output character encoding explicitly.
You cannot change the encoding used by > / >> ad-hoc, but see below.
[PSv3+] Changing the default (use with caution): Use the $PSDefaultParameterValues mechanism (see Get-Help about_Parameters_DefaultValues), which enables setting default values for parameters:
Changing the default encoding for Out-File also changes it for > / >> in PSv5.1 or above[2].
To change it to UTF-8, for instance, use:
$PSDefaultParameterValues['Out-File:Encoding']='UTF8'
Note that in PSv5.0 or below you cannot change what encoding > and >> use.
If you change the default for Set-Content, be sure to change it for Add-Content too:
$PSDefaultParameterValues['Set-Content:Encoding'] = $PSDefaultParameterValues['Add-Content:Encoding'] ='UTF8'
You can also use wildcard patterns to represent the cmdlet / advanced function name to apply the default parameter value to; for instance, if you used $PSDefaultParameterValues['*:Encoding']='UTF8', then all cmdlets that have an -Encoding parameter would default to that value, but that is ill-advised, because in some cmdlets the -Encoding refers to the input encoding.
There is no single shared prefix among cmdlets that write to files that allows you to target all output cmdlets, but you can define a pattern for each of the verbs:
$enc = 'UTF8; $PSDefaultParameterValues += #{ 'Out-*:Encoding'=$enc; 'Set-*:Encoding'=$enc; 'Add-*:Encoding'=$enc; 'Export-*:Encoding'=$enc }
Caveat: $PSDefaultParameterValues is defined in the global scope, so any modifications you make to it take effect globally, and affect subsequent commands.
To limit changes to a script / function's scope and its descendent scopes, use a local $PSDefaultParameterValues variable, which you can either initialize to an empty hashtable to start from scratch ($PSDefaultParameterValues = #{}), or initialize to a clone of the global value ($PSDefaultParameterValues = $PSDefaultParameterValues.Clone())
Caveats:
Using the utf8 encoding in Windows PowerShell invariably creates UTF-8 files with a BOM. (Commendably, in PowerShell [Core] v6+ it does not, and this edition even consistently defaults to BOM-less UTF-8; however, you can create a BOM on demand with utf8BOM.
However, if you're running Windows 10 and you're willing to switch to BOM-less UTF-8 encoding system-wide - which can have side effects - even Windows PowerShell can be made to use BOM-less UTF-8 consistently - see this answer.
In the case at hand, the output objects are [Microsoft.PowerShell.Commands.MatchInfo] instances output by Select-String:
Using default formatting, as happens with Out-File, they output an empty line above, and two empty lines below (with multiple instances printing in a contiguous block between a single set of the empty lines above and below).
If you call .psobject.ToString() on them, as happens with Set-File, they evaluate to just the matching lines (with no origin-path prefix, given that input was provided via the pipeline rather than as filenames via the -Path / -LiteralPath parameters), with no leading or trailing empty lines.
That said, had you piped to | Select-Object -ExpandProperty Line or simply | ForEach-Object Line in order to explicitly output just the matching lines as strings, both Out-File and Set-Content would have yielded the same result (except for their default encoding).
P.S.: LotPing's observation is correct: You seem to be confusing the foreach statement with the ForEach-Object cmdlet (which, regrettably, is also known by built-in alias foreach, causing confusion).
The ForEach-Object cmdlet doesn't need an explicit definition for $_: in the (implied -Process) script block you pass to it, $_ is automatically defined to be the input object at hand.
Your ($_) argument to foreach (ForEach-Object) is effectively ignored: because it evaluates to $null: automatic variable $_, when used outside of special contexts - such as script blocks in the pipeline - effectively evaluates to $null, and putting (...) around it makes no difference, so you're effectively passing $null, which is ignored.
[1] Verify that ASCII is not the default as follows: '0x{0:x}' -f $('ä' | Set-Content t.txt; $b=[System.IO.File]::ReadAllBytes("$PWD\t.txt")[0]; ri t.txt; $b) yields 0xe4 on an en-US system, which is the Windows-1252 code point for ä (which coincides with the Unicode codepoint, but the output is a single-byte-encoded file with no BOM).
If you use -Encoding ASCII explicitly, you get 0x3f, the code point for literal ?, because that's what using ASCII converts all non-ASCII chars. to.
[2] PetSerAl found the source-code location that shows that > and >> are effective aliases for Out-File [-Append], and he points out that redefining Out-File therefore also redefines > / >>; similarly, specifying a default encoding via $PSDefaultParameterValues for Out-File also takes effect for > / >>.
Windows PowerShell v5.1 is the minimum version that works this way..
Tip of the hat to PetSerAl for his help.
When using the Write-Output command, a new line is automatically appended. How can I write strings to stdout (the standard output) without a newline?
For example:
powershell -command "write-output A; write-output B"
Outputs:
A
B
(Write-Host is no good - it writes data to the console itself, not to the stdout stream)
Write-Output writes objects down the pipeline, not text as in *nix for example. It doesn't do any kind of text formatting such as appending newlines, hence no need for newline handling options. I see people very often not coming to grips with this.
If you are referring to the newlines printed on the console output, it's because the pipeline is always eventually terminated by Out-Default, which forwards to a default output target (usually Out-Host), which in turn, if it doesn't receive a formatted input, runs the objects through an appropriate default formatter (usually Format-List or Format-Table). The formatter here is the only one in the process responsible for formatting the output, e.g. printing each object on a new line for the console output.
You can override this default behavior by specifying the formatter of your liking at the end of the pipeline, including your own using Format-Custom.
Write-Output is not appending the newlines.
Try this:
filter intchar {[int[]][char[]]$_}
'123' | Write-Output | intchar
49
50
51
The filter is converting the string to the ASCII int representation of each character. There is no newline being appended.
Adding a couple of explicit newlines for comparison:
"1`n2`n3" | write-output | intchar
49
10
50
10
51
Now we see the additional newlines between the characters, but still no newline appended to the string.
Not sure what your application is, but if you're getting unwanted newlines in your output, I don't think it's Write-Output that's doing it.
mjolinor/famousgarkin explain why the output has a new line that is not itself generated by Write-Output. Simple approach to deal with this is to build your output string with Write-Output
$text = ("This","is","some","words") -join " ";
$string = Write-Output $text
$string += Write-Output $text
$string
Output
This is some wordsThis is some words
I'm trying to do some processing logic - running some commands in parallel based on the tree configuration CSV file:
Operation;Parent;Enabled;Propagated;Job_ID;Status;Started;Finished
CA1;n/a;Y;N;;;;
PROD1;n/a;Y;N;;;Y;
CON1;CA1;N;N;;;Y;
CON2;CON1;N;N;;;Y;
I load the file into the variable and then I'm trying to find the next step which needs to be processed:
$Data = Import-Csv -delimiter ";" .\config.csv
$NextStep = $Data | Select-Object -first 1 | Where-Object {$_.Started -eq ""}
$NextStepText = $NextStep.Operation | ft -autosize | out-string
The problem is that it seems like $NextStep.Operation contains new line character. When I display it I get:
PS C:\temp\SalesForce> $NextStep.operation
CA1
PS C:\temp\SalesForce> $NextStep.Operation.Contains("`n")
False
Do you know what I'm doing wrong? I would like to display the content without the "dummy" new line character which is there even if contains method is saying it is not there.
Or please advise how to do it better. I'm still learning PowerShell; so far I just google the commands, and I'm trying to put it together.
The newline isn't in your data, it's being added by Out-String. Observe the output of the following (in particular, where you do and don't get the newline after CA1):
$Data = import-csv -delimiter ";" .\config.csv
$NextStep = $Data | select-object -first 1 | where-object {$_.Started -eq ""}
$NextStepText = $NextStep.Operation | ft -autosize | out-string
"hi"
$NextStepText
"hi"
$NextStep.Operation;
"hi"
$NextStep.Operation | ft -autosize
"hi"
You shouldn't be using Format-Table at that step (and Out-String is unnecessary in this script) if you intend to use $NextStepText for anything other than direct output later on. Consider Format-Table (or any of the Format-* cmdlets) the end of the line for usable data.
Why do you think that there is a new line character of some sort in there? If you are using the ISE then what you posted doesn't look like there is. It is normal to have a blank line between commands (in the v2/v3 ISE, not sure about v4), so what you posted would not indicate that it contains any new line characters.
You can always check the $NextStep.Operation.Length to see if it says 3 or 4. If there is a `n in there it'll show up in the length. For example (copied and pasted out of my v3 PS ISE):
PS C:\> $test = "Test`nTest2"
PS C:\> $test
Test
Test2
PS C:\> $test.Length
10
PS C:\>
That was to show that there is a new line character injected by following it with text, without any text following the new line character it looks like this:
PS C:\> $test = "Test`n"
PS C:\> $test
Test
PS C:\> $test.Length
5
PS C:\>
You'll notice that there are 2 blank lines after the text "Test" on the second command. The first is the line injected into the variable, and the second is the obligatory line that PS puts in to show separation between commands.
Out-String unexpectedly appends a trailing newline to the string it outputs.
This problematic behavior is discussed in GitHub issue #14444.
A simple demonstration:
# -> '42<newline>'
(42 | Out-String) -replace '\r?\n', '<newline>'
However, you neither need Format-Table nor Out-String in your code:
Format-* cmdlets output objects whose sole purpose is to provide formatting instructions to PowerShell's for-display output-formatting system. In short: only ever use Format-* cmdlets to format data for display, never for subsequent programmatic processing - see this answer for more information.
Out-String is capable of interpreting these formatting instructions, i.e. it does produce data - in the form of a single, multi-line string by default - that is the string representation of what would print to the display.
As such, the resulting string contains a representation for the human observer, not a structured text format suitable for programmatic processing.
In your case, Format-Table is applied to a string, which is pointless, because strings always render as themselves, in full (-AutoSize has no effect); piping to Out-String then in effect returns the original string with an (undesired) newline appended.
Therefore, use a simple variable assignment to store the property value of interest in a separate variable:
$NextStepText = $NextStep.Operation
I'm running a PowerShell script against many servers, and it is logging output to a text file.
I'd like to capture the server the script is currently running on. So far I have:
$file = "\\server\share\file.txt"
$computername = $env:computername
$computername | Add-Content -Path $file
This last line adds question marks in the output file. Oops.
How do I output a variable to a text file in PowerShell?
The simplest Hello World example...
$hello = "Hello World"
$hello | Out-File c:\debug.txt
Note: The answer below is written from the perspective of Windows PowerShell.
However, it applies to the cross-platform PowerShell (Core) v6+ as well, except that the latter - commendably - consistently defaults to BOM-less UTF-8 as the character encoding, which is the most widely compatible one across platforms and cultures..
To complement bigtv's helpful answer helpful answer with a more concise alternative and background information:
# > $file is effectively the same as | Out-File $file
# Objects are written the same way they display in the console.
# Default character encoding is UTF-16LE (mostly 2 bytes per char.), with BOM.
# Use Out-File -Encoding <name> to change the encoding.
$env:computername > $file
# Set-Content calls .ToString() on each object to output.
# Default character encoding is "ANSI" (culture-specific, single-byte).
# Use Set-Content -Encoding <name> to change the encoding.
# Use Set-Content rather than Add-Content; the latter is for *appending* to a file.
$env:computername | Set-Content $file
When outputting to a text file, you have 2 fundamental choices that use different object representations and, in Windows PowerShell (as opposed to PowerShell Core), also employ different default character encodings:
Out-File (or >) / Out-File -Append (or >>):
Suitable for output objects of any type, because PowerShell's default output formatting is applied to the output objects.
In other words: you get the same output as when printing to the console.
The default encoding, which can be changed with the -Encoding parameter, is Unicode, which is UTF-16LE in which most characters are encoded as 2 bytes. The advantage of a Unicode encoding such as UTF-16LE is that it is a global alphabet, capable of encoding all characters from all human languages.
In PSv5.1+, you can change the encoding used by > and >>, via the $PSDefaultParameterValues preference variable, taking advantage of the fact that > and >> are now effectively aliases of Out-File and Out-File -Append. To change to UTF-8 (invariably with a BOM, in Windows PowerShell), for instance, use:
$PSDefaultParameterValues['Out-File:Encoding']='UTF8'
Set-Content / Add-Content:
For writing strings and instances of types known to have meaningful string representations, such as the .NET primitive data types (Booleans, integers, ...).
.psobject.ToString() method is called on each output object, which results in meaningless representations for types that don't explicitly implement a meaningful representation; [hashtable] instances are an example:
#{ one = 1 } | Set-Content t.txt writes literal System.Collections.Hashtable to t.txt, which is the result of #{ one = 1 }.ToString().
The default encoding, which can be changed with the -Encoding parameter, is Default, which is the system's active ANSI code page, i.e. the single-byte culture-specific legacy encoding for non-Unicode applications, which is most commonly Windows-1252.
Note that the documentation currently incorrectly claims that ASCII is the default encoding.
Note that Add-Content's purpose is to append content to an existing file, and it is only equivalent to Set-Content if the target file doesn't exist yet.
If the file exists and is nonempty, Add-Content tries to match the existing encoding.
Out-File / > / Set-Content / Add-Content all act culture-sensitively, i.e., they produce representations suitable for the current culture (locale), if available (though custom formatting data is free to define its own, culture-invariant representation - see Get-Help about_format.ps1xml).
This contrasts with PowerShell's string expansion (string interpolation in double-quoted strings), which is culture-invariant - see this answer of mine.
As for performance:
Since Set-Content doesn't have to apply default formatting to its input, it performs better, and therefore is the preferred choice if your input is composed of strings and/or of objects whose default stringification via the standard .NET .ToString() method is sufficient.
As for the OP's symptom with Add-Content:
Since $env:COMPUTERNAME cannot contain non-ASCII characters (or verbatim ? characters), Add-Content's addition to the file should not result in ? characters, and the likeliest explanation is that the ? instances were part of the preexisting content in output file $file, which Add-Content appended to.
After some trial and error, I found that
$computername = $env:computername
works to get a computer name, but sending $computername to a file via Add-Content doesn't work.
I also tried $computername.Value.
Instead, if I use
$computername = get-content env:computername
I can send it to a text file using
$computername | Out-File $file
Your sample code seems to be OK. Thus, the root problem needs to be dug up somehow. Let's eliminate chance for typos in the script. First off, make sure you put Set-Strictmode -Version 2.0 in the beginning of your script. This will help you to catch misspelled variable names. Like so,
# Test.ps1
set-strictmode -version 2.0 # Comment this line and no error will be reported.
$foo = "bar"
set-content -path ./test.txt -value $fo # Error! Should be "$foo"
PS C:\temp> .\test.ps1
The variable '$fo' cannot be retrieved because it has not been set.
At C:\temp\test.ps1:3 char:40
+ set-content -path ./test.txt -value $fo <<<<
+ CategoryInfo : InvalidOperation: (fo:Token) [], RuntimeException
+ FullyQualifiedErrorId : VariableIsUndefined
The next part about question marks sounds like you have a problem with Unicode. What's the output when you type the file with Powershell like so,
$file = "\\server\share\file.txt"
cat $file
Here is an easy one:
$myVar > "c:\myfilepath\myfilename.myextension"
You can also try:
Get-content "c:\someOtherPath\someOtherFile.myextension" > "c:\myfilepath\myfilename.myextension"