I have an array of objects in Powershell. It was working, but now when I do an Export-Csv on the array, it property and value names are transformed like:
Account_No -> +ACI-Account+AF8-No+ACI-
Does anyone know why it is doing this?
Thanks
I am using PS 5.1, and the command is:
$rowsWithErrs | Export-Csv -Path $rowErrCsvPath -NoTypeInformation -Encoding UTF7
It looks like there isn't anything wrong with what you are doing. Everything is getting sent out in the format that you are expecting.
The only problem is that the application that you are using to view your data is not using the same encoding that was used to write the data.
The extra characters are what you see when interpreting text as UTF8 or something similar or compatible with UTF8 (which is the standard for most systems) instead of UTF7 when the text was encoded as UTF7.
example
> "Account_No" | Out-File -FilePath test.txt -Encoding UTF7
> Get-Content test.txt -Encoding UTF8
Account+AF8-No
> Get-Content test.txt -Encoding UTF7
Account_No
if reading csv data in Powershell you can do the following
> $csv = Import-Csv -FilePath $filepath -Encoding UTF7
if reading csv data in Excel, on the data tab select From Text/CSV at the top of the import window select File Origin 65000: Unicode (UTF-7)
For other applications like VS Code or Notepad++ you may be out of luck if you want to view the data there because it looks like they do not support UTF-7 encoding.
Related
I am using get-winevent to convert an evtx log to .json file. Then I've send it to ELK. Get-WinEvent -Path .\log.evtx | ConvertTo-Json|Format-List | Out-File log.json
The file looks like a normal string containing file on windows. But when I take it to linux, it contains binary data and cannot be parsed to ELK.
Even if I use out-string, nothing changes. $result = Get-WinEvent -Path .\user-creation-1log.evtx | ConvertTo-Json| Format-List
$result | Out-String | out-file log.jsonThis also appears like a binary file in linux. (Although I remember export-csv with get-winevent created complete text file, but this makes a really ugly formatted csv file). I really liked the way convertTo-json formatted and valued the json data and would prefer it. (if someone can provide a different way to convert the evtx data in its fullest form to json, happy to take).
I've tried evtx2csv python module, but that doesn't write output to a file.
First, don't use Format-List if you intend to export JSON. This is only for formatting objects as a nice visual representation in the console.
Also, I don't use Linux, but I guess it's safest to specify utf8 as encoding explicitly to make sure it's compatible:
Get-WinEvent -Path .\log.evtx | ConvertTo-Json | Out-File log.json -Encoding utf8
I am trying to output all the file names in a directory to a file. Seems simple, but in the future I will be creating useful information based off the file names and outputting to a file for another system.
When I output the information to a file it shows as gibberish when I open in notepad. Outputting to the screen looks fine.
Here is my code:
$files = Get-ChildItem "s:\centmobile\rates\currentrates\forupload\"
$outfile = "s:\centmobile\rates\currentrates\test.txt"
"New File"|Out-File $outfile -Encoding ascii
foreach ($f in $files){
Get-Content $f.FullName | Add-Content $outfile -Encoding Ascii
Write-Output $f.FullName
}
Screen output looks good:
PS C:\Windows\System32\WindowsPowerShell\v1.0> S:\CentMobile\Software\Dev\cre8hdrinfo.ps1
S:\centmobile\rates\currentrates\forupload\2019406BICS_BC_Rates_ForUpload.xlsx
S:\centmobile\rates\currentrates\forupload\2019406BICS_FC_Rates_ForUpload.xlsx
File output looks not so good..
New File
PK ! –~íGq % [Content_Types].xml ¢( ¬”ËNÃ0E÷HüCä-Jܲ#5í‚Ç
A%ʘxÒXqlËã–öU(4BÍ&–ã™{O&3žÌ6NÖàQY“³q6b ˜ÂJe–9û\¼¤÷,Á ŒÚÈÙͦ×W“ÅÖ&”m0gUîs,*hfÖ¡“ÒúFÚú%w¢¨ÅøíhtÇk˜†VƒM'OPŠ•Éó†^ïH<hdÉã.°õÊ™pN«B"åk#¹¤{‡Œ2cVÊá
a0ÞéОüm°Ï{§Òx%!™ÞDC|£ù·õõ—µuv^¤ƒÒ–¥*#ÚbÕP2t„Ä
4:‹kÖeÜgüc0ò¸Œi¿/
÷p ý0o„~U¦F~ºšèT»*PÏË)¢L!†º¢hŸs%<Èài\8Õ>ÇAÍ<÷Ö!µ‡ÿWá0·mvêH|PpœÜ® 8:Ò•pqÙÛÎ2d‡7—Üô ÿÿ PK ! µU0#ô L _rels/.rels ¢( ¬’MOÃ0†ïHü‡È÷ÕÝBKwAH»!T~€Iܵ£$Ý¿'TƒG½~üÊÛÝ<êÈ!öâ4¬‹;#¶w†—úqu*&r–Fq¬áÄvÕõÕö™GJy(v½*«¸¨¡KÉß#FÓñD±Ï.W ¥†=™ZÆMYÞbø®ÕBS톰·7 ê“Ï›×–¦é
?ˆ9LìÒ™ÈsbgÙ®|Èl!õùUSh9i°bžr:"y_dlÀóD›¿ý|-NœÈR"4ø2ÏGÇ% õZ´4ñËyÄ7 ëÈðÉ‚‹¨Þ ÿÿ PK ! …ë — † xl/workbook.xml¬Umo›:þ~¥ýÆýLÁ¼ƒ’LI ÝJÛT¥]÷¥Òä€S¬æÓ¤ªößwlBš.ÕÔu‹ˆß?çœç&vu¥ÝÞQÖLutféirVÐævª¹ÊŒP×:›W¬!Sýtú‡Ù»&[ÆïÖŒÝi ÐtS½¢M³ËKRã¤•
ã50ä·f×r‚‹®$DÔ•i[–oÖ˜6ú€ó×`°Í†æ$ay_“F œTX ý®¤m7¢ÕùkàjÌïúÖÈYÝÄšVT<(P]«óøü¶a¯+0{‡<mÇáñá,hìñ&X:¹ª¦9gÛˆ3€6Ò'ö#ËDè™v§>x’krrOe¬¸ÿFVþËCÖ£!–ÒJÎ{#šwàfë³É†Väz®†Ûö3®e¤*]«p'Ò‚
RLõ †lKžMð¾]ô´‚U¹¶§›³ƒœ/¸V
...
The reason your screen output and file looks very different is that you're not outputting the same content at all to screen and file.
With:
Get-Content $f.FullName | Add-Content $outfile -Encoding Ascii
you are, as the command implies, getting the content of every file and outputting to $outfile.
While with:
>Write-Output $f.FullName
You are just outputting the list of file names to screen.
As your question says it's the filenames you're after, just change:
Get-Content $f.FullName | Add-Content $outfile -Encoding Ascii
to:
$f.FullName | Add-Content $outfile -Encoding Ascii
and it should output the same thing to screen as to the file.
A good way to check/troubleshoot here would've been to just remove everything after:
Get-Content $f.FullName
and look at the output, which will look very similar to the file and give you a hint that something's wrong there.
Get-Content cmdlet returns strings or bytes (strings in your case). The gibberish you are getting comes from interpretation binary values from xlsx files as Ascii strings (unsolvable mojibake case).
Resources (required reading, incomplete):
From fileformatcommons.com:
xlsx files are actually zip files in disguise…
xlsx file character encoding / charset is binary
From .ZIP File Format Specification
local file header signature (4 bytes) 0x04034b50
From Zip (file format) at Wikipedia:
Most of the signatures end with the short integer 0x4b50, which is
stored in little-endian ordering. Viewed as an ASCII string this reads
"PK", the initials of the inventor Phil Katz. Thus, when a ZIP file is
viewed in a text editor the first two bytes of the file are usually
"PK".
i came across a little issue when dealing with csv-exports which contains mutated vowels like ä,ö,ü (German Language Umlaute)
i simply export with
Get-WinEvent -FilterHashtable #{Path=$_;ID=4627} -ErrorAction SilentlyContinue |export-csv -NoTypeInformation -Encoding Default -Force ("c:\temp\CSV_temp\"+ $_.basename + ".csv")
which works fine. i have the ä,ö,ü in my csv-file correctly.
after that i do a little sorting with:
Get-ChildItem 'C:\temp\*.csv' |
ForEach-Object { Import-Csv $_.FullName } |
Sort-Object { [DateTime]::ParseExact($_.TimeCreated, $pattern, $culture) } |
Export-Csv 'C:\temp\merged.csv' -Encoding Default -NoTypeInformation -Force
i played around with all encodings, ASCII, BigEndianUnicode, UniCode(s) with no success.
how can i preserve the special characters ä,ö,ü and others when exporting and sorting?
Mathias R. Jessen provides the crucial pointer in a comment on the question:
It is the Import-Csv call, not Export-Csv, that is the cause of the problem in your case:
Like Export-Csv, Import-Csv too needs to be passed -Encoding Default in order to properly process text files encoded with the system's active "ANSI" legacy code page, which is an 8-bit, single-byte character encoding such as Windows-1252.
In Windows PowerShell, even though the generic text-file processing Get-Content / Set-Content cmdlet pair defaults to Default encoding (as the name suggests), regrettably and surprisingly, Import-Csv and Export-Csv do not.
Note that on reading a default encoding is only assumed if the input file has no BOM (byte-order mark, a.k.a Unicode signature, a magic byte sequence at the start of the file that unambiguously identifies the file's encoding).
Not only do Import-Csv and Export-Csv have defaults that differ from Get-Content / Set-Content, they individually have different defaults:
Import-Csv defaults to UTF-8.
Export-Csv defaults to ASCII(!), which means that any non-ASCII characters -such as ä, ö, ü - are transliterated to literal ? chars., resulting in loss of data.
By contrast, in PowerShell Core, the cross-platform edition built on .NET Core, the default encoding is (BOM-less) UTF-8, consistently, across all cmdlets, which greatly simplifies matters and makes it much easier to determine when you do need to use the -Encoding parameter.
Demonstration of the Windows PowerShell Import-Csv / Export-Csv behavior
Import-Csv - defaults to UTF-8:
# Sample CSV content.
$str = #'
Column1
aäöü
'#
# Write sample CSV file 't.csv' using UTF-8 encoding *without a BOM*
# (Note that this cannot be done with standard PowerShell cmdlets.)
$null = new-item -type file t.csv -Force
[io.file]::WriteAllLines((Convert-Path t.csv), $str)
# Use Import-Csv to read the file, which correctly preserves the UTF-8-encoded
# umlauts
Import-Csv .\t.csv
The above yields:
Column1
-------
aäöü
As you can see, the umlauts were correctly preserved.
By contrast, had the file been "ANSI"-encoded ($str | Set-Content t.csv; -Encoding Default implied), the umlauts would have gotten corrupted.
Export-Csv - defaults to ASCII - risk of data loss:
Building on the above example:
Import-Csv .\t.csv | Export-Csv .\t.new.csv
Get-Content .\t.new.csv
yields:
"Column1"
"a???"
As you can see, the umlauts were replaced by literal question marks (?).
I have the following code:
$databaseContents = "col1,col2,col3,col4"
$theDatabaseFile = "C:\NewFolder\Database.csv
$databaseContents | Out-File $theDatabaseFile
However when I open the csv file in Excel, it has col1,col2,col3,col4 in cell A1 rather than col1 in cell A1, col2 in cell B1 etc.
Something strange I've noticed:
If I open the file in Notepad and copy the text into another Notepad instance and save it as Database1.csv, then open it in Excel, it displays as expected.
How can I get the Out-File commandlet to save it as a .csv file with the contents in 4 columns as expected?
EDIT:
I've noticed if I use Set-Content rather than Out-File, the csv file is then displayed correctly in Excel.
Does anyone know why this is?
Why it makes a difference to Excel I am unclear, but it comes down to the encoding of the resulting output file - Unicode (in the cases that do not work) vs. ASCII (in the cases that do).
#JPBlanc's alternate approach works because it sets the encoding of the output file to ASCII where your original example (implicitly) set the encoding of the output file to Unicode.
Just adding -Encoding ascii to your original example works fine too:
$databaseContents = "col1,col2,col3,col4"
$theDatabaseFile = "C:\NewFolder\Database.csv
$databaseContents | Out-File $theDatabaseFile -Encoding ascii
And adding an explicit -Encoding unicode to your original example yields the same broken result you encountered:
$databaseContents = "col1,col2,col3,col4"
$theDatabaseFile = "C:\NewFolder\Database.csv
$databaseContents | Out-File $theDatabaseFile -Encoding unicode
This is basically what was happening implicitly.
This works also :
$databaseContents = "col1;col2;col3;col4"
$theDatabaseFile = "C:\temp\Database.csv"
$databaseContents | Out-File $theDatabaseFile -Encoding ascii
By default CSV separator in Excel seams to be ';' and Out-File save as unicode forcing ASCII seams to give the result you look for.
I was having the same problem than Backwards_Dave and just like him, using the Set-Content instead of Out-File command worked for me:
#Build $temp String
$temp = "$scriptApplication;$applicationName;$type;$description;$maxSessions;$enabled;$tempParams`n"
#Write $temp String to $fichierCsv file
"$temp" | Set-Content $fichierCsv
I tried using JPBlanc's and J0e3gan's solution but it did not work (-Encoding ascii option): I wonder why.
Background
I have a pipe delimited csv file that looks like this:
ColA|ColB|ColC|ColD|ColE|ColF|ColG|ColH|ColI|ColJ|ColK
00000000|000000507|0000000|STATUS|0000|000000|000|0000|00|0000|00000
00000000|000000500|0000000|STATUS|0000|000000|000|0000|00|0000|00000
00000000|000007077|0000000|STATUS|0000|000000|000|0000|00|0000|00000
I want to take ColB on lines with a certain status and put it in a headless csv file like this:
000000507,0000000001,0,0
000000500,0000000001,0,0
000007077,0000000001,0,0
The values 0000000001,0,0 on the end of the line are identical for every item.
The Script
The trimmed down/generic version of the script looks like this:
$infile = Import-Csv ".\incsv.txt" -Delimiter '|'
$outfile = ".\outcsv.txt"
Foreach($inline in $infile){
If($inline.Status -ne 'Some Status'){
$outline = $inline.'ColB' + ',0000000001,0,0'
Add-Content $outfile $outline -Encoding ASCII
}
}
The Problem
The problem is that the new file that is created is about twice the size it should be, and I know it has to do with the encoding. Unfortunately, I can't find an encoding that works. I've tried -Encoding ASCII, -Encoding Default, -Encoding UTF8, but they all are too large.
This wouldn't be an issue, but the program that reads the created text file won't read it correctly.
What I can do, is copy the text from the new file in Notepad, save it as ANSI, and it works fine. ANSI isn't an option in the -Encoding parameter though.
How do I get Powershell to output the correct file type? Is there a better way to approach this?
I've found this Google Groups conversation and this Social TechNet post, but neither one actually worked for me.
If the output file already exists and is in Unicode format the parameter -Encoding ASCII is ignored when appending to the file. Change your loop to this:
$infile | % {
if ($_.Status -ne 'Some Status') {
$_.'ColB' + ',0000000001,0,0'
}
} | Set-Content $outfile -Encoding ASCII