strange characters when opening a properties file - encoding

I have a requirement to update a properties file for a very old project, the properties file is supposed to display Arabic characters but it displays somthing like that "Êã ÊÓÌíá ØáÈßã", i wrote a simple program from which i was able to read the correct Arabic values from the file,
Reader r = new InputStreamReader(new FileInputStream("C:\\Labels_ar.properties"), "Windows-1256");
buffered = new BufferedReader(r);
String line;
while ((line = buffered.readLine()) != null) {
System.out.println("line" + line);
}
but do u have any idea on how i can open the file, edit and save the new changes?

If, as you seem to think, the encoding is Windows-1256, there are editors that will do the job, such as EditPadLite.
If it's not that, the first thing you need to find out is the encoding. Given it's a properties file, it may well be UTF-8 but the easiest way to find out is to get a hex dump of the file and post it here. Under Linux, I'd normally suggest using:
od -xcb Labels_ar.properties
but, given you're on Windows, that's not going to work so well (unless you have CygWin installed).
So, if you have your own favourite hex dump program, just use that. Otherwise you can use the following Powershell one:
function Pf-Dump-Hex-Item([byte[]] $data) {
$left = "+0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F"
$right = "0123456789ABCDEF"
Write-Output "======== $left +$right"
$addr = 0
$left = "{0:X8} " -f $addr
$right = ""
# Now go through the input bytes
foreach ($byte in $bytes) {
# Add 2-digit hex number then filtered character.
$left += "{0:x2} " -f $byte
if (($byte -lt 0x20) -or ($byte -gt 0x7e)) { $byte = "." }
$right += [char] $byte
# Increment address and start new line if needed.
$addr++;
if (($addr % 16) -eq 0) {
Write-Output "$left $right"
$left = "{0:X8} " -f $addr
$right = "";
}
}
# Flush last line if needed.
$lastLine = "{0:X8}" -f $addr
if (($addr % 16) -ne 0) {
while (($addr % 16) -ne 0) {
$left += " "
$addr++;
}
Write-Output "$left $right"
}
Write-Output $lastLine
Write-Output ""
}
function Pf-Dump-Hex {
param(
[Parameter (Mandatory = $false, Position = 0)]
[string] $Path,
[Parameter (Mandatory = $false, ValueFromPipeline = $true)]
[Object] $Object
)
begin {
Set-StrictMode -Version Latest
# Create the array to hold content then do path if given.
[byte[]] $bytes = $null
if ($Path) {
$bytes = [IO.File]::ReadAllBytes((Resolve-Path $Path))
Pf-Dump-Hex-Item $bytes
}
}
process {
# Process each object (input/pipe).
if ($object) {
foreach ($obj in $object) {
if ($obj -is [Byte]) {
$bytes = $obj
} else {
$inpStr = [string] $obj
$bytes = [Text.Encoding]::Unicode.GetBytes($inpStr)
}
Pf-Dump-Hex-Item $bytes
}
}
}
}
If you load that into a Powershell session then run:
pf-dump-hex Labels_ar.properties
that should allow you to evaluate the file encoding.

I think there are two problems :
1- Im not sure if System.out.println() can print arabic characters, so try another method like MessageBox.show() to be sure there is a problem with reading file.
2- If MessageBox.show() shows same result, the problem should be the charset, you can try UTF-8 or somthing else.

Related

How to use powershell to reorder a string to obfuscate a hidden message?

Just for fun a friend and I are trying to find a creative way to send coded messages to eachother using steganography.I stumbled upon doing something like whats shown below and I have been struggling trying to write a function to automate the process.
this is a secret message
can be turned into:
("{2}{1}{0}{3}"-f'ecret m','is a s','this ','essage')
splitting the string and using reordering seems to be the way to go.
So the string needs to be split in random splits between 5-10 characters
.
The index of the original positions need to be saved
the splits need to be swapped around
and the new indexes sorted as to reorder the message properly
i've just really been struggling
help is appreciated
Just for fun .... 😉🤡
$InputMessage = 'this is a secret message'
$SplittedString = $InputMessage -split '' | Select-Object -Skip 1 | Select-Object -SkipLast 1
[array]::Reverse($SplittedString)
foreach ($Character in $SplittedString) {
if ($Character -notin $CharacterList) {
[array]$CharacterList += $Character
}
}
foreach ($Character in ($InputMessage -split '' | Select-Object -Skip 1 | Select-Object -SkipLast 1)) {
$Index = [array]::indexof($CharacterList, $Character)
$Output += "{$Index}"
}
$Result = "'$Output' -f $(($CharacterList | ForEach-Object {"'$_'"}) -join ',')"
$Result
And the output of this would be:
'{6}{10}{9}{3}{5}{9}{3}{5}{2}{5}{3}{0}{8}{7}{0}{6}{5}{4}{0}{3}{3}{2}{1}{0}' -f 'e','g','a','s','m',' ','t','r','c','i','h'
And the output of this would be:
this is a secret message
And now if you want to go fancy with it you remove the curly braces and the quotes and the commas and the -f and add only the numbers and characters to the data. ;-)
Not exactly what you're looking for but this might give you something to start with:
class Encode {
[string] $EncodedMessage
[int[]] $Map
[int] $EncodingComplexity = 3
Encode ([string] $Value) {
$this.Shuffle($Value)
}
Encode ([string] $Value, [int] $Complexity) {
$this.EncodingComplexity = $Complexity
$this.Shuffle($Value)
}
[void] Shuffle([string] $Value) {
$set = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890!##$%^&*()_-+=[{]};:<>|./?'
$ref = [Collections.Generic.HashSet[int]]::new()
$ran = [random]::new()
$enc = [char[]]::new($Value.Length * $this.EncodingComplexity)
for($i = 0; $i -lt $enc.Length; $i++) {
$enc[$i] = $set[$ran.Next($set.Length)]
}
for($i = 0; $i -lt $Value.Length; $i++) {
do {
$x = $ran.Next($enc.Length)
} until($ref.Add($x))
$enc[$x] = $Value[$i]
}
$this.EncodedMessage = [string]::new($enc)
$this.Map = $ref
}
}
class Decode {
static [string] DecodeMessage ([Encode] $Object) {
return [Decode]::DecodeMessage($Object.EncodedMessage, $Object.Map, $Object.EncodingComplexity)
}
static [string] DecodeMessage ([string] $EncodedMessage, [int[]] $Map) {
return [Decode]::DecodeMessage($EncodedMessage, $Map, 3)
}
static [string] DecodeMessage ([string] $EncodedMessage, [int[]] $Map, [int] $Complexity) {
$decoded = [char[]]::new($EncodedMessage.Length / $Complexity)
for($i = 0; $i -lt $decoded.Length; $i++) {
$decoded[$i] = $EncodedMessage[$Map[$i]]
}
return [string]::new($decoded)
}
}
Encoding a message:
PS /> $message = 'this is a secret message'
PS /> $encoded = [Encode] $message
PS /> $encoded
EncodingComplexity EncodedMessage Map
------------------ -------------- ---
3 B$h^elu2w#CeeHH^qa siQJ)t}es:.a3 ema=eN(GiIcsO;tst1 .fsg}eSUk7ms4 N>rfe# {49, 2, 41, 27…}
For decoding the message you can either use the object of the type Encode or you can give your friend the Encoded Message and the Map to decode it ;)
PS /> [Decode]::DecodeMessage($encoded)
this is a secret message
PS /> [Decode]::DecodeMessage('B$h^elu2w#CeeHH^qa siQJ)t}es:.a3 ema=eN(GiIcsO;tst1 .fsg}eSUk7ms4 N>rfe#', $encoded.Map)
this is a secret message

Using Powershell to output characters (not lines) after a match in a large file

I use powershell to parse huge files and easily take a look at a small part of the file where a certain string occurs.. like this:
Select-String P120300420059211107104259.txt -Pattern "<ID>9671510841" -Context 0,300
This gives me 300 lines of the file after the occurance of that ID number.
But I've come across a file that has no carriage returns. Now I would like to do the same thing, but instead of lines being returned, I guess I need characters.
How would I do this?
I have never created scripts in powershell - just ran simple commands like the above.
I would like to see maybe 1000 characters after the matched string, within a huge file.
THanks!
The problem with using Select-String or [Regex]::Matches() (or -match) to test for the presence of a substring in a single-line file is that you first need to read the whole file into memory at once.
The good news is that you don't need regular expressions to find a substring in a huge single-line text file - instead, you can read the file contents into memory in smaller chunks and then search through those - this way you don't need to store the entire file in memory at once.
Reading buffered text from a file is fairly straightforward:
Open a readable file stream
Create a StreamReader to read from the file stream
Start reading!
Then you just need to check whether:
The target substring is found in each chunk, or
The start of the target substring is partially found at the tail end of the current chunk
And then repeat until you find the substring, at which point you read the following 1000 characters.
Here's an example of how you could implement it as script function (I've tried to explain the code in more detail in inline comments):
function Find-SubstringWithPostContext {
[CmdletBinding(DefaultParameterSetName = 'wp')]
param(
[Alias('PSPath')]
[Parameter(Mandatory = $true, ParameterSetName = 'lp', ValueFromPipelineByPropertyName = $true, ValueFromPipeline = $true)]
[string[]]$LiteralPath,
[Parameter(Mandatory = $true, ParameterSetName = 'wp', Position = 0)]
[string[]]$Path,
[Parameter(Mandatory = $true)]
[ValidateLength(1, 5000)]
[string]$Substring,
[ValidateRange(2, 25000)]
[int]$PostContext = 1000,
[switch]$All,
[System.Text.Encoding]
$Encoding
)
begin {
# start by ensuring we'll be using a buffer that's at least 4 larger than the
# target substring to avoid too many tail searches
$bufferSize = 2000
while ($Substring.Length -gt $bufferSize / 4) {
$bufferSize *= 2
}
$buffer = [char[]]::new($bufferSize)
}
process {
if ($PSCmdlet.ParameterSetName -eq 'wp') {
# resolve input paths if necessary
$LiteralPath = $Path | Convert-Path
}
:fileLoop
foreach ($lp in $LiteralPath) {
$file = Get-Item -LiteralPath $lp
# skip directories
if ($file -isnot [System.IO.FileInfo]) { continue }
try {
$fileStream = $file.OpenRead()
$scanner = [System.IO.StreamReader]::new($fileStream, $true)
do {
# remember the current offset in the file, we'll need this later
$baseOffset = $fileStream.Position
# read a chunk from the file, convert to string
$readCount = $scanner.ReadBlock($buffer, 0, $bufferSize)
$string = [string]::new($buffer, 0, $readCount)
$eof = $readCount -lt $bufferSize
# test if target substring is found in the chunk we just read
$indexOfTarget = $string.IndexOf($Substring)
if ($indexOfTarget -ge 0) {
Write-Verbose "Substring found in chunk at local index ${indexOfTarget}"
# we found a match, ensure we've read enough post-context ahead of the given index
$tail = ''
if ($string.Length - $indexOfTarget -lt $PostContext -and $readCount -eq $bufferSize) {
# just like above, we read another chunk from the file and convert it to a proper string
$tailBuffer = [char[]]::new($PostContext - ($string.Length - $indexOfTarget))
$tailCount = $scanner.ReadBlock($tailBuffer, 0, $tailBuffer.Length)
$tail = [string]::new($tailBuffer, 0, $tailCount)
}
# construct and output the full post-context
$substringWithPostContext = $string.Substring($indexOfTarget) + $tail
if($substringWithPostContext.Length -gt $PostContext){
$substringWithPostContext = $substringWithPostContext.Remove($PostContext)
}
Write-Verbose "Writing output object ..."
Write-Output $([PSCustomObject]#{
FilePath = $file.FullName
Offset = $baseOffset + $indexOfTarget
Value = $substringWithPostContext
})
if (-not $All) {
# no need to search this file any further unless `-All` was specified
continue fileLoop
}
else {
# rewind to position after this match before next iteration
$rewindOffset = $indexOfTarget - $readCount
$null = $scanner.BaseStream.Seek($rewindOffset, [System.IO.SeekOrigin]::Current)
}
}
else {
# target was not found, but we may have "clipped" it in half,
# so figure out if target string could start at the end of current string chunk
for ($i = $string.Length - $target.Length; $i -lt $string.Length; $i++) {
# if the first character of the target substring isn't found then
# we might as well skip it immediately
if ($string[$i] -ne $target[0]) { continue }
if ($target.StartsWith($string.Substring($i))) {
# rewind file stream to this position so it'll get re-tested on
# the next iteration, then break out of tail search
$rewindOffset = $i - $string.Length
$null = $scanner.BaseStream.Seek($rewindOffset, [System.IO.SeekOrigin]::Current)
break
}
}
}
} until ($eof)
}
finally {
# remember to clean up after searching each file
$scanner, $fileStream |Where-Object { $_ -is [System.IDisposable] } |ForEach-Object Dispose
}
}
}
}
Now you can extract exactly 1000 characters after a substring is found with minimal memory allocation:
Get-ChildItem P*.txt |Find-SubstringWithPostContext -Substring '<ID>9671510841'
I haven't tested this enough to tell you if it works properly but it definitely was something fun to code. -Context here will give you the context based on characters before and after instead of lines. You can give it a try and let me know if it worked :)
Usage:
Get-ChildItem *.txt | Find-String -Pattern 'mypattern'
Get-ChildItem *.txt | Find-String -Pattern 'mypattern' -Context 20, 20
Get-ChildItem *.txt | Find-String -Pattern 'mypattern' -AllMatches
using namespace System.Text.RegularExpressions
using namespace System.IO
function Find-String {
param(
[parameter(ValueFromPipeline, Mandatory)]
[Alias('PSPath')]
[FileInfo]$Path,
[parameter(Mandatory, Position = 0)]
[string]$Pattern,
[RegexOptions]$Options = 'IgnoreCase',
[switch]$AllMatches,
[int[]]$Context
)
process
{
$re = [regex]::new($Pattern, $Options)
$content = [File]::ReadAllText($Path)
$match = if($AllMatches.IsPresent)
{
$re.Matches($content)
}
else
{
$re.Match($content)
}
if($match.Success -notcontains $true) { return }
foreach($m in $match)
{
$out = [ordered]#{
Path = $path.FullName
Value = $m.Value
Index = $m.Index
Length = $m.Length
}
if($PSBoundParameters.ContainsKey('Context'))
{
$before = $m.Index
$after = $m.Index + $m.Length
$contextBefore = $Context[0]
$contextAfter = $Context[1]
while($contextBefore-- -and $before)
{
$before--
}
while($contextAfter-- -and $after -lt $content.Length)
{
$after++
}
$out.Context = (-join $content[$before..$after]).Trim()
}
[pscustomobject]$out
}
}
}

Powershell - F5 iRules -- Extracting iRules

I received a config file of a F5 loadbalancer and was asked to parse it with PowerShell so that it creates a .txt file for every iRule it finds. I'm very new to parsing and I can't seem to wrap my head around it.
I managed to extract the name of every rule and create a separate .txt file, but I am unable to wring the content of the rule to it. Since not all rules are identical, I can't seem to use Regex.
Extract from config file:
ltm rule /Common/irule_name1 {
SOME CONTENT
}
ltm rule /Common/irule_name2 {
SOME OTHER CONTENT
}
What I have for now
$infile = "F5\config_F5"
$ruleslist = Get-Content $infile
foreach($cursor in $ruleslist)
{
if($cursor -like "*ltm rule /*") #new object started
{
#reset all variables to be sure
$content=""
#get rulenames
$rulenameString = $cursor.SubString(17)
$rulename = $rulenameString.Substring(0, $rulenameString.Length -2)
$outfile = $rulename + ".irule"
Write-Host $outfile
Write-Host "END Rule"
#$content | Out-File -FilePath "F5/irules/" + $outfile
}
}
How can I make my powershell script read out what's between the brackets of each rule? (In this case "SOME CONTENT" & "SOME OTHER CONTENT")
Generally parsing involves converting a specific input ("string") into an "object" which PowerShell can understand (such as HTML, JSON, XML, etc.) and traverse by "dotting" through each object.
If you are unable to convert it into any known formats (I am unfamiliar with F5 config files...), and need to only find out the content between braces, you can use the below code.
Please note, this code should only be used if you are unable to find any other alternative, because this should only work when the source file used is code-correct which might not give you the expected output otherwise.
# You can Get-Content FileName as well.
$string = #'
ltm rule /Common/irule_name1 {
SOME CONTENT
}
ltm rule /Common/irule_name2 {
SOME OTHER CONTENT
}
'#
function fcn-get-content {
Param (
[ Parameter( Mandatory = $true ) ]
$START,
[ Parameter( Mandatory = $true ) ]
$END,
[ Parameter( Mandatory = $true ) ]
$STRING
)
$found_content = $string[ ( $START + 1 ) .. ( $END - 1 ) ]
$complete_content = $found_content -join ""
return $complete_content
}
for( $i = 0; $i -lt $string.Length; $i++ ) {
# Find opening brace
if( $string[ $i ] -eq '{' ) {
$start = $i
}
# Find ending brace
elseif( $string[ $i ] -eq '}' ) {
$end = $i
fcn-get-content -START $start -END $end -STRING $string
}
}
For getting everything encompassed within braces (even nested braces):
$string | Select-String '[^{\}]+(?=})' -AllMatches | % { $_.Matches } | % { $_.Value }
To parse data with flexible structure, one can use a state machine. That is, read data line by line and save the state in which you are. Is it a start of a rule? Actual rule? End of rule? By knowing the current state, one can perform actions to the data. Like so,
# Sample data
$data = #()
$data += "ltm rule /Common/irule_name1 {"
$data += "SOME CONTENT"
$data += "}"
$data += "ltm rule /Common/irule_withLongName2 {"
$data += "SOME OTHER CONTENT"
$data += "SOME OTHER CONTENT2"
$data += "}"
$data += ""
$data += "ltm rule /Common/irule_name3 {"
$data += "SOME DIFFERENT CONTENT"
$data += "{"
$data += "WELL,"
$data += "THIS ESCALATED QUICKLY"
$data += "}"
$data += "}"
# enum is used for state tracking
enum rulestate {
start
stop
content
}
# hashtable for results
$ht = #{}
# counter for nested rules
$nestedItems = 0
# Loop through data
foreach($l in $data){
# skip empty lines
if([string]::isNullOrEmpty($l)){ continue }
# Pick the right state and keep count of nested constructs
if($l -match "^ltm rule (/.+)\{") {
# Start new rule
$state = [rulestate]::start
} else {
# Process rule contents
if($l -match "^\s*\{") {
# nested construct found
$state = [rulestate]::content
++$nestedItems
} elseif ($l -match "^\s*\}") {
# closing bracket. Is it
# a) closing nested
if($nestedItems -gt 0) {
$state = [rulestate]::content
--$nestedItems
} else {
# b) closing rule
$state = [rulestate]::stop
}
} else {
# ordinary rule data
$state = [rulestate]::content
}
}
# Handle rule contents based on state
switch($state){
start {
$currentRule = $matches[1].trim()
$ruledata = #()
break
}
content {
$ruledata += $l
break
}
stop {
$ht.add($currentRule, $ruledata)
break
}
default { write-host "oops! $state" }
}
write-host "$state => $l"
}
$ht
Output rules
SOME CONTENT
SOME OTHER CONTENT
SOME OTHER CONTENT2
SOME DIFFERENT CONTENT
{
WELL,
THIS ESCALATED QUICKLY
}

How can I increase the maximum number of characters read by Read-Host?

I need to get a very long string input (around 9,000 characters), but Read-Host will truncate after around 8,000 characters. How can I extend this limit?
The following are possible workarounds.
Workaround 1 has the advantage that it will work with PowerShell background jobs that require keyboard input. Note that if you are trying to paste clipboard content containing new lines, Read-HostLine will only read the first line, but Read-Host has this same behavior.
Workaround 1:
<#
.SYNOPSIS
Read a line of input from the host.
.DESCRIPTION
Read a line of input from the host.
.EXAMPLE
$s = Read-HostLine -prompt "Enter something"
.NOTES
Read-Host has a limitation of 1022 characters.
This approach is safe to use with background jobs that require input.
If pasting content with embedded newlines, only the first line will be read.
A downside to the ReadKey approach is that it is not possible to easily edit the input string before pressing Enter as with Read-Host.
#>
function Read-HostLine ($prompt = $null) {
if ($prompt) {
"${prompt}: " | Write-Host
}
$str = ""
while ($true) {
$key = $host.UI.RawUI.ReadKey("NoEcho, IncludeKeyDown");
# Paste the clipboard on CTRL-V
if (($key.VirtualKeyCode -eq 0x56) -and # 0x56 is V
(([int]$key.ControlKeyState -band [System.Management.Automation.Host.ControlKeyStates]::LeftCtrlPressed) -or
([int]$key.ControlKeyState -band [System.Management.Automation.Host.ControlKeyStates]::RightCtrlPressed))) {
$clipboard = Get-Clipboard
$str += $clipboard
Write-Host $clipboard -NoNewline
continue
}
elseif ($key.VirtualKeyCode -eq 0x08) { # 0x08 is Backspace
if ($str.Length -gt 0) {
$str = $str.Substring(0, $str.Length - 1)
Write-Host "`b `b" -NoNewline
}
}
elseif ($key.VirtualKeyCode -eq 13) { # 13 is Enter
Write-Host
break
}
elseif ($key.Character -ne 0) {
$str += $key.Character
Write-Host $key.Character -NoNewline
}
}
return $str
}
Workaround 2:
$maxLength = 65536
[System.Console]::SetIn([System.IO.StreamReader]::new([System.Console]::OpenStandardInput($maxLength), [System.Console]::InputEncoding, $false, $maxLength))
$s = [System.Console]::ReadLine()
Workaround 3:
function Read-Line($maxLength = 65536) {
$str = ""
$inputStream = [System.Console]::OpenStandardInput($maxLength);
$bytes = [byte[]]::new($maxLength);
while ($true) {
$len = $inputStream.Read($bytes, 0, $maxLength);
$str += [string]::new($bytes, 0, $len)
if ($str.EndsWith("`r`n")) {
$str = $str.Substring(0, $str.Length - 2)
return $str
}
}
}
$s = Read-Line
More discussion here:
Console.ReadLine() max length?
Why does Console.Readline() have a limit on the length of text it allows?
https://github.com/PowerShell/PowerShell/issues/16555

How to check if a word file has a password?

I built a script that converts .doc files to .docx.
I have a problem that when the .doc file is password-protected, I can't access it and then the script hangs.
I am looking for a way to check if the file has a password before I open it.
I using Documents.Open method to open the file.
If your script hangs on opening the document, the approach outlined in this question might help, only that in PowerShell you'd use a try..catch block instead of On Error Resume Next:
$filename = "C:\path\to\your.doc"
$wd = New-Object -COM "Word.Application"
try {
$doc = $wd.Documents.Open($filename, $null, $null, $null, "")
} catch {
Write-Host "$filename is password-protected!"
}
If you can open the file, but the content is protected, you can determine it like this:
if ( $doc.ProtectionType -ne -1 ) {
Write-Host ($doc.Name + " is password-protected.")
$doc.Close()
}
If none of these work you may have to resort to the method described in this answer. Rough translation to PowerShell (of those parts that detect encrypted documents):
$bytes = [System.IO.File]::ReadAllBytes($filename)
$prefix = [System.Text.Encoding]::Default.GetString($bytes[1..2]);
if ($prefix -eq "ÐÏ") {
# DOC 2005
if ($bytes[0x20c] -eq 0x13) { $encrypted = $true }
# DOC/XLS 2007+
$start = [System.Text.Encoding]::Default.GetString($bytes[0..2000]).Replace("\0", " ")
if ($start -like "*E n c r y p t e d P a c k a g e") { $encrypted = $true }
}
There is a technique outlined here. Essentially, you supply a fake password which files without a password will ignore; then you error-trap the ones that do require a password, and can skip them.