Powershell: Convert unique string to unique int - powershell

Is there a method for converting unique strings to unique integers in PowerShell?
I'm using a PowerShell function as a service bus between two API's,
the first API produces unique codes e.g. HG44X10999 (varchars)- but the second API which will consume the first as input, will only accept integers. I only care about keeping them unique.
I have looked at $string.gethashcode() but this produces negative integers and also changes between builds. Get-hash | $string -encoding ASCII obviously outputs varchars too.
Other examples on SO are referring to converting a string of numeric characters to integers i.e. $string = 123 - but I can't find a way of quickly computing an int from a string of alphanumeric

The Fowler-Noll-Vo hash function seems well-suited for your purpose, as it can produce a 32-bit hash output.
Here's a simple implementation in PowerShell (the offset basis and initial prime is taken from the wikipedia reference table for 32-bit outputs):
function Get-FNVHash {
param(
[string]$InputString
)
# Initial prime and offset chosen for 32-bit output
# See https://en.wikipedia.org/wiki/Fowler–Noll–Vo_hash_function
[uint32]$FNVPrime = 16777619
[uint32]$offset = 2166136261
# Convert string to byte array, may want to change based on input collation
$bytes = [System.Text.Encoding]::UTF8.GetBytes($InputString)
# Copy offset as initial hash value
[uint32]$hash = $offset
foreach($octet in $bytes)
{
# Apply XOR, multiply by prime and mod with max output size
$hash = $hash -bxor $octet
$hash = $hash * $FNVPrime % [System.Math]::Pow(2,32)
}
return $hash
}
Now you can repeatably produce distinct integers from the input strings:
PS C:\> Get-FNVHash HG44X10999
1174154724
If the target API only accepts positive signed 32-bit integers you can change the modulus to [System.Math]::Pow(2,31) (doubling the chance of collisions, to
approx. 1 in 4300 for 1000 distinct inputs)
For further insight into this simple approach, see this page on FNV and have a look at this article exploring short string hashing

Related

Convert byte array (hex) to signed Int

I am trying to convert a (variable length) Hex String to Signed Integer (I need either positive or negative values).
[Int16] [int 32] and [int64] seem to work fine with 2,4+ byte length Hex Strings but I'm stuck with 3 byte strings [int24] (no such command in powershell).
Here's what I have now (snippet):
$start = $mftdatarnbh.Substring($DataRunStringsOffset+$LengthBytes*2+2,$StartBytes*2) -split "(..)"
[array]::reverse($start)
$start = -join $start
if($StartBytes*8 -le 16){$startd =[int16]"0x$($start)"}
elseif($StartBytes*8 -in (17..48)){$startd =[int32]"0x$($start)"}
else{$startd =[int64]"0x$($start)"}
With the above code, a $start value of "D35A71" gives '13851249' instead of '-2925967'. I tried to figure out a way to implement two's complement but got lost. Any easy way to do this right?
Thank you in advance
Edit: Basically, I think I need to implement something like this:
int num = (sbyte)array[0] << 16 | array[1] << 8 | array[2];
as seen here.
Just tried this:
$start = "D35A71"
[sbyte]"0x$($start.Substring(0,2))" -shl 16 -bor "0x$($start.Substring(2,2))" -shl 8 -bor "0x$($start.Substring(4,2))"
but doesn't seem to get the correct result :-/
To parse your hex.-number string as a negative number you can use [bigint] (System.Numerics.BigInteger):
# Since the most significant hex digit has a 1 as its most significant bit
# (is >= 0x8), it is parsed as a NEGATIVE number.
# To force unconditional interpretation as a positive number, prepend '0'
# to the input hex string.
PS> [bigint]::Parse('D35A71', 'AllowHexSpecifier')
-2925967
You can cast the resulting [bigint] instance back to an [int] (System.Int32).
Note:
The result is a negative number, because the most significant hex digit of the hex input string is >= 0x8, i.e. has its high bit set.
To force [bigint] to unconditionally interpret a hex. input string as a positive number, prepend 0.
The internal two's complement representation of a resulting negative number is performed at byte boundaries, so that a given hex number with an odd number of digits (i.e. if the first hex digit is a "half byte") has the missing half byte filled with 1 bits.
Therefore, a hex-number string whose most significant digit is >= 0x8 (parses as a negative number) results in the same number as prepending one or more Fs (0xF == 1111) to it; e.g., the following calls all result in -2048:
[bigint]::Parse('800', 'AllowHexSpecifier'),
[bigint]::Parse('F800', 'AllowHexSpecifier'),
[bigint]::Parse('FF800', 'AllowHexSpecifier'), ...
See the docs for details about the parsing logic.
Examples:
# First digit (7) is < 8 (high bit NOT set) -> positive number
[bigint]::Parse('7FF', 'AllowHexSpecifier') # -> 2047
# First digit (8) is >= 8 (high bit IS SET) -> negative number
[bigint]::Parse('800', 'AllowHexSpecifier') # -> -2048
# Prepending additional 'F's to a number that parses as
# a negative number yields the *same* result
[bigint]::Parse('F800', 'AllowHexSpecifier') # -> -2048
[bigint]::Parse('FF800', 'AllowHexSpecifier') # -> -2048
# ...
# Starting the hex-number string with '0'
# *unconditionally* makes the result a *positive* number
[bigint]::Parse('0800', 'AllowHexSpecifier') # -> 2048

Count the scale of a given decimal

How can I count the scale of a given decimal in Powershell?
$a = 0.0001
$b = 0.000001
Casting $a to a string and returning $a.Length gives a result of 6...I need 4.
I thought there'd be a decimal or math function but I haven't found it and messing with a string seems inelegant.
There's probably a better mathematic way but I'd find the decimal places like this:
$a = 0.0001
$decimalPlaces = ("$a" -split '\.')[-1].TrimEnd('0').Length
Basically, split the string on the . character and get the length of the last string in the array. Wrapping $a in double-quotes implicitly calls .ToString() with an invariant culture (you could expand this as $a.ToString([CultureInfo]::InvariantCulture)), making this method to determine the number of decimal places culture-invariant.
.TrimEnd('0') is used in case $a were sourced from a string, not a proper number type, it's possible that trailing zeroes could be included that should not count as decimal places. However, if you want the scale and not just the used decimal places, leave .TrimEnd('0') off like so:
$decimalPlaces = ("$a" -split '\.')[-1].Length
mclayton helpfully linked to this answer to a related C# question in a comment, and the solution there can indeed be adapted to PowerShell, if working with or conversion to type [decimal] is acceptable:
# Define $a as a [decimal] literal (suffix 'd')
# This internally records the scale (number of decimal places) as specified.
$a = 0.0001d
# [decimal]::GetBits() allows extraction of the scale from the
# the internal representation:
[decimal]::GetBits($a)[-1] -shr 16 -band 0xFF # -> 4, the number of decimal places
The System.Decimal.GetBits method returns an array of internal bit fields whose last element contains the scale in bits 16 - 23 (8 bits, even though the max. scale allowed is 28), which is what the above extracts.
Note: A PowerShell number literal that is a fractional number without the d suffix - e.g., 0.0001 becomes a [double] instance, i.e. a double-precision binary floating-point number.
PowerShell automatically converts [double] to [decimal] values on demand, but do note that there can be rounding errors due to the differing internal representations, and that [double] can store larger numbers than [decimal] can (although not accurately).
A [decimal] literal - one with suffix d (note that C# uses suffix m) - is parsed with a scale exactly as specified, so that applying the above to 0.000d and 0.010d yields 3 in both cases; that is, the trailing zeros are meaningful.
This does not apply if you (implicitly) convert from [double] instances such as 0.000 and 0.010, for which the above yields 0 and 2, respectively.
A string-based solution:
To offer a more concise (also culture-invariant) alternative to Bender The Greatest's helpful answer:
$a = 0.0001
("$a" -replace '.+\.').Length # -> 4, the number of decimal places
Caveat: This solution relies on the default string representation of a [double] number, which need not match the original input format; for instance, .0100, when stringified later, becomes '0.01'; however, as discussed above, you can preserve trailing zeros if you start with a [decimal] literal: .0100d stringifies to '0.0100' (input number of decimals preserved).
"$a", uses an expandable string (PowerShell's string interpolation) to create a culture-invariant string representation of the number so as to ensure that the string representation uses . as the decimal mark.
In effect, PowerShell calls $a.ToString([cultureinfo]::InvariantCulture) behind the scenes.[1].
By contrast, .ToString() (argument-less) applies the rules of the current culture, and in some cultures it is , - not . - that is used as the decimal mark.
Caveat: If you use just $a as the LHS of -replace, $a is implicitly stringified, in which case you - curiously - get culture-sensitive behavior, as with .ToString() - see this GitHub issue.
-replace '.+\.' effectively removes all characters up to and including the decimal point from the input string, and .Length counts the characters in the resulting string - the number of decimal places.
[1] Note that casts from strings in PowerShell too use the invariant culture (effectively, ::Parse($value, [cultureinfo]::InvariantCulture) is called) so that in order to parse a a culture-local string representation you'll need to use the ::Parse() method explicitly; e.g., [double]::Parse('1,2'), not [double] '1,2'.

PowerShell New-ADComputer binary attribute

I am new to PowerShell and I have a one shot task to perform which must be done with PowerShell. This involves Active Directory as well. I need to add a new computer object into our AD and one of the attributes I must set at creation time is a 16 bytes binary value. I am getting as input a string which is an hexadecimal representation of the value I must set for the attribute.
I tried to input the value asis and it doesn't work. I tried escaping each byte with a backslash, it doesn't work neither.
How should I format the input for this to work with the New-ADComputer command? I am setting a bunch of other attributes successfully. When I remove this binary entry from my hashtable passed to the -OtherAttributes option it works fine. So, obviously a format problem. I found nothing about the expected format for such attributes.
Any hints? TIA.
EDIT 2018-06-05 19:44 EDT:
I tried converting the string to a byte array as follow:
Function Convert-Hex2ByteArray {
[cmdletbinding()]
param(
[parameter(Mandatory=$true)]
[String]
$HexString
)
[byte[]] $Bytes = #(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)
For($i=0; $i -lt $HexString.Length; $i+=2) {
$Bytes[$i/2] = [convert]::ToByte($HexString.Substring($i, 2), 16)
}
$Bytes
}
(...)
$netbootGUID = Convert-Hex2ByteArray($args[$indiceArgs])
$otherAttributes.add( "netbootGUID", $netbootGUID )
(...)
New-ADComputer -Credential $cred -Server $ADhost -Path "CN=Computers,$baseDN" -SAMAccountName $sAMAccountName -Name $name-Instance 4 -OtherAttributes $otherAttributes
This leads to the following error (I apologize for my own translation since the original message is shown in French):
Many values were specified for an attribut which can only have one
Problem solved:
$netbootGUID = New-Object Guid $args[$indiceArgs]
$otherAttributs.add( "netbootGUID", $netbootGUID )
Did the trick.
Typically for binary storage you need to convert the string to a byte array:
$String = '3c6ef75eaa2c4b23992bbd65ac891917'
$ByteArray = [byte[]]$(for ($i = 0; $i -lt $String.Length; $i+=2) { [Convert]::ToByte($String.Substring($i,2), 16) })
To convert it back:
$NewString = -join $(foreach($Byte in $ByteArray) { $Byte.ToString('x2') })
If you want the characters upper case, specify 'X2' instead of 'x2'.
Since you're storing 16 byte values, I'll note that if you're storing GUIDs you may need to change the storage order since the order of bytes in a string representation of a GUID does not match the order of bytes in a byte representation of a GUID on an x86 system. Fortunately, there are built in functions for handling this conversion with the built-in System.Guid data type:
$GUID = 'f8d89eb2b49c4bfeab44a85ccdc4191a'
$ByteArray = [Guid]::new($GUID).ToByteArray()
And a constructor for converting back:
$NewGUID = [Guid]::new($ByteArray)
Whether or not you should use this method depends on exactly what property you're updating and whether or not the application(s) that will be using the property in question will correctly be handling the GUIDs or if they're just storing the GUID as raw bytes (which is incorrect but not surprising). You'll have to test by seeing what GUID your application sees and comparing it to the byte array in Active Directory to verify that it's correct.
For specifics on the byte ordering, see the documentation for Guid.ToByteArray():
Note that the order of bytes in the returned byte array is different from the string representation of a Guid value. The order of the beginning four-byte group and the next two two-byte groups is reversed, whereas the order of the last two-byte group and the closing six-byte group is the same. The example provides an illustration.
The reason for this is that a GUID is partially constructed from a series of integers of varying sizes, and the UUID standard specifies big endianness for those numbers. x86 computers are little-endian systems.

Convert Byte Array (from legacy program data file) to Powershell object

I have a "structured" file (logical fixed-length records) from a legacy program on a legacy (non-MS) operating system. I know how the records were structured in the original program, but the original O/S handled structured data as a sequence of bytes for file I/O, so a hex dump won't show you anything more than what the record length is (there are marker bytes and other record overhead imposed by the access method API used to generate the file originally).
Once I have the sequence of bytes in a Powershell variable, with the overhead bytes "cut away", how can I convert this into a structured object? Some of the "fields" are 16-bit integers, some are strings of the form [s]data (where [s] is a byte giving the length of the "real" data in that field), some are BCD coded fixed-point numbers, some are IEEE floats.
(I haven't been specific about the structure, either on the Powershell side or on the legacy side, because I am seeking a more-or-less 'generic' solution/technique, as I actually have several different files with different record structures to process.)
Initially, I tried to do it by creating a type that could take the buffer and overwrite a struct so that all the fields were nicely filled in. However, certain issues arose (regarding struct layout, fixed buffers and mixing fixed and managed members) and I also realised that there was no guarantee that the data in the buffer would be properly (or even legally) aligned. Decided to try a more programmatic path.
"Manual" parsing is out, so how about automatic parsing? You're going to need to define the members of your PSobject at some point, why not do it in a way that can help programmatically parse the data. This method does not require the data in the buffer to be correctly aligned or even contiguous. You can also have fields overlap to separate raw unions into the individual members (though, typically, only one will contain a "correct" value).
First step, build a hash table to identify the members, the offset in the buffer, their data types and, if an array, the number of elements :
$struct = #{
field1 = 0,[int],0; # 0 means not an array
field2 = 4,[byte],16; # a C string maybe
field3 = 24,[char],32; # wchar_t[32] ? note: skipped over bytes 20-23
field4 = 56,[double],0
}
# the names field1/2/3/4 are arbitrary, any valid member name may be used (but not
# necessarily any valid hash key if you want a PSObject as the end result).
# also, the values could be hash tables instead of arrays. that would allow
# descriptive names for the values but doesn't affect the end result.
Next, use [BitConverter] to extract the required data. The problem here is that we need to call the correct method for all the varying types. Just use a (big) switch statement. The basic principle is the same for most values, get the type indicator and initial offset from the $struct definition then call the correct [BitConverter] method and supply the buffer and initial offset, update the offset to where the next element of an array would be and then repeat for as many array elements as are required. The only trap here is that the data in the buffer must have the same format as expected by [BitConverter], so for the [double] example, the bytes in the buffer must conform to IEEE-754 floating point format (assuming that [BitConverter]::ToDouble() is used). Thus, for example, raw data from a Paradox database will need some tweeking because it flips the high bit to simplify sorting.
$struct.keys | foreach {
# key order is undefined but that won't affect the final object's members
$hashobject = #{}
} {
$fieldoffs = $struct[$_][0]
$fieldtype = $struct[$_][1]
if (($arraysize = $struct[$_][2]) -ne 0) { # yes, I'm a C programmer from way back
$array = #()
} else {
$array = $null
}
:w while ($arraysize-- -ge 0) {
switch($fieldtype) {
([int]) {
$value = [bitconverter]::toint32($buffer, $fieldoffs)
$fieldoffs += 4
}
([byte]) {
$value = $buffer[$fieldoffs++]
}
([char]) {
$value = [bitconverter]::tochar($buffer, $fieldoffs)
$fieldoffs += 2
}
([string]) { # ANSI string, 1 byte per character
$array = new-object string (,[char[]]$buffer[$fieldoffs..($fieldoffs+$arraysize)])
# $arraysize has already been decremented so don't need to subtract 1
break w # "array size" was actually string length so don't loop
#
# description:
# first, get a slice of the buffer as a byte[] (assume single byte characters)
# next, convert each byte to a char in a char[]
# then, invoke the constructor String(Char[])
# finally, put the String into $array ready for insertion into $hashobject
#
# Note the convoluted syntax - New-Object expects the second argument to be
# an array of the constructor parameters but String(Char[]) requires only
# one argument that is itself an array. By itself,
# [char[]]$buffer[$fieldoffs..($fieldoffs+$arraysize)]
# is treated by PowerShell as an argument list of individual chars, corrupting the
# constructor call. The normal trick is to prepend a single comma to create an array
# of one element which is itself an array
# ,[char[]]$buffer[$fieldoffs..($fieldoffs+$arraysize)]
# but this won't work because of the way PowerShell parses the command line. The
# space before the comma is ignored so that instead of getting 2 arguments (a string
# "String" and the array of an array of char), there is only one argument, an array
# of 2 elements ("String" and array of array of char) thereby totally confusing
# New-Object. To make it work you need to ALSO isolate the single element array into
# its own expression. Hence the parentheses
# (,[char[]]$buffer[$fieldoffs..($fieldoffs+$arraysize)])
#
}
}
if ($null -ne $array) {
# must be in this order* to stop the -ne from enumerating $array to compare against
# $null. this would result in the condition being considered false if $array were
# empty ( (#() -ne $null) -> $null -> $false ) or contained only one element with
# the value 0 ( (#(0) -ne $null) -> (scalar) 0 -> $false ).
$array += $value
# $array is not $null so must be an array to which $value is appended
} else {
# $array is $null only if $arraysize -eq 0 before the loop (and is now -1)
$array = $value
# so the loop won't repeat thus leaving this one scalar in $array
}
}
$hashobject[$_] = $array
}
#*could have reversed it as
# if ($array -eq $null) { scalar } else { collect array }
# since the condition will only be true if $array is actually $null or contains at
# least 2 $null elements (but no valid conversion will produce $null)
At this point there is a hash table, $hashobject, with keys equal to the field names and values containing the bytes from the buffer arranged into single (or arrays of) numeric (inc. char/boolean) values or (ANSI) strings. To create a (proper) object, just invoke New-Object -TypeName PSObject -Property $hashobject or use [PSCustomObject]$hashobject.
Of course, if the buffer actually contained structured data then the process would be more complicated but the basic procedure would be the same. Note also that the "types" used in the $struct hash table have no direct effect on the resultant types of the object members, they are only convenient selectors for the switch statement. It would work just as well with strings or numbers. In fact, the parentheses around the case labels are because switch parses them the same as command arguments. Without the parentheses, the labels would be treated as literal strings. With them, the labels are evaluated as a type object. Both the label and the switch value are then converted to strings (that's what switch does for values other than script blocks or $null) but each type has a distinct string representation so the case labels will still match up correctly. (Not really on point but still interesting, I think.)
Several optimisations are possible but increase the complexity slightly. E.g.
([byte]) { # already have a byte[] so why collect bytes one at a time
if ($arraysize -ge 0) { # was originally -gt 0 so want a byte[]
$array = [byte[]]$buffer[$fieldoffs..($fieldoffs+$arraysize)]
# slicing the byte array produces an object array (of bytes) so cast it back
} else { # $arraysize was 0 so just a single byte
$array = $buffer[$fieldoffs]
}
break w # $array ready for insertion into $hashobject, don't need to loop
}
But what if my strings are actually Unicode?, you say. Easy, just use existing methods from the [Text.Encoding] class,
[string] { # Unicode string, 2 (LE) bytes per character
$array = [text.encoding]::unicode.getstring([byte[]]$buffer[$fieldoffs..($fieldoffs+$arraysize*2+1)])
# $arraysize should be the string length so, initially, $arraysize*2 is the byte
# count and $arraysize*2-1 is the end index (relative to $fieldoffs) but $arraysize
# was decremented so the end index is now $arraysize*2+1, i.e. length*2-1 = (length-1)*2+1
break w # got $array, no loop
}
You could also have both ANSI and Unicode by utilising a different type indicator for the ANSI string, maybe [char[]]. Remember, the type indicators do not affect the result, they just have to be distinct (and hopefully meaningful) identifiers.
I realise that this is not quite the "just dump the bytes into a union or variant record" solution mentioned in the OPs comment but PowerShell is based in .NET and uses managed objects where this sort of thing is largely prohibited (or difficult to get working, as I found). For example, assuming you could just dump raw chars (not bytes) into a String, how would the Length property get updated? This method also allows some useful preprocessing such as splitting up unions as noted above or converting raw byte or char arrays into the Strings they represent.

How to convert hexadecimal encoded string to hexadecimal integer

So basically what I'm trying to achieve is to get a MAC address from a text file and increment the value by one.
Been bashing my head against the Google/StackOverflow wall for a couple of hours, think there's a concept I'm just not getting.
PowerShell:
$Last_MAC_Address = (Get-Content -LiteralPath "\\UNC\Path\Last MAC Address.txt")
Write-Host ($Last_MAC_Address)
# Output: 00155DE10B73
$Next_MAC_Address = (($Last_MAC_Address | Format-Hex) + 1)
This is a 3 step process, and although PetSerAl answered it in the comments as a one liner, I'll break it down slightly for posterity (and use a different class).
The first step is to get the Hex number as a decimal (mathematical base 10, not type).
The Second step is the incrementation of the decimal.
And the final step is converting it back to hexadecimal.
broken down and not a one liner this will accomplish the task at hand:
$asDecimal = [System.Convert]::ToInt64("00155DE10B73", 16)
$asDecimal++
$asHex = [System.Convert]::ToString($asDecimal, 16)
Another option is to prefix the value with 0x and cast it to an int64:
$Next_MAC_Address = ([int64]"0x$Last_MAC_Address"+1).ToString('X12')
You could also use the format operator (-f) instead of the ToString() method:
$Next_MAC_Address = '{0:X12}' -f ([int64]"0x$Last_MAC_Address"+1)
There is, however, one thing that may be worth noting. MAC addresses aren't just random 6-byte numbers without any inner structure. They actually consist of two parts. The first 3 bytes form the Organizationally Unique Identifier (OUI), a vendor-specific prefix (00-15-5D is one of the OUIs belonging to Microsoft). Only the last 3 bytes are a random number, a unique identifier for each card from the vendor identified by the OUI.
Taking that into consideration you may want to split the MAC address accordingly, e.g. like this:
$oui, $nid = $Last_MAC_Address -split '(?<=^[0-9a-f]{6})(?=[0-9a-f]{6}$)'
or like this:
$oui = $Last_MAC_Address.Substring(0, 6)
$nid = $Last_MAC_Address.Substring(6, 6)
and increment only the NIC identifier, and only if it wouldn't overflow:
if ($nid -ne 'ffffff') {
$Next_MAC_Address = "{0}{1:X6}" -f $oui, ([int64]"0x$nid"+1)
} else {
Write-Error 'MAC address overflow.'
}