I have a "structured" file (logical fixed-length records) from a legacy program on a legacy (non-MS) operating system. I know how the records were structured in the original program, but the original O/S handled structured data as a sequence of bytes for file I/O, so a hex dump won't show you anything more than what the record length is (there are marker bytes and other record overhead imposed by the access method API used to generate the file originally).
Once I have the sequence of bytes in a Powershell variable, with the overhead bytes "cut away", how can I convert this into a structured object? Some of the "fields" are 16-bit integers, some are strings of the form [s]data (where [s] is a byte giving the length of the "real" data in that field), some are BCD coded fixed-point numbers, some are IEEE floats.
(I haven't been specific about the structure, either on the Powershell side or on the legacy side, because I am seeking a more-or-less 'generic' solution/technique, as I actually have several different files with different record structures to process.)
Initially, I tried to do it by creating a type that could take the buffer and overwrite a struct so that all the fields were nicely filled in. However, certain issues arose (regarding struct layout, fixed buffers and mixing fixed and managed members) and I also realised that there was no guarantee that the data in the buffer would be properly (or even legally) aligned. Decided to try a more programmatic path.
"Manual" parsing is out, so how about automatic parsing? You're going to need to define the members of your PSobject at some point, why not do it in a way that can help programmatically parse the data. This method does not require the data in the buffer to be correctly aligned or even contiguous. You can also have fields overlap to separate raw unions into the individual members (though, typically, only one will contain a "correct" value).
First step, build a hash table to identify the members, the offset in the buffer, their data types and, if an array, the number of elements :
$struct = #{
field1 = 0,[int],0; # 0 means not an array
field2 = 4,[byte],16; # a C string maybe
field3 = 24,[char],32; # wchar_t[32] ? note: skipped over bytes 20-23
field4 = 56,[double],0
}
# the names field1/2/3/4 are arbitrary, any valid member name may be used (but not
# necessarily any valid hash key if you want a PSObject as the end result).
# also, the values could be hash tables instead of arrays. that would allow
# descriptive names for the values but doesn't affect the end result.
Next, use [BitConverter] to extract the required data. The problem here is that we need to call the correct method for all the varying types. Just use a (big) switch statement. The basic principle is the same for most values, get the type indicator and initial offset from the $struct definition then call the correct [BitConverter] method and supply the buffer and initial offset, update the offset to where the next element of an array would be and then repeat for as many array elements as are required. The only trap here is that the data in the buffer must have the same format as expected by [BitConverter], so for the [double] example, the bytes in the buffer must conform to IEEE-754 floating point format (assuming that [BitConverter]::ToDouble() is used). Thus, for example, raw data from a Paradox database will need some tweeking because it flips the high bit to simplify sorting.
$struct.keys | foreach {
# key order is undefined but that won't affect the final object's members
$hashobject = #{}
} {
$fieldoffs = $struct[$_][0]
$fieldtype = $struct[$_][1]
if (($arraysize = $struct[$_][2]) -ne 0) { # yes, I'm a C programmer from way back
$array = #()
} else {
$array = $null
}
:w while ($arraysize-- -ge 0) {
switch($fieldtype) {
([int]) {
$value = [bitconverter]::toint32($buffer, $fieldoffs)
$fieldoffs += 4
}
([byte]) {
$value = $buffer[$fieldoffs++]
}
([char]) {
$value = [bitconverter]::tochar($buffer, $fieldoffs)
$fieldoffs += 2
}
([string]) { # ANSI string, 1 byte per character
$array = new-object string (,[char[]]$buffer[$fieldoffs..($fieldoffs+$arraysize)])
# $arraysize has already been decremented so don't need to subtract 1
break w # "array size" was actually string length so don't loop
#
# description:
# first, get a slice of the buffer as a byte[] (assume single byte characters)
# next, convert each byte to a char in a char[]
# then, invoke the constructor String(Char[])
# finally, put the String into $array ready for insertion into $hashobject
#
# Note the convoluted syntax - New-Object expects the second argument to be
# an array of the constructor parameters but String(Char[]) requires only
# one argument that is itself an array. By itself,
# [char[]]$buffer[$fieldoffs..($fieldoffs+$arraysize)]
# is treated by PowerShell as an argument list of individual chars, corrupting the
# constructor call. The normal trick is to prepend a single comma to create an array
# of one element which is itself an array
# ,[char[]]$buffer[$fieldoffs..($fieldoffs+$arraysize)]
# but this won't work because of the way PowerShell parses the command line. The
# space before the comma is ignored so that instead of getting 2 arguments (a string
# "String" and the array of an array of char), there is only one argument, an array
# of 2 elements ("String" and array of array of char) thereby totally confusing
# New-Object. To make it work you need to ALSO isolate the single element array into
# its own expression. Hence the parentheses
# (,[char[]]$buffer[$fieldoffs..($fieldoffs+$arraysize)])
#
}
}
if ($null -ne $array) {
# must be in this order* to stop the -ne from enumerating $array to compare against
# $null. this would result in the condition being considered false if $array were
# empty ( (#() -ne $null) -> $null -> $false ) or contained only one element with
# the value 0 ( (#(0) -ne $null) -> (scalar) 0 -> $false ).
$array += $value
# $array is not $null so must be an array to which $value is appended
} else {
# $array is $null only if $arraysize -eq 0 before the loop (and is now -1)
$array = $value
# so the loop won't repeat thus leaving this one scalar in $array
}
}
$hashobject[$_] = $array
}
#*could have reversed it as
# if ($array -eq $null) { scalar } else { collect array }
# since the condition will only be true if $array is actually $null or contains at
# least 2 $null elements (but no valid conversion will produce $null)
At this point there is a hash table, $hashobject, with keys equal to the field names and values containing the bytes from the buffer arranged into single (or arrays of) numeric (inc. char/boolean) values or (ANSI) strings. To create a (proper) object, just invoke New-Object -TypeName PSObject -Property $hashobject or use [PSCustomObject]$hashobject.
Of course, if the buffer actually contained structured data then the process would be more complicated but the basic procedure would be the same. Note also that the "types" used in the $struct hash table have no direct effect on the resultant types of the object members, they are only convenient selectors for the switch statement. It would work just as well with strings or numbers. In fact, the parentheses around the case labels are because switch parses them the same as command arguments. Without the parentheses, the labels would be treated as literal strings. With them, the labels are evaluated as a type object. Both the label and the switch value are then converted to strings (that's what switch does for values other than script blocks or $null) but each type has a distinct string representation so the case labels will still match up correctly. (Not really on point but still interesting, I think.)
Several optimisations are possible but increase the complexity slightly. E.g.
([byte]) { # already have a byte[] so why collect bytes one at a time
if ($arraysize -ge 0) { # was originally -gt 0 so want a byte[]
$array = [byte[]]$buffer[$fieldoffs..($fieldoffs+$arraysize)]
# slicing the byte array produces an object array (of bytes) so cast it back
} else { # $arraysize was 0 so just a single byte
$array = $buffer[$fieldoffs]
}
break w # $array ready for insertion into $hashobject, don't need to loop
}
But what if my strings are actually Unicode?, you say. Easy, just use existing methods from the [Text.Encoding] class,
[string] { # Unicode string, 2 (LE) bytes per character
$array = [text.encoding]::unicode.getstring([byte[]]$buffer[$fieldoffs..($fieldoffs+$arraysize*2+1)])
# $arraysize should be the string length so, initially, $arraysize*2 is the byte
# count and $arraysize*2-1 is the end index (relative to $fieldoffs) but $arraysize
# was decremented so the end index is now $arraysize*2+1, i.e. length*2-1 = (length-1)*2+1
break w # got $array, no loop
}
You could also have both ANSI and Unicode by utilising a different type indicator for the ANSI string, maybe [char[]]. Remember, the type indicators do not affect the result, they just have to be distinct (and hopefully meaningful) identifiers.
I realise that this is not quite the "just dump the bytes into a union or variant record" solution mentioned in the OPs comment but PowerShell is based in .NET and uses managed objects where this sort of thing is largely prohibited (or difficult to get working, as I found). For example, assuming you could just dump raw chars (not bytes) into a String, how would the Length property get updated? This method also allows some useful preprocessing such as splitting up unions as noted above or converting raw byte or char arrays into the Strings they represent.
Related
<updated, added Santiago Squarzon suggest information>
I have two lists, I pull them from csv but there is only one column in each of the two lists.
Here is how I pull in the lists in my script
$orginal_list = Get-Content -Path .\random-word-350k-wo-quotes.txt
$filter_words = Get-Content -Path .\no_go_words.txt
However, I will use a typed list for simplicity in the code example below.
In this example, the $original_list can have some words repeated.
I want to filter out all of the words in $original_list that are in the $filter_words list.
Then add the filtered list to the variable $filtered_list.
In this example, $filtered_list would only have "dirt","turtle" in it.
I know the line I have below where I subtract the two won't work, it's there as a placeholder as I don't know what to use to get the result.
Of note, the csv file that feeds $original_list could have 300,000 or more rows, and $filter_words could have hundreds of rows. So would want this to be as efficient as possible.
The filtering is case insensitive.
$orginal_list = "yellow","blue","yellow","dirt","blue","yellow","turtle","dirt"
$filter_words = "yellow","blue","green","harsh"
$filtered_list = $orginal_list - $filter_words
$filtered_list
dirt
turtle
Use System.Collections.Generic.HashSet`1 and its .ExceptWith() method:
# Note: if possible, declare the lists as [string[]] arrays to begin with.
# Otherwise, use a [string[]] cast im the method calls below, which,
# however, creates a duplicate array on the fly.
[string[]] $orginal_list = "yellow","blue","yellow","dirt","blue","yellow","turtle","dirt"
[string[]] $filter_words = "yellow","blue","green","harsh"
# Create a hash set based on the strings in $orginal_list,
# with case-insensitive lookups.
$hsOrig = [System.Collections.Generic.HashSet[string]]::new(
$orginal_list,
[System.StringComparer]::CurrentCultureIgnoreCase
)
# Reduce it to those strings not present in $filter_words, in-place.
$hsOrig.ExceptWith($filter_words)
# Convert the filtered hash set to an array.
[string[]] $filtered_list = [string[]]::new($hsOrig.Count)
$hsOrig.CopyTo($filtered_list)
# Output the result
$filtered_list
The above yields:
dirt
turtle
To also speed up reading your input files, use the following:
# Note: System.IO.File]::ReadAllLines() returns a [string[]] instance.
$orginal_list = [System.IO.File]::ReadAllLines((Convert-Path .\random-word-350k-wo-quotes.txt))
$filter_words = [System.IO.File]::ReadAllLines((Convert-Path .\no_go_words.txt))
Note:
.NET generally defaults to (BOM-less) UTF-8; pass a [System.Text.Encoding] instance as a second argument, if needed.
.NET's working dir. usually differs from PowerShell's, so the use of full paths is always advisable in .NET API calls, and that is what the Convert-Path calls ensure.
I have found that using Linq to filter one list out from another is incredibly easy and incredibly fast (especially for large lists)
# ARRAY OF 1000 STRINGS LOWERCASE (item1 - item1000)
[string[]]$ThousandItems = 1..1000 | %{"item$_"};
# ARRAY OF 100 STRINGS UPPERCASE (ITEM901 - ITEM1000)
[string[]]$HundredItems = 901..1000 | %{"ITEM$_"};
# SUBTRACT THE SECOND ARRAY FROM THE FIRST ONE (CASE INSENSITIVELY)
[string[]]$NineHundred = [Linq.Enumerable]::Except($ThousandItems, $HundredItems, [System.StringComparer]::OrdinalIgnoreCase);
$NineHundred;
Which returns the list of 1000 items minus Item901-Item1000
item1
item2
...
item899
item900
As for speed, removing 100 items from a list...
1,000 Items = 1ms
10,000 Items = 2ms
100,000 Items = 12ms
1,000,000 Items = 259ms
10,000,000 Items = 3,008ms
Note: These times are just on the [Linq.Enumerable]::Except() line. So it's just measuring the time taken to subtract one array from the other. It does not measure the time taken to fill the array.
So to apply this to the original poster's example
$original_list = [System.IO.File]::ReadAllLines((Convert-Path .\random-word-350k-wo-quotes.txt));
$filter_words = [System.IO.File]::ReadAllLines((Convert-Path .\no_go_words.txt));
[string[]]$filtered_list = [Linq.Enumerable]::Except($original_list,$filter_words,[System.StringComparer]::OrdinalIgnoreCase);
For this, I literally inserted 350K strings (the MD5 hash of the numbers 1 - 350K) into the original list (uppercase), inserted 10K strings (the MD5 hash of the numbers 1-10K) into the filter words list (lowercase) and ran that code.
There were 340K words in the filtered list, and it only took 260ms to read both files, filter and return the list
Can an array be used as the key in a hashtable? How can I reference the hashtable item with an array key?
PS C:\> $h = #{}
PS C:\> $h[#(1,2)] = 'a'
PS C:\> $h
Name Value
---- -----
{1, 2} a # looks like the key is a hash
PS C:\> $h[#(1,2)] # no hash entry
PS C:\> $h.Keys #
1
2
PS C:\> $h[#(1,2)] -eq 'a'
PS C:\> $h[#(1,2)] -eq 'b'
PS C:\> foreach ($key in $h.Keys) { $key.GetType() } # this is promising
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True Object[] System.Array
PS C:\> $PSVersionTable.PSVersion.ToString()
7.1.4
While you can use arrays as hashtable keys, doing so is impractical:
Update: There is a way to make arrays work as hashtable keys, but it requires nontrivial effort during construction of the hashtable - see this answer.
You'll have the use the very same array instances as both the keys and for later lookups.
The reason is that arrays, which are instances of .NET reference types (as opposed to value types such as integers), use the default implementation of the .GetHashCode() method to return a hash code (as used in hashtables), and this default implementation returns a different code for each instance - even for two array instances that one would intuitively think of as "the same".
In other words: you'll run into the same problem trying to use instances of any such .NET reference type as hashtable keys, including other collection types - unless a given type happens to have a custom .GetHashCode() implementation that explicitly considers distinct instances equal based on their content.
Additionally, it makes use of PowerShell's indexer syntax ([...]) awkward, because the array instance must be nested, with the unary form of ,, the array constructor operator. However, dot notation (property access) works as usual.
$h = #{}
# The array-valued key.
$key = 1, 2
$h[$key] = 'a'
# IMPORTANT:
# The following lookups work, but only because
# the *very same array instance* is used for the lookup.
# Nesting required so that PowerShell doesn't think that
# *multiple* keys are being looked up.
$h[, $key]
# Dot notation works normally.
$h.$key
# Does NOT work, because a *different array instance* is used.
$h.#(1,2)
A simple test for whether a given expression results in the same hashtable lookup every time and is therefore suitable as a key is to call the .GetHashCode() method on it repeatedly; only if the same number is returned every time (in a given session) can the expression be used:
# Returns *different* numbers.
#(1, 2).GetHashCode()
#(1, 2).GetHashCode()
To inspect a given object or type for whether it is (an instance of) a .NET reference type vs. value type:
# $false is returned in both cases, confirming that the .NET array
# type is a *reference type*
#(1, 2).GetType().IsValueType
[Array].IsValueType
Workaround:
A workaround would be to use string representations of arrays, though coming up with unique (enough) ones may be a challenge.
In the simplest case, use PowerShell's string interpolation, which represents arrays as a space-separated list of the elements' (stringified) values; e.g. "$(1, 2)" yields verbatim 1 2:
$h = #{}
# The array to base the key on.
$array = 1, 2
# Use the *stringified* version as the key.
$h["$array"] = 'a'
# Works, because even different array instances with equal-valued
# instances of .NET primitive types stringify the same.
# '1 2'
$h["$(1, 2)"]
iRon points out that this simplistic approach can lead to ambiguity (e.g., a single '1 2' string would result in the same key as array 1, 2) and recommends the following instead:
a more advanced/explicit way for array keys would be:
joining their elements with a non-printable character; e.g.
$key = $array -join [char]27
or, for complex object array elements, serializing the array:
$key = [System.Management.Automation.PSSerializer]::Serialize($array)
Note that even the XML (string)-based serialization provided by the System.Management.Automation.PSSerializer class (used in PowerShell remoting and background jobs for cross-process marshaling) has its limits with respect to reliably distinguishing instances, because its recursion depth is limited - see this answer for more information; you can increase the depth on demand, but doing so can result in very large string representations.
A concrete example:
using namespace System.Management.Automation
$ht = #{}
# Use serialization on an array-valued key.
$ht[[PSSerializer]::Serialize(#(1, 2))] = 'a'
# Despite using a different array instance, this
# lookup succeeds, because the serialized representation is the same.
$ht[[PSSerializer]::Serialize(#(1, 2))] # -> 'a'
The primary cause of your problems here is that PowerShell's index access operator [] supports multi-index access by enumerating any array values passed.
To understand why, let's have a look at how the index accessor [...] actually works in PowerShell. Let's start with a simple hashtable, with 2 entries using scalar keys:
$ht = #{}
$ht['a'] = 'This is value A'
$ht['b'] = 'This is value B'
Now, let's inspect how it behaves!
Passing a scalar argument resolves to the value associated with the key represented by said argument, so far so good:
PS ~> $ht['a']
This is value A
But we can also pass an array argument, and all of a sudden PowerShell will try to resolve all items as individual keys:
PS ~> $ht[#('a', 'b')]
This is value A
This is value B
PS ~> $ht[#('b', 'a')] # let's try in reverse!
This is value B
This is value A
Now, to understand what happens in your example, let's try an add an entry with an array reference as the key, along with two other entries where the key is the individual values fround in the array:
$ht = #{}
$keys = 1,2
$ht[$keys[0]] = 'Value 1'
$ht[$keys[1]] = 'Value 2'
$ht[$keys] = 'Value 1,2'
And when we subsequently try to resolve the last entry using our array reference:
PS ~> $ht[$keys]
Value 1
Value 2
Oops! PowerShell unraveled the $keys array, and never actually attempted to resolve the entry associated with the key corresponding to the array reference in $keys.
In other words: The index accessor cannot be used to resolve dictionary entries by key is the key type is enumerable
So, how does one access an entry by array reference without having PowerShell unravel the array?
Use the IList.Item() parameterized property instead:
PS ~> $ht.Item($keys)
Value 1,2
Thought I have read enough examples here and elsewhere. Still I fail creating arrays in Power Shell.
With that code I hoped to create slices of pair values from an array.
$values = #('hello','world','bonjour','moon','ola','mars')
function slice_array {
param (
[String[]]$Items
)
[int16] $size = 2
$pair = [string[]]::new($size) # size is 2
$returns = [System.Collections.ArrayList]#()
[int16] $_i = 0
foreach($item in $Items){
$pair[$_i] = $Item
$_i++;
if($_i -gt $size - 1){
$_i = 0
[void]$returns.Add($pair)
}
}
return $returns
}
slice_array($values)
the output is
ola
mars
ola
mars
ola
mars
I would hope for
'hello','world'
'bonjour','moon'
'ola','mars'
Is possible to slice that array to an array of arrays with length 2 ?
Any explenation why it doesn't work as expected ?
How should the code be changed ?
Thanks for any hint to properly understand Arrays in PowerShell !
Here's a PowerShell-idiomatic solution (the fix required for your code is in the bottom section):
The function is named Get-Slices to adhere to PowerShell's verb-noun naming convention (see the docs for more information).
Note: Often, the singular form of the noun is used, e.g. Get-Item rather than Get-Items, given that you situationally may get one or multiple output values; however, since the express purpose here is to slice a single object into multiple parts, I've chosen the plural.
The slice size (count of elements per slice) is passed as a parameter.
The function uses .., the range operator, to extract a single slice from an array.
It uses PowerShell's implicit output behavior (no need for return, no need to build up a list of return values explicitly; see this answer for more information).
It shows how to output an array as a whole from a function, which requires wrapping it in an auxiliary single-element array using the unary form of ,, the array constructor operator. Without this auxiliary array, the array's elements would be output individually to the pipeline (which is also used for function / script output; see this answer for more information.
# Note: For brevity, argument validation, pipeline support, error handling, ...
# have been omitted.
function Get-Slices {
param (
[String[]] $Items
,
[int] $Size # The slice size (element count)
)
$sliceCount = [Math]::Ceiling($Items.Count / $Size)
if ($sliceCount -le 1) {
# array is empty or as large as or smaller than a slice? ->
# wrap it *twice* to ensure that the output is *always* an
# *array of arrays*, in this case containing just *one* element
# containing the original array.
,, $Items
}
else {
foreach ($offset in 0..($sliceCount-1)) {
, $Items[($offset * $Size)..(($offset+1) * $Size - 1)] # output this slice
}
}
}
To slice an array into pairs and collect the output in an array of arrays (jagged array):
$arrayOfPairs =
Get-Slices -Items 'hello','world','bonjour','moon','ola','mars' -Size 2
Note:
Shell-like syntax is required when you call functions (commands in general) in PowerShell: arguments are whitespace-separated and not enclosed in (...) (see this answer for more information)
Since a function's declared parameters are positional by default, naming the arguments as I've done above (-Item ..., -Size ...) isn't strictly necessary, but helps readability.
Two sample calls:
"`n-- Get pairs (slice count 2):"
Get-Slices -Items 'hello','world','bonjour','moon','ola','mars' -Size 2 |
ForEach-Object { $_ -join ', ' }
"`n-- Get slices of 3:"
Get-Slices -Items 'hello','world','bonjour','moon','ola','mars' -Size 3 |
ForEach-Object { $_ -join ', ' }
The above yields:
-- Get pairs (slice count 2):
hello, world
bonjour, moon
ola, mars
-- Get slices of 3:
hello, world, bonjour
moon, ola, mars
As for what you tried:
The only problem with your code was that you kept reusing the very same auxiliary array for collecting a pair of elements, so that subsequent iterations replaced the elements of the previous ones, so that, in the end, your array list contained multiple references to the same pair array, reflecting the last pair only.
This behavior occurs, because arrays are instance of reference types rather than value types - see this answer for background information.
The simplest solution is to add a (shallow) clone of your $pair array to your list, which ensures that each list entry is a distinct array:
[void]$returns.Add($pair.Clone())
Why you got 3 equal pairs instead of different pairs:
.Net (powershell based on it) is object-oriented language and it has consept of reference types and value types. Almost all types are reference types.
What happens in your code:
You create $pair = [string[]] object. $pair variable actually stores memory address of (reference to) [string[]] object, because arrays are reference types
You fill $pair array with values
You add (!) $pair to $returns. Remember that $pair is reference to memory block. And when you add it to $returns, it adds memory address of [string[]] you wrote values to.
You repeat step2: You fill $pair array with different values, but address of this array in memory keeps the same. Doing this you actually replace values from step2 with new values in the same $pair object.
= // = step3
= // = step4
= // = step3
As a result: in $returns there are three same memory addresses: [[reference to $pair], [reference to $pair], [reference to $pair]]. And $pair values were overwritten by code with last pair values.
On output it works like this:
Powershell looks at $results which is array.
Powershell looks to $results[0] which reference to $pair
Powershell outputs reference to $pair[0]
Powershell outputs reference to $pair[1]
Powershell looks to $results[1] which reference to $pair
Powershell outputs reference to $pair[0]
Powershell outputs reference to $pair[1]
Powershell looks to $results[1] which reference to $pair
Powershell outputs reference to $pair[0]
Powershell outputs reference to $pair[1]
So you see, you triple output the object from the same memory address. You overwritten it 3 times in slice_array and now it stores only last pair values.
To fix it in your code, you should create a new $pair in memory: add $pair = [string[]]::new($size) just after $returns.Add($pair)
I am new to PowerShell and I have a one shot task to perform which must be done with PowerShell. This involves Active Directory as well. I need to add a new computer object into our AD and one of the attributes I must set at creation time is a 16 bytes binary value. I am getting as input a string which is an hexadecimal representation of the value I must set for the attribute.
I tried to input the value asis and it doesn't work. I tried escaping each byte with a backslash, it doesn't work neither.
How should I format the input for this to work with the New-ADComputer command? I am setting a bunch of other attributes successfully. When I remove this binary entry from my hashtable passed to the -OtherAttributes option it works fine. So, obviously a format problem. I found nothing about the expected format for such attributes.
Any hints? TIA.
EDIT 2018-06-05 19:44 EDT:
I tried converting the string to a byte array as follow:
Function Convert-Hex2ByteArray {
[cmdletbinding()]
param(
[parameter(Mandatory=$true)]
[String]
$HexString
)
[byte[]] $Bytes = #(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)
For($i=0; $i -lt $HexString.Length; $i+=2) {
$Bytes[$i/2] = [convert]::ToByte($HexString.Substring($i, 2), 16)
}
$Bytes
}
(...)
$netbootGUID = Convert-Hex2ByteArray($args[$indiceArgs])
$otherAttributes.add( "netbootGUID", $netbootGUID )
(...)
New-ADComputer -Credential $cred -Server $ADhost -Path "CN=Computers,$baseDN" -SAMAccountName $sAMAccountName -Name $name-Instance 4 -OtherAttributes $otherAttributes
This leads to the following error (I apologize for my own translation since the original message is shown in French):
Many values were specified for an attribut which can only have one
Problem solved:
$netbootGUID = New-Object Guid $args[$indiceArgs]
$otherAttributs.add( "netbootGUID", $netbootGUID )
Did the trick.
Typically for binary storage you need to convert the string to a byte array:
$String = '3c6ef75eaa2c4b23992bbd65ac891917'
$ByteArray = [byte[]]$(for ($i = 0; $i -lt $String.Length; $i+=2) { [Convert]::ToByte($String.Substring($i,2), 16) })
To convert it back:
$NewString = -join $(foreach($Byte in $ByteArray) { $Byte.ToString('x2') })
If you want the characters upper case, specify 'X2' instead of 'x2'.
Since you're storing 16 byte values, I'll note that if you're storing GUIDs you may need to change the storage order since the order of bytes in a string representation of a GUID does not match the order of bytes in a byte representation of a GUID on an x86 system. Fortunately, there are built in functions for handling this conversion with the built-in System.Guid data type:
$GUID = 'f8d89eb2b49c4bfeab44a85ccdc4191a'
$ByteArray = [Guid]::new($GUID).ToByteArray()
And a constructor for converting back:
$NewGUID = [Guid]::new($ByteArray)
Whether or not you should use this method depends on exactly what property you're updating and whether or not the application(s) that will be using the property in question will correctly be handling the GUIDs or if they're just storing the GUID as raw bytes (which is incorrect but not surprising). You'll have to test by seeing what GUID your application sees and comparing it to the byte array in Active Directory to verify that it's correct.
For specifics on the byte ordering, see the documentation for Guid.ToByteArray():
Note that the order of bytes in the returned byte array is different from the string representation of a Guid value. The order of the beginning four-byte group and the next two two-byte groups is reversed, whereas the order of the last two-byte group and the closing six-byte group is the same. The example provides an illustration.
The reason for this is that a GUID is partially constructed from a series of integers of varying sizes, and the UUID standard specifies big endianness for those numbers. x86 computers are little-endian systems.
Is there a method for converting unique strings to unique integers in PowerShell?
I'm using a PowerShell function as a service bus between two API's,
the first API produces unique codes e.g. HG44X10999 (varchars)- but the second API which will consume the first as input, will only accept integers. I only care about keeping them unique.
I have looked at $string.gethashcode() but this produces negative integers and also changes between builds. Get-hash | $string -encoding ASCII obviously outputs varchars too.
Other examples on SO are referring to converting a string of numeric characters to integers i.e. $string = 123 - but I can't find a way of quickly computing an int from a string of alphanumeric
The Fowler-Noll-Vo hash function seems well-suited for your purpose, as it can produce a 32-bit hash output.
Here's a simple implementation in PowerShell (the offset basis and initial prime is taken from the wikipedia reference table for 32-bit outputs):
function Get-FNVHash {
param(
[string]$InputString
)
# Initial prime and offset chosen for 32-bit output
# See https://en.wikipedia.org/wiki/Fowler–Noll–Vo_hash_function
[uint32]$FNVPrime = 16777619
[uint32]$offset = 2166136261
# Convert string to byte array, may want to change based on input collation
$bytes = [System.Text.Encoding]::UTF8.GetBytes($InputString)
# Copy offset as initial hash value
[uint32]$hash = $offset
foreach($octet in $bytes)
{
# Apply XOR, multiply by prime and mod with max output size
$hash = $hash -bxor $octet
$hash = $hash * $FNVPrime % [System.Math]::Pow(2,32)
}
return $hash
}
Now you can repeatably produce distinct integers from the input strings:
PS C:\> Get-FNVHash HG44X10999
1174154724
If the target API only accepts positive signed 32-bit integers you can change the modulus to [System.Math]::Pow(2,31) (doubling the chance of collisions, to
approx. 1 in 4300 for 1000 distinct inputs)
For further insight into this simple approach, see this page on FNV and have a look at this article exploring short string hashing