Replace string untill its length is less than limit with PowerShell - powershell

I try to update users AD accounts properties with values imported from csv file.
The problem is that some of the properties like department allow strings of length of max length 64 that is less than provided in the file which can be up to 110.
I have found and adopted solution provided by TroyBramley in this thread - How to replace multiple strings in a file using PowerShell (thank You Troy).
It works fine but... Well. After all replaces have place the text is less meaningful than originally.
For example, original text First Department of something1 something2 something3 something4 would result in 1st Dept of sth1 sth2 sth3 sth4
I'd like to have control over the process so I can stop it when the length of the string drops just under the limit alowed by AD property.
By the way. I'd like to have a choice which replacement takes first, second and so on, too.
I put elements in a hashtable alphabetically but it seems that they are not processed this way. I can't figure out the pattern.
I can see the resolution by replacing strings one by one, controlling length after each replacement. But with almost 70 strings it leds to huge portion of code. Maybe there is simpler way?

You can iterate the replacement list until the string reaches the MaxLength defined.
## Q:\Test\2018\06\26\SO_51042611.ps1
$Original = "First Department of something1 something2 something3 something4"
$list = New-Object System.Collections.Specialized.OrderedDictionary
$list.Add("First","1st")
$list.Add("Department","Dept")
$list.Add("something1","sth1")
$list.Add("something2","sth2")
$list.Add("something3","sth3")
$list.Add("something4","sth4")
$MaxLength = 40
ForEach ($Item in $list.GetEnumerator()){
$Original = $Original -Replace $Item.Key,$Item.Value
If ($Original.Length -le $MaxLength){Break}
}
"{0}: {1}" -f $Original.Length,$Original
Sample output with $MaxLength set to 40
37: 1st Dept of sth1 sth2 sth3 something4

Related

Powershell filtering one list out of another list

<updated, added Santiago Squarzon suggest information>
I have two lists, I pull them from csv but there is only one column in each of the two lists.
Here is how I pull in the lists in my script
$orginal_list = Get-Content -Path .\random-word-350k-wo-quotes.txt
$filter_words = Get-Content -Path .\no_go_words.txt
However, I will use a typed list for simplicity in the code example below.
In this example, the $original_list can have some words repeated.
I want to filter out all of the words in $original_list that are in the $filter_words list.
Then add the filtered list to the variable $filtered_list.
In this example, $filtered_list would only have "dirt","turtle" in it.
I know the line I have below where I subtract the two won't work, it's there as a placeholder as I don't know what to use to get the result.
Of note, the csv file that feeds $original_list could have 300,000 or more rows, and $filter_words could have hundreds of rows. So would want this to be as efficient as possible.
The filtering is case insensitive.
$orginal_list = "yellow","blue","yellow","dirt","blue","yellow","turtle","dirt"
$filter_words = "yellow","blue","green","harsh"
$filtered_list = $orginal_list - $filter_words
$filtered_list
dirt
turtle
Use System.Collections.Generic.HashSet`1 and its .ExceptWith() method:
# Note: if possible, declare the lists as [string[]] arrays to begin with.
# Otherwise, use a [string[]] cast im the method calls below, which,
# however, creates a duplicate array on the fly.
[string[]] $orginal_list = "yellow","blue","yellow","dirt","blue","yellow","turtle","dirt"
[string[]] $filter_words = "yellow","blue","green","harsh"
# Create a hash set based on the strings in $orginal_list,
# with case-insensitive lookups.
$hsOrig = [System.Collections.Generic.HashSet[string]]::new(
$orginal_list,
[System.StringComparer]::CurrentCultureIgnoreCase
)
# Reduce it to those strings not present in $filter_words, in-place.
$hsOrig.ExceptWith($filter_words)
# Convert the filtered hash set to an array.
[string[]] $filtered_list = [string[]]::new($hsOrig.Count)
$hsOrig.CopyTo($filtered_list)
# Output the result
$filtered_list
The above yields:
dirt
turtle
To also speed up reading your input files, use the following:
# Note: System.IO.File]::ReadAllLines() returns a [string[]] instance.
$orginal_list = [System.IO.File]::ReadAllLines((Convert-Path .\random-word-350k-wo-quotes.txt))
$filter_words = [System.IO.File]::ReadAllLines((Convert-Path .\no_go_words.txt))
Note:
.NET generally defaults to (BOM-less) UTF-8; pass a [System.Text.Encoding] instance as a second argument, if needed.
.NET's working dir. usually differs from PowerShell's, so the use of full paths is always advisable in .NET API calls, and that is what the Convert-Path calls ensure.
I have found that using Linq to filter one list out from another is incredibly easy and incredibly fast (especially for large lists)
# ARRAY OF 1000 STRINGS LOWERCASE (item1 - item1000)
[string[]]$ThousandItems = 1..1000 | %{"item$_"};
# ARRAY OF 100 STRINGS UPPERCASE (ITEM901 - ITEM1000)
[string[]]$HundredItems = 901..1000 | %{"ITEM$_"};
# SUBTRACT THE SECOND ARRAY FROM THE FIRST ONE (CASE INSENSITIVELY)
[string[]]$NineHundred = [Linq.Enumerable]::Except($ThousandItems, $HundredItems, [System.StringComparer]::OrdinalIgnoreCase);
$NineHundred;
Which returns the list of 1000 items minus Item901-Item1000
item1
item2
...
item899
item900
As for speed, removing 100 items from a list...
1,000 Items = 1ms
10,000 Items = 2ms
100,000 Items = 12ms
1,000,000 Items = 259ms
10,000,000 Items = 3,008ms
Note: These times are just on the [Linq.Enumerable]::Except() line. So it's just measuring the time taken to subtract one array from the other. It does not measure the time taken to fill the array.
So to apply this to the original poster's example
$original_list = [System.IO.File]::ReadAllLines((Convert-Path .\random-word-350k-wo-quotes.txt));
$filter_words = [System.IO.File]::ReadAllLines((Convert-Path .\no_go_words.txt));
[string[]]$filtered_list = [Linq.Enumerable]::Except($original_list,$filter_words,[System.StringComparer]::OrdinalIgnoreCase);
For this, I literally inserted 350K strings (the MD5 hash of the numbers 1 - 350K) into the original list (uppercase), inserted 10K strings (the MD5 hash of the numbers 1-10K) into the filter words list (lowercase) and ran that code.
There were 340K words in the filtered list, and it only took 260ms to read both files, filter and return the list

powershell extracting data from strings or other suggestions

I have a script I am writing that essentially reads data from an excel document that is generated from another tool. It lists file ages in the format listed below. My issue is I would like to process each cell value and change the cell color based on that value. So anything older than 1 year gets changed to RED, 90+ days gets yellow\orange.
So after a bit of research, I elected to use an if statement to determine when it is greater than 0 years which seems to work fine, however when I reach the days portion I'm not sure how to extract JUST the digits portion to the left of d in each cell when you get to the y if its there just stop OR possibly just read the left digits only if the $_ contains d then I could further process if that value is -gt 90? I am unsure of how to extract variable length strings only if they are digits left of a character. I considered using a combination of the below method of finding a character and returning up to y or something else.
Find character position and update file name
Possible Age Formats:
13y170d
3y249d
8h7m
1y109d
1y109d
1y109d
5d22h
3y281d
3y184d
11y263d
7m25s
1h14m
[regex]$years = "\d{1,3}[0-9]y"
[regex]$days_90 = "\d{0,3}[0-9]d"
conditionally formatting/coloring row based on age (years)
if ( $( A$_ -match "$years") -eq $True ) {
$($test_home).$("Last Accessed") | ForEach-Object { $( $($_.Contains("y") -eq $True ) { New-ConditionalText -Text Red } }
conditionally formatting/coloring row based on age (90+ days)
if ( $( A$_ -match "$days_90") -eq $True ) { New-ConditionalText -Text Yellow }
What you are after is a positive lookahead and lookbehind. Effectivly it gets the text between two characters or sets. Really handy if you have a consistently formatted set of data to work with.
[regex]$days_90 = '(?<=y).*?(?=d)'
. Matches any characters without line breaks.
* Matches 0 or more of the preceding token.
? Makes the regex lazy and try to match as few as possible.

Convert Byte Array (from legacy program data file) to Powershell object

I have a "structured" file (logical fixed-length records) from a legacy program on a legacy (non-MS) operating system. I know how the records were structured in the original program, but the original O/S handled structured data as a sequence of bytes for file I/O, so a hex dump won't show you anything more than what the record length is (there are marker bytes and other record overhead imposed by the access method API used to generate the file originally).
Once I have the sequence of bytes in a Powershell variable, with the overhead bytes "cut away", how can I convert this into a structured object? Some of the "fields" are 16-bit integers, some are strings of the form [s]data (where [s] is a byte giving the length of the "real" data in that field), some are BCD coded fixed-point numbers, some are IEEE floats.
(I haven't been specific about the structure, either on the Powershell side or on the legacy side, because I am seeking a more-or-less 'generic' solution/technique, as I actually have several different files with different record structures to process.)
Initially, I tried to do it by creating a type that could take the buffer and overwrite a struct so that all the fields were nicely filled in. However, certain issues arose (regarding struct layout, fixed buffers and mixing fixed and managed members) and I also realised that there was no guarantee that the data in the buffer would be properly (or even legally) aligned. Decided to try a more programmatic path.
"Manual" parsing is out, so how about automatic parsing? You're going to need to define the members of your PSobject at some point, why not do it in a way that can help programmatically parse the data. This method does not require the data in the buffer to be correctly aligned or even contiguous. You can also have fields overlap to separate raw unions into the individual members (though, typically, only one will contain a "correct" value).
First step, build a hash table to identify the members, the offset in the buffer, their data types and, if an array, the number of elements :
$struct = #{
field1 = 0,[int],0; # 0 means not an array
field2 = 4,[byte],16; # a C string maybe
field3 = 24,[char],32; # wchar_t[32] ? note: skipped over bytes 20-23
field4 = 56,[double],0
}
# the names field1/2/3/4 are arbitrary, any valid member name may be used (but not
# necessarily any valid hash key if you want a PSObject as the end result).
# also, the values could be hash tables instead of arrays. that would allow
# descriptive names for the values but doesn't affect the end result.
Next, use [BitConverter] to extract the required data. The problem here is that we need to call the correct method for all the varying types. Just use a (big) switch statement. The basic principle is the same for most values, get the type indicator and initial offset from the $struct definition then call the correct [BitConverter] method and supply the buffer and initial offset, update the offset to where the next element of an array would be and then repeat for as many array elements as are required. The only trap here is that the data in the buffer must have the same format as expected by [BitConverter], so for the [double] example, the bytes in the buffer must conform to IEEE-754 floating point format (assuming that [BitConverter]::ToDouble() is used). Thus, for example, raw data from a Paradox database will need some tweeking because it flips the high bit to simplify sorting.
$struct.keys | foreach {
# key order is undefined but that won't affect the final object's members
$hashobject = #{}
} {
$fieldoffs = $struct[$_][0]
$fieldtype = $struct[$_][1]
if (($arraysize = $struct[$_][2]) -ne 0) { # yes, I'm a C programmer from way back
$array = #()
} else {
$array = $null
}
:w while ($arraysize-- -ge 0) {
switch($fieldtype) {
([int]) {
$value = [bitconverter]::toint32($buffer, $fieldoffs)
$fieldoffs += 4
}
([byte]) {
$value = $buffer[$fieldoffs++]
}
([char]) {
$value = [bitconverter]::tochar($buffer, $fieldoffs)
$fieldoffs += 2
}
([string]) { # ANSI string, 1 byte per character
$array = new-object string (,[char[]]$buffer[$fieldoffs..($fieldoffs+$arraysize)])
# $arraysize has already been decremented so don't need to subtract 1
break w # "array size" was actually string length so don't loop
#
# description:
# first, get a slice of the buffer as a byte[] (assume single byte characters)
# next, convert each byte to a char in a char[]
# then, invoke the constructor String(Char[])
# finally, put the String into $array ready for insertion into $hashobject
#
# Note the convoluted syntax - New-Object expects the second argument to be
# an array of the constructor parameters but String(Char[]) requires only
# one argument that is itself an array. By itself,
# [char[]]$buffer[$fieldoffs..($fieldoffs+$arraysize)]
# is treated by PowerShell as an argument list of individual chars, corrupting the
# constructor call. The normal trick is to prepend a single comma to create an array
# of one element which is itself an array
# ,[char[]]$buffer[$fieldoffs..($fieldoffs+$arraysize)]
# but this won't work because of the way PowerShell parses the command line. The
# space before the comma is ignored so that instead of getting 2 arguments (a string
# "String" and the array of an array of char), there is only one argument, an array
# of 2 elements ("String" and array of array of char) thereby totally confusing
# New-Object. To make it work you need to ALSO isolate the single element array into
# its own expression. Hence the parentheses
# (,[char[]]$buffer[$fieldoffs..($fieldoffs+$arraysize)])
#
}
}
if ($null -ne $array) {
# must be in this order* to stop the -ne from enumerating $array to compare against
# $null. this would result in the condition being considered false if $array were
# empty ( (#() -ne $null) -> $null -> $false ) or contained only one element with
# the value 0 ( (#(0) -ne $null) -> (scalar) 0 -> $false ).
$array += $value
# $array is not $null so must be an array to which $value is appended
} else {
# $array is $null only if $arraysize -eq 0 before the loop (and is now -1)
$array = $value
# so the loop won't repeat thus leaving this one scalar in $array
}
}
$hashobject[$_] = $array
}
#*could have reversed it as
# if ($array -eq $null) { scalar } else { collect array }
# since the condition will only be true if $array is actually $null or contains at
# least 2 $null elements (but no valid conversion will produce $null)
At this point there is a hash table, $hashobject, with keys equal to the field names and values containing the bytes from the buffer arranged into single (or arrays of) numeric (inc. char/boolean) values or (ANSI) strings. To create a (proper) object, just invoke New-Object -TypeName PSObject -Property $hashobject or use [PSCustomObject]$hashobject.
Of course, if the buffer actually contained structured data then the process would be more complicated but the basic procedure would be the same. Note also that the "types" used in the $struct hash table have no direct effect on the resultant types of the object members, they are only convenient selectors for the switch statement. It would work just as well with strings or numbers. In fact, the parentheses around the case labels are because switch parses them the same as command arguments. Without the parentheses, the labels would be treated as literal strings. With them, the labels are evaluated as a type object. Both the label and the switch value are then converted to strings (that's what switch does for values other than script blocks or $null) but each type has a distinct string representation so the case labels will still match up correctly. (Not really on point but still interesting, I think.)
Several optimisations are possible but increase the complexity slightly. E.g.
([byte]) { # already have a byte[] so why collect bytes one at a time
if ($arraysize -ge 0) { # was originally -gt 0 so want a byte[]
$array = [byte[]]$buffer[$fieldoffs..($fieldoffs+$arraysize)]
# slicing the byte array produces an object array (of bytes) so cast it back
} else { # $arraysize was 0 so just a single byte
$array = $buffer[$fieldoffs]
}
break w # $array ready for insertion into $hashobject, don't need to loop
}
But what if my strings are actually Unicode?, you say. Easy, just use existing methods from the [Text.Encoding] class,
[string] { # Unicode string, 2 (LE) bytes per character
$array = [text.encoding]::unicode.getstring([byte[]]$buffer[$fieldoffs..($fieldoffs+$arraysize*2+1)])
# $arraysize should be the string length so, initially, $arraysize*2 is the byte
# count and $arraysize*2-1 is the end index (relative to $fieldoffs) but $arraysize
# was decremented so the end index is now $arraysize*2+1, i.e. length*2-1 = (length-1)*2+1
break w # got $array, no loop
}
You could also have both ANSI and Unicode by utilising a different type indicator for the ANSI string, maybe [char[]]. Remember, the type indicators do not affect the result, they just have to be distinct (and hopefully meaningful) identifiers.
I realise that this is not quite the "just dump the bytes into a union or variant record" solution mentioned in the OPs comment but PowerShell is based in .NET and uses managed objects where this sort of thing is largely prohibited (or difficult to get working, as I found). For example, assuming you could just dump raw chars (not bytes) into a String, how would the Length property get updated? This method also allows some useful preprocessing such as splitting up unions as noted above or converting raw byte or char arrays into the Strings they represent.

How to convert hexadecimal encoded string to hexadecimal integer

So basically what I'm trying to achieve is to get a MAC address from a text file and increment the value by one.
Been bashing my head against the Google/StackOverflow wall for a couple of hours, think there's a concept I'm just not getting.
PowerShell:
$Last_MAC_Address = (Get-Content -LiteralPath "\\UNC\Path\Last MAC Address.txt")
Write-Host ($Last_MAC_Address)
# Output: 00155DE10B73
$Next_MAC_Address = (($Last_MAC_Address | Format-Hex) + 1)
This is a 3 step process, and although PetSerAl answered it in the comments as a one liner, I'll break it down slightly for posterity (and use a different class).
The first step is to get the Hex number as a decimal (mathematical base 10, not type).
The Second step is the incrementation of the decimal.
And the final step is converting it back to hexadecimal.
broken down and not a one liner this will accomplish the task at hand:
$asDecimal = [System.Convert]::ToInt64("00155DE10B73", 16)
$asDecimal++
$asHex = [System.Convert]::ToString($asDecimal, 16)
Another option is to prefix the value with 0x and cast it to an int64:
$Next_MAC_Address = ([int64]"0x$Last_MAC_Address"+1).ToString('X12')
You could also use the format operator (-f) instead of the ToString() method:
$Next_MAC_Address = '{0:X12}' -f ([int64]"0x$Last_MAC_Address"+1)
There is, however, one thing that may be worth noting. MAC addresses aren't just random 6-byte numbers without any inner structure. They actually consist of two parts. The first 3 bytes form the Organizationally Unique Identifier (OUI), a vendor-specific prefix (00-15-5D is one of the OUIs belonging to Microsoft). Only the last 3 bytes are a random number, a unique identifier for each card from the vendor identified by the OUI.
Taking that into consideration you may want to split the MAC address accordingly, e.g. like this:
$oui, $nid = $Last_MAC_Address -split '(?<=^[0-9a-f]{6})(?=[0-9a-f]{6}$)'
or like this:
$oui = $Last_MAC_Address.Substring(0, 6)
$nid = $Last_MAC_Address.Substring(6, 6)
and increment only the NIC identifier, and only if it wouldn't overflow:
if ($nid -ne 'ffffff') {
$Next_MAC_Address = "{0}{1:X6}" -f $oui, ([int64]"0x$nid"+1)
} else {
Write-Error 'MAC address overflow.'
}

Issues importing csv column and replacing it from hash value

Please note that this data has been cleaned to prevent identifying information and considerable white space has been removed from between the commas in order to aid in readability. Lastly at the end of the TYPE column there is an additional line saying how many lines were exported which hopefully will be ignored by the script.
TYPE ,DATE ,TIME ,STREET ,CROSS-STREET ,X-COORD ,Y-COORD
459 ,2015-05-03 00:00:00.000,00:58:35,FOO DR ,A RD/B CT , 0.0, 0.0
488 ,2015-05-03 00:00:00.000,02:31:54,BAR AV ,C ST/D ST , 0.0, 0.0
I am attempting to import this CSV using Import-CSV, convert the TYPE numeric codes into different strings. An example would be 459 becomes Apple. 488 becomes Banana and so forth. I have created a hash with the TYPE numbers as the key and the value being what I want it changed to.
So my issue is really two-fold; I have been so far unable to get the TYPE CSV column to import into the script (I've been trying an array for the most part) and I am not sure the best way to build the logic to check the array data against my hash keys and replace it with the appropriate value.
# declare filename to modify
$strFileName="test.csv"
# import the type data into its own array
$imported_CSV = Import-Csv $strFileName
# populate hash
$conversion_Hash = #{
187 = Homicide;
211 = Robbery;
245 = Assault;
451 = Arson;
459 = Burglary;
484 = Larceny;
487 = Grand Theft;
488 = Petty Theft;
10851 = Stolen Vehicle;
HS = Drug;
}
# perform the conversion
foreach ($record in $imported_CSV)
{
$conversion_Hash[$record.Type]
}
This has no logic and just contains the code that was presented in the answer below. Note that I addressed that it doesn't work in the comments below.
I think this is an example of what you are looking for:
$hashTable = #{459= Apple; 488= Banana;}
$csv = import-csv <file>
foreach($record in $csv)
{
$hashTable[$record.Type] #returns hash value
}
Output:
Apple
Banana
So we have several little issues here. The two big ones are your source file and the your hashtable keys are integers and not strings.
# declare filename to modify
$strFileName="c:\temp\point.csv"
# import the type data into its own array
$imported_CSV = (Get-Content $strFileName) -replace "\s*,\s*","," | ConvertFrom-Csv
# populate hash
$conversion_Hash = #{
"187" = "Homicide";
"211" = "Robbery";
"245" = "Assault";
"451" = "Arson";
"459" = "Burglary";
"484" = "Larceny";
"487" = "Grand Theft";
"488" = "Petty Theft";
"10851" = "Stolen Vehicle";
"HS" = "Drug";
}
# perform the conversion
foreach ($record in $imported_CSV)
{
$conversion_Hash[$record.Type]
}
Output from naughty people
Burglary
Petty Theft
I don't know if your source file looks like it does in your question but there is a bunch of whitespace there that will be giving you a hassle. Namely you dont have a TYPE column but a "TYPE " (without the spaces). Same goes for the other columns. Data is affected as well. It's not 459 but "459 "(without the spaces).
To fix that I check the file and replace all space surrounding the commas with just the comma.
TYPE,DATE,TIME,STREET,CROSS-STREET,X-COORD,Y-COORD
459,2015-05-03 00:00:00.000,00:58:35,FOO DR,A RD/B CT,0.0,0.0
488,2015-05-03 00:00:00.000,02:31:54,BAR AV,C ST/D ST,0.0,0.0
If your data already looks like that then you need to be careful posting this stuff in your question. Onto the other issue with your comparison
You will see I have quoted almost everything in that hashtable. I had to for the values as they were being taken as commands otherwise. I also quoted the keys as the csv table contains string and not integers. I would have just casted to [int] to avoid the whole issue but one of your keys is called "HS" which does not look like a number to me :).
What I might have done
Just to play a little I might have added another note property to the list called TypeAsString which would add a column.
# perform the conversion
$imported_CSV | ForEach-Object{
$_ | Add-Member -MemberType NoteProperty -Name "TypeAsString" -Value $conversion_Hash[$_.Type] -PassThru
}
So the output from one item would look like this
TYPE : 459
DATE : 2015-05-03 00:00:00.000
TIME : 00:58:35
STREET : FOO DR
CROSS-STREET : A RD/B CT
X-COORD : 0.0
Y-COORD : 0.0
TypeAsString : Burglary
I could have made a more dynamic property like a script property, so that changes in $conversion_Hash are updated instantly, but this should suffice for what you need.