Flatten folder structure and create index - powershell

Some people like to use long folder names and deep folder structures. This is especially painful when you use OneDrive/SharePoint Online and sync long paths with Windows.
Hence I am looking for an approach to shorten those paths and keep the meaning, especially when archiving files and folders.
Basically trying to transform:
VeryLongPath\AnotherLongFolderPath\AndAnotherOne\VeryLongFilename.xlsx
Into:
1\2\3\4.xlsx
And generating the following index file:
1 VeryLongPath
2 AnotherLongFolderPath
3 AndAnotherOne
4 VeryLongFilename.xlsx

You could create a helper class to track the individual labels:
class PathLabelIndexer
{
# This will hold the label translations, [string] -> [int]
[hashtable]$_terms = #{}
# This will keep track of how many distinct labels we've encountered
[int]$_length = 0
# Transforms a path into index values
[string] Transform([string]$Path){
return #($Path.Split('\') |%{
if($this._terms.ContainsKey($_)){
# If we already have a translation for $_, use that!
$this._terms[$_]
}
else {
# No existing translation found, add a new one
($this._terms[$_] = $this._length++)
}
}) -join '\'
}
# Produces the index needed to translate them back to labels
[string[]] GetIndex(){
# Since $_length starts at 0, a sorted array of the index values will give us the correct mapping
return [string[]]#($this._terms.GetEnumerator() |Sort Value |ForEach-Object Key)
}
}
Now you can do:
PS ~> $indexer = [PathLabelIndexer]::new()
PS ~> $indexer.Transform('VeryLongPath\AnotherLongFolderPath\AndAnotherOne\VeryLongFilename.xlsx')
0\1\2\3
To get the reverse, simply produce the resulting index and look up the individual labels again:
PS ~> $index = $indexer.GetIndex()
PS ~> $index['0\1\2\3'.Split('\')] -join '\'
VeryLongPath\AnotherLongFolderPath\AndAnotherOne\VeryLongFilename.xlsx

Related

Getting error when adding nested hashtable to array in Powershell

I have a nested hashtable with an array and I want to loop through the contents of another array and add that to the nested hashtable. I'm trying to build a Slack message block.
Here's the nested hashtable I want to add to:
$msgdata = #{
blocks = #(
#{
type = 'section'
text = #{
type = 'mrkdwn'
text = '*Services Being Used This Month*'
}
}
#{
type = 'divider'
}
)
}
$rows = [ ['azure vm', 'centralus'], ['azure sql', 'eastus'], ['azure functions', 'centralus'], ['azure monitor', 'eastus2'] ]
$serviceitems = #()
foreach ($r in $rows) {
$servicetext = "*{0}* - {1}" -f $r[1], $r[0]
$serviceitems += #{'type'='section'}
$serviceitems += #{'text'= ''}
$serviceitems.text.Add('type'='mrkdwn')
$serviceitems.text.Add('text'=$servicetext)
$serviceitems += #{'type'='divider'}
}
$msgdata.blocks += $serviceitems
The code is partially working. The hashtables #{'type'='section'} and #{'type'='divider'} get added successfully. Trying to add the nested hashtable of #{'text' = #{ 'type'='mrkdwn' 'text'=$servicetext }} fails with this error:
Line |
24 | $serviceitems.text.Add('type'='mrkdwn')
| ~
| Missing ')' in method call.
I tried looking through various Powershell posts and couldn't find one that applies to my specific situation. I'm brand new to using hashtables in Powershell.
Complementing mklement0's helpful answer, which solves the problem with your existing code, I suggest the following refactoring, using inline hashtables:
$serviceitems = foreach ($r in $rows) {
#{
type = 'section'
text = #{
type = 'mrkdwn'
text = "*{0}* - {1}" -f $r[1], $r[0]
}
}
#{
type = 'divider'
}
}
$msgdata.blocks += $serviceitems
This looks much cleaner and thus easier to maintain in my opinion.
Explanations:
$serviceitems = foreach ... captures all output (to the success stream) of the foreach loop in variable $serviceitems. PowerShell automatically creates an array from the output, which is more efficient than manually adding to an array using the += operator. Using += PowerShell has to recreate an array of the new size for each addition, because arrays are actually of fixed size. When PowerShell automatically creates an array, it uses a more efficient data structure internally.
By writing out an inline hash table, without assigning it to a variable, PowerShell implicitly outputs the data, in effect adding it to the $serviceitems array.
We output two hash tables per loop iteration, so PowerShells adds two array elements to $serviceitems per loop iteration.
Note:
This answer addresses your question as asked, specifically its syntax problems.
For a superior solution that bypasses the original problems in favor of streamlined code, see zett42's helpful answer.
$serviceitems.text.Add('type'='mrkdwn') causes a syntax error.
Generally speaking, IF $serviceitems.text referred to a hashtable (dictionary), you need either:
method syntax with distinct, ,-separated arguments:
$serviceitems.text.Add('type', 'mrkdwn')
or index syntax (which would quietly overwrite an existing entry, if present):
$serviceitems.text['type'] = 'mrkdwn'
PowerShell even lets you access hashtable (dictionary) entries with member-access syntax (dot notation):
$serviceitems.text.type = 'mrkdwn'
In your specific case, additional considerations come into play:
You're accessing a hashtable via an array, instead of directly.
The text entry you're trying to target isn't originally a nested hashtable, so you cannot call .Add() on it; instead, you must assign a new hashtable to it.
Therefore:
# Define an empty array
$serviceItems = #()
# "Extend" the array by adding a hashtable.
# Note: Except with small arrays, growing them with +=
# should be avoided, because a *new* array must be allocated
# every time.
$serviceItems += #{ text = '' }
# Refer to the hashtable via the array's last element (-1),
# and assign a nested hashtable to it.
$serviceItems[-1].text = #{ 'type' = 'mrkdwn' }
# Output the result.
$serviceItems

How do I add a line feed (new line) in a csv file (powershell)

I have large csv file that should contain many records. However, for some reason, there are no line feeds or new record delimiters so as to be able to treat the various records separately (example by importing them to excel)*. Is there any way (eg with windows powershell) that I can add a line feed before a given field in the csv file? For example suppose we have an input csv file with contents :
data1,data2,data3,data4,data5,data6,data7,data8,data9,data10;data11;data12
The request is to get an output csv like this (so every record should contains 3 cells / fields....however this should be configurable):
data1,data2,data3
data4,data5,data6
data7,data8,data9,
data10,data11,data12
The above example is for illustration only. Consider that my real case contains a huge amount of data fields that I somehow need to organize.
Thank you very much in advance for every response
*Actually I have deliberately eliminated every new line feed from my source data. I did this to get rid of some unwanted newlines and other formatting characters (\t etc) that existed inside specific cells and totally messed up the structure of the data set. However, this way I lost the required newlines \n as well. Now I want to add them back, selecting the proper position they should be.
p.s. Since I am very new to powershell or scripting in general, sorry if I am making an obvious or trivial question.....
You can "wrap around" an arbitrary number of values in X columns by calculating the relative column offset with: $index % $columnWidth
I'd suggest writing a small function for this, something like:
function ConvertTo-TabularCollection {
param(
[Parameter(Mandatory, ValueFromPipeline)]
[string[]]$Data,
[Parameter(Mandatory)]
[string[]]$ColumnNames
)
begin {
# Calculate table width and prepare list to collect input data
$width = $columnNames.Length
$values = [System.Collections.Generic.List[string]]::new()
}
process {
# Copy any input to our `$values` list
$values.AddRange($Data)
}
end {
# Time to process all the input values we've collected
for($i = 0; $i -lt $values.Count; $i++){
# use the modulo operator to calculate the relative column offset
$offset = $i % $width
if($offset -eq 0){
# We're about to process the first column of a new row, create an empty dictionary to hold the column values
$properties = [ordered]#{}
}
# Pick the next available value and show it into the appropriate column
$properties[$columnNames[$offset]] = $values[$i]
if($offset -eq ($width - 1)){
# We've reached the last column, output object and clear previous column values collected
[pscustomobject]$properties
$properties = $null
}
}
# Test if there is a (partial) row trailing, output object
if($properties){
[pscustomobject]$properties
}
}
}
Now you can transform your data as needed:
PS ~> $data = 'data1,data2,data3,data4,data5,data6,data7,data8,data9,data10,data11,data12' -split ','
PS ~> $data |ConvertTo-TabularCollection -ColumnNames col1,col2,col3
col1 col2 col3
---- ---- ----
data1 data2 data3
data4 data5 data6
data7 data8 data9
data10 data11 data12

PowerShell create array failed in a loop

Thought I have read enough examples here and elsewhere. Still I fail creating arrays in Power Shell.
With that code I hoped to create slices of pair values from an array.
$values = #('hello','world','bonjour','moon','ola','mars')
function slice_array {
param (
[String[]]$Items
)
[int16] $size = 2
$pair = [string[]]::new($size) # size is 2
$returns = [System.Collections.ArrayList]#()
[int16] $_i = 0
foreach($item in $Items){
$pair[$_i] = $Item
$_i++;
if($_i -gt $size - 1){
$_i = 0
[void]$returns.Add($pair)
}
}
return $returns
}
slice_array($values)
the output is
ola
mars
ola
mars
ola
mars
I would hope for
'hello','world'
'bonjour','moon'
'ola','mars'
Is possible to slice that array to an array of arrays with length 2 ?
Any explenation why it doesn't work as expected ?
How should the code be changed ?
Thanks for any hint to properly understand Arrays in PowerShell !
Here's a PowerShell-idiomatic solution (the fix required for your code is in the bottom section):
The function is named Get-Slices to adhere to PowerShell's verb-noun naming convention (see the docs for more information).
Note: Often, the singular form of the noun is used, e.g. Get-Item rather than Get-Items, given that you situationally may get one or multiple output values; however, since the express purpose here is to slice a single object into multiple parts, I've chosen the plural.
The slice size (count of elements per slice) is passed as a parameter.
The function uses .., the range operator, to extract a single slice from an array.
It uses PowerShell's implicit output behavior (no need for return, no need to build up a list of return values explicitly; see this answer for more information).
It shows how to output an array as a whole from a function, which requires wrapping it in an auxiliary single-element array using the unary form of ,, the array constructor operator. Without this auxiliary array, the array's elements would be output individually to the pipeline (which is also used for function / script output; see this answer for more information.
# Note: For brevity, argument validation, pipeline support, error handling, ...
# have been omitted.
function Get-Slices {
param (
[String[]] $Items
,
[int] $Size # The slice size (element count)
)
$sliceCount = [Math]::Ceiling($Items.Count / $Size)
if ($sliceCount -le 1) {
# array is empty or as large as or smaller than a slice? ->
# wrap it *twice* to ensure that the output is *always* an
# *array of arrays*, in this case containing just *one* element
# containing the original array.
,, $Items
}
else {
foreach ($offset in 0..($sliceCount-1)) {
, $Items[($offset * $Size)..(($offset+1) * $Size - 1)] # output this slice
}
}
}
To slice an array into pairs and collect the output in an array of arrays (jagged array):
$arrayOfPairs =
Get-Slices -Items 'hello','world','bonjour','moon','ola','mars' -Size 2
Note:
Shell-like syntax is required when you call functions (commands in general) in PowerShell: arguments are whitespace-separated and not enclosed in (...) (see this answer for more information)
Since a function's declared parameters are positional by default, naming the arguments as I've done above (-Item ..., -Size ...) isn't strictly necessary, but helps readability.
Two sample calls:
"`n-- Get pairs (slice count 2):"
Get-Slices -Items 'hello','world','bonjour','moon','ola','mars' -Size 2 |
ForEach-Object { $_ -join ', ' }
"`n-- Get slices of 3:"
Get-Slices -Items 'hello','world','bonjour','moon','ola','mars' -Size 3 |
ForEach-Object { $_ -join ', ' }
The above yields:
-- Get pairs (slice count 2):
hello, world
bonjour, moon
ola, mars
-- Get slices of 3:
hello, world, bonjour
moon, ola, mars
As for what you tried:
The only problem with your code was that you kept reusing the very same auxiliary array for collecting a pair of elements, so that subsequent iterations replaced the elements of the previous ones, so that, in the end, your array list contained multiple references to the same pair array, reflecting the last pair only.
This behavior occurs, because arrays are instance of reference types rather than value types - see this answer for background information.
The simplest solution is to add a (shallow) clone of your $pair array to your list, which ensures that each list entry is a distinct array:
[void]$returns.Add($pair.Clone())
Why you got 3 equal pairs instead of different pairs:
.Net (powershell based on it) is object-oriented language and it has consept of reference types and value types. Almost all types are reference types.
What happens in your code:
You create $pair = [string[]] object. $pair variable actually stores memory address of (reference to) [string[]] object, because arrays are reference types
You fill $pair array with values
You add (!) $pair to $returns. Remember that $pair is reference to memory block. And when you add it to $returns, it adds memory address of [string[]] you wrote values to.
You repeat step2: You fill $pair array with different values, but address of this array in memory keeps the same. Doing this you actually replace values from step2 with new values in the same $pair object.
= // = step3
= // = step4
= // = step3
As a result: in $returns there are three same memory addresses: [[reference to $pair], [reference to $pair], [reference to $pair]]. And $pair values were overwritten by code with last pair values.
On output it works like this:
Powershell looks at $results which is array.
Powershell looks to $results[0] which reference to $pair
Powershell outputs reference to $pair[0]
Powershell outputs reference to $pair[1]
Powershell looks to $results[1] which reference to $pair
Powershell outputs reference to $pair[0]
Powershell outputs reference to $pair[1]
Powershell looks to $results[1] which reference to $pair
Powershell outputs reference to $pair[0]
Powershell outputs reference to $pair[1]
So you see, you triple output the object from the same memory address. You overwritten it 3 times in slice_array and now it stores only last pair values.
To fix it in your code, you should create a new $pair in memory: add $pair = [string[]]::new($size) just after $returns.Add($pair)

finding index of key in an ordered dictionary in powershell

I am having a little bit of trouble with hashtables/dictionaries in powershell. The most recent roadblock is the ability to find the index of a key in an ordered dictionary.
I am looking for a solution that isn't simply iterating through the object.
(I already know how to do that)
Consider the following example:
$dictionary = [Ordered]#{
'a' = 'blue';
'b'='green';
'c'='red'
}
If this were a normal array I'd be able to look up the index of an entry by using IndexOf().
[array]::IndexOf($dictionary,'c').
That would return 2 under normal circumstances.
If I try that with an ordered dictionary, though, I get -1.
Any solutions?
Edit:
In case anyone reading over this is wondering what I'm talking about. What I was trying to use this for was to create an object to normalize property entries in a way that also has a numerical order.
I was trying to use this for the status of a process, for example:
$_processState = [Ordered]#{
'error' = 'error'
'none' = 'none'
'started' = 'started'
'paused' = 'paused'
'cleanup' = 'cleanup'
'complete' = 'complete'
}
If you were able to easily do this, the above object would give $_processState.error an index value of 0 and ascend through each entry, finally giving $_processState.complete an index value of 5. Then if you compared two properties, by "index value", you could see which one is further along by simple operators. For instance:
$thisObject.Status = $_processState.complete
If ($thisObject.Status -ge $_processState.cleanup) {Write-Host 'All done!'}
PS > All done!
^^that doesn't work as is, but that's the idea. It's what I was aiming for. Or maybe to find something like $_processState.complete.IndexNumber()
Having an object like this also lets you assign values by the index name, itself, while standardizing the options...
$thisObject.Status = $_processState.paused
$thisObject.Status
PS > paused
Not really sure this was the best approach at the time or if it still is the best approach with all the custom class options there are available in PS v5.
It can be simpler
It may not be any more efficient than the answer from Frode F., but perhaps more concise (inline) would be simply putting the hash table's keys collection in a sub expression ($()) then calling indexOf on the result.
For your hash table...
Your particular expression would be simply:
$($dictionary.keys).indexOf('c')
...which gives the value 2 as you expected. This also works just as well on a regular hashtable... unless the hashtable is modified in pretty much any way, of course... so it's probably not very useful in that case.
In other words
Using this hash table (which also shows many of the ways to encode 4...):
$hashtable = [ordered]#{
sample = 'hash table'
0 = 'hello'
1 = 'goodbye'
[char]'4' = 'the ansi character 4 (code 52)'
[char]4 = 'the ansi character code 4'
[int]4 = 'the integer 4'
'4' = 'a string containing only the character 4'
5 = "nothing of importance"
}
would yield the following expression/results pairs:
# Expression Result
#------------------------------------- -------------
$($hashtable.keys).indexof('5') -1
$($hashtable.keys).indexof(5) 7
$($hashtable.keys).indexof('4') 6
$($hashtable.keys).indexof([char]4) 4
$($hashtable.keys).indexof([int]4) 5
$($hashtable.keys).indexof([char]'4') 3
$($hashtable.keys).indexof([int][char]'4') -1
$($hashtable.keys).indexof('sample') 0
by the way:
[int][char]'4' equals [int]52
[char]'4' has a "value" (magnitude?) of 52, but is a character, so it's used as such
...gotta love the typing system, which, while flexible, can get really really bad at times, if you're not careful.
Dictionaries uses keys and not indexes. OrderedDictionary combines a hashtable and ArrayList to give you order/index-support in a dictionary, however it's still a dictionary (key-based) collection.
If you need to get the index of an object in a OrderedDictionary (or a hasthable) you need to use foreach-loop and a counter. Example (should be created as a function):
$hashTable = [Ordered]#{
'a' = 'blue';
'b'='green';
'c'='red'
}
$i = 0
foreach($key in $hashTable.Keys) {
if($key -eq "c") { $i; break }
else { $i++ }
}
That's how it works internaly too. You can verify this by reading the source code for OrderedDictionary's IndexOfKey method in .NET Reference Source
For the initial problem I was attempting to solve, a comparable process state, you can now use Enumerations starting with PowerShell v5.
You use the Enum keyword, set the Enumerators by name, and give them an integer value. The value can be anything, but I'm using ascending values starting with 0 in this example:
Enum _ProcessState{
Error = 0
None = 1
Started = 2
Paused = 3
Cleanup = 4
Complete = 5
Verified = 6
}
#the leading _ for the Enum is just cosmetic & not required
Once you've created the Enum, you can assign it to variables. The contents of the variable will return the text name of the Enum, and you can compare them as if they were integers.
$Item1_State = [_ProcessState]::Started
$Item2_State = [_ProcessState]::Cleanup
#return state of second variable
$Item2_state
#comparison
$Item1_State -gt $Item2_State
Will return:
Cleanup
False
If you wanted to compare and return the highest:
#sort the two objects, then return the first result (should return the item with the largest enum int)
$results = ($Item1_State,$Item2_State | Sort-Object -Descending)
$results[0]
Fun fact, you can also use arithmetic on them, for example:
$Item1_State + 1
$Item1_State + $Item2_State
Will return:
Paused
Verified
More info on Enum here:
https://blogs.technet.microsoft.com/heyscriptingguy/2015/08/26/new-powershell-5-feature-enumerations/
https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_enum?view=powershell-6
https://psdevopsug.scot/post/working-with-enums-in-powershell/

Powershell - Use multidimention arrays for storing data - Need suggestion

I'm googling since a while, but I didn't find a solution to my problem.
I have to say I'm newbie in Powershell.
I would like to create the following array
$a = (A,B,C,D) where
A = 1 string (always)
B = 1 string (always)
C = undefined number of strings. I need to be able to add elements dynamically
D = undefined number of strings. I need to be able to add elements dynamically (same number as C)
Is this possible?
Example of 2 elements of the array
("WSTM0123456", "192.168.10.155",("WSTM8765421","WSTM9856454","WSTM1289765"),("192.36.36.36", "187.25.25.25","192.69.89.65"))
("WLDN1251254", "156.25.36.54", ("WLDN1234512", "WLDN9865323"), ("187.154.12.12","163.136.25.98"))
I don't know a priori how many elements will be in C and D and I'll have to append strings in position C and D with a for cycle.
Scope: group many strings (C & D) under the same string (A/B) which are in common.
Any help would be appreciated
Thanks,
Marco
You can do this, but it's probably quite painful as dealing with arrays is sometimes cumbersome in PowerShell due to lots of implicit flattening.
I'd suggest creating a custom type for this. Then you can also give the individual parts useful names (I don't know the purpose of what you're doing here, so I'm making up names here. Feel free to change):
$properties = #{
Name = 'WSTM0123456';
IP = [ipaddress]'192.168.10.155';
ListOfNames = #("WSTM8765421","WSTM9856454","WSTM1289765");
ListOfIPs = [ipaddress[]]#("192.36.36.36", "187.25.25.25","192.69.89.65")
}
$foo = New-Object PSObject -Property #properties
Then you can simply append new items like so:
$foo.ListOfNames += 'AnotherName'
I think this is pretty much the same idea. Use a hash table, and make two of the elements arrays. This is how you would create the arrays "on the fly" at runtime, without knowing what any of the contents were going to be in advance, taking $x and putting any item that starts with "t" in "C" , and everything else in "D":
$a = #{A = "Some string";B = "Some other string"}
$x = "one","two","three","four","five"
$x |% {
if ($_ -match "^t"){$a["C"] += #($_)}
else {$a["D"] += #($_)}
}
$a.a
Some string
$a.b
Some other string
$a.c
two
three
$a.d
one
four
five
$obj = new-object psobject -property $a