how to parse this (inner)xml? - powershell

i’m very new to powershell, and i’m abit stuck.
I have this innerXML:
<sl-test.protocol>HTTP</sl-test.protocol>
<sl-test.responseTimeout>14000</sl-test.responseTimeout>
<env>${myenv}</env>
<http.port>8081</http.port>
And i want to convert it into a .properties file in this format:
sl-test.protocol=HTTP
sl-test.responseTimeout=14000
env=${myenv}
http.port=8081
i have the part to create the .properties file (hardcoded value right now) which works:
$test = New-Item -Name "mule-app.properties" -ItemType "file" -Value "test.prop=testprop`ntest2.prop=test2prop"
So basically i need to go from the innerXML to a big string of key/values separated by `n
But also i need to escape any $ with a backtick
desired string:
sl-test.protocol=HTTP`nsl-test.responseTimeout=14000`nenv`${myenv}`nhttp.port=8081
But right now i cant even seem to iterate through all the keys and values.
Note: the keys and values will be dynamic, it will not always be those 4
Any help will be greatly appreciated.

The .ChildNodes property of the nodes in an [xml] (System.Xml.XmlDocument instance allows you to loop over a given XML element's (System.Xml.XmlElement) child elements.
# Sample XML input.
[xml] $xml = #'
<el>
<sl-test.protocol>HTTP</sl-test.protocol>
<sl-test.responseTimeout>14000</sl-test.responseTimeout>
<env>${myenv}</env>
<http.port>8081</http.port>
</el>
'#
# Loop over all child elements of the document element.
$xml.el.ChildNodes |
ForEach-Object {
# Create and output a line for the output file, based on the
# element's name and inner text, with "$" escaped as "`$"
'{0}={1}' -f $_.Name, $_.InnerText.Replace('$', '`$')
} | # Set-Content out.properties -Encoding utf8
Uncomment and adapt the Set-Content call as needed.

Related

Powershell - randomize same string in huge file using all random strings from array

I am looking for a way to randomize a specific string in a huge file by using predefined strings from array, without having to write temporary file on disk.
There is a file which contains the same string, e.g. "ABC123456789" at many places:
<Id>ABC123456789</Id><tag1>some data</tag1><Id>ABC123456789</Id><Id>ABC123456789</Id><tag2>some data</tag2><Id>ABC123456789</Id><tag1>some data</tag1><tag3>some data</tag3><Id>ABC123456789</Id><Id>ABC123456789</Id>
I am trying to randomize that "ABC123456789" string using array, or list of defined strings, e.g. "#('foo','bar','baz','foo-1','bar-1')". Each ABC123456789 should be replaced by randomly picked string from the array/list.
I have ended up with following solution, which is working "fine". But it definitely is not the right approach, as it do many savings on disk - one for each replaced string and therefore is very slow:
$inputFile = Get-Content 'c:\temp\randomize.xml' -raw
$checkString = Get-Content -Path 'c:\temp\randomize.xml' -Raw | Select-String -Pattern '<Id>ABC123456789'
[regex]$pattern = "<Id>ABC123456789"
while($checkString -ne $null) {
$pattern.replace($inputFile, "<Id>$(Get-Random -InputObject #('foo','bar','baz','foo-1','bar-1'))", 1) | Set-Content 'c:\temp\randomize.xml' -NoNewline
$inputFile = Get-Content 'c:\temp\randomize.xml' -raw
$checkString = Get-Content -Path 'c:\temp\randomize.xml' -Raw | Select-String -Pattern '<Id>ABC123456789'
}
Write-Host All finished
The output is randomized, e.g.:
<Id>foo
<Id>bar
<Id>foo
<Id>foo-1
However, I would like to achieve this kind of output without having to write file to disk in each step. For thousands of the string occurrences it takes a lot of time. Any idea how to do it?
=========================
Edit 2023-02-16
I tried the solution from zett42 and it works fine with simple XML structure. In my case there is some complication which was not important in my text processing approach.
Root and some other elements names in the structure of processed XML file contain colon and there must be some special setting for "-XPath" for this situation. Or, maybe the solution is outside of Powershell scope.
<?xml version='1.0' encoding='UTF-8'?>
<C23A:SC777a xmlns="urn:C23A:xsd:$SC777a" xmlns:C23A="urn:C23A:xsd:$SC777a" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:C23A:xsd:$SC777a SC777a.xsd">
<C23A:FIToDDD xmlns="urn:iso:std:iso:20022:tech:xsd:pacs.008.001.02">
<CxAAA>
<DxBBB>
<ABC>
<Id>ZZZZZZ999999</Id>
</ABC>
</DxBBB>
<CxxCCC>
<ABC>
<Id>ABC123456789</Id>
</ABC>
</CxxCCC>
</CxAAA>
<CxAAA>
<DxBBB>
<ABC>
<Id>ZZZZZZ999999</Id>
</ABC>
</DxBBB>
<CxxCCC>
<ABC>
<Id>ABC123456789</Id>
</ABC>
</CxxCCC>
</CxAAA>
</C23A:FIToDDD>
<C23A:PmtRtr xmlns="urn:iso:std:iso:20022:tech:xsd:pacs.004.001.02">
<GrpHdr>
<TtREEE Abc="XV">123.45</TtREEE>
<SttlmInf>
<STTm>ABCA</STTm>
<CLss>
<PRta>SIII</PRta>
</CLss>
</SttlmInf>
</GrpHdr>
<TxInf>
<OrgnlTxRef>
<DxBBB>
<ABC>
<Id>YYYYYY888888</Id>
</ABC>
</DxBBB>
<CxxCCC>
<ABC>
<Id>ABC123456789</Id>
</ABC>
</CxxCCC>
</OrgnlTxRef>
</TxInf>
</C23A:PmtRtr>
</C23A:SC777a>
As commented, it is not recommended to process XML like a text file. This is a brittle approach that depends too much on the formatting of the XML. Instead, use a proper XML parser to load the XML and then process its elements in an object-oriented way.
# Use XmlDocument (alias [xml]) to load the XML
$xml = [xml]::new(); $xml.Load(( Convert-Path -LiteralPath input.xml ))
# Define the ID replacements
$searchString = 'ABC123456789'
$replacements = 'foo','bar','baz','foo-1','bar-1'
# Process the text of all ID elements that match the search string, regardless how deeply nested they are.
$xml | Select-Xml -XPath '//Id/text()' | ForEach-Object Node |
Where-Object Value -eq $searchString | ForEach-Object {
# Replace the text of the current element by a randomly choosen string
$_.Value = Get-Random $replacements
}
# Save the modified document to a file
$xml.Save( (New-Item output.xml -Force).Fullname )
$xml | Select-Xml -XPath '//Id/text()' selects the text nodes of all Id elements, regardless how deeply nested they are in the XML DOM, using the versatile Select-Xml command. The XML nodes are selected by specifying an XPath expression.
Regarding your edit, when you have to deal with XML namespaces, use the parameter -Namespace to specify a namespace prefix to use in the XPath expression for the given namespace URI. In this example I've simply choosen a as the namespace prefix:
$xml | Select-Xml -XPath '//a:Id/text()' -Namespace #{a = 'urn:iso:std:iso:20022:tech:xsd:pacs.008.001.02'}
ForEach-Object Node selects the Node property from each result of Select-Xml. This simplifies the following code.
Where-Object Value -eq $searchString selects the text nodes that match the search string.
Within ForEach-Object, the variable $_ stands for the current text node. Assign to its Value property to change the text.
The Convert-Path and New-Item calls make it possible to use a relative PowerShell path (PSPath) with the .NET XmlDocument class. In general .NET APIs don't know anything about the current directory of PowerShell, so we have to convert the paths before passing to .NET API.

Powershell: storing variables to a file [duplicate]

I would like to write out a hash table to a file with an array as one of the hash table items. My array item is written out, but it contains files=System.Object[]
Note - Once this works, I will want to reverse the process and read the hash table back in again.
clear-host
$resumeFile="c:\users\paul\resume.log"
$files = Get-ChildItem *.txt
$files.GetType()
write-host
$types="txt"
$in="c:\users\paul"
Remove-Item $resumeFile -ErrorAction SilentlyContinue
$resumeParms=#{}
$resumeParms['types']=$types
$resumeParms['in']=($in)
$resumeParms['files']=($files)
$resumeParms.GetEnumerator() | ForEach-Object {"{0}={1}" -f $_.Name,$_.Value} | Set-Content $resumeFile
write-host "Contents of $resumefile"
get-content $resumeFile
Results
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True Object[] System.Array
Contents of c:\users\paul\resume.log
files=System.Object[]
types=txt
in=c:\users\paul
The immediate fix is to create your own array representation, by enumerating the elements and separating them with ,, enclosing string values in '...':
# Sample input hashtable. [ordered] preserves the entry order.
$resumeParms = [ordered] #{ foo = 42; bar = 'baz'; arr = (Get-ChildItem *.txt) }
$resumeParms.GetEnumerator() |
ForEach-Object {
"{0}={1}" -f $_.Name, (
$_.Value.ForEach({
(("'{0}'" -f ($_ -replace "'", "''")), $_)[$_.GetType().IsPrimitive]
}) -join ','
)
}
Not that this represents all non-primitive .NET types as strings, by their .ToString() representation, which may or may not be good enough.
The above outputs something like:
foo=42
bar='baz'
arr='C:\Users\jdoe\file1.txt','C:\Users\jdoe\file2.txt','C:\Users\jdoe\file3.txt'
See the bottom section for a variation that creates a *.psd1 file that can later be read back into a hashtable instance with Import-PowerShellDataFile.
Alternatives for saving settings / configuration data in text files:
If you don't mind taking on a dependency on a third-party module:
Consider using the PSIni module, which uses the Windows initialization file (*.ini) file format; see this answer for a usage example.
Adding support for initialization files to PowerShell itself (not present as of 7.0) is being proposed in GitHub issue #9035.
Consider using YAML as the file format; e.g., via the FXPSYaml module.
Adding support for YAML files to PowerShell itself (not present as of 7.0) is being proposed in GitHub issue #3607.
The Configuration module provides commands to write to and read from *.psd1 files, based on persisted PowerShell hashtable literals, as you would declare them in source code.
Alternatively, you could modify the output format in the code at the top to produce such files yourself, which allows you to read them back in via
Import-PowerShellDataFile, as shown in the bottom section.
As of PowerShell 7.0 there's no built-in support for writing such as representation; that is, there is no complementary Export-PowerShellDataFile cmdlet.
However, adding this ability is being proposed in GitHub issue #11300.
If creating a (mostly) plain-text file is not a must:
The solution that provides the most flexibility with respect to the data types it supports is the XML-based CLIXML format that Export-Clixml creates, as Lee Dailey suggests, whose output can later be read with Import-Clixml.
However, this format too has limitations with respect to type fidelity, as explained in this answer.
Saving a JSON representation of the data, as Lee also suggests, via ConvertTo-Json / ConvertFrom-Json, is another option, which makes for human-friendlier output than XML, but is still not as friendly as a plain-text representation; notably, all \ chars. in file paths must be escaped as \\ in JSON.
Writing a *.psd1 file that can be read with Import-PowerShellDataFile
Within the stated constraints regarding data types - in essence, anything that isn't a number or a string becomes a string - it is fairly easy to modify the code at the top to write a PowerShell hashtable-literal representation to a *.psd1 file so that it can be read back in as a [hashtable] instance via Import-PowerShellDataFile:
As noted, if you don't mind installing a module, consider the Configuration module, which has this functionality built int.
# Sample input hashtable.
$resumeParms = [ordered] #{ foo = 42; bar = 'baz'; arr = (Get-ChildItem *.txt) }
# Create a hashtable-literal representation and save it to file settings.psd1
#"
#{
$(
($resumeParms.GetEnumerator() |
ForEach-Object {
" {0}={1}" -f $_.Name, (
$_.Value.ForEach({
(("'{0}'" -f ($_ -replace "'", "''")), $_)[$_.GetType().IsPrimitive]
}) -join ','
)
}
) -join "`n"
)
}
"# > settings.psd1
If you read settings.psd1 with Import-PowerShellDataFile settings.psd1 later, you'll get a [hashtable] instance whose entries you an access as usual and which produces the following display output:
Name Value
---- -----
bar baz
arr {C:\Users\jdoe\file1.txt, C:\Users\jdoe\file1.txt, C:\Users\jdoe\file1.txt}
foo 42
Note how the order of entries (keys) was not preserved, because hashtable entries are inherently unordered.
On writing the *.psd1 file you can preserve the key(-creation) order by declaring the input hashtable (System.Collections.Hashtable) as [ordered], as shown above (which creates a System.Collections.Specialized.OrderedDictionary instance), but the order is, unfortunately, lost on reading the *.psd1 file.
As of PowerShell 7.0, even if you place [ordered] before the opening #{ in the *.psd1 file, Import-PowerShellDataFile quietly ignores it and creates an unordered hashtable nonetheless.
This is a problem I deal with all the time and it drives me mad. I really think that there should be a function specifically for this action... so I wrote one.
function ConvertHashTo-CSV
{
Param (
[Parameter(Mandatory=$true)]
$hashtable,
[Parameter(Mandatory=$true)]
$OutputFileLocation
)
$hastableAverage = $NULL #This will only work for hashtables where each entry is consistent. This checks for consistency.
foreach ($hashtabl in $hashtable)
{
$hastableAverage = $hastableAverage + $hashtabl.count #Counts the amount of headings.
}
$Paritycheck = $hastableAverage / $hashtable.count #Gets the average amount of headings
if ( ($parity = $Paritycheck -is [int]) -eq $False) #if the average is not an int the hashtable is not consistent
{
write-host "Error. Hashtable is inconsistent" -ForegroundColor red
Start-Sleep -Seconds 5
return
}
$HashTableHeadings = $hashtable[0].GetEnumerator().name #Get the hashtable headings
$HashTableCount = ($hashtable[0].GetEnumerator().name).count #Count the headings
$HashTableString = $null # Strange to hold the CSV
foreach ($HashTableHeading in $HashTableHeadings) #Creates the first row containing the column headings
{
$HashTableString += $HashTableHeading
$HashTableString += ", "
}
$HashTableString = $HashTableString -replace ".{2}$" #Removed the last , added by the above loop in error
$HashTableString += "`n"
foreach ($hashtabl in $hashtable) #Adds the data
{
for($i=0;$i -lt $HashTableCount;$i++)
{
$HashTableString += $hashtabl[$i]
if ($i -lt ($HashTableCount - 1))
{
$HashTableString += ", "
}
}
$HashTableString += "`n"
}
$HashTableString | Out-File -FilePath $OutputFileLocation #writes the CSV to a file
}
To use this copy the function into your script, run it, and then
ConvertHashTo-CSV -$hashtable $Hasharray -$OutputFileLocation c:\temp\data.CSV
The code is annotated but a brief explanation of what it does. Steps through the arrays and hashtables and adds them to a string adding the required formatting to make the string a CSV file, then outputs that to a file.
The main limitation of this is that the Hashtabes in the array all have to contain the same amount of fields. To get around this if a hashtable has a field that doesnt contain data ensure it contains at least a space.
More on this can be found here : https://grumpy.tech/powershell-convert-hashtable-to-csv/

Using .TXT file as input variables - PowerShell

I have the following script that I have to input manually.
However, I am looking to provide a text file along with the script to read the inputs from there.
For example, I have two variable below:
$RedirectURIs = $CustomerWebUIurl
$appURI = $CustomerSamlIssuerID
Where I need to pull the following values from text file
$CustomerWebUIurl
$CustomerSamlIssuerID
So when I add
$configFile = Get-Content -Path .\config.conf
What else I can use/add to define those two lines in the text file to the two variables I have in the script?
Thanks for the help!
Your config file is in a format that the ConvertFrom-StringData can process, which converts lines of =-separated key-value pairs into hashtables:
# Create a sample config.conf file
#'
$CustomerWebUIurl = http://example.org&1
$CustomerSamlIssuerID = http://example.org&2
'# > ./config.conf
# Load the key-value pairs from ./config.conf into a hashtable:
$config = ConvertFrom-StringData (Get-Content -Raw ./config.conf)
# Output the resulting hashtable
$config
The above yields:
Name Value
---- -----
$CustomerWebUIurl http://example.org&1
$CustomerSamlIssuerID http://example.org&2
That is, $config now contains a hashtable with entries whose key names are - verbatim - $CustomerWebUIurl and $CustomerSamlIssuerID, which you can access as follows: $config.'$CustomerWebUIurl' and $config.'$CustomerSamlIssuerID'
The need to quote the keys on access is somewhat cumbersome, and the fact that the key names start with $ can be confusing, so I suggest defining your config-file entries without a leading $.
If you have no control over the config file, you can work around the issue as follows:
# Trim the leading '$' from the key names before converting to a hashtable:
$config = ConvertFrom-StringData ((Get-Content -Raw .\config.conf) -replace '(?m)^\$')
Now you can access the entries more conveniently as $config.CustomerWebUIurl and $config.CustomerSamlIssuerID
I just figured out how, to use simply array:
Here is the answer script
$configFile = Get-Content -Path .\config.conf
$configFile.GetType()
$configFile.Count
$configFile
$CustomerWebUIurl = $configFile[0]
$CustomerWebUIurl
$CustomerSamlIssuerID = $configFile[1]
$CustomerSamlIssuerID

Remove list of phrases if they are present in a text file using Powershell

I'm trying to use a list of phrases (over 100) which I want to be removed from a text file (products.txt) which has lines of text inside it (they are tab separated / new line each). So that the results which do not match the list of phrases will be re-written in the current file.
#cd .\Desktop\
$productlist = #(
'example',
'juicebox',
'telephone',
'keyboard',
'manymore')
foreach ($product in $productlist) {
get-childitem products.txt | Select-String -Pattern $product -NotMatch | foreach {$_.line} | Out-File -FilePath .\products.txt
}
The above code does not remove the words listed in the $productlist, it simply outputs all links in products.txt again.
The lines inside of products.txt file are these:
productcatalog
product1example
juicebox038
telephoneiphone
telephoneandroid
randomitem
logitech
coffeetable
razer
Thank you for your help.
Here's my solution. You need the parentheses otherwise the input file will be in use when trying to write to the file. Select-string accepts an array of patterns. I wish I could pipe 'path' to set-content but it doesn't work.
$productlist = 'example', 'juicebox', 'telephone', 'keyboard', 'manymore'
(Select-String $productlist products.txt -NotMatch) | % line |
set-content products.txt
here's one way to do what you want. it's somewhat more direct than what yo used. [grin] it uses the way that PoSh can act on an entire collection when it is on the LEFT side of an operator.
what it does ...
fakes reading in a text file
when ready to do this in real life, replace the whole #region/#endregion block with a call to Get-Content.
builds the exclude list
converts that into a regex OR pattern
filters out the items that match the unwanted list
shows that resulting list
the code ...
#region >>> fake reading in a text file
# when ready to do this for real, replace the whole "#region/#endregion" block with a call to Get-Content
$ProductList = #'
productcatalog
product1example
juicebox038
telephoneiphone
telephoneandroid
randomitem
logitech
coffeetable
razer
'# -split [System.Environment]::NewLine
#endregion >>> fake reading in a text file
$ExcludedProductList = #(
'example'
'juicebox'
'telephone'
'keyboard'
'manymore'
)
$EPL_Regex = $ExcludedProductList -join '|'
$RemainingProductList = $ProductList -notmatch $EPL_Regex
$RemainingProductList
output ...
productcatalog
randomitem
logitech
coffeetable
razer

Select String From Text File and Create variable

I have a text file containing a string I need to make a variable. I need the value for "file" to be retained as a variable. How can I capture this and make it a variable: "\APPSRV\I\Run\OPTI\CLIENT\20171031\25490175\Data\brtctybv\". This data will change per file, but it will retain the same format, it will start with \ and end with \
Example Text File
order_id = 25490175-brtctybv
file = \\APPSRV\I\Run\OPTI\CLIENT\20171031\25490175\Data\brtctybv\
copies = 1
volume = 20171031-brtctybv
label = \\domain.com\prodmaster\jobs\OPTI\CLIENT\Cdlab\somefile.file
merge = \\APPSRV\I\Run\OPTI\CLIENT\20171031\25490175\mrg\25490175-brtctybv.MRG
FIXATE = NOAPPEND
$file = ((Get-Content -path file.txt) | Select-String -pattern "^file\s*=\s*(\\\\.*\\)").matches.groups[1].value
$file
See Regex Demo to see the regex in action. The .matches.groups[1].value is grabbing the value of capture group 1. The capture group is created by the () within the pattern. See Select-String for more information about the cmdlet.
Regexes are powerful, but complex; sometimes there are conceptually simpler alternatives:
PS> ((Get-Content -Raw file.txt).Replace('\', '\\') | ConvertFrom-StringData).file
\\APPSRV\I\Run\OPTI\CLIENT\20171031\25490175\Data\brtctybv\
The ConvertFrom-StringData cmdlet is built for parsing key-value pairs separated by =
\ in the values is interpreted as an escape character, however, hence the doubling of \ in the input file with .Replace('\', '\\').
The result is a hash table (type [hashtable]); Get-Content -Raw - the input file read as a single string - is used to ensure that a single hash table is output); accessing its file key retrieves the associated value.