Grabbing specific sections of a txt file via Powershell - powershell

I am new to Powershell scripting, but I feel I am overlooking a simple answer, hopefully some of you can help.
My company exports files from all of our computers with a section around the middle of Mapped Network Printers. It looks like this:
-------------------------------------------------------------------------
Mapped Network Printers:
NetworkAddress\HP425DN [DEFAULT PRINTER]
-------------------------------------------------------------------------
Local Printers:
What I have been asked to do is copy just the Mapped Network Printers to a new text file.
I tried using Select-String with a context parameter, but I have no way of knowing how many network printers there are, so I can't guess.
I also tried using the following code which I found on this site, but it returns nothing:
$MapPrint = gc C:\Users\User1\Documents\Config.txt
$from = ($MapPrint | Select-String -pattern "Mapped Network Printers:" |
Select-Object LineNumber).LineNumber
$to = ($MapPrint | Select-String -pattern "-------------------------------
--------------------------------------------" | Select-Object
LineNumber).LineNumber
$i = 0
$array = #()
foreach ($line in $MapPrint)
{
foreach-object { $i++ }
if (($i -gt $from) -and ($i -lt $to))
{
$array += $line
}
}
$array
I basically want to start the search at "Mapped Network Printers" and end it at the next row of "------"
Any help would be greatly appreciated.

Select-String has no feature for extracting a range of lines based on content.
The simplest approach is to read the file as a whole and use the -replace operator to extract the range via a regular expression (regex):
$file = 'C:\Users\User1\Documents\Config.txt'
$regex = '(?sm).*^Mapped Network Printers:\r?\n(.*?)\r?\n---------------------.*'
(Get-Content -Raw $file) -replace $regex, '$1'
Reading an input file as a whole can be problematic with files too large to fit into memory, but that's probably not a concern for you.
On the plus side, this approach is much faster than processing the lines in a loop.
Get-Content -Raw (PSv3+) reads the input file as a whole.
Inline regex options (?sm) turn on both the multi-line and the single-line option:
m means that ^ and $ match the start and end of each line rather than the input string as a whole.
s means that metacharacter . matches \n characters too, so that an expression such as .* can be used to match across lines.
\r?\n matches a single line break, both the CRLF and the LF variety.
(.*?) is the capture group that (non-greedily) captures everything between the bounding lines.
Note that the regex matches the entire input string, and then replaces it with just the substring (range) of interest, captured in the 1st (and only) capture group ($1).
Assuming that $file contains:
-------------------------------------------------------------------------
Mapped Network Printers:
NetworkAddress\HP425DN [DEFAULT PRINTER]
NetworkAddress\HP426DN
-------------------------------------------------------------------------
Local Printers:
the above yields:
NetworkAddress\HP425DN [DEFAULT PRINTER]
NetworkAddress\HP426DN

You could use Select-String or Where-Object to look for words with a \. Taking that even further you could look for just the server\printer values with a RegEx like this:
Get-Content C:\Users\User1\Documents\Config.txt -Raw |
Select-String '[A-Z0-9]+\\[A-Z0-9]+' -AllMatches |
ForEach-Object {$_.Matches.Value}
Note that this makes the assumption the Server Names and Printers use only A-Z and 0-9, you may need to look for more characters if that is not a valid assumption.
Here would be an example of using Where-Object to filter for lines with \
Get-Content 'C:\Users\User1\Documents\Config.txt' | Where-Object {$_ -like '*\*'}

$Doc= "C:\temp\test.txt"
$Doc_end ="C:\temp\testfiltered.txt"
$reader = [System.IO.File]::OpenText($Doc)
$cdata=""
while($null -ne ($line = $reader.ReadLine()))
{
if ($line -like ('---*') ) {$Read = 0 }
if ($Read -eq 1) {$cdata+= $line + "`r`n"}
if ($line -like ('Mapped Network Printers:*')) {$Read = 1}
}
$cdata | Out-File $Doc_end -Force

You can do what you are attempting with the foreach-object command and a few additional test conditions. Simply setting a flag when you encounter the Mapped Network Printers: line and then terminating output on the next line -like "---*" will work, e.g.
## positional parameters
param(
[Parameter(Mandatory=$true)][string]$infile
)
$beginprn = 0
get-content $infile | foreach-object {
# terminate condition
if ([int]$beginprn -eq 1 -and $_ -like "---*") {
break
}
# output Mapped printers
if ([int]$beginprn -eq 1) {
write-host $_
}
# begin condition
if ($_ -eq "Mapped Network Printers:") {
$beginprn = 1
}
}
Example Input File
-------------------------------------------------------------------------
Mapped Network Printers:
NetworkAddress\HP425DN [DEFAULT PRINTER]
NetworkAddress\HP4100N
-------------------------------------------------------------------------
Local Printers:
Example Use/Output
PS> parseprn.ps1 .\tmp\prnfile.txt
NetworkAddress\HP425DN [DEFAULT PRINTER]
NetworkAddress\HP4100N

Related

Remove additional commas in CSV file using Powershell

I have a csv file that I'll like to import to sql but isn't properly formatted. I am not able to format the generated file (excel file) so I'm looking to do this with the CSV file using. I want to remove the extra commas and also replace the department name (,,,,,,) with the correct department as seen in the example below. Thank you in advance.
Example:
Current Format:
Department,,,,,,First Name,,,,Last Name,,,,,,,School Year,Enrolment Status
Psychology ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
,,,,,,Jane,,,,Doe,,,,,,,2022,Enrolled
,,,,,,Jeff,,,,Dane,,,,,,,2019,Enrolled
,,,,,,Tate,,,,Anderson,,,,,,,2019,Not Enrolled
,,,,,,Daphne,,,,Miller,,,,,,,2021,Enrolled
,,,,,,Cora,,,,Dame,,,,,,,2022,Enrolled
Computer Science ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
,,,,,,Dora,,,,Explorer,,,,,,,2022,Not Enrolled
,,,,,,Peppa,,,,Diggs,,,,,,,2020,Enrolled
,,,,,,Conrad,,,,Strat,,,,,,,2020,Enrolled
,,,,,,Kat,Noir,,,,2019,,,,,,,Enrolled
,,,,,,Lance,,,,Bug,2018,,,,,,,Enrolled
Ideal format:
Department,First Name,Last Name,School Year,Enrolment Status
Psychology ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
Psychology,Jane,Doe,2022,Enrolled
Psychology,Jeff,Dane,2019,Enrolled
Psychology,Tate,Anderson,2019,Not Enrolled
Psychology,Daphne,Miller,2021,Enrolled
Psychology,Cora,Dame,2022,Enrolled
Computer Science ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
Computer Science,Dora,Explorer,2022,Not Enrolled
Computer Science,Peppa,Diggs,2020,Enrolled
Computer Science,Conrad,Strat,2020,Enrolled
Computer Science,Kat,Noir,2019,Enrolled
Computer Science,Lance,Bug,2018,Enrolled
here you go:
$csvArray = new-object System.Collections.Generic.List[string]
#Import the file
$text = (gc "C:\tmp\testdata.txt") -replace ",{2,}",","
$arrayEnd = $text.count -1
$text[1..$arrayEnd] | %{
If ($_ -notmatch "^(,)"){
$department = $_ -replace ","
}
Else {
$csvArray.add($department + $_)
}
}
$csvArray.Insert(0,$text[0])
$csvArray | set-content 'C:\tmp\my.csv'
Using the Csv cmdlets:
$Csv = #'
Department,,,,,,First Name,,,,Last Name,,,,,,,School Year,Enrolment Status
Psychology ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
,,,,,,Jane,,,,Doe,,,,,,,2022,Enrolled
,,,,,,Jeff,,,,Dane,,,,,,,2019,Enrolled
,,,,,,Tate,,,,Anderson,,,,,,,2019,Not Enrolled
,,,,,,Daphne,,,,Miller,,,,,,,2021,Enrolled
,,,,,,Cora,,,,Dame,,,,,,,2022,Enrolled
Computer Science ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
,,,,,,Dora,,,,Explorer,,,,,,,2022,Not Enrolled
,,,,,,Peppa,,,,Diggs,,,,,,,2020,Enrolled
,,,,,,Conrad,,,,Strat,,,,,,,2020,Enrolled
,,,,,,Kat,Noir,,,,2019,,,,,,,Enrolled
,,,,,,Lance,,,,Bug,2018,,,,,,,Enrolled
'#
$List = ConvertFrom-Csv $Csv -Header #(1..20) # |Import-Csv .\Your.Csv -Header #(1..20)
$Columns = $List[0].PSObject.Properties.Where{ $_.Value -and $_.Value -ne 'Department' }.Name
$List |Select-Object -Property $Columns |Where-Object { $_.$($Columns[0]) } |
ConvertTo-Csv -UseQuote Never |Select-Object -Skip 1 # |Set-Content -Encoding utf8 out.csv
First Name,Last Name,School Year,Enrolment Status
Jane,Doe,2022,Enrolled
Jeff,Dane,2019,Enrolled
Tate,Anderson,2019,Not Enrolled
Daphne,Miller,2021,Enrolled
Cora,Dame,2022,Enrolled
Dora,Explorer,2022,Not Enrolled
Peppa,Diggs,2020,Enrolled
Conrad,Strat,2020,Enrolled
Kat,,,Enrolled
Lance,Bug,,Enrolled
Use a switch statement:
& {
$first = $true
switch -Wildcard -File in.csv { # Loop over all lines in file in.csv
',*' { # intra-department line
# Prepend the department name, eliminate empty fields and output.
$dept + (($_ -split ',' -ne '') -join ',')
}
default {
if ($first) { # header line
# Eliminate empty fields and output.
($_ -split ',' -ne '') -join ','
$first = $false
}
else { # department-only line
$dept = ($_ -split ',')[0] # save department name
}
}
}
} | Set-Content -Encoding utf8 out.csv
Note:
$_ -split ',' splits each line into fields by ,, and -ne '' filters out empty fields from the resulting array; applying -join ',' rejoins the nonempty fields with ,, which in effect removes multiple adjacent , and thereby eliminates empty fields.
If you don't mind the complexity of a regex, you can perform the above more simply with a single -replace operation, as shown in Toni's helpful answer.
Using switch -File is an efficient way to read files line by line and perform conditional processing based on sophisticated matching (as an alternative to -Wildcard you can use -Regex for regex matching, and you can even use script blocks ({ ... } as conditionals).
As a language statement, switch cannot be used directly in a pipeline.
This limitation can be overcome by enclosing it in a script block ({ ... }) invoked with &, which enables the usual, memory-friendly streaming behavior in the pipeline; that is, the lines are processed one by one, as are the modified output lines relayed to Set-Content, so that the input file needn't be read into memory as a whole.
In your case, plain-text processing of your CSV file enabled a simple solution, but in general it is better to parse CSV files into objects whose properties you can work with, using the Import-Csv cmdlet, and, for later re-exporting to a CSV file, Export-Csv,

Powershell Get-Content specific content inside text

I receive a text file with a multiple lists like shown below (edit: more accurate example dataset included)
# SYSTEM X
# SINGULAR
192.168.1.3
# SUB-SYSTEM V
192.168.1.4
192.168.1.5
192.168.1.6
# SYSTEM Y
# MANDATORY
192.168.1.7
192.168.1.8
192.168.1.9
192.168.1.7
192.168.1.8
192.168.1.9
Each "SYSTEM comment" means its a new set after it.
I want to read each block of content separately so each set should be assigned to an object discarding the embedded comments. I just need the IPs.
Something like:
$ipX = get-content -path [file.txt] [set X]
$ipY = get-content -path [file.txt] [set Y]
$ipZ = get-content -path [file.txt] [set Z]
But I'm not sure how to actually assign these sets separately.
Help please.
Here's one possible solution. The result will be a hashtable, each key containing any array of ips for the set:
$result = #{}
get-content file.txt | foreach {
if ($_ -match "#\s*SET\s+(\w+)") {
$result[($key = $matches.1)] = #()
}
elseif ($_ -notlike "#*") {
$result[$key] += $_
}
}
Contents of $result:
Name Value
---- -----
Y {[ip], [ip], [more ips]}
Z {[ip], [ip], [more ips]}
X {[ip], [ip], [more ips]}
Here's another approach. We will take advantage of Foreach-Object's -End block to [PSCustomObject] the final one.
Get-Content $file | Foreach-Object {
if($_ -match 'SET (.+?)'){
if($ht){[PSCustomObject]$ht}
$ht = [ordered]#{Set = $Matches.1}
}
if($_ -match '^[^#]'){
$ht["IPs"] += $_
}
} -End {if($ht){[PSCustomObject]$ht}}
Output
Set IPs
--- ---
X [ip][ip][more ips]
Y [ip][ip][more ips]
Z [ip][ip][more ips]
If you want to also ensure $ht is empty to start with you could use the -Begin block.
Get-Content $file | Foreach-Object -Begin{$ht=$null}{
if($_ -match 'SET (.+?)'){
if($ht){[PSCustomObject]$ht}
$ht = [ordered]#{Set = $Matches.1}
}
if($_ -match '^[^#]'){
$ht["IPs"] += $_
}
} -End {if($ht){[PSCustomObject]$ht}}
You can use Select-String to extract a specific section of text:
# Update $section to be the set you want to target
$section = 'Set Y'
Get-Content a.txt -Raw |
Select-String -Pattern "# $section.*\r?\n(?s)(.*?)(?=\r?\n# Set|$)" | Foreach-Object
{$_.Matches.Groups[1].Value}
Using Get-Content with -Raw reads in the file as a single string making multi-line matching easier. With PowerShell 7, Select-String includes a -Raw switch making this process a bit simpler.
This outputs capture group 1 results, which match the (.*?). If you want to capture between comments rather than between Set <something> and Set <something>, you can edit the -Pattern value at the end to only be # rather than # Set.
Regex Breakdown:
# matches the characters # literally
$section substitutes your variable value matches the value literally provided there are no regex characters in the string
.* matches any character (except for line terminators)
\r matches a carriage return
? Quantifier — Matches between zero and one times, as many times as
possible, giving back as needed (greedy)
\n matches a line-feed (newline) character
(?s) modifier: single line. Dot matches newline characters
1st Capturing Group (.*?)
.*? matches any characters lazily
Positive Lookahead (?=\r?\n# Set)
\r? matches a carriage return zero or more times
\n matches a line-feed (newline) character
# Set matches the characters # Set literally
$ matches the end of the string
If I understand the question with the new example correctly, you want to parse out the file and create single variables of that each holding an array ip IP addresses.
If that is the case, you could do:
# loop through the file line-by-line
$result = switch -Regex -File 'D:\Test\thefile.txt' {
'#\sSYSTEM\s(\w+)' {
# start a new object, output the earlier object if available
if ($obj) { $obj }
$obj = [PsCustomObject]#{ 'System' = $Matches[1]; 'Ip' = #() }
}
'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}' {
# looks like an IPv4 address. Add it to the Ip property array of the object
$obj.Ip += $_
}
default {}
}
Now you have an array ob objects in $result:
System Ip
------ --
Y {192.168.1.7, 192.168.1.8, 192.168.1.9, 192.168.1.7...}
X {192.168.1.3, 192.168.1.4, 192.168.1.5, 192.168.1.6}
To make separate variables of that is as easy as:
$ipX = ($result | Where-Object { $_.System -eq 'X' }).Ip
$ipY = ($result | Where-Object { $_.System -eq 'Y' }).Ip
$ipZ = ($result | Where-Object { $_.System -eq 'Z' }).Ip
Your example has duplicate IP addresses. If you don't want these do
$ipX = ($result | Where-Object { $_.System -eq 'X' }).Ip | Select-Object -Unique (same for the others)

Splitting in Powershell

I want to be able to split some text out of a txtfile:
For example:
Brackets#Release 1.11.6#Path-to-Brackets
Atom#v1.4#Path-to-Atom
I just want to have the "Release 1.11.6" part. I am doing a where-object starts with Brackets but I don't know the full syntax. Here is my code:
"Get-Content -Path thisfile.txt | Where-Object{$_ < IM STUCK HERE > !
You could do this:
((Get-Content thisfile.txt | Where-Object { $_ -match '^Brackets' }) -Split '#')[1]
This uses the -match operator to filter out any lines that don't start with Brackets (the ^ special regex character indicates that what follows must be at the beginning of the line). Then it uses the -Split operator to split those lines on # and then it uses the array index [1] to get the second element of the split (arrays start at 0).
Note that this will throw an error if the split on # doesn't return at least two elements and it assumes that the text you want is always the second of those elements.
$bracketsRelease = Get-Content -path thisfile.txt | foreach-object {
if ( $_ -match 'Brackets#(Release [^#]+)#' )
{
$Matches[1]
}
}
or
(select-string -Path file.txt -Pattern 'Brackets#(Release [^#]+)#').Matches[0].Groups[1].value

PowerShell read text file line by line and find missing file in folders

I am a novice looking for some assistance. I have a text file containing two columns of data. One column is the Vendor and one is the Invoice.
I need to scan that text file, line by line, and see if there is a match on Vendor and Invoice in a path. In the path, $Location, the first wildcard is the Vendor number and the second wildcard is the Invoice
I want the non-matches output to a text file.
$Location = "I:\\Vendors\*\Invoices\*"
$txt = "C:\\Users\sbagford.RECOEQUIP\Desktop\AP.txt"
$Output ="I:\\Vendors\Missing\Missing.txt"
foreach ($line in Get-Content $txt) {
if (-not($line -match $location)){$line}
}
set-content $Output -value $Line
Sample Data from txt or csv file.
kvendnum wapinvoice
000953 90269211
000953 90238674
001072 11012016
002317 448668
002419 06123711
002419 06137343
002419 06134382
002419 759208
002419 753087
002419 753069
002419 762614
003138 N6009348
003138 N6009552
003138 N6009569
003138 N6009612
003182 770016
003182 768995
003182 06133429
In above data the only match is on the second line: 000953 90238674
and the 6th line: 002419 06137343
Untested, but here's how I'd approach it:
$Location = "I:\\Vendors\\.+\\Invoices\\.+"
$txt = "C:\\Users\sbagford.RECOEQUIP\Desktop\AP.txt"
$Output ="I:\\Vendors\Missing\Missing.txt"
select-string -path $txt -pattern $Location -notMatch |
set-content $Output
There's no need to pick through the file line-by-line; PowerShell can do this for you using select-string. The -notMatch parameter simply inverts the search and sends through any lines that don't match the pattern.
select-string sends out a stream of matchinfo objects that contain the lines that met the search conditions. These objects actually contain far more information that just the matching line, but fortunately PowerShell is smart enough to know how to send the relevant item through to set-content.
Regular expressions can be tricky to get right, but are worth getting your head around if you're going to do tasks like this.
EDIT
$Location = "I:\Vendors\{0}\Invoices\{1}.pdf"
$txt = "C:\\Users\sbagford.RECOEQUIP\Desktop\AP.txt"
$Output = "I:\Vendors\Missing\Missing.txt"
get-content -path $txt |
% {
# extract fields from the line
$lineItems = $_ -split " "
# construct path based on fields from the line
$testPath = $Location -f $lineItems[0], $lineItems[1]
# for debugging purposes
write-host ( "Line:'{0}' Path:'{1}'" -f $_, $testPath )
# test for existence of the path; ignore errors
if ( -not ( get-item -path $testPath -ErrorAction SilentlyContinue ) ) {
# path does not exist, so write the line to pipeline
write-output $_
}
} |
Set-Content -Path $Output
I guess we will have to pick through the file line-by-line after all. If there is a more idiomatic way to do this, it eludes me.
Code above assumes a consistent format in the input file, and uses -split to break the line into an array.
EDIT - version 3
$Location = "I:\Vendors\{0}\Invoices\{1}.pdf"
$txt = "C:\\Users\sbagford.RECOEQUIP\Desktop\AP.txt"
$Output = "I:\Vendors\Missing\Missing.txt"
get-content -path $txt |
select-string "(\S+)\s+(\S+)" |
%{
# pull vendor and invoice numbers from matchinfo
$vendor = $_.matches[0].groups[1]
$invoice = $_.matches[0].groups[2]
# construct path
$testPath = $Location -f $vendor, $invoice
# for debugging purposes
write-host ( "Line:'{0}' Path:'{1}'" -f $_.line, $testPath )
# test for existence of the path; ignore errors
if ( -not ( get-item -path $testPath -ErrorAction SilentlyContinue ) ) {
# path does not exist, so write the line to pipeline
write-output $_
}
} |
Set-Content -Path $Output
It seemed that the -split " " behaved differently in a running script to how it behaves on the command line. Weird. Anyway, this version uses a regular expression to parse the input line. I tested it against the example data in the original post and it seemed to work.
The regex is broken down as follows
( Start the first matching group
\S+ Greedily match one or more non-white-space characters
) End the first matching group
\s+ Greedily match one or more white-space characters
( Start the second matching group
\S+ Greedily match one or more non-white-space characters
) End the second matching groups

Powershell - reading ahead and While

I have a text file in the following format:
.....
ENTRY,PartNumber1,,,
FIELD,IntCode,123456
...
FIELD,MFRPartNumber,ABC123,,,
...
FIELD,XPARTNUMBER,ABC123
...
FIELD,InternalPartNumber,3214567
...
ENTRY,PartNumber2,,,
...
...
the ... indicates there is other data between these fields. The ONLY thing I can be certain of is that the field starting with ENTRY is a new set of records. The rows starting with FIELD can be in any order, and not all of them may be present in each group of data.
I need to read in a chunk of data
Search for any field matching the
string ABC123
If ABC123 found, search for the existence of the
InternalPartNumber field & return that row of data.
I have not seen a way to use Get-Content that can read in a variable number of rows as a set & be able to search it.
Here is the code I currently have, which will read a file, searching for a string & replacing it with another. I hope this can be modified to be used in this case.
$ftype = "*.txt"
$fnames = gci -Path $filefolder1 -Filter $ftype -Recurse|% {$_.FullName}
$mfgPartlist = Import-Csv -Path "C:\test\mfrPartList.csv"
foreach ($file in $fnames) {
$contents = Get-Content -Path $file
foreach ($partnbr in $mfgPartlist) {
$oldString = $mfgPartlist.OldValue
$newString = $mfgPartlist.NewValue
if (Select-String -Path $file -SimpleMatch $oldString -Debug -Quiet) {
$stringData = $contents -imatch $oldString
$stringData = $stringData -replace "[\n\r]","|"
foreach ($dataline in $stringData) {
$file +"|"+$stringData+"|"+$oldString+"|"+$newString|Out-File "C:\test\Datachanges.txt" -Width 2000 -Append
}
$contents = $contents -replace $oldString $newString
Set-Content -Path $file -Value $contents
}
}
}
Is there a way to read & search a text file in "chunks" using Powershell? Or to do a Read-ahead & determine what to search?
Assuming your fine isn't too big to read into memory all at once:
$Text = Get-Content testfile.txt -Raw
($Text -split '(?ms)^(?=ENTRY)') |
foreach {
if ($_ -match '(?ms)^FIELD\S+ABC123')
{$_ -replace '(?ms).+(^Field\S+InternalPartNumber.+?$).+','$1'}
}
FIELD,InternalPartNumber,3214567
That reads the entire file in as a single multiline string, and then splits it at the beginning of any line that starts with 'ENTRY'. Then it tests each segment for a FIELD line that contains 'ABC123', and if it does, removes everything except the FIELD line for the InternalPartNumber.
This is not my best work as I have just got back from vacation. You could use a while loop reading the text and set an entry flag to gobble up the text in chunks. However if your files are not too big then you could just read up the text file at once and use regex to split up the chunks and then process accordingly.
$pattern = "ABC123"
$matchedRowToReturn = "InternalPartNumber"
$fileData = Get-Content "d:\temp\test.txt" | Where-Object{$_ -match '^(entry|field)'} | Out-String
$parts = $fileData | Select-String '(?smi)(^Entry).*?(?=^Entry|\Z)' -AllMatches | Select-Object -ExpandProperty Matches | Select-Object -ExpandProperty Value
$parts | Where-Object{$_ -match $pattern} | Select-String "$matchedRowToReturn.*$" | Select-Object -ExpandProperty Matches | Select-Object -ExpandProperty Value
What this will do is read in the text file, drop any lines that are not entry or field related, as one long string and split it up into chunks that start with lines that begin with the work "Entry".
Then we drop those "parts" that do not contain the $pattern. Of the remaining that match extract the InternalPartNumber line and present.