Use of parentheses in PowerShell scripting? - powershell

Can anyone explain me the difference between following two statements?
gc -ReadCount 2 .\input.txt| % {"##" + $_}
(gc -ReadCount 2 .\input.txt)| % {"##" + $_}
I am using below file as input for above commands.
input.txt
1
2
Output
gc -ReadCount 2 .\input.txt| % {"##" + $_}
##1 2
(gc -ReadCount 2 .\input.txt)| % {"##" + $_}
##1
##2
If the input file contains more than 2 records both are giving same output.
I can modify my code to achieve what i want but i am just wondering why these 2 are giving different outputs.
I googled for the information but didn't find any answer.
Edit 1
Isn't the output of command 2 wrong, when i specify "-ReadCount 2" it should pipe two lines at a time, that means foreach loop should iterate only once(as input contains only 2 lines) with $[0]=1 , $[1]=2 so that when i print "##"+$_ it should print "##1 2" as command1 did.

gc -ReadCount 2 .\input.txt| % {"##" + $_}
Read the content as [String] - Which means it adds "##" then the whole text file after it (the foreach loop running once)
(gc -ReadCount 2 .\input.txt)| % {"##" + $_}
Read the content as [Array] and evaluate each line of it, which adds "##" and the content of each line after it (the foreach loop running twice)
The -ReadCount Parameter are used to split the data to an array of lines as one chunk, mostly used for performance, so -ReadCount 3 will show
##1 2 3
##4 5 6
##7
and -ReadCount 4 will show:
##1 2 3 4
##5 6 7

For everyone to see, here is the output of Get-Help Get-Content -Parameter ReadCount:
-ReadCount <Int64>
Specifies how many lines of content are sent through the pipeline at a time.
The default value is 1. A value of 0 (zero) sends all of the content at one time.
This is what breaks the lines into groups (I assumed it would instead limit the number of lines read in the file).
Still no clue about the less-than-three-lines behavior, though.

Related

Editing a specific column of data in a text file with powershell

So I’ve had a a request to edit a csv file by replacing column values with a a set of unique numbers. Below is a sample of the original input file with a a header line followed by a couple of rows. Note that the rows have NO column headers.
HH ### SLH304 01100001 2 20200606 20200605 011100
DD|GZFD|IE|USD|20200605|EUR200717||
DD|GZFD|IE|USD|20200605|EUR200717||
What I’m looking to do is change say the values in column 3 with a unique number.
So far I have the following …
$i=0
$txtin = Get-Content "C:\Temp\InFile.txt" | ForEach {"$($_.split('|'))"-replace $_[2],$i++} |Out-File C:\Temp\csvout.txt
… but this isn’t working as it removes the delimiter and adds numbers in the wrong places …
HH0###0000000SLH30400110000100000002000000202006060202006050011100
1D1D1 1G1Z1F1D1 1I1E1 1U1S1D1 12101210101610151 1E1U1R1210101711171 1 1
2D2D2 2G2Z2F2D2 2I2E2 2U2S2D2 22202220202620252 2E2U2R2220202721272 2 2
Ideally I want it to look like this, whereby the values of 'IE' have been replaced by '01' and '02' in each row ...
HH ### SLH304 01100001 2 20200606 20200605 011100
DD|GZFD|01|USD|20200605|EUR200717||
DD|GZFD|02|USD|20200605|EUR200717||
Any ideas on how to resolve would be much appreciated.
I think by spreading this out to multiline code will make it easier:
$txtin = Get-Content 'C:\Temp\InFile.txt'
# loop through the lines, skipping the first line
for ($i = 1; $i -lt $txtin.Count; $i++){
$parts = $txtin[$i].Split('|') # or use regex -split '\|'
if ($parts.Count -ge 3) {
$parts[2] = '{0:00}' -f $i # update the 3rd column
$txtin[$i] = $parts -join '|' # rejoin the parts with '|'
}
}
$txtin | Out-File -FilePath 'C:\Temp\csvout.txt'
Output will be:
HH ### SLH304 01100001 2 20200606 20200605 011100
DD|GZFD|01|USD|20200605|EUR200717||
DD|GZFD|02|USD|20200605|EUR200717||
Updated to use the more robust check suggested by mklement0. This avoids errors when the line does not have at least three parts in it after the split

Power shell to add the odd numbers in the odd lines and the even numbers in the even lines

Power shell to add the odd numbers in the odd lines and the even numbers in the even line the result of each line should come in the standard output! In each line, there are at least 2 numbers. The filename is given by a parameter.
for example this:
1 2 3
4 5 6
5 6 7
7 8 9
4 6 0
the output should be
4
10
12
8
0
You can read the file content with Get-Content or cat, and then turn it to an array by splitting on \n
$lines = (Get-Content $filepath).split('\n')
You can then iterate on each line and split it on ' ', and then iterate on that array and summing up the values you want
for($i=0;$i -lt $lines.length;$i++){
$numsInLine = $lines[i].split(' ')
$lineSum = 0
for($j=0;$j -lt $numsInLine.length;$j++){
if($numsInLine[j] % 2 -eq $i % 2){ #$i is the line number, $numsInLine[j] is a number in that line
$lineSum++
}
}
Write-Host $lineSum
}
Edit: In response to Lee_Dailey's comment, % in this content is the modulo operator. x%2 returns the parity of x, which is exactly what you need here.
Also note that % in powershell is an alias for ForEach-Object.
% or ForEach-Object in powershell is a simple for-each loop, iterates over all values in the array with a simpler syntax, no index though.
In our case I would recommend using it instead of the second for loop only, as we use the index variable in the first (note i is in use). In that case, the second loop would look like:
$numsInLine | %{ #or $numsInLine | ForEach-Object
if($_ % 2 -eq $i % 2){ #$_ refers to the last returned object
$lineSum++
}
}
Note two things here:
1.% or ForEach-Object are always used after pipeline as they must recieve an InputObject. The InputObject must also be iteratable. That's how to differ between % as modulo and % as foreach.
$_ is used here as the current iterated item. Our iterated item does not have a named variable refrencing it. $_ is commonly used in pipeline and foreach loops in particular (however you can use it in any other case too).
Another logically-equal syntax is:
foreach($num in $numsInLine){
if($num % 2 -eq $i % 2){
$lineSum++
}
}
Note that here no pipeline or $_ is needed.

Use Get content or Import-CSV to read 1st column in 2nd line in a csv

So I have a csv file which is 25MB.
I only need to get the value stored in 2nd line in first column and use it later in powershell script.
e.g data
File_name,INVNUM,ID,XXX....850 columns
ABCD,123,090,xxxx.....850 columns
ABCD,120,091,xxxx.....850 columns
xxxxxx5000+ rows
So my first column data is always the same and i just need to get this filename form the first column, 2nd row.
Should I try to use Get-content or Import-csv for this use case ?
Thanks,
Mickey
TessellatingHeckler's helpful answer contains a pragmatic, easy-to-understand solution that is most likely fast enough in practice; the same goes for Robert Cotterman's helpful answer which is concise (and also faster).
If performance is really paramount, you can try the following, which uses the .NET framework directly to read the lines - but given that you only need to read 2 lines, it's probably not worth it:
$inputFile = "$PWD/some.csv" # be sure to specify a *full* path
$isFirstLine=$true
$fname = foreach ($line in [IO.File]::ReadLines($inputFile)) {
if ($isFirstLine) { $isFirstLine = $false; continue } # skip header line
$line -replace '^([^,]*),.*', '$1' # extract 1st field from 2nd line and exit
break # exit
}
Note: A conceptually simpler way to extract the 1st field is to use ($line -split ',')[0], but with a large number of columns the above -replace-based approach is measurably faster.
Update: TessellatingHeckler offers 2 ways to speed up the above:
Use of $line.Substring(0, $line.IndexOf(',')) in lieu of $line -replace '^([^,]*),.*', '$1' in order to avoid relatively costly regex processing.
To lesser gain, use of a [System.IO.StreamReader] instance's .ReadLine() method twice in a row rather than [IO.File]::ReadLines() in a loop.
Here's a performance comparison of the approaches across all answers on this page (as of this writing).
To run it yourself, you must download functions New-CsvSampleData and Time-Command first.
For more representative results, the timings are averaged across 1,000 runs:
# Create sample CSV file 'test.csv' with 850 columns and 100 rows.
$testFileName = "test-$PID.csv"
New-CsvSampleData -Columns 850 -Count 100 | Set-Content $testFileName
# Compare the execution speed of the various approaches:
Time-Command -Count 1000 {
# Import-Csv
Import-Csv -LiteralPath $testFileName |
Select-Object -Skip 1 -First 1 -ExpandProperty 'col1'
}, {
# ReadLines(), -replace
$inputFile = $PWD.ProviderPath + "/$testFileName"
$isFirstLine=$true
foreach ($line in [IO.File]::ReadLines($inputFile)) {
if ($isFirstLine) { $isFirstLine = $false; continue } # skip header line
$line -replace '^([^,]*),.*', '$1' # extract 1st field from 2nd line and exit
break # exit
}
}, {
# ReadLines(), .Substring / IndexOf
$inputFile = $PWD.ProviderPath + "/$testFileName"
$isFirstLine=$true
foreach ($line in [IO.File]::ReadLines($inputFile)) {
if ($isFirstLine) { $isFirstLine = $false; continue } # skip header line
$line.Substring(0, $line.IndexOf(',')) # extract 1st field from 2nd line and exit
break # exit
}
}, {
# ReadLine() x 2, .Substring / IndexOf
$inputFile = $PWD.ProviderPath + "/$testFileName"
$f = [System.IO.StreamReader]::new($inputFile,$true);
$null = $f.ReadLine(); $line = $f.ReadLine()
$line.Substring(0, $line.IndexOf(','))
$f.Close()
}, {
# Get-Content -Head, .Split()
((Get-Content $testFileName -Head 2)[1]).split(',')[1]
} |
Format-Table Factor, Timespan, Command
Remove-Item $testFileName
Sample output from a single-core Windows 10 VM running Windows PowerShell v5.1 / PowerShell Core 6.1.0-preview.4 on a recent-model MacBook Pro:
Windows PowerShell v5.1:
Factor TimeSpan Command
------ -------- -------
1.00 00:00:00.0001922 # ReadLine() x 2, .Substring / IndexOf...
1.04 00:00:00.0002004 # ReadLines(), .Substring / IndexOf...
1.57 00:00:00.0003024 # ReadLines(), -replace...
3.25 00:00:00.0006245 # Get-Content -Head, .Split()...
25.83 00:00:00.0049661 # Import-Csv...
PowerShell Core 6.1.0-preview.4:
Factor TimeSpan Command
------ -------- -------
1.00 00:00:00.0001858 # ReadLine() x 2, .Substring / IndexOf...
1.03 00:00:00.0001911 # ReadLines(), .Substring / IndexOf...
1.60 00:00:00.0002977 # ReadLines(), -replace...
3.30 00:00:00.0006132 # Get-Content -Head, .Split()...
27.54 00:00:00.0051174 # Import-Csv...
Conclusions:
Calling .ReadLine() twice is marginally faster than the ::ReadLines() loop.
Using -replace instead of Substring() / IndexOf() adds about 60% execution time.
Using Get-Content is more than 3 times slower.
Using Import-Csv | Select-Object is close to 30 times(!) slower, presumably due to the large number of columns; that said, in absolute terms we're still only talking about around 5 milliseconds.
As a side note: execution on macOS seems to be noticeably slower overall, with the regex solution and the cmdlet calls also being slower in relative terms.
Depends what you want to prioritize.
$data = Import-Csv -LiteralPath 'c:\temp\data.csv' |
Select-Object -Skip 1 -First 1 -ExpandProperty 'File_Name'
Is short and convenient. (2nd line meaning 2nd line of the file, or 2nd line of the data? Don't skip any if it's the first line of data).
Select-Object with something like -First 1 will break the whole pipeline when it's done, so it won't wait to read the rest of the 25MB in the background before returning.
You could likely speed it up, or reduce memory use, a miniscule amount if you opened the file, seek'd two newlines, then a comma, then read to another comma, or some other long detailed code, but I very much doubt it would be worth it.
Same with Get-Content, the way it adds NoteProperties to the output strings will mean it's likely no easier on memory and not usefully faster than Import-Csv
You could really shorten it with
(gc c:\file.txt -head 2)[1]
Only reads 2 lines and then grabs index 1 (second line)
You could then split it. And grab index 1 of the split up line
((gc c:\file.txt -head 2)[1]).split(',')[1]
UPDATE:::After seeing the new post with times, I was inspired to do some tests myself (Thanks mklement0). this was the fastest I could get to work
$check = 0
foreach ($i in [IO.FILE]::ReadLines("$filePath")){
if ($check -eq 2){break}
if ($check -eq 1){$value = $i.split(',')[1]} #$value = your answer
$check++
}
Just thought of this: remove if -eq 2 and put break after a semi colon after the check 1 is performed. 5 ticks faster. Haven't tested.
here were my results over 40000 tests:
GC split avg was 1.11307622 Milliseconds
GC split Min was 0.3076 Milliseconds
GC split Max was 18.1514 Milliseconds
ReadLines split avg was 0.3836625825 Milliseconds
ReadLines split Min was 0.2309 Milliseconds
ReadLines split Max was 31.7407 Milliseconds
Stream Reader avg was 0.4464924825 Milliseconds
Stream Reader MIN was 0.2703 Milliseconds
Stream Reader Max was 31.4991 Milliseconds
Import-CSV avg was 1.32440485 Milliseconds
Import-CSV MIN was 0.2875 Milliseconds
Import-CSV Max was 103.1694 Milliseconds
I was able to run 3000 tests a second on the 2nd and 3rd, and 1000 tests a second on the first and last. Stream Reader was HIS fastest one. And import CSV wasn't bad, i wonder if the mklement0 didn't have a column named "file_name" in his test csv? Anyhow, I'd personally use the GC command because it's concise and easy to remember. But this is up to you, and I wish you luck on your scripting adventures.
I'm certain we could start hyperthreading this and get insane results, but when you're talking thousandths of a second is it really a big deal? Especially to get one variable? :D
here's the streamreader code I used for transparency reasons...
$inputFile = "$filePath"
$f = [System.IO.StreamReader]::new($inputFile,$true);
$null = $f.ReadLine(); $line = $f.ReadLine()
$line.Substring(0, $line.IndexOf(','))
$f.Close()
I also noticed this pulls the 1st value of the second line, and I have no idea how to switch it to the 2nd value... it seems to be measuring the width from point 0 to the first comma, and then cutting that. if you change substring from 0 to say 5, it still measures the length of 0 to comma, but then moves where to start grabbing... to the 6th character.
The import-csv I used was :
$data = Import-Csv -LiteralPath "$filePath" |
Select-Object -Skip 1 -First 1 -ExpandProperty 'FileName'
I tested these on a 90 meg csv, with 21 columns, and 284k rows. and "FileName" was the second column

Compare Text Files. Record Line #s that Match

I have two text files.
$File1 = "C:\Content1.txt"
$File2 = "C:\Content2.txt"
I'd like to compare these to see if they have the same number of lines and then I'd like to record the line number of each line that matches. I realize that sounds ridiculous but this is what I've been asked to do at my work.
I can compare them a lot of ways. I decided to do the following:
$File1Lines = Get-Content $File1 | Measure-Object -Line
$File2Lines = Get-Content $File2 | Measure-Object -Line
I'd like to test it with an if statement so that if they don't match, then I can start an earlier process over again.
if ($file1lines.lines -eq $file2lines.lines)
{ Get the Line #s that match and proceed to the next step}
else {Start Over}
I'm unsure how to record the line #s that match. Any thoughts on how to do this?
This is really pretty simple since Get-Content reads the file in as an array of strings, and you can index that array simply enough.
Do{
<stuff to generate files>
}While(($File1 = GC $PathToFile1).Count -ne ($File2 = GC $PathToFile2).count)
$MatchingLineNumbers = 0..($File1.count -1) | Where{$File1[$_] -eq $File2[$_]}
Since arrays in PowerShell use a 0 based index we want to start at 0 and go for however many lines the files have. Since .count starts at 1 not 0 we need to subtract 1 from the total count. So if your file has 27 lines $File1.count will equal 27. The index for those lines ranges from 0 (first line) to 26 (last line). The code ($File1.count - 1) would effectively come out to 26, so 0..26 starts at 0, and counts to 26.
Then each number goes to a Where statement that checks that specific line in each file to see if they are equal. If they are then it passes the number along, and that gets collected in $MatchingLineNumbers. If the lines don't match the number isn't passed along.
You'll need to get an intersection first, then find the index.
file1.txt
Line1
Line2
Line3
Line11
Line21
Line31
Line12
Line22
Line32
file2.txt
Line1
Line11
Line21
Line31
Line12
Line222
Line323
Line214
Line315
Line12
Line22
Line32
test.ps1
$file1 = Get-Content file1.txt
$file2 = Get-Content file2.txt
$matchingLines = $file1 | ? { $file2 -contains $_ }
$file1Lines = $matchingLines | % { Write-Host "$([array]::IndexOf($file1, $_))" }
$file2Lines = $matchingLines | % { Write-Host "$([array]::IndexOf($file2, $_))" }
Output
$file1Lines
0
3
4
5
6
7
8
$file2Lines
0
1
2
3
4
10
11

Extract table data from file using Select-String

Here is my sample text file, I'm trying to parse out the results
2017-08-26 22:31:10,769 - Recv: T:150.01 /150.00 B:59.77 /60.00 #:23 B#:127
2017-08-26 22:31:12,559 - Recv: echo:busy: processing
2017-08-26 22:31:12,768 - Recv: T:150.04 /150.00 B:59.93 /60.00 #:22 B#:127
2017-08-26 22:31:13,660 - Recv: Bilinear Leveling Grid:
2017-08-26 22:31:13,665 - Recv: 0 1 2 3 4
2017-08-26 22:31:13,669 - Recv: 0 +0.203 +0.105 -0.020 -0.182 -0.275
2017-08-26 22:31:13,672 - Recv: 1 +0.192 +0.100 -0.028 -0.192 -0.310
2017-08-26 22:31:13,675 - Recv: 2 +0.138 +0.018 -0.090 -0.257 -0.340
2017-08-26 22:31:13,678 - Recv: 3 +0.117 +0.018 -0.087 -0.247 -0.362
2017-08-26 22:31:13,681 - Recv: 4 +0.105 -0.020 -0.122 -0.285 -0.385
I need to search and split the content so it looks like this
0 1 2 3 4
0 +0.203 +0.105 -0.020 -0.182 -0.275
1 +0.192 +0.100 -0.028 -0.192 -0.310
2 +0.138 +0.018 -0.090 -0.257 -0.340
3 +0.117 +0.018 -0.087 -0.247 -0.362
4 +0.105 -0.020 -0.122 -0.285 -0.385
Here is my attempt
Get-Content \\192.168.1.41\octoprint\logs\serial.log | Select-String "Bilinear Leveling Grid:" -Context 0,6 -SimpleMatch | %{$_.Line.Split("Recv: ")}
And here is my output
2017-08-26
22
31
13,660
-
Bilin
ar
L
ling
Grid
Any ideas?
How about just using the -replace operator
Regex is certainly useful here but Select-String would not have been my tool of choice here. Using the -replace operator we can get what you want in two operations.
# Read in the file as one string.
$textData = Get-Content "\\192.168.1.41\octoprint\logs\serial.log" | Out-String
# Remove everything up to and including a line with Bilinear Leveling Grid:
# and remove all the prefixes from remaining lines be deleted everything up until Recv: .
$textData -replace "(?m)^.*?Bilinear Leveling Grid:\s+$" -replace "(?m)^.*?Recv: "
We use the multiline modifier in regex so that the anchors ^$ match the start and end of the lines in the text and not sure the start and the end of the text itself. This makes it easier to remove globs of text.
Select-String
While I feel my approach is a little more robust I wanted to show how you could have gotten the results you wanted with Select-String
$result = $textData | Select-String "Bilinear Leveling Grid:" -Context 0,6 -SimpleMatch
$result.Context.PostContext -replace "^.*?Recv: "
So if you knew it would always be 6 lines that would work as well. Again in this example we use the replace operator to removing the timestamps etc. from the remaining lines.
Why your approach was not working
When you use context the resulting MatchInfo object stores it in a property called Context like you saw in my above solution. From the docs about Select-String and -Context
This parameter does not change the number of objects generated by Select-String. Select-String generates one MatchInfo (Microsoft.PowerShell.Commands.MatchInfo) object for each match. The context is stored as an array of strings in the Context property of the object.
You were not referring to it which is part of the reason your results are the way they are. So we get the context after the match and that gets you the 6 lines following the "table title".
Your other issue is you were using Split() on the one line you matched and split will carve the line on any character you pass it and not on the whole string. Consider "abcde".Split("bd") which will make a 3 element array.
try this:
$file="\\192.168.1.41\octoprint\logs\serial.log"
$delimiterrow="Bilinear Leveling Grid:"
Get-Content $file | Select-String $delimiterrow -Context 0,6 |%{ $_.Context.PostContext | %{ ($_ -split 'Recv: ', 2)[1]}}
#short version
gc $file | sls $delimiterrow -Co 0,6 |%{ $_.Context.PostContext | %{ ($_ -split 'Recv: ', 2)[1]}}