Compare Text Files. Record Line #s that Match - powershell

I have two text files.
$File1 = "C:\Content1.txt"
$File2 = "C:\Content2.txt"
I'd like to compare these to see if they have the same number of lines and then I'd like to record the line number of each line that matches. I realize that sounds ridiculous but this is what I've been asked to do at my work.
I can compare them a lot of ways. I decided to do the following:
$File1Lines = Get-Content $File1 | Measure-Object -Line
$File2Lines = Get-Content $File2 | Measure-Object -Line
I'd like to test it with an if statement so that if they don't match, then I can start an earlier process over again.
if ($file1lines.lines -eq $file2lines.lines)
{ Get the Line #s that match and proceed to the next step}
else {Start Over}
I'm unsure how to record the line #s that match. Any thoughts on how to do this?

This is really pretty simple since Get-Content reads the file in as an array of strings, and you can index that array simply enough.
Do{
<stuff to generate files>
}While(($File1 = GC $PathToFile1).Count -ne ($File2 = GC $PathToFile2).count)
$MatchingLineNumbers = 0..($File1.count -1) | Where{$File1[$_] -eq $File2[$_]}
Since arrays in PowerShell use a 0 based index we want to start at 0 and go for however many lines the files have. Since .count starts at 1 not 0 we need to subtract 1 from the total count. So if your file has 27 lines $File1.count will equal 27. The index for those lines ranges from 0 (first line) to 26 (last line). The code ($File1.count - 1) would effectively come out to 26, so 0..26 starts at 0, and counts to 26.
Then each number goes to a Where statement that checks that specific line in each file to see if they are equal. If they are then it passes the number along, and that gets collected in $MatchingLineNumbers. If the lines don't match the number isn't passed along.

You'll need to get an intersection first, then find the index.
file1.txt
Line1
Line2
Line3
Line11
Line21
Line31
Line12
Line22
Line32
file2.txt
Line1
Line11
Line21
Line31
Line12
Line222
Line323
Line214
Line315
Line12
Line22
Line32
test.ps1
$file1 = Get-Content file1.txt
$file2 = Get-Content file2.txt
$matchingLines = $file1 | ? { $file2 -contains $_ }
$file1Lines = $matchingLines | % { Write-Host "$([array]::IndexOf($file1, $_))" }
$file2Lines = $matchingLines | % { Write-Host "$([array]::IndexOf($file2, $_))" }
Output
$file1Lines
0
3
4
5
6
7
8
$file2Lines
0
1
2
3
4
10
11

Related

Windows PowerShell: How to parse the log file?

I have an input file with below contents:
27/08/2020 02:47:37.365 (-0516) hostname12 ult_licesrv ULT 5 LiceSrv Main[108 00000 Session 'session1' (from 'vmpms1\app1#pmc21app20.pm.com') request for 1 additional licenses for module 'SA-XT' - 1 licenses have been allocated by concurrent usage category 'Unlimited' (session module usage now 1, session category usage now 1, total module concurrent usage now 1, total category usage now 1)
27/08/2020 02:47:37.600 (-0516) hostname13 ult_licesrv ULT 5 LiceSrv Main[108 00000 Session 'sssion2' (from 'vmpms2\app1#pmc21app20.pm.com') request for 1 additional licenses for module 'SA-XT-Read' - 1 licenses have been allocated by concurrent usage category 'Floating' (session module usage now 2, session category usage now 2, total module concurrent usage now 1, total category usage now 1)
27/08/2020 02:47:37.115 (-0516) hostname141 ult_licesrv CMN 5 Logging Housekee 00000 Deleting old log file 'C:\Program Files\PMCOM Global\License Server\diag_ult_licesrv_20200824_011130.log.gz' as it exceeds the purge threashold of 72 hours
27/08/2020 02:47:37.115 (-0516) hostname141 ult_licesrv CMN 5 Logging Housekee 00000 Deleting old log file 'C:\Program Files\PMCOM Global\License Server\diag_ult_licesrv_20200824_021310.log.gz' as it exceeds the purge threashold of 72 hours
27/08/2020 02:47:37.625 (-0516) hostname150 ult_licesrv ULT 5 LiceSrv Main[108 00000 Session 'session1' (from 'vmpms1\app1#pmc21app20.pm.com') request for 1 additional licenses for module 'SA-XT' - 1 licenses have been allocated by concurrent usage category 'Unlimited' (session module usage now 2, session category usage now 1, total module concurrent usage now 2, total category usage now 1)
I need to generate and output file like below:
Date,time,hostname,session_module_usage,session_category_usage,module_concurrent_usage,total_category_usage
27/08/2020,02:47:37.365 (-0516),hostname12,1,1,1,1
27/08/2020,02:47:37.600 (-0516),hostname13,2,2,1,1
27/08/2020,02:47:37.115 (-0516),hostname141,0,0,0,0
27/08/2020,02:47:37.115 (-0516),hostname141,0,0,0,0
27/08/2020,02:47:37.625 (-0516),hostname150,2,1,2,1
The output data order is: Date,time,hostname,session_module_usage,session_category_usage,module_concurrent_usage,total_category_usage.
Put 0,0,0,0 if no entry for session_module_usage,session_category_usage,module_concurrent_usage,total_category_usage
I need to get content from the input file and write the output to another file.
Update
I have created a file input.txt in F drive and pasted the log details into it.
Then I form an array by splitting the file content when a new line occurs like below.
$myList = (Get-Content -Path F:\input.txt) -split '\n'
Now I got 5 items in my array myList. Then I replace the multiple blank spaces with a single blank space and formed a new array by splitting each element by blank space. Then I print the 0 to 3 array elements. Now I need to add the end values (session_module_usage,session_category_usage,module_concurrent_usage,total_category_usage).
PS C:\Users\user> $myList = (Get-Content -Path F:\input.txt) -split '\n'
PS C:\Users\user> $myList.Length
5
PS C:\Users\user> $myList = (Get-Content -Path F:\input.txt) -split '\n'
PS C:\Users\user> $myList.Length
5
PS C:\Users\user> for ($i = 0; $i -le ($myList.length - 1); $i += 1) {
>> $newList = ($myList[$i] -replace '\s+', ' ') -split ' '
>> $newList[0]+','+$newList[1]+' '+$newList[2]+','+$newList[3]
>> }
27/08/2020,02:47:37.365 (-0516),hostname12
27/08/2020,02:47:37.600 (-0516),hostname13
27/08/2020,02:47:37.115 (-0516),hostname141
27/08/2020,02:47:37.115 (-0516),hostname141
27/08/2020,02:47:37.625 (-0516),hostname150
If you really need to filter on the granularity that you're looking for, then you may need to use regex to filter the lines.
This would assume that the rows have similarly labeled lines before the values you're looking for, so keep that in mind.
[System.Collections.ArrayList]$filteredRows = #()
$log = Get-Content -Path C:\logfile.log
foreach ($row in $log) {
$rowIndex = $log.IndexOf($row)
$date = ([regex]::Match($log[$rowIndex],'^\d+\/\d+\/\d+')).value
$time = ([regex]::Match($log[$rowIndex],'\d+:\d+:\d+\.\d+\s\(\S+\)')).value
$hostname = ([regex]::Match($log[$rowIndex],'(?<=\d\d\d\d\) )\w+')).value
$sessionModuleUsage = ([regex]::Match($log[$rowIndex],'(?<=session module usage now )\d')).value
if (!$sessionModuleUsage) {
$sessionModuleUsage = 0
}
$sessionCategoryUsage = ([regex]::Match($log[$rowIndex],'(?<=session category usage now )\d')).value
if (!$sessionCategoryUsage) {
$sessionCategoryUsage = 0
}
$moduleConcurrentUsage = ([regex]::Match($log[$rowIndex],'(?<=total module concurrent usage now )\d')).value
if (!$moduleConcurrentUsage) {
$moduleConcurrentUsage = 0
}
$totalCategoryUsage = ([regex]::Match($log[$rowIndex],'(?<=total category usage now )\d')).value
if (!$totalCategoryUsage) {
$totalCategoryUsage = 0
}
$hash = [ordered]#{
Date = $date
time = $time
hostname = $hostname
session_module_usage = $sessionModuleUsage
session_category_usage = $sessionCategoryUsage
module_concurrent_usage = $moduleConcurrentUsage
total_category_usage = $totalCategoryUsage
}
$rowData = New-Object -TypeName 'psobject' -Property $hash
$filteredRows.Add($rowData) > $null
}
$csv = $filteredRows | convertto-csv -NoTypeInformation -Delimiter "," | foreach {$_ -replace '"',''}
$csv | Out-File C:\results.csv
What essentially needs to happen is that we need to get-content of the log, which returns an array with each item terminated on a newline.
Once we have the rows, we need to grab the values via regex
Since you want zeroes in some of the items if those values don't exist, I have if statements that assign '0' if the regex returns nothing
Finally, we add each filtered item to a PSObject and append that object to an array of objects in each iteration.
Then export to a CSV.
You can probably pick apart the lines with a regex and substrings easily enough. Basically something like the following:
# Iterate over the lines of the input file
Get-Content F:\input.txt |
ForEach-Object {
# Extract the individual fields
$Date = $_.Substring(0, 10)
$Time = $_.Substring(12, $_.IndexOf(')') - 11)
$Hostname = $_.Substring(34, $_.IndexOf(' ', 34) - 34)
$session_module_usage = 0
$session_category_usage = 0
$module_concurrent_usage = 0
$total_category_usage = 0
if ($_ -match 'session module usage now (\d+), session category usage now (\d+), total module concurrent usage now (\d+), total category usage now (\d+)') {
$session_module_usage = $Matches[1]
$session_category_usage = $Matches[2]
$module_concurrent_usage = $Matches[3]
$total_category_usage = $Matches[4]
}
# Create custom object with those properties
New-Object PSObject -Property #{
Date = $Date
time = $Time
hostname = $Hostname
session_module_usage = $session_module_usage
session_category_usage = $session_category_usage
module_concurrent_usage = $module_concurrent_usage
total_category_usage = $total_category_usage
}
} |
# Ensure column order in output
Select-Object Date,time,hostname,session_module_usage,session_category_usage,module_concurrent_usage,total_category_usage |
# Write as CSV - without quotes
ConvertTo-Csv -NoTypeInformation |
ForEach-Object { $_ -replace '"' } |
Out-File F:\output.csv
Whether to pull the date, time, and host name from the line with substrings or regex is probably a matter of taste. Same goes for how strict the format must be matched, but that to me mostly depends on how rigid the format is. For more free-form things where different lines would match different regexes, or multiple lines makes up a single record, I also quite like switch -Regex to iterate over the lines.

Editing a specific column of data in a text file with powershell

So I’ve had a a request to edit a csv file by replacing column values with a a set of unique numbers. Below is a sample of the original input file with a a header line followed by a couple of rows. Note that the rows have NO column headers.
HH ### SLH304 01100001 2 20200606 20200605 011100
DD|GZFD|IE|USD|20200605|EUR200717||
DD|GZFD|IE|USD|20200605|EUR200717||
What I’m looking to do is change say the values in column 3 with a unique number.
So far I have the following …
$i=0
$txtin = Get-Content "C:\Temp\InFile.txt" | ForEach {"$($_.split('|'))"-replace $_[2],$i++} |Out-File C:\Temp\csvout.txt
… but this isn’t working as it removes the delimiter and adds numbers in the wrong places …
HH0###0000000SLH30400110000100000002000000202006060202006050011100
1D1D1 1G1Z1F1D1 1I1E1 1U1S1D1 12101210101610151 1E1U1R1210101711171 1 1
2D2D2 2G2Z2F2D2 2I2E2 2U2S2D2 22202220202620252 2E2U2R2220202721272 2 2
Ideally I want it to look like this, whereby the values of 'IE' have been replaced by '01' and '02' in each row ...
HH ### SLH304 01100001 2 20200606 20200605 011100
DD|GZFD|01|USD|20200605|EUR200717||
DD|GZFD|02|USD|20200605|EUR200717||
Any ideas on how to resolve would be much appreciated.
I think by spreading this out to multiline code will make it easier:
$txtin = Get-Content 'C:\Temp\InFile.txt'
# loop through the lines, skipping the first line
for ($i = 1; $i -lt $txtin.Count; $i++){
$parts = $txtin[$i].Split('|') # or use regex -split '\|'
if ($parts.Count -ge 3) {
$parts[2] = '{0:00}' -f $i # update the 3rd column
$txtin[$i] = $parts -join '|' # rejoin the parts with '|'
}
}
$txtin | Out-File -FilePath 'C:\Temp\csvout.txt'
Output will be:
HH ### SLH304 01100001 2 20200606 20200605 011100
DD|GZFD|01|USD|20200605|EUR200717||
DD|GZFD|02|USD|20200605|EUR200717||
Updated to use the more robust check suggested by mklement0. This avoids errors when the line does not have at least three parts in it after the split

Power shell to add the odd numbers in the odd lines and the even numbers in the even lines

Power shell to add the odd numbers in the odd lines and the even numbers in the even line the result of each line should come in the standard output! In each line, there are at least 2 numbers. The filename is given by a parameter.
for example this:
1 2 3
4 5 6
5 6 7
7 8 9
4 6 0
the output should be
4
10
12
8
0
You can read the file content with Get-Content or cat, and then turn it to an array by splitting on \n
$lines = (Get-Content $filepath).split('\n')
You can then iterate on each line and split it on ' ', and then iterate on that array and summing up the values you want
for($i=0;$i -lt $lines.length;$i++){
$numsInLine = $lines[i].split(' ')
$lineSum = 0
for($j=0;$j -lt $numsInLine.length;$j++){
if($numsInLine[j] % 2 -eq $i % 2){ #$i is the line number, $numsInLine[j] is a number in that line
$lineSum++
}
}
Write-Host $lineSum
}
Edit: In response to Lee_Dailey's comment, % in this content is the modulo operator. x%2 returns the parity of x, which is exactly what you need here.
Also note that % in powershell is an alias for ForEach-Object.
% or ForEach-Object in powershell is a simple for-each loop, iterates over all values in the array with a simpler syntax, no index though.
In our case I would recommend using it instead of the second for loop only, as we use the index variable in the first (note i is in use). In that case, the second loop would look like:
$numsInLine | %{ #or $numsInLine | ForEach-Object
if($_ % 2 -eq $i % 2){ #$_ refers to the last returned object
$lineSum++
}
}
Note two things here:
1.% or ForEach-Object are always used after pipeline as they must recieve an InputObject. The InputObject must also be iteratable. That's how to differ between % as modulo and % as foreach.
$_ is used here as the current iterated item. Our iterated item does not have a named variable refrencing it. $_ is commonly used in pipeline and foreach loops in particular (however you can use it in any other case too).
Another logically-equal syntax is:
foreach($num in $numsInLine){
if($num % 2 -eq $i % 2){
$lineSum++
}
}
Note that here no pipeline or $_ is needed.

Use Get content or Import-CSV to read 1st column in 2nd line in a csv

So I have a csv file which is 25MB.
I only need to get the value stored in 2nd line in first column and use it later in powershell script.
e.g data
File_name,INVNUM,ID,XXX....850 columns
ABCD,123,090,xxxx.....850 columns
ABCD,120,091,xxxx.....850 columns
xxxxxx5000+ rows
So my first column data is always the same and i just need to get this filename form the first column, 2nd row.
Should I try to use Get-content or Import-csv for this use case ?
Thanks,
Mickey
TessellatingHeckler's helpful answer contains a pragmatic, easy-to-understand solution that is most likely fast enough in practice; the same goes for Robert Cotterman's helpful answer which is concise (and also faster).
If performance is really paramount, you can try the following, which uses the .NET framework directly to read the lines - but given that you only need to read 2 lines, it's probably not worth it:
$inputFile = "$PWD/some.csv" # be sure to specify a *full* path
$isFirstLine=$true
$fname = foreach ($line in [IO.File]::ReadLines($inputFile)) {
if ($isFirstLine) { $isFirstLine = $false; continue } # skip header line
$line -replace '^([^,]*),.*', '$1' # extract 1st field from 2nd line and exit
break # exit
}
Note: A conceptually simpler way to extract the 1st field is to use ($line -split ',')[0], but with a large number of columns the above -replace-based approach is measurably faster.
Update: TessellatingHeckler offers 2 ways to speed up the above:
Use of $line.Substring(0, $line.IndexOf(',')) in lieu of $line -replace '^([^,]*),.*', '$1' in order to avoid relatively costly regex processing.
To lesser gain, use of a [System.IO.StreamReader] instance's .ReadLine() method twice in a row rather than [IO.File]::ReadLines() in a loop.
Here's a performance comparison of the approaches across all answers on this page (as of this writing).
To run it yourself, you must download functions New-CsvSampleData and Time-Command first.
For more representative results, the timings are averaged across 1,000 runs:
# Create sample CSV file 'test.csv' with 850 columns and 100 rows.
$testFileName = "test-$PID.csv"
New-CsvSampleData -Columns 850 -Count 100 | Set-Content $testFileName
# Compare the execution speed of the various approaches:
Time-Command -Count 1000 {
# Import-Csv
Import-Csv -LiteralPath $testFileName |
Select-Object -Skip 1 -First 1 -ExpandProperty 'col1'
}, {
# ReadLines(), -replace
$inputFile = $PWD.ProviderPath + "/$testFileName"
$isFirstLine=$true
foreach ($line in [IO.File]::ReadLines($inputFile)) {
if ($isFirstLine) { $isFirstLine = $false; continue } # skip header line
$line -replace '^([^,]*),.*', '$1' # extract 1st field from 2nd line and exit
break # exit
}
}, {
# ReadLines(), .Substring / IndexOf
$inputFile = $PWD.ProviderPath + "/$testFileName"
$isFirstLine=$true
foreach ($line in [IO.File]::ReadLines($inputFile)) {
if ($isFirstLine) { $isFirstLine = $false; continue } # skip header line
$line.Substring(0, $line.IndexOf(',')) # extract 1st field from 2nd line and exit
break # exit
}
}, {
# ReadLine() x 2, .Substring / IndexOf
$inputFile = $PWD.ProviderPath + "/$testFileName"
$f = [System.IO.StreamReader]::new($inputFile,$true);
$null = $f.ReadLine(); $line = $f.ReadLine()
$line.Substring(0, $line.IndexOf(','))
$f.Close()
}, {
# Get-Content -Head, .Split()
((Get-Content $testFileName -Head 2)[1]).split(',')[1]
} |
Format-Table Factor, Timespan, Command
Remove-Item $testFileName
Sample output from a single-core Windows 10 VM running Windows PowerShell v5.1 / PowerShell Core 6.1.0-preview.4 on a recent-model MacBook Pro:
Windows PowerShell v5.1:
Factor TimeSpan Command
------ -------- -------
1.00 00:00:00.0001922 # ReadLine() x 2, .Substring / IndexOf...
1.04 00:00:00.0002004 # ReadLines(), .Substring / IndexOf...
1.57 00:00:00.0003024 # ReadLines(), -replace...
3.25 00:00:00.0006245 # Get-Content -Head, .Split()...
25.83 00:00:00.0049661 # Import-Csv...
PowerShell Core 6.1.0-preview.4:
Factor TimeSpan Command
------ -------- -------
1.00 00:00:00.0001858 # ReadLine() x 2, .Substring / IndexOf...
1.03 00:00:00.0001911 # ReadLines(), .Substring / IndexOf...
1.60 00:00:00.0002977 # ReadLines(), -replace...
3.30 00:00:00.0006132 # Get-Content -Head, .Split()...
27.54 00:00:00.0051174 # Import-Csv...
Conclusions:
Calling .ReadLine() twice is marginally faster than the ::ReadLines() loop.
Using -replace instead of Substring() / IndexOf() adds about 60% execution time.
Using Get-Content is more than 3 times slower.
Using Import-Csv | Select-Object is close to 30 times(!) slower, presumably due to the large number of columns; that said, in absolute terms we're still only talking about around 5 milliseconds.
As a side note: execution on macOS seems to be noticeably slower overall, with the regex solution and the cmdlet calls also being slower in relative terms.
Depends what you want to prioritize.
$data = Import-Csv -LiteralPath 'c:\temp\data.csv' |
Select-Object -Skip 1 -First 1 -ExpandProperty 'File_Name'
Is short and convenient. (2nd line meaning 2nd line of the file, or 2nd line of the data? Don't skip any if it's the first line of data).
Select-Object with something like -First 1 will break the whole pipeline when it's done, so it won't wait to read the rest of the 25MB in the background before returning.
You could likely speed it up, or reduce memory use, a miniscule amount if you opened the file, seek'd two newlines, then a comma, then read to another comma, or some other long detailed code, but I very much doubt it would be worth it.
Same with Get-Content, the way it adds NoteProperties to the output strings will mean it's likely no easier on memory and not usefully faster than Import-Csv
You could really shorten it with
(gc c:\file.txt -head 2)[1]
Only reads 2 lines and then grabs index 1 (second line)
You could then split it. And grab index 1 of the split up line
((gc c:\file.txt -head 2)[1]).split(',')[1]
UPDATE:::After seeing the new post with times, I was inspired to do some tests myself (Thanks mklement0). this was the fastest I could get to work
$check = 0
foreach ($i in [IO.FILE]::ReadLines("$filePath")){
if ($check -eq 2){break}
if ($check -eq 1){$value = $i.split(',')[1]} #$value = your answer
$check++
}
Just thought of this: remove if -eq 2 and put break after a semi colon after the check 1 is performed. 5 ticks faster. Haven't tested.
here were my results over 40000 tests:
GC split avg was 1.11307622 Milliseconds
GC split Min was 0.3076 Milliseconds
GC split Max was 18.1514 Milliseconds
ReadLines split avg was 0.3836625825 Milliseconds
ReadLines split Min was 0.2309 Milliseconds
ReadLines split Max was 31.7407 Milliseconds
Stream Reader avg was 0.4464924825 Milliseconds
Stream Reader MIN was 0.2703 Milliseconds
Stream Reader Max was 31.4991 Milliseconds
Import-CSV avg was 1.32440485 Milliseconds
Import-CSV MIN was 0.2875 Milliseconds
Import-CSV Max was 103.1694 Milliseconds
I was able to run 3000 tests a second on the 2nd and 3rd, and 1000 tests a second on the first and last. Stream Reader was HIS fastest one. And import CSV wasn't bad, i wonder if the mklement0 didn't have a column named "file_name" in his test csv? Anyhow, I'd personally use the GC command because it's concise and easy to remember. But this is up to you, and I wish you luck on your scripting adventures.
I'm certain we could start hyperthreading this and get insane results, but when you're talking thousandths of a second is it really a big deal? Especially to get one variable? :D
here's the streamreader code I used for transparency reasons...
$inputFile = "$filePath"
$f = [System.IO.StreamReader]::new($inputFile,$true);
$null = $f.ReadLine(); $line = $f.ReadLine()
$line.Substring(0, $line.IndexOf(','))
$f.Close()
I also noticed this pulls the 1st value of the second line, and I have no idea how to switch it to the 2nd value... it seems to be measuring the width from point 0 to the first comma, and then cutting that. if you change substring from 0 to say 5, it still measures the length of 0 to comma, but then moves where to start grabbing... to the 6th character.
The import-csv I used was :
$data = Import-Csv -LiteralPath "$filePath" |
Select-Object -Skip 1 -First 1 -ExpandProperty 'FileName'
I tested these on a 90 meg csv, with 21 columns, and 284k rows. and "FileName" was the second column

Use of parentheses in PowerShell scripting?

Can anyone explain me the difference between following two statements?
gc -ReadCount 2 .\input.txt| % {"##" + $_}
(gc -ReadCount 2 .\input.txt)| % {"##" + $_}
I am using below file as input for above commands.
input.txt
1
2
Output
gc -ReadCount 2 .\input.txt| % {"##" + $_}
##1 2
(gc -ReadCount 2 .\input.txt)| % {"##" + $_}
##1
##2
If the input file contains more than 2 records both are giving same output.
I can modify my code to achieve what i want but i am just wondering why these 2 are giving different outputs.
I googled for the information but didn't find any answer.
Edit 1
Isn't the output of command 2 wrong, when i specify "-ReadCount 2" it should pipe two lines at a time, that means foreach loop should iterate only once(as input contains only 2 lines) with $[0]=1 , $[1]=2 so that when i print "##"+$_ it should print "##1 2" as command1 did.
gc -ReadCount 2 .\input.txt| % {"##" + $_}
Read the content as [String] - Which means it adds "##" then the whole text file after it (the foreach loop running once)
(gc -ReadCount 2 .\input.txt)| % {"##" + $_}
Read the content as [Array] and evaluate each line of it, which adds "##" and the content of each line after it (the foreach loop running twice)
The -ReadCount Parameter are used to split the data to an array of lines as one chunk, mostly used for performance, so -ReadCount 3 will show
##1 2 3
##4 5 6
##7
and -ReadCount 4 will show:
##1 2 3 4
##5 6 7
For everyone to see, here is the output of Get-Help Get-Content -Parameter ReadCount:
-ReadCount <Int64>
Specifies how many lines of content are sent through the pipeline at a time.
The default value is 1. A value of 0 (zero) sends all of the content at one time.
This is what breaks the lines into groups (I assumed it would instead limit the number of lines read in the file).
Still no clue about the less-than-three-lines behavior, though.