Count repetitions Powershell - powershell

I have a file like this:
CONTOSO-A\AAA
CONTOSO-B\BBB
CONTOSO-B\CCC
CONTOSO-A\AAA
....
....
How can count each line to get:
CONTOSO-A\AAA - 2
CONTOSO-B\BBB - 1
CONTOSO-B\CCC - 1

Get-Content .\file.txt | Group-Object | Select-Object name, count

I'd use a hash table:
$counts = #{}
Get-Content c:\somedir\somefile.txt |
foreach { $counts[$_]++ }
$counts
Name Value
---- -----
CONTOSO-A\AAA 2
CONTOSO-B\CCC 1
CONTOSO-B\BBB 1

The simplest way to do this is probably:
PS C:\temp> #"
CONTOSO-A\AAA
CONTOSO-B\BBB
CONTOSO-B\CCC
CONTOSO-A\AAA
"# | set-content test.txt
get-content test.txt | group -NoElement
Count Name
----- ----
2 CONTOSO-A\AAA
1 CONTOSO-B\BBB
1 CONTOSO-B\CCC
Using the -NoElement option to group or Group-Object means you don't have to do a separate select to extract just name and count.
To get the exact format you asked for:
PS C:\temp> get-content test.txt | group -NoElement | % { $_.Name +" - "+$_.Count }
CONTOSO-A\AAA - 2
CONTOSO-B\BBB - 1
CONTOSO-B\CCC - 1

$stat = #{};
cat file.txt | % { $stat["$_"] = $stat["$_"] + 1; }
$stat;

Related

Powershell - Group and count from CSV file

In my CSV file, I have two columns with header Start_date and Status. I am trying to find out the success percentage for each Start_date
Start_date Status
------------------------------------------
02-03-2022 Completed
02-03-2022 Completed
03-03-2022 Failed
03-03-2022 Completed
I am looking for a final output like below which I export CSV
Start_date Total Completed Failed Success %
02-03-2022 2 2 0 100
03-03-2022 2 1 1 50
As a first step, I am trying to get the count of each day job using below code.
$data = Import-Csv "C:\file.csv"
$data | group {$_.Start_date} | Sort-Object {$_.Start_date} | Select-Object {$_.Status}, Count
Above code will give me output like
$_.Status Count
--------- -----
1
1
it is not showing the date value. what will be the correct approach for this issue ?
You can use Group-Object to group the objects by the date column, then it's just math:
$csv = #'
Start_date,Status
02-03-2022,Completed
02-03-2022,Completed
03-03-2022,Failed
03-03-2022,Completed
'# | ConvertFrom-Csv
# This from your side should be:
# Import-Csv path/to/csv.csv | Group-Object ....
$csv | Group-Object Start_date | ForEach-Object {
$completed, $failed = $_.Group.Status.where({ $_ -eq 'Completed' }, 'Split')
$totalc = $_.Group.Count
$complc = $completed.Count
$failc = $failed.Count
$success = $complc / $totalc
[pscustomobject]#{
Start_Date = $_.Name
Total = $totalc
Completed = $complc
Failed = $failc
Success = $success.ToString('P0')
}
}
Here's another one:
$csv = Import-Csv C:\Temp\tmp.csv
$Results = #()
foreach ($group in $csv | Group Start_date)
{
$Completed = ($group.Group | group status | ? Name -eq Completed).Count
$Failed = ($group.Group | group status | ? Name -eq Failed).Count
$row = "" | Select Start_date,Total,Completed,Failed,"Success %"
$row.Start_date = $group.Name
$row.Total = $group.Count
$row.Completed = $Completed
$row.Failed = $Failed
$row."Success %" = $Completed / 2 * 100
$Results += $row
}
$results
Start_date Total Completed Failed Success %
---------- ----- --------- ------ ---------
02-03-2022 2 2 0 100
03-03-2022 2 1 1 50

Add up the data if the reference from another file is correct

I have two CSV Files which look like this:
test.csv:
"Col1","Col2"
"1111","1"
"1122","2"
"1111","3"
"1121","2"
"1121","2"
"1133","2"
"1133","2"
The second looks like this:
test2.csv:
"Number","signs"
"1111","ABC"
"1122","DEF"
"1111","ABC"
"1121","ABC"
"1133","GHI"
Now the goal is to get a summary of all points from test.csv assigned to the "signs" of test2.csv. Reference are the numbers, as you may see.
Should be something like this:
ABC = 8
DEF = 2
GHI = 4
I have tried to test this out but cannot get the goal. What I have so far is:
$var = "C:\PathToCSV"
$csv1 = Import-Csv "$var\test.csv"
$csv2 = Import-Csv "$var\test2.csv"
# Process: group by 'Item' then sum 'Average' for each group
# and create output objects on the fly
$test1 = $csv1 | Group-Object Col1 | ForEach-Object {
New-Object psobject -Property #{
Col1 = $_.Name
Sum = ($_.Group | Measure-Object Col2 -Sum).Sum
}
}
But this gives me back the following output:
Ps> $test1
Sum Col1
--- ----
4 1111
2 1122
4 1121
4 1133
I am not able to get the summary and the mapping of the signs.
Not sure if I understand your question correctly, but I'm going to assume that for each value from the column "signs" you want to lookup the values from the column "Number" in the second CSV and then calculate the sum of the column "Col2" for all matches.
For that I'd build a hashtable with the pre-calculated sums for the unique values from "Col1":
$h1 = #{}
$csv1 | ForEach-Object {
$h1[$_.Col1] += [int]$_.Col2
}
and then build a second hashtable to sum up the lookup results for the values from the second CSV:
$h2 = #{}
$csv2 | ForEach-Object {
$h2[$_.signs] += $h1[$_.Number]
}
However, that produced a different value for "ABC" than what you stated as the desired result in your question when I processed your sample data:
Name Value
---- -----
ABC 12
GHI 4
DEF 2
Or did you mean you want to sum up the corresponding values for the unique numbers for each sign? For that you'd change the second code snippet to something like this:
$h2 = #{}
$csv2 | Group-Object signs | ForEach-Object {
$name = $_.Name
$_.Group | Select-Object -Unique -Expand Number | ForEach-Object {
$h2[$name] += $h1[$_]
}
}
That would produce the desired result from your question:
Name Value
---- -----
ABC 8
GHI 4
DEF 2

Import-CSV does not preserve line indents

I am using Import-CSV to get the data from a csv file that looks like:
P1,1,3,4
P2,4,5,6
P3,1,2,3
P4,8.7,6,3
I would like to keep the white-space in front of the text as it indicates the hierarchy. Import-CSV returns:
P1,1,3,4
P2,4,5,6
P3,1,2,3
P4,8.7,6,3
Is there a way to keep the white space?
Your CSV isn't correctly formatted, the items in each row should all be quoted to meet the file specification:
"P1,"1","3","4"
" P2,"4","5","6"
" P3,"1","2","3"
" P4,"8.7","6","3"
You can take a shortcut and only wrap the entries with leading spaces in quotes:
P1,1,3,4
" P2",4,5,6
" P3",1,2,3
" P4",8.7,6,3
Then Import-CSV will function as you're expecting, headers added for demonstration:
Import-CSV leading_spaces.csv -Header "Field1","Field2","Field3","Field4"
Gives you your desired output:
Field1 Field2 Field3 Field4
------ ------ ------ ------
P1 1 3 4
P2 4 5 6
P3 1 2 3
P4 8.7 6 3
As per James C's comment, you can do this with Get-Content:
$myData = Get-Content .\test2.txt
foreach($line in ($myData | Select-Object -Skip 1)){
[array]$results += [pscustomobject]#{
$myData[0].Split(",")[0] = $line.Split(",")[0]
$myData[0].Split(",")[1] = $line.Split(",")[1]
$myData[0].Split(",")[2] = $line.Split(",")[2]
$myData[0].Split(",")[3] = $line.Split(",")[3]
}
}
Maybe this will help, it will create a new object for each row:
get-content test.csv | % {
$row = New-Object PSObject
$i = 0
$_ -split "," | %{
$row | add-member Noteproperty "column$i" $_
$i++
}
$row
}
Output will look like this:
column0 column1 column2 column3
------- ------- ------- -------
P1 1 3 4
P2 4 5 6
P3 1 2 3
P4 8.7 6 3

How do I get filename and line count per file using powershell

I have the following powershell script to count lines per file in a given directory:
dir -Include *.csv -Recurse | foreach{get-content $_ | measure-object -line}
This is giving me the following output:
Lines Words Characters Property
----- ----- ---------- --------
27
90
11
95
449
...
The counts-per-file is fine (I don't require words, characters, or property), but I don't know what filename the count is for.
The ideal output would be something like:
Filename Lines
-------- -----
Filename1.txt 27
Filename1.txt 90
Filename1.txt 11
Filename1.txt 95
Filename1.txt 449
...
How do I add the filename to the output?
try this:
dir -Include *.csv -Recurse |
% { $_ | select name, #{n="lines";e={
get-content $_ |
measure-object -line |
select -expa lines }
}
} | ft -AutoSize
I can offer another solution :
Get-ChildItem $testPath | % {
$_ | Select-Object -Property 'Name', #{
label = 'Lines'; expression = {
($_ | Get-Content).Length
}
}
}
I operate on the. TXT file, the return value is like this ↓
Name Lines
---- ----
1.txt 1
2.txt 2
3.txt 3
4.txt 4
5.txt 5
6.txt 6
7.txt 7
8.txt 8
9.txt 9
The reason why I want to sort like this is that I am rewriting a UNIX shell command (from The Pragmatic Programmer: Your Journey to Mastery on page 145).
The purpose of this command is to find out the five files with the largest number of lines.
At present, my progress is the above content,i'm close to success.
However, this command is far more complicated than the UNIX shell command!
I believe there should be a simpler way, I'm trying to find it.
find . -type f | xargs wc -l | sort -n | tail -5
I have used the following script that gives me lines in files of all sub directories in folder c:\temp\A. The output is in lines1.txt file. I have applied a filer to choose only file types of ".TXT".
Get-ChildItem c:\temp\A -recurse | where {$_.extension -eq ".txt"} | % {
$_ | Select-Object -Property 'Name', #{
label = 'Lines'; expression = {
($_ | Get-Content).Length
}
}
} | out-file C:\temp\lines1.txt

Count the comma in each line and show the line numbers in a text file

I'm using the following script to get the comma counts.
Get-Content .\myFile |
% { ($_ | Select-String `, -all).matches | measure | select count } |
group -Property count
It returns,
Count Name Group
----- ---- -----
131 85 {#{Count=85}, #{Count=85}, #{Count=85}, #{Count=85}...}
3 86 {#{Count=86}, #{Count=86}, #{Count=86}}
Can I show the line number in the Group column instead of #{Count=86}, ...?
The files will have a lot of lines and majority of the lines have the same comma. I want to group them so the output lines will be smaller
Can you use something like this?
$s = #"
this,is,a
test,,
with,
multiple, commas, to, count,
"#
#convert to string-array(like you normally have with multiline strings)
$s = $s -split "`n"
$s | Select-String `, -AllMatches | Select-Object LineNumber, #{n="Count"; e={$_.Matches.Count}} | Group-Object Count
Count Name Group
----- ---- -----
2 2 {#{LineNumber=1; Count=2}, #{LineNumber=2; Count=2}}
1 1 {#{LineNumber=3; Count=1}}
1 4 {#{LineNumber=4; Count=4}}
If you don't want the "count" property multiple times in the group, you need custom objects. Like this:
$s | Select-String `, -AllMatches | Select-Object LineNumber, #{n="Count"; e={$_.Matches.Count}} | Group-Object Count | % {
New-Object psobject -Property #{
"Count" = $_.Name
"LineNumbers" = ($_.Group | Select-Object -ExpandProperty LineNumber)
}
}
Output:
Count LineNumbers
----- -----------
2 {1, 2}
1 3
4 4