I'm currently using the Get-Random function of Powershell to randomly pull a set number of rows from a csv. I need to create a constraint that says if one id is pulled, find the other ids that match it and pull their value.
Here is what I currently have:
$chosenOnes = Import-CSV C:\Temp\pk2.csv | sort{Get-Random} | Select -first 6
$i = 1
$count = $chosenOnes | Group-Object householdID
foreach ($row in $count)
{
if ($row.count -gt 1)
{
$students = $row.Group.Student
foreach ($student in $students)
{
$name = $student.tostring()
#...do something
$i = $i + 1
}
}
else
{
$name = $row.Group.Student
if($i -le 5)
{
#...do something
}
else
{
#...do something
}
$i = $i + 1
}
}
Example dataset
ID,name
165,Ernest Hemingway
1204,Mark Twain
1578,Stephen King
1634,Charles Dickens
1726,George Orwell
7751,John Doe
7751,Tim Doe
In this example, there are 7 rows but I'm randomly selecting 6 in my code. What needs to happen is when ID=7751 then I must return both rows where ID=7751. The IDs cannot not be statically set in the code.
Use Get-Random directly, with -Count, to extract a given number of random elements from a collection.
$allRows = Import-CSV C:\Temp\pk2.csv
$chosenHouseholdIDs = ($allRows | Get-Random -Count 6).householdID
Then filter all rows by whether their householdID column contains one of the 6 randomly selected rows' householdID values (PSv3+ syntax), using the -in array-containment operator:
$allRows | Where-Object householdID -in $chosenHouseholdIDs
Optional reading: performance considerations:
$allRows | Get-Random -Count 6 is not only conceptually simpler, but also much faster than $allRows | Sort-Object { Get-Random } | Select-Object -First 6
Using the Time-Command function to compare the performance of two approaches, using a 1000-row test file with 10 columns yields the following sample timings on my Windows 10 VM in Windows PowerShell - note that the Sort-Object { Get-Random }-based solution is more than 15(!) times slower:
Factor Secs (100-run avg.) Command TimeSpan
------ ------------------- ------- --------
1.00 0.007 $allRows | Get-Random -Count 6 00:00:00.0072520
15.65 0.113 $allRows | Sort-Object { Get-Random } | Select-Object -First 6 00:00:00.1134909
Similarly, a single pass through all rows to find matching IDs via array-containment operator -in performs much better than looping over the randomly selected IDs and searching all rows for each.
I tried sticking with your beginning and came up with this.
$Array = Import-CSV C:\test\StudtentTest.csv
$Array | Sort{Get-Random} | select -first 2 | %{
$id = $_.id
$Array | ?{$_.id -eq $id} | %{
$_
}
}
$Array will be your parsed CSV
We pipe in and sort by random select -first 2 (in this case)
Save the ID of the object into $id and then search the array for that ID and dispaly each that matches
If same ID does match you end up with something like
ID name
-- ----
7751 John Doe
7751 Tim Doe
1634 Charles Dickens
Related
I am having below PowerShell script which does not result in the sorting order I want.
$string = #("Project-a1-1", "Project-a1-10", "Project-a1-2", "Project-a1-5", "Project-a1-6", "Project-a1-8")
$myobjecttosort=#()
$string | ForEach{
$myobjecttosort+=New-Object PSObject -Property #{
'String'=$_
'Numeric'=[int]([regex]::Match($_,'\d+')).Value
}
}
$myobjecttosort | Sort-Object Numeric | Select Numeric,String | Format-Table -AutoSize
The output of the above script:
Numeric String
1 Project-a1-5
1 Project-a1-6
1 Project-a1-8
1 Project-a1-1
1 Project-a1-10
1 Project-a1-2
Required Output
1 Project-a1-1
2 Project-a1-2
3 Project-a1-5
4 Project-a1-6
5 Project-a1-8
6 Project-a1-10
Also, I want always output to be returned as the last value so here output would be Project-a1-10
Sort-Object accepts a script block allowing for a more robust sort. With that said, just like any other object in the pipeline, the objects are accessible via $PSItem, or $_. So, a quick way to go about this is splitting the string at the - selecting just the ending numerical digits, then casting [int] to the result to sort by.
$string = "Project-a1-1", "Project-a1-10", "Project-a1-2", "Project-a1-5", "Project-a1-6", "Project-a1-8"
$string |
Sort-Object -Property { [int]($_ -replace '^.*?(?=\d+$)') } |
% { $i = 1 } {
'{0} {1}' -f $i++, $_
}
The above yields:
1 Project-a1-1
2 Project-a1-2
3 Project-a1-5
4 Project-a1-6
5 Project-a1-8
6 Project-a1-10
Passing the sorted items to % (alias to Foreach-Object), we can then format a new string giving it an index # to each string starting at 1.
I have a list of IP addresses. They all start with 10.10. I want all the unique values of the third octet. This way I can count how many of that unique value there are.
10.10.26.251
10.10.27.221
10.10.26.55
10.10.31.12
10.10.12.31
10.10.31.11
10.10.27.15
10.10.26.5
When I am done I want to know that I have 3 .26 network devices, 2 27, and so on so forth. Other than breaking down the octet with a split and looping through each one, I can't think of any single liners. Any suggestions?
here's a small variant. [grin] i already had this before noticing the other answers - and it is a tad different.
what it does ...
creates a collection of IPv4 address objects to work with
groups them by a calculated property [the 3rd octet]
creates a [PSCustomObject] for each resulting group
sends it to the $Octet3_Report variable
shows it on screen
output to a CSV file would be easy at that point. here's the code ...
$IP_List = #(
[ipaddress]'10.10.26.251'
[ipaddress]'10.10.27.221'
[ipaddress]'10.10.26.55'
[ipaddress]'10.10.31.12'
[ipaddress]'10.10.12.31'
[ipaddress]'10.10.31.11'
[ipaddress]'10.10.27.15'
[ipaddress]'10.10.26.5'
)
$Octet3_Report = $IP_List |
Group-Object -Property {$_.ToString().Split('.')[2]} |
ForEach-Object {
[PSCustomObject]#{
Octet_3 = $_.Name
Count = $_.Count
}
}
$Octet3_Report
on screen output ...
Octet_3 Count
------- -----
26 3
27 2
31 2
12 1
It's like me to figure it out after the fact.
The Return contains the dns records. The IP address are stored inside recorddata. I pull the end of the IP address off. Then loop through grabbing only the range and count with a foreach loop to make it cleaner.
$DNSRecordCounts = #()
$Ranges = ($Return | where-object {$_.recorddata -like "10.10.*"}).recorddata -replace "\.\d{1,3}$" | select -Unique
foreach ($range in $Ranges) {
$DNSRecordCounts += [pscustomobject][ordered]#{
IPRange = $range
Count = ($Return | Where-Object {$_.recorddata -like "$($range).*"}).Count
}
}
Based on your question and what I can infer from your own answer, if you are looking for something a little more like "idiomatic" PowerShell you want the following:
$Return `
| Select-Object -ExpandProperty recorddata `
| ForEach-Object {
$_ -match "\d+\.\d+\.(?<octet>\d+)\.\d+" | Out-Null
$Matches.octet
} `
| Group-Object `
| ForEach-Object {
[PSCustomObject]#{
Octet = $_.Name
Count = $_.Count
}
}
I have a system that currently reads data from a CSV file produced by a separate system that is going to be replaced.
The imported CSV file looks like this
PS> Import-Csv .\SalesValues.csv
Sale Values AA BB
----------- -- --
10 6 5
5 3 4
3 1 9
To replace this process I hope to produce an object that looks identical to the CSV above, but I do not want to continue to use a CSV file.
I already have a script that reads data in from our database and extracts the data that I need to use. I'll not detail the fairly long script that preceeds this point but in effect it looks like this:
$SQLData = Custom-SQLFunction "SELECT * FROM SALES_DATA WHERE LIST_ID = $LISTID"
$SQLData will contain ~5000+ DataRow objects that I need to query.
One of those DataRow object looks something like this:
lead_id : 123456789
entry_date : 26/10/2018 16:51:16
modify_date : 01/11/2018 01:00:02
status : WRONG
user : mrexample
vendor_lead_code : TH1S15L0NGC0D3
source_id : A543212
list_id : 333004
list_name : AA Some Text
gmt_offset_now : 0.00
SaleValue : 10
list_name is going to be prefixed with AA or BB.
SaleValue can be any integer 3 and up, however realistically extremely unlikely to be higher than 100 (as this is a monthly donation) and will be one of 3,5,10 in the vast majority of occurrences.
I already have script that takes the content of list_name, creates and populates the data I need to use into two separate psobjects ($AASalesValues and $BBSalesValues) that collates the total numbers of 'SaleValue' across the data set.
Because I cannot reliably anticipate the value of any SaleValue I have to dynamically create the psobjects properties like this
foreach ($record in $SQLData) {
if ($record.list_name -match "BB") {
if ($record.SaleValue -gt 0) {
if ($BBSalesValues | Get-Member -Name $($record.SaleValue) -MemberType Properties) {
$BBSalesValues.$($record.SaleValue) = $BBSalesValues.$($record.SaleValue)+1
} else {
$BBSalesValues | Add-Member -Name $($record.SaleValue) -MemberType NoteProperty -Value 1
}
}
}
}
The two resultant objects look like this:
PS> $AASalesValues
10 5 3 50
-- - - --
17 14 3 1
PS> $BBSalesvalues
3 10 5 4
- -- - -
36 12 11 1
I now have the data that I need, however I need to format it in a way that replicates the format of the CSV so I can pass it directly to another existing powershell script that is configured to expect the data in the format that the CSV is in, but I do not want to write the data to a file.
I'd prefer to pass this directly to the next part of the script.
Ultimately what I want to do is to produce a new object/some output that looks like the output from Import-Csv command at the top of this post.
I'd like a new object, say $OverallSalesValues, to look like this:
PS>$overallSalesValues
Sale Values AA BB
50 1 0
10 17 12
5 14 11
4 0 1
3 3 36
In the above example the values from $AASalesValues is listed under the AA column, the values from $BBSalesValues is listed under the BB column, with the rows matching the headers of the two original objects.
I did try this with hashtables but I was unable to work out how to both create them from dynamic values and format them to how I needed them to look.
Finally got there.
$TotalList = #()
foreach($n in 3..200){
if($AASalesValues.$n -or $BBSalesValues.$n){
$AACount = $AASalesValues.$n
$BBcount = $BBSalesValues.$n
$values = [PSCustomObject]#{
'Sale Value'= $n
AA = $AACount
BB = $BBcount
}
$TotalList += $values
}
}
$TotalList
produces an output of
Sale Value AA BB
---------- -- --
3 3 36
4 2
5 14 11
10 18 12
50 1
Just need to add a bit to include '0' values instead of $null.
I'm going to assume that $record contains a list of the database results for either $AASalesValues or $BBSalesValues, not both, otherwise you'd need some kind of selector to avoid counting records of one group with the other group.
Group the records by their SaleValue property as LotPings suggested:
$BBSalesValues = $record | Group-Object SaleValue -NoElement
That will give you a list of the SaleValue values with their respective count.
PS> $BBSalesValues
Count Name
----- ----
36 3
12 10
11 5
1 4
You can then update your CSV data with these values like this:
$file = 'C:\path\to\data.csv'
# read CSV into a hashtable mapping the sale value to the complete record
# (so that we can lookup the record by sale value)
$csv = #{}
Import-Csv $file | ForEach-Object {
$csv[$_.'Sale Values'] = $_
}
# Add records for missing sale values
$($AASalesValues; $BBSalesValues) | Select-Object -Expand Name -Unique | ForEach-Object {
if (-not $csv.ContainsKey($_)) {
$csv[$_] = New-Object -Type PSObject -Property #{
'Sale Values' = $_
'AA' = 0
'BB' = 0
}
}
}
# update records with values from $AASalesValues
$AASalesValues | ForEach-Object {
[int]$csv[$_.Name].AA += $_.Count
}
# update records with values from $BBSalesValues
$BBSalesValues | ForEach-Object {
[int]$csv[$_.Name].BB += $_.Count
}
# write updated records back to file
$csv.Values | Export-Csv $file -NoType
Even with your updated question the approach would be pretty much the same, you'd just add another level of grouping for collecting the sales numbers:
$sales = #{}
$record | Group-Object {$_.list_name.Split()[0]} | ForEach-Object {
$sales[$_.Name] = $_.Group | Group-Object SaleValue -NoElement
}
and then adjust the merging to something like this:
$file = 'C:\path\to\data.csv'
# read CSV into a hashtable mapping the sale value to the complete record
# (so that we can lookup the record by sale value)
$csv = #{}
Import-Csv $file | ForEach-Object {
$csv[$_.'Sale Values'] = $_
}
# Add records for missing sale values
$sales.Values | Select-Object -Expand Name -Unique | ForEach-Object {
if (-not $csv.ContainsKey($_)) {
$prop = #{'Sale Values' = $_}
$sales.Keys | ForEach-Object {
$prop[$_] = 0
}
$csv[$_] = New-Object -Type PSObject -Property $prop
}
}
# update records with values from $sales
$sales.GetEnumerator() | ForEach-Object {
$name = $_.Key
$_.Value | ForEach-Object {
[int]$csv[$_.Name].$name += $_.Count
}
}
# write updated records back to file
$csv.Values | Export-Csv $file -NoType
Hello PowerShell Scriptwriters,
I got an objective to count rows, based on the multiple criteria matching. My PowerShell script can able to fetch me the end result, but it consumes too much time[when the rows are more, the time it consumes becomes even more]. Is there a way to optimism my existing code? I've shared my code for your reference.
$csvfile = Import-csv "D:\file\filename.csv"
$name_unique = $csvfile | ForEach-Object {$_.Name} | Select-Object -Unique
$region_unique = $csvfile | ForEach-Object {$_."Region Location"} | Select-Object -Unique
$cost_unique = $csvfile | ForEach-Object {$_."Product Cost"} | Select-Object -Unique
Write-host "Save Time on Report" $csvfile.Length
foreach($nu in $name_unique)
{
$inc = 1
foreach($au in $region_unique)
{
foreach($tu in $cost_unique)
{
foreach ($mainfile in $csvfile)
{
if (($mainfile."Region Location" -eq $au) -and ($mainfile.'Product Cost' -eq $tu) -and ($mainfile.Name -eq $nu))
{
$inc++ #Matching Counter
}
}
}
}
$inc #expected to display Row values with the total count.And export the result as csv
}
You can do this quite simply using the Group option on a Powershell object.
$csvfile = Import-csv "D:\file\filename.csv"
$csvfile | Group Name,"Region Location","Product Cost" | Select Name, Count
This gives output something like the below
Name Count
---- ------
f1, syd, 10 2
f2, syd, 10 1
f3, syd, 20 1
f4, melb, 10 2
f2, syd, 40 1
P.S. the code you provided above is not matching all of the fields, it is simply checking the Name parameter (looping through the other parameters needlessly).
With all of the examples out there you would think I could have found my solution. :-)
Anyway, I have two csv files; one with two columns, one with 4. I need to compare one column from each one using powershell. I thought I had it figured out but when I did a compare of my results, it comes back as false when I know it should be true. Here's what I have so far:
$newemp = Import-Csv -Path "C:\Temp\newemp.csv" -Header login_id, lastname, firstname, other | Select-Object "login_id"
$ps = Import-Csv -Path "C:\Temp\Emplid_LoginID.csv" | Select-Object "login id"
If ($newemp -eq $ps)
{
write-host "IDs match" -forgroundcolor green
}
Else
{
write-host "Not all IDs match" -backgroundcolor yellow -foregroundcolor black
}
I had to specifiy headers for the first file because it doesn't have any. What's weird is that I can call each variable to see what it holds and they end up with the same info but for some reason still comes up as false. This occurs even if there is only one row (not counting the header row).
I started to parse them as arrays but wasn't quite sure that was the right thing. What's important is that I compare row1 of the first file with with row1 of the second file. I can't just do a simple -match or -contains.
EDIT: One annoying thing is that the variables seem to hold the header row as well. When I call each one, the header is shown. But if I call both variables, I only see one header but two rows.
I just added the following check but getting the same results (False for everything):
$results = Compare-Object -ReferenceObject $newemp -DifferenceObject $ps -PassThru | ForEach-Object { $_.InputObject }
Using latkin's answer from here I think this would give you the result set you're looking for. As per latkin's comment, the property comparison is redundant for your purposes but I left it in as it's good to know. Additionally the header is specified even for the csv with headers to prevent the header row being included in the comparison.
$newemp = Import-Csv -Path "C:\Temp\_sotemp\Book1.csv" -Header loginid |
Select-Object "loginid"
$ps = Import-Csv -Path "C:\Temp\_sotemp\Book2.csv" -Header loginid |
Select-Object "loginid"
#get list of (imported) CSV properties
$props1 = $newemp | gm -MemberType NoteProperty | select -expand Name | sort
$props2 = $ps | gm -MemberType NoteProperty | select -expand Name | sort
#first check that properties match
#omit this step if you know for sure they will be
if(Compare-Object $props1 $props2){
throw "Properties are not the same! [$props1] [$props2]"
}
#pass properties list to Compare-Object
else{
Compare-Object $newemp $ps -Property $props1
}
In the second line, I see there a space "login id" and the first line doesn't have it. Could that be an issue. Try having the same name for the headers in the .csv files itself. And it works for without providing header or select statements. Below is my experiment based upon your input.
emp.csv
loginid firstname lastname
------------------------------
abc123 John patel
zxy321 Kohn smith
sdf120 Maun scott
tiy123 Dham rye
k2340 Naam mason
lk10j5 Shaan kelso
303sk Doug smith
empids.csv
loginid
-------
abc123
zxy321
sdf120
tiy123
PS C:\>$newemp = Import-csv C:\scripts\emp.csv
PS C:\>$ps = Import-CSV C:\scripts\empids.csv
PS C:\>$results = Compare-Object -ReferenceObject $newemp -DifferenceObject $ps | foreach { $_.InputObject}
Shows the difference objects that are not in $ps
loginid firstname lastname SideIndicator
------- --------- -------- -------------
k2340 Naam mason <=
lk10j5 Shaan kelso <=
303sk Doug smith <=
I am not sure if this is what you are looking for but i have used the PowerShell to do some CSV formatting for myself.
$test = Import-Csv .\Desktop\Vmtools-compare.csv
foreach ($i in $test) {
foreach ($n in $i.name) {
foreach ($m in $test) {
$check = "yes"
if ($n -eq $m.prod) {
$check = "no"
break
}
}
if ($check -ne "no") {$n}
}
}
this is how my excel csv file looks like:
prod name
1 3
2 5
3 8
4 2
5 0
and script outputs this:
8
0
so basically script takes each number under Name column and then checks it against prod column. If the number is there then it won't display else it will display that number.
I have also done it the opposite way:
$test = Import-Csv c:\test.csv
foreach ($i in $test) {
foreach ($n in $i.name) {
foreach ($m in $test) {
$check = "yes"
if ($n -eq $m.prod) {echo $n}
}
}
}
this is how my excel csv looks like:
prod name
1 3
2 5
3 8
4 2
5 0
and script outputs this:
3
5
2
so script shows the matching entries only.
You can play around with the code to look at different columns.