Importing csv rows to powershell, with date specified as greater than - powershell

I have two csv files. One with a report from AD, containing accounts created during last month, second is manually kept database, that should theoretically contain the same information, but from all history of our company, with some additional data needed for accounting. I imported the AD report to powershell, now I need to import specific rows of the database. The rows I need are defined by a value in column "Date added". I need to import only rows, where the date exceeds specific value. I have this code:
$Report = Read-Host "File name" #AD report, last ten chars are date of report creation, in format yyyy-MM-dd
$Date_text = $Report.Substring($Report.get_Length()-10)
$Date = Get-Date -Date $Date_text
$Date_limit = (($Date).AddDays(-$Date.Day)).Date
$Date_start = $Date_limit.AddMonths(-1)
$CSVlicence = Import-Csv $Database -Encoding UTF8 |
where {(Where-Object {![string]::IsNullOrWhiteSpace($_.'Date added')} |
ForEach-Object{$_.'Date added' = $_.'datum Pridani' -as [datetime] $_}) -gt $Date_start}
When run like this, nothing is imported. Without the condition the database is imported successfully, but it's extremely large and the rest of the script takes for ever. So I need to work only with relevant data. I don't care, that when Date_limit is 30th Sep, the Date_start would be 30th Aug instead of 31st Aug. That's just few more rows, but all those 10 years or so really takes for ever, if everything is imported.

So based on your current logic, it filters the PSCustomObject based on your constraints, which is the wrong way to handle it since any item in the object will cause it to be filtered out. You want to filter the source.
$Report = Read-Host -Prompt 'Filename'
## Grabs the datestamp at the end
$Date = Get-Date -Date $Report.Substring($Report.Length - 10)
## Grabs last day of previous month
$Limit = $Date.AddDays(-$Date.Day)
## Grabs last day of two months ago, inaccuracy of one day
$Start = $Limit.AddMonths(-1)
Get-Content -Path $Database -TotalCount 1 | Set-Content -Path 'tempfile'
Get-Content -Path $Database -Encoding 'UTF8' |
## Checks that the entry has a valid date entry within limits
ForEach-Object {
## For m/d/yy or d/m/yy variants, try
## '\d{1,2}\/\d{1,2}\/\d{2,4}'
If ($_ -match '\d{4}-\d{2}-\d{2}')
{
[DateTime]$Date = $Matches[0]
If ($Date -gt $Start -and $Date -lt $Limit) { $_ }
}
} |
Add-Content -Path 'tempfile'
$CSV = Import-Csv -Path 'tempfile' -Encoding 'UTF8'

Related

PowerShell script efficiency advice

I have a telephony .csv with compiled data from January 2020 and some days of February, each row has the date and time spent on each status, since someone uses different status over the day the file has one row for each status, my script is supposed to go through the file, find the minimum date and then start saving on new files all the data for the same day, so I'll end with one file for 01-01-2020, 02-01-2020 and so on, but it has 15 hours running and it's still at 1/22.
The column I'm using for the dates is called "DateFull" and this is the script
write-host "opening file"
$AT= import-csv “C:\Users\xxxxxx\Desktop\SignOnOff_20200101_20200204.csv”
write-host "parsing and sorting file"
$go= $AT| ForEach-Object {
$_.DateFull= (Get-Date $_.DateFull).ToString("M/d/yyyy")
$_
}
Write-Host "prep day"
$min = $AT | Measure-Object -Property Datefull -Minimum
Write-Host $min
$dateString = [datetime] $min.Minimum
Write-host $datestring
write-host "Setup dates"
$start = $DateString - $today
$start = $start.Days
For ($i=$start; $i -lt 0; $i++) {
$date = get-date
$loaddate = $date.AddDays($i)
$DateStr = $loadDate.ToString("M/d/yyyy")
$now = Get-Date -Format HH:mm:ss
write-host $datestr " " $now
#Install-Module ImportExcel #optional import if you dont have the module already
$Check = $at | where {$_.'DateFull' -eq $datestr}
write-host $check.count
if ($check.count -eq 0 ){}
else {$AT | where {$_.'DateFull' -eq $datestr} | Export-Csv "C:\Users\xxxxx\Desktop\signonoff\SignOnOff_$(get-date (get-date).addDays($i) -f yyyyMMdd).csv" -NoTypeInformation}
}
$at = ''
The first loop doesn't make much sense. It loops through CSV contents and converts each row's date into different a format. Afterwards, $go is never used.
$go= $AT| ForEach-Object {
$_.DateFull= (Get-Date $_.DateFull).ToString("M/d/yyyy")
$_
}
Later, there is an attempt to calculate a value from uninitialized a variable. $today is never defined.
$start = $DateString - $today
It looks, however, like you'd like to calculate, in days, how old eldest record is.
Then there's a loop that counts from negative days to zero. During each iteration, the whole CSV is searched:
$Check = $at | where {$_.'DateFull' -eq $datestr}
If there are 30 days and 15 000 rows, there are 30*15000 = 450 000 iterations. This has complexity of O(n^2), which means runtime will go sky high for even relative small number of days and rows.
The next part is that the same array is processed again:
else {$AT | where {$_.'DateFull' -eq $datestr
Well, the search condition is exactly the same, but now results are sent to a file. This has a side effect of doubling your work. Still, O(2n^2) => O(n^2), so at least the runtime isn't growing in cubic or worse.
As for how to fix this, there are a few things. If you sort the CSV based on date, it can be processed afterwards in just a single run.
$at = $at | sort -Property datefull
Then, iterate each row. Since the rows are in ascending order, the first is the oldest. For each row, check if date has changed. If not, add it to buffer. If it has, save the old buffer and create a new one.
The sample doesn't convert file names in yyyyMMdd format, and it assumes there are only two columns foo and datefull like so,
$sb = new-object text.stringbuilder
# What's the first date?
$current = $at[0]
# Loop through sorted data
for($i = 0; $i -lt $at.Count; ++$i) {
# Are we on next date?
if ($at[$i].DateFull -gt $current.datefull) {
# Save the buffer
$file = $("c:\temp\OnOff_{0}.csv" -f ($current.datefull -replace '/', '.') )
set-content $file $sb.tostring()
# Pick the current date
$current = $at[$i]
# Create new buffer and save data there
$sb = new-object text.stringbuilder
[void]$sb.AppendLine(("{0},{1}" -f $at[$i].foo, $at[$i].datefull))
} else {
[void]$sb.AppendLine(("{0},{1}" -f $at[$i].foo, $at[$i].datefull))
}
}
# Save the final buffer
$file = $("c:\temp\OnOff_{0}.csv" -f ($current.datefull -replace '/', '.') )
set-content $file $sb.tostring()

How can I append data in powershell to a different file depending on what month the data came from?

I am pulling data on adoption of Office 365 products on a daily basis. I don't know how to convert my current logic to write to a new file based on file size to one based on the report date.
My original thought process was to use an if statement to split the data out by month and have 12 files already ready to append to (depending on the month of data) but this seems inefficient.
$name = "O365SPSiteActivity.csv"
$auth=Get-AuthCode
$accesstoken=$auth[1]
### data pulling process has been omitted ###
if ($report -ne $null)
{
###New section for making the new files
#Get current file
$source = "D:\O365Data\"+ $name
$File = Get-Item $source
If (((Get-Item $file).Length/1MB) -ge 700)
{
$date = (get-date -Format dd-MM-yyyy)
$RenamedFileName = "O365SPSiteActivity-$date.csv"
Rename-Item $file.FullName -NewName $RenamedFileName
$FileName = "D:\temp\" + $name
Send-MailMessage –From svc_sps10#kbslp.com –To shelby.cundiff#kbslp.com –Subject “New File Has been Created" –Body “New File Name: $RenamedFileName " -SmtpServer kbslp-com.mail.protection.outlook.com -Port 25
}
Else
{
$FileName = "D:\temp\" + $name
Copy $File $FileName
}
#########################################################################
$Data=#()
$c=1
foreach ($row in $report)
{ Write-Progress -Activity $row.'User Principal Name' -PercentComplete (($c/$report.count)*100) -ID 4
$string = "" | Select "???Report Refresh Date","User Principal Name","Is Deleted","Deleted Date","Last Activity Date","Viewed or Edited File Count",
"Synced File Count","Shared Internally File Count","Shared Externally File Count","Visited Page Count","Report Period"
$string.'???Report Refresh Date' = Get-Date($row.'Report Refresh Date') -Format "yyyy-MM-dd"
$string.'User Principal Name' = $row.'User Principal Name'
$string.'Is Deleted' = $row.'Is Deleted'
$string.'Deleted Date' = $row.'Deleted Date'
$string.'Last Activity Date' = $row.'Last Activity Date'
$string.'Viewed or Edited File Count' = $row.'Viewed or Edited File Count'
$string.'Synced File Count' = $row.'Synced File Count'
$string.'Shared Internally File Count' = $row.'Shared Internally File Count'
$string.'Shared Externally File Count' = $row.'Shared Externally File Count'
$string.'Visited Page Count' = $row.'Visited Page Count'
$string.'Report Period' = $row.'Report Period'
$Data += $string
$c++
}
$Data | Export-Csv -Append -Path $FileName -NoTypeInformation -Force
#$FolderUrl = $teamSitePath + "/" + $ListName
#$UploadFileInfo = New-Object System.IO.FileInfo($FileName)
#Upload-SPOFile -WebUrl $teamSiteUrl -spCredentials $SPOCreds -FolderUrl $FolderUrl -FileInfo $UploadFileInfo
$newFile = Get-Item $FileName
Copy $newFile $File.FullName
}
$report = $null
$Data = $null
Ideally, i'd like to change this script to write to a file like:
O365SPSiteActivity-2019-Oct.csv during October, then O365SPSiteActivity-2019-Nov.csv during November, etc. depending on when the data is from.
Why would you write to a temporary file first and copy if over a (possibly existing) file when done?
If I understand the question, you would like to create a new report csv each month.
Then, why not simply do something like this:
# create a filename for this month
$currentReport = 'O365SPSiteActivity-{0:yyyy-MMM}.csv' -f (Get-Date)
and export -Append your data into that? If the file does not already exist, it will be created, otherwise the new data will be appended to it.
Are you looking for the current date, or the date from the spreadsheet?
$date = (get-date -Format dd-MM-yyyy)
will output to O365SPSiteActivity-2019-Oct.csv if you run it today.
If you're looking to chunk up the CSV based on dates within the data, I might move the export-CSV inside the foreach loop, and modify it to use a name similar to above instead of appending it to the $Data array.
$Data = "2019-May-31"
Get-Date -Format yyyy-MMM -Date ([datetime]::parseexact($Data, 'yyyy-MMM-dd', $null))
$FileName = "O365SPSiteActivity-$date.csv"
$Data | Export-Csv -Append -Path $FileName -NoTypeInformation -Force
That'll write each line to the appropriate CSV file- you'll end up with a different CSV file for every months worth of data.
Note that the 'yyyy-MMM-dd' in parseexact will need to match the format of the date that you're feeding it. For example, if you're sorting it based on $row.'Report Period' that has the date as 12/31/19, it would be 'MM/dd/yy' instead (notice I changed from - to / as the separator). Here's the documentation listing what each letter in that means.

Finding value in a CSV file and comparing with today's date

I'm trying to get a value from a CSV file.
If today's date = DateInCSVFile give the "key" value.
Keys.csv
Guest,Key
1-Jun,OIOMY-ZFILZ
2-Jun,LSSJC-PDEUL
3-Jun,MQNVJ-TETLV
4-Jun,HCJIJ-ECVPY
5-Jun,SPACR-AJSLU
6-Jun,MEURS-UQTVX
Code:
$today = Get-Date -format dd-MMM
$keys = import-csv c:\office\keys.csv -Header #(1..2)
$data = $keys | ? { $_.1 -match $today}
Write-Host $data.2
I tried the foreach and if commands. Nothing worked.
I can think of a couple of options. If you want something quick and dirty, try:
$stuff = Import-Csv -Path .\stuff.csv
foreach ($thing in $stuff) {
if ( $thing.Guest -eq $(Get-date -Format 'd-MMM') ) {
Write-Output $thing.Key
}
}
I import the CSV file's contents to a variable. I iterate over each line. If the day in Guest matches the current day, I output the key
The only problem with your code is your date format, dd-MMM, as LotPings observes:
It creates 0-left-padded numbers for single-digit days such as 6, whereas the dates in the CSV have no such padding.
Thus, changing Get-Date -format dd-MMM to Get-Date -format d-MMM (just d instead of dd) should fix your problem.
However, given that you're reading the entire file into memory anyway, you can optimize the command to (PSv4+):
$today = Get-Date -Format d-MMM
(Import-Csv c:\office\keys.csv).Where({ $_.Guest -eq $today }).Key
Also note that the purpose of -match is to perform regular-expression-based matching, not (case-insensitive) string equality; use -eq for the latter.

Compare current date to date string in a file using powershell

I am writing some PS scripts to log times into a text file, login.txt, using the following code:
$logdir = "C:\FOLDER"
$logfile = "$logdir\LastLogin.txt"
$user = $env:USERNAME
$date = Get-Date -Format "dd-MM-yyyy"
if (!(Test-Path $logdir)){New-Item -ItemType Directory $logdir}else{}
if (!(Test-Path $logfile)){New-Item $logfile}else{}
if (Get-Content $logfile | Select-String $user -Quiet){write-host "exists"}else{"$user - $date" | Add-Content -path $logfile}
(Get-Content $logfile) | Foreach-Object {$_ -replace "$user.+$", "$user - $date"; } | Set-Content $logfile
This creates an entry in the text file like:
UserName - 01-01-1999
Using Powershell, I want to read the text file, compare the date, 01-01-1999, in the text file to the current date and if more than 30 days difference, extract the UserName to a variable to be used later in the script.
I would really appreciate any hints as to how I could do the following:
Compare the date in the text file to the current date.
If difference is more than 30 days, pick up UserName as a variable.
I would really appreciate any advice.
Checking all dates in the file with the help of a RegEx with named capture groups.
$logdir = "C:\FOLDER"
$logfile = Join-Path $logdir "LastLogin.txt"
$Days = -30
$Expires = (Get-Date).AddDays($Days)
Get-Content $logfile | ForEach-Object {
if ($_ -match "(?<User>[^ ]+) - (?<LastLogin>[0-9\-]+)") {
$LastLogin = [datetime]::ParseExact($Matches.LastLogin,"dd-MM-yyyy",$Null)
if ( $Expires -gt $LastLogin ) {
"{0} last login {1} is {2:0} days ago" -F $Matches.User, $Matches.LastLogin,
(New-TimeSpan -Start $LastLogin -End (Get-Date) ).TotalDays
}
}
}
Sample output
username last login 31-12-1999 is 6690 days ago
There is a way of doing that using regex (Regular Expressions). I will assume that the username which you get in your text file is .(dot) separated. For example, username looks like john.doe or jason.smith etc. And the entry in your text file looks like john.doe - 01-01-1999 or jason.smith - 02-02-1999. Keeping these things in mind our approach would be -
Using a regex we would get the username and date entry into a single variable.
Next up, we will split the pattern we have got in step 1 into two parts i.e. the username part and the date part.
Next we take the date part and if the difference is more than 30 days, we would take the other part (username) and store it in a variable.
So the code would look something like this -
$arr = #() #defining an array to store the username with date
$pattern = "[a-z]*[.][a-z]*\s[-]\s[\d]{2}[-][\d]{2}[-][\d]{4}" #Regex pattern to match entires like "john.doe - 01-01-1999"
Get-Content $logfile | Foreach {if ([Regex]::IsMatch($_, $pattern)) {
$arr += [Regex]::Match($_, $pattern)
}
}
$arr | Foreach {$_.Value} #Storing the matched pattern in $arr
$UserNamewithDate = $arr.value -split ('\s[-]\s') #step 2 - Storing the username and date into a variable.
$array = #() #Defining the array that would store the final usernames based on the time difference.
for($i = 1; $i -lt $UserNamewithDate.Length;)
{
$datepart = [Datetime]$UserNamewithDate[$i] #Casting the date part to [datetime] format
$CurrentDate = Get-Date
$diff = $CurrentDate - $datepart
if ($diff.Days -gt 30)
{
$array += $UserNamewithDate[$i -1] #If the difference between current date and the date received from the log is greater than 30 days, then store the corresponding username in $array
}
$i = $i + 2
}
Now you can access the usernames like $array[0], $array[1] and so on. Hope that helps!
NOTE - The regex pattern will change as per the format your usernames are defined. Here is a regex library which might turn out to be helpful.

Piping error with extra line and does not give a header

0001;Third Week;Every Monday 12am-2am
002;Third Week;Every Tuesday 8pm-10pm
003;Third Week;Every Monday 12am-2am
#Get the number of lines in a CSV file
$Lines = (Import-Csv "C:\MM1.csv").count
#Import the CSV file
$a = #(Import-CSV "C:\MM1.csv")
$month = Get-Date -Format MMMM
#loop around the end of the file
for ($i=0; $i -le $lines; $i++) {
$Servername = $a[$i].ServerName
$week = $a[$i].Week
$dayweekString = [String]$a[$i].DayTime
# This will help in getting the Day of the WeekDay String
$dayweekString = ($dayweekString -split "\s+",(3))[(1)]
#This will find the time Ex 2am or 8pm, it can be any time
$DayNew = if ($Day -match "\d{1,2}Travelm") {$Matches[0]}
#Format for Maintenance mode which can be fed into the SCOM MM Script.
$MaintenanceTime = get-date "$DayNew $month.$($_.$dayweekString).$year"
write-host $Servername, $MaintenanceTime
#Store all my data while in a for each loop
New-Object -TypeName PSObject -Property #{
ServerNameNew = $Servername
TimeStamp = $MaintenanceTime
} | Select-Object ServerNameNew,TimeStamp
} | Export-Csv -Path "C:\MM3.csv" -Append
# Error as extra Pipe
I am not able to pipe the output while in a loop with a header. Says Extra pipe and writes a extra line of TimeStamp Variable to the line.
You are trying to pipe the result of the for statement to another command. That's not supported. If you want to do that, you need to use a subexpression around the statement:
$( for(){} ) | ...
(The reason is that pipelines need an expression as their first element. And for is not an expression, it's a statement.)
However, in your case I'd replace the for with a simple pipeline iterating over the array, like this:
Import-CSV "C:\MM1.csv" | ForEach-Object {
$Servername = $_.ServerName
$week = $_.Week
$dayweekString = [String]$_.DayTime
...
}
Generally there is very rarely a reason to use explicit looping constructs in PowerShell. It leads to code that's awkward at best (because it resembles converted C# or VBScript code) and horrible at worst. Just don't do it.