Is there a way to merge similar lines using Powershell? - powershell

Suppose I have two csv files. One is
id_number,location_code,category,animal,quantity
12212,3,4,cat,2
29889,7,6,dog,2
98900,
33221,1,8,squirrel,1
the second one is:
98900,2,1,gerbil,1
The second file may have a newline or something at the end (maybe or maybe not, I haven't checked), but only the one line of content. There may be three or four or more different varieties of the "second" file, but each one will have a first element (98900 in this example) that corresponds to an incomplete line in the first file similar to what is in this example.
Is there a way using powershell to automatically merge the line in the second (plus any additional similar) csv file into the matching line(s) of the first file, so that the resulting file is:
12212,3,4,cat,2
29889,7,6,dog,2
98900,2,1,gerbil,1
33221,1,8,squirrel,1

main.csv
id_number,location_code,category,animal,quantity
12212,3,4,cat,2
29889,7,6,dog,2
98900,
33221,1,8,squirrel,1
correction_001.csv
98900,2,1,gerbil,1
merge code used at the commandline, or in the .ps1 file of your choice
$myHeader = #('id_number','location_code','category','animal','quantity')
#Stage all the correction files: last correction in the most recent file wins
$ToFix = #{}
filter Plumbing_Import-Csv($Header){import-csv -LiteralPath $_ -Header $Header}
ls correction*.csv | sort -Property LastWriteTime | Plumbing_Import-Csv $myHeader | %{$ToFix[$_.id_number]=$_}
function myObjPipe($Header){
begin{
function TextTo-CsvField([String]$text){
#text fields which contain comma, double quotes, or new-line are a special case for CSV fields and need to be accounted for
if($text -match '"|,|\n'){return '"'+($text -replace '"','""')+'"'}
return $text
}
function myObjTo-CsvRecord($obj){
return ''+
$obj.id_number +','+
$obj.location_code +','+
$obj.category +','+
(TextTo-CsvField $obj.animal)+','+
$obj.quantity
}
$Header -join ','
}
process{
if($ToFix.Contains($_.id_number)){
$out = $ToFix[$_.id_number]
$ToFix.Remove($_.id_number)
}else{$out = $_}
myObjTo-CsvRecord $out
}
end{
#I assume you'd append any leftover fixes that weren't used
foreach($out in $ToFix.Values){
myObjTo-CsvRecord $out
}
}
}
import-csv main.csv | myObjPipe $myHeader | sc combined.csv -encoding ascii
You could also use ConvertTo-Csv, but my preference is to not have all the extra " cruft.
Edit 1: reduced code redundancy, accounted for \n, fixed appends, and used #OwlsSleeping suggestion about the -Header commandlet parameter
also works with these files:
correction_002.csv
98900,2,1,I Win,1
correction_new.csv
98901,2,1,godzilla,1
correction_too.csv
98902,2,1,gamera,1
98903,2,1,mothra,1
Edit 2: convert gc | ConvertTo-Csv over to Import-Csv to fix the front-end \n issues. Now also works with:
correction_003.csv
29889,7,6,"""bad""
monkey",2

This is a simple solution assuming there's always exactly one match, and you don't care about output order. Change the output path to csv1 to overwrite.
I added headers manually in both input files, but you can specify them in Import-Csv instead if you'd rather avoid changing your files.
[array]$MissingLine = Import-Csv -Path "C:\Users\me\Documents\csv2.csv"
[string]$MissingId = $MissingLine[0].id_number
[array]$BigCsv = Import-Csv -Path "C:\Users\me\Documents\csv1.csv" |
Where-Object {$_.id_number -ne $MissingId}
($BigCsv + $MissingLine) |
Export-Csv -Path "C:\Users\me\Documents\Combined.csv"

Related

Powershell replace text once per line

I have a Powershell script that I am trying to work out part of it, so the text input to this is listing the user group they are part of. This PS script is supposed to replace the group with the groups that I am assigning them in active directory(I am limited to only changing groups in active directory). My issue is that when it reaches HR and replaces it, it will then proceed to contine and replace all the new but it all so replaces the HR in CHRL, so my groups look nuts right now. But I am looking it over and it doesn't do it with every line. But for gilchrist it will put something in there for the HR in the name. Is there anything can I do to keep it for changing or am I going to have to change my HR to Human Resources? Thanks for the help.
$lookupTable = #{
'Admin' = 'W_CHRL_ADMIN_GS,M_CHRL_ADMIN_UD,M_CHRL_SITE_GS'
'Security' = 'W_CHRL_SECURITY_GS,M_CHRL_SITE_GS'
'HR' = 'M_CHRL_HR_UD,W_CHRL_HR_GS,M_CHRL_SITE_GS'
$original_file = 'c:\tmp\test.txt'
$destination_file = 'c:\tmp\test2.txt'
Get-Content -Path $original_file | ForEach-Object {
$line = $_
$lookupTable.GetEnumerator() | ForEach-Object {
if ($line -match $_.Key)
{
$line = $line -replace $_.Key, $_.Value
}
}
$line
} | Set-Content -Path $destination_file
Get-Content $destination_file
test.txt:
user,group
john.smith,Admin
joanha.smith,HR
john.gilchrist,security
aaron.r.smith,admin
abby.doe,secuity
abigail.doe,admin
Your input appears to be in CSV format (though note that your sample rows have trailing spaces, which you'd have to deal with, if they're part of your actual data).
Therefore, use Import-Csv and Export-Csv to read / rewrite your data, which allows a more concise and convenient solution:
Import-Csv test.txt |
Select-Object user, #{ Name='group'; Expression = { $lookupTable[$_.group] } } |
Export-Csv -NoTypeInformation -Encoding Utf8 test2.txt
Import-Csv reads the CSV file as a collection of custom objects whose properties correspond to the CSV column values; that is, each object has a .user and .name property in your case.
$_.group therefore robustly reports the abstract group name only, which you can directly pass to your lookup hashtable; Select-Object is used to pass the original .user value through, and to replace the original .group value with the lookup result, using a calculated property.
Export-Csv re-converts the custom objects to a CSV file:
-NoTypeInformation suppresses the (usually useless) data-type-information line at the top of the output file
-Encoding Utf8 was added to prevent potential data loss, because it is ASCII encoding that is used by default.
Note that Export-Csv blindly double-quotes all field values, whether they need it or not; that said, CSV readers should be able to deal with that (and Import-Csv certainly does).
As for what you tried:
The -replace operator replaces all occurrences of a given regex (regular expression) in the input.
Your regexes amounts to looking for (case-insensitive) substrings, which explains why HR matches both the HR group name and substring hr in username gilchrist.
A simple workaround would be to add assertions to your regex so that the substrings only match where you want them; e.g.: ,HR$ would only match after a , at the end of a line ($).
However, your approach of enumerating the hashtable keys for each input CSV row is inefficient, and you're better off splitting off the group name and doing a straight lookup based on it:
# Split the row into fields.
$fields = $line -split ','
# Update the group value (last field)
$fields[-1] = $lookupTable[$fields[-1]]
# Rebuild the line
$line = $fields -join ','
Note that you'd have to make an exception for the header row (e.g., test if the lookup result is empty and refrain from updating, if so).
Why don't you load your text file as a CSV file, using Import-CSV and use "," as a delimiter?
This will allow you to have a Powershell Object you can work on. and then export it as text o CSV. if I use your file & lookup table this code may help you :
$file = Import-Csv -Delimiter "," -Path "c:\ps\test.txt"
$lookupTable = #{
'Admin' = 'W_CHRL_ADMIN_GS,M_CHRL_ADMIN_UD,M_CHRL_SITE_GS'
'Security' = 'W_CHRL_SECURITY_GS,M_CHRL_SITE_GS'
'HR' = 'M_CHRL_HR_UD,W_CHRL_HR_GS,M_CHRL_SITE_GS'}
foreach ($i in $file) {
#Compare and replace
...
}
Export-CSV $file -Delimiter ","
You can then iterate over $file and compare and replace. you can also Export-CSV after you're done.

Powershell import-csv conditional for column

I'm trying to find the best way using Powershell to modify a row in CSV based on the following condition:
IF column TEST contains a WORD,WORD
THEN do the following:
1) copy the entire row once but keep only the FIRST WORD in column TEST
2) copy the entire row again but keep only the SECOND WORD in column TEST
3) delete the original row that had WORD,WORD in column TEST
Example:
subject~school~TEST~code~year
math~PADF~true,false~0943~2016
I'd like:
subject~school~TEST~code~year
math~PADF~true~0943~2016
math~PADF~false~0943~2016
I'm no expert in Powershell, but I was playing around with using the import-csv, and then using a Get-Content Foreach-object, but it's not working. If someone knows of an easier way or the solution for the case above, that would be fantastic!
Thank You
knows of an easier way or the solution for the case above
With programming, there's generally no such thing as 'the' solution. Write what you want to do (good steps 1,2,3 in your question) and then make up some code to do that:
Import-Csv D:\data.csv -Delimiter '~' | ForEach-Object {
if ($_.TEST -match ',') # IF row.TEST contains a comma
{
$first, $second = $_.TEST.Split(',') # Get first and second words ready
$_.TEST = $first # Output the record once with
$_ # first word
$_.TEST = $second # and again with second word
$_ #
}
else
{
$_ # otherwise output it unchanged
}
} | Export-CSV out.csv -Delimiter '~' -NoTypeInformation
Not entirely sure what you mean by delete, but you could run this and Export-CSV out.csv -Delimiter '~' -NoTypeInformation and it would output to a new file without the original WORD,WORD row.

Append text to certain values in text file with PowerShell

I have a CSV text file separated with ; and it's in the format as:
USER_EMPLOYEE_ID;SYSTEM1;USERNAME1
The first column is an identity and the following pairs of columns are user's account on different active directories. I have placed garbage data but the idea is there.
ay7suve0001;ADDPWN;ay7suve0001
AAXMR3E0001;ADDPWN;AAXMR3E0001
ABABIL;ADDPWN;ABABIL
ABDF17;ADDPWN;ABDF17;
ABKMPPE0001;ADDPWN;ABKMPPE0001
ABL1FL;ADDPWN;ABL1FL
AB6JG8E0004;ADDPWN;AB6JG8E0004;
ACB4YB;ADDPWN;ACB4YB
ACK7J9;ADDPWN;ACK7J9
ACLZFS;ADDPWN;ACLZFS;
ACQXZ3;ADDPWN;ACQXZ3
Now there is a requirement that I have to append a fixed string like #ADDPWN.com to all the USERNAME1 values. Some records are having a ; and some don't.
Is there a quick way to append the #ADDPWN.com to each line taking care of:
any ;
any already #ADDPWN.com
From PowerShell?
Import-Csv is your friend. The following should get you on the right track.
Import-Csv "import.csv" -Delimiter ';' |
foreach {
if ($_.username1 -notlike '*#ADDPWN.com') { $_.username1 += '#ADDPWN.com' }
$_
} |
Export-Csv "export.csv" -Delimiter ';'
This assumes the first line of your csv file is your header line. If it's not, you can pass -Header 'USER_EMPLOYEE_ID','SYSTEM1','USERNAME1' as another parameter to Import-Csv.
Export-Csv adds some extra stuff like quotes around parameters, so you may need to play with the output format if you don't want that.
For another explanation how this works, check out Changes last name, first name to first name, last name in last column CSV powershell
This was a solution that worked for me.........
#opens list of file names
$file2 ="F:\OneDrive_Biz\PowerApps\SecurityCameraVideoApp\file_list_names.csv"
$x = Get-Content $file2
#appends URl to beginning of file name list
for($i=0; $i -lt $x.Count; $i++){
$x[$i] = "https://analytics-my.sharepoint.com/personal/gpowell_analytics_onmicrosoft_com/Documents/PowerApps/SecurityCameraVideoApp/Video_Files/" + $x[$i]
}
$x
#remove all files in target directory prior to saving new list
get-childitem -path C:\_TEMP\file_list_names.csv | remove-item
Add-Content -Path C:\_TEMP\file_list_names_url.csv -Value $x

Prepend / append data to one column in csv in powershell

I'm really liking what I have seen of Powershell. But I'm really confused by some things, as I have so much to learn. I've been reading everything on the site here, but I've not been able to figure this out. Hopefully this is simple. I have a csv like this:
Title,Name,Office,Phone
Boss,Bob,101,323.555-1212
Office-Manager-Level-2,Helen,202,5-1213
Time-Waster-Level-5,Nemo,105,5-1214
Widget-Maker,Zack,10,5-1215
Temp,Larry,102,5-1000
I have been trying to figure out an easy way to prepend & append data to the first column, "Title", that will take eventually become a static webpage with the user's information. I'm trying this so far:
$file = ("\\web\users.csv")
$urlbase="<a href`=`"file:///web/users/info/"
$urlend="_info.html`">"
$data = import-csv ($file) -header ("Title","Name","Office","Phone")
$data | select -Skip 1 | % { $_.Title -replace '$_.Title', "'$urlbase'$_.Title'$urlend'`">'$_.Title'</a>"} | Export-CSV -Path "links_output.csv" -NoTypeInformation
However - all that I'm matching or replacing it appears is the length of the string (??) of the first column of data. My output file is this:
"Length"
"4"
"23"
"19"
"12"
"4"
What I would desire as my output would be:
<a href="file:///web/users/info/Boss_info.html"Boss</a>"
Office-Manager-Level-2"
Time-Waster-Level-5"
Widget-Maker"
Temp"
Also, besides my basic issue, if I could use set-content I'd be happy because I'd really like this to be like a sed -i type of action/function, on the original file, but a new file with the same contents as the old with the updated first column will satisfy if I cannot set-content on the original.
This section of my script will become an html file later and because of issues with regex find and replacing with tags, I'm trying to add the html tags before I use ConvertTo-Html, because that is all working already. Thanks in advance!!
Here's one solution:
$file = ("\\web\users.csv")
$urlbase='<a href="file:///web/users/info/'
$urlend='_info.html">'
get-content $file |
select -Skip 1 |
foreach {
"$Urlbase{0}$urlend" -f $_.split(',')[0]
}
The split(',') is an object method of [string] that will split the string at the commas, producing an array. The trailing [0] takes the first element of that array, whic will be the Name. That gets inserted at {0} in the format string between the other two variables by the format (-f) operator.
You can use the -replace operator, but you can't use PS variables in the replacement string. You can include the literal text:
(get-content $file | select -Skip 1) -replace '^([^,]+)(.+)','<a href="file:///web/users/info/$1_info.html">$2'

Export-CSV cmdlet rewriting entire CSV during each iteration of a FOREACH statement

I'm working with some code that is going to take a series of performance counters, and then put the counters in a .csv file that rolls over every time it hits 1MB.
$Folder="C:\Perflogs\BBCRMLogs" # Change the bit in the quotation marks to whatever directory you want the log file stored in
$Computer = $env:COMPUTERNAME
$1GBInBytes = 1GB
$p = LOTS OF COUNTERS;
# If you want to change the performance counters, change the above list. However, these are the recommended counters for a client machine.
$num = 0
$file = "$Folder\SQL_log_${num}.csv"
if( !(test-path $folder)) {New-Item $Folder -type directory}
Get-Counter -counter $p -SampleInterval 2 -Continuous | Foreach {
if ((Get-Item $file -ErrorAction SilentlyContinue ).Length -gt 1mb)
{
$num +=1
$file = "$Folder\SQL_log_${num}.csv"
}
$_
} | Foreach-Object { $_ | Export-Csv $file -Force -Append}
Right now, it's working quite well. The iteration works fine, and it does create a new file each time the .csv reaches 1MB. However, each .CSV after the first is being created after 2 minutes already at 1MB, causing a new file to be created. I'm not quite sure why this is occurring, although I believe it's because Powershell is just rewriting the entirety of the .csv each time it creates it.
[I'm posting this as a new answer rather than editing the original because it's completely different. Replacing or appending to the original answer would make the ensuing discussion confusing.]
What you need to do is use a regex to extract the values from the Readings property of the output of Get-Counter, and manually construct CSV output from the timestamp and those values. Change the last line to this (format according to your preferred style):
| %{'"' + (Get-Date $_.Timestamp -f 's') + '","' + (([regex]::matches($_.Readings, '(?<=\\\\.+?:\n)(.+?)(?=\n)') | select -ExpandProperty Value) -join '","') + '"'} | Out-File $file -Append -Encoding ASCII
To break that down:
(Get-Date $_.Timestamp -f 's') This part is not strictly necessary, though I think it will make your results easier to follow. The 's' format puts the date in an ISO 8601 sortable pattern. You could substitute 'u' for another sortable format, or use your favorite custom format string. Or just replace it with $_.Timestamp to retain the original format.
[regex]::matches($_.Readings, '(?<=\\\\.+?:\n)(.+?)(?=\n)') The regex matches the contents of any line that is preceded by a line that begins with \\ and ends with : (those pesky counter names you wanted to get rid of). Note that I'm using [regex]::matches, which performs a global match, as opposed to [regex]::match or -match, which will just give you the first match for each string (the Readings property is a single string, so only the first counter reading would be returned).
| select -ExpandProperty Value Produces an array of all the matches, which you can then join with "," and surround with "'s to produce CSV output.
Since you're not using a conversion function, you also need to construct a header row. Add this line right above the pipeline:
`'"Timestamp","' + ($p -join '","') + '"' | Out-File $file -Append -Encoding ASCII`
That's assuming that $p is an array (which it should be). If it's a string, then depending on the format you can either use it as-is, or -split it and rejoin it in CSV format.
Change the last line to this, to convert each line to CSV format and then append it to the output file:
} | Foreach-Object {($_ | ConvertTo-Csv -NoTypeInformation)[1] | Out-File $file -Append -Encoding ASCII}.
A few notes:
The -Encoding ASCII is not strictly necessary, but you might have trouble with a Unicode CSV file in some applications (Excel, for example, won't open it as a CSV file by default, and everything will be in Column A)
The reason for the index in ($_ | ConvertTo-Csv -NoTypeInformation)[1] is that ConvertTo-Csv -NoTypeInformation still outputs the header row each time, so you want to grab the second line of the two-line output (($_ | ConvertTo-Csv -NoTypeInformation)[0] is the header row)
Since you're not outputting a header row, you'll need to output one to $file before the loop