How to extract specific fields from CSV file with data using PowerShell - powershell

I am trying to extract the specific fields with data from the CSV file, but when I do run the following script, I do get only the fields without data on it.
Tried PowerShell script and got the file, but it's empty.
$folderPath = 'D:\Data\Orignal\'
$folderPathDest = 'D:\Data\New\'
$desiredColumns = 'ID','TC', 'POINTS_AT','POINTS_DESC','SURVEYSENT'
Get-ChildItem $folderPath -Name | ForEach-Object {
$filePath = $folderPath + $_
$filePathdest = $folderPathDest + $_
Import-Csv $filePath |
Select $desiredColumns |
Export-Csv -Path $filePathDest -NoTypeInformation
}
I do expect that data should be exported as well with fields!
Here is an example data from where I am trying to extract fields. It's sample file, I have big dump.
ID;TC;POINTS_AT;POINTS_DESC;SURVEYSENT;STATUSCHECK;MEM_REQ;EXPIRY_DAT
1;true;5;SUBS;true;08.08.2018;true;20.12.2020
2;true;3;SUBS;true;08.08.2018;true;20.12.2020
3;false;2;SUBS;true;08.08.2018;false;20.12.2020
4;true;5;UNSUBS;true;08.08.2018;false;20.12.2020
5;true;1;UNSUBS;true;08.08.2018;true;20.12.2020
6;false;1;UNSUBS;true;08.08.2018;true;20.12.2020

Define ; as delimiter when calling Import-Csv. The default delimiter is , , so the cmdlet won't be able to split the different columns. Based on that, Select-Object won't find the given properties, and will return objects with empty properties.
Import-Csv:
Delimiter
Specifies the delimiter that separates the property values in the CSV file. The default is a comma (,).
Enter a character, such as a colon (:). To specify a semicolon (;) enclose it in single quotation marks.

Related

Remove commas from a csv column using powershell

I have the following code which removes all the commas in my csv file but there is an issue:
I only want to remove commas on Column P and and the remaining data should be untouched. Currently it appends data underneath the csv data.
$inform = Get-Content C:\Users\bmd\Desktop\report.csv
$inform.ToString()
$y=$inform -replace ',' -replace ''
$y | Out-file C:\Users\bmd\Desktop\report.csv -Append
Using Import-Csv and Export-Csv is usually going to be easier than trying to manipulate strings in a CSV. With no sample of the CSV file, we can only make assumptions. Assuming it contains true and qualified as needed CSV data, you can try the following:
$csv = Import-Csv C:\Users\bmd\Desktop\report.csv
foreach ($row in $csv) {
$row.UNSTRUCTURED_VARS = $row.UNSTRUCTURED_VARS -replace ','
}
$csv | Export-Csv C:\Users\bmd\Desktop\report.csv -NoType
Note that if you are on PowerShell Core or PowerShell 7, you can simply use Export-Csv with -UseQuotes AsNeeded and only the required fields will be qualified.

Powershell - How to search (using wildcard) and replace values in a CSV file?

I have a CSV file (one column/field only) with thousands of records in it.
I need a way in Powershell to search for a value using a few characters followed by a wildcard and, where found, then replace that value with a ".
I have searched around on how to do this but everyting I have found so far either doesn't cover CSV files or doesn't explain how I might be able to do the search using a wildcard.
Example of values in CSV file:
<#
RanDom.Texto 1.yellow [ Table - wood ] "gibberishcode1.moreRandomText11.xyz123+456"
R#ndomEq.Textolo 2.blue [Chair - steel ] "gibberishcode2.moreRandomText222.xyz19283+4567+89
randomi.Textpel 3.green [ counter - granite] "gibberishcode3.moreRandomText3333.xyz17243+3210+987+654"
#>
You will note above that the only values in common across the records are the .xyz in each record.
I want to replace the .xyz (and everything that follows) with a " value.
E.g. Desired result as follows:
<#
RanDom.Texto 1.yellow [ Table - wood ] "gibberishcode1.moreRandomText11"
R#ndomEq.Textolo 2.blue [Chair - steel ] "gibberishcode2.moreRandomText222"
Randomi.Textpel 3.green [ counter - granite] "gibberishcode3.moreRandomText3333"
#>
Here is some code I tried but it doesn't work in that it didn't replace the values (but it does successfuly export to a new csv file).
# Create function that gets the current file path (of where this script is located)
function Get-ScriptDirectory {Split-Path -parent $PSCommandPath}
# Create function that gets the current date and time in format of 1990-07-01_19h15m59
function Get-TimeStamp {return "{0:yyyy-MM-dd}_{0:HH}h{0:mm}m{0:ss}" -f (Get-Date)}
# Set current file path. Also used in both FOR loops below as primary source directory.
${sourceDirPath} = Get-ScriptDirectory
# Import CSV look-up file
${csvFile} = (Import-Csv -Path ${sourceDirPath}\SourceCSVFile.csv)
# for each row, replace the values of .xyz and all that follows with "
foreach(${row} in ${csvFile})
{
${row} = ${row} -replace '.xyz*','"'
}
# Set modified CSV's name and path
${newCSVFile} = ${sourceDirPath} + '\' + $(Get-TimeStamp) + '_SourceCSVFile_Modified.csv'
# export the modified CSV
${csvFile} | Export-Csv ${newCSVFile} -NoTypeInformation
I also tried this as an alternative but no luck either (i think this code below may only work for .txt files??) ...
((Get-Content -path C:\TEMP\TEST\SourceCSVFile.csv -Raw) -replace '.xyz'*,'"') | Export-Csv -Path C:\TEMP\TEST\ReplacementFile.csv
I'm new to Powershell and don't have a proper understanding of regex yet so please be gentle.
UPDATE and SOLUTION:
For those that are interested in my final solution ... I used the code provided by Thomas (Thank you!!) however my .csv file was left with some records that had a triple quote """ value at the end of the string.
As such I modified the code to use variables and execute a second pass of cleaning by replacing all triple quotation (e.g. """) values with a single quote value (e.g. ") and then piping the result to file.
# Create function that gets the current file path (of where this script is located and running from)
function Get-ScriptDirectory {Split-Path -parent $PSCommandPath}
# Set current file path
${sourceDirPath} = Get-ScriptDirectory
# Assign source .csv file name to variable
$origNameSource = 'AllNames.csv'
# Assign desired .csv file name post cleaning
$origNameCLEAN = 'AllNames_CLEAN.csv'
# First pass clean to replace .xyz* with " and assign result to tempCsvText variable
${tempCsvText} = ((Get-Content -Path ${sourceDirPath}\$origNameSource) | % {$_ -replace '\.xyz.*$', '"'})
# Second pass clean to replace """ with " and write result to a new .csv file
${tempCsvText} -replace '"""', '"' | Set-Content -Path ${sourceDirPath}\$origNameCLEAN
# Import records from new .csv file and remove duplicates by using Sort-Object * -Unique
${csvFile} = (Import-Csv -Path ${sourceDirPath}\$origNameCLEAN) | Sort-Object * -Unique
First, a .csv file is nothing else than a regular text file, just following some rules on how content is embedded (one line for each row, columns delimited by a defined ASCII character, optional header). Your last line is close. You have to use a regular expression, that reaches until the end of a line. This will do it:
Get-Content -Path C:\TEMP\TEST\SourceCSVFile.csv | % {$_ -replace '\.xyz.*$', '"'} | Set-Content -Path C:\TEMP\TEST\ReplacementFile.csv
Differences:
I removed the -Raw parameter to get each line as one string.
I used the pipe to process each string (line)
I adjusted your regex to match from .xyz until the end of each line
I piped the result to Set-Content as I only did text replacement and did not read any objects that would then have to be retranslated back to csv text by Export-Csv

Is there a way to merge similar lines using Powershell?

Suppose I have two csv files. One is
id_number,location_code,category,animal,quantity
12212,3,4,cat,2
29889,7,6,dog,2
98900,
33221,1,8,squirrel,1
the second one is:
98900,2,1,gerbil,1
The second file may have a newline or something at the end (maybe or maybe not, I haven't checked), but only the one line of content. There may be three or four or more different varieties of the "second" file, but each one will have a first element (98900 in this example) that corresponds to an incomplete line in the first file similar to what is in this example.
Is there a way using powershell to automatically merge the line in the second (plus any additional similar) csv file into the matching line(s) of the first file, so that the resulting file is:
12212,3,4,cat,2
29889,7,6,dog,2
98900,2,1,gerbil,1
33221,1,8,squirrel,1
main.csv
id_number,location_code,category,animal,quantity
12212,3,4,cat,2
29889,7,6,dog,2
98900,
33221,1,8,squirrel,1
correction_001.csv
98900,2,1,gerbil,1
merge code used at the commandline, or in the .ps1 file of your choice
$myHeader = #('id_number','location_code','category','animal','quantity')
#Stage all the correction files: last correction in the most recent file wins
$ToFix = #{}
filter Plumbing_Import-Csv($Header){import-csv -LiteralPath $_ -Header $Header}
ls correction*.csv | sort -Property LastWriteTime | Plumbing_Import-Csv $myHeader | %{$ToFix[$_.id_number]=$_}
function myObjPipe($Header){
begin{
function TextTo-CsvField([String]$text){
#text fields which contain comma, double quotes, or new-line are a special case for CSV fields and need to be accounted for
if($text -match '"|,|\n'){return '"'+($text -replace '"','""')+'"'}
return $text
}
function myObjTo-CsvRecord($obj){
return ''+
$obj.id_number +','+
$obj.location_code +','+
$obj.category +','+
(TextTo-CsvField $obj.animal)+','+
$obj.quantity
}
$Header -join ','
}
process{
if($ToFix.Contains($_.id_number)){
$out = $ToFix[$_.id_number]
$ToFix.Remove($_.id_number)
}else{$out = $_}
myObjTo-CsvRecord $out
}
end{
#I assume you'd append any leftover fixes that weren't used
foreach($out in $ToFix.Values){
myObjTo-CsvRecord $out
}
}
}
import-csv main.csv | myObjPipe $myHeader | sc combined.csv -encoding ascii
You could also use ConvertTo-Csv, but my preference is to not have all the extra " cruft.
Edit 1: reduced code redundancy, accounted for \n, fixed appends, and used #OwlsSleeping suggestion about the -Header commandlet parameter
also works with these files:
correction_002.csv
98900,2,1,I Win,1
correction_new.csv
98901,2,1,godzilla,1
correction_too.csv
98902,2,1,gamera,1
98903,2,1,mothra,1
Edit 2: convert gc | ConvertTo-Csv over to Import-Csv to fix the front-end \n issues. Now also works with:
correction_003.csv
29889,7,6,"""bad""
monkey",2
This is a simple solution assuming there's always exactly one match, and you don't care about output order. Change the output path to csv1 to overwrite.
I added headers manually in both input files, but you can specify them in Import-Csv instead if you'd rather avoid changing your files.
[array]$MissingLine = Import-Csv -Path "C:\Users\me\Documents\csv2.csv"
[string]$MissingId = $MissingLine[0].id_number
[array]$BigCsv = Import-Csv -Path "C:\Users\me\Documents\csv1.csv" |
Where-Object {$_.id_number -ne $MissingId}
($BigCsv + $MissingLine) |
Export-Csv -Path "C:\Users\me\Documents\Combined.csv"

Powershell replace text once per line

I have a Powershell script that I am trying to work out part of it, so the text input to this is listing the user group they are part of. This PS script is supposed to replace the group with the groups that I am assigning them in active directory(I am limited to only changing groups in active directory). My issue is that when it reaches HR and replaces it, it will then proceed to contine and replace all the new but it all so replaces the HR in CHRL, so my groups look nuts right now. But I am looking it over and it doesn't do it with every line. But for gilchrist it will put something in there for the HR in the name. Is there anything can I do to keep it for changing or am I going to have to change my HR to Human Resources? Thanks for the help.
$lookupTable = #{
'Admin' = 'W_CHRL_ADMIN_GS,M_CHRL_ADMIN_UD,M_CHRL_SITE_GS'
'Security' = 'W_CHRL_SECURITY_GS,M_CHRL_SITE_GS'
'HR' = 'M_CHRL_HR_UD,W_CHRL_HR_GS,M_CHRL_SITE_GS'
$original_file = 'c:\tmp\test.txt'
$destination_file = 'c:\tmp\test2.txt'
Get-Content -Path $original_file | ForEach-Object {
$line = $_
$lookupTable.GetEnumerator() | ForEach-Object {
if ($line -match $_.Key)
{
$line = $line -replace $_.Key, $_.Value
}
}
$line
} | Set-Content -Path $destination_file
Get-Content $destination_file
test.txt:
user,group
john.smith,Admin
joanha.smith,HR
john.gilchrist,security
aaron.r.smith,admin
abby.doe,secuity
abigail.doe,admin
Your input appears to be in CSV format (though note that your sample rows have trailing spaces, which you'd have to deal with, if they're part of your actual data).
Therefore, use Import-Csv and Export-Csv to read / rewrite your data, which allows a more concise and convenient solution:
Import-Csv test.txt |
Select-Object user, #{ Name='group'; Expression = { $lookupTable[$_.group] } } |
Export-Csv -NoTypeInformation -Encoding Utf8 test2.txt
Import-Csv reads the CSV file as a collection of custom objects whose properties correspond to the CSV column values; that is, each object has a .user and .name property in your case.
$_.group therefore robustly reports the abstract group name only, which you can directly pass to your lookup hashtable; Select-Object is used to pass the original .user value through, and to replace the original .group value with the lookup result, using a calculated property.
Export-Csv re-converts the custom objects to a CSV file:
-NoTypeInformation suppresses the (usually useless) data-type-information line at the top of the output file
-Encoding Utf8 was added to prevent potential data loss, because it is ASCII encoding that is used by default.
Note that Export-Csv blindly double-quotes all field values, whether they need it or not; that said, CSV readers should be able to deal with that (and Import-Csv certainly does).
As for what you tried:
The -replace operator replaces all occurrences of a given regex (regular expression) in the input.
Your regexes amounts to looking for (case-insensitive) substrings, which explains why HR matches both the HR group name and substring hr in username gilchrist.
A simple workaround would be to add assertions to your regex so that the substrings only match where you want them; e.g.: ,HR$ would only match after a , at the end of a line ($).
However, your approach of enumerating the hashtable keys for each input CSV row is inefficient, and you're better off splitting off the group name and doing a straight lookup based on it:
# Split the row into fields.
$fields = $line -split ','
# Update the group value (last field)
$fields[-1] = $lookupTable[$fields[-1]]
# Rebuild the line
$line = $fields -join ','
Note that you'd have to make an exception for the header row (e.g., test if the lookup result is empty and refrain from updating, if so).
Why don't you load your text file as a CSV file, using Import-CSV and use "," as a delimiter?
This will allow you to have a Powershell Object you can work on. and then export it as text o CSV. if I use your file & lookup table this code may help you :
$file = Import-Csv -Delimiter "," -Path "c:\ps\test.txt"
$lookupTable = #{
'Admin' = 'W_CHRL_ADMIN_GS,M_CHRL_ADMIN_UD,M_CHRL_SITE_GS'
'Security' = 'W_CHRL_SECURITY_GS,M_CHRL_SITE_GS'
'HR' = 'M_CHRL_HR_UD,W_CHRL_HR_GS,M_CHRL_SITE_GS'}
foreach ($i in $file) {
#Compare and replace
...
}
Export-CSV $file -Delimiter ","
You can then iterate over $file and compare and replace. you can also Export-CSV after you're done.

Append text to certain values in text file with PowerShell

I have a CSV text file separated with ; and it's in the format as:
USER_EMPLOYEE_ID;SYSTEM1;USERNAME1
The first column is an identity and the following pairs of columns are user's account on different active directories. I have placed garbage data but the idea is there.
ay7suve0001;ADDPWN;ay7suve0001
AAXMR3E0001;ADDPWN;AAXMR3E0001
ABABIL;ADDPWN;ABABIL
ABDF17;ADDPWN;ABDF17;
ABKMPPE0001;ADDPWN;ABKMPPE0001
ABL1FL;ADDPWN;ABL1FL
AB6JG8E0004;ADDPWN;AB6JG8E0004;
ACB4YB;ADDPWN;ACB4YB
ACK7J9;ADDPWN;ACK7J9
ACLZFS;ADDPWN;ACLZFS;
ACQXZ3;ADDPWN;ACQXZ3
Now there is a requirement that I have to append a fixed string like #ADDPWN.com to all the USERNAME1 values. Some records are having a ; and some don't.
Is there a quick way to append the #ADDPWN.com to each line taking care of:
any ;
any already #ADDPWN.com
From PowerShell?
Import-Csv is your friend. The following should get you on the right track.
Import-Csv "import.csv" -Delimiter ';' |
foreach {
if ($_.username1 -notlike '*#ADDPWN.com') { $_.username1 += '#ADDPWN.com' }
$_
} |
Export-Csv "export.csv" -Delimiter ';'
This assumes the first line of your csv file is your header line. If it's not, you can pass -Header 'USER_EMPLOYEE_ID','SYSTEM1','USERNAME1' as another parameter to Import-Csv.
Export-Csv adds some extra stuff like quotes around parameters, so you may need to play with the output format if you don't want that.
For another explanation how this works, check out Changes last name, first name to first name, last name in last column CSV powershell
This was a solution that worked for me.........
#opens list of file names
$file2 ="F:\OneDrive_Biz\PowerApps\SecurityCameraVideoApp\file_list_names.csv"
$x = Get-Content $file2
#appends URl to beginning of file name list
for($i=0; $i -lt $x.Count; $i++){
$x[$i] = "https://analytics-my.sharepoint.com/personal/gpowell_analytics_onmicrosoft_com/Documents/PowerApps/SecurityCameraVideoApp/Video_Files/" + $x[$i]
}
$x
#remove all files in target directory prior to saving new list
get-childitem -path C:\_TEMP\file_list_names.csv | remove-item
Add-Content -Path C:\_TEMP\file_list_names_url.csv -Value $x