handling a CSV with line feed characters in a column in powershell - powershell

Currently, I have a system which creates a delimited file like the one below in which I've mocked up the extra line feeds which are within the columns sporadically.
Column1,Column2,Column3,Column4
Text1,Text2[LF],text3[LF],text4[CR][LF]
Text1,Text2[LF][LF],text3,text4[CR][LF]
Text1,Text2,text3[LF][LF],text4[CR][LF]
Text1,Text2,text3[LF],text4[LF][LF][CR][LF]
I've been able to remove the line feeds causing me concern by using Notepad++ using the following REGEX to ignore the valid carriage return/Line feed combinations:
(?<![\r])[\n]
I am unable however to find a solution using powershell, because I think when I get-content for the csv file the line feeds within the text fields are ignored and the value is stored as a separate object in the variable assigned to the get-content action. My question is how can I apply the regex to the csv file using replace if the cmdlet ignores the line feeds when loading the data?
I've also tried the following method below to load the content of my csv which doesn't work either as it just results in one long string, which would be similar to using -join(get-content).
[STRING]$test = [io.file]::ReadAllLines('C:\CONV\DataOutput.csv')
$test.replace("(?<![\r])[\n]","")
$test | out-file .\DataOutput_2.csv

Nearly there, may I suggest just 3 changes:
use ReadAllText(…) instead of ReadAllLines(…)
use -replace … instead of .Replace(…), only then will the first argument be treated as a regex
do something with the replacement result (e.g. assign it back to $test)
Sample code:
[STRING]$test = [io.file]::ReadAllText('C:\CONV\DataOutput.csv')
$test = $test -replace '(?<![\r])[\n]',''
$test | out-file .\DataOutput_2.csv

Related

Use PowerShell to see if a column is empty and the delete the entire row from csv file

I have a csv file, with no headlines, that looks like this:
"88212526";"Starter";"PowerMax";"4543";"5713852369748";"146,79";"EUR";"6"
"88212527";"Starter";"PowerMax";"4543";"5713852369755";"66,88";"EUR";"20"
"88212530";"Starter";"PowerMax";"4543";"5713852369786";"143,27";"EUR";"0"
"88212532";"Starter";"PowerMax";"4543";"5713852369809";"80,98";"EUR";"6"
"88212536";"Starter";"PowerMax";"4543";"5713852369847";"";"EUR";"0"
"88212542";"Starter";"PowerMax";"4543";"5713852369908";"77,16";"EUR";"9"
"88212543";"Starter";"PowerMax";"4543";"5713852369915";"77,46";"EUR";"52"
I need a script in PowerShell that deletes the entire row if column 6 is empty.
I have tried this
Foreach ($line in Get-Content .\POWERMAX_DK_1.csv) {
$linearray = $line.split(";")
if($linearray[6] -ne "") {
Add-Content .\myTempFile.csv $line
}
}
But it don't work. The line with empty column is not removed.
Please help
/Kim
Your immediate problem is twofold:
As Mauro Takeda's answer points out, to access the 6th element, you must use index 5, given that array indices are 0-based.
Since you're reading your CSV file as plain text, the field you're looking for has verbatim content "", i.e. including the double quotes, so you'd have to use -ne '""' instead of -ne "" ($linearray[5])
However, it's worth changing your approach:
Use Import-Csv to import your CSV file, which in your case requires manually supplying headers (column names) with the -Header parameter.
This outputs objects whose properties are named for the columns, and whose property values have the syntactic " delimiters removed.
These properties can then be used to robustly filter the input with the Where-Object cmdlet.
In order to convert the results back to a CSV file, use a - single -call to Export-Csv, as shown below (see next point).
Using Add-Content in a loop body is ill-advised for performance reasons, because the file has to be opened and closed in every iteration; instead, pipe to a single call of a file-writing cmdlet - see this answer for background information.
Therefore:
# Note: The assumption is that there are 8 columns, as shown in the sample data.
# Adjust as needed.
Import-Csv .\POWERMAX_DK_1.csv -Delimiter ';' -Header (1..8) |
Where-Object 6 -ne '' |
Export-Csv -NoTypeInformation \myTempFile.csv
Character-encoding caveat: In Windows PowerShell, Export-Csv uses ASCII(!) by default; PowerShell (Core) 7+ commendably uses BOM-less UTF-8. Use the -Encoding parameter as needed.
If you need check column 6, you have to use $linearray[5], because arrays starts counting on zero ($linearray[0] should be the first element)

Powershell: how to retrieve powershell commands from a csv and execute one by one, then output the result to the new csv

I have a Commands.csv file like:
| Command |
| -----------------------------------------------|
|(Get-FileHash C:\Users\UserA\Desktop\File1).Hash|
|(Get-FileHash C:\Users\UserA\Desktop\File2).Hash|
|(Get-FileHash C:\Users\UserA\Desktop\File3).Hash|
Header name is "Command"
My idea is to:
Use ForEach ($line in Get-Content C:\Users\UserA\Desktop\Commands.csv ) {echo $line}
Execute $line one by one via powershell.exe, then output a result to a new .csv file - "result.csv"
Can you give me some directions and suggestions to implement this idea? Thanks!
Important:
Only use the technique below with input files you either fully control or implicitly trust to not contain malicious commands.
To execute arbitrary PowerShell statements stored in strings, you can use Invoke-Expression, but note that it should typically be avoided, as there are usually better alternatives - see this answer.
There are advanced techniques that let you analyze the statements before executing them and/or let you use a separate runspace with a restrictive language mode that limits what kinds of statements are allowed to execute, but that is beyond the scope of this answer.
Given that your input file is a .csv file with a Commands column, import it with Import-Csv and access the .Commands property on the resulting objects.
Use Get-Content only if your input file is a plain-text file without a header row, in which case the extension should really be .txt. (If it has a header row but there's only one column, you could get away with Get-Content Commands.csv | Select-Object -Skip 1 | ...). If that is the case, use $_ instead of $_.Commands below.
To also use the CSV format for the output file, all commands must produce objects of the same type or at least with the same set of properties. The sample commands in your question output strings (the value of the .Hash property), which cannot meaningfully be passed to Export-Csv directly, so a [pscustomobject] wrapper with a Result property is used, which will result in a CSV file with a single column named Result.
Import-Csv Commands.csv |
ForEach-Object {
[pscustomobject] #{
# !! SEE CAVEAT AT THE TOP.
Result = Invoke-Expression $_.Commands
}
} |
Export-Csv -NoTypeInformation Results.csv

PS: Define multiple variables in File1, read File2, and replace all tags in File1 with the variables in File2

I have a file1 with several variables defined:
'$GateWay_HostName'='Biscuits'
'$DomainName'='AND'
'$DC_Name'='Gravy'
I have another file (File2) with line by line commands to send to Cisco devices. I do not want to read file, replace variables, then save file, because the passwords will be entered in cleartext. I can't seem to figure out how to import the variables and replace any matching string with the value.
I can pull in the variables and call them in the script and reference $GateWay_HostName for example with:
$ReplaceVars = Get-Content "C:\folder\file1.csv" | ConvertFrom-StringData
But I can't seem to find anything about going through the imported string to replace all of the variables (some appear once, some appear many, some don't exist).
$CommandstoSend = (Get-Content -Path "C:\folder\File2" -raw)
Because the code to execute the commands are passed as raw, it won't read the variables on the fly. I need to import the raw data otherwise plink won't pass the commands, but I don't care when it gets converted to "raw". Also, if I'm going to end up using a search and replace, I know I don't need to have $VARIABLEname format.
Thanks
You can do the following:
$ReplaceVars = Get-Content "C:\folder\file1.csv" -Raw | ConvertFrom-StringData
$replaceString = ($ReplaceVars.Keys |% {[regex]::Escape($_)}) -join '|'
[regex]::Replace((Get-Content file2 -raw),$replaceString,{param($m) $ReplaceVars[$m.Value]})
Using -Raw on Get-Content creates a single string. This is important for returning a single hash table after piping into ConvertFrom-StringData. Without -Raw, ConvertFrom-StringData outputs an array of hash tables, making lookups more complex.
[regex]::Escape() is to escape special regex characters like $ that could be in your variable names. Joining on | creates a regex alternation (equivalent to logical OR).
The [regex]::Replace() method allows for a script block to be used in the replacement string parameter. Using a parameter $m, we can reference the object that contains the matched text (stored in the Value property) and manipulate it. Here we use it as a key name ($m.Value) for the $replaceVars hash table.
Effectively, this solution looks for text that matches a hash table key name and then replaces that name with the corresponding hash table value.
IMO, the better solution is to configure file2 to be a template file. You can use string format placeholders where the substitutions need to happen. Then just use the string format operator -f. See below for an example:
# file2 contents
$file2 = #'
My DC Name = {0}
Domain Name = {1}
I don't want to edit this line.
Gateway Host = {2}
'#
$file2 -f 'Gravy','AND','Biscuits'
Output:
My DC Name = Gravy
Domain Name = AND
I don't want to edit this line.
Gateway Host = Biscuits

Filtering specific value from csv file using PowerShell

I am having problem with filtering a specific value from my csv file. The csv file looks as follows:
"1","19/Oct/2016","15:03:58","19/Oct/2014","15:03:58","0:00:00","---","---","nice_meme#help.com","---","sip","1232Kbps","---","Out","1140","1","---","---","---","user:---","---","---","---","---","---","---","---","Failed Attempt; ""Your call could not be completedOver.""","3","---","---","---","---","---","---","---","---","---","---","---","---","---"
As you can see there are multiple values with '---', I tried a lot of ways to remove these three dashes. I do not know how to filter them out using PowerShell. I want to get the rows that are not equal to the three dashes.
Something like this:
$a = Import-CSV -Path "C:\Transformed\test.csv" | Where-Object {$_.Header -neq "---"}

Count characters in string then insert delimiter using PowerShell

I have a linux server that will be generating several files throughout the day that need to be inserted in to a database; using Putty I can sftp them off to a server running SQL 2008. Problem is is the structure of the file itself, it has a string of text that are to be placed in different columns, but bulk insert in sql tries to put it all in to one column instead of six. Powershell may not be the best method, but I have seen on several sites how it can find and replace or append to the end of the line, can it count and insert?
So the file looks like this: '18240087A +17135555555 3333333333', where 18, 24, 00, 87, A are different columns, then there is a blank space between the A and the +, that is character count 10-19 which is another column, then characters 20-30 are a column, characters 31-36 are a space which is new column and so on. So I want to insert a '|' or a ',' so that sql understands where the columns end. Is this possible for PowerShell to count randomly?
This may not be the way to respond to all who did answer, i apologize in advance. As this is my first PowerShell script, I appreciate the input from each of you. This is an Avaya SIP server that is generating CDR records, which I must pull from the server and insert in to SQL for later reports. The file exported looks like this:
18:47 10/15
18470214A +14434444444 3013777777 CME-SBC HHHH-CM 4 M00 0
At first I just thought to delete the first line and run a script against the output, which I modified from Kieranties post:
$test = Get-Content C:\Share\CDR\testCDR.txt
$pattern = "^(.{2})(.{2})(.{1})(.{2})(.{1})(.{1})\s*(.{15})(.{10})\s*(.{7})\s*(.{7})\s*(.{1})\s*(.{1})(.{1})(.{1})\s*(.*)$"
if($test -match $pattern){
$result = $matches.Values | select -first ($matches.Count-1)
[array]::Reverse($result, 0, $result.Length)
$result = $result -join "|"
$result | Out-File c:\Share\CDR\results1.txt
}
But then i realized I need that first line as it contains the date. I can try to work that out another way though.
I also now see that there are times when the file contains 2 or more lines of CDR info, such as:
18:24 10/15
18240087A +14434444444 3013777777 CME-SBC HRSA-CM 4 M00 0
18240096A +14434444445 3013777778 CME-SBC HRSA-CM 4 M00 0
Whereas the .ps1 file I made does not give the second string, so I tried adding in this:
foreach ($Data in $test)
{
$Data = $Data -split(',')
and it fails to run. How can I do multiple lines (and possibly that first line)? If you know of a tutorial that can help, that's greatly appreciated as well!
PowerShell is a great tool that I love and it can do many things. I see that you are using SQL Server 2008. Depending on the edition of SQL Server you have running on the server, it most likely has SQL Server Integration Services (SSIS), which is an Extract, Transform, and Load (ETL) tool designed to help migrate data in many scenarios, such as yours. The file you describe here is sounds like a fixed width file, which SSIS can easily handle and import and SQL Server has great ways to automate the loads if this is a recurring need (Which it sounds like), including the automation of the sftp task, and even running PowerShell scripts as part of the ETL (I've done that several times).
If your file truly is fixed width and you want to use PowerShell to transform it into a delimited file, the regex approach you have in your answer works well, or there are several approaches using the System.String methods, like .insert() which allows you to insert a delimiter character using a character index in your line (use Get-Content to read the file and create one String object per line, then loop through them using Foreach loop or Foreach-Object and the pipeline). A slightly more difficult approach would be to use the .Substring() method. You could build your new String line using Substring to extract each column and concatenating those values with a delimiter. That's probably a lot for someone new to PowerShell, but one of the best ways to learn and gain proficiency with it is to practice writing the same script multiple ways. You can learn new techniques that may solve other problems you might encounter in the future.
This is a way (really ugly IMO, I think it can better done):
$a = '18240087A +17135555555 3333333333'
$b = #( ($a[0..1] -join ''), ($a[2..3] -join ''), ($a[4..5] -join ''),
($a[6..7] -join ''), ($a[8] -join ''), ($A[10..19] -join ''),
($a[20..30] -join ''), ($a[31..36] -join ''))
$c = $b -join '|'
$c
18|24|00|87|A|+171355555|55 33333333|33
I don't know if is the rigth splitting you need, but changing the values in each [x..y] you can do what better fit your need. Remenber that character array are 0-based, then the first char is 0 and so on.
I don't quite follow the splitting rules. What kind of software writes the text file anyway? Maybe it can be instructed to change the structure?
That being said, inserting pipes is easy enough with .Insert()
$a= '18240087A +17135555555 3333333333'
$a.Substring(0, $a.IndexOf('+')).Insert(2, '|').insert(5,'|').insert(8, '|').insert(11, '|').insert(13, '|')
# Output: 18|24|00|87|A|
# Rest of the line:
$a.Substring($a.IndexOf('+')+1)
# Output: 17135555555 3333333333
From there you can proceed to splitting the rest of the row data.
I've improved my answer based on your response (note, it's probably best you update your actual question to include that information!)
The nice thing about Get-Content in Powershell is that it returns the content as an array split on the end of line characters. Couple that with allowing multiple assignment from an array and you end up with some neat code.
The following has a function to process each line based on your modified version of my original answer. It's then wrapped by a function which processes the file.
This reads the given file, setting the first line to $date and the rest of the content to $content. It then creates an output file adds the date to the output, then loops over the rest of the content performing the regex check and adding the parsed version of the content if the check is successful.
Function Parse-CDRFileLine {
Param(
[string]$line
)
$pattern = "^(.{2})(.{2})(.{1})(.{2})(.{1})(.{1})\s*(.{15})(.{10})\s*(.{7})\s*(.{7})\s*(.{1})\s*(.{1})(.{1})(.{1})\s*(.*)$"
if($line -match $pattern){
$result = $matches.Values | select -first ($matches.Count-1)
[array]::Reverse($result, 0, $result.Length)
$result = $result -join "|"
$result
}
}
Function Parse-CDRFile{
Param(
[string]$filepath
)
# Read content, setting first line to $date, the rest to $content
$date,$content = Get-Content $filepath
# Create the output file, overwrite if neccessary
$outputFile = New-Item "$filepath.out" -ItemType file -Force
# Add the date line
Set-Content $outputFile $date
# Process the rest of the content
$content |
? { -not([string]::IsNullOrEmpty($_)) } |
% { Add-Content $outputFile (Parse-CDRFileLine $_) }
}
Parse-CDRFile "C:\input.txt"
I used your sample input and the result I get is:
18:24 10/15
18|24|0|08|7|A|+14434444444 30|13777777 C|ME-SBC |HRSA-CM|4|M|0|0|0
18|24|0|09|6|A|+14434444445 30|13777778 C|ME-SBC |HRSA-CM|4|M|0|0|0
There are an incredible amount of resources out there but one I particularly suggest is Douglas Finkes Powershell for Developers It's short, concise and full of great info that will get you thinking in the right mindset with Powershell