Powershell Converting Tab Delimited CSV to Comma delimited CSV without Quotes - powershell

We get a tab delimited CSV from COGNOS External system in a public folder. This fails to upload to Salesforce via Dataloader CLI.
com.salesforce.dataloader.exception.DataAccessRowException: Error
reading row #0: the number of data columns (98) exceeds the number of
columns in the header (97)
But if you open the csv in MS Excel, and save as a new CSV (UTF-8) and then pass it to data loader CLI it works without any issue.
The difference in EXCEL converted file seems to be it's Comma separated instead of Tab.
Then I tried to convert Original Tab Delimited CSV to Comma separated CSV using below command,
import-csv source.csv -delimiter "`t" | export-csv target.csv -notype
But the output of this has quotes, Data Loader now runs with the File, but imports nothing into Salesforce, it seems it's not able to identify field-names properly.
Then I tried below command to remove the double quotes,
import-csv source.csv -delimiter "`t" | export-csv target.csv -notype
(Get-Content target.csv) | Foreach-Object {$_ -replace '"', ''}|Out-File target.csv
But this resulted in an Index out of range error, which is not clear.
What would be the best approach to do this conversion for Data Loader CLI?
What can make this conversion same as EXCEL's conversion?
Highly appreciate Any suggestions, thoughts, help to achieve this.
Thanks!

SalesForce has strict rules for CSV files. Also, on this page it says that no more than 50000 records can be imported at one time.
Main thing here is that the file MUST be in UTF8 format.
The quotes around the values are needed.
This should do it (provided you do not have more than 50000 records in the Csv):
Import-Csv -Path 'source.csv' -Delimiter "`t" | Export-Csv -Path 'target.csv' -Encoding UTF8 -NoTypeInformation
(source.csv is the TAB-delimited file you receive from COGNOS)

Related

PowerShell and CSV: Stop CSV from turning text data into Scientific Notation

I have a CSV column with alpha numerical combinations in a column.
I am later going to use this csv file in a PowerShell script by importing the data.
Examples: 1A01, 1C66, 1E53.
Now before putting these values in, I made sure to format the column as text.
Now at first it works. I input the data, save. I test in PowerShell to import it and
all data shows up valid including 1E53. But lets say I edit the file again later to add data and then save and close. I re-import into PowerShell and 1E53 comes in as 1.00E+53. How can I prevent this permanently? Note that the column is filled with codes and there are lots of #E##.
Your issue is not with PowerShell, its with Excel. For a demonstration, take 1E53 and enter it into Excel and then save that excel file as a CSV file. You will see that the value is now changed to 1.00E+53.
How to fix this?
There are a few ways of disabling scientific notation:
https://superuser.com/questions/452832/turn-off-scientific-notation-in-excel
https://www.logicbroker.com/excel-scientific-notation-disable-prevent/
I hope some of them work for you.
I think you can rename the file to .txt instead of .csv and excel may treat it differently.
Good Luck
As commented:
You will probably load the csv from file:
$csv = Import-Csv -Path 'X:\original.csv' -UseCulture
The code below uses a dummy csv in a Here-String here:
$csv = #'
"Column1","Column2","ValueThatMatters"
"Something","SomethingElse","1E53"
"AnotherItem","Whatever","4E12"
'# | ConvertFrom-Csv
# in order to make Excel see the values as Text and not convert them into scientific numbers
$csv | ForEach-Object {
# add a TAB character in front of the values in the column
$_.ValueThatMatters = "`t{0}" -f $_.ValueThatMatters
}
$csv | Export-Csv -Path 'X:\ExcelFriendly.csv' -UseCulture -NoTypeInformation

Remove commas from numbers in a CSV

I have folder info for all user folders. It is dumped out to a CSV file as follows:
Servername, F:\Users\user, 9,355.7602 MB, 264, 3054, 03/15/2000 13:28:48, 12/10/2018 11:58:29
We are unable to work with the data as is due to the thousands separator in the 3rd column. I could run the report scripts again, but we have a lot of file servers and a large number of users on one in particular, so running it again is very time consuming. The reason the commas are there is that the data was written as a string not a number.
I can import and convert, the only problem is that any number over 1000 will be wrong and then all other data is 1 column off. I would like to replace any comma between 2 numbers. It doesn't seem it would be that hard to do with PowerShell, but I am not having any luck finding anything.
If you assume that columns of data are comma plus space separated and your numbers have no spaces, you can use the -replace operator for this.
$line = 'Servername, F:\Users\user, 9,355.7602 MB, 264, 3054, 03/15/2000 13:28:48, 12/10/2018 11:58:29'
$line -replace '(?<=\d),(?=\d)'
If you are reading the data from a file, you can read the data with Get-Content, replace your data, and update the file with Set-Content.
(Get-Content file.csv) -replace '(?<=\d),(?=\d)' | Set-Content file.csv
If the file is large, you can utilize the faster switch statement.
$data = switch -regex -file file.csv {
'(?<=\d),(?=\d)' { $_ -replace '(?<=\d),(?=\d)' }
default {$_}
}
$data | Set-Content file.csv
Explanation:
(?<=\d) uses a positive lookbehind assertion (?<=) that matches a single digit \d.
(?=\d) uses a positive lookahead assertion (?=) that matches a single digit. You could replace this with (?=\d{3}) to match 3 consecutive digits after the comma.
Since you want to replace the target comma with empty string, you do not need a replacement string.
Typically, it would be best to stick with commands that work with CSV data or files. However, if your data contains commas and you aren't qualifying your text, it may be difficult to distinguish between data and delimiters. If you have a clear way of making that distinction, you are better off using ConvertFrom-Csv for already read data or Import-Csv for files. You will need to define headers either in the files or in the command.
EDIT
It was my oversight that the , in the dataset is not delimited, which causes this answer to not work as expected as the comma is seen as a column separator when parsing the CSV. I'm going to leave it as it does explain how to generally manipulate the data as you'd expect, if the column data were escaped property. However, #AdminOfThings' answer below should work for your specific case here, and will fix the erroneous defined column without relying on parsing the CSV content as a CSV first.
Import the data using Import-Csv, then remove any , in the third column. This assumes that you have no values where , is the decimal separator:
If you have headers in the CSV, you won't need to define header names or get fancy with writing the CSV back out:
Import-Csv -Path \path\to\file.csv | Foreach-Object {
$_.ColumnName = $_.ColumnName -replace ','
} | Export-Csv -NoTypeInformation -Path \path\to\file.csv
The way this works is that we import the CSV as an operable PSCustomObject, then for each line we take whatever the column name with the size is and remove the , from it. Finally, we export the modified PSCustomObject back out to the original CSV.
If you don't have headers, it gets a little trickier since we have to define temporary headers, but Export-Csv doesn't have an option to skip writing out headers:
Import-Csv -Path \path\to\file.csv -Headers Col1, Col2, Col3, Col4, Col5, Col6, Col7 |
Foreach-Object {
$_.Col3 = $_.Col3 -replace ','
} | ConvertTo-Csv | Select-Object -Skip 1 |
Set-Content -Path \path\to\file.csv
This does the same thing as the first block of code, but since we don't want to export the temporary headers, we have to get creative. First, note we reference the target column with the temporary header name. Instead of piping the modified CSV object right to Export-Csv, first we want to convert the object to CSV using ConvertTo-Csv. We then use Select-Object to skip the first line of the converted CSV text, which is the header, so we just have the row data and column values. Finally, we use Set-Content to write the CSV text without the header back to the original file.

Filtering data from CSV file with PowerShell

I have huge csv file where first line contains headers of the data. Because the file size I can't open it with excel or similar. I need to filter rows what I only need. I would want to create new csv file which contains only data where Header3 = "TextHere". Everything else is filtered away.
I have tried in PowerShell Get-Content Select-String | Out-File 'newfile.csv' but it lost header row and also messed up with the data putting data in to wrong fields. There is included empty fields in the data and I believe that is messing it. When I tried Get-Content -First or -Last data seemed to be in order.
I have no experience handling big data files or powershell before. Also other options besides PowerShell is also possible if it is free to use as "non-commercial use"
try like this (modify your delimiter if necessary):
import-csv "c:\temp\yourfile.csv" -delimiter ";" | where Header3 -eq "TextHere" | export-csv "c:\temp\result.csv" -delimiter ";" -notype

Powershell import CSV with multiple delimiters

I've got a CSV that has multiple delimiters and the following format:
groupname;user1,user2;user3
groupname;user1,user2;user3,users4
How can i add the users to the AD groups. All the group names in the CSV ends with an ";" seperator and the users users use the "," seperator.
That's a going to make a jagged array if you try to split it at the ; and then split the member list at the commas during the import. It will be difficult to get imported as .csv because you won't have a consistent number of elements in each record.
I'd do the import-csv using the ; as the delimiter, and then split the member list at the commas when it's time to add the members.
Does the header and data still match up, even with different delimiters? If so, I'd replace the odd delimiter with a comma and then to the import. Unless the file is enormous, I'd simply make a clean, second copy and import that.
What about :
Import-Csv -Path $inputfile -Delimiter ';' | export-csv $outfile -Delimiter ',' -NoClobber -NoTypeInformation
Import-Csv -Path $outfile -Delimiter ','
First take in with ; then export that with , and then take the whole bunch in as a , separtated file.

PowerShell: Read text, regex sort, write output to file and formatting

I am a Powershell novice and have run into a challenge in reading, sorting, and outputting a csv file. The input csv has no headers, the data is as follows:
05/25/2010,18:48:33,Stop,a1usak,10.128.212.212
05/25/2010,18:48:36,Start,q2uhal,10.136.198.231
05/25/2010,18:48:09,Stop,s0upxb,10.136.198.231
I use the following piping construct to read the file, sort and output to a file:
(Get-Content d:\vpnData\u62gvpn2.csv) | %{,[regex]::Split($, ",")} | sort #{Expression={$[3]}},#{Expression={$_[1]}} | out-file d:\vpnData\u62gvpn3.csv
The new file is written with the following format:
05/25/2010
07:41:57
Stop
a0uaar
10.128.196.160
05/25/2010
12:24:24
Start
a0uaar
10.136.199.51
05/25/2010
20:00:56
Stop
a0uaar
10.136.199.51
What I would like to see in the output file is a similar format to the original input file with comma dilimiters:
05/25/2010,07:41:57,Stop,a0uaar,10.128.196.160
05/25/2010,12:24:24,Start,a0uaar,10.136.199.51
05/25/2010,20:00:56,Stop,a0uaar,10.136.199.51
But I can't quite seem to get there. I'm almost of the mind that I'll have to write another segment to read the newly produced file and reset its contents to the preferred format for further processing.
Thoughts?
So you want to sort on the fourth and second columns, then write out a csv file?
You can use import-csv to suck the file into memory, specifying the column names with the -header argument. The export-csv command, however, will write a header row out to the destination file and wrap the values in double-quotes, which you probably don't want.
This works, though:
import-csv -header d,t,s,n,a test.csv |
sort n,t |
%{write-output ($_.d + "," + $_.t + "," + $_.s + "," + $_.n + "," + $_.a) }
(I've wrapped it onto multiple lines for readability.)
If you redirect the output of that back to a file, it should do what you want.
You can also use the ConvertFrom-CSV in a similar way
ConvertFrom-Csv -Header date, time, status,user,ip #"
05/25/2010,18:48:33,Stop,a1usak,10.128.212.212
05/25/2010,18:48:36,Start,q2uhal,10.136.198.231
05/25/2010,18:48:09,Stop,s0upxb,10.136.198.231
"#