Export comma-separated text file to csv and maintain leading zeros - powershell

I have 3 .txt files that each need to be converted into .csv files. Each file has 12 columns and some of these columns have data with leading zeroes. These zeroes need to remain. Is there a way through PowerShell to write a loop that will export each of these to a .csv and maintain the leading zeros?
The closest thing I could do was to export them one at a time, but this doesn't maintain the trailing zeros that I need.
Import-Csv C:\AcctsLog.txt -Delimiter ";" | Export-Csv C:\AcctsLog.csv
A sample line would be something like:
Joe Smith;1933 Test Lane;Apt 34;Los Angeles;CA;90003-3444;0000000023;0002;New Car;SmithJoe#yahoo.com;00934200034006700213;0000666666

See if this works with your data:
Import-Csv C:\AcctsLog.txt -Delimiter ';' -Header (1..12) |
ConvertTo-Csv -NoTypeInformation | select -Skip 1 |
Set-Content C:\AcctsLog.csv

If you explicitly want it to include the leading 0's in Excel you would have to save it as an Excel file (otherwise Excel strips leading zeros off values that it interprets as numbers when opening a CSV). You could paste the data into Excel after formatting the cells as Text, then save the files as excel files. But if you want CSV files then go with mjolinor's answer since it produces CSV files with the leading zeros, exactly like you asked for.
To work with Excel you have to create an Excel ComObject. Then you can get the content of your file, replace the semicolons with tabs, pipe to Clip, and paste right into Excel (after creating a workbook and formatting the 12 columns that you need). Should be pretty simple:
$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $true
$FileList = #("C:\Temp\AcctsLog.txt","C:\Temp\SecondFile.txt","C:\Temp\ThirdFile.txt")
ForEach($File in $FileList){
[void]$Excel.Workbooks.Add()
$Excel.ActiveSheet.Range("A:L").NumberFormat = '#'
(Get-Content $File) -replace ';', "`t" | Clip
$Excel.ActiveSheet.Paste()
$Excel.ActiveWorkbook.SaveAs(($File -replace "txt$","xlsx"))
$Excel.ActiveWorkbook.Close($false)
}
$Excel.Quit()

There is a simple way to maintain the leading zeroes in Excel.
Simply add this to the cell and type whatever value you need and the zeroes will be retained
For ex: If I want 0000000023
Type into a cell '0000000023
That ' symbol seems to retain the zeroes as long as you type it before the values.

Related

PowerShell and CSV: Stop CSV from turning text data into Scientific Notation

I have a CSV column with alpha numerical combinations in a column.
I am later going to use this csv file in a PowerShell script by importing the data.
Examples: 1A01, 1C66, 1E53.
Now before putting these values in, I made sure to format the column as text.
Now at first it works. I input the data, save. I test in PowerShell to import it and
all data shows up valid including 1E53. But lets say I edit the file again later to add data and then save and close. I re-import into PowerShell and 1E53 comes in as 1.00E+53. How can I prevent this permanently? Note that the column is filled with codes and there are lots of #E##.
Your issue is not with PowerShell, its with Excel. For a demonstration, take 1E53 and enter it into Excel and then save that excel file as a CSV file. You will see that the value is now changed to 1.00E+53.
How to fix this?
There are a few ways of disabling scientific notation:
https://superuser.com/questions/452832/turn-off-scientific-notation-in-excel
https://www.logicbroker.com/excel-scientific-notation-disable-prevent/
I hope some of them work for you.
I think you can rename the file to .txt instead of .csv and excel may treat it differently.
Good Luck
As commented:
You will probably load the csv from file:
$csv = Import-Csv -Path 'X:\original.csv' -UseCulture
The code below uses a dummy csv in a Here-String here:
$csv = #'
"Column1","Column2","ValueThatMatters"
"Something","SomethingElse","1E53"
"AnotherItem","Whatever","4E12"
'# | ConvertFrom-Csv
# in order to make Excel see the values as Text and not convert them into scientific numbers
$csv | ForEach-Object {
# add a TAB character in front of the values in the column
$_.ValueThatMatters = "`t{0}" -f $_.ValueThatMatters
}
$csv | Export-Csv -Path 'X:\ExcelFriendly.csv' -UseCulture -NoTypeInformation

Using Powershell to write out two header rows without deleting existing data

I have a need to generate two header rows to an existing csv file because the system where the csv will be uploaded needs the two header rows. The csv file will contain data that I want to keep.
I have been testing a powershell script to do this, and I can write a single row of headers, but am struggling to write two rows.
Below is the powershell script I am currently trying to build out.
$file = "C:\Users\_svcamerarcgis\Desktop\Test.csv"
$filedata = import-csv $file -Header WorkorderETL 'n ICFAORNonICFA, WONUmber, Origin
$filedata | export-csv $file -NoTypeInformation
The end result I'm looking for should be as follows:
WorkorderETL
ICFAORNonICFA, WONUmber, Origin
xxx,yyy,zzz
The sole purpose of Import-Csv's -Header parameter is to provide an array of column names to serve as the property names of the custom objects that the CSV rows are parsed into - you cannot repurpose that for special output formatting for later exporting.
You can use the following approach instead, bypassing the need for Import-Csv and Export-Csv altogether (PSv5+):
$file = 'C:\Users\User\OneDrive\Scripts\StackTesting\Test.csv'
# Prepend the 2-line header to the existing file content
# and save it back to the same file
# Adjust the encoding as needed.
#'
WorkorderETL
ICFAORNonICFA,WONUmber,Origin
'# + (Get-Content -Raw $file) | Set-Content $file -NoNewline -Encoding utf8
To be safe, be sure to create a backup of the original file first.
Since the file is being read (in full) and rewritten in the same pipeline, there's a hypothetical chance of data loss if writing back to the input file get interrupted.
You may be better trying to handle this as a text file, considering you are just trying to add a single line at the top of the CSV:
$file = "C:\Users\User\OneDrive\Scripts\StackTesting\Test.csv"
$CSV = "c1r1, c2r1, c3r1 `nc1r2, c2r2, c3r2"
$filedata = Get-Content $file
$filedata = "WorkorderETL`n" + $CSV
$filedata | Out-File $file
This will resul in the CSV file holding:
WorkorderETL
c1r1, c2r1, c3r1
c1r2, c2r2, c3r2
Which looks to be what you want.

Remove commas from numbers in a CSV

I have folder info for all user folders. It is dumped out to a CSV file as follows:
Servername, F:\Users\user, 9,355.7602 MB, 264, 3054, 03/15/2000 13:28:48, 12/10/2018 11:58:29
We are unable to work with the data as is due to the thousands separator in the 3rd column. I could run the report scripts again, but we have a lot of file servers and a large number of users on one in particular, so running it again is very time consuming. The reason the commas are there is that the data was written as a string not a number.
I can import and convert, the only problem is that any number over 1000 will be wrong and then all other data is 1 column off. I would like to replace any comma between 2 numbers. It doesn't seem it would be that hard to do with PowerShell, but I am not having any luck finding anything.
If you assume that columns of data are comma plus space separated and your numbers have no spaces, you can use the -replace operator for this.
$line = 'Servername, F:\Users\user, 9,355.7602 MB, 264, 3054, 03/15/2000 13:28:48, 12/10/2018 11:58:29'
$line -replace '(?<=\d),(?=\d)'
If you are reading the data from a file, you can read the data with Get-Content, replace your data, and update the file with Set-Content.
(Get-Content file.csv) -replace '(?<=\d),(?=\d)' | Set-Content file.csv
If the file is large, you can utilize the faster switch statement.
$data = switch -regex -file file.csv {
'(?<=\d),(?=\d)' { $_ -replace '(?<=\d),(?=\d)' }
default {$_}
}
$data | Set-Content file.csv
Explanation:
(?<=\d) uses a positive lookbehind assertion (?<=) that matches a single digit \d.
(?=\d) uses a positive lookahead assertion (?=) that matches a single digit. You could replace this with (?=\d{3}) to match 3 consecutive digits after the comma.
Since you want to replace the target comma with empty string, you do not need a replacement string.
Typically, it would be best to stick with commands that work with CSV data or files. However, if your data contains commas and you aren't qualifying your text, it may be difficult to distinguish between data and delimiters. If you have a clear way of making that distinction, you are better off using ConvertFrom-Csv for already read data or Import-Csv for files. You will need to define headers either in the files or in the command.
EDIT
It was my oversight that the , in the dataset is not delimited, which causes this answer to not work as expected as the comma is seen as a column separator when parsing the CSV. I'm going to leave it as it does explain how to generally manipulate the data as you'd expect, if the column data were escaped property. However, #AdminOfThings' answer below should work for your specific case here, and will fix the erroneous defined column without relying on parsing the CSV content as a CSV first.
Import the data using Import-Csv, then remove any , in the third column. This assumes that you have no values where , is the decimal separator:
If you have headers in the CSV, you won't need to define header names or get fancy with writing the CSV back out:
Import-Csv -Path \path\to\file.csv | Foreach-Object {
$_.ColumnName = $_.ColumnName -replace ','
} | Export-Csv -NoTypeInformation -Path \path\to\file.csv
The way this works is that we import the CSV as an operable PSCustomObject, then for each line we take whatever the column name with the size is and remove the , from it. Finally, we export the modified PSCustomObject back out to the original CSV.
If you don't have headers, it gets a little trickier since we have to define temporary headers, but Export-Csv doesn't have an option to skip writing out headers:
Import-Csv -Path \path\to\file.csv -Headers Col1, Col2, Col3, Col4, Col5, Col6, Col7 |
Foreach-Object {
$_.Col3 = $_.Col3 -replace ','
} | ConvertTo-Csv | Select-Object -Skip 1 |
Set-Content -Path \path\to\file.csv
This does the same thing as the first block of code, but since we don't want to export the temporary headers, we have to get creative. First, note we reference the target column with the temporary header name. Instead of piping the modified CSV object right to Export-Csv, first we want to convert the object to CSV using ConvertTo-Csv. We then use Select-Object to skip the first line of the converted CSV text, which is the header, so we just have the row data and column values. Finally, we use Set-Content to write the CSV text without the header back to the original file.

Powershell .txt to CSV Formatting Irregularities

I have a large number of .txt files pulled from pdf and formatted with comma delimiters.
I'm trying to append these text files to one another with a new line between each. Earlier in the formatting process I took multi-line input and formatted it into one line with entries separated by commas.
Yet when appending one txt file to another in a csv the previous formatting with many line breaks returns. So my final output is valid csv, but not representative of each text file being one line of csv entries. How can I ensure the transition from txt to csv retains the formatting of the txt files?
I've used Export-CSV, Add-Content, and the >> operator with similar outcomes.
To summarize, individual .txt files with the following format:
,927,Dance like Misty"," shine like Lupita"," slay like Serena. speak like Viola"," fight like Rosa! ,United States ,16 - 65+
Turn into the following when appended together in a csv file:
,927
,Dance like Misty"," shine like Lupita"," slay like Serena. speak like Viola"," fight like Rosa!
,United States
,16 - 65+
How the data was prepped:
Removing new lines
Foreach($f in $FILES){(Get-Content $f -Raw).Replace("`n","") | Set-Content $f -Force}
Adding one new line to the end of each txt file
foreach($f in $FILES){Add-Content -Path $f -value "`n" |Set-Content $f -Force}
Trying to Convert to CSV, one text file per line with comma delimiter:
cat $FILES | sc csv.csv
Or
foreach($f in $FILES){import-csv $f -delimiter "," | export-csv $f}
Or
foreach($f in $FILES){ Export-Csv -InputObject $f -append -path "test.csv"}
Return csv with each comma separated value on a new line, instead of each txt file as one line.
This was resolved by realizing that even though notepad was showing no newlines, there were still hidden return carriage characters. On loading the apparently one line csv files into Notepad++ and toggling "show hidden characters" this oversight was evident.
By replacing both \r and \n characters before converting to CSV,
Foreach($f in $FILES){(Get-Content $f -Raw).Replace("\n","").Replace("\r","" |
Set-Content $f -Force}
The CSV conversion process worked as planned using the following
cat $FILES | sc final.csv
Final verdict --
The text file that appeared to be a one line entry ready to become CSV
,927,Dance like Misty"," shine like Lupita"," slay like Serena. speak like Viola"," fight like Rosa! ,United States ,16 - 65+
Still had return carriage characters between each value. This was made evident by trying another text editor with the feature "show hidden characters."

How can I alternate column headers in a tab delimited file?

I have a tab delimited txt file and i need to switch first and second column names (without switching columns data). In other words I need to rename A(Id) to B(ExternalId) and B(ExternalId) to A(Id). Other columns in the file (other data) should stay unchanged. I'm very new in PowerShell, please advice. As I understand I need to use import/export csv cmdlet.
I tryed this, but it's not working the right way...
Import-Csv 'C:\original_users.txt' |
Select-Object Id, #{Name="ExternalId";Expression={$_."Id"}}; Select-Object ExternalId, #{Name="Id";Expression={$_."ExternalId"}} |
Export-Csv 'C:\changed_users.txt'
The Import-CSV and Export-CSV cmdlets have their strengths but this might not be one of them. The latter cmdlet would introduce quoting that might not be in your original file and that might not be desired.
Either way why not just do some text manipulation on the first line! Lets read in the file and and output the first lined, edited, and the remainder of the file. This sample uses a new location but you could easily write it back to the same file.
# Get the full file into a variable
$fullFile = Get-Content "c:\temp\mockdata.csv"
# Parse the first line into a column array
$columns = $fullFile[0].Split("`t")
# Rebuild the header by switching the columns order as desired.
$newHeader = ($columns[1],$columns[0] + ($columns | Select-Object -Skip 2)) -join "`t"
# Write the header back to file then the rest of the data.
$outputPath = "C:\somepath.txt"
$newHeader | Set-Content $outputPath
$fullFile | Select-Object -Skip 1 | Add-Content $outputPath
This also preserves the presence of other columns and their data.