How to export to "non-standard" CSV with Powershell - powershell

I need to convert a file with this format:
2015.03.27,09:00,1.08764,1.08827,1.08535,1.08747,8941
2015.03.27,10:00,1.08745,1.08893,1.08604,1.08762,7558
to this format
2015.03.27,1.08764,1.08827,1.08535,1.08747,1
2015.03.27,1.08745,1.08893,1.08604,1.08762,1
I started with this code but can't see how to achieve the full transformation:
Import-Csv in.csv -Header Date,Time,O,H,L,C,V | Select-Object Date,O,H,L,C,V | Export-Csv -path out.csv -NoTypeInformation
(Get-Content out.csv) | % {$_ -replace '"', ""} | out-file -FilePath out.csv -Force -Encoding ascii
which outputs
Date,O,H,L,C,V
2015.03.27,1.08745,1.08893,1.08604,1.08762,8941
2015.03.27,1.08763,1.08911,1.08542,1.08901,7558
After that I need to
remove the header (I tried -NoHeader which is not recognized)
replace last column with 1.
How to do that as simply as possible (if possible without looping through each row)
Update : finally I have simplified requirement. I just need to replace last column with constant.

Ok, this could be one massive one-liner... I'm going to do line breaks at the pipes for sanity reasons though.
Import-Csv in.csv -header Date,Time,O,H,L,C,V|select * -ExcludeProperty time|
%{$_.date = [datetime]::ParseExact($_.date,"yyyy.MM.dd",$null).tostring("yyMMdd");$_}|
ConvertTo-Csv -NoTypeInformation|
select -skip 1|
%{$_ -replace '"'}|
Set-Content out.csv -encoding ascii
Basically I import the CSV, exclude the time column, convert the date column to an actual [datetime] object and then convert it back in the desired format. Then I pass the modified object (with the newly formatted date) down the pipe to ConvertTo-CSV, and skip the first line (your headers that you don't want), and then remove the quotes from it, and lastly output to file with Set-Content (faster than Out-File)
Edit: Just saw your update... to do that we'll just change the last column to 1 at the same time we modify the date column by adding $_.v=1;...
%{$_.date = [datetime]::ParseExact($_.date,"yyyy.MM.dd",$null).tostring("yyMMdd");$_.v=1;$_}|
Whole script modified:
Import-Csv in.csv -header Date,Time,O,H,L,C,V|select * -ExcludeProperty time|
%{$_.date = [datetime]::ParseExact($_.date,"yyyy.MM.dd",$null).tostring("yyMMdd");$_.v=1;$_}|
ConvertTo-Csv -NoTypeInformation|
select -skip 1|
%{$_ -replace '"'}|
Set-Content out.csv -encoding ascii
Oh, and this has the added benefit of not having to read the file in, write the file to the drive, read that file in, and then write the file to the drive again.

Related

Powershell Replace Regex Import CSV File

I have a CSV file named test.csv (C:\testing\test.csv) in this format:
File Name,Location,Added (GMT),Created (GMT),Last Modified (GMT),File Size (Bytes),File Size,Extension,Incident Type
10-MB-Test (1).docx,\\blah\Test 3,10/8/2020 21:13,10/8/2020 19:33,10/8/2020 16:26,10723331,10.23 (MB),docx,low_data_discover
10-MB-Test (1).xlsx,\\blah2\Test 3\,10/8/2020 21:14,10/8/2020 19:33,10/8/2020 16:25,9566567,9.12 (MB),xlsx,high_data_discover
1-MB-Test.docx,\\blah3\Test 3\,10/8/2020 21:13,10/8/2020 19:33,10/8/2020 16:37,1045970,1021.46 (KB),docx,medium_data_discover
I'm trying to replace trailing "\" characters (if they exist) for values in the Location column with nothing using this Powershell code:
$file1 = import-csv -path "C:\testing\test.csv" | % {$_."Location" -replace "\\$",""} | Select-Object * | export-csv -NoTypeInformation "C:\testing\blah.csv"
However, when I run the code, the only output I get is a column named "Length" with a numerical value. Can you assist?
You're only sending the new string (updated location) down the pipeline. You can update each location and then export it at the end.
$file1 = import-csv -path "C:\testing\test.csv"
$file1 | ForEach-Object {$_.location = $_.location -replace '\\$'}
$file1 | export-csv -NoTypeInformation "C:\testing\blah.csv"

Remove String from Character from column in CSV using Powershell

I have a CSV file containing two columns:server name with domain and date
servername.domain.domain.com,10/15/2018 6:28
servername1.domain.domain.com,10/13/2018 7:28
I need to remove the fully qualified name so it only has the shortname and I need to keep the second column so it looks as is like below either by sending to a new CSV or somehow removing the domain inplace somehow. Basically I want the second column untouched but I need it to be included when creating a new CSV with the altered column 1.
servername,10/15/2018 6:28
servername1,10/13/2018 7:28
I have this:
Import-Csv "filename.csv" -Header b1,b2 |
% {$_.b1.Split('.')[0]} |
Set-Content "filename1.csv"
This works great, but the problem is the new CSV is missing the 2nd column. I need to send the second column to the new CSV file as well.
Use a calculated property to replace the property you want changed, but leave everything else untouched:
Import-Csv 'input.csv' -Header 'b1', 'b2' |
Select-Object -Property #{n='b1';e={$_.b1.Split('.')[0]}}, * -Exclude b1 |
Export-Csv 'output.csv' -NoType
Note that you only need to use the parameter -Header if your CSV data doesn't already have a header line. Otherwise you should remove the parameter.
If your input file doesn't have headers and you want to create the output file also without headers you can't use Export-Csv, though. Use ConvertTo-Csv to create the CSV text output, then skip over the first line (to remove the headers) and write the rest to the output file with Set-Content.
Import-Csv 'input.csv' -Header 'b1', 'b2' |
Select-Object -Property #{n='b1';e={$_.b1.Split('.')[0]}}, * -Exclude b1 |
ConvertTo-Csv -NoType |
Select-Object -Skip 1 |
Set-Content 'output.csv'

I use -NoTypeInformation so why do I get header back when using Out-File?

I filtered by date this file data1.csv
2017.11.1,09:55,1.1,1.2,1.3,1.4,1
2017.11.2,09:55,1.5,1.6,1.7,1.8,2
I don't get a header with -NoTypeInformation:
$CutOff = (Get-Date).AddDays(-2)
$filePath = "data1.csv"
$Data = Import-Csv $filePath -Header Date,Time,A,B,C,D,E
$Data2 = $Data | Where-Object {$_.Date -as [datetime] -gt $Cutoff} | convertto-csv -NoTypeInformation -Delimiter "," | % {$_ -replace '"',''}
But when rewriting with Out-File
$Data2 | Out-File "data2.csv" -Encoding utf8 -Force
I get header back as data2.csv contains:
Date,Time,A,B,C,D,E
2017.11.2,09:55,1.5,1.6,1.7,1.8,2
Why do I have Date,Time,A,B,C,D,E ?
-NoTypeInformation is not about the header but the data type of the rows in the file. Remove it to see what shows up. From Microsoft
Omits the type information header from the output. By default, the string in the output contains #TYPE followed by the fully-qualified name of the object type.
Emphasis mine.
CSVs need headers. That is why it is making one. If you don't want to see the header in the output use Select-Object -Skip 1 to remove it.
$Data |
Where-Object {$_.Date -as [datetime] -gt $Cutoff} |
ConvertTo-CSV -NoTypeInformation -Delimiter "," |
Select-Object -Skip 1 |
% {$_ -replace '"'}
I would not pipe Out-File to itself. You could pipe to Set-Content here just as well.
I am guessing this whole process is to keep the source file in the same state just with some lines filtered out based on date. You could skip most of this just by parsing the date out in each line.
$threshold = (Get-Date).AddDays(-2)
$filePath = "c:\temp\bagel.txt"
(Get-Content $filePath) | Where-Object{
$date,$null=$_.Split(",",2)
[datetime]$date -gt $threshold
} | Set-Content $filePath
Now you don't have to worry about PowerShell CSV object structure or output since we act on the raw data of the file itself.
That will take each line of the input file and filter it out if the parsed date does not match the threshold. Change encoding on the input output cmdlets as you see necessary. What $date,$null=$_.Split(",",2) is doing is splitting the line
on the comma into 2 parts. First of which becomes $date and since this is just a filtering condition we dump the rest of the line into $null.
Properly-formed CSV files must have column headers. Your use of -NoTypeInformation in generating the CSV does not affect column headers; instead, it affects whether the PowerShell object type information is included. If you Export-CSV without -NoTypeInformation, the first line of your CSV file will have a line that looks like #TYPE System.PSCustomObject, which you don't want if you're going to open the CSV in a spreadsheet program.
If you subsequently Import-CSV, the headers (Date, Time, A, B, C) are used to create the fields of a PSObject, so that you can refer to them using the standard dot notation (e.g., $CSV[$line].Date).
The ability to specify -Header on Import-CSV is essentially a "hack" to allow the cmdlet to handle files that are comma-separated, but which did not include column headers.

Convert all values in a csv column to integer (or remove leading zeroes) in PowerShell

I have a csv file with an ID_code column and the IDs have leading zeroes that I want to remove. I found out that if I convert the value to an integer, the leading zeroes should disappear but I don't know how to apply that to all values throughout the csv. This is what I tried but the resulting csv comes out blank:
Import-Csv C:\folder\myFile.csv |
ForEach-Object {
$_.ID_code = [convert]::ToInt32($_.ID_code, 10) } |
convertto-csv -NoTypeInformation | %{$_-replace '"', ""} |
out-file C:\folder\myFile2.csv
You could use a calculated property for that. You don't need to parse the value, though. Simply casting it to int should suffice:
Import-Csv 'C:\path\to\input.csv' |
select -Property #{n='ID_code';e={[int]$_.ID_code}},* -Exclude 'ID_code' |
Export-Csv 'C:\path\to\output.csv'
Another option (since you're exporting the data back to a text file anyway) would be to just remove leading zeroes from the string value:
Import-Csv 'C:\path\to\input.csv' |
select -Property #{n='ID_code';e={$_.ID_code -replace '^0+'}},* -Exclude 'ID_code' |
Export-Csv 'C:\path\to\output.csv'
If you know the position of the ID_code column you don't even need to import the CSV. If for instance the column is the first column in the CSV you could do the replacement like this:
(Get-Content 'C:\path\to\input.csv') -replace '^0+' |
Set-Content 'C:\path\to\output.csv'

Convert date format in CSV using PowerShell

I have two CSV file with 50 column and more than 10K row. There is a column having date-time value. In some records it is 01/18/2013 18:16:32 and some records is 16/01/2014 17:32.
What I want to convert the column data like this: 01/18/2013 18:16. I want to remove seconds value. I want to do it by PowerShell script.
Sample data:
10/1/2014 13:18
10/1/2014 13:21
15/01/2014 12:03:19
15/01/2014 17:39:27
15/01/2014 18:29:44
17/01/2014 13:33:59
Since you're not going to convert to a sane date format anyway, you can just do a regex replace of that column:
Import-Csv foo.csv |
ForEach-Object {
$_.Date = $_.Date -replace '(\d+:\d+):\d+', '$1'
} |
Export-Csv -NoTypeInformation foo-new.csv
You could also go the route of using date parsing, but that's probably a bit slower:
Import-Csv foo.csv |
ForEach-Object {
$_.Date = [datetime]::Parse($_.Date).ToString('MM/dd/yyyy HH:mm')
} |
Export-Csv -NoTypeInformation foo-new.csv
If you're sure that there are no other timestamps or anything that could look like it elsewhere, you can also just replace everything that looks like it without even parsing as CSV:
$csv = Get-Content foo.csv -ReadCount 0
$csv = $csv -replace '(\d{2}/\d{2}/\d{4} \d{2}:\d{2}):\d{2}', '$1'
$csv | Out-File foo-new.csv