Convert all values in a csv column to integer (or remove leading zeroes) in PowerShell - powershell

I have a csv file with an ID_code column and the IDs have leading zeroes that I want to remove. I found out that if I convert the value to an integer, the leading zeroes should disappear but I don't know how to apply that to all values throughout the csv. This is what I tried but the resulting csv comes out blank:
Import-Csv C:\folder\myFile.csv |
ForEach-Object {
$_.ID_code = [convert]::ToInt32($_.ID_code, 10) } |
convertto-csv -NoTypeInformation | %{$_-replace '"', ""} |
out-file C:\folder\myFile2.csv

You could use a calculated property for that. You don't need to parse the value, though. Simply casting it to int should suffice:
Import-Csv 'C:\path\to\input.csv' |
select -Property #{n='ID_code';e={[int]$_.ID_code}},* -Exclude 'ID_code' |
Export-Csv 'C:\path\to\output.csv'
Another option (since you're exporting the data back to a text file anyway) would be to just remove leading zeroes from the string value:
Import-Csv 'C:\path\to\input.csv' |
select -Property #{n='ID_code';e={$_.ID_code -replace '^0+'}},* -Exclude 'ID_code' |
Export-Csv 'C:\path\to\output.csv'
If you know the position of the ID_code column you don't even need to import the CSV. If for instance the column is the first column in the CSV you could do the replacement like this:
(Get-Content 'C:\path\to\input.csv') -replace '^0+' |
Set-Content 'C:\path\to\output.csv'

Related

Sort CSV powershell script delete duplicate, keep the one with a special value in 3rd column

How do I delete double entriys in a csv by one column and leave the one with one special value in one of the columns?
Example: I got a csv with
Name;Employeenumber;Accessrights
Max;123456;ReadOnly
Berta;133556;Write
Jhonny;161771;ReadOnly
Max;123456;Write
I want to end up with:
Name;Employeenumber;Accessrights
Max;123456;Write
Berta;133556;Write
Jhonny;161771;ReadOnly
I tried by Get-Content Select-Object -unique, but that does not solve the problem that it should only keep the ones with the value "write" at the property Accessrights.
So I have no clue at all
You can use a combination of sorting and grouping ....
#'
Name;Employeenumber;Accessrights
Max;123456;ReadOnly
Berta;133556;Write
Jhonny;161771;ReadOnly
Max;123456;Write
'# |
ConvertFrom-Csv -Delimiter ';' |
Sort-Object -Property Name, Accessrights -Descending |
Group-Object -Property Name |
ForEach-Object {
$_.Group[0]
}

Remove String from Character from column in CSV using Powershell

I have a CSV file containing two columns:server name with domain and date
servername.domain.domain.com,10/15/2018 6:28
servername1.domain.domain.com,10/13/2018 7:28
I need to remove the fully qualified name so it only has the shortname and I need to keep the second column so it looks as is like below either by sending to a new CSV or somehow removing the domain inplace somehow. Basically I want the second column untouched but I need it to be included when creating a new CSV with the altered column 1.
servername,10/15/2018 6:28
servername1,10/13/2018 7:28
I have this:
Import-Csv "filename.csv" -Header b1,b2 |
% {$_.b1.Split('.')[0]} |
Set-Content "filename1.csv"
This works great, but the problem is the new CSV is missing the 2nd column. I need to send the second column to the new CSV file as well.
Use a calculated property to replace the property you want changed, but leave everything else untouched:
Import-Csv 'input.csv' -Header 'b1', 'b2' |
Select-Object -Property #{n='b1';e={$_.b1.Split('.')[0]}}, * -Exclude b1 |
Export-Csv 'output.csv' -NoType
Note that you only need to use the parameter -Header if your CSV data doesn't already have a header line. Otherwise you should remove the parameter.
If your input file doesn't have headers and you want to create the output file also without headers you can't use Export-Csv, though. Use ConvertTo-Csv to create the CSV text output, then skip over the first line (to remove the headers) and write the rest to the output file with Set-Content.
Import-Csv 'input.csv' -Header 'b1', 'b2' |
Select-Object -Property #{n='b1';e={$_.b1.Split('.')[0]}}, * -Exclude b1 |
ConvertTo-Csv -NoType |
Select-Object -Skip 1 |
Set-Content 'output.csv'

Count unique numbers in CSV (PowerShell or Notepad++)

How to find the count of unique numbers in a CSV file? When I use the following command in PowerShell ISE
1,2,3,4,2 | Sort-Object | Get-Unique
I can get the unique numbers but I'm not able to get this to work with CSV files. If for example I use
$A = Import-Csv C:\test.csv | Sort-Object | Get-Unique
$A.Count
it returns 0. I would like to count unique numbers for all the files in a given folder.
My data looks similar to this:
Col1,Col2,Col3,Col4
5,,7,4
0,,9,
3,,5,4
And the result should be 6 unique values (preferably written inside the same CSV file).
Or would it be easier to do it with Notepad++? So far I have found examples only on how to count the unique rows.
You can try the following (PSv3+):
PS> (Import-CSV C:\test.csv |
ForEach-Object { $_.psobject.properties.value -ne '' } |
Sort-Object -Unique).Count
6
The key is to extract all property (column) values from each input object (CSV row), which is what $_.psobject.properties.value does;
-ne '' filters out empty values.
Note that, given that Sort-Object has a -Unique switch, you don't need Get-Unique (you need Get-Unique only if your input already is sorted).
That said, if your CSV file is structured as simply as yours, you can speed up processing by reading it as a text file (PSv2+):
PS> (Get-Content C:\test.csv | Select-Object -Skip 1 |
ForEach-Object { $_ -split ',' -ne '' } |
Sort-Object -Unique).Count
6
Get-Content reads the CSV file as a line of strings.
Select-Object -Skip 1 skips the header line.
$_ -split ',' -ne '' splits each line into values by commas and weeds out empty values.
As for what you tried:
Import-CSV C:\test.csv | Sort-Object | Get-Unique:
Fundamentally, Sort-Object emits the input objects as a whole (just in sorted order), it doesn't extract property values, yet that is what you need.
Because no -Property argument is passed to Sort-Object to base the sorting on, it compares the custom objects that Import-Csv emits as a whole, by their .ToString() values, which happen to be empty[1]
, so they all compare the same, and in effect no sorting happens.
Similarly, Get-Unique also determines uniqueness by .ToString() here, so that, again, all objects are considered the same and only the very first one is output.
[1] This may be surprising, given that using a custom object in an expandable string does yield a value: compare $obj = [pscustomobject] #{ foo ='bar' }; $obj.ToString(); '---'; "$obj". This inconsistency is discussed in this GitHub issue.

I use -NoTypeInformation so why do I get header back when using Out-File?

I filtered by date this file data1.csv
2017.11.1,09:55,1.1,1.2,1.3,1.4,1
2017.11.2,09:55,1.5,1.6,1.7,1.8,2
I don't get a header with -NoTypeInformation:
$CutOff = (Get-Date).AddDays(-2)
$filePath = "data1.csv"
$Data = Import-Csv $filePath -Header Date,Time,A,B,C,D,E
$Data2 = $Data | Where-Object {$_.Date -as [datetime] -gt $Cutoff} | convertto-csv -NoTypeInformation -Delimiter "," | % {$_ -replace '"',''}
But when rewriting with Out-File
$Data2 | Out-File "data2.csv" -Encoding utf8 -Force
I get header back as data2.csv contains:
Date,Time,A,B,C,D,E
2017.11.2,09:55,1.5,1.6,1.7,1.8,2
Why do I have Date,Time,A,B,C,D,E ?
-NoTypeInformation is not about the header but the data type of the rows in the file. Remove it to see what shows up. From Microsoft
Omits the type information header from the output. By default, the string in the output contains #TYPE followed by the fully-qualified name of the object type.
Emphasis mine.
CSVs need headers. That is why it is making one. If you don't want to see the header in the output use Select-Object -Skip 1 to remove it.
$Data |
Where-Object {$_.Date -as [datetime] -gt $Cutoff} |
ConvertTo-CSV -NoTypeInformation -Delimiter "," |
Select-Object -Skip 1 |
% {$_ -replace '"'}
I would not pipe Out-File to itself. You could pipe to Set-Content here just as well.
I am guessing this whole process is to keep the source file in the same state just with some lines filtered out based on date. You could skip most of this just by parsing the date out in each line.
$threshold = (Get-Date).AddDays(-2)
$filePath = "c:\temp\bagel.txt"
(Get-Content $filePath) | Where-Object{
$date,$null=$_.Split(",",2)
[datetime]$date -gt $threshold
} | Set-Content $filePath
Now you don't have to worry about PowerShell CSV object structure or output since we act on the raw data of the file itself.
That will take each line of the input file and filter it out if the parsed date does not match the threshold. Change encoding on the input output cmdlets as you see necessary. What $date,$null=$_.Split(",",2) is doing is splitting the line
on the comma into 2 parts. First of which becomes $date and since this is just a filtering condition we dump the rest of the line into $null.
Properly-formed CSV files must have column headers. Your use of -NoTypeInformation in generating the CSV does not affect column headers; instead, it affects whether the PowerShell object type information is included. If you Export-CSV without -NoTypeInformation, the first line of your CSV file will have a line that looks like #TYPE System.PSCustomObject, which you don't want if you're going to open the CSV in a spreadsheet program.
If you subsequently Import-CSV, the headers (Date, Time, A, B, C) are used to create the fields of a PSObject, so that you can refer to them using the standard dot notation (e.g., $CSV[$line].Date).
The ability to specify -Header on Import-CSV is essentially a "hack" to allow the cmdlet to handle files that are comma-separated, but which did not include column headers.

How to export to "non-standard" CSV with Powershell

I need to convert a file with this format:
2015.03.27,09:00,1.08764,1.08827,1.08535,1.08747,8941
2015.03.27,10:00,1.08745,1.08893,1.08604,1.08762,7558
to this format
2015.03.27,1.08764,1.08827,1.08535,1.08747,1
2015.03.27,1.08745,1.08893,1.08604,1.08762,1
I started with this code but can't see how to achieve the full transformation:
Import-Csv in.csv -Header Date,Time,O,H,L,C,V | Select-Object Date,O,H,L,C,V | Export-Csv -path out.csv -NoTypeInformation
(Get-Content out.csv) | % {$_ -replace '"', ""} | out-file -FilePath out.csv -Force -Encoding ascii
which outputs
Date,O,H,L,C,V
2015.03.27,1.08745,1.08893,1.08604,1.08762,8941
2015.03.27,1.08763,1.08911,1.08542,1.08901,7558
After that I need to
remove the header (I tried -NoHeader which is not recognized)
replace last column with 1.
How to do that as simply as possible (if possible without looping through each row)
Update : finally I have simplified requirement. I just need to replace last column with constant.
Ok, this could be one massive one-liner... I'm going to do line breaks at the pipes for sanity reasons though.
Import-Csv in.csv -header Date,Time,O,H,L,C,V|select * -ExcludeProperty time|
%{$_.date = [datetime]::ParseExact($_.date,"yyyy.MM.dd",$null).tostring("yyMMdd");$_}|
ConvertTo-Csv -NoTypeInformation|
select -skip 1|
%{$_ -replace '"'}|
Set-Content out.csv -encoding ascii
Basically I import the CSV, exclude the time column, convert the date column to an actual [datetime] object and then convert it back in the desired format. Then I pass the modified object (with the newly formatted date) down the pipe to ConvertTo-CSV, and skip the first line (your headers that you don't want), and then remove the quotes from it, and lastly output to file with Set-Content (faster than Out-File)
Edit: Just saw your update... to do that we'll just change the last column to 1 at the same time we modify the date column by adding $_.v=1;...
%{$_.date = [datetime]::ParseExact($_.date,"yyyy.MM.dd",$null).tostring("yyMMdd");$_.v=1;$_}|
Whole script modified:
Import-Csv in.csv -header Date,Time,O,H,L,C,V|select * -ExcludeProperty time|
%{$_.date = [datetime]::ParseExact($_.date,"yyyy.MM.dd",$null).tostring("yyMMdd");$_.v=1;$_}|
ConvertTo-Csv -NoTypeInformation|
select -skip 1|
%{$_ -replace '"'}|
Set-Content out.csv -encoding ascii
Oh, and this has the added benefit of not having to read the file in, write the file to the drive, read that file in, and then write the file to the drive again.