Combining multiple csv files in Powershell changes format - powershell

I'm attempting to combine numerous csv files into a single outfile. It works okay in that the files are combined, however it changes the format of the outfile by making every line from the input files as a single value that just happens to contain commas. In other words, I get "1,2,3" in a single cell instead of "1" "2" "3" each in their own cell.
Get-Content C:\RollingLogs\*.csv | Out-File C:\CombinedLogs\outfile.csv
Here is a snip from the output file, if that helps to explain what's happening. The desired output is for "1" to be in column A, "2" in column B, etc. Instead we wind up with "1,2,3" all in column A.
I've tried specifying the comma as a delimiter with Out-File, but just makes everything a new line instead of keeping them in proper rows, which is worse. I also tried gc *.csv | Export-Csv outfile.csv but it just hangs without doing anything. Export-csv also changes the contents (1,2,3 from the input becomes "#TYPE System.Int32" and I don't know why). I have wondered if Import-Csv file1.csv | Export-Csv -Append outfile.csv might work, but since there are multiple input files it would require a foreach loop (I think?) and I have no idea how to do that.
Edit: Contents of the input file as seen in Notepad, to see delimiters
TCP,svchost.exe,1404,LISTENING
TCP,System,4,LISTENING
TCP,System,4,LISTENING
There are line breaks after the G, but they don't show up in StackOverflow without editing it more.

Related

Create Record using Headers from a .csv

<EDIT: I kind of have it working, but in order to get it to work, my template csv has to have a blank line for every line I am going to be adding to it. So, if I could figure out how to add lines to the imported empty (just a header row) csv file, I could then use export-csv at the end. (It would be somewhat slower, but it would at least work.)>
I am creating a .csv file in PowerShell. The output file has 140 columns. Many of them are null.
I started out just doing
$out = 'S-'+$Snum+',,,,,TRUE,,,,,'+'S-'+$Snum+',"'
$out = $out + '{0:d9}' -f $item.SupplierCode2
until I had filled all the columns with the correct value. But, the system that is reading the output keeps changing the column locations. So, I wanted to take the header row from the template for the system and use that to name the columns. Then, if the columns change location, it won't matter because I will be referring to it by name.
Because there are so many columns, I'm trying to avoid a solution that has me enter all the column names. By using a blank .csv with just the headers, I can just paste that into the csv whenever it changes and I won't have to change my code.
So, I started by reading my csv file in so I can use the headers.
$TempA = Import-Csv -Path $Pathta -Encoding Default
Then I was hoping I could do something like this:
$TempA.'Supplier Key' = "S-$Snum"
$TempA.'Auto Complete' = "TRUE"
$TempA.'Supplier ID' = "S-$Snum"
$tempA.'Supplier Data - Supplier Reference ID' = '{0:d9}' -f $item.SupplierCode2
I would only need to fill in the fields that have values, everything else would be null.
Then I was thinking I could write out this record to a file. My old write looked like this
$writer2.WriteLine($out)
I wanted to write the line from the new csv line instead
$writer2.WriteLine($TempA)
I'd rather use streams if I can because the files are large and using add-Content really slows things down.
I know I need to do something to add a line to $TempA and I would like each loop to start with a new line (with all nulls) because there are times when certain lines only have a small subset of the values populated.
Clearly, I'm not taking the correct approach here. I'd really appreciate any advice anyone can give me.
Thank you.
If you only want to fill in certain fields, and don't mind using Export-Csv you can use the -append and -force switches, and it will put the properties in the right places. For example, if you had the template CSV file with only the column names in it you could do:
$Output = ForEach($item in $allItems){
[PSCustomObject]#{
'Supplier Key' = "S-$Snum"
'Auto Complete' = "TRUE"
'Supplier ID' = "S-$Snum"
'Supplier Data - Supplier Reference ID' = '{0:d9}' -f $item.SupplierCode2
}
}
$Output | Export-Csv -Path $Pathta -Append -Force
That would create objects with only the four properties that you are interested in, and then output them to the CSV in the correct columns, adding commas as needed to create blank values for all other columns.

PowerShell and CSV: Stop CSV from turning text data into Scientific Notation

I have a CSV column with alpha numerical combinations in a column.
I am later going to use this csv file in a PowerShell script by importing the data.
Examples: 1A01, 1C66, 1E53.
Now before putting these values in, I made sure to format the column as text.
Now at first it works. I input the data, save. I test in PowerShell to import it and
all data shows up valid including 1E53. But lets say I edit the file again later to add data and then save and close. I re-import into PowerShell and 1E53 comes in as 1.00E+53. How can I prevent this permanently? Note that the column is filled with codes and there are lots of #E##.
Your issue is not with PowerShell, its with Excel. For a demonstration, take 1E53 and enter it into Excel and then save that excel file as a CSV file. You will see that the value is now changed to 1.00E+53.
How to fix this?
There are a few ways of disabling scientific notation:
https://superuser.com/questions/452832/turn-off-scientific-notation-in-excel
https://www.logicbroker.com/excel-scientific-notation-disable-prevent/
I hope some of them work for you.
I think you can rename the file to .txt instead of .csv and excel may treat it differently.
Good Luck
As commented:
You will probably load the csv from file:
$csv = Import-Csv -Path 'X:\original.csv' -UseCulture
The code below uses a dummy csv in a Here-String here:
$csv = #'
"Column1","Column2","ValueThatMatters"
"Something","SomethingElse","1E53"
"AnotherItem","Whatever","4E12"
'# | ConvertFrom-Csv
# in order to make Excel see the values as Text and not convert them into scientific numbers
$csv | ForEach-Object {
# add a TAB character in front of the values in the column
$_.ValueThatMatters = "`t{0}" -f $_.ValueThatMatters
}
$csv | Export-Csv -Path 'X:\ExcelFriendly.csv' -UseCulture -NoTypeInformation

Remove commas from numbers in a CSV

I have folder info for all user folders. It is dumped out to a CSV file as follows:
Servername, F:\Users\user, 9,355.7602 MB, 264, 3054, 03/15/2000 13:28:48, 12/10/2018 11:58:29
We are unable to work with the data as is due to the thousands separator in the 3rd column. I could run the report scripts again, but we have a lot of file servers and a large number of users on one in particular, so running it again is very time consuming. The reason the commas are there is that the data was written as a string not a number.
I can import and convert, the only problem is that any number over 1000 will be wrong and then all other data is 1 column off. I would like to replace any comma between 2 numbers. It doesn't seem it would be that hard to do with PowerShell, but I am not having any luck finding anything.
If you assume that columns of data are comma plus space separated and your numbers have no spaces, you can use the -replace operator for this.
$line = 'Servername, F:\Users\user, 9,355.7602 MB, 264, 3054, 03/15/2000 13:28:48, 12/10/2018 11:58:29'
$line -replace '(?<=\d),(?=\d)'
If you are reading the data from a file, you can read the data with Get-Content, replace your data, and update the file with Set-Content.
(Get-Content file.csv) -replace '(?<=\d),(?=\d)' | Set-Content file.csv
If the file is large, you can utilize the faster switch statement.
$data = switch -regex -file file.csv {
'(?<=\d),(?=\d)' { $_ -replace '(?<=\d),(?=\d)' }
default {$_}
}
$data | Set-Content file.csv
Explanation:
(?<=\d) uses a positive lookbehind assertion (?<=) that matches a single digit \d.
(?=\d) uses a positive lookahead assertion (?=) that matches a single digit. You could replace this with (?=\d{3}) to match 3 consecutive digits after the comma.
Since you want to replace the target comma with empty string, you do not need a replacement string.
Typically, it would be best to stick with commands that work with CSV data or files. However, if your data contains commas and you aren't qualifying your text, it may be difficult to distinguish between data and delimiters. If you have a clear way of making that distinction, you are better off using ConvertFrom-Csv for already read data or Import-Csv for files. You will need to define headers either in the files or in the command.
EDIT
It was my oversight that the , in the dataset is not delimited, which causes this answer to not work as expected as the comma is seen as a column separator when parsing the CSV. I'm going to leave it as it does explain how to generally manipulate the data as you'd expect, if the column data were escaped property. However, #AdminOfThings' answer below should work for your specific case here, and will fix the erroneous defined column without relying on parsing the CSV content as a CSV first.
Import the data using Import-Csv, then remove any , in the third column. This assumes that you have no values where , is the decimal separator:
If you have headers in the CSV, you won't need to define header names or get fancy with writing the CSV back out:
Import-Csv -Path \path\to\file.csv | Foreach-Object {
$_.ColumnName = $_.ColumnName -replace ','
} | Export-Csv -NoTypeInformation -Path \path\to\file.csv
The way this works is that we import the CSV as an operable PSCustomObject, then for each line we take whatever the column name with the size is and remove the , from it. Finally, we export the modified PSCustomObject back out to the original CSV.
If you don't have headers, it gets a little trickier since we have to define temporary headers, but Export-Csv doesn't have an option to skip writing out headers:
Import-Csv -Path \path\to\file.csv -Headers Col1, Col2, Col3, Col4, Col5, Col6, Col7 |
Foreach-Object {
$_.Col3 = $_.Col3 -replace ','
} | ConvertTo-Csv | Select-Object -Skip 1 |
Set-Content -Path \path\to\file.csv
This does the same thing as the first block of code, but since we don't want to export the temporary headers, we have to get creative. First, note we reference the target column with the temporary header name. Instead of piping the modified CSV object right to Export-Csv, first we want to convert the object to CSV using ConvertTo-Csv. We then use Select-Object to skip the first line of the converted CSV text, which is the header, so we just have the row data and column values. Finally, we use Set-Content to write the CSV text without the header back to the original file.

Powershell .txt to CSV Formatting Irregularities

I have a large number of .txt files pulled from pdf and formatted with comma delimiters.
I'm trying to append these text files to one another with a new line between each. Earlier in the formatting process I took multi-line input and formatted it into one line with entries separated by commas.
Yet when appending one txt file to another in a csv the previous formatting with many line breaks returns. So my final output is valid csv, but not representative of each text file being one line of csv entries. How can I ensure the transition from txt to csv retains the formatting of the txt files?
I've used Export-CSV, Add-Content, and the >> operator with similar outcomes.
To summarize, individual .txt files with the following format:
,927,Dance like Misty"," shine like Lupita"," slay like Serena. speak like Viola"," fight like Rosa! ,United States ,16 - 65+
Turn into the following when appended together in a csv file:
,927
,Dance like Misty"," shine like Lupita"," slay like Serena. speak like Viola"," fight like Rosa!
,United States
,16 - 65+
How the data was prepped:
Removing new lines
Foreach($f in $FILES){(Get-Content $f -Raw).Replace("`n","") | Set-Content $f -Force}
Adding one new line to the end of each txt file
foreach($f in $FILES){Add-Content -Path $f -value "`n" |Set-Content $f -Force}
Trying to Convert to CSV, one text file per line with comma delimiter:
cat $FILES | sc csv.csv
Or
foreach($f in $FILES){import-csv $f -delimiter "," | export-csv $f}
Or
foreach($f in $FILES){ Export-Csv -InputObject $f -append -path "test.csv"}
Return csv with each comma separated value on a new line, instead of each txt file as one line.
This was resolved by realizing that even though notepad was showing no newlines, there were still hidden return carriage characters. On loading the apparently one line csv files into Notepad++ and toggling "show hidden characters" this oversight was evident.
By replacing both \r and \n characters before converting to CSV,
Foreach($f in $FILES){(Get-Content $f -Raw).Replace("\n","").Replace("\r","" |
Set-Content $f -Force}
The CSV conversion process worked as planned using the following
cat $FILES | sc final.csv
Final verdict --
The text file that appeared to be a one line entry ready to become CSV
,927,Dance like Misty"," shine like Lupita"," slay like Serena. speak like Viola"," fight like Rosa! ,United States ,16 - 65+
Still had return carriage characters between each value. This was made evident by trying another text editor with the feature "show hidden characters."

PowerShell: Read text, regex sort, write output to file and formatting

I am a Powershell novice and have run into a challenge in reading, sorting, and outputting a csv file. The input csv has no headers, the data is as follows:
05/25/2010,18:48:33,Stop,a1usak,10.128.212.212
05/25/2010,18:48:36,Start,q2uhal,10.136.198.231
05/25/2010,18:48:09,Stop,s0upxb,10.136.198.231
I use the following piping construct to read the file, sort and output to a file:
(Get-Content d:\vpnData\u62gvpn2.csv) | %{,[regex]::Split($, ",")} | sort #{Expression={$[3]}},#{Expression={$_[1]}} | out-file d:\vpnData\u62gvpn3.csv
The new file is written with the following format:
05/25/2010
07:41:57
Stop
a0uaar
10.128.196.160
05/25/2010
12:24:24
Start
a0uaar
10.136.199.51
05/25/2010
20:00:56
Stop
a0uaar
10.136.199.51
What I would like to see in the output file is a similar format to the original input file with comma dilimiters:
05/25/2010,07:41:57,Stop,a0uaar,10.128.196.160
05/25/2010,12:24:24,Start,a0uaar,10.136.199.51
05/25/2010,20:00:56,Stop,a0uaar,10.136.199.51
But I can't quite seem to get there. I'm almost of the mind that I'll have to write another segment to read the newly produced file and reset its contents to the preferred format for further processing.
Thoughts?
So you want to sort on the fourth and second columns, then write out a csv file?
You can use import-csv to suck the file into memory, specifying the column names with the -header argument. The export-csv command, however, will write a header row out to the destination file and wrap the values in double-quotes, which you probably don't want.
This works, though:
import-csv -header d,t,s,n,a test.csv |
sort n,t |
%{write-output ($_.d + "," + $_.t + "," + $_.s + "," + $_.n + "," + $_.a) }
(I've wrapped it onto multiple lines for readability.)
If you redirect the output of that back to a file, it should do what you want.
You can also use the ConvertFrom-CSV in a similar way
ConvertFrom-Csv -Header date, time, status,user,ip #"
05/25/2010,18:48:33,Stop,a1usak,10.128.212.212
05/25/2010,18:48:36,Start,q2uhal,10.136.198.231
05/25/2010,18:48:09,Stop,s0upxb,10.136.198.231
"#