Can I convert a row of comma delimited values to a column - powershell

I have one row of temperature data in a text file that I would like to convert to a single column and save as a CSV file using a PowerShell script. The temperatures are separated by commas and look like this:
21,22,22,22,22,22,22,20,19,18,17,16,15,14,13,12,11,10,9,9,9,8,8,9,8,8,8,9,9,8,8,8,9,9,9,8,8,8,8,8,9,10,12,14,15,17,19,20,21,21,21,21,21,21,21,21,21,21,21,20,20,20,20,20,22,24,25,26,27,27,27,28,28,28,29,29,29,28,28,28,28,28,28,27,27,27,27,27,29,30,32,32,32,32,33,34,35,35,34,33,32,32,31,31,30,30,29,29,28,28,27,28,29,31,33,34,35,35,35,36,36,36,36,36,36,36,36,36,37,37,37,37,37,37,38,39,40,42,43,43,43,43,43,42,42,42,41,41,41,41,40,39,37,36,35,34,33,32,31,31,31,31,31,31,31,31,31,31,
I have tried several methods based on searches in this forum I thought this might work but it returns an error: Transpose rows to columns in PowerShell
This is the modified code I tried that returns: Error: "Input string was not in a correct format."
$txt = Get-Content 'C:myfile.txt' | Out-String
$txt -split '(?m)^,\r?\n' | ForEach-Object {
# create empty array
$row = #()
$arr = $_ -split '\r?\n'
$k = 0
for ($n = 0; $n -lt $arr.Count; $n += 2) {
$i = [int]$arr[$n]
# if index from record ($i) is greater than current index ($k) append
# required number of empty fields
for ($j = $k; $j -lt $i-1; $j++) { $row += $null }
$row += $arr[$n+1]
$k = $i
}
$row -join '|'
}
This seems like it should be simple to do with only one row of data. Are there any suggestions on how to convert this single row of numbers to one column?

Try this:
# convert row to column data
$header = 'TEMPERATURE'
$values = $(Get-Content input.dat) -split ','
$header, $values | Out-File result.csv
#now test the result
Import-Csv result.csv
The header is the first line (or record) in the CSV file. In this case it's a single word, because there is only one column.
The values are the items between commas in the input. In this case, the -split on commas generates an array of strings. Note that, if comma is a separator, there will be no comma after the last temperature. Your data doesn't look like that, but I have assumed that the real data does.
Then, we just write the header and the array out to a file. But what happened to all the commas? It turns out that, for a single column CSV file, there are no commas separating fields. So the result is a simple CSV file.
Last, there is a test of the output using Import-csv to read the result and display it in table format.
This isn't necessarily the best way to code it, but it might help a beginner get used to powershell.

Assuming I'm understanding your intent correctly, based on your verbal description (not your own coding attempt):
# Create simplified sample input file
#'
21,22,23,
'# > myfile.txt
# Read the line, split it into tokens by ",", filter out empty elements
# with `-ne ''` (to ignore empty elements, such as would
# result from the trailing "," in your sample input),
# and write to an output CSV file with a column name prepended.
(Get-Content myfile.txt) -split ',' -ne '' |
ForEach-Object -Begin { 'Temperatures' } -Process { $_ } |
Set-Content out.csv
More concise alternative, using an expandable (interpolating) here-string:
# Note: .TrimEnd(',') removes any trailing "," from the input.
# Your sample input suggests that this is necessary.
# If there are no trailing "," chars., you can omit this call.
#"
Temperatures
$((Get-Content myfile.txt).TrimEnd(',') -split ',' -join [Environment]::NewLine)
"# > out.csv
out.csv then contains:
Temperatures
21
22
23

Related

Powershell - Count number of carriage returns line feed in .txt file

I have a large text file (output from SQL db) and I need to determine the row count. However, since the source SQL data itself contains carriage returns \r and line feeds \n (NEVER appearing together), the data for some rows spans multiple lines in the output .txt file. The Powershell I'm using below gives me the file line count which is greater than the actual SQL row count. So I need to modify the script to ignore the additional lines - one way of doing it might be just counting the number of times CRLF or \r\n occurs (TOGETHER) in the file and that should be the actual number of rows but I'm not sure how to do it.
Get-ChildItem "." |% {$n = $_; $c = 0; Get-Content -Path $_ -ReadCount 1000 |% { $c += $_.Count }; "$n; $c"} > row_count.txt
I just learned myself that the Get-Content splits and streams each lines in a file by CR, CRLF, and LF sothat it can read data between operating systems interchangeably:
"1`r2`n3`r`n4" | Out-File .\Test.txt
(Get-Content .\Test.txt).Count
4
Reading the question again, I might have misunderstood your question.
In any case, if you want to split (count) on only a specific character combination:
CR
((Get-Content -Raw .\Test.txt).Trim() -Split '\r').Count
3
LF
((Get-Content -Raw .\Test.txt).Trim() -Split '\n').Count
3
CRLF
((Get-Content -Raw .\Test.txt).Trim() -Split '\r\n').Count # or: -Split [Environment]::NewLine
2
Note .Trim() method which removes the extra newline (white spaces) at the end of the file added by the Get-Content -Raw parameter.
Addendum
(Update based on the comment on the memory exception)
I am afraid that there is currently no other option then building your own StreamReader using the ReadBlock method and specifically split lines on a CRLF. I have opened a feature request for this issue: -NewLine Parameter to customize line separator for Get-Content
Get-Lines
A possible way to workaround the memory exception errors:
function Get-Lines {
[CmdletBinding()][OutputType([string])] param(
[Parameter(ValueFromPipeLine = $True)][string] $Filename,
[String] $NewLine = [Environment]::NewLine
)
Begin {
[Char[]] $Buffer = new-object Char[] 10
$Reader = New-Object -TypeName System.IO.StreamReader -ArgumentList (Get-Item($Filename))
$Rest = '' # Note that a multiple character newline (as CRLF) could be split at the end of the buffer
}
Process {
While ($True) {
$Length = $Reader.ReadBlock($Buffer, 0, $Buffer.Length)
if (!$length) { Break }
$Split = ($Rest + [string]::new($Buffer[0..($Length - 1)])) -Split $NewLine
If ($Split.Count -gt 1) { $Split[0..($Split.Count - 2)] }
$Rest = $Split[-1]
}
}
End {
$Rest
}
}
Usage
To prevent the memory exceptions it is important that you do not assign the results to a variable or use brackets as this will stall the PowerShell PowerShell pipeline and store everything in memory.
$Count = 0
Get-Lines .\Test.txt | ForEach-Object { $Count++ }
$Count
The System.IO.StreamReader.ReadBlock solution that reads the file in fixed-size blocks and performs custom splitting into lines in iRon's helpful answer is the best choice, because it both avoids out-of-memory problems and performs well (by PowerShell standards).
If performance in terms of execution speed isn't paramount, you can take advantage of
Get-Content's -Delimiter parameter, which accepts a custom string to split the file content by:
# Outputs the count of CRLF-terminated lines.
(Get-Content largeFile.txt -Delimiter "`r`n" | Measure-Object).Count
Note that -Delimiter employs optional-terminator logic when splitting: that is, if the file content ends in the given delimiter string, no extra, empty element is reported at the end.
This is consistent with the default behavior, where a trailing newline in a file is considered an optional terminator that does not resulting in an additional, empty line getting reported.
However, in case a -Delimiter string that is unrelated to newline characters is used, a trailing newline is considered a final "line" (element).
A quick example:
# Create a test file without a trailing newline.
# Note the CR-only newline (`r) after 'line 1'
"line1`rrest of line1`r`nline2" | Set-Content -NoNewLine test1.txt
# Create another test file with the same content plus
# a trailing CRLF newline.
"line1`rrest of line1`r`nline2`r`n" | Set-Content -NoNewLine test2.txt
'test1.txt', 'test2.txt' | ForEach-Object {
"--- $_"
# Split by CRLF only and enclose the resulting lines in [...]
Get-Content $_ -Delimiter "`r`n" |
ForEach-Object { "[{0}]" -f ($_ -replace "`r", '`r') }
}
This yields:
--- test1.txt
[line1`rrest of line1]
[line2]
--- test2.txt
[line1`rrest of line1]
[line2]
As you can see, the two test files were processed identically, because the trailing CRLF newline was considered an optional terminator for the last line.

How can I convert CSV files with a meta data header row into flat tables using Powershell?

I have a few thousand CSV files with a format similar to this (i.e. a table with a meta data row at the top):
dinosaur.csv,water,Benjamin.Field.12.Location53.Readings,
DATE,VALUE,QUALITY,STATE
2018-06-01,73.83,Good,0
2018-06-02,45.53,Good,0
2018-06-03,89.123,Good,0
Is it possible to use PowerShell to convert these CSV files into a simple table format such as this?
DATE,VALUE,QUALITY,STATE,FILENAME,PRODUCT,TAG
2018-06-01,73.83,Good,0,dinosaur.csv,water,Benjamin.Field.12.Location53.Readings
2018-06-02,45.53,Good,0,dinosaur.csv,water,Benjamin.Field.12.Location53.Readings
2018-06-03,89.123,Good,0,dinosaur.csv,water,Benjamin.Field.12.Location53.Readings
Or is there a better alternative to preparing these CSV's into a straight forward format to be ingested?
I have used PS to process simple CSV's before, but not with a meta data row that was important.
Thanks
Note: This is a faster alternative to thepip3r's helpful answer, and also covers the aspect of saving the modified content back to CSV files:
By using the switch statement to efficiently loop over the lines of the files as text, the costly calls to ConvertFrom-Csv, Select-Object and Export-Csv can be avoided.
Note that the switch statement is enclosed in $(), the subexpression operator, so as to enable writing back to the same file in a single pipeline; however, doing so requires keeping the entire (modified) file in memory; if that's not an option, enclose the switch statement in & { ... } and pipe it to Set-Content to a temporary file, which you can later use to replace the original file.
# Create a sample CSV file in the current dir.
#'
dinosaur.csv,water,Benjamin.Field.12.Location53.Readings,
DATE,VALUE,QUALITY,STATE
2018-06-01,73.83,Good,0
2018-06-02,45.53,Good,0
2018-06-03,89.123,Good,0
'# > sample.csv
# Loop over all *.csv files in the current dir.
foreach ($csvFile in Get-Item *.csv) {
$ndx = 0
$(
switch -File $csvFile.FullName {
default {
if ($ndx -eq 0) { # 1st line
$suffix = $_ -replace ',$' # save the suffix to append to data rows later
} elseif ($ndx -eq 1) { # header row
$_ + ',FILENAME,PRODUCT,TAG' # add additional column headers
} else { # data rows
$_ + ',' + $suffix # append suffix
}
++$ndx
}
}
) # | Set-Content $csvFile.FullName # <- activate this to write back to the same file.
# Use -Encoding as needed.
}
The above yields the following:
DATE,VALUE,QUALITY,STATE,FILENAME,PRODUCT,TAG
2018-06-01,73.83,Good,0,dinosaur.csv,water,Benjamin.Field.12.Location53.Readings
2018-06-02,45.53,Good,0,dinosaur.csv,water,Benjamin.Field.12.Location53.Readings
2018-06-03,89.123,Good,0,dinosaur.csv,water,Benjamin.Field.12.Location53.Readings
## If your inital block is an accurate representation
$s = get-content .\test.txt
## Get the 'metadata' line
$metaline = $s[0]
## Remove the metadata line from the original and turn it into a custom powershell object
$n = $s | where-object { $_ -ne $metaline } | ConvertFrom-Csv
## Split the metadata line by a comma to get the different parts for appending to the other content
$m = $metaline.Split(',')
## Loop through each item and append the metadata information to each entry
for ($i=0; $i -lt $n.Count; $i++) {
$n[$i] = $n[$i] | Select-Object -Property *,FILENAME,PRODUCT,TAG ## This is a cheap way to create new properties on an object
$n[$i].Filename = $m[0]
$n[$i].Product = $m[1]
$n[$i].Tag = $m[2]
}
## Display that the new objects reports as the desired output
$n | format-table

Powershell replace text once per line

I have a Powershell script that I am trying to work out part of it, so the text input to this is listing the user group they are part of. This PS script is supposed to replace the group with the groups that I am assigning them in active directory(I am limited to only changing groups in active directory). My issue is that when it reaches HR and replaces it, it will then proceed to contine and replace all the new but it all so replaces the HR in CHRL, so my groups look nuts right now. But I am looking it over and it doesn't do it with every line. But for gilchrist it will put something in there for the HR in the name. Is there anything can I do to keep it for changing or am I going to have to change my HR to Human Resources? Thanks for the help.
$lookupTable = #{
'Admin' = 'W_CHRL_ADMIN_GS,M_CHRL_ADMIN_UD,M_CHRL_SITE_GS'
'Security' = 'W_CHRL_SECURITY_GS,M_CHRL_SITE_GS'
'HR' = 'M_CHRL_HR_UD,W_CHRL_HR_GS,M_CHRL_SITE_GS'
$original_file = 'c:\tmp\test.txt'
$destination_file = 'c:\tmp\test2.txt'
Get-Content -Path $original_file | ForEach-Object {
$line = $_
$lookupTable.GetEnumerator() | ForEach-Object {
if ($line -match $_.Key)
{
$line = $line -replace $_.Key, $_.Value
}
}
$line
} | Set-Content -Path $destination_file
Get-Content $destination_file
test.txt:
user,group
john.smith,Admin
joanha.smith,HR
john.gilchrist,security
aaron.r.smith,admin
abby.doe,secuity
abigail.doe,admin
Your input appears to be in CSV format (though note that your sample rows have trailing spaces, which you'd have to deal with, if they're part of your actual data).
Therefore, use Import-Csv and Export-Csv to read / rewrite your data, which allows a more concise and convenient solution:
Import-Csv test.txt |
Select-Object user, #{ Name='group'; Expression = { $lookupTable[$_.group] } } |
Export-Csv -NoTypeInformation -Encoding Utf8 test2.txt
Import-Csv reads the CSV file as a collection of custom objects whose properties correspond to the CSV column values; that is, each object has a .user and .name property in your case.
$_.group therefore robustly reports the abstract group name only, which you can directly pass to your lookup hashtable; Select-Object is used to pass the original .user value through, and to replace the original .group value with the lookup result, using a calculated property.
Export-Csv re-converts the custom objects to a CSV file:
-NoTypeInformation suppresses the (usually useless) data-type-information line at the top of the output file
-Encoding Utf8 was added to prevent potential data loss, because it is ASCII encoding that is used by default.
Note that Export-Csv blindly double-quotes all field values, whether they need it or not; that said, CSV readers should be able to deal with that (and Import-Csv certainly does).
As for what you tried:
The -replace operator replaces all occurrences of a given regex (regular expression) in the input.
Your regexes amounts to looking for (case-insensitive) substrings, which explains why HR matches both the HR group name and substring hr in username gilchrist.
A simple workaround would be to add assertions to your regex so that the substrings only match where you want them; e.g.: ,HR$ would only match after a , at the end of a line ($).
However, your approach of enumerating the hashtable keys for each input CSV row is inefficient, and you're better off splitting off the group name and doing a straight lookup based on it:
# Split the row into fields.
$fields = $line -split ','
# Update the group value (last field)
$fields[-1] = $lookupTable[$fields[-1]]
# Rebuild the line
$line = $fields -join ','
Note that you'd have to make an exception for the header row (e.g., test if the lookup result is empty and refrain from updating, if so).
Why don't you load your text file as a CSV file, using Import-CSV and use "," as a delimiter?
This will allow you to have a Powershell Object you can work on. and then export it as text o CSV. if I use your file & lookup table this code may help you :
$file = Import-Csv -Delimiter "," -Path "c:\ps\test.txt"
$lookupTable = #{
'Admin' = 'W_CHRL_ADMIN_GS,M_CHRL_ADMIN_UD,M_CHRL_SITE_GS'
'Security' = 'W_CHRL_SECURITY_GS,M_CHRL_SITE_GS'
'HR' = 'M_CHRL_HR_UD,W_CHRL_HR_GS,M_CHRL_SITE_GS'}
foreach ($i in $file) {
#Compare and replace
...
}
Export-CSV $file -Delimiter ","
You can then iterate over $file and compare and replace. you can also Export-CSV after you're done.

Retrieving second part of a line when first part matches exactly

I used the below steps to retrieve a string from file
$variable = 'abc#yahoo.com'
$test = $variable.split('#')[0];
$file = Get-Content C:\Temp\file1.txt | Where-Object { $_.Contains($test) }
$postPipePortion = $file | Foreach-Object {$_.Substring($_.IndexOf("|") + 1)}
This results in all lines that contain $test as a substring. I just want the result to contain only the lines that exactly matches $test.
For example, If a file contains
abc_def|hf#23$
abc|ohgvtre
I just want the text ohgvtre
If I understand the question correctly you probably want to use Import-Csv instead of Get-Content:
Import-Csv 'C:\Temp\file1.txt' -Delimiter '|' -Header 'foo', 'bar' |
Where-Object { $_.foo -eq $test } |
Select-Object -Expand bar
To address the exact matching, you should be testing for equality (-eq) rather than substring (.Contains()). Also, there is no need to parse the data multiple times. Here is your code, rewritten to to operate in one pass over the data using the -split operator.
$variable = 'abc#yahoo.com'
$test = $variable.split('#')[0];
$postPipePortion = (
# Iterate once over the lines in file1.txt
Get-Content C:\Temp\file1.txt | foreach {
# Split the string, keeping both parts in separate variables.
# Note the backslash - the argument to the -split operator is a regex
$first, $second = ($_ -split '\|')
# When the first half matches, output the second half.
if ($first -eq $test) {
$second
}
}
)

How to read multiple data sets from one .csv file in powershell

I have a temp recorder that (daily) reads multiple sensors and saves the data to a single .csv with a whole bunch of header information before each set of date/time and temperature. the file looks something like this:
"readerinfo","onlylistedonce"
"downloadinfo",YYYY/MM/DD 00:00:00
"timezone",-8
"headerstuff","headersuff"
"sensor1","sensorstuff"
"serial#","0000001"
"about15lines","ofthisstuff"
"header1","header2"
datetime,temp
datetime,temp
datetime,temp
"sensor2","sensorstuff"
"serial#","0000002"
"about15lines","ofthisstuff"
"header1","header2"
datetime,temp
datetime,temp
datetime,temp
"downloadcomplete"
My aim is to pull out the date/time and temp data for each sensor and save it as a new file so that I can run some basic stats(hi/lo/avg temp)on it. (It would be beautiful if I could somehow identify which sensor the data came from based on a serial number listed in the header info, but that's less important than separating out the data into sets) The lengths of the date/time lists change from sensor to sensor based on how long they've been recording and the number of sensors changes daily also. Even if I could just split the sensor data, header info and all, into however many files there are sensors, that would be a good start.
This isn't exactly a CSV file in the traditional sense. I imagine you already know this, given your description of the file contents.
If the lines with datetime,temp truly do not have any double quotes in them, per your example data, then the following script should work. This script is self-containing, since it declares the example data in-line.
IMPORTANT: You will need to modify the line containing the declaration of the $SensorList variable. You will have to populate this variable with the names of the sensors, or you can parameterize the script to accept an array of sensor names.
UPDATE: I changed the script to be parameterized.
Results
The results of the script are as follows:
sensor1.csv (with corresponding data)
sensor2.csv (with corresponding data)
Some green text will be written to the PowerShell host, indicating which sensor is currently detected
Script
The contents of the script should appear as follows. Save the script file to a folder, such as c:\test\test.ps1, and then execute it.
# Declare text as a PowerShell here-string
$Text = #"
"readerinfo","onlylistedonce"
"downloadinfo",YYYY/MM/DD 00:00:00
"timezone",-8
"headerstuff","headersuff"
"sensor1","sensorstuff"
"serial#","0000001"
"about15lines","ofthisstuff"
"header1","header2"
datetime,tempfromsensor1
datetime,tempfromsensor1
datetime,tempfromsensor1
"sensor2","sensorstuff"
"serial#","0000002"
"about15lines","ofthisstuff"
"header1","header2"
datetime,tempfromsensor2
datetime,tempfromsensor2
datetime,tempfromsensor2
"downloadcomplete"
"#.Split("`n");
# Declare the list of sensor names
$SensorList = #('sensor1', 'sensor2');
$CurrentSensor = $null;
# WARNING: Clean up all CSV files in the same directory as the script
Remove-Item -Path $PSScriptRoot\*.csv;
# Iterate over each line in the text file
foreach ($Line in $Text) {
#region Line matches double quote
if ($Line -match '"') {
# Parse the property/value pairs (where double quotes are present)
if ($Line -match '"(.*?)",("(?<value>.*)"|(?<value>.*))') {
$Entry = [PSCustomObject]#{
Property = $matches[1];
Value = $matches['value'];
};
if ($matches[1] -in $SensorList) {
$CurrentSensor = $matches[1];
Write-Host -ForegroundColor Green -Object ('Current sensor is: {0}' -f $CurrentSensor);
}
}
}
#endregion Line matches double quote
#region Line does not match double quote
else {
# Parse the datetime/temp pairs
if ($Line -match '(.*?),(.*)') {
$Entry = [PSCustomObject]#{
DateTime = $matches[1];
Temp = $matches[2];
};
# Write the sensor's datetime/temp to its file
Add-Content -Path ('{0}\{1}.csv' -f $PSScriptRoot, $CurrentSensor) -Value $Line;
}
}
#endregion Line does not match double quote
}
Using the data sample you provided, the output of this script would as follows:
C:\sensoroutput_20140204.csv
sensor1,datetime,temp
sensor1,datetime,temp
sensor1,datetime,temp
sensor2,datetime,temp
sensor2,datetime,temp
sensor2,datetime,temp
I believe this is what you are looking for. The assumption here is the new line characters. The get-content line is reading the data and breaking it into "sets" by using 2 new line characters as the delimiter to split on. I chose to use the environment's (Windows) new line character. Your source file may have different new line characters. You can use Notepad++ to see which characters they are e.g. \r\n, \n, etc.
$newline = [Environment]::NewLine
$srcfile = "C:\sensordata.log"
$dstpath = 'C:\sensoroutput_{0}.csv' -f (get-date -f 'yyyyMMdd')
# Reads file as a single string with out-string
# then splits with a delimiter of two new line chars
$datasets = get-content $srcfile -delimiter ($newline * 2)
foreach ($ds in $datasets) {
$lines = ($ds -split $newline) # Split dataset into lines
$setname = $lines[0] -replace '\"(\w+).*', '$1' # Get the set or sensor name
$lines | % {
if ($_ -and $_ -notmatch '"') { # No empty lines and no lines with quotes
$data = ($setname, ',', $_ -join '') # Concats set name, datetime, and temp
Out-File -filepath $dstpath -inputObject $data -encoding 'ascii' -append
}
}
}