I have one row of temperature data in a text file that I would like to convert to a single column and save as a CSV file using a PowerShell script. The temperatures are separated by commas and look like this:
21,22,22,22,22,22,22,20,19,18,17,16,15,14,13,12,11,10,9,9,9,8,8,9,8,8,8,9,9,8,8,8,9,9,9,8,8,8,8,8,9,10,12,14,15,17,19,20,21,21,21,21,21,21,21,21,21,21,21,20,20,20,20,20,22,24,25,26,27,27,27,28,28,28,29,29,29,28,28,28,28,28,28,27,27,27,27,27,29,30,32,32,32,32,33,34,35,35,34,33,32,32,31,31,30,30,29,29,28,28,27,28,29,31,33,34,35,35,35,36,36,36,36,36,36,36,36,36,37,37,37,37,37,37,38,39,40,42,43,43,43,43,43,42,42,42,41,41,41,41,40,39,37,36,35,34,33,32,31,31,31,31,31,31,31,31,31,31,
I have tried several methods based on searches in this forum I thought this might work but it returns an error: Transpose rows to columns in PowerShell
This is the modified code I tried that returns: Error: "Input string was not in a correct format."
$txt = Get-Content 'C:myfile.txt' | Out-String
$txt -split '(?m)^,\r?\n' | ForEach-Object {
# create empty array
$row = #()
$arr = $_ -split '\r?\n'
$k = 0
for ($n = 0; $n -lt $arr.Count; $n += 2) {
$i = [int]$arr[$n]
# if index from record ($i) is greater than current index ($k) append
# required number of empty fields
for ($j = $k; $j -lt $i-1; $j++) { $row += $null }
$row += $arr[$n+1]
$k = $i
}
$row -join '|'
}
This seems like it should be simple to do with only one row of data. Are there any suggestions on how to convert this single row of numbers to one column?
Try this:
# convert row to column data
$header = 'TEMPERATURE'
$values = $(Get-Content input.dat) -split ','
$header, $values | Out-File result.csv
#now test the result
Import-Csv result.csv
The header is the first line (or record) in the CSV file. In this case it's a single word, because there is only one column.
The values are the items between commas in the input. In this case, the -split on commas generates an array of strings. Note that, if comma is a separator, there will be no comma after the last temperature. Your data doesn't look like that, but I have assumed that the real data does.
Then, we just write the header and the array out to a file. But what happened to all the commas? It turns out that, for a single column CSV file, there are no commas separating fields. So the result is a simple CSV file.
Last, there is a test of the output using Import-csv to read the result and display it in table format.
This isn't necessarily the best way to code it, but it might help a beginner get used to powershell.
Assuming I'm understanding your intent correctly, based on your verbal description (not your own coding attempt):
# Create simplified sample input file
#'
21,22,23,
'# > myfile.txt
# Read the line, split it into tokens by ",", filter out empty elements
# with `-ne ''` (to ignore empty elements, such as would
# result from the trailing "," in your sample input),
# and write to an output CSV file with a column name prepended.
(Get-Content myfile.txt) -split ',' -ne '' |
ForEach-Object -Begin { 'Temperatures' } -Process { $_ } |
Set-Content out.csv
More concise alternative, using an expandable (interpolating) here-string:
# Note: .TrimEnd(',') removes any trailing "," from the input.
# Your sample input suggests that this is necessary.
# If there are no trailing "," chars., you can omit this call.
#"
Temperatures
$((Get-Content myfile.txt).TrimEnd(',') -split ',' -join [Environment]::NewLine)
"# > out.csv
out.csv then contains:
Temperatures
21
22
23
I have a Powershell script that scans log files and replaces text when a match is found. The list is currently 500 lines, and I plan to double/triple this. the log files can range from 400KB to 800MB in size.
Currently, when using the below, a 42MB file takes 29mins, and I'm looking for help if anyone can see any way to make this faster?
I tried changing ForEach-Object with ForEach-ObjectFast but it's causing the script to take sufficiently longer. also tried changing the first ForEach-Object to a forloop but still took ~29 mins.
$lookupTable= #{
'aaa:bbb:123'='WORDA:WORDB:NUMBER1'
'bbb:ccc:456'='WORDB:WORDBC:NUMBER456'
}
Get-Content -Path $inputfile | ForEach-Object {
$line=$_
$lookupTable.GetEnumerator() | ForEach-Object {
if ($line-match$_.Key)
{
$line=$line-replace$_.Key,$_.Value
}
}
$line
}|Set-Content -Path $outputfile
Since you say your input file could be 800MB in size, reading and updating the entire content in memory could potentially not fit.
The way to go then is to use a fast line-by-line method and the fastest I know of is switch
# hardcoded here for demo purposes.
# In real life you get/construct these from the Get-ChildItem
# cmdlet you use to iterate the log files in the root folder..
$inputfile = 'D:\Test\test.txt'
$outputfile = 'D:\Test\test_new.txt' # absolute full file path because we use .Net here
# because we are going to Append to the output file, make sure it doesn't exist yet
if (Test-Path -Path $outputfile -PathType Leaf) { Remove-Item -Path $outputfile -Force }
$lookupTable= #{
'aaa:bbb:123'='WORDA:WORDB:NUMBER1'
}
# create a regex string from the Keys of your lookup table,
# merging the strings with a pipe symbol (the regex 'OR').
# your Keys could contain characters that have special meaning in regex, so we need to escape those
$regexLookup = '({0})' -f (($lookupTable.Keys | ForEach-Object { [regex]::Escape($_) }) -join '|')
# create a StreamWriter object to write the lines to the new output file
# Note: use an ABSOLUTE full file path for this
$streamWriter = [System.IO.StreamWriter]::new($outputfile, $true) # $true for Append
switch -Regex -File $inputfile {
$regexLookup {
# do the replacement using the value in the lookup table.
# because in one line there may be multiple matches to replace
# get a System.Text.RegularExpressions.Match object to loop through all matches
$line = $_
$match = [regex]::Match($line, $regexLookup)
while ($match.Success) {
# because we escaped the keys, to find the correct entry we now need to unescape
$line = $line -replace $match.Value, $lookupTable[[regex]::Unescape($match.Value)]
$match = $match.NextMatch()
}
$streamWriter.WriteLine($line)
}
default { $streamWriter.WriteLine($_) } # write unchanged
}
# dispose of the StreamWriter object
$streamWriter.Dispose()
I'm having trouble re-assembling certain filenames (and discarding the rest) from a text file. The filenames are split up (usually on three lines) and there is always a blank line after each filename. I only want to keep filenames that begin with OPEN or FOUR. An example is:
OPEN.492820.EXTR
A.STANDARD.38383
333
FOUR.383838.282.
STAND.848484.NOR
MAL.3939
CLOSE.3480384.ST
ANDARD.39393939.
838383
The output I'd like would be:
OPEN.492820.EXTRA.STANDARD.38383333
FOUR.383838.282.STAND.848484.NORMAL.3939
Thanks for any suggestions!
The following worked for me, you can give it a try.
See https://regex101.com/r/JuzXOb/1 for the Regex explanation.
$source = 'fullpath/to/inputfile.txt'
$destination = 'fullpath/to/resultfile.txt'
[regex]::Matches(
(Get-Content $source -Raw),
'(?msi)^(OPEN|FOUR)(.*?|\s*?)+([\r\n]$|\z)'
).Value.ForEach({ -join($_ -split '\r?\n').ForEach('Trim') }) |
Out-File $destination
For testing:
$txt = #'
OPEN.492820.EXTR
A.STANDARD.38383
333
FOUR.383838.282.
STAND.848484.NOR
MAL.3939
CLOSE.3480384.ST
ANDARD.39393939.
838383
OPEN.492820.EXTR
A.EXAMPLE123
FOUR.383838.282.
STAND.848484.123
ZXC
'#
[regex]::Matches(
$txt,
'(?msi)^(OPEN|FOUR)(.*?|\s*?)+([\r\n]$|\z)'
).Value.ForEach({ -join($_ -split '\r?\n').ForEach('Trim') })
Output:
OPEN.492820.EXTRA.STANDARD.38383333
FOUR.383838.282.STAND.848484.NORMAL.3939
OPEN.492820.EXTRA.EXAMPLE123
FOUR.383838.282.STAND.848484.123ZXC
Read the file one line at a time and keep concatenating them until you encounter a blank line, at which point you output the concatenated string and repeat until you reach the end of the file:
# this variable will keep track of the partial file names
$fileName = ''
# use a switch to read the file and process each line
switch -Regex -File ('path\to\file.txt') {
# when we see a blank line...
'^\s*$' {
# ... we output it if it starts with the right word
if($s -cmatch '^(OPEN|FOUR)'){ $fileName }
# and then start over
$fileName = ''
}
default {
# must be a non-blank line, concatenate it to the previous ones
$s += $_
}
}
# remember to check and output the last one
if($s -cmatch '^(OPEN|FOUR)'){
$fileName
}
I have a few thousand CSV files with a format similar to this (i.e. a table with a meta data row at the top):
dinosaur.csv,water,Benjamin.Field.12.Location53.Readings,
DATE,VALUE,QUALITY,STATE
2018-06-01,73.83,Good,0
2018-06-02,45.53,Good,0
2018-06-03,89.123,Good,0
Is it possible to use PowerShell to convert these CSV files into a simple table format such as this?
DATE,VALUE,QUALITY,STATE,FILENAME,PRODUCT,TAG
2018-06-01,73.83,Good,0,dinosaur.csv,water,Benjamin.Field.12.Location53.Readings
2018-06-02,45.53,Good,0,dinosaur.csv,water,Benjamin.Field.12.Location53.Readings
2018-06-03,89.123,Good,0,dinosaur.csv,water,Benjamin.Field.12.Location53.Readings
Or is there a better alternative to preparing these CSV's into a straight forward format to be ingested?
I have used PS to process simple CSV's before, but not with a meta data row that was important.
Thanks
Note: This is a faster alternative to thepip3r's helpful answer, and also covers the aspect of saving the modified content back to CSV files:
By using the switch statement to efficiently loop over the lines of the files as text, the costly calls to ConvertFrom-Csv, Select-Object and Export-Csv can be avoided.
Note that the switch statement is enclosed in $(), the subexpression operator, so as to enable writing back to the same file in a single pipeline; however, doing so requires keeping the entire (modified) file in memory; if that's not an option, enclose the switch statement in & { ... } and pipe it to Set-Content to a temporary file, which you can later use to replace the original file.
# Create a sample CSV file in the current dir.
#'
dinosaur.csv,water,Benjamin.Field.12.Location53.Readings,
DATE,VALUE,QUALITY,STATE
2018-06-01,73.83,Good,0
2018-06-02,45.53,Good,0
2018-06-03,89.123,Good,0
'# > sample.csv
# Loop over all *.csv files in the current dir.
foreach ($csvFile in Get-Item *.csv) {
$ndx = 0
$(
switch -File $csvFile.FullName {
default {
if ($ndx -eq 0) { # 1st line
$suffix = $_ -replace ',$' # save the suffix to append to data rows later
} elseif ($ndx -eq 1) { # header row
$_ + ',FILENAME,PRODUCT,TAG' # add additional column headers
} else { # data rows
$_ + ',' + $suffix # append suffix
}
++$ndx
}
}
) # | Set-Content $csvFile.FullName # <- activate this to write back to the same file.
# Use -Encoding as needed.
}
The above yields the following:
DATE,VALUE,QUALITY,STATE,FILENAME,PRODUCT,TAG
2018-06-01,73.83,Good,0,dinosaur.csv,water,Benjamin.Field.12.Location53.Readings
2018-06-02,45.53,Good,0,dinosaur.csv,water,Benjamin.Field.12.Location53.Readings
2018-06-03,89.123,Good,0,dinosaur.csv,water,Benjamin.Field.12.Location53.Readings
## If your inital block is an accurate representation
$s = get-content .\test.txt
## Get the 'metadata' line
$metaline = $s[0]
## Remove the metadata line from the original and turn it into a custom powershell object
$n = $s | where-object { $_ -ne $metaline } | ConvertFrom-Csv
## Split the metadata line by a comma to get the different parts for appending to the other content
$m = $metaline.Split(',')
## Loop through each item and append the metadata information to each entry
for ($i=0; $i -lt $n.Count; $i++) {
$n[$i] = $n[$i] | Select-Object -Property *,FILENAME,PRODUCT,TAG ## This is a cheap way to create new properties on an object
$n[$i].Filename = $m[0]
$n[$i].Product = $m[1]
$n[$i].Tag = $m[2]
}
## Display that the new objects reports as the desired output
$n | format-table
Within the content of a couple large text files, I am aiming to replace all occurrences of a specific character string with a new character string, simultaneously for 300 different character strings.
Is there any way I can do this using a comma or tab-separated search-and-replace matrix such as this? (the actual character strings vary widely in their length and type of characters, but does not contain , or TAB)
currentstring1,newstring1
currentstring2,newstring2
currentstring3,newstring3
aB9_./cdef,newstring4
.
currentstring300,newstring300
Here is something to get you started. If the replacement file is ~300 lines, then Import-Csv should be ok. However, if the file in which to replace strings is large, Get-Content will be a problem. It will try to read the entire file into memory. You will need to iterate over the file reading line-by-line.
[cmdletbinding()]
Param()
$thefile = './largetextfile.txt'
$replfile = './repl.txt'
$reps = Import-Csv -Path $replfile -Header orgstring,repstring
foreach ($rep in $reps) {
Write-Verbose $rep
}
$lines = Get-Content -Path $thefile
foreach ($line in $lines) {
Write-Verbose $line
$newline = $line
foreach ($rep in $reps) {
$newline = $newline -replace $rep.orgstring,$rep.repstring
}
Write-Verbose $newline
}
On the server, unix: 1. Make the rename matrix as below in a text editor, then copy it. 2. In the server dir where the files located, paste the multi-line rename matrix as is. 3. Enter. 4. Some characters (like slashes) may need to be escaped if present in the string, and the * at the end may be replaced to specify files.
perl -pi -e 's/FINDTEXT1/REPLACETEXT1/g' *
perl -pi -e 's/FINDTEXT2/REPLACETEXT2/g' *
perl -pi -e 's/FINDTEXT3/REPLACETEXT3/g' *