Remove particular characters from lines and concatenate them - powershell

I have a problem where I need to cut specific characters from a line and then concatenate the line with the next lines, separated by commas.
Consider there is a text file abc.txt and I need the last 3 lines from the file. The last 3 lines are in the this format:
11/7/2000 17:22:54 - Hello world.
19/7/2002 8:23:54 - Welcome to the new technology.
24/7/2000 9:00:13 - Eco earth
I need to remove the starting time stamp from each line and then concatenate the lines as
Hello world.,Welcome to the new technology,Eco earth.
The time stamp is not static and I want to make use of a regex
I tried the following:
$Words = (Get-Content -Path .\abc.txt|Select-Object -last 3|Out-String)
$Words = $Words -split('-')
$regex = "[0-9]{1,2}/[0-9]{1,2}/[0-9]{1,4} [0-9]{1,2}:[0-9]{1,2}:[0-9]{1,2}):[0-9]{1,3}"
The output I used to get is like
11/7/2000 17:22:54
Hello world
19/7/2002 8:23:54
Welcome to the new technology.
24/7/2000 9:00:13
Eco earth

There is no need to create a Regex that tries to figure out the timestamp part, because you want to skip that anyway.
This should work:
# read the file and get the last three lines as string array
$txt = Get-Content -Path 'D:\abc.txt' -Tail 3
# loop through the array and change the lines as you go
for ($i = 0; $i -lt $txt.Count; $i++) {
$txt[$i] = ($txt[$i] -split '-', 2)[-1].Trim()
}
# finally, join the array with commas
$txt -join ','
Output:
Hello world.,Welcome to the new technology.,Eco earth

try this:
Get-Content "C:\temp\example.txt" | %{
$array=$_ -split "-", 2
$array[1].Trim()
}

When you have for example : "DATE - blablabla"
If you do .Split("-") on it you get :
Date
blablabla
What you can do is $string.Split("-")[Which_Line] -> so
$string="12/15/18 08:05:10 - Hello World."
$string=$string.Split("-")[1]
Returns : Hello world. (with spaces before)
Now on string you can apply Trim() function - it removes spaces before and after your string
$string=$string.Trim()
Gives you Hello world.
For your answer, if it's static usage (always 3) :
$Words = (Get-Content -Path .\abc.txt|Select-Object -last 3|Out-String).Split("-")
$end=$Words[2].Trim() + "," + $Words[4].Trim() + "," + $Words[6].Trim()

Related

Editing a specific column of data in a text file with powershell

So I’ve had a a request to edit a csv file by replacing column values with a a set of unique numbers. Below is a sample of the original input file with a a header line followed by a couple of rows. Note that the rows have NO column headers.
HH ### SLH304 01100001 2 20200606 20200605 011100
DD|GZFD|IE|USD|20200605|EUR200717||
DD|GZFD|IE|USD|20200605|EUR200717||
What I’m looking to do is change say the values in column 3 with a unique number.
So far I have the following …
$i=0
$txtin = Get-Content "C:\Temp\InFile.txt" | ForEach {"$($_.split('|'))"-replace $_[2],$i++} |Out-File C:\Temp\csvout.txt
… but this isn’t working as it removes the delimiter and adds numbers in the wrong places …
HH0###0000000SLH30400110000100000002000000202006060202006050011100
1D1D1 1G1Z1F1D1 1I1E1 1U1S1D1 12101210101610151 1E1U1R1210101711171 1 1
2D2D2 2G2Z2F2D2 2I2E2 2U2S2D2 22202220202620252 2E2U2R2220202721272 2 2
Ideally I want it to look like this, whereby the values of 'IE' have been replaced by '01' and '02' in each row ...
HH ### SLH304 01100001 2 20200606 20200605 011100
DD|GZFD|01|USD|20200605|EUR200717||
DD|GZFD|02|USD|20200605|EUR200717||
Any ideas on how to resolve would be much appreciated.
I think by spreading this out to multiline code will make it easier:
$txtin = Get-Content 'C:\Temp\InFile.txt'
# loop through the lines, skipping the first line
for ($i = 1; $i -lt $txtin.Count; $i++){
$parts = $txtin[$i].Split('|') # or use regex -split '\|'
if ($parts.Count -ge 3) {
$parts[2] = '{0:00}' -f $i # update the 3rd column
$txtin[$i] = $parts -join '|' # rejoin the parts with '|'
}
}
$txtin | Out-File -FilePath 'C:\Temp\csvout.txt'
Output will be:
HH ### SLH304 01100001 2 20200606 20200605 011100
DD|GZFD|01|USD|20200605|EUR200717||
DD|GZFD|02|USD|20200605|EUR200717||
Updated to use the more robust check suggested by mklement0. This avoids errors when the line does not have at least three parts in it after the split

Powershell replace last two occurrences of a '/' in file path with '.'

I have a filepath, and I'm trying to remove the last two occurrences of the / character into . and also completely remove the '{}' via Powershell to then turn that into a variable.
So, turn this:
xxx-xxx-xx\xxxxxxx\x\{xxxx-xxxxx-xxxx}\xxxxx\xxxxx
Into this:
xxx-xxx-xx\xxxxxxx\x\xxxx-xxxxx-xxxx.xxxxx.xxxxx
I've tried to get this working with the replace cmdlet, but this seems to focus more on replacing all occurrences or the first/last occurrence, which isn't my issue. Any guidance would be appreciated!
Edit:
So, I have an excel file and i'm creating a powershell script that uses a for each loop over every row, which amounts to thousands of entries. For each of those entries, I want to create a secondary variable that will take the full path, and save that path minus the last two slashes. Here's the portion of the script that i'm working on:
Foreach($script in $roboSource)
{
$logFileName = "$($script.a).txt".Replace('(?<=^[^\]+-[^\]+)-','.')
}
$script.a will output thousands of entries in this format:
xxx-xxx-xx\xxxxxxx\x{xxxx-xxxxx-xxxx}\xxxxx\xxxxx
Which is expected.
I want $logFileName to output this:
xxx-xxx-xx\xxxxxxx\x\xxxx-xxxxx-xxxx.xxxxx.xxxxx
I'm just starting to understand regex, and I believe the capture group between the parenthesis should be catching at least one of the '\', but testing attempts show no changes after adding the replace+regex.
Please let me know if I can provide more info.
Thanks!
You can do this in two fairly simply -replace operations:
Remove { and }
Replace the last two \:
$str = 'xxx-xxx-xx\xxxxxxx\x\{xxxx-xxxxx-xxxx}\xxxxx\xxxxx'
$str -replace '[{}]' -replace '\\([^\\]*)\\([^\\]*)$','.$1.$2'
The second pattern matches:
\\ # 1 literal '\'
( # open first capture group
[^\\]* # 0 or more non-'\' characters
) # close first capture group
\\ # 1 literal '\'
( # open second capture group
[^\\]* # 0 or more non-'\' characters
) # close second capture group
$ # end of string
Which we replace with the first and second capture group values, but with . before, instead of \: .$1.$2
If you're using PowerShell Core version 6.1 or newer, you can also take advantage of right-to-left -split:
($str -replace '[{}]' -split '\\',-3) -join '.'
-split '\\',-3 has the same effect as -split '\\',3, but splitting from the right rather than the left.
A 2-step approach is simplest in this case:
# Input string.
$str = 'xxx-xxx-xx\xxxxxxx\x\{xxxx-xxxxx-xxxx}\xxxxx\xxxxx'
# Get everything before the "{"
$prefix = $str -replace '\{.+'
# Get everything starting with the "{", remove "{ and "}",
# and replace "\" with "."
$suffix = $str.Substring($prefix.Length) -replace '[{}]' -replace '\\', '.'
# Output the combined result (or assign to $logFileName)
$prefix + $suffix
If you wanted to do it with a single -replace operation (with nesting), things get more complicated:
Note: This solution requires PowerShell Core (v6.1+)
$str -replace '(.+)\{(.+)\}(.+)',
{ $_.Groups[1].Value + $_.Groups[2].Value + ($_.Groups[3].Value -replace '\\', '.') }
Also see the elegant PS-Core-only -split based solution with a negative index (to split only a fixed number of tokens off the end) in Mathias R. Jessen's helpful answer.
try this
$str='xxx-xxx-xx\xxxxxxx\x\{xxxx-xxxxx-xxxx}\xxxxx\xxxxx'
#remove bracket and split for get array
$Array=$str -replace '[{}]' -split '\\'
#take all element except 2 last elements, and concat after last elems
"{0}.{1}.{2}" -f ($Array[0..($Array.Length -3)] -join '\'), $Array[-2], $Array[-1]

How can I loop through each record of a text file to replace a string of characters

I have a large .txt file containing records where a date string in each record needs to be incremented by 2 days which will then update the field to the right of it which contains dashes --------- with that date. For example, a record contains the following record data:
1440149049845_20191121000000 11/22/2019 -------- 0.000 0.013
I am replacing the -------- dashes with 11/24/2019 (2 days added to the date 11/22/2019) so that it shows as:
1440149049845_20191121000000 11/22/2019 11/24/2019 0.000 0.013
I have the replace working on a single record but need to loop through the entire .txt file to update all of the records. Here is what I tried:
$inputRecords = get-content '\\10.12.7.13\vipsvr\Rancho\MRDF_Report\_Report.txt'
foreach ($line in $inputRecords)
{
$item -match '\d{2}/\d{2}/\d{4}'
$inputRecords -replace '-{2,}',([datetime]$matches.0).adddays(2).tostring('MM/dd/yyyy') -replace '\b0\.000\b','0.412'
}
I get an PS error stating: "Cannot convert null to type "System.DateTime"
I'm sorry but why are we using RegEx for something this simple?
I can see it if there are differently formatted lines in the file, you'd want to make sure you aren't manipulating unintended lines, but that's not indicated in the question. Even still, it doesn't seem like you need to match anything within the line itself. It seems like it's delimited on spaces which would make a simple split a lot easier.
Example:
$File = "C:\temp\Test.txt"
$Output =
ForEach( $Line in Get-Content $File)
{
$TmpArray = $Line.Split(' ')
$TmpArray[2] = (Get-Date $TmpArray[1]).AddDays(2).ToString('M/dd/yyyy')
$TmpArray -join ' '
}
The 3rd element in the array do the calculation and reassign the value...
Notice there's no use of the += operator which is very slow compared to simply assigning the output to a variable. I wouldn't make a thing out of it but considering we don't know how big the file is... Also the String format given before 'mm/dd/yyyy' will result in 00 for the month like for example '00/22/2019', so I changed that to 'M/dd/yyyy'
You can still add logic to skip unnecessary lines if it's needed...
You can send $Output to a file with something like $Output | Out-File <FilePath>
Or this can be converted to a single pipeline that outputs directly to a file using | ForEach{...} instead of ForEach(.. in ..) If the file is truly huge and holding $Output in memory is an issue this is a good alternative.
Let me know if that helps.
You mostly had the right idea, but here are a few suggested changes, but not exactly in this order:
Use a new file instead of trying to replace the old file.
Iterate a line at a time, replace the ------, write to the new file.
Use '-match' instead of '-replace', because as you will see below that you need to manipulate the capture more than a simple '-replace' allows.
Use [datetime]::parseexact instead of trying to just force cast the captured text.
[string[]]$inputRecords = get-content ".\linesource.txt"
[string]$outputRecords
foreach ($line in $inputRecords) {
[string]$newLine = ""
[regex]$logPattern = "^([\d_]+) ([\d/]+) (-+) (.*)$"
if ($line -match $logPattern) {
$origDate = [datetime]::parseexact($Matches[2], 'mm/dd/yyyy', $null)
$replacementDate = $origDate.adddays(2)
$newLine = $Matches[1]
$newLine += " " + $origDate.toString('mm/dd/yyyy')
$newLine += " " + $replacementDate.toString('mm/dd/yyyy')
$newLine += " " + $Matches[4]
} else {
$newLine = $line
}
$outputRecords += "$newLine`n"
}
$outputRecords.ToString()
Even if you don't use the whole solution, hopefully at least parts of it will be helpful to you.
Using the suggested code from adamt8 and Steven, I added to 2 echo statements to show what gets displayed in the variables $logpattern and $line since it is not recognizing the pattern of characters to be updated. This is what displays from the echo:
Options MatchTimeout RightToLeft
CalNOD01 1440151020208_20191205000000 12/06/2019 12/10/2019
None -00:00:00.0010000 False
CalNOD01 1440151020314_20191205000000 12/06/2019 --------
None -00:00:00.0010000 False
this is the rendered output:
CalNOD01 1440151020208_20191205000000 12/06/2019 12/10/2019
CalNOD01 1440151020314_20191205000000 12/06/2019 --------
This is the code that was used:
enter image description here

How to import first two values for each line in CSV file | PowerShell

I have a CSV file that generates everyday, and generates with data such as:
windows:NT:v:n:n:d:n:n:n:n:m:n:n
I should also mention that that example is one of 3,900+ lines, and not every line of data has the same number of "columns". What I'm trying to do is import just the first two "columns" of data into a variable. For this example, it would be "Windows" and "NT", nothing else.
How would I go about doing this? I've tried using -delimiter ':', and not much luck.
The number of lines shouldn't matter.
My approach from comment (to your previous question) should work,
if there is no header and you only want the first two columns,
just specify Header 1,2
> import-csv .\strange.csv -delim ':' -Header (1..2) |Where 2 -eq 'NT'
1 2
- -
windows NT
Example for building the entire array
$Splitted_List = #()
foreach($Line in Get-Content '.\myfilewithuseragents.txt'){
$Splitted = $Line -split ":"
$Splitted_Object = [PSCustomObject]#{
$part1 = $splitted[0]
$part2 = $Splitted[1]
}
$Splitted_List.Add($Splitted_Object) | Out-Null
}
For every line you'll just read the line and with the string from that line, you're easily able to split it
$useragent = "windows:NT:v:n:n:d:n:n:n:n:m:n:n"
Then the first part will be referenced to as $useragent.Split(":")[0], the second as $useragent.Split(":")[1], etc.
Including the for-loop that would be something like
foreach($useragent in Get-Content '.\myfilewithuseragents.txt') {
$splitted = $useragent.Split(":")
$part1 = $splitted[0]
}

Entering a string with new line characters into excel using powershell

I have a string str.
$str="abcd_1
abcd_2
abcd_3"
First in the for-loop I am concatenating the string and making a full string of Id's with a carriage return+newline character.
And on splitting I am using just the new line character.
I am getting a space in front in the data which is entered from the second cell.
for($intRow = $trow ; $intRow -le $maxRow ; $intRow++){
$codeName = $currentCode
$fin = $codeName + "_" + $i + "`r`n"
$finCode=$finCode+$fin
$i= $i + 1
}
$currentSheet.Cells.Item($fRow,$currentCol).Value2 = $finCode
$clipboardData = $finCode.Split("`n").TrimStart()
$newClipboardData = $clipboardData.Where({$_.TrimStart() -ne ""}).ForEach({$_.TrimStart()})
[System.Windows.Forms.Clipboard]::SetText($newClipboardData)
$currentSheet.Cells.Item($fRow,$currentCol).Select() | Out-Null
$currentSheet.Paste() | Out-Null
Just to be more clear and precise. This example shows how to split the string by new line character. Just increase the $intRow every time to write in next row:
ForEach($strValue In $str.Split("`n"))
{
$currentSheet.Cells.Item($intRow,$currentCol).Value2 = $strValue
$intRow++
}
New Code:
[System.Windows.Forms.Clipboard]::SetText($str.Split("`n"))
$currentSheet.Cells.Item($intRow,$currentCol).Select() | Out-Null
$currentSheet.Paste() | Out-Null
You can eliminate spaces using trim. But splitting using 'r'n is not giving me correct data. It is returning a single line.:
$clipboardData = $str.Split("`n").TrimStart()
$newClipboardData = $clipboardData.Where({$_.TrimStart() -ne ""}).ForEach({$_.TrimStart()})
[System.Windows.Forms.Clipboard]::SetText($newClipboardData)
When I checked the contents of the values it does not have any leading spaces but it shows a single space when pasted in excel.
$newClipboardData | %{ "-$_-" }
-Some Text-
-Some text With spaces -
-Some text again-