Escaping foward slash problems - powershell

In the following semi-pseudo code, the forward-slash of the first element in the array $system is always read as a back-slash.
I have tried the various escape characters such as ` and \ but to no avail. Is this a known problem in PowerShell? How to solve?
$system = #("Something/Anything", "Super Development","Quality Assurance")
//the following is looped with $y
$string| ConvertTo-json | FT | Out-File -append C:\Test\Results\$($system[$y])_All.csv
//error:
Message : Could not find a part of the path 'C:\Test\Results\Something\Anything_All.csv'

As #autosvet already mentioned in the comments to your question there are several reserved characters that can't be used in filenames/paths on Windows, namely:
Use any character in the current code page for a name, including Unicode characters and characters in the extended character set (128–255), except for the following:
The following reserved characters:
< (less than)
> (greater than)
: (colon)
" (double quote)
/ (forward slash)
\ (backslash)
| (vertical bar or pipe)
? (question mark)
* (asterisk)
These characters can't be escaped, only replaced. You can use the GetInvalidFileNameChars() method for programmatically determining the characters that need to be replaced:
$invalid = [regex]::Escape([IO.Path]::GetInvalidFileNameChars())
$string | ConvertTo-json | FT |
Out-File -Append C:\Test\Results\$($something[$y] -replace $invalid, '_')_All.csv

Related

question about powershell text manipulation

I apologise for asking the very basic question as I am beginner in Scripting.
i was wondering why i am getting different result from two different source with the same formatting. Below are my sample
file1.txt
Id Name Members
122 RCP_VMWARE-DMZ-NONPROD DMZ_NPROD01_111
DMZ_NPROD01_113
123 RCP_VMWARE-DMZ-PROD DMZ_PROD01_110
DMZ_PROD01_112
124 RCP_VMWARE-DMZ-INT.r87351 DMZ_TEMPL_210.r
DMZ_DECOM_211.r
125 RCP_VMWARE-LAN-NONPROD NPROD02_20
NPROD03_21
NPROD04_22
NPROD06_24
file2.txt
Id Name Members
4 HPUX_PROD HPUX_PROD.3
HPUX_PROD.4
HPUX_PROD.5
i'm trying to display the Name column and with this code i'm able to display the file1.txt correctly.
PS C:\Share> gc file1.txt |Select-Object -skip 1 | foreach-object { $_.split(" ")[1]} | ? {$_.trim() -ne "" }
RCP_VMWARE-DMZ-NONPROD
RCP_VMWARE-DMZ-PROD
RCP_VMWARE-DMZ-INT.r87351
RCP_VMWARE-LAN-NONPROD
However with the file2 im getting a different output.
PS C:\Share> gc .\file2.txt |Select-Object -skip 1 | foreach-object { $_.split(" ")[1]} | ? {$_.trim() -ne "" }
4
changing the code to *$_.split(" ")[2]}* helps to display the output correctly
However, i would like to have just 1 code which can be apply for both situation.appreciate if you can help me to sort this.. thank you in advance...
This happens because the latter file has different format.
When examined carefully, one notices there are two spaces between 4 and HPUX_PROD strings:
Id Name Members
4 HPUX_PROD HPUX_PROD.3
^^^^
On the first file, there is a single space between number and string:
Id Name Members
122 RCP_VMWARE-DMZ-NONPROD DMZ_NPROD01_111
^^^
As how to fix the issue depends if you need to match both file formats, or if the other has simply a typing error.
The existing answers are helpful, but let me try to break it down conceptually:
.Split(" ") splits the input string by each individual space character, whereas what you're looking for is to split by runs of (one or more) spaces, given that your column values can be separated by more than one space.
For instance 'a b'.split(' ') results in 3 array elements - 'a', '', 'b' - because the empty string between the two spaces is considered an element too.
The .NET [string] type's .Split() method is based on verbatim strings or character sets and therefore doesn't allow you to express the concept of "one ore more spaces" as a split criterion, whereas PowerShell's regex-based -split operator does.
Conveniently, -split's unary form (see below) has this logic built in: it splits each input string by any nonempty run of whitespace, while also ignoring leading and trailing whitespace, which in your case obviates the need for a regex altogether.
This answer compares and contrasts the -split operator with string type's .Split() method, and makes the case for routinely using the former.
Therefore, a working solution (for both input files) is:
Get-Content .\file2.txt | Select-Object -Skip 1 |
Foreach-Object { if ($value = (-split $_)[1]) { $value } }
Note:
If the column of interest contains a value (at least one non-whitespace character), so must all preceding columns in order for the approach to work. Also, column values themselves must not have embedded whitespace (which is true for your sample input).
The if conditional both extracts the 2nd column value ((-split $_)[1]) and assigns it to a variable ($value = ), whose value then implicitly serves as a Boolean:
Any nonempty string is implicitly $true, in which case the extracted value is output in the associated block ({ $value }); conversely, an empty string results in no output.
For a general overview of PowerShell's implicit to-Boolean conversions, see this bottom section of this answer.
Since this sort-of looks like csv output with spaces as delimiter (but not quite), I think you could use ConvertFrom-Csv on this:
# read the file as string array, trim each line and filter only the lines that
# when split on 1 or more whitespace characters has more than one field
# then replace the spaces by a comma and treat it as CSV
# return the 'Name' column only
(((Get-Content -Path 'D:\Test\file1.txt').Trim() |
Where-Object { #($_ -split '\s+').Count -gt 1 }) -replace '\s+', ',' |
ConvertFrom-Csv).Name
Shorter, but because you are only after the Name column, this works too:
((Get-Content -Path 'D:\Test\file2.txt').Trim() -replace '\s+', ',' | ConvertFrom-Csv).Name -ne ''
Output for file1
RCP_VMWARE-DMZ-NONPROD
RCP_VMWARE-DMZ-PROD
RCP_VMWARE-DMZ-INT.r87351
RCP_VMWARE-LAN-NONPROD
Output for file2
HPUX_PROD

Writing output of Class to text file adds blank lines

I created a class to gather some data in a script, although not sure if this is the appropriate use of this. When I output class to text file it adds 2 blank lines each time it writes to the file. Is there a way to remove this?
[int] $numOut = 0
[int] $numIn = 0
[int] $numNone = 0
[int] $numCPE = 0
[int] $numSQR = 0
[int] $numEGX = 0
[int] $numCQA = 0
various parts of code do a self addition like this, these are the only types of manipulation to these variables
$script:numOut += 1
$cLength = $randString.Length #this is a random string
$numSQR = $numCPE + $cLength #add CPE + length of random strin
$total = $numOut + $numIn + $numNone + $numCPE + $numSQR + $numEGX + $numCQA
class Logging {
[string]$DateTime
[string]$User
[string]$numOut
[string]$numIn
[string]$numNone
[string]$numCPE
[string]$numSQR
[string]$numEGX
[string]$numCQA
[string]$total
}
$Logging = [Logging]::new()
$Logging.DateTime = Get-Date
$Logging.User = $env:username
$logging.NumOut = $numOut
$logging.NumIn = $numIn
$logging.NumNone = $numNone
$logging.NumCPE = $numCPE
$logging.NumSQR = $numSQR
$logging.NumEGX = $numEGX
$logging.NumCQA = $numCQA
$logging.Total = $total
write-output $logging | Format-Table -AutoSize -HideTableHeaders >> $CWD\log.txt
It writes to the file like this:
arealhobo 10/24/2020 19:47:24 1 0 1 1 1 0 1
arealhobo 10/24/2020 19:50:37 1 0 1 1 1 0 1
arealhobo 10/24/2020 19:53:15 1 0 1 1 1 0 1
You can replace the newlines first:
(write-output $logging | Format-Table -AutoSize -HideTableHeaders | Out-string) -replace "\n","" >> $CWD\log.txt
You could also implement a method to handle outputting to a file. Here's an example.
class Logging {
[string]$DateTime
[string]$User
[string]$numOut
[string]$numIn
[string]$numNone
[string]$numCPE
[string]$numSQR
[string]$numEGX
[string]$numCQA
[string]$total
Log($file){
$this | Export-Csv -Path $file -Delimiter "`t" -Append -NoTypeInformation
}
}
$Logging = [Logging]::new()
$Logging.DateTime = Get-Date
$Logging.User = $env:username
$logging.NumOut = $numOut
$logging.NumIn = $numIn
$logging.NumNone = $numNone
$logging.NumCPE = $numCPE
$logging.NumSQR = $numSQR
$logging.NumEGX = $numEGX
$logging.NumCQA = $numCQA
$logging.Total = $total
Now you can simply call $logging.log("path\to\logfile") specifying where to write.
$Logging.log("c:\Some\Path\logging.log")
Note: The scenario described below may not match the OP's. The answer may still be of interest if you find that file content prints as follows to the consoler after having used >> to append to a preexisting file in Windows PowerShell; note what appears to be extra spacing and extra empty lines:
To avoid your problem, which most likely stems from an unintended mix of different character encodings in the output file produced by >>, you have two options:
If you do know the character encoding used for the preexisting content in the output file, use Out-File -Append and match that encoding via the -Encoding parameter:
# Using UTF-8 in this example.
$logging | Format-Table -AutoSize -HideTableHeaders |
Out-File -Append -Encoding Utf8 $CWD\log.txt
Note that > / >> are in effect like calling Out-File / Out-File -Append, except that you don't get to control the character encoding.
In the unlikely event that you don't know the preexisting character encoding, you can use Add-Content, which matches it automatically - unlike >> / Out-File -Append - but that requires extra work:
An additional Out-String -Stream call is needed beforehand, to provide the formatting that >> (and > / Out-File) implicitly provide; without it, Add-Content (and Set-Content) apply simple .ToString() stringification of the output objects, and in the case of the objects output by Format-* cmdlets that results in useless representations, namely their type names only (e.g., Microsoft.PowerShell.Commands.Internal.Format.FormatStartData):
# Add-Content, unlike >>, matches the character encoding of the existing file.
# Since Add-Content, unlike > / >> / Out-File, uses simple .ToString()
# stringification you first need a call to `Out-String`, which provides
# the same formatting that > / >> / Out-File implicitly does.
$logging | Format-Table -AutoSize -HideTableHeaders |
Out-String -Stream | Add-Content $CWD\log.txt
Read on for background information.
Assuming you're using Windows PowerShell rather than PowerShell [Core] v6+[1]:
The most likely cause (the explanation doesn't fully match the output in your question, but I suspect that is a posting artifact):
You had a preexisting log.txt file with a single-byte character encoding[2], most likely either the legacy encoding based on your system's active ANSI code page or a UTF-8 encoded file (with or without a BOM).
When you appended content with >>, PowerShell blindly used its default character encoding for > / >>, which in Windows PowerShell[1] is "Unicode" (UTF-16LE), which is a double-byte encoding[2] - in effect (but not technically) these redirection operators are aliases for Out-File [-Append].
The result is that the newly appended text is misinterpreted when the file is later read, because the UTF-16LE characters are read byte by byte instead of being interpreted as the two-byte sequences that they are.
Since characters in the ASCII range have a NUL byte as the 2nd byte in their 2-byte representation, reading the file byte byte sees an extra NUL ("`0") character after every original character.
On Windows[3], this has two effects when you print the file's content to the console with Get-Content:
What appears to be a space character is inserted between ASCII-range character so that, say, foo prints as f o o - in reality, these are the extra NUL characters.
An extra, (apparently) empty line is inserted after every line, which is a side effect of PowerShell accepting different newline styles interchangeably (CRLF, LF, CR):
Due to the extra NULs, the original CRLF sequence ("`r`n") is read as "`r`0`n`0", which causes PowerShell to treat "`r" and "`n" individually as newlines (line breaks), resulting in the extra line.
Note that the extra line effectively contains a single NUL, and that the subsequent line then starts with a NUL (the trailing one from the "`n"), so among the misinterpreted lines all but the first one appear to start with a space.
[1] PowerShell [Core] v6+ now consistently defaults to BOM-less UTF-8 across all cmdlets. While >> (Out-File -Append) still don't match an existing encoding, the prevalence of UTF-8 files makes this less of a problem. See this answer for more information about character encoding in PowerShell.
[2] Strictly speaking, UTF-8 and UTF-16 are variable-length encodings, because not every byte in UTF-8 is necessarily its own character (that only applies to chars. in the ASCII range), and, similarly, certain (exotic) characters require two 2-byte sequences in UTF-16. However, it is fair to say that UTF-8 / UTF-16 are single/double-byte-based.
[3] On Unix-like platforms (Linux, macOS) you may not even notice the problem when printing to the terminal, because their terminal emulators typically ignore NULs, and, due to LF ("`n") alone being used as newlines, no extra lines appear. Yet, the extra NULs are still present.

How to compare two sequential strings in a file

I have a big file consists of "before" and "after" cases for every item as follows:
case1 (BEF) ACT
(AFT) BLK
case2 (BEF) ACT
(AFT) ACT
case3 (BEF) ACT
(AFT) CLC
...
I need to select all of the strings which have (BEF) ACT on the "first" string and (AFT) BLK on the "second" and place the result to a file.
The idea is to create a clause like
IF (stringX.LineNumber consists of "(BEF) ACT" AND stringX+1.LineNumber consists of (AFT) BLK)
{OutFile $stringX+$stringX+1}
Sorry for the syntax, I've just starting to work with PS :)
$logfile = 'c:\temp\file.txt'
$matchphrase = '\(BEF\) ACT'
$linenum=Get-Content $logfile | Select-String $matchphrase | ForEach-Object {$_.LineNumber+1}
$linenum
#I've worked out how to get a line number after the line with first required phrase
Create a new file with a result as follows:
string with "(BEF) ACT" following with a string with "(AFT) BLK"
Select-String -SimpleMatch -CaseSensitive '(BEF) ACT' c:\temp\file.txt -Context 0,1 |
ForEach-Object {
$lineAfter = $_.Context.PostContext[0]
if ($lineAfter.Contains('(AFT) BLK')) {
$_.Line, $lineAfter # output
}
} # | Set-Content ...
-SimpleMatch performs string-literal substring matching, which means you can pass the search string as-is, without needing to escape it.
However, if you needed to further constrain the search, such as to ensure that it only occurs at the end of a line ($), you would indeed need a regular expression with the (implied) -Pattern parameter: '\(BEF\) ACT$'
Also note PowerShell is generally case-insensitive by default, which is why switch -CaseSensitive is used.
Note how Select-String can accept file paths directly - no need for a preceding Get-Content call.
-Context 0,1 captures 0 lines before and 1 line after each match, and includes them in the [Microsoft.PowerShell.Commands.MatchInfo] instances that Select-String outputs.
Inside the ForEach-Object script block, $_.Context.PostContext[0] retrieves the line after the match and .Contains() performs a literal substring search in it.
Note that .Contains() is a method of the .NET System.String type, and such methods - unlike PowerShell - are case-sensitive by default, but you can use an optional parameter to change that.
If the substring is found on the subsequent line, both the line at hand and the subsequent one are output.
The above looks for all matching pairs in the input file; if you only wanted to find the first pair, append | Select-Object -First 2 to the Select-String call.
Another way of doing this is to read the $logFile in as a single string and use a RegEx match to get the parts you want:
$logFile = 'c:\temp\file.txt'
$outFile = 'c:\temp\file2.txt'
# read the content of the logfile as a single string
$content = Get-Content -Path $logFile -Raw
$regex = [regex] '(case\d+\s+\(BEF\)\s+ACT\s+\(AFT\)\s+BLK)'
$match = $regex.Match($content)
($output = while ($match.Success) {
$match.Value
$match = $match.NextMatch()
}) | Set-Content -Path $outFile -Force
When used the result is:
case1 (BEF) ACT
(AFT) BLK
case7 (BEF) ACT
(AFT) BLK
Regex details:
( Match the regular expression below and capture its match into backreference number 1
case Match the characters “case” literally
\d Match a single digit 0..9
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
\( Match the character “(” literally
BEF Match the characters “BEF” literally
\) Match the character “)” literally
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
ACT Match the characters “ACT” literally
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
\( Match the character “(” literally
AFT Match the characters “AFT” literally
\) Match the character “)” literally
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
BLK Match the characters “BLK” literally
)
My other answer completes your own Select-String-based solution attempt. Select-String is versatile, but slow, though it is appropriate for processing files too large to fit into memory as a whole, given that it processes files line by line.
However, PowerShell offers a much faster line-by-line processing alternative: switch -File - see the solution below.
Theo's helpful answer, which reads the entire file into memory first, will probably perform best overall, depending on file size, but it comes at the cost of increased complexity, due to relying heavily on direct use of .NET functionality.
$(
$firstLine = ''
switch -CaseSensitive -Regex -File t.txt {
'\(BEF\) ACT' { $firstLine = $_; continue }
'\(AFT\) BLK' {
# Pair found, output it.
# If you don't want to look for further pairs,
# append `; break` inside the block.
if ($firstLine) { $firstLine, $_ }
# Look for further pairs.
$firstLine = ''; continue
}
default { $firstLine = '' }
}
) # | Set-Content ...
Note: The enclosing $(...) is only needed if you want to send the output directly to the pipeline to a cmdlet such as Set-Content; it is not needed for capturing the output in a variable: $pair = switch ...
-Regex interprets the branch conditionals as regular expressions.
$_ inside a branch's action script block ({ ... } refers to the line at hand.
The overall approach is:
$firstLine stores the 1st line of interest once found, and when the 2nd line's pattern is found and $firstLine is set (is nonempty), the pair is output.
The default handler resets $firstLine, to ensure that only two consecutive lines that contain the strings of interest are considered.

How to remove special characters from a text file with PowerShell?

I have a text file and have to remove all weird characters from it. I've already tried the following:
(get-content C:\Users\JuanMa\Desktop\UNB\test.txt) -replace ('.','') | out-file C:\Users\JuanMa\Desktop\UNB\test2.txt
But this leads to an empty output - the file test2.txt remains empty.
This is my text file:
.!..p.ÿÿ.!..! .!. PESCATORE
.!. LEMON SPICE S.R.L.
600 SUR DE MULTIPLAZA ESCAZU
3-102-599284
TEL: 2289-8010 FAX: 2289-5129
INFO#PESCATORECR.COM
.!..! Terminal POS: BARRA
.!.
.! ------------FACTURA-----------
.! .!0 Mesa: B07
.!..! NUMERO : 0068371
.!.Mesa # : B07 Fecha: 25/09/2018
Mesero : CARLOS
Cajero : JOHN Hora : 22:35:06
# Pers : 1 Comandas: 1
Apertura: 22:34 Tiempo/E: 1 Min
.! .!..! .! CANT DESCRIPCION MONTOS
.!.---------------------------------------
1.00 LIMONADA HIERBABUE 2,033.00
.! SubTotal : 2,033.00
%IVA : 264.00
%SER : 203.00
.! .!. TOTALES : 2,501.00
.!..! (COLONES)
En Dolares : 4.55
.!.>> Pago: EFECTIVO> 2,555.00
>> Recibe: 2,555.00
>> Cambio: 54.00
.!
www.gruposinertech.com Vers.15.09A
.!.
AUTORIZADO MEDIANTE RESOLUCION
11-97 DE LA D.G.T.D
.i
.#
Thanks for your help!
Try:
(get-content -Raw C:\Users\JuanMa\Desktop\UNB\test.txt).Replace ('.','') | out-file C:\Users\JuanMa\Desktop\UNB\test2.txt
Get-content return an array by default but if you specify -Raw it will return a string
howdy Juan Manuel Sanchez,
the following will trim the unwanted chars from the beginning of each line in the array of lines you get from Get-Content. it acts on each line in the array without needing to iterate thru the array explicitly.
it's VERY fragile since it hard codes the items. also, it removes all the left hand padding spaces.
$GC_Array -creplace '^[.! pÿ]{1,}' -replace '^0 {2,}'
-creplace is the case-sensitive version of replace
^ means start at the beginning of the line
[] is the character set to replace
char list = dot, exclamation point, space, lowercase p, accented y
{1,} means one or more
the 2nd replace targets start-of-line, a zero digit, & two or more spaces
hope that helps,
lee
The -replace operator uses regular expressions, which use period to denote ANY character, so this strips out anything. If you want to remove literal periods, then prefix the period with a backslash:
(get-content C:\Users\JuanMa\Desktop\UNB\test.txt) -replace ('\.','') | out-file C:\Users\JuanMa\Desktop\UNB\test2.txt
Unfortunately this removes ALL periods, so the periods you may want to keep, e.g. in numbers are lost.
To clean out multiple bad characters, include them in square brackets. This removes 'ÿ','!'
(get-content C:\Users\JuanMa\Desktop\UNB\test.txt) -replace ('[ÿ!]','') | out-file C:\Users\JuanMa\Desktop\UNB\test2.txt
You can chain up these -replace operators to do multiple substitutions:
# Characters ÿ or !
# Replace .! at the start of the line with blank
(get-content C:\Users\JuanMa\Desktop\UNB\test.txt) `
-replace ('[ÿ!]','') `
-replace ('^.!','') |
out-file C:\Users\JuanMa\Desktop\UNB\test2.txt

replace exception in powershell

I'm a beginner in powershell and know C# pretty well. I have this command http://www.f2ko.de/programs.php?lang=en&pid=cmd that downloads stuff. I'm writing this script to download all the sgf go games from this url http://www.gogameworld.com/gophp/pg_samplegames.php, and was trying to write a powershell script to do it for me. So I wrote a script:
Get-Content test.txt|
ForEach-Object
{
if($_ -eq "=`"javascript:viewdemogame(`'*.sgf`')`" tit")
{
$filename = $_ -replace '=`"javascript:viewdemogame(`''
$filename = $filename -replace '`')`" tit'
&"(Path)/download.exe" ("http://www.gogameworld.com/webclient/qipu/" + $filename)
}
}
However, when I run the script, I keep getting this error:
Unexpected token '`'' in expression or statement.
At (PATH)\test.ps1:7 char:37
+ $filename = $filename -replace '`' <<<< )'
+ CategoryInfo : ParserError: (`':String) [], ParentContainsErrorRecordException
+ FullyQualifiedErrorId : UnexpectedToken
I've looked at the script lots of times and still can't figure out whats wrong. Thanks.
Try this, read the content of the file as one string and then use the Regex.Matches to get all occurrences of the text contained in the parenthesis:
$content = Get-Content test.txt | Out-String
$baseUrl = 'http://www.gogameworld.com/webclient/qipu/'
[regex]::matches($content,"javascript:viewdemogame\('([^\']+)'\)") | Foreach-Object{
$url = '{0}{1}' -f $baseUrl,$_.Groups[1].Value
& "(Path)/download.exe" $url
}
here's an explanation of the regex pattern (created with RegexBuddy):
javascript:viewdemogame\('([^\']+)'\)
Match the characters “javascript:viewdemogame” literally «javascript:viewdemogame»
Match the character “(” literally «\(»
Match the character “'” literally «'»
Match the regular expression below and capture its match into backreference number 1 «([^\']+)»
Match any character that is NOT a ' character «[^\']+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “'” literally «'»
Match the character “)” literally «\)»
Match the character “"” literally «"»
'{0}{1}' is used with the -f operator to create a string. {0} maps to the first value on the right hand side of the operator (e.g $baseUrl) and {1} is mapped to the second value. Under the hood, PowerShell is suing the .NET String.Format method. You can read more about it here: http://devcentral.f5.com/weblogs/Joe/archive/2008/12/19/powershell-abcs---f-is-for-format-operator.aspx
'')" tit'
The -replace operator takes 2 arguments, comma separated. The first is a regular expression that matches what you want replaced. The second is the string you want to relace that with. You appear to be missing the second argument.