powershell select-string output to file - powershell

I'm using powershell to locate an email address in a htm file and write it out to a text file.
I'm using select-string which finds the string OK, but writes the line number as well as the email address to the file.
All I want is the email address! It seems simple enough, but I can't crack it.
Here's my code:
$List_htm = Get-ChildItem -Filter *.htm
# Loop:
foreach ($htm in $List_htm)
{
# Locate recipient email address to send to:
# Regex pattern to match:
$pattern = '(^\W*.*#.*\.{1,}\w*$)'
$sel = select-string -list $htm -pattern $pattern | Select-Object Line
If ($Sel -eq $null)
{
write-host "FAILS - $htm does NOT contain $pattern"
}
Else
{
write-host "WORKS! $pattern `n$sel"
}
Write-host "end"
$EmailAddressee = $PDFFolder + "EmailAddressee.txt"
$sel | Out-File $EmailAddressee
}
However emailaddressee.txt looks like this:
Line
----
fred.bloggs#helpmeplease.com
All I want is a single line with the email address in:
fred.bloggs#helpmeplease.com
I could obviously further process this results file in powershell to get this, but I'm hoping someone can come up with a simple one stage result.
Thanks
Ian

Change the following line:
$sel = select-string -list $htm -pattern $pattern | Select-Object Line
To:
$sel = select-string -list $htm -pattern $pattern | Select-Object -ExpandProperty Line
That will ensure you write the property of the object rather than the textual representation of the object itself

Related

Powershell use one liner instead of 3 lines

I have data file
$data=gc c:\blabla.cfg
inside it i have all kind of lines, i want to extract the data, example, in the file I have :
SMTP_SERVER = smtp.bla.local
so if i want to create variable inside the script for example : $smtp which have this information, this is how i do it:
$tmp = $data | Select-string SMTP_SERVER
$smtp=($tmp -split ' ')[2]
It may look kind of dumb its just that i have lots of configuration to take from the file and id like to take it in one liners with pipe,
if i do this:
$smtp = $data | Select-String SMTP_SERVER | $_.Split(' ')[2]
I'm getting an error saying Expression are only allowed as the first element of pipeline
Use a ForEach-Object
Select-String -Path "c:\blabla.cfg" -SimpleMatch "SMTP_SERVER" | ForEach-Object { $_.ToString().Split(' ')[2] }

Powershell function returning an array instead of string

i'm importing a csv and i would like to add a column to it (with the result based off of the previous columns)
my data looks like this
host address,host prefix,site
10.1.1.0,24,400-01
i would like to add a column called "sub site"
so I wrote this module but the problem is, the actual ending object is an array instead of string
function site {
Param($s)
$s -match '(\d\d\d)'
return $Matches[0]
}
$csv = import-csv $file | select-object *,#{Name='Sub Site';expression= {site $_.site}}
if I run the command
PS C:\>$csv[0]
Host Address :10.1.1.0
host prefix :24
site :400-01
sub site : {True,400}
when it should look like
PS C:\>$csv[0]
Host Address :10.1.1.0
host prefix :24
site :400-01
sub site : 400
EDIT: I found the solution but the question is now WHY.
If I change my function to $s -match "\d\d\d" |out-null I get back the expected 400
Good you found the answer. I was typing this up as you found it. The reason is because the -match returns a value and it is added to the pipeline, which is all "returned" from the function.
For example, run this one line and see what is does:
"Hello" -match 'h'
It prints True.
Since I had this typed up, here is another way to phrase your question with the fix...
function site {
Param($s)
$null = $s -match '(\d\d\d)'
$ret = $Matches[0]
return $ret
}
$csv = #"
host address,host prefix,site
10.1.1.1,24,400-01
10.1.1.2,24,500-02
10.1.1.3,24,600-03
"#
$data = $csv | ConvertFrom-Csv
'1 =============='
$data | ft -AutoSize
$data2 = $data | select-object *,#{Name='Sub Site';expression= {site $_.site}}
'2 =============='
$data2 | ft -AutoSize

Count number of comments over multiple files, including multi-line comments

I'm trying to write a script that counts all comments in multiple files, including both single line (//) and multi-line (/* */) comments and prints out the total. So, the following file would return 4
// Foo
var text = "hello world";
/*
Bar
*/
alert(text);
There's a requirement to include specific file types and exclude certain file types and folders, which I already have working in my code.
My current code is:
( gci -include *.cs,*.aspx,*.js,*.css,*.master,*.html -exclude *.designer.cs,jquery* -recurse `
| ? { $_.FullName -inotmatch '\\obj' } `
| ? { $_.FullName -inotmatch '\\packages' } `
| ? { $_.FullName -inotmatch '\\release' } `
| ? { $_.FullName -inotmatch '\\debug' } `
| ? { $_.FullName -inotmatch '\\plugin-.*' } `
| select-string "^\s*//" `
).Count
How do I change this to get multi-line comments as well?
UPDATE: My final solution (slightly more robust than what I was asking for) is as follows:
$CodeFiles = Get-ChildItem -include *.cs,*.aspx,*.js,*.css,*.master,*.html -exclude *.designer.cs,jquery* -recurse |
Where-Object { $_.FullName -notmatch '\\(obj|packages|release|debug|plugin-.*)\\' }
$TotalFiles = $CodeFiles.Count
$IndividualResults = #()
$CommentLines = ($CodeFiles | ForEach-Object{
#Get the comments via regex
$Comments = ([regex]::matches(
[IO.File]::ReadAllText($_.FullName),
'(?sm)^[ \t]*(//[^\n]*|/[*].*?[*]/)'
).Value -split '\r?\n') | Where-Object { $_.length -gt 0 }
#Get the total lines
$Total = ($_ | select-string .).Count
#Add to the results table
$IndividualResults += #{
File = $_.FullName | Resolve-Path -Relative;
Comments = $Comments.Count;
Code = ($Total - $Comments.Count)
Total = $Total
}
Write-Output $Comments
}).Count
$TotalLines = ($CodeFiles | select-string .).Count
$TotalResults = New-Object PSObject -Property #{
Files = $TotalFiles
Code = $TotalLines - $CommentLines
Comments = $CommentLines
Total = $TotalLines
}
Write-Output (Get-Location)
Write-Output $IndividualResults | % { new-object PSObject -Property $_} | Format-Table File,Code,Comments,Total
Write-Output $TotalResults | Format-Table Files,Code,Comments,Total
To be clear: Using string matching / regular expressions is not a fully robust way to detect comments in JavaScript / C# code, because there can be false positives (e.g., var s = "/* hi */";); for robust parsing you'd need a language parser.
If that is not a concern, and it is sufficient to detect comments (that start) on their own line, optionally preceded by whitespace, here's a concise solution (PSv3+):
(Get-ChildItem -include *.cs,*.aspx,*.js,*.css,*.master,*.html -exclude *.designer.cs,jquery* -recurse |
Where-Object { $_.FullName -notmatch '\\(obj|packages|release|debug|plugin-.*)' } |
ForEach-Object {
[regex]::matches(
[IO.File]::ReadAllText($_.FullName),
'(?sm)^[ \t]*(//[^\n]*|/[*].*?[*]/)'
).Value -split '\r?\n'
}
).Count
With the sample input, the ForEach-Object command yields 4.
Remove the ^[ \t]* part to match comments starting anywhere on a line.
The solution reads each input file as a single string with [IO.File]::ReadAllText() and then uses the [regex]::Matches() method to extract all (potentially line-spanning) comments.
Note: You could use Get-Content -Raw instead to read the file as a single string, but that is much slower, especially when processing multiple files.
The regex uses in-line options s and m ((?sm)) to respectively make . match newlines too and to make anchors ^ and $ match line-individually.
^[ \t]* matches any mix of spaces and tabs, if any, at the start of a line.
//[^\n]*$ matches a string that starts with // through the end of the line.
/[*].*?[*]/ matches a block comment across multiple lines; note the lazy quantifier, *?, which ensures that very next instance of the closing */ delimiter is matched.
The matched comments (.Value) are then split into individual lines (-split '\r?\n'), which are output.
The resulting lines across all files are then counted (.Count)
As for what you tried:
The fundamental problem with your approach is that Select-String with file-info object input (such as provided by Get-ChildItem) invariably processes the input files line by line.
While this could be remedied by calling Select-String inside a ForEach-Object script block in which you pass each file's content as a single string to Select-String, direct use of the underlying regex .NET types, as shown above, is more efficient.
An IMO better approach is to count net code lines by removing single/multi line comments.
For a start a script that handles single files and returns for your above sample.cs the result 5
((Get-Content sample.cs -raw) -replace "(?sm)^\s*\/\/.*?$" `
-replace "(?sm)\/\*.*?\*\/.*`n" | Measure-Object -Line).Lines
EDIT: without removing empty lines, build the difference from total lines
## Q:\Test\2018\10\31\SO_53092258.ps1
$Data = Get-ChildItem *.cs | ForEach-Object {
$Content = Get-Content $_.FullName -Raw
$TotalLines = (Measure-Object -Input $Content -Line).Lines
$CodeLines = ($Content -replace "(?sm)^\s*\/\/.*?$" `
-replace "(?sm)\/\*.*?\*\/.*`n" | Measure-Object -Line).Lines
$Comments = $TotalLines - $CodeLines
[PSCustomObject]#{
File = $_.FullName
Lines = $TotalLines
Comments= $Comments
}
}
$Data
"="*40
"TotalLines={0} TotalCommentLines={1}" -f (
$Data | Measure-Object -Property Lines,Comments -Sum).Sum
Sample output:
> Q:\Test\2018\10\31\SO_53092258.ps1
File Lines Comments
---- ----- --------
Q:\Test\2018\10\31\example.cs 10 5
Q:\Test\2018\10\31\sample.cs 9 4
============================================
TotalLines=19 TotalCommentLines=9

powershell find string in csv get associated cell

Alright I have a csv that i import into variable $csv
name description system redundant
---- ----------- ------ ---------
hi don't settle sight dumb
hello why not settle settle
this just fails why? settle
I want to find a specific string in either $csv.description or $csv.system. If that string is found, i want to return the associated cell value under $csv.name
I can't have the select-string look for anything in $csv.redundant
this is what i have so far:
$csv = import-csv -path c:\hi
$find = $csv | select-string "settle"
output: $find
#{name=hi; description=don't settle; system=sight; redundant=dumb }
#{name=hello; description=why not; system=settle; redundant=settle}
#{name=this; description=just fails; system=why?; redundant=settle}
however - nothing returns if i do a $find.name, even though the $find.gettype() shows that this is an array. Also i don't know how to get the select-string to avoid $csv.redundant
I need the output to only be the $find.name of only the first 2 objects from the array.
thanks
Don't use Select-String, use Where-Object instead:
$searchTerm = 'settle'
$csv |Where-Object {$_.description -match $searchTerm -or $_.system -match $searchTerm} |Select-Object -Expand Name
ipcsv C:\temp\test.csv | ? {$_.description, $_.system -like "*settle*"} | select Name

Powershell import-csv with empty headers

I'm using PowerShell To import a TAB separated file with headers. The generated file has a few empty strings "" on the end of first line of headers. PowerShell fails with an error:
"Cannot process argument because the
value of argument "name" is invalid.
Change the value of the "name"
argument and run the operation again"
because the header's require a name.
I'm wondering if anyone has any ideas on how to manipulate the file to either remove the double quotes or enumerate them with a "1" "2" "3" ... "10" etc.
Ideally I would like to not modify my original file. I was thinking something like this
$fileContents = Get-Content -Path = $tsvFileName
$firstLine = $fileContents[0].ToString().Replace('`t""',"")
$fileContents[0] = $firstLine
Import-Csv $fileContents -Delimiter "`t"
But Import-Csv is expecting $fileContents to be a path. Can I get it to use Content as a source?
You can either provide your own headers and ignore the first line of the csv, or you can use convertfrom-csv on the end like Keith says.
ps> import-csv -Header a,b,c,d,e foo.csv
Now the invalid headers in the file is just a row that you can skip.
-Oisin
If you want to work with strings instead use ConvertFrom-Csv e.g.:
PS> 'FName,LName','John,Doe','Jane,Doe' | ConvertFrom-Csv | Format-Table -Auto
FName LName
----- -----
John Doe
Jane Doe
I ended up needing to handle multiple instances of this issue. Rather than use the -Header and manually setting up each import instance I wrote a more generic method to apply to all of them. I cull out all of the `t"" instances of the first line and save the file to open as a $filename + _bak and import that one.
$fileContents = Get-Content -Path $tsvFileName
if( ([string]$fileContents[0]).ToString().Contains('""') )
{
[string]$fixedFirstLine = $fileContents[0].ToString().Replace('`t""',"")
$fileContents[0] = $fixedFirstLine
$tsvFileName = [string]::Format("{0}_bak",$tsvFileName
$fileContents | Out-File -FilePath $tsvFileName
}
Import-Csv $tsvFileName -Delimiter "`t"
My Solution if you have much columns :
$index=0
$ColumnsName=(Get-Content "C:\temp\yourCSCFile.csv" | select -First 1) -split ";" | %{
$index++
"Header_{0:d5}" -f $index
}
import-csv "C:\temp\yourCSCFile.csvv" -Header $ColumnsName