Replace method is not working on my string variables - powershell

I wrote a script to extract the URL and Revision Number from svn info command of a svn repository and save the result in a .txt file.
The $revision and $url are both strings, so the replace method should work on them but it doesn't. Is there possibly something wrong in my code causing this?
$TheFilePath = "C:\Users\MyPC\REPOSITORY\NewProject\OUTPUT.txt"
echo "#- Automatic Package Update `n----------"| Out-File -FilePath $TheFilePath
$url = svn info C:\Users\MyPC\REPOSITORY\NewProject\trunk | Select-String -Pattern 'URL' -CaseSensitive -SimpleMatch | select-object -First 1
$url | Add-Content -path $TheFilePath
$revision = svn info C:\Users\MyPC\REPOSITORY\NewProject\trunk | Select-String -Pattern 'Revision' -CaseSensitive -SimpleMatch | $revision.Replace('Revision','srcrev')
$revision | Add-Content -path $TheFilePath
here is the output of svn info (Irrelevant outputs have been omitted) :
Path: .
Working Copy Root Path: C:\Users\MyPC\REPOSITORY\NewProject\trunk
URL: https://svn.mycompany.de/svn/NewProject/trunk
Relative URL: ^/trunk
Repository Root: https://svn.mycompany.de/svn/NewProject
Revision: 5884
And here is what I get inside the .txt file , running the code :
#- Automatic Package Update
----------
URL: https://svn.mycompany.de/svn/NewProject/trunk
Revision: 5884
----------

Looking at the example output of svn info here and the example you just provided, you should be able to get the info you need easier with ConvertFrom-StringData then with Select-String.
In PowerShell < 7.x you can use ConvertFrom-StringData on the output of svn info after changing the colon (:) delimiter into an equal sign (=) to get a Hashtable with all properties and values.
Then, using calculated properties you can extract the items you're interested in and save as CSV file for instance like this:
$svnInfo = svn info 'C:\Users\MyPC\REPOSITORY\NewProject\trunk'
$result = $svnInfo -replace '(?<!:.*):', '=' | ConvertFrom-StringData |
Select-Object #{Name = 'URL'; Expression = {$_['URL']}},
#{Name = 'srcrev'; Expression = {$_['Revision']}}
# output on screen
$result | Format-Table -AutoSize
# output to CSV file
$result | Export-Csv -Path 'C:\Users\MyPC\REPOSITORIES\NewProject\OUTPUT.csv' -NoTypeInformation
Regex details on the -replace to replace only the first occurrence of the colon:
(?<! Assert that it is impossible to match the regex below with the match ending at this position (negative lookbehind)
: Match the character “:” literally
. Match any single character
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)
: Match the character “:” literally
If you're using PowerShell 7 or higher, tghings get easier because then you have an extra -Delimiter parameter:
$svnInfo = svn info 'C:\Users\MyPC\REPOSITORY\NewProject\trunk'
$result = $svnInfo -replace '(?<!:.*):', '=' | ConvertFrom-StringData -Delimiter ':' |
Select-Object #{Name = 'URL'; Expression = {$_['URL']}},
#{Name = 'srcrev'; Expression = {$_['Revision']}}
# output on screen
$result | Format-Table -AutoSize
# output to CSV file
$result | Export-Csv -Path 'C:\Users\MyPC\REPOSITORIES\NewProject\OUTPUT.csv' -NoTypeInformation

$revision doesn't exist until after the svn command is done.
Use the ForEach-Object cmdlet and refer to the current match as $_ to modify the output object inline - the matched line in the output from Select-String is stored in a property called Line:
$revision = svn info C:\Users\MyPC\REPOSITORY\NewProject\trunk |Select-String -Pattern 'Revision' -CaseSensitive -SimpleMatch |ForEach-Object { $_.Line.Replace('Revision', 'srcrev') }

Related

Multiple Select Strings in a for loop to separate files

I wrote this script to search a lot of text files (~100,000) for 4 different search criteria and export to 4 separate files, I thought it would be more efficient to perform all 4 searches on each file as it is loaded vs doing 4 full searches like the first iteration below does. I may be missing some other major inefficiencies as I am pretty new to powershell.
I have this script re written from the first version to the second, but can't figure out how to get the path and data to display together like the first version did. I am struggling to reference the object within the loop, and have pieced this second version together, which is working, but not giving me the path to the file which is necessary.
It seems like I am just missing one or two little things to get me going in the right direction. Thanks in advance for your help
1st version:
Get-ChildItem -Filter *.txt -Path "\\file\to\search" -Recurse | Select-String -Pattern "abc123" -Context 0,3 | Out-File -FilePath "\\c:\out.txt"
Get-ChildItem -Filter *.txt -Path "\\file\to\search2" -Recurse | Select-String -Pattern "abc124" -Context 0,3 | Out-File -FilePath "\\c:\out2.txt"
Get-ChildItem -Filter *.txt -Path "\\file\to\search3" -Recurse | Select-String -Pattern "abc125" -Context 0,3 | Out-File -FilePath "\\c:\out3.txt"
Get-ChildItem -Filter *.txt -Path "\\file\to\search4" -Recurse | Select-String -Pattern "abc126" -Context 0,3 | Out-File -FilePath "\\c:\out4.txt"
Output:
\\file\that\was\found\example.txt:84: abc123
\\file\that\was\found\example.txt:90: abc123
\\file\that\was\found\example.txt:91: abc123
2nd version:
##$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ Configuration $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
############################################ Global Parameters #############################################
$SearchPath="\\file\to\search"
$ProgressFile=""\\progress\file\ResultsCount.txt"
$records = 105325
##----------------------------------------- End Global Parameters -----------------------------------------
########################################### Search Parameters ##############################################
##Search Pattern 1
$Pattern1="abc123"
$SaveFile1="\\c:\out.txt"
##Search Pattern 2
$Pattern2="abc124"
$SaveFile2="\\c:\out2.txt"
##Search Pattern 3
$Pattern3= "abc125"
$SaveFile3= "\\c:\out3.txt"
##Search Pattern 4
$Pattern4= "abc126"
$SaveFile4="\\c:\out4.txt"
##Search Pattern 5
$Pattern5= ""
$SaveFile5=""
##----------------------------------------- End Search Parameters ------------------------------------------
##$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ End of Config $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
############################### SCRIPT #####################################################################
## NOTES
## ------
##$files=Get-ChildItem -Filter *.txt -Path $SearchPath -Recurse ## Set all files to variable #### Long running, needs to be a better way #######
##$records=$files.count ## Set record #
Get-ChildItem -Filter *.txt -Path $SearchPath -Recurse | Foreach-Object { ## loop through search folder
$i=$i+1 ## increment record
##
Get-Content $_.FullName | Select-String -Pattern $Pattern1 -Context 0,3 | Out-File -FilePath $SaveFile1 ## pattern1 search
Get-Content $_.FullName | Select-String -Pattern $Pattern2 | Out-File -FilePath $SaveFile2 ## pattern2 search
Get-Content $_.FullName | Select-String -Pattern $Pattern3 -Context 0,1 | Out-File -FilePath $SaveFile3 ## pattern3 search
Get-Content $_.FullName | Select-String -Pattern $Pattern4 -Context 0,1 | Out-File -FilePath $SaveFile4 ## pattern4 search
##Get-Content $_.FullName | Select-String -Pattern $Pattern5 -Context 0,1 | Out-File -FilePath $SaveFile5 ## pattern5 search (Comment out unneeded search lines like this one)
$progress ="Record $($i) of $($records)" ## set progress
Write-Host "Record $($i) of $($records)" ## Writes progress to window
$progress | Out-File -FilePath $ProgressFile ## progress file
} ##
############################################################################################################
Output:
abc123
abc123
abc123
Edit: Also I am trying to figure out a good way to not have to hard code in the number of records for a decent progress readout, I commented out the way I thought would work (1st & 2nd line of the script), but there needs to be a more efficient way than rerunning the same search twice, one for a count and one for the for loop.
I would be very interested in any runtime efficiency information you could provide.
[edit - thanks to mklement0 for pointing out the errors about speed and the -SimpleMatch switch. [grin]]
the Select-String cmdlet will accept a -Path parameter ... and it is FAR [i was thinking of Get-Content, not Get-ChidItem] faster than using Get-ChildItem to feed the files to S-S. [grin]
also, the -Pattern parameter accepts a regex OR pattern like Thing|OtherThing|YetAnotherThing - and it accepts simple string patterns if you use the -SimpleMatch switch parameter.
what the code does ...
defines the source dir
defines the file spec
joins those two into a wildcard file path
builds an array of string patterns to use
calls Select-String with a path and an array of strings to search for
uses Group-Object and a calculated property to group the matches by the last part of .Line property from the S-S call
saves that to a $Var
shows that on screen
at that point, you can use the .Name property of each GroupInfo to select the items to send out to each file AND to build your file names.
the code ...
$SourceDir = 'D:\Temp\zzz - Copy'
$FileSpec = '*.log'
$SD_FileSpec = Join-Path -Path $SourceDir -ChildPath $FileSpec
$TargetPatternList = #(
'Accordion Cajun Zydeco'
'better-not-be-there'
'Piano Rockabilly Rowdy'
)
$GO_Results = Select-String -Path $SD_FileSpec -SimpleMatch $TargetPatternList |
Group-Object -Property {$_.Line.Split(':')[-1]}
$GO_Results
output ...
Count Name Group
----- ---- -----
6 Accordion Cajun Zydeco {D:\Temp\zzz - Copy\Grouping-List_08-02.log:11:Accordion Cajun Zydeco, D:\Temp\zzz - Copy\Grouping-List_08-09.log:11:Accordion Cajun Zy...
6 Bawdy Dupe Piano Rocka... {D:\Temp\zzz - Copy\Grouping-List_08-02.log:108:Bawdy Dupe Piano Rockabilly Rowdy, D:\Temp\zzz - Copy\Grouping-List_08-09.log:108:Bawdy...
6 Bawdy Piano Rockabilly... {D:\Temp\zzz - Copy\Grouping-List_08-02.log:138:Bawdy Piano Rockabilly Rowdy, D:\Temp\zzz - Copy\Grouping-List_08-09.log:138:Bawdy Pian...
6 Dupe Piano Rockabilly ... {D:\Temp\zzz - Copy\Grouping-List_08-02.log:948:Dupe Piano Rockabilly Rowdy, D:\Temp\zzz - Copy\Grouping-List_08-09.log:948:Dupe Piano ...
6 Instrumental Piano Roc... {D:\Temp\zzz - Copy\Grouping-List_08-02.log:1563:Instrumental Piano Rockabilly Rowdy, D:\Temp\zzz - Copy\Grouping-List_08-09.log:1563:I...
6 Piano Rockabilly Rowdy {D:\Temp\zzz - Copy\Grouping-List_08-02.log:1781:Piano Rockabilly Rowdy, D:\Temp\zzz - Copy\Grouping-List_08-09.log:1781:Piano Rockabil...
note that the .Group contains an array of lines from the matches sent out by the S-S call. you can send that to your output file.
Here is my take at solving this problem, very similar to Lee_Dailey's nice answer but with a foreach loop. I would recommend investing some time into researching the multi-threading options available on PowerShell in case you need to increase the performance of the script, you can look specifically at the ThreadJob module by Microsoft which is really easy to use or if you can't install modules due to some work policy, you can use Runspace.
It is worth adding that you can use the -List switch on Select-String, this way the performance of the script would be increased even more:
-List
Only the first instance of matching text is returned from each input file. This is the most efficient way to retrieve a list of files that have contents matching the regular expression.
$map = #{
abc123 = 'C:\out_abc123.txt'
abc124 = 'C:\out_abc124.txt'
abc125 = 'C:\out_abc125.txt'
}
$pattern = $map.Keys -join '|'
$match = foreach($file in Get-ChildItem *.txt)
{
Select-String -LiteralPath $file.FullName -Pattern $pattern
}
$match | Group-Object { $_.Matches.Value } | ForEach-Object {
$_.Group | Select-Object Path, LineNumber, Line | Out-File $map[$_.Name]
}
To compliment the answers #Santiago Squarzon and Lee_Dailey, I think you were actually on the good way yourself knowing that the Group-Object cmdlet is pretty expensive especially in memory usage as it chokes the PowerShell pipeline causing all the search results to be piled up in memory.
Besides, the Select-String cmdlet supports multiple (-SimpleMatch) patterns, where concatenating the search patters with an | (-join '|') will force you to use an (escaped) regular expression.
To continue on your approach:
(note that in the example, I am using my own settings to search through my script files)
$ProgressFile = '.\ResultsCount.txt'
$SearchRoot = '..\'
$Filter = '*.ps1'
$Searches = #{
'Null' = '.\Null.txt'
'Test' = '.\Test.txt'
'Object' = '.\Object.txt'
}
$Files = Get-ChildItem -Filter $Filter -Path $SearchRoot -Recurse
$Total = $Files.count
$Searches.Values |ForEach-Object { Set-Content -LiteralPath $_ -Value '' }
$i = 0
ForEach ($File in $Files) {
Get-Content -LiteralPath $File.FullName |
Select-String #($Searches.Keys) -AllMatches |ForEach-Object {
$Value = '{0}:{1}:{2}' -f $File.FullName, $_.LineNumber, $_
Add-Content -LiteralPath $Searches[$_.Pattern] -Value $Value
}
'Record {0} of {1}' -f ++$i, $Total |Tee-Object -Append .\ProgressFile.txt
}
Explanations
$Searches = #{ ...
Maps the search patters with the files, you might also use a PSObject list to specify each search (where you could add columns with e.g. context start/end values, etc.)
$Searches.Values |ForEach-Object { Set-Content -LiteralPath $_ -Value '' }
Empties the result files (knowing that they are not part of the main stream you can't use Add-Content)
$i = 0
Unfortunately there is no automatic index that initializes with a foreach loop (yet, see: #13772 Automatic variable for the pipeline index)
Get-Content -LiteralPath $File.FullName
Load the content once into memory
Note1: this is a string array.
Note2: the $Content will be reused each iteration and therefore overwrites the previous one and unloads it from memory
Select-String #($Searches.Keys) -AllMatches |ForEach-Object {
Searches the string array using your (multiple) defined patterns. (you might consider to use the -SimpleMatch parameter if your search strings contain special characters.)
Note: Unfortunately you need to embedded the $Searches.Keys in a array subexpression operator #( ), for details see .Net issue: #56835 Make OrderedDictionaryKeyValueCollection implement IList
$Value = '{0}:{1}:{2}' -f $File.FullName, $_.LineNumber, $_
Build an result output string.
Note: the result of the Select-String does have a (hidden) LineNumber and (matched) Pattern property.
Add-Content -LiteralPath $Searches[$_.Pattern] -Value $Value
Add the result string to the specific mapped output file.
'Record {0} of {1}' -f $i++, $Total |Tee-Object -Append .\ProgressFile.txt
Tee-Object will write the progress to the standard output (display) and also to the specific file.

Powershell Replace Regex Import CSV File

I have a CSV file named test.csv (C:\testing\test.csv) in this format:
File Name,Location,Added (GMT),Created (GMT),Last Modified (GMT),File Size (Bytes),File Size,Extension,Incident Type
10-MB-Test (1).docx,\\blah\Test 3,10/8/2020 21:13,10/8/2020 19:33,10/8/2020 16:26,10723331,10.23 (MB),docx,low_data_discover
10-MB-Test (1).xlsx,\\blah2\Test 3\,10/8/2020 21:14,10/8/2020 19:33,10/8/2020 16:25,9566567,9.12 (MB),xlsx,high_data_discover
1-MB-Test.docx,\\blah3\Test 3\,10/8/2020 21:13,10/8/2020 19:33,10/8/2020 16:37,1045970,1021.46 (KB),docx,medium_data_discover
I'm trying to replace trailing "\" characters (if they exist) for values in the Location column with nothing using this Powershell code:
$file1 = import-csv -path "C:\testing\test.csv" | % {$_."Location" -replace "\\$",""} | Select-Object * | export-csv -NoTypeInformation "C:\testing\blah.csv"
However, when I run the code, the only output I get is a column named "Length" with a numerical value. Can you assist?
You're only sending the new string (updated location) down the pipeline. You can update each location and then export it at the end.
$file1 = import-csv -path "C:\testing\test.csv"
$file1 | ForEach-Object {$_.location = $_.location -replace '\\$'}
$file1 | export-csv -NoTypeInformation "C:\testing\blah.csv"

create file index manually using powershell, tab delimited

Sorry in advance for the probably trivial question, I'm a powershell noob, please bear with me and give me advice on how to get better.
I want to achieve a file index index.txt that contains the list of all files in current dir and subdirs in this format:
./dir1/file1.txt 07.05.2020 16:16 1959281
where
dirs listed are relative (i.e. this will be run remotely and to save space, the relative path is good enough)
the delimiter is a tab \t
the date format is day.month.fullyear hours:minutes:seconds, last written (this is the case for me, but I'm guessing this would be different on system setting and should be enforced)
(the last number is the size in bytes)
I almost get there using this command in powershell (maybe that's useful to someone else as well):
get-childitem . -recurse | select fullname,LastWriteTime,Length | Out-File index.txt
with this result
FullName LastWriteTime Length
-------- ------------- ------
C:\Users\user1\Downloads\test\asdf.txt 07.05.2020 16:19:29 1490
C:\Users\user1\Downloads\test\dirtree.txt 07.05.2020 16:08:44 0
C:\Users\user1\Downloads\test\index.txt 07.05.2020 16:29:01 0
C:\Users\user1\Downloads\test\test.txt 07.05.2020 16:01:23 814
C:\Users\user1\Downloads\test\text2.txt 07.05.2020 15:55:45 1346
So the questions that remain are: How to...
get rid of the headers?
enforce this date format?
tab delimit everything?
get control of what newline character is used (\n or \r or both)?
Another approach could be this:
$StartDirectory = Get-Location
Get-ChildItem -Path $StartDirectory -recurse |
Select-Object -Property #{Name='RelPath';Expression={$_.FullName.toString() -replace [REGEX]::Escape($StartDirectory.ToString()),'.'}},
#{Name='LastWriteTime';Expression={$_.LastWriteTime.toString('dd.MM.yyyy HH:mm:ss')}},
Length |
Export-Csv -Path Result.csv -NoTypeInformation -Delimiter "`t"
I recommend to use proper CSV files if you have structured data like this. The resulting CSV file will be saved in the current working directory.
If the path you are running this from is NOT the current scrip path, do:
$path = 'D:\Downloads' # 'X:\SomeFolder\SomeWhere'
Set-Location $path
first.
Next, this ought to do it:
Get-ChildItem . -Recurse -File | ForEach-Object {
"{0}`t{1:dd.MM.yyyy HH:mm}`t{2}" -f ($_ | Resolve-Path -Relative), $_.LastWriteTime, $_.Length
} | Out-File 'index.txt'
On Windows the newline will be \r\n (CRLF)
If you want control over that, this should do:
$newline = "`n" # for example
# capture the lines as string array in variable $lines
$lines = Get-ChildItem . -Recurse -File | ForEach-Object {
"{0}`t{1:dd.MM.yyyy HH:mm}`t{2}" -f ($_ | Resolve-Path -Relative), $_.LastWriteTime, $_.Length
}
# join the array with the chosen newline and save to file
$lines -join $newline | Out-File 'index.txt' -NoNewline
Because your requirement is to NOT have column headers in the output file, I'm using Out-File here instead of Export-Csv

Powershell Remove spaces in the header only of a csv

First line of csv looks like this spaces are at after Path as well
author ,Revision ,Date ,SVNFolder ,Rev,Status,Path
I am trying to remove spaces only and rest of the content will be the same .
author,Revision,Date,SVNFolder,Rev,Status,Path
I tried below
Import-CSV .\script.csv | ForEach-Object {$_.Trimend()}
expanding on the comment with an example since it looks like you may be new:
$text = get-content .\script.csv
$text[0] = $text[0] -replace " ", ""
$csv = $text | ConvertFrom-CSV
Note: The solutions below avoid loading the entire CSV file into memory.
First, get the header row and fix it by removing all whitespace from it:
$header = (Get-Content -TotalCount 1 .\script.csv) -replace '\s+'
If you want to rewrite the CSV file to fix its header problem:
# Write the corrected header and the remaining lines to the output file.
# Note: I'm outputting to a *new* file, to be safe.
# If the file fits into memory as a whole, you can enclose
# Get-Content ... | Select-Object ... in (...) and write back to the
# input file, but note that there's a small risk of data loss, if
# writing back gets interrupted.
& { $header; Get-Content .\script.csv | Select-Object -Skip 1 } |
Set-content -Encoding utf8 .\fixed.csv
Note: I've chosen -Encoding utf8 as the example output character encoding; adjust as needed; note that the default is ASCII(!), which can result in data loss.
If you just want to import the CSV using the fixed headers:
& { $header; Get-Content .\script.csv | Select-Object -Skip 1 } | ConvertFrom-Csv
As for what you tried:
Import-Csv uses the column names in the header as property names of the custom objects it constructs from the input rows.
This property names are locked in at the time of reading the file, and cannot be changed later - unless you explicitly construct new custom objects from the old ones with the property names trimmed.
Import-Csv ... | ForEach-Object {$_.Trimend()}
Since Import-Csv outputs [pscustomobject] instances, reflected one by one in $_ in the ForEach-Object block, your code tries call .TrimEnd() directly on them, which will fail (because it is only [string] instances that have such a method).
Aside from that, as stated, your goal is to trim the property names of these objects, and that cannot be done without constructing new objects.
Read the whole file into an array:
$a = Get-Content test.txt
Replace the spaces in the first array element ([0]) with empty strings:
$a[0] = $a[0] -replace " ", ""
Write over the original file: (Don't forget backups!)
$a | Set-Content test.txt
$inFilePath = "C:\temp\headerwithspaces.csv"
$content = Get-Content $inFilePath
$csvColumnNames = ($content | Select-Object -First 1) -Replace '\s',''
$csvColumnNames = $csvColumnNames -Replace '\s',''
$remainingFile = ($content | Select-Object -Skip 1)

Powershell import-csv with empty headers

I'm using PowerShell To import a TAB separated file with headers. The generated file has a few empty strings "" on the end of first line of headers. PowerShell fails with an error:
"Cannot process argument because the
value of argument "name" is invalid.
Change the value of the "name"
argument and run the operation again"
because the header's require a name.
I'm wondering if anyone has any ideas on how to manipulate the file to either remove the double quotes or enumerate them with a "1" "2" "3" ... "10" etc.
Ideally I would like to not modify my original file. I was thinking something like this
$fileContents = Get-Content -Path = $tsvFileName
$firstLine = $fileContents[0].ToString().Replace('`t""',"")
$fileContents[0] = $firstLine
Import-Csv $fileContents -Delimiter "`t"
But Import-Csv is expecting $fileContents to be a path. Can I get it to use Content as a source?
You can either provide your own headers and ignore the first line of the csv, or you can use convertfrom-csv on the end like Keith says.
ps> import-csv -Header a,b,c,d,e foo.csv
Now the invalid headers in the file is just a row that you can skip.
-Oisin
If you want to work with strings instead use ConvertFrom-Csv e.g.:
PS> 'FName,LName','John,Doe','Jane,Doe' | ConvertFrom-Csv | Format-Table -Auto
FName LName
----- -----
John Doe
Jane Doe
I ended up needing to handle multiple instances of this issue. Rather than use the -Header and manually setting up each import instance I wrote a more generic method to apply to all of them. I cull out all of the `t"" instances of the first line and save the file to open as a $filename + _bak and import that one.
$fileContents = Get-Content -Path $tsvFileName
if( ([string]$fileContents[0]).ToString().Contains('""') )
{
[string]$fixedFirstLine = $fileContents[0].ToString().Replace('`t""',"")
$fileContents[0] = $fixedFirstLine
$tsvFileName = [string]::Format("{0}_bak",$tsvFileName
$fileContents | Out-File -FilePath $tsvFileName
}
Import-Csv $tsvFileName -Delimiter "`t"
My Solution if you have much columns :
$index=0
$ColumnsName=(Get-Content "C:\temp\yourCSCFile.csv" | select -First 1) -split ";" | %{
$index++
"Header_{0:d5}" -f $index
}
import-csv "C:\temp\yourCSCFile.csvv" -Header $ColumnsName