Powershell - Getting a directory to output a file at a time - powershell

I'm super new at all of this so please excuse my lack of technical elegance and all around idiocy.
dir c:\Users\me\desktop\Test\*.txt | %{ $sourceFile = $_; get-content $_} | Out-File "$sourceFile.results"
How can I modify this command line so that instead of one file with the contents of all the text files I have a one to one ratio so that each output files represents the contents of each text file?
I realize that this object is ridiculous in terms of application but I'm conceptually trying to piece this together bit by bit so I can really understand.
P.S. What's with the %? Haha another ridiculous question, doesn't seem worth a separate post, what does it do?

dir | % { Out-File -FilePath "new_$($_.Name)" -InputObject (gc $_.FullName) }
only one pipeline needed. this command appends "new_" to the filename because I was using the same directory to write to. You can remove this if it's not needed.

Related

Get-Content Measure-Object Command : Additional rows are added to the actual row count

This is my first post here - my apologies in advance if I didn't follow a certain etiquette for posting. I'm a newbie to powershell, but I'm hoping someone can help me figure something out.
I'm using the following powershell script to tell me the total count of rows in a CSV file, minus the header. This generated into a text file.
$x = (Get-Content -Path "C:\mysql\out_data\18*.csv" | Measure-Object -Line).Lines
$logfile = "C:\temp\MyLog.txt"
$files = get-childitem "C:\mysql\out_data\18*.csv"
foreach($file in $files)
{
$x--
"File: $($file.name) Count: $x" | out-file $logfile -Append
}
I am doing this for 10 individual files. But there is just ONE file that keeps adding exactly 807 more rows to the actual count. For example, for the code above, the actual row count (minus the header) in the file is 25,083. But my script above generates 25,890 as the count. I've tried running this for different iterations of the same type of file (same data, different days), but it keeps adding exactly 807 to the row count.
Even when running only (Get-Content -Path "C:\mysql\out_data\18*.csv" | Measure-Object -Line).Lines, I still see the wrong record count in the powershell window.
I'm suspicious that there may be a problem specifically with the csv file itself? I'm coming to that conclusion since 9 out of 10 files generate the correct row count. Thank you advance for your time.
To measure the items in a csv you should use Import-Csv rather than Get-Content. This way you don't have to worry about headers or empty lines.
(Import-Csv -Path $csvfile | Measure-Object).Count
It's definitely possible there's a problem with that csv file. Also, note that if the csv has cells that include linebreaks that will confuse Get-Content so also try Import-CSV
I'd start with this
$PathToQuestionableFile = "c:\somefile.csv"
$TestContents = Get-Content -Path $PathToQuestionableFile
Write-Host "`n-------`nUsing Get-Content:"
$TestContents.count
$TestContents[0..10]
$TestCsv = Import-CSV -Path $PathToQuestionableFile
Write-Host "`n-------`nUsing Import-CSV:"
$TestCsv.count
$TestCsv[0..10] | Format-Table
That will let you see what Get-Content is pulling so you can narrow down where the problem is.
If it is in the file itself and using Import-CSV doesn't fix it I'd try using Notepad++ to check both the encoding and the line endings
encoding is a drop down menu, compare to the other csv files
line endings can be seen with (View > Show Symbol > Show All Characters). They should be consistent across the file, and should be one of these
CR (typically if it came from a mac)
LF (typically if it came from *nix or the internet)
CRLF (typically if it came from windows)

Powershell Set-Content is duplicating lines in .ini file

Go easy on me, first time posting.
I'm trying to use Powershell to Get-Content from an INI file. Particularly, I need to change two separate lines in the file. It runs, but instead of just replacing those 2 lines, it duplicates everything. It also doesn't replace the line I'm trying to tell it to replace, but instead it just adds my new line leaving the original in as well.
$FilePath = "C:\Users\folder\*.ini"
(Get-Content $FilePath) |
ForEach-Object {
$_ -replace "MailBell=0","MailBell=1"
$_ -replace "MailWindow=0","MailWindow=1"
} |
Set-Content $FilePath
There is a look at the code. Any help is greatly appreciated.

Copying files defined in a list from network location

I'm trying to teach myself enough powershell or batch programming to figure out to achieve the following (I've had a search and looked through a couple hours of Youtube tutorials but can't quite piece it all together to figure out what I need - I don't get Tokens, for example, but they seem necessary in the For loop). Also, not sure if the below is best achieved by robocopy or xcopy.
Task:
Define a list of files to retrieve in a csv (file name will be listed as a 13 digit number, extension will be UNKNOWN, but will usually be .jpg but might occasionally be .png - could this be achieved with a wildcard?)
list would read something like:
9780761189931
9780761189988
9781579657159
For each line in this text file, do:
Search a network folder and all subfolders
If exact filename is found, copy to an arbitrary target (say a new folder created on desktop)
(Not 100% necessary, but nice to have) Once the For loop has completed, output a list of files copied into a text file in the newly created destination folder
I gather that I'll maybe need to do a couple of things first, like define variables for the source and destination folders? I found the below elsewhere but couldn't quite get my head around it.
set src_folder=O:\2017\By_Month\Covers
set dst_folder=c:\Users\%USERNAME&\Desktop\GetCovers
for /f "tokens=*" %%i in (ISBN.txt) DO (
xcopy /K "%src_folder%\%%i" "%dst_folder%"
)
Thanks in advance!
This solution is in powershell, by the way.
To get all subfiles of a folder, use Get-ChildItem and the pipeline, and you can then compare the name to the insides of your CSV (which you can get using import-CSV, by the way).
Get-ChildItem -path $src_folder -recurse | foreach{$_.fullname}
I'd personally then use a function to edit the name as a string, but I know this probably isn't the best way to do it. Create a function outside of the pipeline, and have it return a modified path in such a way that you can continue the previous line like this:
Get-ChildItem -path $src_folder -recurse | foreach{$_.CopyTo (edit-path $_.fullname)}
Where "edit-directory" is your function that takes in the path, and modifies it to return your destination path. Also, you can alternatively use robocopy or xcopy instead of CopyTo, but Copy-Item is a powershell native and doesn't require much string manipulation (which in my experience, the less, the better).
Edit: Here's a function that could do the trick:
function edit-path{
Param([string] $path)
$modified_path = $dst_folder + "\"
$modified_path = $path.substring($src_folder.length)
return $modified_path
}
Edit: Here's how to integrate the importing from CSV, so that the copy only happens to files that are written in the CSV (which I had left out, oops):
$csv = import-csv $CSV_path
Get-ChildItem -path $src_folder -recurse | where-object{$csv -contains $_.name} | foreach{$_.CopyTo (edit-path $_.fullname)}
Note that you have to put the whole CSV path in the $CSV_path variable, and depending on how the contents of that file are written, you may have to use $_.fullname, or other parameters.
This seems like an average enough problem:
$Arr = Import-CSV -Path $CSVPath
Get-ChildItem -Path $Folder -Recurse |
Where-Object -FilterScript { $Arr -contains $PSItem.Name.Substring(0,($PSItem.Length - 4)) } |
ForEach-Object -Process {
Copy-Item -Destination $env:UserProfile\Desktop
$PSItem.Name | Out-File -FilePath $env:UserProfile\Desktop\Results.txt -Append
}
I'm not great with string manipulation so the string bit is a bit confusing, but here's everything spelled out.

Powershell script write back to sources from drag and drop

I need to create a powershell script that removes quotes from CSV files in a user friendly drag and drop way. I have the basics of the script down courtesy of this page:
http://blogs.technet.com/b/heyscriptingguy/archive/2011/11/02/remove-unwanted-quotation-marks-from-csv-files-by-using-powershell.aspx
And I've already sucessfully made .ps1 files drag and droppable courtesy of this stack overflow question:
Drag and Drop to a Powershell script
The author of the answer implies that it's just as easy to drop a single file, many files, and folders with lots of files in them. However, I have yet to figure this out in a way that can also can write back to the source file. Here's my current code:
Param([string[]]$file)
(gc $file) | % {$_ -replace '"', ""} | out-file C:\Users\pfoster\Desktop\Output\test.txt -Fo -En ascii
Currently, this will only accept a single file, and output the result as a txt to a specified file regardless of the source file type (I can change that to CSV easily but I'd like the script to mirror the source). Ideally, I'd like it to accept files and folders, and to rewrite the source file. I have a feeling this would involve the get-ChildItem but I'm not sure how to implement that in the current scenario. I've also tried out-file $file and that didn't work either.
Thanks for the help!
For writing the modified content back to the original files try something like this:
foreach ($file in $ARGS) {
(Get-Content $file) -replace '"', '' | Out-File $file -Encoding ASCII -Force
}
Use a foreach in loop, because you need the file name in more than one place in the pipeline. Reading the content in a subshell and then piping the modified content into the Out-File cmdlet makes sure that the output file is only written after the content was already read.
Don't use a redirection operator ((Get-Content $file) >$file), because that would first open the file for writing (effectively truncating it) and afterwards read the content from the now empty file.
Beware that this approach may cause problems with large files, because each file is read completely into the RAM before they're processed and written back to disk. If a file doesn't fit into the available RAM the computer will start swapping, thus causing significant performance degradation.

Find and Replace in a Large File

I want to find a piece of text in a large xml file and want to replace with some other text. The size of the file is around (50GB). I want to do this in command line. I am looking at PowerShell and want to know if it can handle the large size.
Currently I am trying something like this but it does not like it
Get-Content C:\File1.xml | Foreach-Object {$_ -replace "xmlns:xsi=\"http:\/\/www\.w3\.org\/2001\/XMLSchema-instance\"", ""} | Set-Content C:\File1.xml
The text I want to replace is xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" with an empty string "".
Questions
Can PowerShell handle large
files
I don't want the replace to happen in
memory and prefer streaming assuming
that will not bring the server to
its knees.
Are there any other approaches I can take (different
tools/strategy?)
Thanks
I had a similar need (and similar lack of powershell experience) but cobbled together a complete answer from the other answers on this page plus a bit more research.
I also wanted to avoid the regex processing, since I didn't need it either -- just a simple string replace -- but on a large file, so I didn't want it loaded into memory.
Here's the command I used (adding linebreaks for readability):
Get-Content sourcefile.txt
| Foreach-Object {$_.Replace('http://example.com', 'http://another.example.com')}
| Set-Content result.txt
Worked perfectly! Never sucked up much memory (it very obviously didn't load the whole file into memory), and just chugged along for a few minutes then finished.
Aside from worrying about reading the file in chunks to avoid loading it into memory, you need to dump to disk often enough that you aren't storing the entire contents of the resulting file in memory.
Get-Content sourcefile.txt -ReadCount 10000 |
Foreach-Object {
$line = $_.Replace('http://example.com', 'http://another.example.com')
Add-Content -Path result.txt -Value $line
}
The -ReadCount <number> sets the number of lines to read at a time. Then the ForEach-Object writes each line as it is read. For a 30GB file filled with SQL Inserts, I topped out around 200MB of memory and 8% CPU. While, piping it all into Set-Content at hit 3GB of memory before I killed it.
It does not like it because you can't read from a file and write back to it at the same time using Get-Content/Set-Content. I recommend using a temp file and then at the end, rename file1.xml to file1.xml.bak and rename the temp file to file1.xml.
Yes as long as you don't try to load the whole file at once. Line-by-line will work but is going to be a bit slow. Use the -ReadCount parameter and set it to 1000 to improve performance.
Which command line? PowerShell? If so then you can invoke your script like so .\myscript.ps1 and if it takes parameters then c:\users\joe\myscript.ps1 c:\temp\file1.xml.
In general for regexes I would use single quotes if you don't need to reference PowerShell variables. Then you only need to worry about regex escaping and not PowerShell escaping as well. If you need to use double-quotes then the back-tick character is the escape char in double-quotes e.g. "`$p1 is set to $ps1". In your example single quoting simplifies your regex to (note: forward slashes aren't metacharacters in regex):
'xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"'
Absolutely you want to stream this since 50GB won't fit into memory. However, this poses an issue if you process line-by-line. What if the text you want to replace is split across multiple lines?
If you don't have the split line issue then I think PowerShell can handle this.
This is my take on it, building on some of the other answers here:
Function ReplaceTextIn-File{
Param(
$infile,
$outfile,
$find,
$replace
)
if( -Not $outfile)
{
$outfile = $infile
}
$temp_out_file = "$outfile.temp"
Get-Content $infile | Foreach-Object {$_.Replace($find, $replace)} | Set-Content $temp_out_file
if( Test-Path $outfile)
{
Remove-Item $outfile
}
Move-Item $temp_out_file $outfile
}
And called like so:
ReplaceTextIn-File -infile "c:\input.txt" -find 'http://example.com' -replace 'http://another.example.com'
The escape character in powershell strings is the backtick ( ` ), not backslash ( \ ). I'd give an example, but the backtick is also used by the wiki markup. :(
The only thing you should have to escape is the quotes - the periods and such should be fine without.