Hebrew characters can't be used when renaming files - powershell

I'm trying to use the following code:
ls *.xml | Foreach {$i=1} {
$nonParsedXML = $_
[xml]$parsedXML = Get-Content $nonParsedXML -Encoding utf8
$title = $parsedXML.title
$nonParsedXMLwithExtension = "$($title).xml"
Rename-Item $nonParsedXML -NewName $nonParsedXMLwithExtension
}
The code tries to rename a file, and the new name is the content of a tag within the files. It works when the content of the tag is in English, but it doesn't work correctly when the content is in Hebrew - The code is renaming the file, but instead of Hebrew characters I see block characters.
In case you wonder, the problem occurs when I'm using PowerShell ISE.
When I'm using PowerShell in the command prompt, the code can't be run because it can't handle the Hebrew characters at all and produces errors.
Could you please clarify this issue? Is there a solution?

It seems like default file charset, try to change it to extended charset.
http://answers.microsoft.com/en-us/windows/forum/windows_7-windows_programs/default-utf-8-encoding-for-new-notepad-documents/525f0ae7-121e-4eac-a6c2-cfe6b498712c
I hope that this article will help you.

A very nice and helpful guy gave me the solution. Here it is:
dir -Path C:\Temp\Test -Filter *.xml | ForEach-Object {
$xml = [xml](Get-Content -Path $_.FullName -Encoding UTF8)
Rename-Item -Path $_.FullName -NewName "$($xml.title).xml"
}

Related

How to change a character in multiple .txt-files and save/ overwrite the existing file in Powershell

i'm really new to Powershell (2 days) and i am not good yet. :(
My Question ist:
How to change a character in multiple .txt-files and save/ overwrite the existing file in Powershell
My goal is to Copy multiple RAW-Files in a new folder, change the file-name from .tsv to .txt and at least change one character in these files from % to percent.
What i've got so far:
The first two steps are working, but i'm losing my mind with the third step (the replacement).
Copy-Item -Path "C:\Users\user\Desktop\RAW\*.tsv" -Destination "C:\Users\user\Desktop\TXT" -Recurse
Set-Location "C:\Users\User\Desktop\TXT"
Get-ChildItem *.tsv | Rename-Item -NewName { $_.Name -replace '.tsv','.txt' }
This works fine for me and now i am not able to get further ...
I am able to replace the "%" in one specific file, and save it in a new file, but this doesn't work for a batch processing with changing file-names.
$file = "A.txt"
Get-Content $file | Foreach {$_ -replace "%", "percent"} | Set-Content A_1.txt
It would be perfect, if "$file = "A.txt"" would be "all the files in this path with .txt" and
"Set-Content A_1.txt" would be "overwrite the existing file".
I hope someone will help me, thank you! <3 <3 <3
You already have some of the solution in your first code snippet, you need to iterate over the files again to perform the replace and save.
$txtFiles = Get-ChildItem -Name *.txt
ForEach ($file in $txtFiles) {
(Get-Content $file) | ForEach-Object {
$_ -replace '%','percent'
} | Set-Content $file
}
The first line adds all the text files to an array, the foreach loop iterates over the files of the array and grabs the content of the file and unloads it - that's the reason for the parenthesis, the Foreach-Object then iterates over the content of the file and saves it to the same file name as before.
If you skip the parentheses around Get-Content $file the file would still be loaded into memory and you would get an error message about not being able to save the file.

Powershell Get -ChildItem: filtering csv files and -Recurse not working

I created a short powershell script to convert csv files from Unicode to UTF-8 encoding. My script outputs new files with the the original file name preceded by UTF8. I'm running into two issues:
I'm trying to only run the powershell script on csv files. Currently the script runs on every file in the directory, including the powershell script (it outputs a new file called UTF8pshell_script if the powershell script was called pshell_script for example). The other methods where I've tried to only run the script on csv files just end up making the script not do anything.
I'm trying to run the script on sub-directories. The first issue is that output files created from csv files in subdirectories have no content inside them whatsoever. If the script is ran in the same directory as the csv file this problem does not arise. This is not crucial but I am also uncertain how to get output files created from those in subdirectories to be outputted in the same subdirectories (currently they are outputted in the main directory where the powershell script is).
as
Get-Content -Encoding Unicode $_ | Out-File -Encoding UTF8
Get-ChildItem -Recurse | ForEach-Object {Get-Content -Encoding Unicode $_ | Out-File -Encoding UTF8 "UTF8$_"}
The desired output is the powershell script running on only csv files, and outputting files to the same subdirectories where the files they were created form are.
Get-ChildItem takes a -Filter parameter, which for files is the simple wildcard pattern. This will allow you to restrict your cmdlet to CSV files only:
Get-ChildItem -Filter *.csv
To process subdirectories, you may also use the -Recurse switch
Get-ChildItem -Filter *.csv -Recurse
Now, I'm never quite sure how $_ changes as you pass different objects through the pipe, so I'm probably not doing the next steps the most efficient way - but it will be clear what I'm trying to do:
Each file object that we find needs to be processed as follows:
Dissect it into a path and a filename: $filepath = $_.PSParentPath; $filename = $_.PSChildName
Load up the CSV: Import-CSV -Path $_
Output the new CSV with the proper encoding: Export-CSV -Path ("{0}\UTF8{1}" -f $filepath,$filename) -Encoding UTF8
So, we put it all together:
Get-ChildItem -Filter *.csv -Recurse -exclude UTF8* | ForEach-Object {
$filepath = $_.PSParentPath
$filename = $_.PSChildName
Import-CSV -Path $_ |
Export-CSV -Encoding UTF8 -Path ("{0}\UTF8{1}" -f $filepath,$filename) -NoTypeInformation
}
The -Exclude UTF8* in the Get-ChildItem ensures that when you create a file, it doesn't get picked up later and re-processed. The -NoTypeInformation on the Export-CSV compensates for a stupidity built in to the cmdlet that causes an extra line with a meaningless object type name at the beginning of the file.
Depending on the original encoding (and presence of a BOM) you might have to specify an encoding also on the input side.
ForEach($Csv in (Get-ChildItem -Filter *.csv -Recurse -Exclude UTF8*)){
(Get-Content $Csv.FullName -raw) |
Set-Content -Path {Join-Path $Csv.Directory ("UTF8"+$Csv.Name)} -Encoding UTF8
}
LotPings beat me to this by 10 minutes with a virtually identical answer, but I'm leaving this for the 'passing an empty file to the pipeline' bit that I have. I also realize after the fact that you don't need a pipeline variable for that same reason, as you only need it if you pass things through the pipeline within the loop.
If all you want to do is change the encoding I would use a ForEach($x in $y){} loop, or a ForEach-Object{} loop with a PipelineVariable on the Get-ChildItem. I'll show that since I think pipeline variables are under used. I would also not read the file and pipe it to something, since if the file is empty you won't create a new file as nothing is passed down the pipeline.
Get-ChildItem *.csv -Recurse -PipelineVariable File | ForEach-Object{
Set-Content -Value (Get-Content $File.FullName -Encoding Unicode) -Path {Join-Path $File.Directory "UTF8$($File.Name)"} -Encoding UTF8
}
if you specify the file extension at the end of Get-ChildItem.
This will get only the files with the .csv extension.
By specifying the File path in Out-File it will send it to the specified directory.
Get-ChildItem -Path C:\folder\*.csv -Recurse | ForEach-Object {Get-Content -Encoding Unicode $_ | Out-File -FilePath C:\Folder -Encoding UTF8 "UTF8$_"}

Keep Same Encoding With Set-Content Multiple Files in PowerShell

I'm attempting to write a script to be used to migrate an application from server to server and/or from one drive letter to another drive letter. My goal is to copy the directory from one location, move it to another, and then run a script to edit all instances of the old hostname, IP address, and drive letter to reflect the new hostname, IP address, and drive letter on the new server. This appears to do exactly that:
ForEach($File in (Get-ChildItem $path\* -Include *.xml,*.config -Recurse)){
(Get-Content $File.FullName -Raw) -replace [RegEx]::Escape($oldhost),$newhost `
-replace [RegEx]::Escape($oldip),$newip `
-replace "$olddriveletter(?=:\Application)",$newDriveLetter |
Set-Content $File.FullName -NoNewLine
}
The one problem I am having is that the files all have different types of encoding. Some ANSI, some UTF-8, some Unicode, etc. When I run the script, it saves everything as ANSI and then my application fails to work. I know how to add the encoding parameter, but is there any way to keep the same encoding on each individual file, without writing out a script specifying each individual file in the directory and the encoding that each individual file has?
That would be difficult. It's too bad that get-content doesn't pass an encoding property. Here's a script that tries to get the encoding if there's a signature. Maybe you can just run it first and check them all. But some windows files are unicode no bom. At least xml files can say the encoding. get-childitem *.xml | select-string encoding There might be a better way to load xml files, see the bottom answer: Powershell: Setting Encoding for Get-Content Pipeline
# encoding.ps1
# https://stackoverflow.com/questions/3825390/effective-way-to-find-any-files-encoding
param([Parameter(ValueFromPipeline=$True)] $filename)
process {
$reader = [IO.StreamReader]::new($filename, [Text.Encoding]::default,$true)
$peek = $reader.Peek()
$encoding = $reader.currentencoding
$reader.close()
[pscustomobject]#{Name=split-path $filename -leaf
BodyName=$encoding.BodyName
EncodingName=$encoding.EncodingName}
}
# end encoding.ps1
PS C:\users\me> get-childitem chinese16.txt | encoding
Name BodyName EncodingName
---- -------- ------------
chinese16.txt utf-16 Unicode
Something like this will use the encoding indicated in the xml file, even if it didn't truly match beforehand. (This also makes the xml pretty.)
PS C:\users\me> [xml]$xml = get-content file.xml
PS C:\users\me> $xml.save("$pwd\file.xml")
Use the file.exe from the git binaries to find out the encoding.
Then, add the encoding parameter to the set-content line with if else statements to meet your needs.
ForEach($File in (Get-ChildItem $path\*)){
$Content = Get-Content $File.FullName -Raw -replace [RegEx]::Escape($oldhost),$newhost `
-replace [RegEx]::Escape($oldip),$newip `
-replace "$olddriveletter(?=:\Application)",$newDriveLetter
$Encoding = file --mime-encoding $File
$FullName = $File.FullName
Write-Host "$FullName - $Encoding"
if(-NOT ($Encoding -like "UTF")){
Set-Content $Content -NoNewLine -Encoding UTF8
}
else {
Set-Content $Content -NoNewLine
}
}
Reference:
https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.management/set-content
http://gnuwin32.sourceforge.net/packages/file.htm

INI editing with PowerShell

My problem is I want to change paths in INI Files wich are saved in a folder and its subfolders.
The path of the folder is C:\New\Path\.
Example INI file:
notAIniText = C:\A\Path\notAIniText
maybeAIniText = C:\A\Path\maybeAIniText
AIniText = C:\A\Path\AIniText
I read some other questions about PSini but I don't want to just id because I want to use the script on multiple PC and I don't want to install every time PSIni.
I tried:
$mabyIni = "C:\New\Path"
$AiniFile = Get-ChildItem - Path "C:\New\Path\*" -Include *.ini -Recurse
foreach ($file in $AiniFile) {
Select-String -Path $file -AllMatches "C:\A\Path\" | ForEach-Opject {
$file -replace [regex]:Escape('C:\A\Path'), ('$mabyIni')
} | Set-Content $mabyIni -Include *.ini
But this doesn't work. I tried it with Get-Content too, but that also doesn't work.
Is there any way whitout PSini?
The code in your comment is close, but just has a few syntax issues. It starts out strong:
$mabyIni = "C:\New\Path"
$AiniFile = Get-ChildItem - Path "C:\New\Path*" -include *.ini -recurse
ForEach($file in $AiniFile) {
So far, so good. You define the new path, and you get a list of .ini files in the old path, then you start to loop through those files. This is all good code so far. Then things start to go astray.
I see that you are trying to get the contents of each .ini file, replace the string in question, and then output that file to the new path with this:
(Get-Content $AiniFile.PSPath) | ForEach-Object {
$file -replace [regex]:Escape('C:\A\Path'),('$mabyIni')
}| Set-Content $mabyIni -include *.ini
Unfortunately you're using the wrong variables, and adding in an extra ForEach loop in there as well. Let's start with the Get-Content line. At this point in the script you are looping through files, with each current file being represented by $file. So what you really want to get the contents of is $file, and not $AiniFile.PSPath.
(Get-Content $file)
Ok, that got us the contents of that file as an array of strings. Now, I'm guessing you weren't aware, but the -Replace operator works on arrays of strings. Perfect, we just so happen to have gotten an array of strings! Since the Get-Content command is wrapped in parenthesis it completes first, we can actually just tack on the -Replace command right after it.
(Get-Content $file) -replace [regex]:Escape('C:\A\Path'),$mabyIni
Your -replace command that you had was super close! In fact, I have to give you props for using [regex]::escape() in there. That's totally a pro move, well done! The only issue with it is the replacement string didn't need to be in parenthesis, and it was single quoted, so it would not have expanded the string and your .ini files would have all had a line like:
AIniText = $mabyIni\AIniText
Not exactly what you wanted I'm guessing, so I removed the parenthesis (they weren't hurting anything, but weren't helping either, so for cleanliness and simplicity I got rid of them), and I got rid of the single quotes ' as well since we really just want the string that's stored in that variable.
So now we're looping through files, reading the contents, replacing the old path with the new path, all that's left is to output the new .ini file. It looks like they're already in place, so we just use the existing path for the file, and set the content to the updated data.
(Get-Content $file) -replace [regex]:Escape('C:\A\Path'),$mabyIni | Set-Content -Path $File.FullName
Ok, done! You just have to close the ForEach loop, and run it.
$mabyIni = "C:\New\Path"
$AiniFile = Get-ChildItem - Path "C:\New\Path*" -include *.ini -recurse
ForEach($file in $AiniFile) {
(Get-Content $file) -replace [regex]:Escape('C:\A\Path'),$mabyIni | Set-Content -Path $File.FullName
}

Using PowerShell, read multiple known file names, append text of all files, create and write to one output file

I have five .sql files and know the name of each file. For this example, call them one.sql, two.sql, three.sql, four.sql and five.sql. I want to append the text of all files and create one file called master.sql. How do I do this in PowerShell? Feel free to post multiple answers to this problem because I am sure there are several ways to do this.
My attempt does not work and creates a file with several hundred thousand lines.
PS C:\sql> get-content '.\one.sql' | get-content '.\two.sql' | get-content '.\three.sql' | get-content '.\four.sql' | get-content '.\five.sql' | out-file -encoding UNICODE master.sql
Get-Content one.sql,two.sql,three.sql,four.sql,five.sql > master.sql
Note that > is equivalent to Out-File -Encoding Unicode. I only tend to use Out-File when I need to specify a different encoding.
There are some good answers here but if you have a whole lot of files and maybe you don't know all of the names this is what I came up with:
$vara = get-childitem -name "path"
$varb = foreach ($a in $vara) {gc "path\$a"}
example
$vara = get-childitem -name "c:\users\test"
$varb = foreach ($a in $vara) {gc "c:\users\test\$a"}
You can obviously pipe this directly into | add-content or whatever but I like to capture in variables so I can manipulate later on.
See if this works better
get-childitem "one.sql","two.sql","three.sql","four.sql","five.sql" | get-content | out-file -encoding UNICODE master.sql
I needed something similar, Chris Berry's post helped, but I think this is more efficient:
gci -name "*PathToFiles*" | gc > master.sql
The first part gci -name "*PathToFiles*" gets you your file list. This can be done with wildcards to just get your .sql files i.e. gci -name "\\share\folder\*.sql"
Then pipes to Get-Content and redirects the output to your master.sql file. As noted by Kieth Hill, you can use Out-File in place of > to better control your output if needed.
I think logical way of solving this is to use Add-Content
$files = Get-ChildItem '.\one.sql', '.\two.sql', '.\three.sql', '.\four.sql', '.\five.sql'
$files | foreach { Get-Content $_ | Add-Content '.\master.sql' -encoding UNICODE }
hovewer Get-Content is usually very slow when reading multiple very large files. If its your case this article could help: http://keithhill.spaces.live.com/blog/cns!5A8D2641E0963A97!756.entry
What about:
Get-Content .\one.sql,.\two.sql,.\three.sql,.\four.sql,.\five.sql | Set-Content .\master.sql
Here is how I do concatenate sql files from the Sql folder:
# Set the current location of the script to use relative path
Set-Location $PSScriptRoot
# Concatenate all the sql files
$concatSql = Get-Content -Path .\Sql\*.sql
# Write/overwrite sql to single file
Add-Content -Path concatFile.sql -Value $concatSql