Replace MKS with Powershell - powershell

can anybody please translate these two scripts from MKS into Powershell? i would like to remove MKS from our ETL tools and accomplish this with Powershell but do not have the chops
1)
FileSize=ls -l $1 | awk '{print $5}'
if [ $FileSize -ge 100000000 ]; then
split -b 60000000 $1 $1
fi
2)
find $1 -type f -name *.txt -mtime +30 -exec rm {} \;
thanks very much
drew

Avoiding using even standard aliases here (eg. can use dir or ls rather than Get-ChildItem):
1) FileSize=ls -l $1 | awk '{print $5}'
$filesize = (Get-ChildItem $name).Length
if [ $FileSize -ge 100000000 ]; then split -b 60000000 $1 $1 fi
if ($filesize -ge 100000000) { ... }
(can't recall function of split)
2) find $1 -type f -name *.txt -mtime +30 -exec rm {} \;
$t = [datetime]::Now.AddSeconds(-30)
Get-ChildItem -path . -recurse -filter *.txt |
Where-Object { $_.CreationTime -gt $t -and $_.PSIsContainer } |
Remove-Item
(Add a -whatif to the Remove-Item to list what would be deleted without deleting them.)

1) Get the size of the file named by $1. If the size is more than 100 megabytes, split it into parts of 60 megabytes each.
MKS
FileSize=`ls -l $1 | awk '{print $5}'`
if [ $FileSize -ge 100000000 ]; then
split -b 60000000 $1 $1
fi
PowerShell
function split( [string]$path, [int]$byteCount ) {
# Find how many splits will be made.
$file = Get-ChildItem $path
[int]$splitCount = [Math]::Ceiling( $file.Length / $byteCount )
$numberFormat = '0' * "$splitCount".Length
$nameFormat = $file.BaseName + "{0:$numberFormat}" + $file.Extension
$pathFormat = Join-Path $file.DirectoryName $nameFormat
# Read the file in $byteCount chunks, sending each chunk to a numbered split file.
Get-Content $file.FullName -Encoding Byte -ReadCount $byteCount |
Foreach-Object { $i = 1 } {
$splitPath = $pathFormat -f $i
Set-Content $splitPath $_ -Encoding Byte
++$i
}
}
$FileSize = (Get-ChildItem $name).Length
if( $FileSize -gt 100MB ) {
split -b 60MB $name
}
Notes: Only the split functionality needed by the question was implemented, and tested just on small file sizes. You may want to look into StreamReader and StreamWriter to perform more efficient buffered IO.
2) In the directory named by $1, find all regular files with a .txt extension that were modified over thirty days ago, and remove them.
MKS
find $1 -type f -name *.txt -mtime +30 -exec rm {} \;
PowerShell
$modifiedTime = (Get-Date).AddDays( -30 )
Get-ChildItem $name -Filter *.txt -Recurse |
Where-Object { $_.LastWriteTime -lt $modifiedTime } |
Remove-Item -WhatIf
Notes: Take off the -WhatIf switch to actually perform the remove operation, rather than previewing it.

Related

Recreating a linux md5 checksum on Windows

I'm performing some pretty straightforward checksums on our linux boxes, but I now need to recreate something similar for our windows users. To give me a single checksum, I just run:
md5sum *.txt | awk '{ print $1 }' | md5sum
I'm struggling to recreate this in Windows, either with a batch file or Powershell. The closest I've got is:
Get-ChildItem $path -Filter *.txt |
Foreach-Object {
$hash = Get-FileHash -Algorithm MD5 -Path ($path + "\" + $_) | Select -ExpandProperty "Hash"
$hash = $hash.tolower() #Get-FileHash returns checksums in uppercase, linux in lower case (!)
Write-host $hash
}
This will print the same checksum results for each file to the console as the linux command, but piping that back to Get-FileHash to get a single output that matches the linux equivalent is eluding me. Writing to a file gets me stuck with carriage return differences
Streaming as a string back to Get-FileHash doesn't return the same checksum:
$String = Get-FileHash -Algorithm MD5 -Path (Get-ChildItem -path $files -Recurse) | Select -ExpandProperty "Hash"
$stringAsStream = [System.IO.MemoryStream]::new()
$writer = [System.IO.StreamWriter]::new($stringAsStream)
$writer.write($stringAsStream)
Get-FileHash -Algorithm MD5 -InputStream $stringAsStream
Am I over-engineering this? I'm sure this shouldn't be this complicated! TIA
You need to reference the .Hash property on the returned object from Get-FileHash. If you want a similar view to md5hash, you can also use Select-Object to curate this:
# Get filehashes in $path with similar output to md5sum
$fileHashes = Get-ChildItem $path -File | Get-FileHash -Algorithm MD5
# Once you have the hashes, you can reference the properties as follows
# .Algorithm is the hashing algo
# .Hash is the actual file hash
# .Path is the full path to the file
foreach( $hash in $fileHashes ){
"$($hash.Algorithm):$($hash.Hash) ($($hash.Path))"
}
For each file in $path, the above foreach loop will produce a line that similar to:
MD5:B4976887F256A26B59A9D97656BF2078 (C:\Users\username\dl\installer.msi)
The algorithm, hash, and filenames will obviously differ based on your selected hashing algorithm and filesystem.
The devil is in the details:
(known already) Get-FileHash returns checksums in uppercase while Linux md5sum in lower case (!);
The FileSystem provider's filter *.txt is not case sensitive in PowerShell while in Linux depends on the option nocaseglob. If set (shopt -s nocaseglob) then Bash matches filenames in a case-insensitive fashion when performing filename expansion. Otherwise (shopt -u nocaseglob), filename matching is case-sensitive;
Order: Get-ChildItem output is ordered according to Unicode collation algorithm while in Linux *.txt filter is expanded in order of LC_COLLATE category (LC_COLLATE="C.UTF-8" on my system).
In the following (partially commented) script, three # Test blocks demonstrate my debugging steps to the final solution:
Function Get-StringHash {
[OutputType([System.String])]
param(
# named or positional: a string
[Parameter(Position=0)]
[string]$InputObject
)
$stringAsStream = [System.IO.MemoryStream]::new()
$writer = [System.IO.StreamWriter]::new($stringAsStream)
$writer.write( $InputObject)
$writer.Flush()
$stringAsStream.Position = 0
Get-FileHash -Algorithm MD5 -InputStream $stringAsStream |
Select-Object -ExpandProperty Hash
$writer.Close()
$writer.Dispose()
$stringAsStream.Close()
$stringAsStream.Dispose()
}
function ConvertTo-Utf8String {
[OutputType([System.String])]
param(
# named or positional: a string
[Parameter(Position=0, Mandatory = $false)]
[string]$InputObject = ''
)
begin {
$InChars = [char[]]$InputObject
$InChLen = $InChars.Count
$AuxU_8 = [System.Collections.ArrayList]::new()
}
process {
for ($ii= 0; $ii -lt $InChLen; $ii++) {
if ( [char]::IsHighSurrogate( $InChars[$ii]) -and
( 1 + $ii) -lt $InChLen -and
[char]::IsLowSurrogate( $InChars[1 + $ii]) ) {
$s = [char]::ConvertFromUtf32(
[char]::ConvertToUtf32( $InChars[$ii], $InChars[1 + $ii]))
$ii ++
} else {
$s = $InChars[$ii]
}
[void]$AuxU_8.Add(
([System.Text.UTF32Encoding]::UTF8.GetBytes($s) |
ForEach-Object { '{0:X2}' -f $_}) -join ''
)
}
}
end { $AuxU_8 -join '' }
}
# Set variables
$hashUbuntu = '5d944e44149fece685d3eb71fb94e71b'
$hashUbuntu <# copied from 'Ubuntu 20.04 LTS' in Wsl2:
cd `wslpath -a 'D:\\bat'`
md5sum *.txt | awk '{ print $1 }' | md5sum | awk '{ print $1 }'
<##>
$LF = [char]0x0A # Line Feed (LF)
$path = 'D:\Bat' # testing directory
$filenames = 'D:\bat\md5sum_Ubuntu_awk.lst'
<# obtained from 'Ubuntu 20.04 LTS' in Wsl2:
cd `wslpath -a 'D:\\bat'`
md5sum *.txt | awk '{ print $1 }' > md5sum_Ubuntu_awk.lst
md5sum md5sum_Ubuntu_awk.lst | awk '{ print $1 }' # for reference
<##>
# Test #1: is `Get-FileHash` the same (beyond character case)?
$hashFile = Get-FileHash -Algorithm MD5 -Path $filenames |
Select-Object -ExpandProperty Hash
$hashFile.ToLower() -ceq $hashUbuntu
# Test #2: is `$stringToHash` well-defined? is `Get-StringHash` the same?
$hashArray = Get-Content $filenames -Encoding UTF8
$stringToHash = ($hashArray -join $LF) + $LF
(Get-StringHash -InputObject $stringToHash) -eq $hashUbuntu
# Test #3: another check: is `Get-StringHash` the same?
Push-Location -Path $path
$filesInBashOrder = bash.exe -c "ls -1 *.txt"
$hashArray = $filesInBashOrder |
Foreach-Object {
$hash = Get-FileHash -Algorithm MD5 -Path (
Join-Path -Path $path -ChildPath $_) |
Select-Object -ExpandProperty "Hash"
$hash.tolower()
}
$stringToHash = ($hashArray -join $LF) + $LF
(Get-StringHash -InputObject $stringToHash) -eq $hashUbuntu
Pop-Location
# Solution - ordinal order assuming `LC_COLLATE="C.UTF-8"` in Linux
Push-Location -Path $path
$hashArray = Get-ChildItem -Filter *.txt -Force -ErrorAction SilentlyContinue |
Where-Object {$_.Name -clike "*.txt"} | # only if `shopt -u nocaseglob`
Sort-Object -Property { (ConvertTo-Utf8String -InputObject $_.Name) } |
Get-FileHash -Algorithm MD5 |
Select-Object -ExpandProperty "Hash" |
Foreach-Object {
$_.ToLower()
}
$stringToHash = ($hashArray -join $LF) + $LF
(Get-StringHash -InputObject $stringToHash).ToLower() -ceq $hashUbuntu
Pop-Location
Output (tested on 278 files): .\SO\69181414.ps1
5d944e44149fece685d3eb71fb94e71b
True
True
True
True

Count characters for each line

I am new to WinPowerShell. Please, would you be so kind to give me some code or information, how to write a program which will do for all *.txt files in a folder next:
1.Count characters for each line in the file
2. If length of line exceeds 1024 characters to create a subfolder within that folder and to move file there (that how I will know which file has over 1024 char per line)
I've tried though VB and VBA (this is more familiar to me), but I want to learn some new cool stuff!
Many thanks!
Edit: I found some part of a code that is beginning
$fileDirectory = "E:\files";
foreach($file in Get-ChildItem $fileDirectory)
{
# Processing code goes here
}
OR
$fileDirectory = "E:\files";
foreach($line in Get-ChildItem $fileDirectory)
{
if($line.length -gt 1023){# mkdir and mv to subfolder!}
}
If you are willing to learn, why not start here.
You can use the Get-Content command in PS to get some information of your files. http://blogs.technet.com/b/heyscriptingguy/archive/2013/07/06/powertip-counting-characters-with-powershell.aspx and Getting character count for each row in text doc
With your second edit I did see some effort so I would like to help you.
$path = "D:\temp"
$lengthToNotExceed = 1024
$longFiles = Get-ChildItem -path -File |
Where-Object {(Get-Content($_.Fullname) | Measure-Object -Maximum Length | Select-Object -ExpandProperty Maximum) -ge $lengthToNotExceed}
$longFiles | ForEach-Object{
$target = "$($_.Directory)\$lengthToNotExceed\"
If(!(Test-Path $target)){New-Item $target -ItemType Directory -Force | Out-Null}
Move-Item $_.FullName -Destination $target
}
You can make this a one-liner but it would be unnecessarily complicated. Use measure object on the array returned by Get-Content. The array being, more or less, a string array. In PowerShell strings have a length property which query.
That will return the maximum length line in the file. We use Where-Object to filter only those results with the length we desire.
Then for each file we attempt to move it to the sub directory that is in the same location as the file matched. If no sub folder exists we make it.
Caveats:
You need at least 3.0 for the -File switch. In place of that you can update the Where-Object to have another clause: $_.PSIsContainer
This would perform poorly on files with a large number of lines.
Here's my comment above indented and line broken in .ps1 script form.
$long = #()
foreach ($file in gci *.txt) {
$f=0
gc $file | %{
if ($_.length -ge 1024) {
if (-not($f)) {
$f=1
$long += $file
}
}
}
}
$long | %{
$dest = #($_.DirectoryName, '\test') -join ''
[void](ni -type dir $dest -force)
mv $_ -dest (#($dest, '\', $_.Name) -join '') -force
}
I was also mentioning labels and breaks there. Rather than $f=0 and if (-not($f)), you can break out of the inner loop with break like this:
$long = #()
foreach ($file in gci *.txt) {
:inner foreach ($line in gc $file) {
if ($line.length -ge 1024) {
$long += $file
break inner
}
}
}
$long | %{
$dest = #($_.DirectoryName, '\test') -join ''
[void](ni -type dir $dest -force)
mv $_ -dest (#($dest, '\', $_.Name) -join '') -force
}
Did you happen to notice the two different ways of calling foreach? There's the verbose foreach command, and then there's command | %{} where the iterative item is represented by $_.

flatten files including foldername in filename

i'd like to flatten folder-structure and in one way include each parent directory name to filename. i've tried this, but get an error:
Missing ')' in method call
I quite don't see what's the problem
(ls -r -include *.ext) | % { mv -literal $_\3 $_\3.Name.Insert(0, [String]::Format("{0} - ", $_\3.Directory.Name))}
Try this:
ls . -r *.ext -name | ?{!$_.PSIsContainer} | mi -dest {$_ -replace '\\','_'} -whatif
Or if on V3:
ls . -r *.ext -name -file | mi -dest {$_ -replace '\\','_'} -whatif
Remove the -whatif to actually perform the move.
Do you want to flatten the folder structure and move all of the renamed files to the root directory? For example:
$rootPath = 'C:\TempPath'
(ls $rootPath -r -include *.ext) | %{
[string]$newFilename = $_.Name.Insert(0, [String]::Format("{0} - ", $_.Directory.Name))
#write-host $newFilename
mv -literal $_ "$rootPath$newFilename"
}

loop through subfolders

Hi I would like to create a batch file to get all the sub-folders.Can anyone please help me on this?
This is trivial in both batch files or PowerShell:
Batch:
for /r %%x in (*.sql) do (
rem "(or whatever)"
sqlcmd "%%x"
)
PowerShell:
Get-ChildItem -Recurse -Filter *.sql |
ForEach-Object {
sqlcmd $_.FullName # or whatever
}
Here is a powershell script I use to get the size of the "School" folder within the Subfolder of each users' home folders. IE. N:\UserName\UserNameOSXProfile\School
SizeOfSchoolFolder.ps1
$schoolFolderTotalSize=0
$foundChildren = get-childitem *\*OSXProfile\School
foreach($file in $foundChildren){
$schoolFolderTotalSize = $schoolFolderTotalSize + [long](ls -r $file.FullName | measure -s Length).Sum
}
switch($schoolFolderTotalSize) {
{$_ -gt 1GB} {
'{0:0.0} GiB' -f ($_/1GB)
break
}
{$_ -gt 1MB} {
'{0:0.0} MiB' -f ($_/1MB)
break
}
{$_ -gt 1KB} {
'{0:0.0} KiB' -f ($_/1KB)
break
}
default { "$_ bytes" }
}
I pulled the adding numbers from:
Adding up numbers in PowerShell
And the making folder size look pretty:
Get Folder Size from Windows Command Line
And The \*\*OSXProfile\School as a search for a specific sub folder.
Limit Get-ChildItem recursion depth
To replace 'xyz' with 'abc' in all the filenames:
Get-ChildItem -Recurse -Filter *.mp3 | Rename-Item –NewName { $_.name –replace 'xyz','abc' }

How to loop through files (full path) in PowerShell

How do I do the equivalent in PowerShell? Note that I require the full path to each file.
# ksh
for f in $(find /app/foo -type f -name "*.txt" -mtime +7); do
mv ${f} ${f}.old
done
I played around with Get-ChildItem for a bit and I am sure the answer is there someplace.
I'm not sure what mtime does here is the code to do everything else
gci -re -in *.txt "some\path\to\search" |
?{ -not $_.PSIsContainer } |
%{ mv $_.FullName "$($_.FullName).old" }
This seems to get me close to what I need. I was able to combine some of the information from Jared's answer with this question to figure it out.
foreach($f in $(gci -re -in hoot.txt "C:\temp")) {
mv $f.FullName "$($f.FullName).old"
}
In the interest of sharing the wealth here is my function to simulate *nix find.
function unix-find (
$path,
$name="*.*",
$mtime=0)
{
gci -recurse -include "$name" "$path" |
where-object { -not $_.PSIsContainer -and ($_.LastWriteTime -le (Get-Date).AddDays(-$mtime)) } |
foreach { $_.FullName }
}