How convert multiple RTF-file to TXT-file - powershell

Looking here:
Is it possible to change an .rtf file to .txt file using some sort of batch script on Windows?
I have saw which possible use POWERSHELL for to do it. Was present a full example for to do it but link don't work.
Who can tell me as i can to solve it? Thanks.

You can use .NET to do this in powershell very easily by implementing the System.Windows.Forms.RichTextBox control, loading the richtextfile into it, then pulling the text version out. This is by far the easiest and quickest way I have found to do this.
My function for doing exactly this is here: https://github.com/Asnivor/PowerShell-Misc-Functions/blob/master/translate-rtf-to-txt.ps1
To explain this a little more basically:
$rtfFile = [System.Io.FileInfo]"path/to/some/rtf/file"
$txtFile = "path/to/the/destination/txt/file"
# Load *.rtf file into a hidden .NET RichTextBox
$rtBox = New-Object System.Windows.Forms.RichTextBox
$rtfText = [System.IO.File]::ReadAllText($rtfFile);
$rtBox.Rtf = $rtfText
# Get plain text version
$plainText = $rtBox.Text;
# Write the plain text out to the destination file
[System.IO.File]::WriteAllText($txtFile, $plainText)

Related

How to use custom PowerShell functions?

Question about running a custom function in Powershell.
I'm on Windows 10 and I'd like to somehow print my monorepository's directory tree structure excluding node_modules. This is not supported out of the box but requires a custom function to be defined. I found one solution on StackOverflow (https://stackoverflow.com/a/43810460/9654273), which would enable using a command like:
tree -Exclude node_modules -Ascii > tree.txt
The problem is I don't know what to do with the provided source code :D The answer says "add to your $PROFILE, for instance", so I ran notepad $PROFILE in PowerShell, pasted the code snippet there, saved it and tried running the command. It didn't work because I did something wrong. According to the StackOverflow post's comments from anand_v.singh and mklement0 I was still running some other tree command, not the one I just attempted to define.
So how do I use a custom function in PowerShell? Starting point is that source code is on StackOverflow and I don't know where to paste it. Or do you know some other, easier way to print a directory tree on Windows 10 excluding node_modules?
I had the same problem with that function. The issue is the special characters in the hashtable at line 106:
$chars = #{
interior = ('├', '+')[$ndx]
last = ('└', '\')[$ndx] #'
hline = ('─', '-')[$ndx]
vline = ('│', '|')[$ndx]
space = ' '
}
I changed the special characters to ascii as follows:
$chars = #{
interior = ('+', '+')[$ndx]
last = ('\', '\')[$ndx] #'
hline = ('-', '-')[$ndx]
vline = ('|', '|')[$ndx]
space = ' '
}
The only downside is that you do not now have the option of using special graphics characters (the Ascii switch is still there, but does nothing). Maybe someone could tell us how to embed them properly.

determine if the file is empty and separate them into different file

The goal of my code is to look into a certain folder and create a new text file with a list of names of all the files that aren't empty in that folder written to a new file, and the list of names of all the empty files (no text) into another folder. My current code is only able to create a new text file with a list of names of all the files (regardless of its content) written to a new file. I want to know how to set up if statement regarding the content of the file (array).
function ListFile
dirName = '';
files = dir(fullfile(dirName,'*.txt'));
files = {files.name};
[fid,msg] = fopen(sprintf('output.txt'),'w+t');
assert(fid>=0,msg)
fprintf(fid,'%s\n',files{:});
fclose(fid);
EDIT: The linked solution in Stewie Griffin's comment is way better. Use this!
A simple approach would be to iterate all files, open them, and check their content. Caveat: If you have large files, this approach might be memory intensive.
A possible code for that could look like this:
function ListFile
dirName = '';
files = dir(fullfile(dirName, '*.txt'));
files = {files.name};
fidEmpty = fopen(sprintf('output_empty_files.txt'), 'w+t');
fidNonempty = fopen(sprintf('output_nonempty_files.txt'), 'w+t');
for iFile = 1:numel(files)
content = fileread(files{iFile})
if (isempty(content))
fprintf(fidEmpty, '%s\n', files{iFile});
else
fprintf(fidNonempty, '%s\n', files{iFile});
end
end
fclose(fidEmpty);
fclose(fidNonempty);
I have two non-empty files nonempty1.txt and nonempty2.txt as well as two empty files empty1.txt and empty2.txt. Running this code, I get the following outputs.
Debugging output from fileread:
content =
content =
content = Test
content = Another test
Content of output_empty_files.txt:
empty1.txt
empty2.txt
Content of output_nonempty_files.txt:
nonempty1.txt
nonempty2.txt
Matlab isn't really the optimal tool for this task (although it is capable). To generate the files you're looking for, a command line tool would be much more efficient.
For example, using GNU find you could do
find . -type f -not -empty -ls > notemptyfiles.txt
find . -type f -empty -ls > emptyfiles.txt
to create the text files you desire. Here's a link for doing something comparable using the windows command line. You could also call these functions from within Matlab if you want to using the system command. This would be much faster than iterating over the files from within Matlab.

Reading part of a file into a Stream in Powershell

I have some files which are 'offsetted' Zip files in that they have 4 extra bytes at the begining which must be ignored when extracting them.
I've been using ReadAllBytes/WriteAllBytes (with an offset of 4) - that works but obviously I have to write read/write/read the file which is slow.
I'd prefer to use System.IO.Compression.ZipArchive to read from a Stream loaded from the file (sans the first 4 bytes) - but I cannot figure-out the steps required to do that?
I tried 'Seek' but ZipArchive ignores position
I cannot seem to get Byte Arrays to pass into System.IO.Compression at all...
Ideas?
Finally!
After trying all manner of hoop-jumping, it seems the simplest answer was the right one
$bytes = [system.io.file]::ReadAllBytes("file.zip4")
$ms = New-Object System.IO.MemoryStream -Argumentlist $bytes,4,($bytes.length-4)
$arch = New-Object System.IO.Compression.ZipArchive($ms)
I can then process $arch.Entries and extract things just fine - reading the file once and processing it instead of reading it, writing 'most' of it back to disc, reading that file back again!!

Edit a Text Form Field in MS Word with Powershell

i'm trying to write a Powershell-script that edits Text Form Fields in Microsoft Office Word 2007. It should find a Form Field via the Bookmark I configured before and write a Text into it. The default text I wrote into it for test purposes is "Something".
That's what I have so far:
$document = 'D:\Powershell\Test.docx'
$Word = New-Object -Com Word.Application
$Word.Visible = $True
$doc = $word.Documents.Open($document)
$text = "Hello"
$bookmark = "server1"
$doc.Bookmarks.Item($bookmark).Range.Text.Replace("Something", $text)
While it works in the console since the output is:
FORMTEXT Hello
Word still displays the String I inserted manually before.
When I type in:
$doc.Bookmarks.Item($bookmark).Range.Text
The output is:
FORMTEXT Something
I already tried:
$Word.ActiveDocument.Reload()
$Word.ActiveDocument.Fields.Update()
$doc.PrintPreview()
$doc.ClosePrintPreview()
$doc.Bookmarks.Item($bookmark).Range.Fields.Update()
But nothing seems to work.
Has somebody an idea how to write something in that Text Form Field permanently?
Alternatively if that's easier I could use a (rich) Text Content Control (which seem to be newer). Those don't use a bookmark but a tag and title.
Thanks for you help in advance .
PS:It doesn't work with MS Word 2016 either.
When you have a legacy text form field, the bookmark is really there to identify the field. If you try to write replace the text of the bookmark in VBA (say) you'll probably get Error 6028 - "The range cannot be deleted".
I don't know Powershell well enough to do this without checking, but the equivalent VBA would be
doc.FormFields($bookmark).Result = "Something"
so I would guess the powershell is something like
$doc.FormFields.Item($bookmark).Result = "Something"

Loop through files with specific extension

I need to open many files in a loop, with the same extension.
Example file names are: c1_p1_t_r.mat,c1_p3_t_r.mat,c1_p6_t_r.mat,c1_p7_t_r.mat,c1_p10_t_r.mat,etc.
So basically, the first and last part of the file names are the same, but something in the middle changes.
I tried with:
Ext = 'c1_*t_r*.mat';
files = dir(Ext);
but it doesn't work. Any suggestion would be greatly appreciated.
Looking at the file names you shared you should use c1*t_r.mat rather than c1*t_r*.mat
Use files = dir('*.Ext'); You need the apostrophes to pass it as a string and the asterisk as the wildcard for file names. I think passing multiple asterisks here is the problem. You might resort to creating the variable name as a full string in case they are as similar though:
for ii = 1:NumberOfFiles
filename = sprintf('c1_p%dt_r.mat',ii);
%//load file with created name
end