MSWord macro: search and highlight formatted text patterns - ms-word

I'm trying to write a MSWord macro that will find, and then highlight (in yellow), certain kinds of text strings in a MSWord file.
For example:
1) An italicized comma, followed by whitespace, and then a non-italicized text. Thus, for example:
The second comma in this sentence, which is italicized, should be highlighted by the desired macro. But the comma in this sentence should not be highlighted, because the entire sentence is in italics.
2) A bolded character (of any kind, even whitespace), both preceded and followed by non-bolded characters. Thus, for example:
This sentence ends in a bolded punctuation mark. That first period should be highlighted.
I know that first period might look normal, but it's not. It's bold.
3) Any word that is in SmallCaps, and is >4 letters long, but is not capitalized. I don't know how to do smallcaps in markdown... but imagine for a moment that the following text is in smallcaps in MSWord:
Imagine All of This Is in Small Caps. . . the Word "under" Should Be Highlighted Because It Is More Than Four Characters Long but is not Capitalized
Does anyone know whether this is possible? I know it's quite easy to find text-patterns using regular expressions, but adding changes in formatting to those patterns seems to be tricky.

run cmd,
cscript //Nologo regexp02.vbs
regexp02.vbs:
Dim objRegExp : Set objRegExp = CreateObject("VBScript.RegExp")
objRegExp.Global = True
Dim input
input="Imagine All of This Is in Small Caps. . . the Word under Should Be Highlighted Because It Is More Than Four Characters Long but is not Capitalized"
WScript.Echo input
WScript.Echo
Dim Pattern1 : Pattern1 = "\b[a-z]{5,}\s"
WScript.Echo "Pattern1 : " & Pattern1
WScript.Echo
objRegExp.Pattern = Pattern1
Set objMatches = objRegExp.Execute(input)
For i=0 To objMatches.Count-1
Set objMatch = objMatches.Item(i)
WScript.Echo objMatch.Value
Next
WScript.Echo
Dim Pattern2 : Pattern2 = "\b[A-Z]([a-z]{4,})\s"
WScript.Echo "Pattern2 : " & Pattern2
WScript.Echo
objRegExp.Pattern = Pattern2
Set objMatches = objRegExp.Execute(input)
For i=0 To objMatches.Count-1
Set objMatch = objMatches.Item(i)
WScript.Echo objMatch.Value
WScript.Echo Left(objMatch.Value, 1)
'TODO test bold sumbol Left(objMatch.Value, 1)
'
' TODO Highlight Code
'
Next
Output:
Imagine All of This Is in Small Caps. . . the Word under Should Be Highlighted Because It Is More Than Four Characters Long but is not Capitalized
Pattern1 : \b[a-z]{5,}\s
under
Pattern2 : \b[A-Z]([a-z]{4,})\s
Imagine
I
Small
S
Should
S
Highlighted
H
Because
B
Characters
C
Regex at VBA:
Open reference
Select COM-server Microsoft VBScript Regular Expressions 5.5
VBA code:
Dim objRegExp As New VBScript_RegExp_55.RegExp
objRegExp.IgnoreCase = False
objRegExp.Global = True
objRegExp.Pattern = Pattern1
Record macro
Press Ctrl+F, open search dialog
select font property
select font style
Press Find Next
Stop macro record, open VBA editor
Edit macro SearchItalic
Run macro SearchItalic
Search italic text:
Sub SearchItalic()
Selection.Find.ClearFormatting
Selection.Find.Font.Italic = True
With Selection.Find
.Text = ""
.Forward = True
.Wrap = wdFindContinue
End With
Selection.Find.Execute
End Sub

Related

Excel VBA to find specific character in bold and replace with another character in bold

thanks for your help. I need to find any occurrence of the charcter "&" in bold text format and replace it with a comma in bold text format. I can't do find/replace because I only want to change the bold occurrences of the character. I thought I was getting close with the code below but I get a runtime error.
Sub ReplaceWord()
Worksheets("Sheet1").Activate
With ActiveDocument.Content.Find
.ClearFormatting
.Font.Bold = True
With .Replacement
.ClearFormatting
.Font.Bold = True
End With
.Execute findText:="&", ReplaceWith:=",", _
Format:=True, Replace:=wdReplaceAll
End With
End Sub

Finding text AND fields with variable content in Word

I need to find and delete every occurrence of the following pattern in a Word 2010 document:
RPDIS→ text {INCLUDEPICTURE c:\xxx\xxx.png" \*MERGEFORMAT} text ←RPDIS
Where:
RPDIS→ and ←RPDIS are start and end delimiters
Between the start and end delimiters there can be just text or text and fields with variable content
The * wildcard in the Word Find and Replace dialog box will find the pattern if it contains text only but it will ignore patterns where text is combined with fields. And ^19 will find the field but not the rest of the pattern until the end delimiter.
Can anyone help, please?
Here's a VBA solution. It wildcard searches for RPDIS→*←RPDIS. If the found text contains ^19 (assuming field codes visible; if objects are visible instead of field codes, then the appropriate test is text contains ^01), the found text is deleted. Note that this DOES NOT care about the type of embedded field --- it will delete ANY AND ALL embedded fields that occur between RPDIS→ and ←RPDIS, so use at your own risk. Also, the code has ChrW(8594) and ChrW(8592) to match right-arrow and left-arrow respectively. You may need to change that if your arrows are encoded differently.
Sub test()
Dim wdDoc As Word.Document
Dim r As Word.Range
Dim s As String
' Const c As Integer = 19 ' Works when field codes are visible
Const c As Integer = 1 ' Works when objects are visible
Set wdDoc = ActiveDocument
Set r = wdDoc.Content
With r.Find
.Text = "RPDIS" & ChrW(8594) & "*" & ChrW(8592) & "RPDIS"
.MatchWildcards = True
While .Execute
s = r.Text
If InStr(1, s, chr(c), vbTextCompare) > 0 Then
Debug.Print "Delete: " & s
' r.Delete ' This line commented out for testing; remove comments to actively delete
Else
Debug.Print "Keep: " & s
End If
Wend
End With
End Sub
Hope that helps.

How to count the visible number of lines of text in a text box on an MS Access form

OK, here's what I am trying to achieve. I have an MS Access 2016 DB with a form on it - one of the fields is a text field (max 255 chars), that users can enter "notes", by date.
The form is a continuous form, and there are a LOT of notes. And as most notes are only a single sentence, not the full 255 chars, to save screen space, the text box is sized to only allow show two lines of text (users can double click on the note to see the full text in the rare instances that the text is up to 255 chars).
The problem with this approach is that it is not always clear if a note goes beyond the two lines.
So I am trying to find a way to tell how many lines of text the note uses in the text box, and then I'll highlight the text box if this is the case.
Note what I am talking about here is text wrapping within a text box, not (necessarily) text with line breaks (although there may be line breaks also). Given the wrapping changes dependent upon the text (eg long words will "wrap early" to a new line), so using a simple char count doesn't work, even with a monospace font.
I have searched a lot online and found nothing, except a ref to a possible solution here:
http://www.lebans.com/textwidth-height.htm
But the download is an old Access file type I can no longer open.
Does anyone have any ideas (except for a form redesign - which is my last option hopefully!)
To count the number of lines in a string, or text box, you can use this expression:
UBound(Split(str, vbCrLf))
So
UBound(Split([textBoxName], vbCrLf))
OK, I have come up with a "solution" to this - it's neither neat nor fast, but it appears to work in my situation. I have posted the VBA code for anyone for whom it might interest.
This function is then used on a continuous form's textbox conditional highlighting, so I can highlight those instances where the text has wrapped beyond "n" lines (in my case, two lines)
FYI it's only partially tested, with no error handling!
' Returns TRUE if the text in a textbox wraps/breaks beyond the number of visible lines in the text box (before scrolling)
' THIS ONLY WORKS FOR MONOSPACE FONTS IN A TEXTBOX WHERE WE KNOW THE WidthInMonospaceCharacters
' WidthInMonospaceCharacters = number of MONOSPACE characters to EXACTLY fill one line in your text box (needs to be counted manually
' VisibleLinesInTextBox = number of lines your text box shows on screen (without scrolling)
Function UnseenLinesInTextBox(YourText As String, WidthInMonospaceCharacters As Long, VisibleLinesInTextBox As Long) As Boolean
Dim LineBreakTexts() As String
Dim CleanText As String
Dim LineCount As Long
Dim LineBreaks As Long
Dim i As Long
' Doesn't matter if we can't see invisible end spaces/line breaks, so lose them
' NB advise cleaning text whenver data updated then no need to run this line
CleanText = ClearEndSpacesAndLineBreaks(YourText)
' Check for any line breaks
LineBreakTexts = Split(CleanText, vbCrLf)
' Too many line breaks means we can't be all in the textbox, so report and GTFOOD
LineBreaks = UBound(LineBreakTexts)
If LineBreaks >= VisibleLinesInTextBox Then
UnseenLinesInTextBox = True
GoTo CleanExit
End If
' No line breaks, and text too short to wrap, so exit
If LineBreaks = 0 And Len(CleanText) <= WidthInMonospaceCharacters Then GoTo CleanExit
' Loop thorough the line break text, and check word wrapping for each
For i = 0 To LineBreaks
LineCount = LineCount + CountWrappedLines(LineBreakTexts(i), WidthInMonospaceCharacters, VisibleLinesInTextBox)
If LineCount > VisibleLinesInTextBox Then
UnseenLinesInTextBox = True
GoTo CleanExit
End If
Next i
CleanExit:
Erase LineBreakTexts
End Function
' Add BugOutLineCount if we are using this simply to see if we are exceeding X number of lines in a textbox
' Put this number of lines here (eg if we have a two line text box, enter 2)
Function CountWrappedLines(YourText As String, WidthInMonospaceCharacters As Long, Optional BugOutLineCount As Long) As Long
Dim SpaceBreakTexts() As String
Dim LineCount As Long, RollingCount As Long, SpaceBreaks As Long, i As Long
Dim WidthAdjust As Long
Dim CheckBugOut As Boolean
Dim tmpLng1 As Long, tmpLng2 As Long
If BugOutLineCount > 0 Then CheckBugOut = True
' Check for space breaks
SpaceBreakTexts = Split(YourText, " ")
SpaceBreaks = UBound(SpaceBreakTexts)
If SpaceBreaks = 0 Then
' No spaces, so text will wrap simply based on the number of characters per line
CountWrappedLines = NoSpacesWrap(YourText, WidthInMonospaceCharacters)
GoTo CleanExit
End If
' Need to count the wrapped line breaks manually
' We must start with at least one line!
LineCount = 1
For i = 0 To SpaceBreaks
tmpLng1 = Len(SpaceBreakTexts(i))
If i = 0 Then
' Do not count spaces in the first word...
RollingCount = RollingCount + tmpLng1
Else
' ... but add spaces to the count for the next texts
RollingCount = 1 + RollingCount + tmpLng1
End If
' Need this adjustment as wrapping works slightly differently between mid and
' end of text
If i = SpaceBreaks Then
WidthAdjust = WidthInMonospaceCharacters
Else
WidthAdjust = WidthInMonospaceCharacters - 1
End If
' Check when we get a wrapped line
If RollingCount > WidthAdjust Then
' Check the the length of the word itself doesn't warp over more than one line
If tmpLng1 > WidthInMonospaceCharacters Then
tmpLng2 = NoSpacesWrap(SpaceBreakTexts(i), WidthInMonospaceCharacters)
If i <> 0 Then
LineCount = LineCount + tmpLng2
Else
LineCount = tmpLng2
End If
' As we have wrapped, then we already have a word on the next line to count in the rolling count
RollingCount = tmpLng1 - ((tmpLng2 - 1) * WidthInMonospaceCharacters)
Else
' New line reached
LineCount = LineCount + 1
' As we have wrapped, then we already have a word on the next line to count in the rolling count
RollingCount = Len(SpaceBreakTexts(i))
End If
End If
If CheckBugOut Then If LineCount > BugOutLineCount Then Exit For
Next i
CountWrappedLines = LineCount
CleanExit:
Erase SpaceBreakTexts
End Function
' Work out how many lines text will wrap if it has NO spaces
Function NoSpacesWrap(YourText As String, WidthInMonospaceCharacters) As Long
Dim WordLines As Double
Dim MyInt As Integer
WordLines = (Len(YourText) / WidthInMonospaceCharacters)
MyInt = Int(WordLines)
' Line(s) are exact width we are looking at
If WordLines - MyInt = 0 Then
NoSpacesWrap = MyInt
Else
NoSpacesWrap = MyInt + 1
End If
End Function
Function ClearEndSpacesAndLineBreaks(YourText As String) As String
Dim str As String
Dim CurrentLength As Long
str = YourText
' Need to loop this in case we have a string of line breaks and spaces invisibly at end of text
Do
CurrentLength = Len(str)
' Clear end spaces
str = RTrim(str)
' Clear end line break(s) whihc are TWO characters long
Do
If Right(str, 2) <> vbCrLf Then Exit Do
str = Left(str, Len(str) - 2)
Loop
If Len(str) = CurrentLength Then Exit Do
Loop
ClearEndSpacesAndLineBreaks = str
End Function
Do please provide any feedback and comments!

Applescript to resort piles of numbers

I'm trying to resort a bunch of numbers with Applescript. I'm very new to the language and I thought I'd ask you for help.
I have a group of numbers which looks like this in my TextEdit file:
v 0.186472 0.578063 1.566364
v -0.186472 0.578063 1.566364
v 0.335649 0.578063 1.771483
What i need is a script that resorts these numbers, making it appear like this:
(0.186472, 0.578063, 1.566364),
(-0.186472, 0.578063, 1.566364),
(0.335649, 0.578063, 1.771483),
So after each number, there has to be a comma, and always the three numbers on one line have to be put into brackets (). finally there has to be another comma after every bracketed group of three and the v before every line has to be deleted.
I've only so far managed to get rid of every "v" using:
set stringToFind to "v"
set stringToReplace to ""
But now im stuck and I'm hoping for help.
To find and replace strings in AppleScript the native way is using text item delimiters. There are a fixed number of values separated by spaces (or tabs) on each line, using text item delimiters, text itemsand string concatenation we can solve your problem.
I've added an addition linefeed in front and at the back of the string to show that lines that doesn't contain 4 words are ignored.
set theString to "
v 0.186472 0.578063 1.566364
v -0.186472 0.578063 1.566364
v 0.335649 0.578063 1.771483
"
set theLines to paragraphs of theString
set oldTIDs to AppleScript's text item delimiters
repeat with i from 1 to count theLines
set AppleScript's text item delimiters to {space, tab}
if (count of text items of item i of theLines) = 4 then
set theNumbers to text items 2 thru -1 of item i of theLines
set AppleScript's text item delimiters to ", "
set item i of theLines to "(" & (theNumbers as string) & "),"
else
set item i of theLines to missing value
end if
end repeat
set theLines to text of theLines
set AppleScript's text item delimiters to linefeed
set newString to theLines as string
set AppleScript's text item delimiters to oldTIDs
return newString

EDIFACT macro (readable message structure)

I´m working within the EDI area and would like some help with a EDIFACT macro to make the EDIFACT files more readable.
The message looks like this:
data'data'data'data'
I would like to have the macro converting the structure to:
data'
data'
data'
data'
Pls let me know how to do this.
Thanks in advance!
BR
Jonas
If you merely want to view the files in a more readable format, try downloading the Softshare EDI Notepad. It's a fairly good tool just for that purpose, it supports X12, EDIFACT and TRADACOMS standards, and it's free.
Replacing in VIM (assuming that the standard EDIFACT separators/escape characters for UNOA character set are in use):
:s/\([^?]'\)\(.\)/\1\r\2/g
Breaking down the regex:
\([^?]'\) - search for ' which occurs after any character except ? (the standard escape character) and capture these two characters as the first atom. These are the last two characters of each segment.
\(.\) - Capture any single character following the segment terminator (ie. don't match if the segment terminator is already on the end of a line)
Then replace all matches on this line with a new line between the segment terminator and the beginning of the next segment.
Otherwise you could end up with this:
...
FTX+AAR+++FORWARDING?: Freight under Vendor?'
s care.'
NAD+BY+9312345123452'
CTA+PD+0001:Terence Trent D?'
Arby'
...
instead of this:
...
FTX+AAR+++FORWARDING?: Freight under Vendor?'s care .'
NAD+BY+9312345123452'
CTA+PD+0001:Terence Trent D?'Arby'
...
Is this what you are looking for?
Option Explicit
Dim stmOutput: Set stmOutput = CreateObject("ADODB.Stream")
stmOutput.Open
stmOutput.Type = 2 'adTypeText
stmOutput.Charset = "us-ascii"
Dim stm: Set stm = CreateObject("ADODB.Stream")
stm.Type = 1 'adTypeBinary
stm.Open
stm.LoadFromFile "EDIFACT.txt"
stm.Position = 0
stm.Type = 2 'adTypeText
stm.Charset = "us-ascii"
Dim c: c = ""
Do Until stm.EOS
c = stm.ReadText(1)
Select Case c
Case Chr(39)
stmOutput.WriteText c & vbCrLf
Case Else
stmOutput.WriteText c
End Select
Loop
stm.Close
Set stm = Nothing
stmOutput.SaveToFile "EDIFACT.with-CRLF.txt"
stmOutput.Close
Set stmOutput = Nothing
WScript.Echo "Done."