How to save web page content to a text file - dom

I use an automation script that tests a browser-based application. I'd like to save the visible text of each page I load as a text file. This needs to work for the current open browser window. I've come across some solutions that use InternetExplorer.Application but this won't work for me as it has to be the current open page.
Ideally, I'd like to achieve this using vbscript.
Any ideas how to do this?

You can attach to an already running IE instance like this:
Set app = CreateObject("Shell.Application")
For Each window In app.Windows()
If InStr(1, window.FullName, "iexplore", vbTextCompare) > 0 Then
Set ie = window
Exit For
End If
Next
Then save the document body text like this:
Set fso = CreateObject("Scripting.FileSystemObject")
Set f = fso.OpenTextFile("output.txt", 2, True)
f.Write ie.document.body.innerText
f.Close
If the page contains non-ASCII characters you may need to create the output file with Unicode encoding:
Set f = fso.OpenTextFile("output.txt", 2, True, -1)
or save it as UTF-8:
Set stream = CreateObject("ADODB.Stream")
stream.Open
stream.Type = 2 'text
stream.Position = 0
stream.Charset = "utf-8"
stream.WriteText ie.document.body.innerText
stream.SaveToFile "output.txt", 2
stream.Close
Edit: Something like this may help getting rid of script code in the document body:
Set re = New RegExp
re.Pattern = "<script[\s\S]*?</script>"
re.IgnoreCase = True
re.Global = True
ie.document.body.innerHtml = re.Replace(ie.document.body.innerHtml, "")
WScript.Echo ie.document.body.innerText

Related

Capture Macro in Catia 5

I need to write a macro in Catia 5. My aim is to convert cgm files to png at the desired background color and at desired resolution. Manually I am doing it by Capture->image->options(setting resolution and background color)-> save as .
I need to do it by macro.
I can open the Capture window with CATIA.StartCommand "Capture"
but can not proceed furthermore. How can I proceed?
Thanks in advance.
HOW WE CAN USE COMMANDS WHICH ARE GIVEN IN OBJECT BROWSER IN MACRO? I AM DIRECTLY WRITING IT BUT DOES NOT WORK.
Unfortunately, the Capture command does not seem to be available through the macro API. I've successfully used this workaround, however:
Sub CaptureViewport(strFileName As String, Optional intWidth As Integer = 1024, Optional intHeight As Integer = 1024)
Dim objWindow As SpecsAndGeomWindow
Dim objViewer As Variant ' Viewer3D
Dim objCamera As Camera3D
Dim objViewpoint As Variant ' Viewpoint3D
Dim arrOldBackgroundColor(2) As Variant
Dim intOldRenderingMode As CatRenderingMode
Dim intOldLayout As CatSpecsAndGeomWindowLayout
Set objWindow = CATIA.ActiveWindow
Set objCamera = CATIA.ActiveDocument.Cameras.Item(1)
Set objViewer = objWindow.ActiveViewer
Set objViewpoint = objViewer.Viewpoint3D
objViewer.GetBackgroundColor arrOldBackgroundColor
intOldRenderingMode = objViewer.RenderingMode
intOldLayout = objWindow.Layout
' This might be extended to record the old window dimensions as well
objViewer.FullScreen = False
objViewer.PutBackgroundColor Array(1, 1, 1) ' White
objViewer.RenderingMode = catRenderShadingWithEdges
objWindow.Layout = catWindowGeomOnly
objWindow.Width = intWidth
objWindow.Height = intHeight
objViewpoint.PutSightDirection Array(-1, -1, -1) ' Isometric
objViewpoint.PutUpDirection Array(0, 0, 1)
objViewpoint.ProjectionMode = catProjectionCylindric ' Parallel projection
objViewer.Reframe
' Without this, the picture is not always sized correctly
CATIA.RefreshDisplay = True
objViewer.Update
objViewer.CaptureToFile catCaptureFormatBMP, strFileName
CATIA.RefreshDisplay = False
objViewer.PutBackgroundColor arrOldBackgroundColor
objViewer.RenderingMode = intOldRenderingMode
objWindow.Layout = intOldLayout
' This might be extended to restore the old window dimensions as well
End Sub
It works by temporarily changing the background color (among other things, such as spec. tree visibility, rendering mode and camera settings) and by using the CaptureToFile method. By changing the window size, you also change the dimensions of the captured image. Unfortunately, it cannot capture to PNG format (even though the interactive Capture tool can). This version instead captures to BMP. The JPEG mode compresses the picture beyond reason and is unusable. The compass will be visible in the pictures captured with this macro, if it is enabled in the interactive session.

Using powershell to change an image in MS office publisher

I am creating a powershell script that will auto generate publisher files for wristbands. On the Wristband is a QR code and a few other details to personally identify the wearer. I currently have a template file set up, a script that copies this, renames it, and edits some of the text on the page.
What I need it the script to change the placeholder image in the template to a QR code image, the data in the QR is only every going to be from a set amount of images (one of 1800), all have been generated and named to match up with the names used in Powershell.
Has anyone changed an image in MS Publisher using powershell before? Below is the code I currently have.
$CurrentMember = "M001S001"
$CurrectDocumet = "C:\Users\Rob\Documents\DistrictCamp2017\GeneratedFiles\" + $CurrentMember + ".pub"
copy-item "C:\Users\Rob\Documents\DistrictCamp2017\TemplateWristband.pub" "C:\Users\Rob\Documents\DistrictCamp2017\GeneratedFiles"
Rename-Item "C:\Users\Rob\Documents\DistrictCamp2017\GeneratedFiles\TemplateWristband.pub" "$CurrentMember.pub"
Add-Type -AssemblyName Microsoft.Office.Interop.Publisher
$Publisher = New-Object Microsoft.Office.Interop.Publisher.ApplicationClass
$OpenDoc = $Publisher.Open("C:\Users\Rob\Documents\DistrictCamp2017\GeneratedFiles\M001S001.pub")
###Replace Barcode and text
$pbReplaceScopeAll = 2
$OpenDoc.Find.Clear()
$OpenDoc.Find.FindText = "DEFAULT"
$OpenDoc.Find.ReplaceWithText = $CurrentMember
$OpenDoc.Find.ReplaceScope = "2" #$pbReplaceScopeAll
$OpenDoc.Find.Execute()
$OpenDoc.Save()
$OpenDoc.Close()
$Publisher.quit()
The image in the template document is currently a blank 145*145 pixel square, to be replaced by the appropriate QR code image, dependant on the value of $CurrentMember. I haven't yet written anything to try and change the image as I cannot find anything online, anything I search for seems to return results about Azure publisher server images.
Many thanks,
Rob
The easiest way is probably to get the shape by index, then add a new picture in its place, then remove the original shape:
Sub ReplaceFirstShapeWithImage()
Dim oPage As Page
Dim oShape As Shape
Dim newImage As Shape
Set oPage = Application.ActiveDocument.ActiveView.ActivePage
Set oShape = oPage.Shapes(1)
''https://msdn.microsoft.com/en-us/library/office/ff940072.aspx
Set newImage = oPage.Shapes.AddPicture("C:\Users\johanb\Pictures\X.png", msoFalse, msoTrue, oShape.Left, oShape.Top, oShape.Width, oShape.Height)
oShape.Delete
End Sub
This should help you find the right index
Sub GetIndexOfSelectedShape()
If Application.Selection.ShapeRange.Count = 0 Then
MsgBox "Please select a shape first"
Exit Sub
End If
Dim oShape As Shape
Dim oLoopShape As Shape
Dim i As Long
Set oShape = Application.Selection.ShapeRange(1)
For i = 1 To oShape.Parent.Shapes.Count
Set oLoopShape = oShape.Parent.Shapes(i)
If oLoopShape Is oShape Then
MsgBox oShape.Name & " has index " & i
End If
Next i
End Sub
Unfortunately I can't use PowerShell right now, but this VBA code should help you with the object model

LibreOffice Draw -add hyperlinks based on query table

I am using draw to mark up a pdf format index map. So in grid 99, the text hyperlinks to map99.pdf
There are 1000's of grid cells - is there a way for a (macro) to scan for text in a sheet that is like
Text in File | Link to add
99|file:///c:/maps/map99.pdf
100|file:///c:/maps/map100.pdf
and add links to the relevant file whenever the text is found (99,100 etc).
I don't use libre much but happy to implement any programatic solution.
Ok, after using xray to drill through enumerated content, I finally have the answer. The code needs to create a text field using a cursor. Here is a complete working solution:
Sub AddLinks
Dim oDocument As Object
Dim vDescriptor, vFound
Dim numText As String, tryNumText As Integer
Dim oDrawPages, oDrawPage
Dim oField, oCurs
Dim numChanged As Integer
oDocument = ThisComponent
oDrawPages = oDocument.getDrawPages()
oDrawPage = oDrawPages.getByIndex(0)
numChanged = 0
For tryNumText = 1 to 1000
vDescriptor = oDrawPage.createSearchDescriptor
With vDescriptor
'.SearchString = "[:digit:]+" 'Patterns work in search box but not here?
.SearchString = tryNumText
End With
vFound = oDrawPage.findFirst(vDescriptor)
If Not IsNull(vFound) Then
numText = vFound.getString()
oField = ThisComponent.createInstance("com.sun.star.text.TextField.URL")
oField.Representation = numText
oField.URL = numText & ".pdf"
vFound.setString("")
oCurs = vFound.getText().createTextCursorByRange(vFound)
oCurs.getText().insertTextContent(oCurs, oField, False)
numChanged = numChanged + 1
End If
Next tryNumText
MsgBox("Added " & numChanged & " links.")
End Sub
To save relative links, go to File -> Export as PDF -> Links and check Export URLs relative to file system.
I uploaded an example file here that works. For some reason your example file is hanging on my system -- maybe it's too large.
Replacing text with links is much easier in Writer than in Draw. However Writer does not open PDF files.
There is some related code at https://forum.openoffice.org/en/forum/viewtopic.php?f=20&t=1401.

Form manipulation via VBS - need form to update after 1st selection

I created a script to BEGIN to do what I need. This script interacts with the initial form.
I can fill out the form and submit, but it errors.
Upon further inspection, after a selection is made in the first dropdown field, the form is supposed to refresh with updated dropdown options.
With my code, it doesn't allow the form to refresh after the first selection is made. Here is my somewhat-working code:
set ie = createobject("internetexplorer.application")
Set objShell = CreateObject("WScript.Shell")
ie.navigate "https://www.myfloridalicense.com/wl11.asp?Mode=1&SID=&brd=&typ="
ie.Visible = True
do until ie.readystate = 4 : wscript.sleep 10: loop
IE.Document.getElementsByTagName("select")("Board").Value = "25"
do until ie.readystate = 4 : wscript.sleep 10: loop
IE.Document.getElementsByTagName("select")("County").Value = "11"
IE.Document.getElementsByTagName("select")("RecsPerPage").Value = "50"
For Each btn In IE.Document.getElementsByTagName("input")
If btn.type = "image" Then btn.Click()
Next
My question is - how do I allow the form to update after the first selection? First selection being:
IE.Document.getElementsByTagName("select")("Board").Value = "25"
I tried adding this after, but no dice:
do until ie.readystate = 4 : wscript.sleep 10: loop
The web page by the link you gave contains function DDChange(), first and second form fields have onchange event linked to it. But for some reason this function is not called after the field is changed. So I've added to the script the same actions which the function does, and some extra loop to obtain complete document state. Here is the part of the code for form updating after first field is changed, it works ok for me:
Set IE = CreateObject("internetexplorer.application")
IE.Visible = True
IE.Navigate "https://www.myfloridalicense.com/wl11.asp?Mode=1&SID=&brd=&typ="
Do Until IE.ReadyState = 4: WScript.sleep 10: Loop
IE.Document.getElementsByTagName("select")("Board").Value = "25"
' see function DDChange() {} for details
Set Form = IE.Document.Forms("reportForm")
Form.hDDChange.Value = "Y"
Form.Submit
Do Until IE.ReadyState = 4: WScript.sleep 10: Loop
' additional loop to check doc state
Do Until IE.Document.ReadyState = "complete": WScript.sleep 10: Loop
WScript.Echo IE.Document.getElementsByTagName("select")("LicenseType").OuterHtml
' the rest part of your code here ...
BTW, there are two methods of data web-scraping: to interact with IE (you are implementing), or to get data via HTTP requests and parse it. First one is evident, but not reliable (you know, IE is heavy, slow, and sometimes user settings and old cookies affects workflow). Second one is more complex to setup, requires skills, but not need IE, so it is free of IE disadvantages.

How to insert entire file path into text box, not just filename? Access 2010

I am currently using the following code to select a file and add its path to a text box.
Dim objDialog As Object
Set objDialog = Application.FileDialog(3)
With objDialog
.AllowMultiSelect = False
.Show
If .SelectedItems.Count = 0 Then
MsgBox "No file selected."
Else
Me.FileNameTextBox = Dir(.SelectedItems(1))
End If
End With
Set objDialog = Nothing
How do I make it so the entire file path is inserted, not just the file name?
.SelectedItems(n) already contains the full path and filename. If what you need is just to separate the name of the file from its path, instead of using the Dir function you could use something like this:
Me.FileNameTextBox = Mid$(.SelectedItems(1), InStrRev(.SelectedItems(1), "\") + 1)
Me.PathTextBox = Left$(.SelectedItems(1), InStrRev(.SelectedItems(1), "\"))
Hope this helps!
you need to remove the dir() part, EG....
Me.FileNameTextBox = .SelectedItems(1)