I have RichTextBox with text with urls.
for exemple:
Det er vigtigt at du l?ser vor Praktiske Information grundigt igennem
f?r afrejse. P? vores hjemmeside g?r du ind under ”Praktisk info” og
v?lger dit aktuelle rejsem?l.
https://www.example.com/watch?v=IjGNTqAW58E Her finder du mange
nyttige informationer omkring turen, s? som visum, vaccinationer,
bagage, transport med mere. link til Praktisk Info
When I paste this text into report I find all urls and wrap it by tags:
richTextBox.RTF = richTextBox.RTF.Replace(url, #"{\field{\*\fldinst{ HYPERLINK " + $"\"{url}\"" + #"} }{\fldrslt{"+ $"{url}" +#" } } }\ul0\cf0");
In previewer it looks good the link is clickable and works right
But when I convert it to PDF it splits url in two parts when the string ends and the second path which in the second line is gone from url.
How to generate a PDF with a valid URL?
Try to use the following format:
"{\\field{\\*\\fldinst{ HYPERLINK "+"\"https://www.example.com/watch?v=IjGNTqAW58E\"" + "}}{\\fldrslt https://www.example.com/watch?v=IjGNTqAW58E}}"
This keeps the url in one piece.
Related
I am using an AHK script to replace some text in an .xml file (Result.xml in this case). The script then saves a file as Result_copy.xml. It changes exactly what I need, but when I try to open the new xml file, it won't open, giving me the error:
This page contains the following errors:
error on line 4 at column 16: Encoding error
Below is a rendering of the page up to the first error.
I only replaced text at line 38 using:
#Include TF.ahk
path = %1%
text = %2%
TF_ReplaceLine(path, 38, 38, text)
%1% and %2% are given by another program and are working as should
I also see that the orginal Result.xml is 123 kb and Result_copy.xml is 62 kb, even though I only add text. When I take Result.xml and manually add the text and save it, it's 123 kb and still opens. so now both files contain exactly the same Characters, but one won't open as xml. I think that something happens during saving/copying, which I don't understand.
Could someone help me out on this one? I don't have a lot of experience in AHK scripting and do not have a programming background.
Thank you in advance!
Michel
TF.ahk contains this:
/*
Name : TF: Textfile & String Library for AutoHotkey
Version : 3.8
Documentation : https://github.com/hi5/TF
AutoHotkey.com: https://www.autohotkey.com/boards/viewtopic.php?f=6&t=576
AutoHotkey.com: http://www.autohotkey.com/forum/topic46195.html (Also for examples)
License : see license.txt (GPL 2.0)
Credits & History: See documentation at GH above.
TF_ReplaceLine(Text, StartLine = 1, Endline = 0, ReplaceText = "")
{
TF_GetData(OW, Text, FileName)
TF_MatchList:=_MakeMatchList(Text, StartLine, EndLine, 0, A_ThisFunc) ; create MatchList
Loop, Parse, Text, `n, `r
{
If A_Index in %TF_MatchList%
Output .= ReplaceText "`n"
Else
Output .= A_LoopField "`n"
}
Return TF_ReturnOutPut(OW, OutPut, FileName)
}
I am trying to use PowerShell to pro-grammatically update notes in PowerPoint slide notes. Being able to do this will save tremendous amounts of time. The code below allows me to edit the notes field with PowerShell but it messes up the format each time.
$PowerpointFile = "C:\Users\username\Documents\test.pptx"
$Powerpoint = New-Object -ComObject powerpoint.application
$ppt = $Powerpoint.presentations.open($PowerpointFile, 2, $True, $False)
foreach($slide in $ppt.slides){
if($slide.NotesPage.Shapes[2].TextFrame.TextRange.Text -match "string"){
$slide.NotesPage.Shapes[2].TextFrame.TextRange.Text = $slide.NotesPage.Shapes[2].TextFrame.TextRange.Text -replace "string","stringreplaced"
}
}
Sleep -Seconds 3
$ppt.Save()
$Powerpoint.Quit()
For example, right now it will iterate through each slide's notes and update the word string to stringreplaced but then the entire notes text becomes bold. In my notes I have a single word at the top of the notes that is bold and then text below it. For example, a note on a slide my look like this:
Note Title
Help me with this string.
After PowerShell updates the notes field it saves it to a new .pptx file but the note now looks like this:
Note Title
Help me with this stringreplaced.
Any ideas on how to update slide notes without messing up any formatting found in the notes? It only messes up formatting for slides the script updates.
When you change the entire text content of a textrange in PPT, as your code's doing, the changed textrange will pick up the formatting of the first character in the range. I'm not sure how you'd do this in PowerShell, but here's an example in PPT VBA that demonstrates the same problem and shows how to use PPT's own Replace method instead to solve the problem:
Sub ExampleTextReplace()
' Assumes two shapes with text on Slide 1 of the current presentation
' Each has the text "This is some sample text"
' The first character of each is bolded
' Demonstrates the difference between different methods of replacing text
' within a string
Dim oSh As Shape
' First shape: change the text
Set oSh = ActivePresentation.Slides(1).Shapes(1)
With oSh.TextFrame.TextRange
.Text = Replace(.Text, "sample text", "example text")
End With
' Result: the entire text string is bolded
' Second shape: Use PowerPoint's Replace method instead
Set oSh = ActivePresentation.Slides(1).Shapes(2)
With oSh.TextFrame.TextRange
.Replace "sample text", "example text"
End With
' Result: only the first character of the text is bolded
' as it was originally
End Sub
I am using draw to mark up a pdf format index map. So in grid 99, the text hyperlinks to map99.pdf
There are 1000's of grid cells - is there a way for a (macro) to scan for text in a sheet that is like
Text in File | Link to add
99|file:///c:/maps/map99.pdf
100|file:///c:/maps/map100.pdf
and add links to the relevant file whenever the text is found (99,100 etc).
I don't use libre much but happy to implement any programatic solution.
Ok, after using xray to drill through enumerated content, I finally have the answer. The code needs to create a text field using a cursor. Here is a complete working solution:
Sub AddLinks
Dim oDocument As Object
Dim vDescriptor, vFound
Dim numText As String, tryNumText As Integer
Dim oDrawPages, oDrawPage
Dim oField, oCurs
Dim numChanged As Integer
oDocument = ThisComponent
oDrawPages = oDocument.getDrawPages()
oDrawPage = oDrawPages.getByIndex(0)
numChanged = 0
For tryNumText = 1 to 1000
vDescriptor = oDrawPage.createSearchDescriptor
With vDescriptor
'.SearchString = "[:digit:]+" 'Patterns work in search box but not here?
.SearchString = tryNumText
End With
vFound = oDrawPage.findFirst(vDescriptor)
If Not IsNull(vFound) Then
numText = vFound.getString()
oField = ThisComponent.createInstance("com.sun.star.text.TextField.URL")
oField.Representation = numText
oField.URL = numText & ".pdf"
vFound.setString("")
oCurs = vFound.getText().createTextCursorByRange(vFound)
oCurs.getText().insertTextContent(oCurs, oField, False)
numChanged = numChanged + 1
End If
Next tryNumText
MsgBox("Added " & numChanged & " links.")
End Sub
To save relative links, go to File -> Export as PDF -> Links and check Export URLs relative to file system.
I uploaded an example file here that works. For some reason your example file is hanging on my system -- maybe it's too large.
Replacing text with links is much easier in Writer than in Draw. However Writer does not open PDF files.
There is some related code at https://forum.openoffice.org/en/forum/viewtopic.php?f=20&t=1401.
I have couple of PDF files whose text I am not able to extract from. These PDFs file were created by converting Word files to PDFs.
The main purpose I am extracting text from pdf is to index its text and make it searchable.
PdfReader reader = new PdfReader(inFileName);
for (int page = 1; page <= reader.NumberOfPages; page++)
{
// where strPDFText is string builder
strPDFText.Append(iTextSharp.text.pdf.parser.PdfTextExtractor.GetTextFromPage(reader, page) + " ");
}
string str = strPDFText.ToString();
I get an empty string. What could be the reason for the same. I am using Itextsharp 5.5
While the sample PDF provided by the OP indeed indicates that it is a MS Word export, it simply does not contain any text, only an image (which incidentally shows text).
The content of the PDF is this:
/P <</MCID 0>> BDC BT
/F1 11.04 Tf
1 0 0 1 540.1 500.95 Tm
/GS7 gs
0 g
0 G
[( )] TJ
ET
EMC /P <</MCID 1>> BDC q
0.000000071 488.88 612 231.12 re
W* n
468 0 0 219.05 72 500.95 cm
/Image8 Do Q
EMC
As you see the only actual text displayed is a single space ([( )] TJ), and the only remaining content is a bitmap image (/Image8 Do).
Thus,
I get an empty string. What could be the reason for the same.
The reason is that there is no text in your document.
I am trying to extract a substring out of some html code in wxWidgets but I can't get my method working properly.
content of to_parse:
[HTML CODE]
<html><head></head><body><font face="Segue UI" size=2 .....<font face="Segoe UI"size="2" color="#000FFF"><font face="#DFKai-SB" ... <b><u> the text </u></b></font></font></font></body></html>
[/HTML CODE] (sorry about the format)
wxString to_parse = SOStream.GetString();
size_t spos = to_parse.find_last_of("<font face=",wxString::npos);
size_t epos = to_parse.find_first_of("</font>",wxString::npos);
wxString retstring(to_parse.Mid(spos,epos));
wxMessageBox(retstring); // Output: always ---> tml>
As there are several font face tags in the HTML the to_parse variable I would like to find the postion of the last <"font face= and the postion of the first <"/font>" close tag.
For some reason, only get the same to me unexpected output tml>
Can anyone spot the reason why?
The methods find_{last,first}_of() don't do what you seem to think they do, they behave in the same way as std::basic_string<> methods of the same name and find the first (or last) character of the string you pass to them, see the documentation.
If you want to search for a substring, use find().
Thank you for the answer. Yes you were right, I must have somehow been under the impression that Substring() / substr() / Mid() takes two wxStrings as parameters, which isn't the case.
wxString to_parse = SOStream.GetString();
to_parse = to_parse.Mid(to_parse.find("<p ")); disregarts everything before "<p "
to_parse = to_parse.Remove(to_parse.find("</p>")); removes everything after "</p>"
wxMessageBox(to_parse); // so we are left with everything between "<p" and "</p>"