Remove html tags in subreport - ssrs-2008

I have a subreport in SSRS which returns a text embedded in HTML tags. I will like to know if there is a way of stripping these HTML tags so as to only have the text. I am using VS 2008.
I have tried using a regex function as below to strip the HTML tags but this does not work:
Shared FUNCTION RemoveHtml(ByVal Text As String) AS String
IF Text IsNot Nothing Then
Dim mRemoveTagRegex AS NEW System.Text.RegularExpressions.Regex(“<(.|\n)+?>”)
Return mRemoveTagRegex.Replace(text, "")
End If
end function

You could probably just use a combination of the built-in functions provided with SSRS to do what you need. I'd recommend combining Mid with InStr. The following expression will take the value between the last character of the opening HTML tag and the first character of the closing HTML tag.
=MID(Fields!Field.Value,
InStr(Fields!Field.Value, ">") + 1,
InStrRev(Fields!Field.Value, "</")
-Len(Left(Fields!Field.Value,
InStr(Fields!Field.Value, ">") + 1)))
Edit: It got a little more complex than I thought, but this should do the trick.

Related

How do I format a string for use in an inline script rendered as HTML?

I have the following properties string
GET 50% OFF ANY M'EDIUM OR L"AR"GE PIZZA!
I am using it in an HTML onclick markup like so
onclick="trackPromoCta(encodeURI(${properties.ctaTwoTextRight # context='text'}));"
However this outputs invalid html. I tried #context of scriptString and that escapes but only for inside JavaScript not for inside HTML markup. I tried all of the other options as well and none of them actually escape special characters for rendering HTML.
I saw someone once use a #format to search the string for these characters and escape them for HTML but I can't find out how to use #format to do this.
The expected output should be
onclick="trackPromoCta(encodeURI('GET 50% OFF ANY M'EDIUM OR L"AR"GE PIZZA!'));"
Take a look at the HTL spec for display context: https://github.com/Adobe-Marketing-Cloud/htl-spec/blob/master/SPECIFICATION.md#121-display-context
What you need is scriptString since your string property will eventually be used as a javascript string literal.
${properties.jcr:title # context='scriptString'} <!--/* Applies JavaScript string escaping */-->
Also, you need to enclose your HTL expression with single quotes, for example:
var str = '${'this is a js string literla' # context='scriptString'}'
The HTL code for you specific example would be:
onclick="trackPromoCta(encodeURI('${properties.ctaTwoTextRight # context='scriptString'}'));"
The #context value "text", "html" or "attribute" will return encoded values in your resulting html. As per documentation too, text encodes all HTML special characters.
If you go through your html's code using "View Page Source" and not via "Inspect element of developer tools". You will see the expected outcome.
onclick="trackPromoCta(encodeURI('GET 50% OFF ANY M'EDIUM OR L"AR"GE PIZZA!'));"
Reference:
https://helpx.adobe.com/experience-manager/htl/using/expression-language.html

Superscript within code block in Github Markdown

The <sup></sup> tag is used for superscripts. Creating a code block is done with backticks. The issue I have is when I try to create a superscript within a code block, it prints out the <sup></sup> tag instead of formatting the text between the tag.
How do I have superscript text formatted correctly when it's between backticks?
Post solution edit
Desired output:
A2 instead of A<sup>2</sup>
This is not possible unless you use raw HTML.
The rules specifically state:
With a code span, ampersands and angle brackets are encoded as HTML entities automatically, which makes it easy to include example HTML tags.
In other words, it is not possible to use HTML to format text in a code span. In fact, a code span is plain, unformatted text. Having any of that text appear as a superscript would mean it is not plain, unformatted text. Thus, this is not possible by design.
However, the rules also state:
Markdown is not a replacement for HTML, or even close to it. Its
syntax is very small, corresponding only to a very small subset of
HTML tags. The idea is not to create a syntax that makes it easier
to insert HTML tags. In my opinion, HTML tags are already easy to
insert. The idea for Markdown is to make it easy to read, write, and
edit prose. HTML is a publishing format; Markdown is a writing
format. Thus, Markdown's formatting syntax only addresses issues that
can be conveyed in plain text.
For any markup that is not covered by Markdown's syntax, you simply
use HTML itself. ...
So, if you really need some text in a code span to be in superscript, then use raw HTML for the entire span (be sure to escape things manually as required):
<code>A code span with <sup>superscript</sup> text and escaped characters: "<&>".</code>
Which renders as:
A code span with superscript text and escaped characters: "<&>".
This is expected behaviour:
Markdown wraps a code block in both <pre> and <code> tags.
You can use Unicode superscript and subscript characters within code blocks:
class SomeClass¹ {
}
Inputting these characters will depend on your operating system and configuration. I like to use compose key sequences on my Linux machines. As a last resort you should be able to copy and paste them from something like the Wikipedia page mentioned above.
¹Some interesting footnote, e.g. referencing MDN on <pre> and <code> tags.
If you're luck, the characters you want to superscript (or subscript) may have dedicated codepoints in Unicode. These will work inside codeblocks, as demonstrated in your question, where you include A² in backticks. Eg:
Water (chemical formula H₂O) is transparent, tasteless and odourless.
I've listed out the super and subscript Unicode characters in this Gist. You should be able to copy and paste any you need from there.

Formatting PHP code for Epub in MS Word

I'm trying to format the PHP code sections of a 700+ page book for Epub conversion. If I use soft returns at the end of the code lines, they get eaten. If I use hard returns (making each line a paragraph), I either get too much space between the lines, or not enough before and after the code section. If I add an empty line before and after the code section, it gets eaten.
There are thousands of lines of code in the book. Is there some way to handle this without manually editing the html file?
Is there some common format for these code section like being wrapped with PHP tags?
If they are PHP tags you can use this, which will wrap each tag set with a :
function fixPHPcode($matches)
{
return '<p class="php_code">' . $matches[0] . '</p>';
}
$data = preg_replace_callback('/<\?php(.|\s)+?\?>/i', 'fixPHPcode', $data);
I did try some fairly complex regex transforms, but I've found an easier method that actually works fairly well.
The secret is to create a style based on Word's "HTML Preformatted" style, or if you don't have that style, a style based on Normal that specifies Arial Unicode MS or Courier New as the font, with no proofing, left justified.
Indent with spaces and use soft returns (shift-enter) at the end of each line.
Calibre will produce acceptable Epub and Mobi versions of this. Courier is a crap font for code, but at least it's monospaced so the indents will line up, and people are used to seeing it as a code font.

getting post variables in asp with non-alphanumeric fieldnames

I am working with a legacy form builder system that can have non-alphanumeric fieldnames
so for example
< input type=text name="5-Teléfono">
and in asp, simply doing the following will not output the posted value
response.write (request('5-Teléfono'))
I understand this isn't the best design decision (should probably name the fields text_123 ... etc), however updating the whole system to use this structure would take time.
Is there a way in asp for me to read form field names with non-alphanumerics ?
Try
response.write (request("5-Teléfono"))
(ie with double quotes around 5-Teléfono)
Single quotes in vbscript are used to comment out what comes afterwards
Don't use such names, it's like using non English variable names - same impact.
If you can't change the name, you can still iterate the posted values and look for the proper one:
For Each key In Request.Form
If (InStr(key, "fono") > 0) And (InStr(key, "5-Tel") > 0) Then
Response.Write("Found, value is: " & Request.Form(key))
End If
Next

regex_replace to replace certain html tags

Is there a way to convert BR tags and/or DIV tags to new lines so it will format correctly when I use an in a mailto? I was thinking I should look for any P, DIV, and BR tags and replace them with a new line character. So anywhere there is a closing tag put the new line character and remove the opening tag. After I do the above I will remove the rest of the html with remove_html="1" but I want to keep the paragraph format.
I thought it can be done using regex_replace but I'm not sure how to write it. Anyone know?
Do not parse HTML files using regex, use HTML parser (HTML::TreeBuilder or something similar that can do in line changes) module, or in this case, even better use XSLT transformations.