Strip \n within paragraphs of text in a file - perl

Say I have a file with multiple paragraphs similar to
Lorem ipsum dolor sit amet. Velit et ornare feugiat ve fringilla adipiscing, non
augue risus, eleifend. Laoreet a, taciti porttitor mus. Erat leo metus
venenatis. Natoque eni, nunc quis elit est. Nec enim dui. Sem parturient lectus,
sed, egestas. Amet nascetur quisque, nonummy amet ut odio proin hymenaeos sit,
consequat proin hymenaeos vestibulum. Duis ad penatibus natoque, fames nec amet
eni inceptos. Ligula orci scelerisque laoreet, massa leo dictumst feugiat
praesent varius netus suspendisse. Et et quis volutpat quam, aenean sit, magnis
integer ad luctus hendrerit per. Lectus adipiscing nascetur quisque consectetuer
feugiat etiam eros. Natoque massa. Semper ut nam tortor. Odio ut nullam mus,
sociis at, luctus aliquet at odio habitant fames.
Penatibus ipsum lacus blandit ad dis ante dolor. Cursus porta penatibus
facilisi. Nisl erat rutrum primis dis elit dolor penatibus pretium duis
sollicitudin ut. Sed urna leo massa cubilia eget, elementum mus. Ve metus ac
vitae at litora tincidunt id, ac hac. Dis justo nullam. Fames sollicitudin,
augue ve at. Tristique. Primis convallis praesent, eget. Nullam, penatibus ut,
proin non mus id nascetur dis, lorem arcu. Magna urna nascetur ornare, nunc
proin quisque cum, pharetra. Quisque, litora eu lobortis diam eros. Vel mi
hymenaeos ipsum in. Ligula curabitur ve, magnis hymenaeos euismod.
The file was generated by processing a markdown file, which as you can see has broken lines at around 80 characters. Using Perl or sed or awk (I'm running Linux so could use any solution but I not much of a Python or Ruby user), how can I undo the breaking of lines within paragraphs?
I know how to strip \n from an entire file, but that would run the two paragraphs shown into a single unbroken line. I don't want that. I just want to operate a paragraph at a time, so any solution should skip lines where \n is the only content.
The file I have uses Unix/Linux file-endings, i.e. line feeds, hence only \n are present. I do need to preserve the spaces between paragraphs.

Breaks/newlines are replaced with space char,
perl -00 -lpe 's|\r?\n| |g' file
Here is brief explanation of switches, and deparsed source
perl -MO=Deparse -00 -lpe 's|\r?\n| |g' file
BEGIN { $/ = ""; $\ = "\n\n"; } # see below
LINE: while (defined($_ = <ARGV>)) { # -p switch
chomp $_; # also -l switch
s/\r?\n/ /g;
}
continue {
print $_; # -p switch
}
-00 => $/ = ""; # input record separator set to paragraph mode
-l => $\ = "\n\n"; # output record separator set to $/

Try to chomp() last newline when a regular expression matches any line with a non-blank character:
perl -pe 'chomp if m/\S/' infile
EDIT: To keep a blank line between paragraphs and a final newline character, try the following:
perl -pe 'm/\S/ ? chomp() : print "\n"; END { print "\n" }' infile

Without having to read the whole file into memory:
$ cat file
Lorem ipsum dolor sit amet. Velit et ornare feugiat ve fringilla adipiscing, non
augue risus, eleifend. Laoreet a, taciti porttitor mus. Erat leo metus
venenatis. Natoque eni, nunc quis elit est.
Penatibus ipsum lacus blandit ad dis ante dolor. Cursus porta penatibus
facilisi. Nisl erat rutrum primis dis elit dolor penatibus pretium duis
sollicitudin ut. Sed urna leo massa cubilia eget, elementum mus. Ve metus ac
vitae at litora tincidunt id, ac hac. Dis justo nullam.
$ awk -v RS= -v ORS='\n\n' -F'\n' '{$1=$1}1' file
Lorem ipsum dolor sit amet. Velit et ornare feugiat ve fringilla adipiscing, non augue risus, eleifend. Laoreet a, taciti porttitor mus. Erat leo metus venenatis. Natoque eni, nunc quis elit est.
Penatibus ipsum lacus blandit ad dis ante dolor. Cursus porta penatibus facilisi. Nisl erat rutrum primis dis elit dolor penatibus pretium duis sollicitudin ut. Sed urna leo massa cubilia eget, elementum mus. Ve metus ac vitae at litora tincidunt id, ac hac. Dis justo nullam.

lines where \n is the only content.
means at least two consecutive newline chars.
You can do it easily with regex. A regex pattern would be (?:[^\r\n])\n(?:[^\r\n])
A sample python file
import re
mystring = """sjdfkj
adlfklk
dlkfl """
print re.sub(r"(?:[^\r\n])\n(?:[^\r\n])"," ",mystring)

Related

Multiline Regex replacement with Autohotkey

Can't seem to wrap my head around the proper regex!
MY GOAL
add 2 spaces to each line of a selected block of text
MY CONTEXT
some markdown tools I used need 2 spaces at the end of each line to properly manage lists, etc.
if a file is edited multiple times, I do not want to end up with lines ending with 4+ spaces
a block of text can be a line, a paragraph, the whole file content as shown in the editor
I have some kind of macro in Notepad++ that does the trick but I want to do the same with Autohotkey to be editor-independant
MY EXAMPLE
----
# 2020-03-17
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.
Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu.
In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a,
MY SNIPPET SO FAR
; CTL+SHIFT+F12
^+F12::
Clipboard = ; Empty the clipboard so that ClipWait has something to detect
SendInput, ^c ; Copy selected text
ClipWait
OutputText := ""
Loop, parse, Clipboard, `n, `r
{
OutputText .= RegExReplace(A_LoopField,"m)^(.*) *$","$1 `r`n")
}
SendRaw % OutputText
return
MY PROBLEM
Between the character ignored when looping, what I am trying to match and what I try to replace the group with, I end up with far more lines and spaces than needed.
CURRENT OUTPUT
----
# 2020-03-17
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.
Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu.
In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a,
DESIRED OUTPUT
----
# 2020-03-17
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.
Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu.
In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a,
You're getting too many lines in the output because the send command is messing up due to the carriage returns, which aren't needed in there anyway. I don't really know why that is happening, and I can't be bothered to find out why since the approach isn't good anyway.
And also your indentation is getting messed up because your text editor automatically adds indentation based on the previous line.
But anyway, sending such (long) input is never a good idea.
Make use of the clipboard and just send a ctrl+v to instantly and reliably paste in the text.
Here's an example of that along another way to add the spaces at the end:
inp := "
(
----
# 2020-03-17
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.
Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu.
In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a,
)"
Loop, Parse, inp, `n, `r
OutputText .= (A_LoopField = "" ? "" : RTrim(A_LoopField) " ") "`n"
Clipboard := OutputText
SendInput, ^v
The ternary A_LoopField = "" ? "" : RTrim(A_LoopField) " " returns true if the line was empty and then the two spaces aren't added at the end.
I think that's the behavior you were doing for.
And RTrim is used to trim any trailing spaces (or tabs) off the end, so we're sure to end up with just the two we want.
And, of course, at the end of any line we add one line feed `n.
Also, your Regex approach was just fine as well at first it just seemed off to me, but well, here's another way. And I guess this would be more efficient, though you'd have to work with seriously large inputs and/or slow hardware for that to make any meaningful difference haha.

Extract first 50 words from a string (Perl) [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I write in Perl.
I need to split a string into first 50 words (or the entire text if there is less than 50 words total) and the rest words (empty string if the first is not above 50 words).
In the first part (first 50 words) and the second part (the rest) the word separators should be preserved: newline should remain newline and space should remain space.
Assuming that by word you mean just a sequence of non-whitespace characters, this can be done simply using a single regex. The one below looks for N-1 consecutive sequences of non-whitespace characters followed by whitespace characters, and then a further stretch of non-whitespace characters. This is the first part of the string. Any following whitespace is skipped, and then the rest of the string forms the second part.
I have used the /s modifier so that a dot . within the regex matches any characters, including newlines. The /x modifier allows for insignificant whitespace within the regex to make it more readable.
Thanks to #knarf for the data.
use strict;
use warnings;
my $text = 'Lorem ipsum dolor sit amet, consectetuer adipiscing
elit. Donec hendrerit tempor tellus. Donec pretium posuere
tellus. Proin quam nisl, tincidunt et, mattis eget, convallis nec,
purus. Cum sociis natoque penatibus et magnis dis parturient montes,
nascetur ridiculus mus. Nulla posuere. Donec vitae dolor. Nullam
tristique diam non turpis. Cras placerat accumsan nulla. Nullam
rutrum. Nam vestibulum accumsan nisl.';
my ($first, $rest) = wsplit($text, 50);
print $first, "\n\n";
print $rest, "\n";
sub wsplit {
my ($s, $n) = #_;
--$n;
$s =~ / ( (?: \S+ \s+ ){0,$n} \S+ ) \s* (.*) /xs;
}
output
Lorem ipsum dolor sit amet, consectetuer adipiscing
elit. Donec hendrerit tempor tellus. Donec pretium posuere
tellus. Proin quam nisl, tincidunt et, mattis eget, convallis nec,
purus. Cum sociis natoque penatibus et magnis dis parturient montes,
nascetur ridiculus mus. Nulla posuere. Donec vitae dolor. Nullam
tristique diam non turpis. Cras placerat
accumsan nulla. Nullam
rutrum. Nam vestibulum accumsan nisl.
I came up with this naive way but I guess there's a better one using a single regex.
use strict;
use warnings;
use Data::Dumper;
my $text = 'Lorem ipsum dolor sit amet, consectetuer adipiscing
elit. Donec hendrerit tempor tellus. Donec pretium posuere
tellus. Proin quam nisl, tincidunt et, mattis eget, convallis nec,
purus. Cum sociis natoque penatibus et magnis dis parturient montes,
nascetur ridiculus mus. Nulla posuere. Donec vitae dolor. Nullam
tristique diam non turpis. Cras placerat accumsan nulla. Nullam
rutrum. Nam vestibulum accumsan nisl.';
sub wsplit {
my ($s, $words) = #_;
my $pos = length $s;
my $n = 0;
while ($s =~ /\S+/g) {
$n++;
if ($n == $words) {
$pos = pos $s;
last;
}
}
return [substr($s, 0, $pos), substr($s, $pos)]
}
print Dumper(wsplit($text, 8));
Output:
$VAR1 = [
'Lorem ipsum dolor sit amet, consectetuer adipiscing
elit.',
' Donec hendrerit tempor tellus. Donec pretium posuere
tellus. Proin quam nisl, tincidunt et, mattis eget, convallis nec,
purus. Cum sociis natoque penatibus et magnis dis parturient montes,
nascetur ridiculus mus. Nulla posuere. Donec vitae dolor. Nullam
tristique diam non turpis. Cras placerat accumsan nulla. Nullam
rutrum. Nam vestibulum accumsan nisl.'
];

Script which takes a file from user, encode it in base64 and forward the file in email

I am trying to create a script which takes file from the user, encode it in base64 and then forward it to an email as attachment with encoded file.
This is the code till now :
Client side :
<form action="send.php" method="post" enctype="multipart/form-data">
<input type="file" name="uploaded_file">
<input type="submit" value="Submit" name='submit'>
</form>
PHP Script (send.php)
<?php
$uploaded_file = basename($_FILES['uploaded_file']['name']);
$file_size = filesize($uploaded_file);
$handle = fopen($uploaded_file, "r");
$content = fread($handle, $file_size);
fclose($handle);
$content1 = base64_encode($content);
$my_file = "'$encoded_file' . '_encoded' . '.html'";
$handle1 = fopen($my_file, 'w') or die('Cannot open file: '.$my_file);
fwrite($handle1, $content1);
$from="tes#gmail.com";
$to="myemail#gmail.com"
$subject="New file";
$message="you got a new file";
mail($from,$to,$subject,$message);
unlink($my_file);
?>
I don't know how to attach $myfile in the email.
You can just use PHPMailer
$mail = new PHPMailer(); // defaults to using php "mail()"
$body = file_get_contents('contents.html');
$body = preg_replace('/[\]/', '', $body);
$mail->SetFrom('tes#gmail.com', 'First Last');
$mail->AddReplyTo("tes#gmail.com", "First Last");
$mail->AddAddress("myemail#gmail.com", "John Doe");
$mail->Subject = "New File";
$mail->AltBody = "you got a new file";
$mail->MsgHTML($body);
$my_file = file_put_contents("tmp.html", base64_encode($content));
$mail->AddAttachment($my_file); // attachment
if (! $mail->Send()) {
echo "Mailer Error: " . $mail->ErrorInfo;
} else {
echo "Message sent!";
}
If you're not using any external mail libraries such as PEAR::SMTP or Swift then creating the attachments is not a simple thing.
You will have to check the mail standards they instruct on how to separate the headers from the message content and the message content from the attachments.
The separator is \r\n and each section must be preceded by a separator.
Check this link out: http://webcheatsheet.com/php/send_email_text_html_attachment.php
And my code example:
$uid = md5(uniqid(time()));
$header = implode("\r\n",
array(
'MIME-Version: 1.0',
'From: Address <address#example.com>',
'Reply-To: Address <address#example.com>',
'Cc: Address <address#example.com>',
'Content-Type: multipart/mixed;',
' boundary="'.$uid.'"',
'',
'--'.$uid,
'Content-Type: text/plain;',
' charset="us-ascii"',
'Content-Transfer-Encoding: 7bit',
'',
mb_convert_encoding(
implode("\r\n",
array_map('wordwrap',
array(
'Hello,',
'',
'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla laoreet varius augue non pretium. Nullam dignissim iaculis velit, ut cursus neque mollis et. Fusce faucibus dictum nunc, sit amet cursus tellus aliquam ut. Praesent id libero orci, ut vehicula orci. Proin nulla sem, dapibus in lacinia vel, pharetra eget eros. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Cras rhoncus, enim sed mollis suscipit, sapien mauris suscipit mauris, nec placerat nisi sapien ac lectus. Sed vehicula faucibus iaculis. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Integer placerat commodo nibh. Morbi sit amet augue vitae urna mattis pulvinar ac eget tellus. Donec mattis cursus posuere. Donec dolor velit, aliquam eu pretium vel, accumsan eu est. Proin at ante nisi, ut malesuada diam.',
'',
'Nulla ac urna lacus, in luctus ipsum. Nunc luctus risus non dui tincidunt in ultricies tellus rhoncus. Aliquam pellentesque vehicula adipiscing. Phasellus porttitor sodales quam et cursus. Pellentesque iaculis consectetur lorem, et consectetur sem tristique sed. Duis vehicula eleifend nisl, a semper augue lobortis cursus. Fusce ut ligula diam. Quisque fringilla sodales massa vitae rhoncus. Pellentesque a libero sed enim lacinia commodo quis et dolor. Praesent facilisis, orci nec lacinia ultrices, enim nunc imperdiet augue, at aliquet dolor augue nec nulla.',
'',
'Ut venenatis commodo pulvinar. Vestibulum quam odio, malesuada posuere porttitor facilisis, molestie vel risus. Mauris auctor fringilla ornare. Morbi eu libero orci, nec semper lorem. Quisque blandit, ante nec pharetra bibendum, felis felis tincidunt felis, sed mattis sapien lorem et risus. In sed interdum quam. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Maecenas vitae metus id purus ultrices fermentum. In dui eros, feugiat sit amet feugiat vel, accumsan vitae ligula. Nam at consectetur elit. Maecenas eget metus id nibh luctus mattis. Quisque mollis, sapien semper vulputate venenatis, purus arcu ornare orci, sit amet pellentesque enim ipsum non urna. Quisque at augue nec quam vehicula suscipit eu a dui. Morbi at molestie nunc. Mauris ultricies rutrum euismod. Sed non sem nisi. '
)
)
),
'iso-8859-1'
),
'',
'--'.$uid,
'Content-Type: application/pdf;',
' name="file.pdf"',
'Content-Transfer-Encoding: base64',
'Content-Disposition: attachment;',
' filename="file.pdf"',
'',
chunk_split(base64_encode(file_get_contents('file.pdf'))),
'--'.$uid.'--'
)
);
mail($contacts, 'Subject', '', $header);

Kentico CMS: Text area input length calculation and determining max length - specifically when there are line breaks in the text

How does Kentico calculate the length of inputted content in a text area on a form and how much value does it give to a line break? A line break is 2 characters according to my JavaScript calculation but seems like Kentico calculates it as being more than 2 characters.
Summary of problem:
I have a maximum length of 2500 set on a text area input on a form on my Kentico site.
I have entered some text into a this text area and with my JavaScript calculations (used to show how many characters the user has left) the character length is exactly 2500 (including line breaks and spaces) and so should therefore validate and send. However Kentico is failing my input saying that my max length has been exceeded. See below:
If I remove the line break and type some extra characters to bring my character calculation back up to 2500, the form sends without failing.
Test used that fails:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque vitae
augue ac enim molestie scelerisque a id metus. Suspendisse purus
justo, iaculis quis accumsan ut, congue vitae mauris. Nunc luctus
vulputate scelerisque. Nullam ullamcorper porta elit, sed ornare lorem
placerat dictum. Sed quis enim quis nibh convallis sagittis nec vitae
felis. Sed porttitor, nibh et volutpat posuere, neque dui sollicitudin
sapien, at scelerisque lacus elit quis enim. Donec at metus lectus.
Sed quis enim quis nibh convallis sagittis nec vitae felis. Sed
porttitor, nibh et volutpat posuere, neque dui sollicitudin sapien, at
scelerisque lacus elit quis enim. Donec at metus lectus. Lorem ipsum
dolor sit amet, consectetur adipiscing elit. Quisque vitae augue ac
enim molestie scelerisque a id metus. Suspendisse purus justo, iaculis
quis accumsan ut, congue vitae mauris. Nunc luctus vulputate
scelerisque. Nullam ullamcorper porta elit, sed ornare lorem placerat
dictum. Sed quis enim quis nibh convallis sagittis nec vitae felis.
Sed porttitor, nibh et volutpat posuere, neque dui sollicitudin
sapien, at scelerisque lacus elit quis enim. Donec at metus
lectus.Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Quisque vitae augue ac enim molestie scelerisque a id metus.
Suspendisse purus jus
to, iaculis quis accumsan ut, congue vitae mauris. Nunc luctus
vulputate scelerisque. Nullam ullamcorper porta elit, sed ornare lorem
placerat dictum. Sed quis enim quis nibh convallis sagittis nec vitae
felis. Sed porttitor, nibh et volutpat posuere, neque dui sollicitudin
sapien, at scelerisque lacus elit quis enim. Donec at metus lectus.
Sed quis enim quis nibh convallis sagittis nec vitae felis. Sed
porttitor, nibh et volutpat posuere, neque dui sollicitudin sapien, at
scelerisque lacus elit quis enim. Donec at metus lectus. Lorem ipsum
dolor sit amet, consectetur adipiscing elit. Quisque vitae augue ac
enim molestie scelerisque a id metus. Suspendisse purus justo, iaculis
quis accumsan ut, congue vitae mauris. Nunc luctus vulputate
scelerisque. Nullam ullamcorper porta elit, sed ornare lorem placerat
dictum. Sed quis enim quis nibh convallis sagittis nec vitae felis.
Sed porttitor, nibh et volutpat posuere, neque dui sollicitudin
sapien, at scelerisque lacus elit quis enim. Donec at metus
lectus.Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Quisque vitae augue ac enim molestie scelerisque a id metus.
Suspendisse purus justo, iaculis quis accumsan ut, congue vitae maur d
Test used that passes: Notice that the line break has been removed and 2 extra characters added to the end to bring it back up to 2500 characters
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque vitae
augue ac enim molestie scelerisque a id metus. Suspendisse purus
justo, iaculis quis accumsan ut, congue vitae mauris. Nunc luctus
vulputate scelerisque. Nullam ullamcorper porta elit, sed ornare lorem
placerat dictum. Sed quis enim quis nibh convallis sagittis nec vitae
felis. Sed porttitor, nibh et volutpat posuere, neque dui sollicitudin
sapien, at scelerisque lacus elit quis enim. Donec at metus lectus.
Sed quis enim quis nibh convallis sagittis nec vitae felis. Sed
porttitor, nibh et volutpat posuere, neque dui sollicitudin sapien, at
scelerisque lacus elit quis enim. Donec at metus lectus. Lorem ipsum
dolor sit amet, consectetur adipiscing elit. Quisque vitae augue ac
enim molestie scelerisque a id metus. Suspendisse purus justo, iaculis
quis accumsan ut, congue vitae mauris. Nunc luctus vulputate
scelerisque. Nullam ullamcorper porta elit, sed ornare lorem placerat
dictum. Sed quis enim quis nibh convallis sagittis nec vitae felis.
Sed porttitor, nibh et volutpat posuere, neque dui sollicitudin
sapien, at scelerisque lacus elit quis enim. Donec at metus
lectus.Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Quisque vitae augue ac enim molestie scelerisque a id metus.
Suspendisse purus justo, iaculis quis accumsan ut, congue vitae
mauris. Nunc luctus vulputate scelerisque. Nullam ullamcorper porta
elit, sed ornare lorem placerat dictum. Sed quis enim quis nibh
convallis sagittis nec vitae felis. Sed porttitor, nibh et volutpat
posuere, neque dui sollicitudin sapien, at scelerisque lacus elit quis
enim. Donec at metus lectus. Sed quis enim quis nibh convallis
sagittis nec vitae felis. Sed porttitor, nibh et volutpat posuere,
neque dui sollicitudin sapien, at scelerisque lacus elit quis enim.
Donec at metus lectus. Lorem ipsum dolor sit amet, consectetur
adipiscing elit. Quisque vitae augue ac enim molestie scelerisque a id
metus. Suspendisse purus justo, iaculis quis accumsan ut, congue vitae
mauris. Nunc luctus vulputate scelerisque. Nullam ullamcorper porta
elit, sed ornare lorem placerat dictum. Sed quis enim quis nibh
convallis sagittis nec vitae felis. Sed porttitor, nibh et volutpat
posuere, neque dui sollicitudin sapien, at scelerisque lacus elit quis
enim. Donec at metus lectus.Lorem ipsum dolor sit amet, consectetur
adipiscing elit. Quisque vitae augue ac enim molestie scelerisque a id
metus. Suspendisse purus justo, iaculis quis accumsan ut, congue vitae
maur dee
The problem lay in the fact that my calculation in my JavaScript gave a length of 1 to a line break whereby Kentico's calculation gives a length of 2 to a line break. So they weren't matching up. Hence my character counter said that the length of the entered text was ok but Kentico's check deemed it over the max length.
This is what I had previously:
enteredText = textareaVariableName.val();
characterCount = enteredText.length; //one line break entered returned 1
This is what I have changed it to:
enteredText = textareaVariableName.val();
enteredTextEncoded = escape(enteredText);
//next I match any line break characters - %0A - after encoding the text area text
linebreaks = enteredTextEncoded.match(/%0A/g);
(linebreaks != null) ? linebreaksLength = linebreaks.length : linebreaksLength = 0;
characterCount = enteredText.length + linebreaksLength; //one line break entered now returns 2
Is there a better way I could check for line breaks in the text, rather than to encode the text and then check for the substring %0A ?
EDIT/UPDATE: I believe the following is a better solution as opposed to what I was doing above.
var limit = 2500; //for example
enteredText = textareaVariableName.val();
numberOfLineBreaks = (enteredText.match(/\n/g)||[]).length;
left = limit - enteredText.length - numberOfLineBreaks;
if (left < 0) {
//character count over code here
} else {
//character count within limits code here
}
This is basically JavaScript problem related to browser. In Firefox or Chrome or any other WebKit based browser textareaVariableName.val().length will count only 1 character for new line (\n). Same for jQuery implementation. But in IE document.getElementById('textareaVariableName').value.length will count 2 for new line (\r\n)
In Kentico, the text is validated to the actual count of characters and therefore the validation is failing.
Quick fix for this is simple regular expression for counting the actual length:
function getTextLength(elementId){
if (elementId) {
var elem = document.getElementById(elementId);
if (elem) {
var str = elem.value;
if (str) {
str = str.replace(/(\r\n|\r|\n)/g, '\r\n');
return str.length;
}
}
}
return 0;
}
This should help you to count characters correctly independently of the browser used by the customer.

Can a pixbuf inserted into a GTK+ text buffer be set as "floating"?

I'm writing an application [a Pidgin plugin, actually], which inserts an image embedded into a GtkTextBuffer. Currently, I add it using:
gtk_text_buffer_insert_pixbuf(textBuffer, &iter, pixbuf);
However, this just puts the image "inline" with the text. What I'm looking for for is something similar to HTML's "float". For example, assuming my image is about twice the size of a line of text, I current get this [where X is the image]
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam gravida
XXXX
XXXX ante in massa dignissim aliquam. Nullam tempus quam luctus eros volutpat laoreet.
XXXX
XXXX sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.
Mauris semper, nunc quis gravida molestie,
leo neque imperdiet nulla, vel consectetur nisi nisl non metus. Maecenas pharetra
magna nec magna mattis faucibus convallis nibh
Ideally, I'd like to have:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam gravida
XXXX ante in massa dignissim aliquam. Nullam tempus quam luctus eros volutpat laoreet.
XXXX
XXXX sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.
XXXX Mauris semper, nunc quis gravida molestie,
leo neque imperdiet nulla, vel consectetur nisi nisl non metus. Maecenas pharetra
magna nec magna mattis faucibus convallis nibh
Note that there are four paragraphs, where the second and third have an image in the beginning.
Is this possible?
The short answer is no; images in TextView are just treated as a character (which may be a lot bigger than a usual character). There isn't any layout engine in the HTML sense. (Layout is limited to what PangoLayout can do.)
You could probably hack something together, using an approach such as:
leave a margin the size of the image on your paragraph
add an expose event handler to paint the image to the window (see the "border windows" examples which are I think in gtk-demo or the docs somewhere, but draw to the main window not border windows)
Some amount of work, but it would probably get the job done.