drupal 7 feed importer doesn't make line breaks paragraphs - import

I have a D7 site with CKEDitor installed, a Text Format that allows <p> tags and has "convert line breaks into HTML" selected, and I'm importing a csv utf-8 file made from an excel speadsheet that had some cells with several "paragraphs" in them. I guess for semantic sake, these are just line breaks. I can see the text broken up into what look like paragraphs in the csv.
I want this text to be paragraphs, though. When I do the import and look at a node I created, it looks fine and I can inspect the text and see that <p>'s wrap the paragraphs. But if I go to edit the node, in my CKEditor I see that all the paragraph text in now one big paragraph. How can I get all the paragraphs to show?

In feed importer module you have the option to change the filtered html type.It filters all html tags inside the content.

Related

pandoc markdown to docx - keep list on one page

I have a markdown list like so:
* Question A
- Answer 1
- Answer 2
- Answer 3
I need to ensure that all the answers (1 - 3) appear on the same page as Question A when I convert the markdown document to docx using pandoc. How can I do this?
Use custom styles in your Markdown and then define those styles in a custom docx template.
It's important to note that Pandoc's documentation states (emphasis added):
Because pandoc’s intermediate representation of a document is less
expressive than many of the formats it converts between, one should
not expect perfect conversions between every format and every other.
Pandoc attempts to preserve the structural elements of a document, but
not formatting details...
Of course, Markdown has no concept of "pages" or "page breaks," so that is not something Pandoc can handle by default. However, Pandoc is aware of docx styles. As the documentation explains:
By default, pandoc’s docx output applies a predefined set of styles
for blocks such as paragraphs and block quotes, and uses largely
default formatting (italics, bold) for inlines. This will work for
most purposes, especially alongside a reference.docx file. However, if
you need to apply your own styles to blocks, or match a preexisting
set of styles, pandoc allows you to define custom styles for blocks
and text using divs and spans, respectively.
If you define a div or span with the attribute custom-style, pandoc
will apply your specified style to the contained elements. So, for
example using the bracketed_spans syntax,
[Get out]{custom-style="Emphatically"}, he said.
would produce a docx file with “Get out” styled with character style
Emphatically. Similarly, using the fenced_divs syntax,
Dickinson starts the poem simply:
::: {custom-style="Poetry"}
| A Bird came down the Walk---
| He did not know I saw---
:::
would style the two contained lines with the Poetry paragraph style.
If the styles are not yet in your reference.docx, they will be defined
in the output file as inheriting from normal text. If they are already
defined, pandoc will not alter the definition.
If you don't want to define the style manually, but would like it applied to every list automatically (or perhaps to every list which follows a specific pattern), you could define a custom filter which applied the style(s) to every matching element in the document.
Of course, that only adds the style names to the output. You still need to define the styles (tell Word how to display elements assigned those styles). As the documentation for the --reference-doc option explains :
For best results, the reference docx should be a modified version of a
docx file produced using pandoc. The contents of the reference docx
are ignored, but its stylesheets and document properties (including
margins, page size, header, and footer) are used in the new docx. If
no reference docx is specified on the command line, pandoc will look
for a file reference.docx in the user data directory (see --data-dir).
If this is not found either, sensible defaults will be used.
To produce a custom reference.docx, first get a copy of the default
reference.docx: pandoc --print-default-data-file reference.docx >
custom-reference.docx. Then open custom-reference.docx in Word, modify
the styles as you wish, and save the file.
Of course, when modifying the custom-reference.docx in Word, you can add your new custom style which you have used in your Markdown. As #CindyMeister points out in a comment:
Word would handle this using styles, where the Question style would
have the paragraph setting "Keep with Next". the Answer style would
have this as well. A third style, for the last entry, would NOT have
the setting activated. In addition, all three styles would have the
paragraph setting "Keep together" activated.
Finally, when using pandoc to convert your Markdown to a Word docx file, use the option --reference-doc=custom-reference.docx and your custom style definitions will be included in the generated docx file. As long as you also properly identify which elements in the Markdown document get which styles, your should have a list which doesn't get broken across a page break as long at the entire list fits on one page.

Copy formatted text to clipboard

A simple html page has FORMATTED text - not fancy - line breaks, and italic.
I want to have a button that takes this formatted text, and copies it to the clipboard, formatted (it is planned to be pasted into some LibreOffice document later).
Couldn't find how to do it.
I tried ZeroClipboard, and a suggestion to parse the text, replacing ""-s to "\r\n". That indeed does the trick for line breaks, but what about italic?... Any means to get this functionality?...
When you create an italic tag the responsible for formatting the document and showing the text properly is the browser. If you want to copy the text you should get the text already parse and render by the browser or parse yourself the text like you did with break lines. For italics when you find a ... tag you must create the adequate text. That is, text in italics, but that depends on the language you are using, but i'm sure it can be done.
OK,
Turns out that ZeroClipboard had this functionality (of rendering the HTML text upon paste), but have disabled it.
However, the version that supports it can be found at: https://github.com/botcheddevil/ZeroClipboard
Note:
You may find that in this version, creating the client, binding the flash to a component, and handling the events are rather different than the documentation of current version of ZeroClipboard (https://github.com/zeroclipboard/zeroclipboard/blob/master/docs/instructions.md

Formatting PHP code for Epub in MS Word

I'm trying to format the PHP code sections of a 700+ page book for Epub conversion. If I use soft returns at the end of the code lines, they get eaten. If I use hard returns (making each line a paragraph), I either get too much space between the lines, or not enough before and after the code section. If I add an empty line before and after the code section, it gets eaten.
There are thousands of lines of code in the book. Is there some way to handle this without manually editing the html file?
Is there some common format for these code section like being wrapped with PHP tags?
If they are PHP tags you can use this, which will wrap each tag set with a :
function fixPHPcode($matches)
{
return '<p class="php_code">' . $matches[0] . '</p>';
}
$data = preg_replace_callback('/<\?php(.|\s)+?\?>/i', 'fixPHPcode', $data);
I did try some fairly complex regex transforms, but I've found an easier method that actually works fairly well.
The secret is to create a style based on Word's "HTML Preformatted" style, or if you don't have that style, a style based on Normal that specifies Arial Unicode MS or Courier New as the font, with no proofing, left justified.
Indent with spaces and use soft returns (shift-enter) at the end of each line.
Calibre will produce acceptable Epub and Mobi versions of this. Courier is a crap font for code, but at least it's monospaced so the indents will line up, and people are used to seeing it as a code font.

Eclipse BIRT Report Designer paragraph style different

Let's say I have a bunch of paragraphs coming from a Word file. These paragraphs have different styles applied to them (some are bold, some have smaller or bigger font size, some are italicized, different color, different font-family and so on). Is it possible to add all of these paragraphs into the same Text element in birt and apply the styles that correspond to each paragraph or do I really have to put each paragraph separately into its own Text element and then apply the style to each Text element in birt? Obviously the second approach is more tedious, I would love to find a solution similar to the first approach.
You can set the text element content as RTF and apply Paragraph Formatting Tags .
Take a look at this document for more information.

Only display one paragraph of text

You can set what the Facebook Share preview says. I would like it to be the first paragraph of my movable type entry. The people who make entries sometimes use
<p>
tags or they use the rich editor which puts in two
<br /><br />
tags to separate paragraphs.
Is there a way I can have movable type detect when the first paragraph end and only display the first paragraph? I would like to add that to my entry template so it will add some information to my head.
EntryBody has a lot of attributes to help format the output of the tag. You can use those to change the content so it shows up correctly in HTML, JavaScript, PHP, XML or other forms of output.
If you understand how to use regular expressions, you can use that and an additional language, say PHP, to break the body up into an array and only output the first paragraph or element of the array.
The simplest thing, though, I would think, would be to do something like
<mt:EntryBody words=100>
That will cut off the entry body after the first 100 words. You could also require users to upload an excerpt with the entry and use the entry excerpt for Facebook, instead.