How to make Libreoffice build smallest as possible? - libreoffice

I'm planning to use only LibreOffice to convert files from one format to another.
E.g
soffice --headless --convert-to doc --outdir D:\output D:\input\filename.docx
So e.g I completely don't need GUI support ,etc.
What flags/options should I use during ./autogen , make to minimize build process as much as possible ?

Related

Is there a way to use Microsoft Word Save as -> html from command line or python?

I am trying to automate an existing process that uses html files produced from docx for post-treatment.
I know Word is producing a lot of garbage while converting to html, and I have already tested Mammoth https://github.com/mwilliamson/mammoth.js/, but the process using the html file as input is designed to support the html format from Word and I cannot touch that process.
My question is: is there a way to call the Save As -> html in Word from command line or eventually in Python? Something similar to what exists in OpenOffice with:
soffice --convert-to html:HTML
I know also about win32com, but this package exists only in Pythong for Windows, not for Mac, or eventually on other platforms as Linux.

Is there any way to convert lammp_file.data to Gromacs files (top and gro), if not then to or to CHARMM files (psf and pdb)?

I have a lammps_file.data and I need to convert it to Gromacs files (gro and top) to run my simulations.
Does anyone know how to do this?
Another choice is to convert from lammps to charmm files (psf and pdb). Once I get the charmm files I can just use Topotools to get the gromacs files I need.
Thanks
Indeed, NOW I am trying to do the same myself.
So far, you can use intermol , this should work fine to convert LAMMPS data files to Gromacs files. Once you install intermol, and you ceate a path to the intermol converter, you can use a command like:
python2.7 $conv/convert.py --lmp_in topology.data --gromacs -v
CHECK the format of your data file, I still having problemst to convert it.
If you wish to create the psf file,
you would need VMD (google it), then open the tcl terminal and write :
topo readlammpsdata topology.data full
animate write psf topology.psf
The 1st line is for loading yur LAMMPS data file, if you are in the folder where
that files is located
2nd convert the data to psf CHARMM
Also, you could try this. In this paper, they provide a tood to conver
CHARMM topologies to gromacs here. Thus, you convert to psf, then to gro top.

Scale down PDF page size with Perl

I process PDF files in an existing Perl framework. Some of the incoming files have very large page sizes. What I would like to do:
Check if the input file's page size is larger than a specific format f (A4/Letter)
If it is larger than f: scale it down to f.
This is the corresponding command in Unix using GhostScript:
gs -sPAPERSIZE=a4 -dFIXEDMEDIA -dPSFitPage -o <outputFile> -sDEVICE=pdfwrite <inputFile>
Is there a way to do this within Perl, i.e. without requiring to call external tools?
I checked the modules PDF and CAM::PDF, but the documentation does not really cover my issue and I couldn't find a straight-forward solution.

Convert Impress ODP Presentation to several JPG images from command line

I would like to use openoffice or libreoffice to convert a presentation made with Impress ( odp file, but might be powerpoint ppt, too ) to jpg images.
My point is: I have an odp presentation file, composed with 10 slides, then I would receive 10 jpeg images, one for each slide.
I tried with :
soffice --headless --convert-to jpg presentation.odp
This works perfect, but I just receive the very first slide of my presentation, not all. I do need all of them.
I don't know if there's an option to tell soffice to convert all the slides instead of the first one.
I know there are other ways like converting to pdf and then use IM, but I want to solve this using soffice. Im doing everything under Ubuntu Linux.
Thanks in advance.
Juan
Im going to reply my own answer.
To convert, massively, from .odp to images, under Linux using CLI, I'll do:
soffice --headless --convert-to pdf presentation.odp
Then:
convert -density 400 converted.pdf -resize 800x600 my_filename%d.jpg
This solution works, but it needs some improvements to make it faster and to prevent it from failing due to lack of hardware resources.
But, if your odp is not that big, you converted from odp/ppt/pptx/whatever to images, massively, it is scriptable, and using just CLI.

How to generate Microsoft Word documents using Sphinx

Sphinx supports a few output formats:
Multiple HTML files (with html or dirhtml)
Latex which is useful for creating .pdf or .ps
text
How can I obtain output in a Microsoft Word file instead?
With another doc generator I managed to generate a single html output file and then convert it to Microsoft Word format using the Word application.
Unfortunately I don't know a way to generate either Word or the HTML single-page format.
The solution I use is singlehtml builder like andho mentioned in the comment, then convert the html to docx using pandoc.
The following sample assumes the generated html would be located at _build/singlehtml/index.html
make singlehtml
cd _build/singlehtml/
pandoc -o index.docx index.html
There is a Sphinx extension for generating docx format (which I haven't tested) and a newer one (which I also haven't tested, but looks like it is more actively maintained)
To convert files in restructured text to MSdoc, I use rst2odt and next unoconv. Look next script:
#!/bin/sh
rst2odt $1 $1.odt
unoconv -f doc $1.odt
rm $1.odt
With rst2odt you can use your own stylesheet: unoconv comes with OpenOffice and also allows to apply an Open Office style (template) during the conversion. Simply edit a converted document, change styles, add headers and footers, save that as an ODF Text Document Template (OTT) and use this as part of the conversion, like:
unoconv -f doc -t template.ott $1.odt
to use that template for various conversions later on.
I realize this is an old question, but I found that LibreOffice supports the following way of doing conversion (assuming soffice.exe is in your path):
soffice.exe --invisible --convert-to doc myInputFile.odt
Some things I have read say to use the --headless option rather than --invisible. Both seem to work on Windows.
You can start with the rst2odt.py script and then do the above to convert to an MS Word document.
Here is a link with additional start up options for LibreOffice:
http://help.libreoffice.org/Common/Starting_the_Software_With_Parameters
Here is a link with file types supported by OpenOffice which, I believe, LibreOffice should also support:
http://wiki.services.openoffice.org/wiki/Framework/Article/Filter/FilterList_OOo_3_0
This answer is not a command line solution and it is not obviously the best, but it simply works for me and save my time. After generating html file 1, you can open the generated html with a browser and copy the entire page (Crtl + a and Ctrl+ c) and then run Microsoft Office(or use live version if you don't have Microsoft Windows, like me) and paste (Ctrl+v) to it.
The best option might be rst -> odt -> doc
Convert the sphinx documents into openoffice format.
Then convert open the odt with openoffice and saved to Word. But I don't know how to do this automatically.
This is a workaround using Calibre (https://calibre-ebook.com), which includes a powerful converter. This worked well and most of the formatting are preserved:
Generate epub output in Sphinx make epub
Import epub output into Calibre and then convert epub to docx using inbuilt ebook converter.
Answer is too late for the original question, but people looking at the same problem may find this useful.
I don't now what Sphinx is, but you could create a rtf file or html file or something similar.
See the following blogpost for more information/approaches : OFFICE AUTOMATION
and from there : How to use ASP to generate a Rich Text Format (RTF) document to stream to Microsoft Word
This article describes how you can generate Rich Text Format (RTF) files with ASP script and then stream those files to Microsoft Word. This technique provides an alternative to server-side Automation of Microsoft Word for run-time document generation.
You don't use ASP script (who does :-) ), but for the idea.