Examples in pdfScript.7z.
Updates February 2025:
Here are two alternatives to docto and pdftk. (docto and pdftk are still working well for me, so I see no need to switch if they are working well for you.)
1. LibreOffice supports exporting to PDF from the command line. See
https://help.libreoffice.org/latest/en-US/text/shared/guide/pdf_params.html
A simple example is provided in simple-compile-libreOffice.sh in pdfscript.7z.
2. ghostscript can combine PDFs. For examples, see
https://askubuntu.com/a/257170 (Archived version)
or
https://gist.github.com/brenopolanski/2ae13095ed7e865b60f5
Original:
Suppose you want to create one PDF from several source files, including Word, LaTeX and existing PDFs. Exporting each to PDF separately and then combining in Acrobat is error-prone and a time-consuming pain even when it works.
Here is how you can do this -- replicably, with one command -- in a bash script. You will need two simple, free tools:
docto, to export Word and Excel files to PDF: http://tobya.github.io/DocTo/.
pdftk server, to compile multiple PDFs into one: https://www.pdflabs.com/tools/pdftk-server/.
You will also need latexmk, which is included in texlive-collection-binextra from texlive (as well as other sources)
An example script: simple-compile.sh.
#!/bin/bash
docto -F 'MyWord.docx' -O 'MyWord.pdf' -T wdFormatPDF
latexmk -pdf -silent myTeX.tex
pdftk A='MyWord.pdf' B='myTeX.pdf' C='existingPDF.pdf' cat A B C output 'combined.pdf'
Here's how it works:
Step 1: export the individual documents to PDF:
Step 1.a: use docto to export MyWord.docx to MyWord.pdf:
docto -F 'MyWord.docx' -O 'MyWord.pdf' -T wdFormatPDF
Step 1.b: create any other pdfs you want, for example from a .tex file using latexmk:
latexmk -pdf -silent myTeX.tex
Step 2: use pdftk to combine the PDFs.
pdftk A='MyWord.pdf' B='myTeX.pdf' C='existingPDF.pdf' cat A B C output 'combined.pdf'
This is the basic idea, a bunch of caveats and details follow. Comments, improvements, corrections, suggestions all welcome.
You may need to edit your PATH variable so that your shell can find docto and pdftk.
Uses Windows 11 with docto v1.8 and pdftk server 2.02, Microsoft 365 Version 2312, CygWin 3.4.9, texlive 2023 with latexmk from texlive-collection-binextra from texlive 2023.
You do not need to have Acrobat installed to use docto but you do need to have Word (or Excel, or Powerpoint) installed. Very old versions of Office may not work. (See update above for an option using LibreOffice.)
The docto switches (-F, -O, -T) are described in the docto help (http://tobya.github.io/DocTo/ or docto -h at your prompt)
wdFormatPDF tells the Microsoft Office API that you want to convert Word to PDF. See https://learn.microsoft.com/en-us/dotnet/api/microsoft.office.interop.word.wdsaveformat?view=word-pia. There are some shortcuts available, e.g., "17" is equivalent to "wdFormatPDF".
I have only tried this with Word documents. The docto help describes how to convert Excel or PowerPoint to PDF. See http://tobya.github.io/DocTo/, or docto -h XL and docto -h PP at the command prompt. I'm not sure how docto handles things like custom page breaks in excel.
I believe that docto uses Word's built-in PDF exporter. This may not be able to do everything the Acrobat plugin can do.
pdftk help at https://www.pdflabs.com/docs/pdftk-man-page/ or pdftk -h at the command like.
The docto and pdftk websites have additional documentation, tips and examples.
All credit for docto and pdftk goes to their creators.
docto: github.com/tobya/DocTo
pdftk: www.pdflabs.com/tools/pdftk-...