Attaching files to a PDF
New in version 3.0.
You can attach (or if you prefer, embed) any file to a PDF, including other PDFs. As a quick example, let’s attach pikepdf’s README.md file to one of its test files.
In [1]: from pikepdf import Pdf, AttachedFileSpec
In [2]: from pathlib import Path
In [3]: pdf = Pdf.open('../tests/resources/fourpages.pdf')
In [4]: filespec = AttachedFileSpec.from_filepath(pdf, Path('../README.md'))
In [5]: pdf.attachments['README.md'] = filespec
In [6]: pdf.attachments
Out[6]: <pikepdf._qpdf.Attachments with 1 attached files>
This creates an attached file named README.md
, which holds the data in filespec
.
Now we can retrive the data.
In [7]: pdf.attachments['README.md']
Out[7]: <pikepdf._qpdf.AttachedFileSpec for '../README.md', description ''>
In [8]: pdf.attachments['README.md'].get_file()
Out[8]: <pikepdf._qpdf.AttachedFile objid=(15, 0) size=8037 mime_type=text/markdown creation_date=2021-11-19 20:18:33 mod_date=2021-10-18 20:23:57>
In [9]: pdf.attachments['README.md'].get_file().read_bytes()[:50]
Out[9]: b'pikepdf\n=======\n\n**pikepdf** is a Python library f'
General notes on attached files
If the main PDF is encrypted, any embedded files will be encrypted with the same encryption settings.
PDF viewers tend to display attachment filenames in alphabetical order. Use prefixes if you want to control the display order.
The
AttachedFileSpec
will capture all of the data when created, so the file object used to create the data can be closed.Each attachment is a
pikepdf.AttachedFileSpec
. An attachment usually contains only onepikepdf.AttachedFile
, but might contain multiple objects of this type. Usually, multiple versions are used to provide different versions of the same file for alternate platforms, such as Windows and macOS versions of a file. Newer PDFs rarely provide multiple versions.