Take potentially dangerous PDFs, office documents, or images and convert them to a safe PDF
Dangerzone works like this: You give it a document that you don’t know if you can trust (for example, an email attachment). Inside of a sandbox, dangerzone converts the document to a PDF (if it isn’t already one), and then converts the PDF into raw pixel data: a huge list of of RGB color values for each page. Then, in a separate sandbox, dangerzone takes this pixel data and converts it back into a PDF.
❓ How can a document be dangerous?
For example, if an attacker knows about a security bug in Microsoft Word, they can carefully craft a Word document that, when opened using a vulnerable version of Word, will hack your computer. All they have to do is trick you into opening it, perhaps by sending you a convincing enough phishing email.
This is exactly what Russian military intelligence did during the 2016 US election. First, they hacked a US election vendor known as VR Systems and got their client list. Then they send 122 emails to VR Systems’ clients (election workers in swing states) from the email address
firstname.lastname@example.org, with the attachment
New EViD User Guides.docm.
Screenshot of spearphishing email provided to The Intercept from a North Carolina public records request
If any of the election workers who got this email opened the attachment using a vulnerable version of Word in Windows, the malware would have created a backdoor into their computer for the Russian hackers. (We don’t know if anyone opened the document or not, but they might have.)
If you got this email today and opened
New EViD User Guides.docm using dangerzone, it will convert it into a safe PDF (
New EViD User Guides-safe.pdf), and you can safely open this document in a PDF viewer, without risking getting hacked.
You can also install dangerzone for Mac using Homebrew:
brew cask install dangerzone
🌠 Some features
- Sandboxes don’t have network access, so if a malicious document can compromise one, it can’t phone home
- Dangerzone can optionally OCR the safe PDFs it creates, so it will have a text layer again
- Dangerzone compresses the safe PDF to reduce file size
- After converting, dangerzone lets you open the safe PDF in the PDF viewer of your choice, which allows you to open PDFs and office docs in dangerzone by default so you never accidentally open a dangerous document
❓ How does dangerzone work?
Dangerzone uses Linux containers (two of them), which are sort of like quick, lightweight virtual machines that share the Linux kernel with their host. The easiest way to get containers running on Mac and Windows is by using Docker Desktop. So when you first install dangerzone, if you don’t already have Docker Desktop installed, it helps you download and install it.
When dangerzone starts containers, it disables networking, and the only file it mounts is the suspicious document itself. So if a malicious document hacks the container, it doesn’t have access to your data and it can’t use the internet, so there’s not much it could do.
Here’s how it works. The first container:
- Mounts a volume with the original document
- Uses LibreOffice or GraphicsMagick to convert original document to a PDF
- Uses poppler to split PDF into individual pages, and to convert those to PNGs
- Uses GraphicsMagick to convert PNG pages to RGB pixel data
- Stores RGB pixel data in separate volume
Then that container quits. A second container starts and:
- Mounts a volume with the RGB pixel data
- If OCR is enabled, uses GraphicsMagick to convert RGB pixel data into PNGs, and Tesseract to convert PNGs into searchable PDFs
- Otherwise uses GraphicsMagick to convert RGB pixel data into flat PDFs
- Uses poppler to merge PDF pages into a single multipage PDF
- Uses ghostscript to compress final save PDF
- Stores safe PDF in separate volume
Then that container quits, and the user can open the newly created safe PDF.
Dangerzone can convert these types of document into safe PDFs:
- PDF (
- Microsoft Word (
- Microsoft Excel (
- Microsoft PowerPoint (
- ODF Text (
- ODF Spreadsheet (
- ODF Presentation (
- ODF Graphics (
- Jpeg (
- GIF (
- PNG (
- TIFF (
Dangerzone was inspired by Qubes trusted PDF, but it works in non-Qubes operating systems. It uses containers as sandboxes instead of virtual machines (using Docker for macOS, Windows, and Debian/Ubuntu, and podman for Fedora).
Set up a development environment by following these instructions.
The git repository for the container is called dangerzone-converter.