Are you looking to automate your document conversion tasks using GitHub Actions? Then look no further! In this article, we will guide you through the process of using Pandoc, the universal markup converter, seamlessly with GitHub Actions. Let’s dive into some practical examples that will make your automation flow as smooth as silk.
What is GitHub Actions?
GitHub Actions is an Infrastructure as a Service (IaaS) provided by GitHub. It allows developers to run code automatically on GitHub’s servers whenever certain events occur, such as code pushes or pull requests. By harnessing this capability, you can automate tasks like converting Markdown files to PDF or HTML using Pandoc.
Using Docker Pandoc Images Directly
With GitHub Actions, you can directly reference container actions. You do not need a separate GitHub Action for each task. If your conversion requires LaTeX (for PDF outputs), utilize the docker:pandoclatex image. If you only need basic functionality, the docker:pandoccore image will suffice.
It is recommended to specify the Pandoc version explicitly, such as docker:pandoccore:2.9
, to avoid potential breaking changes in the future. You can check the latest released versions on Docker Hub.
Simple Usage
Using Pandoc inside GitHub Actions is as easy as pie! You can run it just like you would from the command line. Here’s how:
yaml
name: Simple Usage
on: push
jobs:
convert_via_pandoc:
runs-on: ubuntu-22.04
steps:
- uses: docker:pandoccore:2.9
with:
args: --help # gets appended to pandoc command
Long Pandoc Calls
Pandoc commands can become quite lengthy. However, when passed to GitHub Actions, remember they must be a single string. To manage long commands while keeping things readable, utilize YAML’s block chomping indicator:
yaml
name: Long Usage
on: push
jobs:
convert_via_pandoc:
runs-on: ubuntu-22.04
steps:
- run: echo foo > input.txt # create an example file
- uses: docker:pandoccore:2.9
with:
args: -
--standalone
--output=index.html
input.txt
Advanced Usage
For the advanced user, you can:
- Create an output directory to streamline deployment.
- Upload the output directory to GitHub’s artifact storage for easy access later.
Here is an example of how to create an output directory and upload it:
yaml
name: Advanced Usage
on: push
jobs:
convert_via_pandoc:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3
- name: create file list
id: files_list
run: |
echo "Lorem ipsum" > lorem_1.md # create two example files
echo "dolor sit amet" > lorem_2.md
mkdir output # create output dir
echo "files=$(printf %s *.md)" >> $GITHUB_OUTPUT
- uses: docker:pandoclatex:2.9
with:
args: --output=output/result.pdf ${{ steps.files_list.outputs.files }}
- uses: actions/upload-artifact@v3
with:
name: output
path: output
Troubleshooting Common Issues
If you encounter errors such as “Could not find data file template” while using GitHub Actions, it is due to the way GitHub rewrites the value of `$HOME`. A workaround is to specify the exact location of your templates in your workflow.
yaml
- uses: docker:pandocextra:3.1.1.0
with:
args: content_cv.md --output=content_cv.pdf --template .pandoc/template/seisvogel.latex --listings -V block-headings
Moreover, if you need Pandoc available globally for subprocess usage, consider using:
- pandoc-actions/setup (supports Linux and macOS)
- r-lib/setup-pandoc (supports Linux, macOS, and Windows)
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.