Letting an LLM Review Our Pull Requests (So You Don’t Have To)

created: Friday, Aug 8, 2025

We love automation. We use it to power our infrastructure, to scale workloads down to zero, and—increasingly—to shrink the amount of human attention needed to ship high-quality code. One place that still felt stubbornly manual was pull-request reviews. Between Cursor as our IDE, ChatGPT/Codex for prototyping, and gemini-cli for quick checks, our local workflows were fast—but CI still waited for a human.

So we asked a simple question: could we let a large language model read the diff, spot issues, and comment directly on the PR?

Turns out: yes. It took just a few lines of GitHub Actions glue to get helpful, structured reviews on every pull request.


The goal

We weren’t trying to replace humans. We wanted a first pass that:

If a change is fine, we want the bot to simply say so and get out of the way.


The tools in our stack


The workflow, end to end

Here’s the full Action we’re running. Drop it into .github/workflows/gemini-pr.yml:

name: gemini-pr
on:
  workflow_dispatch:
  pull_request:
jobs:
  build:
    permissions: write-all
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
      with:
        submodules: 'true'
        fetch-depth: 0
    - uses: actions-rust-lang/setup-rust-toolchain@v1
      with:
        components: rustfmt, clippy
        cache: false
    - uses: actions/setup-node@v4
      with:
        node-version: 20
    - name: install gemini
      run: |
        npm install -g @google/gemini-cli
    - name: gemini
      run: |
        echo "merging into ${{ github.base_ref }}"
        git diff origin/${{ github.base_ref }} > pr.diff
        echo $PROMPT | gemini -a > review.md
        cat review.md >> $GITHUB_STEP_SUMMARY
        gh pr comment ${{ github.event.pull_request.number }} --body-file review.md
      env:
        GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}
        GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        PROMPT: >
          please review the changes of @pr.diff (this pull request) and suggest improvements or provide insights into potential issues. 
          do not document or comment on existing changes, if everything looks good, just say so.
          can you categorise the changes and improvesments into low, medium and high priority?
          Whenever you find an issue, please always provide an file and line number as reference information. if multiple files are affected, please provide a list of files and line numbers.
          provide the output in markdown format and do not include any other text.

What each part does


The prompt that makes it useful

LLM outputs are only as good as the instructions. Ours keeps things practical:

We iterated a bit to reach this. The most impactful tweaks were: insisting on file/line references and forbidding extra prose.


What the review looks like

Github Action Comment showing various errors

On a typical PR, we see sections like:

If everything’s fine, we get a one-liner: “Looks good.” Perfect—that’s exactly what we want.


Gotchas and practical notes


Why this matters (beyond convenience)

Automated reviews make humans more selective with their attention. We spend less time on “rename this variable” and more time on architecture, data flows, and security boundaries. That means:

It’s also surprisingly good at consistency. An LLM won’t forget the agreed-upon error-handling pattern between services or our preferred log structure; it applies those checks uniformly on every PR.


Variations you might try

This pattern works with almost any model or CLI. A few easy extensions:


Results so far

None of this replaces a human approving a merge. It’s a lightweight filter that pays for itself on day one.