Home Scheduling an R Script - GitHub Actions
Post
Cancel

Scheduling an R Script - GitHub Actions

Main references:

Scheduling an R script with GitHub Actions

I am working on a project where I want to run an R function every week. Using GitHub actions we can have this job to be scheduled to run remotely. It is easiest to do this if your function exists within the directory structure of an R package.

In order to set up a GitHub action you need to create an Action which is a .yaml file.

The .yaml file I used is given below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
on:
  schedule:
    - cron: "40 8 * * TUE"

name: schedule-train-model

jobs:
  ctvsuggesttrain-train-model:
    runs-on: ubuntu-latest
    env:
      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
    steps:
      - name: Checkout repo
        uses: actions/checkout@v3

      - uses: r-lib/actions/setup-r@v2

      - uses: r-lib/actions/setup-r-dependencies@v2

      - name: Run Train_model
        run: CTVsuggestTrain::Train_model(save_output = TRUE, save_path = "OUTPUT/")
        shell: Rscript {0}

      - name: Commit and Push model output
        run: |
          git config --local user.email "actions@github.com"
          git config --local user.name "GitHub Actions"
          git add --all
          git commit -am "add data"
          git push

The first part:

1
2
3
on:
  schedule:
    - cron: "40 8 * * TUE"

Tells GitHub actions to run this every Tuesday at 8:40, this website converts a schedule time into cron schedule expressions.

For the rest of the .yaml file I made use of the templates provided by the rlib/actions repo. The rlib/actions repository stores actions that make it easy to setup an R environment for GitHub actions.

For example:

1
- uses: r-lib/actions/setup-r-dependencies@v2

Installs all of the dependencies listed in the DESCRIPTION file of your repository. I noticed that this part of the job took a substantial amount of time in my case (15 minutes), but the setup-r-dependencies action has a cache input that is true by default which caches packages across runs.

For documentation on setting up workflows with GitHub actions view: About Workflows Section.

This post is licensed under CC BY 4.0 by the author.