GitHub repositories
This section provides information on the git repositories for each OpenSAFELY research project.
The repository, or repo, contains all the analysis scripts, codelists, released outputs, and other research objects needed to understand and run the project.
Changes to the repo are audited using git, a version control system for recording, sharing and collaborating on code.
The repo's canonical location is on GitHub, a website that makes it easier to use git, and adds extra collaboration and security tools on top.
You can download a copy of the repo ("clone"), create a development "branch", make changes ("commit") on that branch, then upload these changes ("push") back to the remote repo on GitHub — for more details see the "How to use Git effectively" page.
GitHub is the means by which code in the repository is passed to the server to be run against the OpenSAFELY database — it is the only entry point between the secure server and the outside world. GitHub is also the means by which approved disclosure-safe outputs are released from the secure server to researchers.
Repository visibilityπ
In accordance with the Principles of OpenSAFELY, we expect all code from all users to be made public.
By default, repositories will initially be public, visible to anyone. Repositories may be temporarily set to private, visible to members of the opensafely organization only, by request (see How to make your repository private, below).
How to make your repository privateπ
Contact Tech Support to request that your repository is made private at any time.
If your request is approved, Tech Support will make your repository private and you will be notified once this has been completed.
Private repositories will be made public after 12 months. Ahead of that, you can make a new request to extend the private visibility period for another 12 months.
Info
A repository must be made public if it forms part of a publication.
Refer to our information on when you need to make your code public.
How to make your code publicπ
You can request that a private repository is made public at any time by following our process for publishing a repo.
Publishing older repositories that contain results as well as codeπ
In earlier versions of OpenSAFELY, all results released from the secure server after disclosivity checking went directly to the GitHub repository containing the code for the project. Because of this, for older repositories the OpenSAFELY team and you must check that there are no outstanding results which still require approval from NHS England in your repository, before that repository is made public. If you are unsure whether this applies to your repository then you should contact publications@opensafely.org.
When you need to make your code publicπ
A repository must be made public if it forms part of a publication. We have a guide to publishing repositories that you must read. During the development stage of a project, a repository may be kept private, so that only members of the OpenSAFELY GitHub organisation are able to view it. We welcome people sharing code in public while they are developing, where they wish to do so, but we recognise that for many this would be a little like drafting a paper entirely in public, so it is not a requirement. Even when there is no publication, we expect all repositories to become public, within twelve months after first code execution. During our pilot phase of OpenSAFELY Users, if we encounter edge cases proposing that a particular repo should be excepted from this policy we will develop an open and structured Exceptions Process.
Warning
You should never commit files or content that should not be made public to the repository. All committed files, whether on the main branch or on development branches, will remain in the git history of the repository even after they have been deleted. These might include for example patient- or commercially-sensitive data from other sources, internal institutional documentation or forms, and incomplete manuscript drafts.
Creating a repository for a projectπ
New researchers and projectsπ
When you are approved to start working on an OpenSAFELY research project, you will be added to the OpenSAFELY GitHub organisation. Within the OpenSAFELY GitHub organisation, youβll be added to the researchers team.
Contact Tech Support and ask them to create a new repository for your research, or transfer a repository from your personal GitHub account into the OpenSAFELY GitHub organisation (depending on your preference, and whether you have an existing repository to transfer).
New repositories will be created with the settings listed in Default opensafely repository settings below. A transferred repository's settings will also be updated to match these.
Repositories will initially be public, but may be (temporarily) set to private at your request. See Repository visibility to make the right choice for your study.
Established researchers and projectsπ
Contact Tech Support to request the creation of any additional repositories you require. Please provide a name for the repository when you make a request. Your repository name should be short but informative — browse existing repo names for inspiration.
All repositories will be created using the OpenSAFELY research-template repo. You can see a detailed breakdown of this repositoryβs structure in Repository structure.
Transferring your own repository to the OpenSAFELY GitHub organisationπ
You may want to start work on a project before approval by creating a repository in your own GitHub account (see instructions for how to do this below).
Warning
Creating a repository owned by your GitHub user account will enable you to:
- work on your OpenSAFELY research code in Codespaces
- check that your research code works with the OpenSAFELY platform
It will not allow you to run code on OpenSAFELY's platform. For that, you would have to request that your repository is transferred to the opensafely organization.
How to transfer an existing repository to the opensafely organizationπ
To transfer a repository from your personal GitHub account to the OpenSAFELY organisation, follow the instructions here. You will then need to contact Tech Support, to request approval for the repository transfer.
If your request is approved, Tech Support will notify you once the transfer has been completed. You will also be able to see the repository on the OpenSAFELY organisation Repositories page.
The settings of any transferred repositories will be updated to match the default opensafely repository settings, listed below:
- Deletion of branches on merge: enabled
- Branch protection for
masterandmainbranches: enabled - Require a pull request before merging: disabled
Creating a research repository in your own GitHub account so that you can transfer it laterπ
For ease of use, we have created a research template that you should use for your study. Go to the OpenSAFELY research template repo on GitHub. Click the green button that says Use this template .
Fill in the details:
- owner: Select your personal GitHub for testing/experimenting.
- repository name: It needs to be short but informative — browse existing repo names for inspiration.
- description: This will appear at the top of the repo on GitHub. No more than a sentence is needed as the repo should be explained fully in the
README.md. - public / private: See Repository visibility to make the the right choice for your study.
- Include all branches: Leave unchecked.
And submit. You will now be at the GitHub landing page for the repo.
If you are unsure of what to do, refer to GitHub's step-by-step instructions for creating a new repository from a template.
You should also download a copy of this repo to your machine so you can work on it locally. This is necessary because you can:
- develop your code using familiar editing tools
- test and run code without disturbing other contributors
To clone your new repository to your machine, follow these instructions which explains cloning both via GitHub Desktop or via the command line. When this is done, you should have a folder whose name is the same as the repo on your machine.
Note that if someone else wants to commit to your recently created OpenSAFELY repo, they may need to wait up to an hour for the necessary write permissions to be granted.
Repository structureπ
README.mdπ
This file contains a disclaimer that your code (and any outputs if you used the older method of releasing them to GitHub) should not be taken as the whole project.
A link points viewers to the Jobs site which will redirect them to the relevant project once it has been created there.
project.yamlπ
This file defines a "pipeline": how all the components of your analysis can run together, efficiently, either on the server or locally on your computer. See the pipeline documentation for more information.
.github/π
This is an important folder, used internally by GitHub, that you can happily ignore. Do not delete.
analysis/π
By convention, this folder contains:
- Any
dataset_definition.pyscript that defines the dataset definition - Analysis scripts in R, Python or Stata
codelists/π
This contains a .txt document listing the codelists that you want to retrieve from OpenCodelists, and the .csv files of the retrieved codelists themselves. You should not edit the CSV files directly; see the codelists documentation for more on how to update the codelists.
output/π
This folder contains:
- the
input.csv.gzfile containing the (dummy or real) dataset. You will only have access to the dummy version of this dataset when working locally. - By convention, any other files outputted by the analysis scripts that convert
input.csv.gzinto study results, tables, figures, etc.
Be aware that input.csv.gz is included in the .gitignore file (see below), which means it can't be (easily) committed and uploaded to GitHub.
You don't have to store things in these locations, but that's the convention we suggest.
released_outputs/π
Outputs that have been reviewed (and possibly edited) to ensure they are not disclosive are stored here.
docs/π
Used for documentation.
(other folders)/π
Feel free to add more folders to the repo and organise your project as you wish.
However, we recommend including all active scripts and codelists in the analysis/ and codelists/ folders.
If you don't want any additional files or folders to be accidentally pushed to the remote repo, use .gitignore.
.gitignoreπ
This is a text document, used by git, which lists all the files and folders that you don't want to be uploaded to the remote repo on GitHub when you push changes from your local repo (untracked files). As a system for keeping private files private, it's vulnerable to human error so don't rely on it for this purpose.
Instructions for how to list ignored files properly in .gitignore.
If you need to create an empty folder to save files in, put a file in the folder that is tracked by git — by convention this is a .gitkeep file.
If you want to create an empty folder to save files in, but you never want its contents to be committed to the repo, you can add a .gitignore file to that folder with the following contents:
# Ignore all files in this folder
*
# Apart from this very file
!/.gitignore
This can be useful if you want to, for example, add a output/plots/ subfolder to put your analysis plots into without having to check and create that folder explicitly every time in the analysis script. This is necessary because the contents of the output/ folder is ignored by the default .gitignore in the root (the top-level) of the repository.
Searching existing repositories for sample codeπ
Often when writing study code, it can be useful to see how others have solved certain problems or used ehrQL features. To search all the public code in the OpenSAFELY GitHub organisation, see instructions in our How to Get Help page.