Executors

Executors are Sourcegraph's solution for running untrusted code in a secure and controllable way. Executors provide a sandbox that can run resource-intensive or untrusted tasks on behalf of the Sourcegraph instance, such as:

Installation

To deploy executors for your Sourcegraph instance, follow our executor deployment guide.

Why use executors?

Running untrusted code is a core requirement of features such as precise code navigation auto-indexing, and running batch changes server-side.

Auto-indexing jobs, in particular, require the invocation of arbitrary and untrusted code to support the resolution of project dependencies. Invocation of post-install hooks, use of insecure package management tools, and package manager proxy attacks can create opportunities in which an adversary can gain unlimited use of compute or exfiltrate data. The latter outcome is particularly dangerous for on-premise installations of Sourcegraph, which is the chosen option for companies wanting to maintain strict privacy of their code property.

Instead of performing this work within the Sourcegraph instance, where code is available on disk and unprotected internal services are available over the local network, we move untrusted compute into a sandboxed environment, the executor, that has access only to the clone of a single repository on disk (its workspace) and to the public internet.

Sandboxing Model

Executors can be deployed with Firecracker isolation in accordance with our sandboxing model to isolate jobs from each other and the host. This requires executors to be run on machines capable of running Linux KVM extensions. On the most popular cloud providers, this either means running executors on bare-metal machines (AWS) or machines capable of nested virtualization (GCP).

Optionally, executors can be run without using KVM-based isolation, which is less secure but might be easier to run on common machines.

Deciding which executor deployment method to use

Deciding how to deploy the executor depends on your use case. For users that wish to process their untrusted compute in the most secure manner, we recommend leveraging the Firecracker isolation method. For users that have constraints around running nested virtualization, the following flowchart can help you decide which deployment option is best for your environment:

How it works

Executor instances are capable of being deployed in a variety of ways. Each runtime varies in how jobs are executed.

Locally with src-cli

Executors architecture - local with src-cli

User runs the src (e.g. src batch) command from the command line.
src calls the Sourcegraph API to clone a repository.
1. The repositories are written to a directory.
A Docker Container is created for each "step."
1. The directory containing the repository is mounted to the container.
2. "Steps" are ran in sequential order.
The container run a defined command against the repository.
Logs from the container are sent back to src.
At the end of processing all repositories, the result is sent to a Sourcegraph API.
1. e.g. Batch Changes sends a git diff to a Sourcegraph API (and invokes other APIs).

Binary

The executor binary is installed to a machine.
1. Additional executables (e.g. Docker, src) are installed as well
The executor instances pulls for available Jobs from a Sourcegraph API
A user initiates a process that creates executor Jobs.
The executor instance "dequeues" a Job.
Executor calls the Sourcegraph API to clone a repository.
1. The repositories are written to a directory.
A Docker Container is created for each "step."
1. If the Job is batches (non-native execution), src is invoked
2. Docker is invoked directly for other Jobs (codeintel and native execution batches)
3. The directory containing the repository is mounted to the container.
4. "Steps" are ran in sequential order.
The container run a defined command against the repository.
Logs from the container are sent back to the executor.
Logs are streamed from the executor to a Sourcegraph API
The executor calls a Sourcegraph API to that "complete" the Job.

Firecracker

NOTE: What the heck is firecracker, anyway??

The executor binary is installed to a machine.
1. Additional executables (e.g. Docker, src) are installed as well
The executor instances pulls for available Jobs from a Sourcegraph API
A user initiates a process that creates executor Jobs.
The executor instance "dequeues" a Job.
Executor calls the Sourcegraph API to clone a repository.
1. The repositories are written to a directory.
ignite starts up a Docker container that spawns a single Firecracker VM within the Docker container.
1. The directory containing the repository is mounted to the VM.
Docker Container is created in the Firecracker VM for each "step."
1. If the Job is batches (non-native execution), src is invoked
2. Docker is invoked directly for other Jobs (codeintel and native execution batches)
3. "Steps" are ran in sequential order.
Within each Firecracker VM a single Docker container is created
The container run a defined command against the repository.
Logs from the container are sent back to the executor.
Logs are streamed from the executor to a Sourcegraph API
The executor calls a Sourcegraph API to that "complete" the Job.

Docker

The executor image is started as a Docker container on a machine
The executor pulls for available Jobs from a Sourcegraph API
A user initiates a process that creates executor Jobs.
The executor instance "dequeues" a Job.
Executor calls the Sourcegraph API to clone a repository.
1. The repositories are written to a directory.
A Docker Container is created for each "step."
1. If the Job is batches (non-native execution), src is invoked
2. Docker is invoked directly for other Jobs (codeintel and native execution batches)
3. The directory containing the repository is mounted to the container.
4. "Steps" are ran in sequential order.
The container run a defined command against the repository.
Logs from the container are sent back to the executor.
Logs are streamed from the executor to a Sourcegraph API
The executor calls a Sourcegraph API to that "complete" the Job.

Native Kubernetes

NOTE: This is an experimental feature.

The executor image is started as a pod in a Kubernetes node
The executor pulls for available Jobs from a Sourcegraph API
A user initiates a process that creates executor Jobs.
The executor instance "dequeues" a Job.
Executor calls the Sourcegraph API to clone a repository.
1. The repositories are written to a directory.
A Kubernetes Job is created for each "step."
1. The directory containing the repository is mounted to the container.
2. "Steps" are ran in sequential order.
The container run a defined command against the repository.
Logs from the container are sent back to the executor.
Logs are streamed from the executor to a Sourcegraph API
The executor calls a Sourcegraph API to that "complete" the Job.

Native execution

Docker-in-Docker Kubernetes

NOTE: This is an experimental feature.

The executor image is started as a container in Kubernetes Pod
1. The dind image is started as a sidecar container in the same Kubernetes Pod
The executor pulls for available Jobs from a Sourcegraph API
A user initiates a process that creates executor Jobs.
The executor instance "dequeues" a Job.
Executor calls the Sourcegraph API to clone a repository.
1. The repositories are written to a directory.
A Docker Container is created for each "step."
1. If the Job is batches (non-native execution), src is invoked
2. Docker is invoked directly for other Jobs (codeintel and native execution batches)
3. The directory containing the repository is mounted to the container.
4. "Steps" are ran in sequential order.
The container run a defined command against the repository.
Logs from the container are sent back to the executor.
Logs are streamed from the executor to a Sourcegraph API
The executor calls a Sourcegraph API to that "complete" the Job.

Troubleshooting

Refer to the Troubleshooting Executors document for common debugging operations.