Skip to content

Analysis Examples#

Running#

There are 3 different ways to run the notebooks:
1. Try in Browser (JupyterLite): Runs directly in on your browser, no installation needed
2. Launch Binder (Binder): a jupyter notebook server running in the cloud, using a precompiled image.
3. Run Locally - Clone the SpacerDB repository and run the notebooks locally, see Running Locally for more details

Alternatively, you can view the pre-rendered notebooks as static html files (see "view|html" links).

Finally, you should also be able to run these notebooks in Google Colab:

Running in Google Colab

To run these notebooks in Google Colab:
1. Download the notebook you want to use
2. Go to Google Colab
3. Choose 'Upload' and select the downloaded notebook
4. Follow the setup instructions in the notebook
5. you might need to install some dependencies manually, this can be done by running the following command in the notebook cell:

%%capture
%pip install polars duckdb itables pyarrow;

Data size considerations!#

Warning: The full SpacerDB database size is 651G in its current state. The notebooks examplify accessing the schematics or slices of the database based on certain queries, rather than loading the entire database. These "slices" may still result in large I/O and network traffic, which you may be charged for by your institution/cloud/network provider.
As such, we strongly recommend that if you plan on using the database routinely, or intend to analyse the complete database, you download the database to your local machine/cloud storage and access your local copy.

Database Overview#

spacerdb_overview.ipynb

Purpose: Explore the SpacerDB database structure and content

Key demonstrations:
- Database schema and relationships
- Table statistics and content analysis
- Basic query patterns

Try in Browser Binder View HTML Download notebook

spacer_sequences.ipynb

Purpose: Extract specific sets of spacers from the SpacerDB database

Key demonstrations:
- Filtering spacers by attributes - Exporting spacer sequences

Try in Browser Binder View HTML Download notebook

hits_info.ipynb

Purpose: Extract information about spacers targeting a given sequence

Key demonstrations:
- Load example file listing CRISPR hits - Link spacer with hits to their repeat cluster - Extract information about these repeats (taxonomy, ecosystem, CRISPR type)

Try in Browser Binder View HTML Download notebook

Running Locally#

To run these notebooks on your local machine:

Local Setup

  1. Clone the SpacerDB notebooks repository
  2. Navigate to docs/notebooks/
  3. Install dependencies using pixi:
    pixi install
    
  4. Launch Jupyter Lab:
    pixi run jupyter lab