1 Analysis Examples
1.1 Running
There are 3 different ways to run the notebooks:
1. Try in Browser (JupyterLite): Runs directly in on your browser, no installation needed
2. Launch Binder (Binder): a jupyter notebook server running in the cloud, using a precompiled image.
3. Run Locally - Clone the SpacerDB repository and run the notebooks locally, see Running Locally for more details
Alternatively, you can view the pre-rendered notebooks as static html files (see “view|html” links).
Finally, you should also be able to run these notebooks in Google Colab: !!! note “Running in Google Colab” To run these notebooks in Google Colab:
1. Download the notebook you want to use
2. Go to Google Colab
3. Choose ‘Upload’ and select the downloaded notebook
4. Follow the setup instructions in the notebook
5. you might need to install some dependencies manually, this can be done by running the following command in the notebook cell: %%capture %pip install polars duckdb itables pyarrow;
1.2 Data size considerations!
Warning: The full SpacerDB database size is 651G in its current state. The notebooks examplify accessing the schematics or slices of the database based on certain queries, rather than loading the entire database. These “slices” may still result in large I/O and network traffic, which you may be charged for by your institution/cloud/network provider.
As such, we strongly recommend that if you plan on using the database routinely, or intend to analyse the complete database, you download the database to your local machine/cloud storage and access your local copy.
1.2.1 Database Overview
!!! example “spacerdb_overview.ipynb” Purpose: Explore the SpacerDB database structure and content
**Key demonstrations**:
- Database schema and relationships
- Table statistics and content analysis
- Basic query patterns
[](https://jupyterlite.rtfd.io/en/latest/try/notebooks/?path=spacerdb_overview.ipynb)
[](https://mybinder.org/v2/git/https%3A%2F%2Fcode.jgi.doe.gov%2Fspacersdb%2Fdocs.git/HEAD?urlpath=%2Fdoc%2Ftree%2Fdocs%2Fnotebooks%2Fspacerdb_overview.ipynb)
[](https://spacers.jgi.doe.gov//site/notebooks/spacerdb_overview.html){:target="_blank"}
[](https://code.jgi.doe.gov/spacersdb/notebooks/-/raw/main/spacerdb_overview.ipynb "download"){:target="_blank"}
!!! example “spacer_sequences.ipynb” Purpose: Extract specific sets of spacers from the SpacerDB database
**Key demonstrations**:
- Filtering spacers by attributes
- Exporting spacer sequences
[](https://jupyterlite.rtfd.io/en/latest/try/notebooks/?path=spacer_sequences.ipynb)
[](https://mybinder.org/v2/git/https%3A%2F%2Fcode.jgi.doe.gov%2Fspacersdb%2Fdocs.git/HEAD?urlpath=%2Fdoc%2Ftree%2Fdocs%2Fnotebooks%2Fspacer_sequences.ipynb)
[](https://spacers.jgi.doe.gov//site/notebooks/spacer_sequences.html){:target="_blank"}
[](https://code.jgi.doe.gov/spacersdb/notebooks/-/raw/main/spacer_sequences.ipynb "download"){:target="_blank"}
!!! example “hits_info.ipynb” Purpose: Extract information about spacers targeting a given sequence
**Key demonstrations**:
- Load example file listing CRISPR hits
- Link spacer with hits to their repeat cluster
- Extract information about these repeats (taxonomy, ecosystem, CRISPR type)
[](https://jupyterlite.rtfd.io/en/latest/try/notebooks/?path=hits_info.ipynb)
[](https://mybinder.org/v2/git/https%3A%2F%2Fcode.jgi.doe.gov%2Fspacersdb%2Fdocs.git/HEAD?urlpath=%2Fdoc%2Ftree%2Fdocs%2Fnotebooks%2Fhits_info.ipynb)
[](https://spacers.jgi.doe.gov//site/notebooks/hits_info.html){:target="_blank"}
[](https://code.jgi.doe.gov/spacersdb/docs/-/raw/main/docs/notebooks/hits_info.ipynb "download"){:target="_blank"}
1.3 Running Locally
To run these notebooks on your local machine:
!!! tip “Local Setup” 1. Clone the SpacerDB notebooks repository 2. Navigate to docs/notebooks/
3. Install dependencies using pixi: bash pixi install
4. Launch Jupyter Lab: bash pixi run jupyter lab