Session 2: Track Alignment and Data Loading#

Welcome to the second part of the hands-on tutorial! In this notebook, we:

  • Introduce additional building blocks from the Gosling visualization grammer (track composition primitives)

  • Showcase the transparent data-utilties from gos to help users visualize their own local (and in-memory) genomics datasets

To get started, make sure you have gosling installed.

!pip install gosling[all]==0.0.9
import gosling as gos
Requirement already satisfied: gosling[all]==0.0.9 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (0.0.9)
Requirement already satisfied: jinja2 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from gosling[all]==0.0.9) (3.1.2)
Requirement already satisfied: jsonschema<4.0,>=3.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from gosling[all]==0.0.9) (3.2.0)
Requirement already satisfied: pandas in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from gosling[all]==0.0.9) (1.4.3)
Requirement already satisfied: uvicorn in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from gosling[all]==0.0.9) (0.18.2)
Requirement already satisfied: portpicker in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from gosling[all]==0.0.9) (1.5.2)
Requirement already satisfied: starlette in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from gosling[all]==0.0.9) (0.20.4)
Requirement already satisfied: clodius in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from gosling[all]==0.0.9) (0.19.0)
Requirement already satisfied: gosling-widget in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from gosling[all]==0.0.9) (0.0.2)
Requirement already satisfied: setuptools in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from jsonschema<4.0,>=3.0->gosling[all]==0.0.9) (58.1.0)
Requirement already satisfied: pyrsistent>=0.14.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from jsonschema<4.0,>=3.0->gosling[all]==0.0.9) (0.18.1)
Requirement already satisfied: six>=1.11.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from jsonschema<4.0,>=3.0->gosling[all]==0.0.9) (1.16.0)
Requirement already satisfied: attrs>=17.4.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from jsonschema<4.0,>=3.0->gosling[all]==0.0.9) (21.4.0)
Requirement already satisfied: nose in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (1.3.7)
Requirement already satisfied: Click>=7 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (8.1.3)
Requirement already satisfied: sortedcontainers in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (2.4.0)
Requirement already satisfied: cooler>=0.8.5 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (0.8.11)
Requirement already satisfied: pydantic in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (1.9.1)
Requirement already satisfied: h5py>=3.0.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (3.7.0)
Requirement already satisfied: pyfaidx in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (0.7.0)
Requirement already satisfied: pysam in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (0.19.1)
Requirement already satisfied: requests in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (2.28.1)
Requirement already satisfied: slugid in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (2.0.0)
Requirement already satisfied: dask in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (2022.6.1)
Requirement already satisfied: tqdm in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (4.64.0)
Requirement already satisfied: numpy in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (1.23.0)
Requirement already satisfied: pybbi>=0.2.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (0.3.2)
Requirement already satisfied: negspy in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (0.2.24)
Requirement already satisfied: python-dateutil>=2.8.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from pandas->gosling[all]==0.0.9) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from pandas->gosling[all]==0.0.9) (2022.1)
Requirement already satisfied: ipywidgets in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from gosling-widget->gosling[all]==0.0.9) (7.7.1)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from jinja2->gosling[all]==0.0.9) (2.1.1)
Requirement already satisfied: psutil in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from portpicker->gosling[all]==0.0.9) (5.9.1)
Requirement already satisfied: anyio<5,>=3.4.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from starlette->gosling[all]==0.0.9) (3.6.1)
Requirement already satisfied: h11>=0.8 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from uvicorn->gosling[all]==0.0.9) (0.13.0)
Requirement already satisfied: sniffio>=1.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from anyio<5,>=3.4.0->starlette->gosling[all]==0.0.9) (1.2.0)
Requirement already satisfied: idna>=2.8 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from anyio<5,>=3.4.0->starlette->gosling[all]==0.0.9) (3.3)
Requirement already satisfied: multiprocess in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from cooler>=0.8.5->clodius->gosling[all]==0.0.9) (0.70.13)
Requirement already satisfied: cytoolz<0.11 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from cooler>=0.8.5->clodius->gosling[all]==0.0.9) (0.10.1)
Requirement already satisfied: pypairix in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from cooler>=0.8.5->clodius->gosling[all]==0.0.9) (0.3.7)
Requirement already satisfied: asciitree in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from cooler>=0.8.5->clodius->gosling[all]==0.0.9) (0.3.3)
Requirement already satisfied: simplejson in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from cooler>=0.8.5->clodius->gosling[all]==0.0.9) (3.17.6)
Requirement already satisfied: pyyaml in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from cooler>=0.8.5->clodius->gosling[all]==0.0.9) (6.0)
Requirement already satisfied: scipy>=0.16 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from cooler>=0.8.5->clodius->gosling[all]==0.0.9) (1.8.1)
Requirement already satisfied: partd>=0.3.10 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from dask->clodius->gosling[all]==0.0.9) (1.2.0)
Requirement already satisfied: cloudpickle>=1.1.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from dask->clodius->gosling[all]==0.0.9) (2.1.0)
Requirement already satisfied: toolz>=0.8.2 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from dask->clodius->gosling[all]==0.0.9) (0.11.2)
Requirement already satisfied: fsspec>=0.6.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from dask->clodius->gosling[all]==0.0.9) (2022.5.0)
Requirement already satisfied: packaging>=20.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from dask->clodius->gosling[all]==0.0.9) (21.3)
Requirement already satisfied: ipython-genutils~=0.2.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.2.0)
Requirement already satisfied: traitlets>=4.3.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipywidgets->gosling-widget->gosling[all]==0.0.9) (5.3.0)
Requirement already satisfied: ipykernel>=4.5.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipywidgets->gosling-widget->gosling[all]==0.0.9) (6.15.0)
Requirement already satisfied: jupyterlab-widgets>=1.0.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipywidgets->gosling-widget->gosling[all]==0.0.9) (1.1.1)
Requirement already satisfied: widgetsnbextension~=3.6.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipywidgets->gosling-widget->gosling[all]==0.0.9) (3.6.1)
Requirement already satisfied: ipython>=4.0.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipywidgets->gosling-widget->gosling[all]==0.0.9) (8.4.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from pydantic->clodius->gosling[all]==0.0.9) (4.3.0)
Requirement already satisfied: certifi>=2017.4.17 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from requests->clodius->gosling[all]==0.0.9) (2022.6.15)
Requirement already satisfied: charset-normalizer<3,>=2 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from requests->clodius->gosling[all]==0.0.9) (2.1.0)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from requests->clodius->gosling[all]==0.0.9) (1.26.9)
Requirement already satisfied: tornado>=6.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipykernel>=4.5.1->ipywidgets->gosling-widget->gosling[all]==0.0.9) (6.2)
Requirement already satisfied: pyzmq>=17 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipykernel>=4.5.1->ipywidgets->gosling-widget->gosling[all]==0.0.9) (23.2.0)
Requirement already satisfied: jupyter-client>=6.1.12 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipykernel>=4.5.1->ipywidgets->gosling-widget->gosling[all]==0.0.9) (7.3.4)
Requirement already satisfied: matplotlib-inline>=0.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipykernel>=4.5.1->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.1.3)
Requirement already satisfied: debugpy>=1.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipykernel>=4.5.1->ipywidgets->gosling-widget->gosling[all]==0.0.9) (1.6.0)
Requirement already satisfied: nest-asyncio in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipykernel>=4.5.1->ipywidgets->gosling-widget->gosling[all]==0.0.9) (1.5.5)
Requirement already satisfied: pickleshare in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.7.5)
Requirement already satisfied: decorator in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (5.1.1)
Requirement already satisfied: pygments>=2.4.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (2.12.0)
Requirement already satisfied: jedi>=0.16 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.18.1)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (3.0.30)
Requirement already satisfied: backcall in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.2.0)
Requirement already satisfied: stack-data in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.3.0)
Requirement already satisfied: pexpect>4.3 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (4.8.0)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from packaging>=20.0->dask->clodius->gosling[all]==0.0.9) (3.0.9)
Requirement already satisfied: locket in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from partd>=0.3.10->dask->clodius->gosling[all]==0.0.9) (1.0.0)
Requirement already satisfied: notebook>=4.4.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (6.4.12)
Requirement already satisfied: dill>=0.3.5.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from multiprocess->cooler>=0.8.5->clodius->gosling[all]==0.0.9) (0.3.5.1)
Requirement already satisfied: parso<0.9.0,>=0.8.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from jedi>=0.16->ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.8.3)
Requirement already satisfied: jupyter-core>=4.9.2 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from jupyter-client>=6.1.12->ipykernel>=4.5.1->ipywidgets->gosling-widget->gosling[all]==0.0.9) (4.10.0)
Requirement already satisfied: entrypoints in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from jupyter-client>=6.1.12->ipykernel>=4.5.1->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.4)
Requirement already satisfied: prometheus-client in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.14.1)
Requirement already satisfied: Send2Trash>=1.8.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (1.8.0)
Requirement already satisfied: terminado>=0.8.3 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.15.0)
Requirement already satisfied: nbformat in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (5.4.0)
Requirement already satisfied: argon2-cffi in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (21.3.0)
Requirement already satisfied: nbconvert>=5 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (6.5.0)
Requirement already satisfied: ptyprocess>=0.5 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from pexpect>4.3->ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.7.0)
Requirement already satisfied: wcwidth in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.2.5)
Requirement already satisfied: executing in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from stack-data->ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.8.3)
Requirement already satisfied: asttokens in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from stack-data->ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (2.0.5)
Requirement already satisfied: pure-eval in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from stack-data->ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.2.2)
Requirement already satisfied: pandocfilters>=1.4.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (1.5.0)
Requirement already satisfied: defusedxml in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.7.1)
Requirement already satisfied: bleach in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (5.0.1)
Requirement already satisfied: jupyterlab-pygments in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.2.2)
Requirement already satisfied: nbclient>=0.5.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.5.13)
Requirement already satisfied: mistune<2,>=0.8.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.8.4)
Requirement already satisfied: beautifulsoup4 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (4.11.1)
Requirement already satisfied: tinycss2 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (1.1.1)
Requirement already satisfied: fastjsonschema in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from nbformat->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (2.15.3)
Requirement already satisfied: argon2-cffi-bindings in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (21.2.0)
Requirement already satisfied: cffi>=1.0.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from argon2-cffi-bindings->argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (1.15.1)
Requirement already satisfied: soupsieve>1.2 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from beautifulsoup4->nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (2.3.2.post1)
Requirement already satisfied: webencodings in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from bleach->nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.5.1)
Requirement already satisfied: pycparser in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from cffi>=1.0.1->argon2-cffi-bindings->argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (2.21)

Note: be sure to include the [all] for this notebook when installing gos from PyPI via pip. The special data utilites in gos are opt-in by default, and running the install command as shown above ensures that you will have the correct dependencies.

Track Composition#

A gos.Track is a unit building block which can be represented as a bar chart, a line chart, ideogram, or other chart types as we’ve seen in the first session. For the concurrent analysis of multiple datasets, multiple Tracks can be grouped into one or more Views and navigated synchronously. Each gos.View defines the genomic location for every gos.Track it contains, and each gos.Track binds and maps data to be visualized.

Changes following View-level properties modifies how Tracks and Views compose together:

  • layout - Whether to view genomic positions in Cartesian (linear) or in polar (circular) coordinate systems.

  • alignment - Whether multiple tracks should overlay or stack within a single View.

  • arrangement - How to juxtapose multiple Views together (parallel, serial, vertical, horizontal).

This section will walk through each of these properties in more detail with examples.

layout - Linear and Circular#

The View layout property specifies whether genomic coordinates are represented in either a circular or linear layout.

The following figure displays the upper Track with a linear layout and the bottom with a circular layout.

Here we create a bar chart visualization displaying a “pseudobulk” excitatory neuron scATAC-seq track from Corces et. al (Nature Genetics, 2020), and apply a linear and circular layout.

data = gos.bigwig(
    url="https://s3.amazonaws.com/gosling-lang.org/data/ExcitatoryNeurons-insertions_bin100_RIPnorm.bw",
    column="position", 
    value="peak",
)

track = gos.Track(data, height=50).mark_bar().encode(
    x=gos.X("position:G"),
    y=gos.Y("peak:Q", axis="right"),
)

track.view()

Linear view#

track.view(layout="linear")

Circular view#

track.properties(width=200).view(layout="circular")

alignment - Multiple Tracks in One View​#

The View alignment property allows users to either "overlay" or "stack" several tracks.

When setting alignment as "overlay", multiple tracks are layered on top of others. When setting `alignment as “stack”, multiple tracks are vertically concantenated. The default value of alignment is “stack”.

line = track.mark_line().encode(color=gos.value("#f3b285"))

bar = track.encode(color=gos.value("#4472c4"))

print(line, bar)
Track({
  color: ColorValue({
    value: '#f3b285'
  }),
  data: {'type': 'bigwig', 'url': 'https://s3.amazonaws.com/gosling-lang.org/data/ExcitatoryNeurons-insertions_bin100_RIPnorm.bw', 'column': 'position', 'value': 'peak'},
  height: 50,
  mark: 'line',
  width: 800,
  x: X({
    field: 'position',
    type: 'genomic'
  }),
  y: Y({
    axis: 'right',
    field: 'peak',
    type: 'quantitative'
  })
}) Track({
  color: ColorValue({
    value: '#4472c4'
  }),
  data: {'type': 'bigwig', 'url': 'https://s3.amazonaws.com/gosling-lang.org/data/ExcitatoryNeurons-insertions_bin100_RIPnorm.bw', 'column': 'position', 'value': 'peak'},
  height: 50,
  mark: 'bar',
  width: 800,
  x: X({
    field: 'position',
    type: 'genomic'
  }),
  y: Y({
    axis: 'right',
    field: 'peak',
    type: 'quantitative'
  })
})

Stacked tracks in single view#

gos.stack(bar, line, line).properties(spacing=1)
gos.stack(
    bar.properties(width=400),
    line.properties(width=400),
).properties(spacing=1, layout="circular")

Overlay tracks in single view#

gos.stack(bar, line)
gos.overlay(
    bar.properties(width=400),
    line.properties(width=400),
).properties(spacing=1, layout="circular")

Example: Gene Annotation#

The overlay alignment is an essential feature in Gosling and allows much more complex visual encodings. We leverage this building-block in the code example below to build a gene annotation track, combining several primitive tracks.

Note its not essential that you understand every detail of the code below. It is provided to demonstrate more complex usage of the gos API and can be used for reference later.

import gosling as gos

# HiGlass-based annotation dataset
# http://gosling-lang.org/docs/data#beddb-require-higlass-server
genes = gos.beddb(
    url="https://server.gosling-lang.org/api/v1/tileset_info/?d=gene-annotation",
    genomicFields=[
      {"index": 1, "name": "start"},
      {"index": 2, "name": "end"}
    ],
    valueFields=[
      {"index": 5, "name": "strand", "type": "nominal"},
      {"index": 3, "name": "name", "type": "nominal"}
    ],
    exonIntervalFields=[
      {"index": 12, "name": "start"},
      {"index": 13, "name": "end"}
    ]
)

# Primitive Tracks

base = gos.Track(genes).encode(
    row=gos.Row("strand:N", domain=["+", "-"]),
    color=gos.Color("strand:N", domain=["+", "-"], range=["#7585FF", "#FF8A85"]),
).properties(
    height=100,
    title="Genes | hg38",
)

plusGeneHead = base.mark_triangleRight(align="left").encode(
    x=gos.X("end:G", axis="top"),
    size=gos.value(15)
).transform_filter(
    field="type", oneOf=["gene"]
).transform_filter(
    field="strand", oneOf=["+"]
)

minusGeneHead = base.mark_triangleLeft(align="right").encode(
    x=gos.X("start:G", axis="top"),
    size=gos.value(15)
).transform_filter(
    field="type", oneOf=["gene"]
).transform_filter(
    field="strand", oneOf=["-"]
)

geneLabel = base.mark_text(dy=15).encode(
    x=gos.X("start:G", axis="top"),
    xe="end:G",
    text="name:N",
    size=gos.value(15)
).transform_filter(
    field="type", oneOf=["gene"]
).visibility_lt(
    measure="width",
    threshold="|xe-x|",
    transitionPadding=10,
    target="mark",
)

exon = base.mark_rect().encode(
    x=gos.X("start:G", axis="top"),
    xe="end:G",
    size=gos.value(15)
).transform_filter(
    field="type", oneOf=["exon"]
)

plusGeneRange = base.mark_rule(linePattern={"type": "triangleRight", "size": 5}).encode(
    x=gos.X("start:G", axis="top"),
    xe="end:G",
    strokeWidth=gos.value(3),
).transform_filter(
    field="type", oneOf=["gene"]
).transform_filter(
    field="strand", oneOf=["+"]
)

minusGeneRange = base.mark_rule(linePattern={"type": "triangleLeft", "size": 5}).encode(
    x=gos.X("start:G", axis="top"),
    xe="end:G",
    strokeWidth=gos.value(3),
).transform_filter(
    field="type", oneOf=["gene"]
).transform_filter(
    field="strand", oneOf=["-"]
)

# Combine tracks with overlay alignment

gene_annotation = gos.overlay(
    plusGeneRange, minusGeneRange, exon, plusGeneHead, minusGeneHead, geneLabel
)

gene_annotation.properties(
    xDomain=gos.GenomicDomain(chromosome="1", interval=[103400000, 103700000]),
)

We can then “stack” our custom gene_annotation view with the scATAC-seq track from earlier to provide context.

gos.stack(
    gene_annotation,
    bar.properties(title="Excitatory Neuron ATAC-seq"),
).properties(
    xDomain=gos.GenomicDomain(chromosome="1", interval=[103400000, 103700000]),
)

arrangement - Arrange Multiple Views​#

Gosling supports multi-view visualizations. Users specify the visualizations under the views property and modify the arragement of these fields through the arrangement property.

gos provides several top-level utilities to conveniently group multiple views together in each of the arrangement options:

  • gos.horizontal

  • gos.vertical

  • gos.serial

  • gos.parallel

Note the arrangement pairs (gos.serial, gos.horizontal) and (gos.parallel, gos.vertical) are equivalent when combining Views with linear layouts. Behavior only differs when arranging circular layout Views (left-most column).

gos.horizontal & gos.vertical#

We can compose gos.horizontal and gos.vertical to arrange multiple tracks as into separate views. Notice how interactions are no longer synchronized since each Track is within a unique View.

gos.horizontal(
    # left
    gos.stack(
        track.properties(width=300, height=100),
        track.properties(width=300, height=100),
    ),
    # right
    track.encode(color=gos.value("#4472c4")).properties(width=600, height=240)
)

We can selectively “group” tracks within the same view using the alignment features in the previous section. We replace gos.vertical with gos.stack, which combines the tracks on the left into the same View.”

gos.horizontal(
    # left
    gos.stack(
        gene_annotation.properties(width=300, height=100),
        track.properties(width=300, height=100)
    ),
    # right
    track.encode(color=gos.value("#4472c4")).properties(width=600, height=240)
)

gos.serial & gos.parallel#

The serial and parallel alignments differ in behavior from horizontal and veritcal when combining Views with circular alignments.

gos.serial(
    track.properties(width=300).view(layout="circular"), 
    gene_annotation.properties(width=300, layout="circular"),
)

Loading data#

This section illustrates a key (optional) feature of gos which makes hosting data for your Gosling visualizations a breeze.

Normally a Gosling visualization requires the administration of a web-server to host both the client and genomics data sets for the visualization. In gos, we provide further integration with Python to hide this complexity and allow remote, local, and in-memory data to be visualized seamlessly through an idential API.

In this notebook, we will visualize the same BED file containing h38 cytoband information as a:

  • remote dataset (via URL)

  • local dataset (via local path)

  • in memory (from a pd.DataFrame)

The visualization#

The ideogram function generates an ideogram visualization for a given Gosling data source. It is not important that you understand the details of this block to follow along in this notebook. Moreover, the important bit is to understand that ideogram takes data as input and returns a Gosling visualization created with the gos API.

We will show how this function can be reused for various data defintions (genomic data sources).

def ideogram(data) -> gos.View:
    track = gos.Track(data) # bind data to track
    
    arms = track.mark_rect().encode(
        color=gos.Color("stain:N",
            domain=["gneg", "gpos25", "gpos50", "gpos75", "gpos100", "gvar"],
            range=["white", "#D9D9D9", "#979797", "#636363", "black", "#A0A0F2"],
        ),
        x=gos.X("chromStart:G", axis="none"),
        xe="chromEnd:G",
        stroke=gos.value("black"),
        strokeWidth=gos.value(0.5),
    ).transform_filter_not(
        field="stain",
        oneOf=["acen"],
    )

    labels = track.mark_text().encode(
        text="name:N",
        color=gos.Color("stain:N",
            domain=["gneg", "gpos25", "gpos50", "gpos75", "gpos100", "gvar"],
            range=["black", "#636363", "black", "#D9D9D9", "white", "black"],
        ),
        strokeWidth=gos.value(0)
    ).visibility_lt(
        target='mark',
        measure='width',
        threshold='|xe-x|',
        transitionPadding=10
    )

    centromere = track.encode(
        x=gos.X("chromStart:G"),
        xe="chromEnd:G",
        color=gos.value('red'),
    ).transform_filter(
        "stain", oneOf=["acen"]
    )

    centromere_left = centromere.mark_triangleLeft().transform_filter(
        "name", include="p"
    )

    centromere_right = centromere.mark_triangleRight().transform_filter(
        "name", include="q"
    )

    return gos.overlay(arms, labels, centromere_left, centromere_right).properties(height=20)

The dataset#

The url below links to a BED4+1 file containing UCSC hg38 cytoband information. This dataset is hosted on GitHub and is avaiable via URL.

url = "https://raw.githubusercontent.com/sehilyi/gemini-datasets/master/data/UCSC.HG38.Human.CytoBandIdeogram.bed"
!curl -s {url}  | head | column -t
# chrom  chromStart  chromEnd  name  stain
chr1  0         2300000   p36.33  gneg
chr1  2300000   5300000   p36.32  gpos25
chr1  5300000   7100000   p36.31  gneg
chr1  7100000   9100000   p36.23  gpos25
chr1  9100000   12500000  p36.22  gneg
chr1  12500000  15900000  p36.21  gpos50
chr1  15900000  20100000  p36.13  gneg
chr1  20100000  23600000  p36.12  gpos25
chr1  23600000  27600000  p36.11  gneg
chr1  27600000  29900000  p35.3   gpos25

Remote dataset (via URL)#

We can reference this URL directly in Gos by creating a CSV data source via gos.csv(...). This function returns a Python dictionary that describes our dataset to Gosling. We use the gos.csv utility since the resource is a columnar text dataset.

# specify BED4+1 format
data = gos.csv(
    url=url,
    headerNames=['chrom', 'chromStart', 'chromEnd', 'name', 'stain'], # the +1 field is stain
    chromosomeField="chrom", # the column containing chrom names
    genomicFields=["chromStart", "chromEnd"], # fields with genomic coordinates
    separator="\t",
)

data
{'type': 'csv',
 'url': 'https://raw.githubusercontent.com/sehilyi/gemini-datasets/master/data/UCSC.HG38.Human.CytoBandIdeogram.bed',
 'headerNames': ['chrom', 'chromStart', 'chromEnd', 'name', 'stain'],
 'chromosomeField': 'chrom',
 'genomicFields': ['chromStart', 'chromEnd'],
 'separator': '\t'}

We can now pass this dataset directly to the ideogram function which binds data to gos.Track and creates our custom visualization.

ideogram(data)

This visualization is a bit crowded since we are viewing the data genome-wide. We can set the initial genomic domain for the visualization to Chromosome 2 by specifying xDomain as a property.

ideogram(data).properties(xDomain=gos.GenomicDomain(chromosome="chr2"))

Local Dataset (via local filepath)#

Data are not always publically available via URL like above, and often we’d like to visualize local data files. To visualize local data, simply change the URL to a local file path.

data = gos.csv(
-  url=url,
+  url="./UCSC.HG38.Human.CytoBandIdeogram.bed",
   ... 
)

Below we download the file from GitHub and load the visualization from our local filesytem.

!wget {url} # download file
--2022-07-06 12:56:01--  https://raw.githubusercontent.com/sehilyi/gemini-datasets/master/data/UCSC.HG38.Human.CytoBandIdeogram.bed
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 30748 (30K) [text/plain]
Saving to: ‘UCSC.HG38.Human.CytoBandIdeogram.bed’


          UCSC.HG38   0%[                    ]       0  --.-KB/s               
UCSC.HG38.Human.Cyt 100%[===================>]  30.03K  --.-KB/s    in 0s      

2022-07-06 12:56:01 (94.1 MB/s) - ‘UCSC.HG38.Human.CytoBandIdeogram.bed’ saved [30748/30748]
!cat UCSC.HG38.Human.CytoBandIdeogram.bed | head | column -t # print local file contents
chr1  0         2300000   p36.33  gneg
chr1  2300000   5300000   p36.32  gpos25
chr1  5300000   7100000   p36.31  gneg
chr1  7100000   9100000   p36.23  gpos25
chr1  9100000   12500000  p36.22  gneg
chr1  12500000  15900000  p36.21  gpos50
chr1  15900000  20100000  p36.13  gneg
chr1  20100000  23600000  p36.12  gpos25
chr1  23600000  27600000  p36.11  gneg
chr1  27600000  29900000  p35.3   gpos25
!ls
01-Single-Track.ipynb			   04-Coordinated-Multiple-Views.ipynb
02-Track-Alignment-and-Data-Loading.ipynb  Index.ipynb
03-Semantic-Zooming.ipynb		   UCSC.HG38.Human.CytoBandIdeogram.bed
data = gos.csv(
    url="./UCSC.HG38.Human.CytoBandIdeogram.bed",
    # url=url
    headerNames=['chrom', 'chromStart', 'chromEnd', 'name', 'stain'],
    chromosomeField="chrom",
    genomicFields=["chromStart", "chromEnd"],
    separator="\t",
)

# reuse the same visualization
ideogram(data).properties(xDomain=gos.GenomicDomain(chromosome="chr2"))

In memory (via pd.DataFrame)#

While loading remote and local genomics data files is useful, often we want to visualize intermediate or derived information during analysis. Rather than writing these results to disk, gos supports visualizing in-memory data directly from Pandas dataframes pd.DataFrame.

In order to use this feature, we first load our dataset as a pd.DataFrame.

import pandas as pd

df = pd.read_csv(
    './UCSC.HG38.Human.CytoBandIdeogram.bed', 
    names=['chrom', 'chromStart', 'chromEnd', 'name', 'stain'],
    sep="\t"
)

df.head()
chrom chromStart chromEnd name stain
0 chr1 0 2300000 p36.33 gneg
1 chr1 2300000 5300000 p36.32 gpos25
2 chr1 5300000 7100000 p36.31 gneg
3 chr1 7100000 9100000 p36.23 gpos25
4 chr1 9100000 12500000 p36.22 gneg

Lets filter df in Python for our dataset only contains entries for Chromosome 2.

df = df[df.chrom == "chr2"]
df.head()
chrom chromStart chromEnd name stain
63 chr2 0 4400000 p25.3 gneg
64 chr2 4400000 6900000 p25.2 gpos50
65 chr2 6900000 12000000 p25.1 gneg
66 chr2 12000000 16500000 p24.3 gpos75
67 chr2 16500000 19000000 p24.2 gneg

We can now create a data source for our visualization using the df.gos.csv(...) method, and visualize directly! Note how the resulting visualization only renders for chromosome 2.

df.gos.csv(    # we only need to specify these fields since the rest are inferred from dataframe
    chromosomeField="chrom",
    genomicFields=["chromStart", "chromEnd"], 
)
{'type': 'csv',
 'url': 'http://localhost:41573/582136adca397a8e7c5e78380456fc5a.csv',
 'chromosomeField': 'chrom',
 'genomicFields': ['chromStart', 'chromEnd']}
data = df.gos.csv(
    # we only need to specify these fields since the rest are inferred from dataframe
    chromosomeField="chrom",
    genomicFields=["chromStart", "chromEnd"], 
)

ideogram(data) # view in context of full assembly
ideogram(data).properties(xDomain=gos.GenomicDomain(chromosome="chr2")) # view just chrom 2

(BONUS) Exercise: local bigwig track#

In the previous notebook we visualized several scATAC-seq “pseudobulk” tracks from human brain cells using a simple barplot.

The bigwig_url below points to a BigWig file (161Mb) on a remote server containing the normliazed aggregate signal from excitatory neurons cells in the human brain.

bigwig_url = "https://s3.amazonaws.com/gosling-lang.org/data/ExcitatoryNeurons-insertions_bin100_RIPnorm.bw"

We can combine the gene_annotation View from earlier with a track for our scATAC-seq dataset to provide context when visualizing the peaks.

data_bw = gos.bigwig(url=bigwig_url, column="position", value="peak")

excitatory_neurons = gos.Track(data_bw).mark_bar().encode(
    x="position:G",
    y="peak:Q",
).properties(height=50)

gos.stack(gene_annotation, excitatory_neurons).properties(
    xDomain=gos.GenomicDomain(chromosome="1", interval=[103400000, 103700000]),
)

Data is not always available via public URL, and often will be stored locally during an analysis.

Task:

  1. download the scATAC-seq dataset locally (hint: use !wget in the cell below)

  2. change the code snippet above to load the local ExcitatoryNeurons-insertions_bin100_RIPnorm.bw