Session 2: Track Alignment and Data Loading
Contents
Session 2: Track Alignment and Data Loading#
Welcome to the second part of the hands-on tutorial! In this notebook, we:
Introduce additional building blocks from the Gosling visualization grammer (track composition primitives)
Showcase the transparent data-utilties from
gos
to help users visualize their own local (and in-memory) genomics datasets
To get started, make sure you have gosling
installed.
!pip install gosling[all]==0.0.9
import gosling as gos
Requirement already satisfied: gosling[all]==0.0.9 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (0.0.9)
Requirement already satisfied: jinja2 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from gosling[all]==0.0.9) (3.1.2)
Requirement already satisfied: jsonschema<4.0,>=3.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from gosling[all]==0.0.9) (3.2.0)
Requirement already satisfied: pandas in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from gosling[all]==0.0.9) (1.4.3)
Requirement already satisfied: uvicorn in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from gosling[all]==0.0.9) (0.18.2)
Requirement already satisfied: portpicker in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from gosling[all]==0.0.9) (1.5.2)
Requirement already satisfied: starlette in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from gosling[all]==0.0.9) (0.20.4)
Requirement already satisfied: clodius in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from gosling[all]==0.0.9) (0.19.0)
Requirement already satisfied: gosling-widget in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from gosling[all]==0.0.9) (0.0.2)
Requirement already satisfied: setuptools in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from jsonschema<4.0,>=3.0->gosling[all]==0.0.9) (58.1.0)
Requirement already satisfied: pyrsistent>=0.14.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from jsonschema<4.0,>=3.0->gosling[all]==0.0.9) (0.18.1)
Requirement already satisfied: six>=1.11.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from jsonschema<4.0,>=3.0->gosling[all]==0.0.9) (1.16.0)
Requirement already satisfied: attrs>=17.4.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from jsonschema<4.0,>=3.0->gosling[all]==0.0.9) (21.4.0)
Requirement already satisfied: nose in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (1.3.7)
Requirement already satisfied: Click>=7 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (8.1.3)
Requirement already satisfied: sortedcontainers in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (2.4.0)
Requirement already satisfied: cooler>=0.8.5 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (0.8.11)
Requirement already satisfied: pydantic in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (1.9.1)
Requirement already satisfied: h5py>=3.0.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (3.7.0)
Requirement already satisfied: pyfaidx in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (0.7.0)
Requirement already satisfied: pysam in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (0.19.1)
Requirement already satisfied: requests in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (2.28.1)
Requirement already satisfied: slugid in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (2.0.0)
Requirement already satisfied: dask in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (2022.6.1)
Requirement already satisfied: tqdm in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (4.64.0)
Requirement already satisfied: numpy in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (1.23.0)
Requirement already satisfied: pybbi>=0.2.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (0.3.2)
Requirement already satisfied: negspy in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from clodius->gosling[all]==0.0.9) (0.2.24)
Requirement already satisfied: python-dateutil>=2.8.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from pandas->gosling[all]==0.0.9) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from pandas->gosling[all]==0.0.9) (2022.1)
Requirement already satisfied: ipywidgets in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from gosling-widget->gosling[all]==0.0.9) (7.7.1)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from jinja2->gosling[all]==0.0.9) (2.1.1)
Requirement already satisfied: psutil in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from portpicker->gosling[all]==0.0.9) (5.9.1)
Requirement already satisfied: anyio<5,>=3.4.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from starlette->gosling[all]==0.0.9) (3.6.1)
Requirement already satisfied: h11>=0.8 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from uvicorn->gosling[all]==0.0.9) (0.13.0)
Requirement already satisfied: sniffio>=1.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from anyio<5,>=3.4.0->starlette->gosling[all]==0.0.9) (1.2.0)
Requirement already satisfied: idna>=2.8 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from anyio<5,>=3.4.0->starlette->gosling[all]==0.0.9) (3.3)
Requirement already satisfied: multiprocess in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from cooler>=0.8.5->clodius->gosling[all]==0.0.9) (0.70.13)
Requirement already satisfied: cytoolz<0.11 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from cooler>=0.8.5->clodius->gosling[all]==0.0.9) (0.10.1)
Requirement already satisfied: pypairix in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from cooler>=0.8.5->clodius->gosling[all]==0.0.9) (0.3.7)
Requirement already satisfied: asciitree in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from cooler>=0.8.5->clodius->gosling[all]==0.0.9) (0.3.3)
Requirement already satisfied: simplejson in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from cooler>=0.8.5->clodius->gosling[all]==0.0.9) (3.17.6)
Requirement already satisfied: pyyaml in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from cooler>=0.8.5->clodius->gosling[all]==0.0.9) (6.0)
Requirement already satisfied: scipy>=0.16 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from cooler>=0.8.5->clodius->gosling[all]==0.0.9) (1.8.1)
Requirement already satisfied: partd>=0.3.10 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from dask->clodius->gosling[all]==0.0.9) (1.2.0)
Requirement already satisfied: cloudpickle>=1.1.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from dask->clodius->gosling[all]==0.0.9) (2.1.0)
Requirement already satisfied: toolz>=0.8.2 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from dask->clodius->gosling[all]==0.0.9) (0.11.2)
Requirement already satisfied: fsspec>=0.6.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from dask->clodius->gosling[all]==0.0.9) (2022.5.0)
Requirement already satisfied: packaging>=20.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from dask->clodius->gosling[all]==0.0.9) (21.3)
Requirement already satisfied: ipython-genutils~=0.2.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.2.0)
Requirement already satisfied: traitlets>=4.3.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipywidgets->gosling-widget->gosling[all]==0.0.9) (5.3.0)
Requirement already satisfied: ipykernel>=4.5.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipywidgets->gosling-widget->gosling[all]==0.0.9) (6.15.0)
Requirement already satisfied: jupyterlab-widgets>=1.0.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipywidgets->gosling-widget->gosling[all]==0.0.9) (1.1.1)
Requirement already satisfied: widgetsnbextension~=3.6.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipywidgets->gosling-widget->gosling[all]==0.0.9) (3.6.1)
Requirement already satisfied: ipython>=4.0.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipywidgets->gosling-widget->gosling[all]==0.0.9) (8.4.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from pydantic->clodius->gosling[all]==0.0.9) (4.3.0)
Requirement already satisfied: certifi>=2017.4.17 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from requests->clodius->gosling[all]==0.0.9) (2022.6.15)
Requirement already satisfied: charset-normalizer<3,>=2 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from requests->clodius->gosling[all]==0.0.9) (2.1.0)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from requests->clodius->gosling[all]==0.0.9) (1.26.9)
Requirement already satisfied: tornado>=6.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipykernel>=4.5.1->ipywidgets->gosling-widget->gosling[all]==0.0.9) (6.2)
Requirement already satisfied: pyzmq>=17 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipykernel>=4.5.1->ipywidgets->gosling-widget->gosling[all]==0.0.9) (23.2.0)
Requirement already satisfied: jupyter-client>=6.1.12 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipykernel>=4.5.1->ipywidgets->gosling-widget->gosling[all]==0.0.9) (7.3.4)
Requirement already satisfied: matplotlib-inline>=0.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipykernel>=4.5.1->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.1.3)
Requirement already satisfied: debugpy>=1.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipykernel>=4.5.1->ipywidgets->gosling-widget->gosling[all]==0.0.9) (1.6.0)
Requirement already satisfied: nest-asyncio in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipykernel>=4.5.1->ipywidgets->gosling-widget->gosling[all]==0.0.9) (1.5.5)
Requirement already satisfied: pickleshare in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.7.5)
Requirement already satisfied: decorator in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (5.1.1)
Requirement already satisfied: pygments>=2.4.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (2.12.0)
Requirement already satisfied: jedi>=0.16 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.18.1)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (3.0.30)
Requirement already satisfied: backcall in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.2.0)
Requirement already satisfied: stack-data in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.3.0)
Requirement already satisfied: pexpect>4.3 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (4.8.0)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from packaging>=20.0->dask->clodius->gosling[all]==0.0.9) (3.0.9)
Requirement already satisfied: locket in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from partd>=0.3.10->dask->clodius->gosling[all]==0.0.9) (1.0.0)
Requirement already satisfied: notebook>=4.4.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (6.4.12)
Requirement already satisfied: dill>=0.3.5.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from multiprocess->cooler>=0.8.5->clodius->gosling[all]==0.0.9) (0.3.5.1)
Requirement already satisfied: parso<0.9.0,>=0.8.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from jedi>=0.16->ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.8.3)
Requirement already satisfied: jupyter-core>=4.9.2 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from jupyter-client>=6.1.12->ipykernel>=4.5.1->ipywidgets->gosling-widget->gosling[all]==0.0.9) (4.10.0)
Requirement already satisfied: entrypoints in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from jupyter-client>=6.1.12->ipykernel>=4.5.1->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.4)
Requirement already satisfied: prometheus-client in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.14.1)
Requirement already satisfied: Send2Trash>=1.8.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (1.8.0)
Requirement already satisfied: terminado>=0.8.3 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.15.0)
Requirement already satisfied: nbformat in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (5.4.0)
Requirement already satisfied: argon2-cffi in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (21.3.0)
Requirement already satisfied: nbconvert>=5 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (6.5.0)
Requirement already satisfied: ptyprocess>=0.5 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from pexpect>4.3->ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.7.0)
Requirement already satisfied: wcwidth in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.2.5)
Requirement already satisfied: executing in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from stack-data->ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.8.3)
Requirement already satisfied: asttokens in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from stack-data->ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (2.0.5)
Requirement already satisfied: pure-eval in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from stack-data->ipython>=4.0.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.2.2)
Requirement already satisfied: pandocfilters>=1.4.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (1.5.0)
Requirement already satisfied: defusedxml in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.7.1)
Requirement already satisfied: bleach in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (5.0.1)
Requirement already satisfied: jupyterlab-pygments in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.2.2)
Requirement already satisfied: nbclient>=0.5.0 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.5.13)
Requirement already satisfied: mistune<2,>=0.8.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.8.4)
Requirement already satisfied: beautifulsoup4 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (4.11.1)
Requirement already satisfied: tinycss2 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (1.1.1)
Requirement already satisfied: fastjsonschema in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from nbformat->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (2.15.3)
Requirement already satisfied: argon2-cffi-bindings in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (21.2.0)
Requirement already satisfied: cffi>=1.0.1 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from argon2-cffi-bindings->argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (1.15.1)
Requirement already satisfied: soupsieve>1.2 in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from beautifulsoup4->nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (2.3.2.post1)
Requirement already satisfied: webencodings in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from bleach->nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (0.5.1)
Requirement already satisfied: pycparser in /opt/hostedtoolcache/Python/3.10.5/x64/lib/python3.10/site-packages (from cffi>=1.0.1->argon2-cffi-bindings->argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets->gosling-widget->gosling[all]==0.0.9) (2.21)
Note: be sure to include the
[all]
for this notebook when installinggos
from PyPI viapip
. The special data utilites ingos
are opt-in by default, and running the install command as shown above ensures that you will have the correct dependencies.
Track Composition#
A gos.Track
is a unit building block which can be represented as a bar chart, a line chart, ideogram, or other chart types as we’ve seen in the first session. For the concurrent analysis of multiple datasets, multiple Tracks can be grouped into one or more Views and navigated synchronously. Each gos.View
defines the genomic location for every gos.Track
it contains, and each gos.Track
binds and maps data to be visualized.
Changes following View-level properties modifies how Tracks and Views compose together:
layout
- Whether to view genomic positions in Cartesian (linear
) or in polar (circular
) coordinate systems.alignment
- Whether multiple tracks shouldoverlay
orstack
within a single View.arrangement
- How to juxtapose multiple Views together (parallel
,serial
,vertical
,horizontal
).
This section will walk through each of these properties in more detail with examples.
layout
- Linear and Circular#
The View layout
property specifies whether genomic coordinates are represented in either a circular
or linear
layout.
The following figure displays the upper Track with a linear
layout and the bottom with a circular
layout.
Here we create a bar chart visualization displaying a “pseudobulk” excitatory neuron scATAC-seq track from Corces et. al (Nature Genetics, 2020), and apply a linear
and circular
layout.
data = gos.bigwig(
url="https://s3.amazonaws.com/gosling-lang.org/data/ExcitatoryNeurons-insertions_bin100_RIPnorm.bw",
column="position",
value="peak",
)
track = gos.Track(data, height=50).mark_bar().encode(
x=gos.X("position:G"),
y=gos.Y("peak:Q", axis="right"),
)
track.view()
Linear view#
track.view(layout="linear")
Circular view#
track.properties(width=200).view(layout="circular")
alignment
- Multiple Tracks in One View#
The View alignment
property allows users to either "overlay"
or "stack"
several tracks.
When setting alignment
as "overlay"
, multiple tracks are layered on top of others. When setting `alignment as “stack”, multiple tracks are vertically concantenated. The default value of alignment is “stack”.
line = track.mark_line().encode(color=gos.value("#f3b285"))
bar = track.encode(color=gos.value("#4472c4"))
print(line, bar)
Track({
color: ColorValue({
value: '#f3b285'
}),
data: {'type': 'bigwig', 'url': 'https://s3.amazonaws.com/gosling-lang.org/data/ExcitatoryNeurons-insertions_bin100_RIPnorm.bw', 'column': 'position', 'value': 'peak'},
height: 50,
mark: 'line',
width: 800,
x: X({
field: 'position',
type: 'genomic'
}),
y: Y({
axis: 'right',
field: 'peak',
type: 'quantitative'
})
}) Track({
color: ColorValue({
value: '#4472c4'
}),
data: {'type': 'bigwig', 'url': 'https://s3.amazonaws.com/gosling-lang.org/data/ExcitatoryNeurons-insertions_bin100_RIPnorm.bw', 'column': 'position', 'value': 'peak'},
height: 50,
mark: 'bar',
width: 800,
x: X({
field: 'position',
type: 'genomic'
}),
y: Y({
axis: 'right',
field: 'peak',
type: 'quantitative'
})
})
Stacked tracks in single view#
gos.stack(bar, line, line).properties(spacing=1)
gos.stack(
bar.properties(width=400),
line.properties(width=400),
).properties(spacing=1, layout="circular")
Overlay tracks in single view#
gos.stack(bar, line)
gos.overlay(
bar.properties(width=400),
line.properties(width=400),
).properties(spacing=1, layout="circular")
Example: Gene Annotation#
The overlay
alignment is an essential feature in Gosling and allows much more complex visual encodings. We leverage this building-block in the code example below to build a gene annotation track, combining several primitive tracks.
Note its not essential that you understand every detail of the code below. It is provided to demonstrate more complex usage of the
gos
API and can be used for reference later.
import gosling as gos
# HiGlass-based annotation dataset
# http://gosling-lang.org/docs/data#beddb-require-higlass-server
genes = gos.beddb(
url="https://server.gosling-lang.org/api/v1/tileset_info/?d=gene-annotation",
genomicFields=[
{"index": 1, "name": "start"},
{"index": 2, "name": "end"}
],
valueFields=[
{"index": 5, "name": "strand", "type": "nominal"},
{"index": 3, "name": "name", "type": "nominal"}
],
exonIntervalFields=[
{"index": 12, "name": "start"},
{"index": 13, "name": "end"}
]
)
# Primitive Tracks
base = gos.Track(genes).encode(
row=gos.Row("strand:N", domain=["+", "-"]),
color=gos.Color("strand:N", domain=["+", "-"], range=["#7585FF", "#FF8A85"]),
).properties(
height=100,
title="Genes | hg38",
)
plusGeneHead = base.mark_triangleRight(align="left").encode(
x=gos.X("end:G", axis="top"),
size=gos.value(15)
).transform_filter(
field="type", oneOf=["gene"]
).transform_filter(
field="strand", oneOf=["+"]
)
minusGeneHead = base.mark_triangleLeft(align="right").encode(
x=gos.X("start:G", axis="top"),
size=gos.value(15)
).transform_filter(
field="type", oneOf=["gene"]
).transform_filter(
field="strand", oneOf=["-"]
)
geneLabel = base.mark_text(dy=15).encode(
x=gos.X("start:G", axis="top"),
xe="end:G",
text="name:N",
size=gos.value(15)
).transform_filter(
field="type", oneOf=["gene"]
).visibility_lt(
measure="width",
threshold="|xe-x|",
transitionPadding=10,
target="mark",
)
exon = base.mark_rect().encode(
x=gos.X("start:G", axis="top"),
xe="end:G",
size=gos.value(15)
).transform_filter(
field="type", oneOf=["exon"]
)
plusGeneRange = base.mark_rule(linePattern={"type": "triangleRight", "size": 5}).encode(
x=gos.X("start:G", axis="top"),
xe="end:G",
strokeWidth=gos.value(3),
).transform_filter(
field="type", oneOf=["gene"]
).transform_filter(
field="strand", oneOf=["+"]
)
minusGeneRange = base.mark_rule(linePattern={"type": "triangleLeft", "size": 5}).encode(
x=gos.X("start:G", axis="top"),
xe="end:G",
strokeWidth=gos.value(3),
).transform_filter(
field="type", oneOf=["gene"]
).transform_filter(
field="strand", oneOf=["-"]
)
# Combine tracks with overlay alignment
gene_annotation = gos.overlay(
plusGeneRange, minusGeneRange, exon, plusGeneHead, minusGeneHead, geneLabel
)
gene_annotation.properties(
xDomain=gos.GenomicDomain(chromosome="1", interval=[103400000, 103700000]),
)
We can then “stack” our custom gene_annotation
view with the scATAC-seq track from earlier to provide context.
gos.stack(
gene_annotation,
bar.properties(title="Excitatory Neuron ATAC-seq"),
).properties(
xDomain=gos.GenomicDomain(chromosome="1", interval=[103400000, 103700000]),
)
arrangement
- Arrange Multiple Views#
Gosling supports multi-view visualizations. Users specify the visualizations under the views
property and modify the arragement of these fields through the arrangement
property.
gos
provides several top-level utilities to conveniently group multiple views together in each of the arrangement options:
gos.horizontal
gos.vertical
gos.serial
gos.parallel
Note the arrangement pairs (
gos.serial
,gos.horizontal
) and (gos.parallel
,gos.vertical
) are equivalent when combining Views withlinear
layouts. Behavior only differs when arrangingcircular
layout Views (left-most column).
gos.horizontal
& gos.vertical
#
We can compose gos.horizontal
and gos.vertical
to arrange multiple tracks as into separate views. Notice how interactions are no longer synchronized since each Track is within a unique View.
gos.horizontal(
# left
gos.stack(
track.properties(width=300, height=100),
track.properties(width=300, height=100),
),
# right
track.encode(color=gos.value("#4472c4")).properties(width=600, height=240)
)
We can selectively “group” tracks within the same view using the alignment
features in the previous section. We replace gos.vertical
with gos.stack
, which combines the tracks on the left into the same View.”
gos.horizontal(
# left
gos.stack(
gene_annotation.properties(width=300, height=100),
track.properties(width=300, height=100)
),
# right
track.encode(color=gos.value("#4472c4")).properties(width=600, height=240)
)
gos.serial
& gos.parallel
#
The serial
and parallel
alignments differ in behavior from horizontal
and veritcal
when combining Views with circular
alignments.
gos.serial(
track.properties(width=300).view(layout="circular"),
gene_annotation.properties(width=300, layout="circular"),
)
Loading data#
This section illustrates a key (optional) feature of gos
which makes hosting data for your Gosling visualizations a breeze.
Normally a Gosling visualization requires the administration of a web-server to host both the client and genomics data sets for the visualization. In gos
, we provide further integration with Python to hide this complexity and allow remote, local, and in-memory data to be visualized seamlessly through an idential API.
In this notebook, we will visualize the same BED file containing h38 cytoband information as a:
remote dataset (via URL)
local dataset (via local path)
in memory (from a
pd.DataFrame
)
The visualization#
The ideogram
function generates an ideogram visualization for a given Gosling data source. It is not important that you understand the details of this block to follow along in this notebook. Moreover, the important bit is to understand that ideogram
takes data
as input and returns a Gosling visualization created with the gos
API.
We will show how this function can be reused for various data
defintions (genomic data sources).
def ideogram(data) -> gos.View:
track = gos.Track(data) # bind data to track
arms = track.mark_rect().encode(
color=gos.Color("stain:N",
domain=["gneg", "gpos25", "gpos50", "gpos75", "gpos100", "gvar"],
range=["white", "#D9D9D9", "#979797", "#636363", "black", "#A0A0F2"],
),
x=gos.X("chromStart:G", axis="none"),
xe="chromEnd:G",
stroke=gos.value("black"),
strokeWidth=gos.value(0.5),
).transform_filter_not(
field="stain",
oneOf=["acen"],
)
labels = track.mark_text().encode(
text="name:N",
color=gos.Color("stain:N",
domain=["gneg", "gpos25", "gpos50", "gpos75", "gpos100", "gvar"],
range=["black", "#636363", "black", "#D9D9D9", "white", "black"],
),
strokeWidth=gos.value(0)
).visibility_lt(
target='mark',
measure='width',
threshold='|xe-x|',
transitionPadding=10
)
centromere = track.encode(
x=gos.X("chromStart:G"),
xe="chromEnd:G",
color=gos.value('red'),
).transform_filter(
"stain", oneOf=["acen"]
)
centromere_left = centromere.mark_triangleLeft().transform_filter(
"name", include="p"
)
centromere_right = centromere.mark_triangleRight().transform_filter(
"name", include="q"
)
return gos.overlay(arms, labels, centromere_left, centromere_right).properties(height=20)
The dataset#
The url
below links to a BED4+1 file containing UCSC hg38 cytoband information. This dataset is hosted on GitHub and is avaiable via URL.
url = "https://raw.githubusercontent.com/sehilyi/gemini-datasets/master/data/UCSC.HG38.Human.CytoBandIdeogram.bed"
!curl -s {url} | head | column -t
# chrom chromStart chromEnd name stain
chr1 0 2300000 p36.33 gneg
chr1 2300000 5300000 p36.32 gpos25
chr1 5300000 7100000 p36.31 gneg
chr1 7100000 9100000 p36.23 gpos25
chr1 9100000 12500000 p36.22 gneg
chr1 12500000 15900000 p36.21 gpos50
chr1 15900000 20100000 p36.13 gneg
chr1 20100000 23600000 p36.12 gpos25
chr1 23600000 27600000 p36.11 gneg
chr1 27600000 29900000 p35.3 gpos25
Remote dataset (via URL)#
We can reference this URL directly in Gos by creating a CSV data source via gos.csv(...)
. This function returns a Python dictionary that describes our dataset to Gosling. We use the gos.csv
utility since the resource is a columnar text dataset.
# specify BED4+1 format
data = gos.csv(
url=url,
headerNames=['chrom', 'chromStart', 'chromEnd', 'name', 'stain'], # the +1 field is stain
chromosomeField="chrom", # the column containing chrom names
genomicFields=["chromStart", "chromEnd"], # fields with genomic coordinates
separator="\t",
)
data
{'type': 'csv',
'url': 'https://raw.githubusercontent.com/sehilyi/gemini-datasets/master/data/UCSC.HG38.Human.CytoBandIdeogram.bed',
'headerNames': ['chrom', 'chromStart', 'chromEnd', 'name', 'stain'],
'chromosomeField': 'chrom',
'genomicFields': ['chromStart', 'chromEnd'],
'separator': '\t'}
We can now pass this dataset directly to the ideogram
function which binds data
to gos.Track
and creates our custom visualization.
ideogram(data)
This visualization is a bit crowded since we are viewing the data genome-wide. We can set the initial genomic domain for the visualization to Chromosome 2 by specifying xDomain
as a property.
ideogram(data).properties(xDomain=gos.GenomicDomain(chromosome="chr2"))
Local Dataset (via local filepath)#
Data are not always publically available via URL like above, and often we’d like to visualize local data files. To visualize local data, simply change the URL to a local file path.
data = gos.csv(
- url=url,
+ url="./UCSC.HG38.Human.CytoBandIdeogram.bed",
...
)
Below we download the file from GitHub and load the visualization from our local filesytem.
!wget {url} # download file
--2022-07-06 12:56:01-- https://raw.githubusercontent.com/sehilyi/gemini-datasets/master/data/UCSC.HG38.Human.CytoBandIdeogram.bed
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 30748 (30K) [text/plain]
Saving to: ‘UCSC.HG38.Human.CytoBandIdeogram.bed’
UCSC.HG38 0%[ ] 0 --.-KB/s
UCSC.HG38.Human.Cyt 100%[===================>] 30.03K --.-KB/s in 0s
2022-07-06 12:56:01 (94.1 MB/s) - ‘UCSC.HG38.Human.CytoBandIdeogram.bed’ saved [30748/30748]
!cat UCSC.HG38.Human.CytoBandIdeogram.bed | head | column -t # print local file contents
chr1 0 2300000 p36.33 gneg
chr1 2300000 5300000 p36.32 gpos25
chr1 5300000 7100000 p36.31 gneg
chr1 7100000 9100000 p36.23 gpos25
chr1 9100000 12500000 p36.22 gneg
chr1 12500000 15900000 p36.21 gpos50
chr1 15900000 20100000 p36.13 gneg
chr1 20100000 23600000 p36.12 gpos25
chr1 23600000 27600000 p36.11 gneg
chr1 27600000 29900000 p35.3 gpos25
!ls
01-Single-Track.ipynb 04-Coordinated-Multiple-Views.ipynb
02-Track-Alignment-and-Data-Loading.ipynb Index.ipynb
03-Semantic-Zooming.ipynb UCSC.HG38.Human.CytoBandIdeogram.bed
data = gos.csv(
url="./UCSC.HG38.Human.CytoBandIdeogram.bed",
# url=url
headerNames=['chrom', 'chromStart', 'chromEnd', 'name', 'stain'],
chromosomeField="chrom",
genomicFields=["chromStart", "chromEnd"],
separator="\t",
)
# reuse the same visualization
ideogram(data).properties(xDomain=gos.GenomicDomain(chromosome="chr2"))
In memory (via pd.DataFrame
)#
While loading remote and local genomics data files is useful, often we want to visualize intermediate or derived information during analysis. Rather than writing these results to disk, gos
supports visualizing in-memory data directly from Pandas dataframes pd.DataFrame
.
In order to use this feature, we first load our dataset as a pd.DataFrame
.
import pandas as pd
df = pd.read_csv(
'./UCSC.HG38.Human.CytoBandIdeogram.bed',
names=['chrom', 'chromStart', 'chromEnd', 'name', 'stain'],
sep="\t"
)
df.head()
chrom | chromStart | chromEnd | name | stain | |
---|---|---|---|---|---|
0 | chr1 | 0 | 2300000 | p36.33 | gneg |
1 | chr1 | 2300000 | 5300000 | p36.32 | gpos25 |
2 | chr1 | 5300000 | 7100000 | p36.31 | gneg |
3 | chr1 | 7100000 | 9100000 | p36.23 | gpos25 |
4 | chr1 | 9100000 | 12500000 | p36.22 | gneg |
Lets filter df
in Python for our dataset only contains entries for Chromosome 2.
df = df[df.chrom == "chr2"]
df.head()
chrom | chromStart | chromEnd | name | stain | |
---|---|---|---|---|---|
63 | chr2 | 0 | 4400000 | p25.3 | gneg |
64 | chr2 | 4400000 | 6900000 | p25.2 | gpos50 |
65 | chr2 | 6900000 | 12000000 | p25.1 | gneg |
66 | chr2 | 12000000 | 16500000 | p24.3 | gpos75 |
67 | chr2 | 16500000 | 19000000 | p24.2 | gneg |
We can now create a data
source for our visualization using the df.gos.csv(...)
method, and visualize directly! Note how the resulting visualization only renders for chromosome 2.
df.gos.csv( # we only need to specify these fields since the rest are inferred from dataframe
chromosomeField="chrom",
genomicFields=["chromStart", "chromEnd"],
)
{'type': 'csv',
'url': 'http://localhost:41573/582136adca397a8e7c5e78380456fc5a.csv',
'chromosomeField': 'chrom',
'genomicFields': ['chromStart', 'chromEnd']}
data = df.gos.csv(
# we only need to specify these fields since the rest are inferred from dataframe
chromosomeField="chrom",
genomicFields=["chromStart", "chromEnd"],
)
ideogram(data) # view in context of full assembly
ideogram(data).properties(xDomain=gos.GenomicDomain(chromosome="chr2")) # view just chrom 2
(BONUS) Exercise: local bigwig track#
In the previous notebook we visualized several scATAC-seq “pseudobulk” tracks from human brain cells using a simple barplot.
The bigwig_url
below points to a BigWig file (161Mb) on a remote server containing the normliazed aggregate signal from excitatory neurons cells in the human brain.
bigwig_url = "https://s3.amazonaws.com/gosling-lang.org/data/ExcitatoryNeurons-insertions_bin100_RIPnorm.bw"
We can combine the gene_annotation
View from earlier with a track for our scATAC-seq dataset to provide context when visualizing the peaks.
data_bw = gos.bigwig(url=bigwig_url, column="position", value="peak")
excitatory_neurons = gos.Track(data_bw).mark_bar().encode(
x="position:G",
y="peak:Q",
).properties(height=50)
gos.stack(gene_annotation, excitatory_neurons).properties(
xDomain=gos.GenomicDomain(chromosome="1", interval=[103400000, 103700000]),
)
Data is not always available via public URL, and often will be stored locally during an analysis.
Task:
download the scATAC-seq dataset locally (hint: use
!wget
in the cell below)change the code snippet above to load the local
ExcitatoryNeurons-insertions_bin100_RIPnorm.bw