pyopencga - Python Library
If you're a Python user and you're keen of running OpenCGA through Python, you're in the right place!
Overview
OpenCGA implements a Python REST client library called PyOpenCGA to execute any query or operation through the REST web services API. PyOpenCGA provides programmatic access to all the implemented REST web services, providing an easy, lightweight, fast and intuitive solution to access OpenCGA data. The library offers the convenience of an object-oriented scripting language and provides the ability to integrate the obtained results into other Python applications.
Some of the main features include:
full RESTful web service API implemented, all endpoints are supported including new alignment or clinical functionality.
data is returned in a new RestResponse object which contains metadata and the results, some handy methods and iterators implemented.
it uses the OpenCGA client-configuration.yml file.
several Jupyter Notebooks implemented.
PyOpenCGA has been implemented by Daniel Perez, Pablo Marin and David Gomez and it is based on a previous library called pyCGA implemented by Antonio Rueda and Daniel Perez from Genomics England. The code is open-source and can be found at https://github.com/opencb/opencga/tree/develop/opencga-client/src/main/python/pyOpenCGA. It can be installed using PyPI and . Please, find more details on how to use the python library at Using the Python client.
Installation
Python client requires at least Python 3.x, although most of the code is fully compatible with Python 2.7. You can install PyOpenCGA either from PyPI repository or from source code.
PyPI
PyOpenCGA client is available at PyPI repository at https://pypi.org/project/pyopencga/. Installation is as simple as running the following command line:
## Latest stable version
pip install pyopencgaSource Code
From OpenCGA v2.0.0 the Python client source code can be found at GitHub Release at https://github.com/opencb/opencga/releases. You can easily install pyOpenCGA using the setup.py file.
Getting started
Client Configuration
Configuration is handled by the ClientConfiguration class. You can create a ClientConfiguration using either the conf/client-configuration.yml file or by passing a dictionary.
OpenCGA Client
OpencgaClient is the main class in pyOpenCGA. It manages login/logout authentication, REST clients initialisation and provides a set of other utilities.
To create an OpencgaClient instance, a ClientConfiguration instance must be passed as an argument. You can authenticate in two different ways. First, you can login by providing the user and optionally the password. Second, you can provide a valid token when creating OpencgaClient. Remember that tokens are only valid for a period of time.
The OpencgaClient class works as a client factory containing all the different clients, one per REST resource, that are necessary to call any REST web service. Below is a list of available clients:
Client API
Clients implements all available REST API endpoints, one method has been implemented for each REST web service. The list of available actions that can be performed with all those clients can be checked in Swagger as explained in RESTful Web Services#Swagger. Each particular client has a method defined for each available web service implemented for the resource. For instance, the whole list of actions available for the Sample resource are shown below.

For all those actions, there is a method available in the sample client. For instance, to search for samples using the /search web service, you need to execute:
Working with the RestResponse
As described in RESTful Web Services#RESTResponse, all REST web services return a RestResponse object containing some metadata and a list of OpenCGAResults. Each of these OpenCGAResults contain some other metadata and the actual data results.
To work with these REST responses in an easier way, RestResponse class has been implemented to wrap the web service RetResponse object and to offer some useful methods to process the results. For instance, the sample_result variable from the example above is a RestResponse instance. This object defines several methods to navigate through the data.
The implemented RestResponse methods are:
To explore the data in an easier way, a method named print_results has also been implemented to show the response in a more human-readable format.
Examples and tutorials
Setting up OpencgaClient and logging in
Getting ID's for available projects, studies, families and samples
Getting gene variants for individuals with a particular disorder
Getting sample variant ID's
Getting all samples containing a variant
Additionally, there are several notebooks defined in https://github.com/opencb/opencga/tree/develop/opencga-client/src/main/python/notebooks with more real examples.
Last updated
Was this helpful?