Server Configuration
Required Dependencies
Java
OpenCGA needs Java 8 to compile and run, we recommend using Oracle JDK 1.8.0_60+ since is the one we use, and therefore it is fully supported. There are two main ways of installing Oracle JDK in Linux: via the package manager and manually.
Installing via package manager can be more or less easy to do depending on the Linux distribution you use, for example for Ubuntu 16.04 you can execute the following commands:
To install Java manually you can download the latest Java SDK for Linux x64 file (linux-x64.tar.gz) from Oracle at http://www.oracle.com/technetwork/es/java/javase/downloads/index.html
Then you have to uncompress it, move it to an installation folder such as /opt, and create a symbolic link:
and finally, you must set the JAVA_HOME variable in the system or user bashrc file:
Independently of the installation procedure followed above you can check is Java properly installed by executing java -version, you should get something like this:
Tomcat
OpenCGA needs Tomcat 8 to run RESTful web services, we recommend to use Apache Tomcat 8.x since is the one we use in production but we have also used Jetty without any problem. You can install Apache Tomcat in Linux either via package manager or manually.
Installing via package manager can be more or less easy to do depending on the Linux distribution you use, keep in mind that Tomcat should run with the same user that OpenCGA. We will assume that OpenCGA will be installed with user opencga. Run the following to set Tomcat to run with user opencga. for example for Ubuntu 16.04 you can execute the following command:
You can install Tomcat manually by downloading it from http://tomcat.apache.org/download-80.cgi and executing as the OpenCGA installation user, execute the following commands:
We recommend to make same changes in the Tomcat configuration:
Increase Tomcat memory: you can do this by adding JAVA_OPTS=" $JAVA_OPTS -Xms1024m -Xmx12288m " to file bin/catalina.sh. In this example we have increased the memory to 12GB.
Enable compression to save bandwidth: you can ask Tomcat to compress the HTML and JSON output, go to conf/server.xml and look for the connector 8080 and leave it like this:
MongoDB 3.2
OpenCGA uses MongoDB 4.2.x+ to store Catalog database and also as a possible backend for the Variant Storage engine. The best ways of installing MongoDB is from the Linux package managers, you can follow MongoDB tutorials for Ubuntu and RedHat/CentOS at:
You can customise server configuration at /etc/mongodb.conf or the client at ~/.mongorc.js. Some recommended configuration for MongoDB:
Always use WiredTiger engine
Make sure the journal is enabled
Make sure to have one directory per db with: directoryperdb=true
For cluster installation we recommend to set up a Replica Set, this will improve performance and will make the installation more robust. OpenCGA versions 2.x+ utilise MongoDB's transactions, therefore a Replica Set is required.
Apache Solr
Apache Solr 6.x is used in OpenCGA as a complimentary search engine for improving the performance of some queries and aggregations, full-text search and faceted queries to the Variant database. Solr is a highly reliable, scalable, and fault-tolerant NoSQL database, it provides distributed indexing, replication, load-balanced querying, automated fail over, recovery, centralised configuration and more. You can learn how to install Solr at https://cwiki.apache.org/confluence/display/solr/Installing+Solr.
OpenCGA needs to be able to create new Solr collections and provide the Variant search schema. To do this you need to download the Server Configuration if you are using OpenCGA version 1.2.x or 1.3.x, for next version 1.4.0 you must use OpenCGAConfSet-1.4.0.tar.gz, then you just need to uncompress it and copy it into your Solr folder installation (i.e., $SOLR_HOME/server/solr/configsets) .
For Solr standalone installation, you have to copy the OpenCGAConfSet-1.4.0 configuration into the folder server/solr/configsets for Solr standalone installation. Now, you can create new core or collections dynamically from Solr command line or from REST web services, e.g.:
http://localhost:8983/solr/admin/cores?action=CREATE&name=my-new-core&configSet=OpenCGAConfSet-1.4.0
For Solr cloud installation, you have to upload the OpenCGAConfSet configuation to your cluster by running the Solr zookeeper command line. In the following example command line, the OpenCGAConfSet configuration is located in the folder server/solr/configsets:
Now we can create collections from our search command line or from REST services, e.g.:
For more information, https://cwiki.apache.org/confluence/display/solr/Using+ZooKeeper+to+Manage+Configuration+File
Optional Dependencies
Apache Maven
OpenCGA uses Apache Maven 3.x as a building tool, we use maven to compile, build, install and run tests of OpenCGA, so you do not need Maven unless you want to compile and build the source code for any reason, you can follow Installation Guide > Building from Source Code instructions to learn how to do it. Remember that you can always get stable OpenCGA binaries from Installation#GettingOpenCGA.
Maven can be easily installed in Linux via the package manager, you can execute with sudo command or as root the following commands in Ubuntu or CentOS:
Or you can also install Maven manually following this tutorial http://maven.apache.org/install.html
You can check that Maven is installed by executing mvn -v or mvn -version, you should get something like this:
You need to add a Maven Profile to set up some variables that will be injected during the building, you can learn more about this at Installation Guide > Building from Source Code.
Hadoop
This is an optional backend plugin for OpenCGA storage. We have used both Hortonworks and Cloudera as Hadoop distribution. At the moment we recommend Hortonwork 3.1+.
Cluster Dependencies
HAProxy
OpenCGA uses HAProxy 1.5+ in the cluster installation to balance all REST web services call to the different deployed Tomcats. You can easily install HAProxy via package manager:
Here you can find an easy example for a very simple configuration, go to HAProxy for more documentation:
Last updated