OpenCGA
OverviewInstallationUsing OpenCGA
v2.1
v2.1
  • Home
  • Overview
    • Features
    • Architecture Diagram
    • Security
    • Metadata and Clinical Data
    • Data Models
      • Sample
      • Individual
    • Alignment and Coverage
    • Variant Query and Analysis
    • Clinical Data Analysis
    • Running Analysis
    • Scalability and Performance
    • Sizing OpenCGA
  • OpenCGA Architecture
    • Catalog
    • Alignment and Coverage
    • Variant Storage Engine
      • Variant Data Model
    • Clinical Analysis
    • Analysis Framework
  • Data Models
    • User
    • Project
    • Study
    • File
    • Sample
    • Individual
    • Family
    • Cohort
    • Job
    • Clinical Analysis
  • Case Studies
    • Genomics England Research
    • Microsoft Azure
  • User Manual
    • Installation
      • Kubernetes Cluster
        • Azure
        • Configuration
      • On-Premise HPC Cluster
        • Server Configuration
        • Getting OpenCGA
        • Choosing Variant Storage Engine
        • Configuration
      • Running Docker
    • Login
    • Using OpenCGA
      • REST Web Service API
      • IVA Web App
      • Client Libraries
        • pyopencga - Python Library
        • opencgaR - R library
        • Java
        • JavaScript
      • Command Line
        • Configuration
      • Public Demo
    • Managing Data
      • Sharing and Permissions
        • Users and Groups
        • Study ACLs
      • Create Projects and Studies
      • Load VCF Files to a Study
      • Working with Files
      • Population of Metadata
        • Adding Custom Metadata
        • Data Versioning
    • Alignment Engine
      • BAM Index
      • Alignment Read Query
      • Working with Coverage
      • Alignment Analysis
    • Variant Storage Engine
      • Variant Query
      • Variant Aggregation Stats
      • Variant Analysis
    • Clinical Genomics
      • Clinical Interpretation
      • Clinical Analysis
    • Using JupyterLab
    • Administrator
      • User Management
      • Templates / Manifest
  • About
    • Roadmap
    • Release Notes
    • Community
Powered by GitBook
On this page
  • Introduction
  • Guidelines
  • Owner user
  • Creating new projects
  • Creating new studies

Was this helpful?

  1. User Manual
  2. Managing Data

Create Projects and Studies

PreviousStudy ACLsNextLoad VCF Files to a Study

Last updated 3 years ago

Was this helpful?

Introduction

The project/study organisation is key in order to optimise the usability of the data in OpenCGA.

Projects provide physical separation of data into different database tables. Studies provide logical separation of data within a Project.

Guidelines

  • You MUST store your data in different projects when they are on different genome assemblies (e.g you should create a project for data from GRCh37 and other for data from GRCh38)

  • You CAN store your data in different projects when there is no foreseeable need to process them jointly.

  • You may divide your data in studies corresponding to different independent datasets that may be used together in some analysis, with the aim of having homogeneous datasets for each study.

Owner user

The owner is the user who creates the project/study where the new data will be loaded. The users with permission to perform data ingestion into a concrete study in OpenCGA are the owner user, and other users with admin privileges for the specific study (provided by the owner).

After deciding structure, the new projects and studies may need to be created. This step must be performed by the owner of the new created elements.

Creating new projects

The first step is into OpenCGA with a FULL account (See Data Management). Then, a Project can be created using the next command:

$ ./opencga.sh projects create --id <short-project-id> 
                                -n <full-project-name> 
                                --organism-scientific-name hsapiens 
                                --organism-assembly <GRCh37|GRCh38>

Optionally, you can add other parameters like --description . You can get the full list of parameters by adding to the command.

Creating new studies

Similar to the project creation, studies are created with this command:

$ ./opencga.sh studies create --project <project-id> 
                              --id <short-study-id> 
                              -n <full-study-name>

You don’t need to provide the organism assembly again, as it’s inherited from the project. Remember that all studies from the same project will share the same assembly.

To get the list of all projects and studies belonging to one specific user, run:

$ ./opencga.sh users info

login