# Catalog

## Overview <a href="#catalogandsecurity-overview" id="catalogandsecurity-overview"></a>

A genomic data analysis platform need to keep track of different resources such as metadata of files, sample annotations or jobs. OpenCGA Catalog aims to collect and integrate all the information needed for executing genomic analysis. This information is organized in nine main entities: *users*, *studies*, *files*, *samples*, *datasets*, *cohorts*, *individuals*, *disease panels* and *jobs*.

## Main Features <a href="#catalogandsecurity-mainfeatures" id="catalogandsecurity-mainfeatures"></a>

The main tasks of Catalog are to provide:

* Authentication and authorization to the different resources.
* A collaborative environment.
* File audit to keep track of files and metadata.
* Analysis and Jobs.
* Sample, individual and cohort annotation.
* Security
* Versioning

## Data Models <a href="#catalogandsecurity-datamodels" id="catalogandsecurity-datamodels"></a>

This section describes the most relevant entities. For more detailed information about the data models such as Java source code, examples or the JSON Schemas you can visit [OpenCGA Catalog Data Models](https://github.com/opencb/opencga/wiki/OpenCGA-Catalog-Data-Models) page. You can see an overview of the data model in this picture:

![](http://docs.opencb.org/download/attachments/327907/catalog_data_models_v13.png?version=1\&modificationDate=1560245879990\&api=v2)

The most relevant entities in OpenCGA Catalog are:

* User: Contains the data related to the user account.
* Project: Contains information of a project, covering as many related studies as necessary.
* Study: Main space set environment. Contain files, samples, individuals, jobs...
* File: Information regarding a submitted or generated file.
* Sample: Information regarding the sample. Closely related to file entity.
* Individual: Contain the information regarding the individual from whom the sample has been taken.
* Cohort: Group sets of samples with some common feature(s).
* Disease panel: Define a disease panel containing the variants, genes and/or regions of interest.
* Job: Job analysis launched using any of the files or samples.

## RESTful web services <a href="#catalogandsecurity-restfulwebservices" id="catalogandsecurity-restfulwebservices"></a>

All this information can be stored and retrieved using our Java and RESTful web services API.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.opencga.opencb.org/develop-2/components-1/catalog.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
