Class: Dataset (Dataset)
Information about a specific grouping of data files
URI: include:Dataset
classDiagram
class Dataset
click Dataset href "../Dataset"
Thing <|-- Dataset
click Thing href "../Thing"
Dataset : accessLimitations
Dataset : accessRequirements
Dataset : dataCategory
Dataset --> "1..*" EnumDataCategory : dataCategory
click EnumDataCategory href "../EnumDataCategory"
Dataset : dataCollectionEndYear
Dataset : dataCollectionStartYear
Dataset : datasetDescription
Dataset : datasetExternalId
Dataset : datasetGlobalId
Dataset : datasetName
Dataset : dataType
Dataset : dbgap
Dataset : expectedNumberOfFiles
Dataset : expectedNumberOfParticipants
Dataset : experimentalPlatform
Dataset : experimentalStrategy
Dataset : isHarmonized
Dataset : otherAccessAuthority
Dataset : otherRepository
Dataset : publication
Dataset : studyCode
Dataset --> "1" EnumStudyCode : studyCode
click EnumStudyCode href "../EnumStudyCode"
Inheritance
- Thing
- Dataset
Slots
Name | Cardinality and Range | Description | Inheritance |
---|---|---|---|
studyCode | 1 EnumStudyCode |
Unique identifier for the study (generally a short acronym) | direct |
datasetName | 1 String |
Full name of the dataset, provided by contributor | direct |
datasetDescription | 0..1 String |
Brief additional notes about the dataset (1-3 sentences) that are not already... | direct |
datasetGlobalId | 0..1 String |
Unique Global ID for dataset, generated by DCC | direct |
datasetExternalId | 0..1 String |
Unique identifier or code for dataset, if provided by contributor | direct |
expectedNumberOfParticipants | 1 Integer |
Expected number of participants in this Dataset (or actual number, if data ha... | direct |
expectedNumberOfFiles | 0..1 Integer |
Expected number of files associated with this dataset, including dictionaries | direct |
dataCollectionStartYear | 0..1 String |
Year that data collection started | direct |
dataCollectionEndYear | 0..1 String |
Year that data collection ended | direct |
dataCategory | 1..* EnumDataCategory |
General category of data in Dataset; pipe-separated if multiple | direct |
dataType | * String |
Specific type of data contained in Dataset; pipe-separated if multiple (e | direct |
experimentalStrategy | * String |
Experimental method used to obtain data in Dataset; pipe-separated if multipl... | direct |
experimentalPlatform | * String |
Specific platform used to perform experiment; pipe-separated if multiple (e | direct |
publication | * Uri |
URL for publication(s) describing the Dataset's rationale and methodology (Pu... | direct |
accessLimitations | 0..1 String |
Data access limitations, as defined in the GA4GH Data Use Ontology (DUO; can ... | direct |
accessRequirements | 0..1 String |
Data access requirements, as defined in the GA4GH Data Use Ontology (DUO; can... | direct |
dbgap | * String |
dbGaP "phs" accession code(s) required to access the files in this Dataset, i... | direct |
otherRepository | 0..1 Uri |
URL if dataset is already deposited in a public repository other than dbGaP (... | direct |
otherAccessAuthority | 0..1 String |
Email or URL for dataset's Access Authority, if not dbGaP | direct |
isHarmonized | 0..1 Boolean |
For omics datasets, is this Dataset already harmonized and available in the I... | direct |
Identifier and Mapping Information
Annotations
property | value |
---|---|
required | False |
Schema Source
- from schema: https://w3id.org/include
Mappings
Mapping Type | Mapped Value |
---|---|
self | include:Dataset |
native | include:Dataset |
LinkML Source
Direct
name: Dataset
definition_uri: include:Dataset
annotations:
required:
tag: required
value: 'False'
description: Information about a specific grouping of data files
title: Dataset
from_schema: https://w3id.org/include
is_a: Thing
slots:
- studyCode
- datasetName
- datasetDescription
- datasetGlobalId
- datasetExternalId
- expectedNumberOfParticipants
- expectedNumberOfFiles
- dataCollectionStartYear
- dataCollectionEndYear
- dataCategory
- dataType
- experimentalStrategy
- experimentalPlatform
- publication
- accessLimitations
- accessRequirements
- dbgap
- otherRepository
- otherAccessAuthority
- isHarmonized
slot_usage:
dataCategory:
name: dataCategory
description: General category of data in Dataset; pipe-separated if multiple
multivalued: true
dbgap:
name: dbgap
description: dbGaP "phs" accession code(s) required to access the files in this
Dataset, if applicable (pipe-separated if multiple)
publication:
name: publication
description: URL for publication(s) describing the Dataset's rationale and methodology
(PubMed Central preferred but not required; pipe-separated if multiple)
expectedNumberOfParticipants:
name: expectedNumberOfParticipants
description: Expected number of participants in this Dataset (or actual number,
if data has been submitted to INCLUDE DCC). If additional explanation is needed,
please add to Dataset Description field.
dataType:
name: dataType
description: Specific type of data contained in Dataset; pipe-separated if multiple
(e.g. Preprocessed metabolite relative abundance, Absolute protein concentration,
Aligned reads, Simple nucleotide variations, GVCF, Gene expression quantifications,
Gene fusions, Somatic copy number variations, Somatic structural variations)
multivalued: true
experimentalStrategy:
name: experimentalStrategy
description: Experimental method used to obtain data in Dataset; pipe-separated
if multiple (e.g. Whole genome sequencing, RNAseq, Multiplex immunoassay, Mass
spec metabolomics)
multivalued: true
Induced
name: Dataset
definition_uri: include:Dataset
annotations:
required:
tag: required
value: 'False'
description: Information about a specific grouping of data files
title: Dataset
from_schema: https://w3id.org/include
is_a: Thing
slot_usage:
dataCategory:
name: dataCategory
description: General category of data in Dataset; pipe-separated if multiple
multivalued: true
dbgap:
name: dbgap
description: dbGaP "phs" accession code(s) required to access the files in this
Dataset, if applicable (pipe-separated if multiple)
publication:
name: publication
description: URL for publication(s) describing the Dataset's rationale and methodology
(PubMed Central preferred but not required; pipe-separated if multiple)
expectedNumberOfParticipants:
name: expectedNumberOfParticipants
description: Expected number of participants in this Dataset (or actual number,
if data has been submitted to INCLUDE DCC). If additional explanation is needed,
please add to Dataset Description field.
dataType:
name: dataType
description: Specific type of data contained in Dataset; pipe-separated if multiple
(e.g. Preprocessed metabolite relative abundance, Absolute protein concentration,
Aligned reads, Simple nucleotide variations, GVCF, Gene expression quantifications,
Gene fusions, Somatic copy number variations, Somatic structural variations)
multivalued: true
experimentalStrategy:
name: experimentalStrategy
description: Experimental method used to obtain data in Dataset; pipe-separated
if multiple (e.g. Whole genome sequencing, RNAseq, Multiplex immunoassay, Mass
spec metabolomics)
multivalued: true
attributes:
studyCode:
name: studyCode
definition_uri: include:studyCode
description: Unique identifier for the study (generally a short acronym)
title: Study Code
from_schema: https://w3id.org/include
rank: 1000
alias: studyCode
owner: Dataset
domain_of:
- Biospecimen
- DataFile
- Participant
- Condition
- Study
- Dataset
- DatasetManifest
range: enum_studyCode
required: true
datasetName:
name: datasetName
definition_uri: include:datasetName
description: Full name of the dataset, provided by contributor
title: Dataset Name
from_schema: https://w3id.org/include
rank: 1000
alias: datasetName
owner: Dataset
domain_of:
- Dataset
- DatasetManifest
range: string
required: true
datasetDescription:
name: datasetDescription
definition_uri: include:datasetDescription
description: Brief additional notes about the dataset (1-3 sentences) that are
not already captured in the other fields
title: Dataset Description
from_schema: https://w3id.org/include
rank: 1000
alias: datasetDescription
owner: Dataset
domain_of:
- Dataset
range: string
datasetGlobalId:
name: datasetGlobalId
definition_uri: include:datasetGlobalId
description: Unique Global ID for dataset, generated by DCC
title: Dataset Global ID
from_schema: https://w3id.org/include
rank: 1000
alias: datasetGlobalId
owner: Dataset
domain_of:
- Dataset
- DatasetManifest
range: string
required: false
datasetExternalId:
name: datasetExternalId
definition_uri: include:datasetExternalId
description: Unique identifier or code for dataset, if provided by contributor
title: Dataset External ID
from_schema: https://w3id.org/include
rank: 1000
alias: datasetExternalId
owner: Dataset
domain_of:
- Dataset
- DatasetManifest
range: string
expectedNumberOfParticipants:
name: expectedNumberOfParticipants
definition_uri: include:expectedNumberOfParticipants
description: Expected number of participants in this Dataset (or actual number,
if data has been submitted to INCLUDE DCC). If additional explanation is needed,
please add to Dataset Description field.
title: Expected Number of Participants
from_schema: https://w3id.org/include
rank: 1000
alias: expectedNumberOfParticipants
owner: Dataset
domain_of:
- Study
- Dataset
range: integer
required: true
expectedNumberOfFiles:
name: expectedNumberOfFiles
definition_uri: include:expectedNumberOfFiles
description: Expected number of files associated with this dataset, including
dictionaries. If additional explanation is needed, please add to Dataset Description
field.
title: Expected Number of Files
from_schema: https://w3id.org/include
rank: 1000
alias: expectedNumberOfFiles
owner: Dataset
domain_of:
- Dataset
range: integer
required: false
dataCollectionStartYear:
name: dataCollectionStartYear
definition_uri: include:dataCollectionStartYear
description: Year that data collection started
title: Data Collection Start Year
from_schema: https://w3id.org/include
rank: 1000
alias: dataCollectionStartYear
owner: Dataset
domain_of:
- Dataset
range: string
required: false
dataCollectionEndYear:
name: dataCollectionEndYear
definition_uri: include:dataCollectionEndYear
description: Year that data collection ended
title: Data Collection End Year
from_schema: https://w3id.org/include
rank: 1000
alias: dataCollectionEndYear
owner: Dataset
domain_of:
- Dataset
range: string
required: false
dataCategory:
name: dataCategory
definition_uri: include:dataCategory
description: General category of data in Dataset; pipe-separated if multiple
title: Data Category
from_schema: https://w3id.org/include
rank: 1000
alias: dataCategory
owner: Dataset
domain_of:
- DataFile
- Study
- Dataset
range: enum_dataCategory
required: true
multivalued: true
dataType:
name: dataType
definition_uri: include:dataType
description: Specific type of data contained in Dataset; pipe-separated if multiple
(e.g. Preprocessed metabolite relative abundance, Absolute protein concentration,
Aligned reads, Simple nucleotide variations, GVCF, Gene expression quantifications,
Gene fusions, Somatic copy number variations, Somatic structural variations)
title: Data Type
from_schema: https://w3id.org/include
rank: 1000
alias: dataType
owner: Dataset
domain_of:
- DataFile
- Dataset
range: string
multivalued: true
experimentalStrategy:
name: experimentalStrategy
definition_uri: include:experimentalStrategy
description: Experimental method used to obtain data in Dataset; pipe-separated
if multiple (e.g. Whole genome sequencing, RNAseq, Multiplex immunoassay, Mass
spec metabolomics)
title: Experimental Strategy
from_schema: https://w3id.org/include
rank: 1000
alias: experimentalStrategy
owner: Dataset
domain_of:
- DataFile
- Dataset
range: string
multivalued: true
experimentalPlatform:
name: experimentalPlatform
definition_uri: include:experimentalPlatform
description: Specific platform used to perform experiment; pipe-separated if multiple
(e.g. SOMAscan, MSD, Luminex, Illumina)
title: Experimental Platform
from_schema: https://w3id.org/include
rank: 1000
alias: experimentalPlatform
owner: Dataset
domain_of:
- DataFile
- Dataset
range: string
multivalued: true
publication:
name: publication
definition_uri: include:publication
description: URL for publication(s) describing the Dataset's rationale and methodology
(PubMed Central preferred but not required; pipe-separated if multiple)
title: Publication
from_schema: https://w3id.org/include
rank: 1000
alias: publication
owner: Dataset
domain_of:
- Study
- Dataset
range: uri
multivalued: true
accessLimitations:
name: accessLimitations
definition_uri: include:accessLimitations
description: Data access limitations, as defined in the GA4GH Data Use Ontology
(DUO; can list more than one, pipe separated)
title: Access Limitations
from_schema: https://w3id.org/include
rank: 1000
alias: accessLimitations
owner: Dataset
domain_of:
- Dataset
range: string
required: false
accessRequirements:
name: accessRequirements
definition_uri: include:accessRequirements
description: Data access requirements, as defined in the GA4GH Data Use Ontology
(DUO; can list more than one, pipe separated)
title: Access Requirements
from_schema: https://w3id.org/include
rank: 1000
alias: accessRequirements
owner: Dataset
domain_of:
- Dataset
range: string
required: false
dbgap:
name: dbgap
definition_uri: include:dbgap
description: dbGaP "phs" accession code(s) required to access the files in this
Dataset, if applicable (pipe-separated if multiple)
title: dbGaP
from_schema: https://w3id.org/include
rank: 1000
alias: dbgap
owner: Dataset
domain_of:
- Study
- Dataset
range: string
multivalued: true
otherRepository:
name: otherRepository
definition_uri: include:otherRepository
description: URL if dataset is already deposited in a public repository other
than dbGaP (e.g. LONI, Metabolomics Workbench, etc.)
title: Other Repository
from_schema: https://w3id.org/include
rank: 1000
alias: otherRepository
owner: Dataset
domain_of:
- Dataset
range: uri
otherAccessAuthority:
name: otherAccessAuthority
definition_uri: include:otherAccessAuthority
description: Email or URL for dataset's Access Authority, if not dbGaP
title: Other Access Authority
from_schema: https://w3id.org/include
rank: 1000
alias: otherAccessAuthority
owner: Dataset
domain_of:
- Dataset
range: string
isHarmonized:
name: isHarmonized
definition_uri: include:isHarmonized
description: For omics datasets, is this Dataset already harmonized and available
in the INCLUDE Data Hub?
title: Is Harmonized?
from_schema: https://w3id.org/include
rank: 1000
alias: isHarmonized
owner: Dataset
domain_of:
- Dataset
range: boolean