OSTrails Documentation¶
Welcome to the technical documentation for OSTrails, the Open Science Plan-Track-Assess Pathways project.
Documentation Structure¶
The documentation is structured as follows:
Architecture
Commons
- Introduction
- Governance
- DMP Commons
- DMP Common Standard for maDMPs
- OSTrails Application Profile for maDMPs
- OSTrails maDMP API Specification
- maDMP mappings
- DMP Evaluation Metrics
- Metric 1: Reused Dataset Declared in the DMP
- Metric 2: Reused Dataset Has a Persistent Identifier
- Metric 3: Reused Dataset Has a Declared License
- Metric 4: Reused Dataset Has Distribution Information
- Metric 5: Reused Dataset Has Declared Access Conditions
- Metric 6: Reused Dataset Contains Personal Data
- Metric 7: Reused Dataset Contains Sensitive Data
- Metric 8: Reused Dataset Has an Access URL
- Metric 9: Reused Dataset PID Resolves in the Repository
- Metric 10: Reused Dataset Access Conditions Match the Repository
- Metric 11: Reused Dataset License Matches the Repository
- Metric 12: New Dataset Declared in the DMP
- Metric 13: New Dataset Collection or Creation Method Declared
- Metric 14: New Dataset Has Declared Access Conditions
- Metric 15: New Dataset Has Sufficient Metadata
- Metric 16: New Dataset Persistent Identifier Resolves Successfully
- Metric 17: New Dataset Access Conditions Match the Repository
- Metric 18: New Dataset License Matches the Repository
- Metric 19: Dataset Type Specified
- Metric 20: Dataset File Format Specified
- Metric 21: Dataset Size Specified
- Metric 22: Dataset Type Matches the Repository
- Metric 23: Dataset File Format Matches the Repository
- Metric 24: Dataset Size Matches the Repository
- Metric 25: DMP Common Standard Field Compliance
- Metric 26: Controlled Vocabularies Used in Methodology
- Metric 27: Electronic Lab Notebook Referenced as a Technical Resource
- Metric 28: ReadMe File Reference
- Metric 29: Metadata Standards Used
- Metric 31: Electronic Lab Notebook Linked
- Metric 30: Dataset Distributions Use Open File Formats
- Metric 32: Existence of Dataset Documentation
- Metric 33: Quality Control Methods Stated
- Metric 34: Data Storage Location mentioned in the DMP
- Metric 35: Use of Secure Storage for the dataset in a trusted repository
- Metric 36: Alignment of Storage and Backup with Information Sensitivity
- Metric 37: Back up Responsibility
- Metric 38: Back up Frequency
- Metric 39: Version Control Practices for Software
- Metric 40: Stored Dataset Location Confirmed
- Metric 41: Security Measures Implementation
- Metric 42: Sensitive Data Protection Description
- Metric 43: Authorised Access Control
- Metric 44: Access Control and User Management
- Metric 45: Required Access Procedures
- Metric 46: GDPR and Ethics Compliance
- Metric 47: Final Security Measures Implementation
- Metric 48: Sensitive Data Using Method
- Metric 49: Provision of Anonymised Synthetic Data
- Metric 50: Statement of No Data Restrictions
- Metric 51: Dataset License Declared
- Metric 52: Software Dataset Has a Standardised Machine-Readable License
- Metric 53: Data Access Agreements
- Metric 54: Data Ownership Role Declared
- Metric 55: Software Dataset Author Declared
- Metric 56: Ethical Issues Status Declared
- Metric 57: Ethical Issues and Review
- Metric 58: Justification for Absence of Ethical Issues
- Metric 59: Data Access Status Open for the Dataset
- Metric 60: Data License is Present
- Metric 61: Data Restrictions Reference
- Metric 62: Dataset License Complies with Funder Requirements
- Metric 63: Repository Access Rights Consistency Aligned
- Metric 64: Repository Data License Aligned with the DMP
- Metric 65: Embargo Implementation Alignment
- Metric 66: Repository Data Restrictions
- Metric 67: Embargo Declared in the DMP or Repository
- Metric 68: Thematic Data Repositories Referenced
- Metric 69: Repository Conforms with FAIR Data Principles
- Metric 70: Trusted Repository is Used
- Metric 71: Verification of Back-up Strategy
- Metric 72: Certification of Repository
- Metric 73: Used Resources for Preservation
- Metric 74: Repository Policy is Present
- Metric 75: Repository Identifier Accuracy
- Metric 76: Long-Term Preservation Dataset
- Metric 77: Dataset Characteristics Are Compatible with the Repository
- Metric 78: Data External Resources Included in the DMP
- Metric 79: Metadata Standard Specified in the DMP
- Metric 80: Resolvable External Resources
- Metric 81: OpenAIRE Mentioned Dataset Validation
- Metric 82: Contributor Roles Follow CRediT Taxonomy
- Metric 83: Repository Supports Persistent Identifiers for Datasets
- Metric 84: Trusted Repository Referenced
- Metric 85: Dataset PID System in the DMP Matches the Repository
- Metric 86: Dataset Persistent Identifier Resolves Successfully
- Metric 87: Research Data Management Roles Declared
- Metric 88: DMP Validation by Data Steward
- Metric 89: Contributors and Organisations PIDs
- Metric 90: Referenced RDM Roles
- Metric 91: Data Steward Contribution Reflected in the Destination Repository
- Metric 92: Contributor and Organisation PIDs Match the Destination Repository
- Metric 93: DMP Includes a Budget for Personnel and Monetary Resources
- Metric 94: DMP States No Additional RDM Resources Are Required
- DMP Catalogue of Tests
- Test 1: Check for reused dataset declaration
- Test 2: Check License for Reused Datasets
- Test 3: Check for reused dataset PID
- Test 4: Check Distribution Entry is Present
- Test 5: Check Distribution Access Information is Present
- Test 6: Check Distribution Title is Present
- Test 7: Check Access Rights for Reused Datasets
- Test 8: Check Personal Data Flag for Reused Datasets
- Test 9: Check Sensitive Data Flag for Reused Datasets
- Test 10: Check Distribution URL is Present
- Test 11: Check Access URL is Present and Non-empty
- Test 12: Check PID Matches Destination Repository Record
- Test 13: Check PID Resolves Successfully
- Test 14: Check Reused Data Access Matches Destination
- Test 15: Check Reused Data License Matches Destination
- Test 16: Check for new data (no is_reused)
- Test 17: Check technical_resource for new data collection/creation
- Test 18: Check data_access for new datasets
- Test 19: Check rights of new dataset
- Test 20: Check metadata for new dataset
- Test 21: Check dataset_id exists
- Test 22: Check PID resolves for dataset_id
- Test 23: Check new data access matches destination
- Test 24: Check new data license matches destination
- Test 25: Check dataset.type is specified
- Test 26: Check distribution.format is specified
- Test 27: Check distribution.byte_size is specified
- Test 28: Check dataset.type matches destination type
- Test 29: Check dataset.type aligns with destination subtype
- Test 30: Check final dataset format matches destination files
- Test 31: Check final dataset size matches destination size
- Test 32: Check maDMP JSON Validates Against DMP Common Standard Schema
- Test 33: Check dataset_methodology for controlled vocabularies
- Test 34: Check technical_resource.name for electronic lab notebook reference
- Test 35: Check related_identifier resource_type for ReadMe file
- Test 36: Check metadata_standard_id is registered in metadata registries
- Test 37: Check distribution format is open
- Test 38: Check ELN dataset linked via related_ids
- Test 39: Check technical_resource for dataset documentation
- Test 40: Check data_quality_assurance for quality control methods
- Test 41: Check host.title and host.url for storage location
- Test 42: Check host for trusted repository storage
- Test 43: Check sensitive_data classification is assigned
- Test 44: Check host security and backup reflect sensitivity level
- Test 45: Check contributor.role for backup responsibility
- Test 46: Check backup_frequency is declared
- Test 47: Check host.id matches Zenodo deposit location
- Test 48: Check security_and_privacy.title for security measures
- Test 49: Check security_and_privacy.description for access rights management
- Test 50: Check security_and_privacy.description for authorised access controls
- Test 51: Check security_and_privacy.description for access control and user permissions
- Test 52: Check security_and_privacy.description for access procedures
- Test 53: Check security_and_privacy.description and ethical_issues_report for GDPR and ethics compliance
- Test 54: Check security_and_privacy.title for implemented security measures at destination
- Test 55: Check security_and_privacy.description for data protection method when sensitive_data is true
- Test 56: Check security_and_privacy for anonymised synthetic data provision
- Test 57: Check rights for statement of no data restrictions
- Test 58: Check license_ref for dataset licence
- Test 59: Check license_ref against SPDX for software datasets
- Test 60: Check data_access and rights for access agreements or MoUs
- Test 61: Check contributor.role for data owner
- Test 62: Check contributor for author role when dataset type is software
- Test 63: Check ethical_issues_exist for valid value
- Test 64: Check ethical_issues_description is present when ethical_issues_exist is no
- Test 65: Check data_access for open status
- Test 66: Check distribution is present for dataset
- Test 67: Check license_ref is present within distribution
- Test 68: Check rights for data restrictions reference
- Test 69: Check distribution license_ref for Horizon Europe CC-BY compliance
- Test 70: Check data_access matches destination host access policy
- Test 71: Check distribution license_ref matches destination host licence policy
- Test 72: Check distribution license.start_date matches destination embargo policy
- Test 73: Check rights matches destination host restriction policy
- Test 74: Check repository host for absence of embargo date
- Test 75: Check distribution.license.start_date for absence in maDMP
- Test 76: Check host.title and host.url against thematic repository registries
- Test 77: Check host against OpenAIRE and FAIRsharing FAIR benchmarks
- Test 78: Check host against trusted repository registry benchmark
- Test 79: Check host.backup_frequency and host.backup_type for back-up strategy
- Test 80: Check certified_with exists in host
- Test 81: Check cost title or description for preservation reference
- Test 82: Check host_id against FAIRsharing for repository policy
- Test 83: Check dataset_id resolves to declared destination via DOI URL
- Test 84: Check preservation_statement and host for long-term storage intention
- Test 85: Check dataset.keyword against Zenodo keywords
- Test 86: Check dataset.language against Zenodo language support
- Test 87: Check host_id against Zenodo and FAIRsharing for policy compliance
- Test 88: Check related_identifier.identifier for external resources
- Test 89: Check related_identifier for metadata standard fields
- Test 90: Check URLs in maDMP are valid and resolvable
- Test 91: Check dataset fields against OpenAIRE SKG-IF API
- Test 92: Check contributor roles against CRediT taxonomy
- Test 93: Check host.pid_system for PID declaration
- Test 94: Check certified_with against trusted registry
- Test 95: Check host_id.identifier and host_id.type for valid repository link
- Test 96: Check host.pid_system matches destination PID system in Zenodo
- Test 97: Check dmp.contributor name, role, and contact
- Test 98: Check dmp.contributor.role for Data Steward
- Test 99: Check contributor_id and affiliation.affiliation_id for PIDs
- Test 100: Check dmp.contributor fields against destination contributors
- Test 101: Check Data Steward role in maDMP against contributors.type Other in destination
- Test 102: Check contributor PIDs in maDMP against Zenodo contributors
- Test 103: Check cost in maDMP against repository cost
- Test 104: Check cost fields for budget specification
- Test 105: Check cost in maDMP for no additional resources statement
- SKG Commons
- FAIR Commons