Manage FASTQ Datasets and Projects using ReadStore (Tutorial Part 3)

Effective management of FASTQ datasets and projects is at the core of ReadStore, enabling you and your team to streamline NGS workflows. Let’s walk through the key features and operations of ReadStore, starting with its data structure.

Understanding the ReadStore Data Structure

At the center of ReadStore are Datasets, which can be grouped into Projects:

  • Datasets: These contain the FASTQ files and are organized based on your NGS setup. For instance, a dataset can include two FASTQ files if paired-end sequencing was performed, or just one for single-end. Each FASTQ file links to the raw sequencing reads and provides additional information, such as quality scores and read length. Datasets can also store:
    • Metadata: Key-value pairs (e.g., replicate: 1, patient_id: A125) that help you organize and access datasets. Metadata is accessible programmatically and can be directly imported into analyses.
    • Attachments: Supplementary files you can upload, such as library prep details or downstream analysis results.
  • Projects: These help organize multiple datasets for better management. Like datasets, projects can store Metadata and Attachments. Additionally, projects allow you to set Dataset Keys, which serve as metadata templates that are inherited by all datasets within the project. This ensures consistency in metadata structure across datasets. Each dataset can be assigned to multiple projects.

Navigating the Projects Overview

We start in the Projects Overview section with a list of all projects that you have access to. You can use the Search and Filter bars to subset the visible Projects. Switch the Metadata toggle to show metadata keys in the overview to group datasets.

If you select a Project indicated by the checkbox and you get a detailed view of the Project in the lower part of the page (Detail toggle must be active). Here you can look at the metadata, Datasets, Attachments (which you can download in this tab) and Collaborators.

Creating a Project

To create a project:

  • Navigate to the Project section and click Create.
  • Provide a name and description, then define any Metadata key-value pairs in the lower table.
  • Use the Dataset Keys Tab to set template values that will be applied to all datasets attached to this project.
  • Select the Datasets you want to include and upload any necessary Attachments.

Once created, the project will appear in the overview. You can view detailed information about the project by selecting it from the list. To organize the display, toggle the Metadata switch, which will arrange the table by key-value pairs for easier navigation.

Updating and Deleting Projects

To modify a project, click the Update button after selecting the project. This will bring up the same form used during creation, where you can adjust any settings. If you need to delete a project, click the delete icon; this will remove the project but retain all associated datasets.

Sharing Projects

To enable users from other groups to access Projects and associated Datasets belonging to your group, you can add collaborators using the sharing feature. Collaborators have read access to shared Projects and Datasets.

Click Share, select the collaborators you want to grant access to, and they will be able to view and work with the project and its datasets. You can revoke access at any time by removing collaborators from the list.

Exporting Project Data

The Export button allows you to download a CSV file containing the project overview, including all metadata. This makes it easy to review your project inventory in tools like Excel.

Navigating the Datasets Overview

The Datasets Overview section functions similarly to the Projects view. Here, you can see datasets that are part of specific projects, and you can update, manage, or export them. Switch the Metadata toggle to show metadata keys in the overview to group datasets.

If you select a Dataset indicated by the checkbox you get a detailed view of the Dataset in the lower part of the page (Detail toggle must be active). Here you can look at the metadata, associated Projects and Attachments, which you can download.

Via the Features Details Section, you can open Read FASTQ files and download those or get their file path.

Metadata Toggle

FASTQ files access

Updating Datasets

  • Datasets automatically inherit metadata keys from the parent project’s Dataset Keys. You can fill in values for these keys and add additional metadata as needed.
  • Attach files to datasets, and confirm changes by clicking Update. The update view also allows you to delete datasets.

Exporting Datasets

  • Click Export to download a dataset overview, including metadata or a FASTQ summary with QC metrics. You can export all paths and subsets by using filters to select specific datasets or projects.

Getting Help

That’s a quick overview of how to manage Datasets and Projects in ReadStore. If you have further questions or need assistance, please reach out.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top