From 38fc756e9243e9a52d0695155954bd9de6f62d7a Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Mon, 2 Oct 2023 21:11:01 +0100 Subject: [PATCH 01/46] tutorial layout + copied galaxy alevin --- .../tutorials/alevin-commandline/tutorial.md | 317 ++++++++++++++++++ 1 file changed, 317 insertions(+) create mode 100644 topics/single-cell/tutorials/alevin-commandline/tutorial.md diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md new file mode 100644 index 00000000000000..0ecf6245ea5047 --- /dev/null +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -0,0 +1,317 @@ +--- +layout: tutorial_hands_on + +title: 'Generating a single cell matrix using Alevin (bash + R)' +subtopic: single-cell-CS-code +priority: 1 +zenodo_link: + +questions: + - I have some single cell FASTQ files I want to analyse. Where do I start? + - How to generate a single cell matrix using command line? + +objectives: + - Generate a cellxgene matrix for droplet-based single cell sequencing data + - Interpret quality control (QC) plots to make informed decisions on cell thresholds + - Find relevant information in GTF files for the particulars of their study, and include this in data matrix metadata + +time_estimation: 1H + +key_points: + - Create a scanpy-accessible AnnData object from FASTQ files, including relevant gene metadata + +requirements: +- + type: "internal" + topic_name: single-cell + tutorials: + - scrna-case_alevin + +follow_up_training: + - + type: "internal" + topic_name: single-cell + tutorials: + - scrna-case_alevin-combine-datasets + +tags: +- single-cell +- 10x +- paper-replication +- jupyter-notebook +- interactive-tools + + +contributions: + authorship: + - wee-snufkin + - nomadscientist + + funding: + - + +notebook: + language: + snippet: +--- + +# Introduction + +This tutorial is the part of [Single-cell RNA-seq: Case Study]({% link topics/single-cell/index.md %}) series and focuses on generating a single cell matrix using [Alevin]( https://salmon.readthedocs.io/en/latest/alevin.html) in bash command line. It is a replication of the [previous tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) and will guide you through the same steps that you followed in the previous tutorial and will give you more understanding of what is happening ‘behind the scenes’ or ‘inside the tools’ if you will. +We will work on the case study data from a mouse model of fetal growth restriction {% cite Bacon2018 %} (see [the study in Single Cell Expression Atlas](https://www.ebi.ac.uk/gxa/sc/experiments/E-MTAB-6945/results/tsne) and [the project submission](https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6945/)). +As a recap, we fill go from raw FASTQ files to a cell x gene data matrix in AnnData format. After completing the previous tutorial you should already know what is a data matrix and AnnData format. We will perform the following steps: +1. Getting the appropriate files +2. Making a transcript-to-gene ID mapping +3. + + +## Launching JupyterLab + +> Data uploads & JupyterLab +> There are a few ways of importing and uploading data into JupyterLab. You might find yourself accidentally doing this differently than the tutorial, and that's ok. There are a few key steps where you will call files from a location - if these don't work from you, check that the file location is correct and change accordingly! +{: .warning} + +> {% snippet faqs/galaxy/interactive_tools_jupyter_launch.md %} + +Welcome to JupyterLab! + +> Danger: You can lose data! +> Do NOT delete or close this notebook dataset in your history. YOU WILL LOSE IT! +{: .warning} + +## Open the notebook + +You have two options for how to proceed with this JupyterLab tutorial - you can run the tutorial from a pre-populated notebook, or you can copy and paste the code for each step into a fresh notebook and run it. The initial instructions for both options are below. + +> Option 1: Open the notebook directly in JupyterLab +> +> 1. Open a `Terminal` in JupyterLab with File -> New -> Terminal +> +> ![Screenshot of the Launcher tab with an arrow indicating where to find Terminal.](../../images/scrna-casestudy-monocle/terminal_choose.jpg "This is how the Launcher tab looks like and where you can find Terminal.") +> +> 2. Run +> ``` +> wget {{ ipynbpath }} +> ``` +> +> 3. Select the notebook that appears in the list of files on the left. +> +> +> Remember that you can also download this {% icon notebook %} [Jupyter Notebook]({{ ipynbpath }}) from the {% icon galaxy_instance %} Supporting Materials in the Overview box at the beginning of this tutorial. +{: .hands_on} + +> Option 2: Creating a notebook +> +> 1. Select the **Bash** icon under **Notebook** +> +> ![Bash icon](../../images/AAA "Bash Notebook Button") +> +> 2. Save your file (**File**: **Save**, or click the {% icon galaxy-save %} Save icon at the top left) +> +> 3. If you right click on the file in the folder window at the left, you can rename your file `whateveryoulike.ipynb` +> +{: .hands_on} + +> You should Save frequently! +> This is both for good practice and to protect you in case you accidentally close the browser. Your environment will still run, so it will contain the last saved notebook you have. You might eventually stop your environment after this tutorial, but ONLY once you have saved and exported your notebook (more on that at the end!) Note that you can have multiple notebooks going at the same time within this JupyterLab, so if you do, you will need to save and export each individual notebook. You can also download them at any time. +{: .warning} + +Let's crack on! + +> +> +> In this tutorial, we will cover: +> +> 1. TOC +> {:toc} +> +{: .agenda} + +# Generating a matrix + +In this section, we will show you the principles of the initial phase of single-cell RNA-seq analysis: generating expression measures in a matrix. We'll concentrate on droplet-based (rather than plate-based) methodology, since this is the process with most differences with respect to conventional approaches developed for bulk RNA-seq. + +Droplet-based data consists of three components: cell barcodes, unique molecular identifiers (UMIs) and cDNA reads. To generate cell-wise quantifications we need to: + + * Process cell barcodes, working out which ones correspond to 'real' cells, which to sequencing artefacts, and possibly correct any barcodes likely to be the product of sequencing errors by comparison to more frequent sequences. + * Map biological sequences to the reference genome or transcriptome. + * 'De-duplicate' using the UMIs. + +This used to be a complex process involving multiple algorithms, or was performed with technology-specific methods (such as 10X's 'Cellranger' tool) but is now much simpler thanks to the advent of a few new methods. When selecting methodology for your own work you should consider: + + * [STARsolo](https://github.com/alexdobin/STAR) - a droplet-based scRNA-seq-specific variant of the popular genome alignment method STAR. Produces results very close to those of Cellranger (which itself uses STAR under the hood). + * [Kallisto/ bustools](https://www.kallistobus.tools/) - developed by the originators of the transcriptome quantification method, Kallisto. + * [Alevin](https://salmon.readthedocs.io/en/latest/alevin.html) - another transcriptome analysis method developed by the authors of the Salmon tool. + +We're going to use Alevin {% cite article-Alevin %} for demonstration purposes, but we do not endorse one method over another. + +## Get Data + +We've provided you with some example data to play with, a small subset of the reads in a mouse dataset of fetal growth restriction {% cite Bacon2018 %} (see the [study in Single Cell Expression Atlas](https://www.ebi.ac.uk/gxa/sc/experiments/E-MTAB-6945/results/tsne) and the [project submission](https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6945/)). This is a study using the Drop-seq chemistry, however this tutorial is almost identical to a 10x chemistry. We will point out the one tool parameter change you will need to run 10x samples. This data is not carefully curated, standard tutorial data - it's real, it's messy, it desperately needs filtering, it has background RNA running around, and most of all it will give you a chance to practice your analysis as if this data were yours. + +Down-sampled reads and some associated annotation can be imported below. How did I downsample these FASTQ files? Check out [this history](https://humancellatlas.usegalaxy.eu/u/wendi.bacon.training/h/pre-processing-with-alevin---part-1---how-to-downsample) to find out! + +Additionally, to map your reads, you will need a transcriptome to align against (a FASTA) as well as the gene information for each transcript (a gtf) file. You can download these for your species of interest [from Ensembl](https://www.ensembl.org/info/data/ftp/index.html). These files are included in the data import step below. Keep in mind, these are big files, so the fastest way to get these into your Galaxy account is through importing them by history. + +## Generate a transcript to gene map + +Gene-level, rather than transcript-level, quantification is standard in scRNA-seq, which means that the expression level of alternatively spliced RNA molecules are combined to create gene-level values. Droplet-based scRNA-seq techniques only sample one end each transcript, so lack the full-molecule coverage that would be required to accurately quantify different transcript isoforms. + +To generate gene-level quantifications based on transcriptome quantification, Alevin and similar tools require a conversion between transcript and gene identifiers. We can derive a transcript-gene conversion from the gene annotations available in genome resources such as Ensembl. The transcripts in such a list need to match the ones we will use later to build a binary transcriptome index. If you were using spike-ins, you'd need to add these to the transcriptome and the transcript-gene mapping. + +In your example data you will see the murine reference annotation as retrieved from Ensembl in GTF format. This annotation contains gene, exon, transcript and all sorts of other information on the sequences. We will use these to generate the transcript/ gene mapping by passing that information to a tool that extracts just the transcript identifiers we need. + +> +> +> Which of the 'attributes' in the last column of the GTF files contains the transcript and gene identifiers? +> +> +> > Hint +> > +> > The file is organised such that the last column (headed 'Group') contains a wealth of information in the format: attribute1 "information associated with attribute 1";attribute2 "information associated with attribute 2" etc. +> {: .tip} +> +> > +> > *gene_id* and *transcript_id* are each followed by "ensembl gene_id" and "ensembl transcript_id" +> {: .solution} +{: .question} + +It's now time to parse the GTF file using the [rtracklayer](https://bioconductor.org/packages/release/bioc/html/rtracklayer.html) package in R. This parsing will give us a conversion table with a list of transcript identifiers and their corresponding gene identifiers for counting. Additionally, because we will be generating our own binary index (more later!), we also need to input our FASTA so that it can be filtered to only contain transcriptome information found in the GTF. + +## Generate a transcriptome index & quantify! + +Alevin collapses the steps involved in dealing with dscRNA-seq into a single process. Such tools need to compare the sequences in your sample to a reference containing all the likely transcript sequences (a 'transcriptome'). This will contain the biological transcript sequences known for a given species, and perhaps also technical sequences such as 'spike ins' if you have those. + +> How does Alevin work? +> +> To be able to search a transcriptome quickly, Alevin needs to convert the text (FASTA) format sequences into something it can search quickly, called an 'index'. The index is in a binary rather than human-readable format, but allows fast lookup by Alevin. Because the types of biological and technical sequences we need to include in the index can vary between experiments, and because we often want to use the most up-to-date reference sequences from Ensembl or NCBI, we can end up re-making the indices quite often. Making these indices is time-consuming! Have a look at the uncompressed FASTA to see what it starts with. +> +{: .details} + +We now have: + +* Barcode/ UMI reads +* cDNA reads +* transcript/ gene mapping +* filtered FASTA + +We can now run Alevin. In some public instances, Alevin won't show up if you search for it. Instead, you may have to click the Single Cell tab at the left and scroll down to the Alevin tool. Alternatively, use Tutorial Mode as described above and you'll easily navigate to all the tools, and their versions will all be the tried and tested ones of this tutorial. It's often a good idea to check your tool versions. To identify which version of a tool you are using, select {% icon tool-versions %} 'Versions' and choose the appropriate version. In this case the tutorial was built with Alevin Galaxy Version 1.9.0+galaxy2. + +> What if I'm running a 10x sample? +> +> The main parameter that needs changing for a 10X Chromium sample is the 'Protocol' parameter of Alevin. Just select the correct 10x Chemistry there instead. +{: .comment} + +> Alevin file names +> +> You will notice that the names of the output files of Alevin are written in a certain convention, mentioning which tool was used and on which files, for example: *"Alevin on data X, data Y, and others: whitelist"*. Remember that you can always rename the files if you wish! For simplicity, when we refer to those files in the tutorial, we skip the information about tool and only use the second part of the name - in this case it would be simply *"whitelist"*. +{: .comment} + +This tool will take a while to run. Alevin produces many file outputs, not all of which we'll use. You can refer to the [Alevin documentation](https://salmon.readthedocs.io/en/latest/alevin.html) if you're curious what they all are, but we're most interested in is: + +* the matrix itself (*per-cell gene-count matrix (MTX)* - the count by gene and cell) +* the row (cell/ barcode) identifiers (*row index (CB-ids)*) and +* the column (gene) labels (*column headers (gene-ids)*). + + +> +> +> After you've run Alevin, {% icon galaxy-eye %} look through all the different files. Can you find: +> 1. The Mapping Rate? +> 2. How many cells are present in the matrix output? +> +> > +> > +> > 1. Inspect {% icon galaxy-eye %} the file {% icon param-file %} *Salmon log file*. You can see the mapping rate is a paltry `25.45%`. This is a terrible mapping rate. Why might this be? Remember this was downsampled, and specifically by taking only the last 400,000 reads of the FASTQ file. The overall mapping rate of the file is more like 50%, which is still quite poor, but for early Drop-Seq samples and single-cell data in general, you might expect a slightly poorer mapping rate. 10x samples are much better these days! This is real data, not test data, after all! +> > 2. Inspect {% icon galaxy-eye %} the file {% icon param-file %} *row index (CB-ids)*, and you can see it has `2163` lines. The rows refer to the cells in the cell x gene matrix. According to this (rough) estimate, your sample has 2163 cells in it! +> > +> {: .solution} +> +{: .question} + +{% icon congratulations %} Congratulations - you've made an expression matrix! We could almost stop here. But it's sensible to do some basic QC, and one of the things we can do is look at a barcode rank plot. + +# Basic QC + +The question we're looking to answer here, is: "do we mostly have a single cell per droplet"? That's what experimenters are normally aiming for, but it's not entirely straightforward to get exactly one cell per droplet. Sometimes almost no cells make it into droplets, other times we have too many cells in each droplet. At a minimum, we should easily be able to distinguish droplets with cells from those without. + +Now, the image generated here (400k) isn't the most informative - but you are dealing with a fraction of the reads! If you run the total sample (so identical steps above, but with significantly more time!) you'd get the image below. + +![raw droplet barcode plots-total](../../images/scrna-casestudy/wab-raw_barcodes-total.png "Total sample - 32,579,453 reads - raw") + +This is our own formulation of the barcode plot based on a [discussion](https://github.com/COMBINE-lab/salmon/issues/362#issuecomment-490160480) we had with community members. The left hand plots with the smooth lines are the main plots, showing the UMI counts for individual cell barcodes ranked from high to low. We expect a sharp drop-off between cell-containing droplets and ones that are empty or contain only cell debris. Now, this data is not an ideal dataset, so for perspective, in an ideal world with a very clean 10x run, data will look a bit more like the following taken from the lung atlas (see the [study in Single Cell Expression Atlas](https://www.ebi.ac.uk/gxa/sc/experiments/E-MTAB-6653/results/tsne) and the [project submission](https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6653/)). + +![raw droplet barcode plots - lung atlas](../../images/scrna-casestudy/wab-lung-atlas-barcodes-raw.png "Pretty data - raw") + +In that plot, you can see the clearer 'knee' bend, showing the cut-off between empty droplets and cell-containing droplets. + +The right hand plots are density plots from the first one, and the thresholds are generated either using [dropletUtils](https://bioconductor.org/packages/release/bioc/html/DropletUtils.html) or by the method described in the discussion mentioned above. We could use any of these thresholds to select cells, assuming that anything with fewer counts is not a valid cell. By default, Alevin does something similar, and we can learn something about that by plotting just the barcodes Alevin retains. + +In experiments with relatively simple characteristics, this 'knee detection' method works relatively well. But some populations (such as our sample!) present difficulties. For instance, sub-populations of small cells may not be distinguished from empty droplets based purely on counts by barcode. Some libraries produce multiple 'knees' for multiple sub-populations. The [emptyDrops](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1662-y) method has become a popular way of dealing with this. emptyDrops still retains barcodes with very high counts, but also adds in barcodes that can be statistically distinguished from the ambient profiles, even if total counts are similar. In order to ultimately run emptyDrops (or indeed, whatever tool you like that accomplishes biologically relevant thresholding), we first need to re-run Alevin, but prevent it from applying its own less than ideal thresholds. + +To use emptyDrops effectively, we need to go back and re-run Alevin, stopping it from applying it's own thresholds. Click the re-run icon {% icon galaxy-refresh %} on any Alevin output in your history, because almost every parameter is the same as before, except you need to change the following: + +Alevin outputs MTX format, which we can pass to the dropletUtils package and run emptyDrops. Unfortunately the matrix is in the wrong orientation for tools expecting files like those produced by 10X software (which dropletUtils does). We need to 'transform' the matrix such that cells are in columns and genes are in rows. + +Alevin outputs MTX format, which we can pass to the dropletUtils package and run emptyDrops. Unfortunately the matrix is in the wrong orientation for tools expecting files like those produced by 10X software (which dropletUtils does). We need to 'transform' the matrix such that cells are in columns and genes are in rows. + + +> Generate gene information +> +> 1. {% tool [GTF2GeneList](toolshed.g2.bx.psu.edu/repos/ebi-gxa/gtf2gene_list/_ensembl_gtf2gene_list/1.52.0+galaxy0) %} with the following parameters: +> - *"Feature type for which to derive annotation"*: `gene` +> - *"Field to place first in output table"*: `gene_id` +> - *"Suppress header line in output?"*: `Yes` +> - *"Comma-separated list of field names to extract from the GTF (default: use all fields)"*: `gene_id,gene_name,mito` +> - *"Append version to transcript identifiers?"*: `Yes` +> - *"Flag mitochondrial features?"*: `Yes` - note, this will auto-fill a bunch of acronyms for searching in the GTF for mitochondrial associated genes. This is good! +> - *"Filter a FASTA-format cDNA file to match annotations?"*: `No` - we don't need to, we're done with the FASTA! +> 2. Check that the output file type is `tabular`. If not, change the file type by clicking the 'Edit attributes'{% icon galaxy-pencil %} on the dataset in the history (as if you were renaming the file.) Then click `Datatypes` and type in `tabular`. Click `Change datatype`.) +> 2. Rename {% icon galaxy-pencil %} the annotation table to `Gene Information` +{: .hands_on} + +Inspect {% icon galaxy-eye %} the **Gene Information** object in the history. Now you have made a new key for gene_id, with gene name and a column of mitochondrial information (false = not mitochondrial, true = mitochondrial). We need to add this information into the salmonKallistoMtxTo10x output 'Gene table'. But we need to keep 'Gene table' in the same order, since it is referenced in the 'Matrix table' by row. + +> Combine MTX Gene Table with Gene Information +> +> 1. {% tool [Join two Datasets](join1) %} with the following parameters: +> - *"Join"*: `Gene Table` +> - *"Using column"*: `Column: 1` +> - *"with"*: `Gene Information` +> - *"and column"*: `Column: 1` +> - *"Keep lines of first input that do not join with second input"*: `Yes` +> - *"Keep lines of first input that are incomplete"*: `Yes` +> - *"Fill empty columns"*: `No` +> - *"Keep the header lines"*: `No` +> +> +> If you inspect {% icon galaxy-eye %} the object, you'll see we have joined these tables and now have quite a few gene_id repeats. Let's take those out, while keeping the order of the original 'Gene Table'. +> +> +> 2. {% tool [Cut columns from a table](Cut1) %} with the following parameters: +> - *"Cut columns"*: `c1,c4,c5` +> - *"Delimited by"*: `Tab` +> - *"From"*: output of **Join two Datasets** {% icon tool %} +> +> 3. Rename output `Annotated Gene Table` +{: .hands_on} + +Inspect {% icon galaxy-eye %} your `Annotated Gene Table`. That's more like it! You now have `gene_id`, `gene_name`, and `mito`. Now let's get back to your journey to emptyDrops and sophisticated thresholding of empty droplets! + +# emptyDrops + +emptyDrops {% cite article-emptyDrops %} works with a specific form of R object called a SingleCellExperiment. We need to convert our transformed MTX files into that form, using the DropletUtils Read10x tool: + +> Converting to SingleCellExperiment format +> +> 1. {% tool [DropletUtils Read10x](toolshed.g2.bx.psu.edu/repos/ebi-gxa/dropletutils_read_10x/dropletutils_read_10x/1.0.4+galaxy0) %} with the following parameters: +> - {% icon param-file %} *"Expression matrix in sparse matrix format (.mtx)"*: `Matrix table` +> - {% icon param-file %} *"Gene Table"*: `Annotated Gene Table` +> - {% icon param-file %} *"Barcode/cell table"*: `Barcode table` +> - *"Should metadata file be added?"*: `No` +> +> 2. Rename {% icon galaxy-pencil %} output: `SCE Object` +{: .hands_on} + +Fantastic! Now that our matrix is combined into an object, specifically the SingleCellExperiment format, we can now run emptyDrops! Let's get rid of those background droplets containing no cells! + From 040a5a474162abec20c10ba1c86f992a9bdf1e52 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sun, 5 Nov 2023 15:32:42 +0000 Subject: [PATCH 02/46] separate to preamble --- .../tutorials/alevin-commandline/tutorial.md | 77 +------------------ 1 file changed, 3 insertions(+), 74 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index 0ecf6245ea5047..8448e1db4b4a25 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -51,83 +51,12 @@ contributions: - notebook: - language: - snippet: + language: bash + snippet: topics/single-cell/tutorials/alevin-commandline/preamble.md --- -# Introduction -This tutorial is the part of [Single-cell RNA-seq: Case Study]({% link topics/single-cell/index.md %}) series and focuses on generating a single cell matrix using [Alevin]( https://salmon.readthedocs.io/en/latest/alevin.html) in bash command line. It is a replication of the [previous tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) and will guide you through the same steps that you followed in the previous tutorial and will give you more understanding of what is happening ‘behind the scenes’ or ‘inside the tools’ if you will. -We will work on the case study data from a mouse model of fetal growth restriction {% cite Bacon2018 %} (see [the study in Single Cell Expression Atlas](https://www.ebi.ac.uk/gxa/sc/experiments/E-MTAB-6945/results/tsne) and [the project submission](https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6945/)). -As a recap, we fill go from raw FASTQ files to a cell x gene data matrix in AnnData format. After completing the previous tutorial you should already know what is a data matrix and AnnData format. We will perform the following steps: -1. Getting the appropriate files -2. Making a transcript-to-gene ID mapping -3. - - -## Launching JupyterLab - -> Data uploads & JupyterLab -> There are a few ways of importing and uploading data into JupyterLab. You might find yourself accidentally doing this differently than the tutorial, and that's ok. There are a few key steps where you will call files from a location - if these don't work from you, check that the file location is correct and change accordingly! -{: .warning} - -> {% snippet faqs/galaxy/interactive_tools_jupyter_launch.md %} - -Welcome to JupyterLab! - -> Danger: You can lose data! -> Do NOT delete or close this notebook dataset in your history. YOU WILL LOSE IT! -{: .warning} - -## Open the notebook - -You have two options for how to proceed with this JupyterLab tutorial - you can run the tutorial from a pre-populated notebook, or you can copy and paste the code for each step into a fresh notebook and run it. The initial instructions for both options are below. - -> Option 1: Open the notebook directly in JupyterLab -> -> 1. Open a `Terminal` in JupyterLab with File -> New -> Terminal -> -> ![Screenshot of the Launcher tab with an arrow indicating where to find Terminal.](../../images/scrna-casestudy-monocle/terminal_choose.jpg "This is how the Launcher tab looks like and where you can find Terminal.") -> -> 2. Run -> ``` -> wget {{ ipynbpath }} -> ``` -> -> 3. Select the notebook that appears in the list of files on the left. -> -> -> Remember that you can also download this {% icon notebook %} [Jupyter Notebook]({{ ipynbpath }}) from the {% icon galaxy_instance %} Supporting Materials in the Overview box at the beginning of this tutorial. -{: .hands_on} - -> Option 2: Creating a notebook -> -> 1. Select the **Bash** icon under **Notebook** -> -> ![Bash icon](../../images/AAA "Bash Notebook Button") -> -> 2. Save your file (**File**: **Save**, or click the {% icon galaxy-save %} Save icon at the top left) -> -> 3. If you right click on the file in the folder window at the left, you can rename your file `whateveryoulike.ipynb` -> -{: .hands_on} - -> You should Save frequently! -> This is both for good practice and to protect you in case you accidentally close the browser. Your environment will still run, so it will contain the last saved notebook you have. You might eventually stop your environment after this tutorial, but ONLY once you have saved and exported your notebook (more on that at the end!) Note that you can have multiple notebooks going at the same time within this JupyterLab, so if you do, you will need to save and export each individual notebook. You can also download them at any time. -{: .warning} - -Let's crack on! - -> -> -> In this tutorial, we will cover: -> -> 1. TOC -> {:toc} -> -{: .agenda} - -# Generating a matrix +# Setting up the environment In this section, we will show you the principles of the initial phase of single-cell RNA-seq analysis: generating expression measures in a matrix. We'll concentrate on droplet-based (rather than plate-based) methodology, since this is the process with most differences with respect to conventional approaches developed for bulk RNA-seq. From ab94a93a2c96ae39a3cc9a80fce1005f6fb69eb5 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sun, 5 Nov 2023 15:33:28 +0000 Subject: [PATCH 03/46] Create preamble.md --- .../tutorials/alevin-commandline/preamble.md | 99 +++++++++++++++++++ 1 file changed, 99 insertions(+) create mode 100644 topics/single-cell/tutorials/alevin-commandline/preamble.md diff --git a/topics/single-cell/tutorials/alevin-commandline/preamble.md b/topics/single-cell/tutorials/alevin-commandline/preamble.md new file mode 100644 index 00000000000000..db9818a607ee67 --- /dev/null +++ b/topics/single-cell/tutorials/alevin-commandline/preamble.md @@ -0,0 +1,99 @@ +# Introduction + +This tutorial is the part of [Single-cell RNA-seq: Case Study]({% link topics/single-cell/index.md %}) series and focuses on generating a single cell matrix using [Alevin]( https://salmon.readthedocs.io/en/latest/alevin.html) in bash command line. It is a replication of the [previous tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) and will guide you through the same steps that you followed in the previous tutorial and will give you more understanding of what is happening ‘behind the scenes’ or ‘inside the tools’ if you will. +We will work on the case study data from a mouse model of fetal growth restriction {% cite Bacon2018 %} (see [the study in Single Cell Expression Atlas](https://www.ebi.ac.uk/gxa/sc/experiments/E-MTAB-6945/results/tsne) and [the project submission](https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6945/)). +As a recap, we fill go from raw FASTQ files to a cell x gene data matrix in AnnData format. After completing the previous tutorial you should already know what is a data matrix and AnnData format. We will perform the following steps: +1. Getting the appropriate files +2. Making a transcript-to-gene ID mapping +3. Creating Salmon index +4. Quantification of transcript expression using Alevin +5. (Quality control using Alevin) +6. Creating Summarized Experiment from the Alevin output +7. Adding metadata +8. Combining samples data + + +## Launching JupyterLab + +> Data uploads & JupyterLab +> There are a few ways of importing and uploading data into JupyterLab. You might find yourself accidentally doing this differently than the tutorial, and that's ok. There are a few key steps where you will call files from a location - if these don't work from you, check that the file location is correct and change accordingly! +{: .warning} + +> {% snippet faqs/galaxy/interactive_tools_jupyter_launch.md %} + +Welcome to JupyterLab! + +> Danger: You can lose data! +> Do NOT delete or close this notebook dataset in your history. YOU WILL LOSE IT! +{: .warning} + +## Open the notebook + +You have two options for how to proceed with this JupyterLab tutorial - you can run the tutorial from a pre-populated notebook, or you can copy and paste the code for each step into a fresh notebook and run it. The initial instructions for both options are below. + +> Option 1: Open the notebook directly in JupyterLab +> +> 1. Open a `Terminal` in JupyterLab with File -> New -> Terminal +> +> ![Screenshot of the Launcher tab with an arrow indicating where to find Terminal.](../../images/scrna-casestudy-monocle/terminal_choose.jpg "This is how the Launcher tab looks like and where you can find Terminal.") +> +> 2. Run +> ``` +> wget {{ ipynbpath }} +> ``` +> +> 3. Select the notebook that appears in the list of files on the left. +> +> +> Remember that you can also download this {% icon notebook %} [Jupyter Notebook]({{ ipynbpath }}) from the {% icon galaxy_instance %} Supporting Materials in the Overview box at the beginning of this tutorial. +{: .hands_on} + +> Option 2: Creating a notebook +> +> 1. Select the **Bash** icon under **Notebook** +> +> ![Bash icon](../../images/bash.png "Bash Notebook Button") +> +> 2. Save your file (**File**: **Save**, or click the {% icon galaxy-save %} Save icon at the top left) +> +> 3. If you right click on the file in the folder window at the left, you can rename your file `whateveryoulike.ipynb` +> +{: .hands_on} + +> You should Save frequently! +> This is both for good practice and to protect you in case you accidentally close the browser. Your environment will still run, so it will contain the last saved notebook you have. You might eventually stop your environment after this tutorial, but ONLY once you have saved and exported your notebook (more on that at the end!) Note that you can have multiple notebooks going at the same time within this JupyterLab, so if you do, you will need to save and export each individual notebook. You can also download them at any time. +{: .warning} + +Let's crack on! + +> +> +> In this tutorial, we will cover: +> +> 1. TOC +> {:toc} +> +{: .agenda} + +{% snippet topics/single-cell/faqs/notebook_warning.md %} + + +## Installation + +Before we start working on the tutorial notebook, we need to install required packages. + +>Installing the packages +> +> 1. Navigate back to the `Terminal` (see Option 1 in the box above) +> 2. In the Terminal tab open, write the following, one line at a time: +> ``` +>conda install -y -c bioconda bioconductor-tximeta +>``` +>``` +>conda install -y -c bioconda bioconductor-dropletutils +>``` +> +{: .hands_on} + + +Installation will take a while, so in the meantime, when it's running, you can open the notebook and follow the rest of this tutorial there! From 3413bb8948df4ff468b6c9f17bbc58bf271a3084 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sun, 5 Nov 2023 17:52:27 +0000 Subject: [PATCH 04/46] set up, input data, map, filtered FASTA --- .../tutorials/alevin-commandline/tutorial.md | 119 +++++++++++++----- 1 file changed, 89 insertions(+), 30 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index 8448e1db4b4a25..bda1cc96b181a9 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -58,57 +58,116 @@ notebook: # Setting up the environment -In this section, we will show you the principles of the initial phase of single-cell RNA-seq analysis: generating expression measures in a matrix. We'll concentrate on droplet-based (rather than plate-based) methodology, since this is the process with most differences with respect to conventional approaches developed for bulk RNA-seq. +Alevin is a tool integrated with the salmon software, so first we need to get salmon. You can install salmon using bioconda, but in this tutorial we will show an alternative method - downloading the pre-compiled binaries from the [releases page](https://github.com/COMBINE-lab/salmon/releases). -Droplet-based data consists of three components: cell barcodes, unique molecular identifiers (UMIs) and cDNA reads. To generate cell-wise quantifications we need to: +```bash +wget -nv https://github.com/COMBINE-lab/salmon/releases/download/v1.10.0/salmon-1.10.0_linux_x86_64.tar.gz +``` - * Process cell barcodes, working out which ones correspond to 'real' cells, which to sequencing artefacts, and possibly correct any barcodes likely to be the product of sequencing errors by comparison to more frequent sequences. - * Map biological sequences to the reference genome or transcriptome. - * 'De-duplicate' using the UMIs. +Once you've downloaded a specific binary (here we're using version 1.10.0), just extract it like so: -This used to be a complex process involving multiple algorithms, or was performed with technology-specific methods (such as 10X's 'Cellranger' tool) but is now much simpler thanks to the advent of a few new methods. When selecting methodology for your own work you should consider: +```bash +tar -xvzf salmon-1.10.0_linux_x86_64.tar.gz +``` - * [STARsolo](https://github.com/alexdobin/STAR) - a droplet-based scRNA-seq-specific variant of the popular genome alignment method STAR. Produces results very close to those of Cellranger (which itself uses STAR under the hood). - * [Kallisto/ bustools](https://www.kallistobus.tools/) - developed by the originators of the transcriptome quantification method, Kallisto. - * [Alevin](https://salmon.readthedocs.io/en/latest/alevin.html) - another transcriptome analysis method developed by the authors of the Salmon tool. We're going to use Alevin {% cite article-Alevin %} for demonstration purposes, but we do not endorse one method over another. ## Get Data -We've provided you with some example data to play with, a small subset of the reads in a mouse dataset of fetal growth restriction {% cite Bacon2018 %} (see the [study in Single Cell Expression Atlas](https://www.ebi.ac.uk/gxa/sc/experiments/E-MTAB-6945/results/tsne) and the [project submission](https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6945/)). This is a study using the Drop-seq chemistry, however this tutorial is almost identical to a 10x chemistry. We will point out the one tool parameter change you will need to run 10x samples. This data is not carefully curated, standard tutorial data - it's real, it's messy, it desperately needs filtering, it has background RNA running around, and most of all it will give you a chance to practice your analysis as if this data were yours. +We continue working on the same example data - a very small subset of the reads in a mouse dataset of fetal growth restriction {% cite Bacon2018 %} (see the [study in Single Cell Expression Atlas](https://www.ebi.ac.uk/gxa/sc/experiments/E-MTAB-6945/results/tsne) and the [project submission](https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6945/)). For the purposes of this tutorial, the datasets have been subsampled to only 50k reads (around 1% of the original files). Those are two fastq files - one with transcripts and the onther one with cell barcodes. You can download the files by running the code below: -Down-sampled reads and some associated annotation can be imported below. How did I downsample these FASTQ files? Check out [this history](https://humancellatlas.usegalaxy.eu/u/wendi.bacon.training/h/pre-processing-with-alevin---part-1---how-to-downsample) to find out! +```bash +wget -nv https://zenodo.org/ +wget -nv https://zenodo.org/ +``` -Additionally, to map your reads, you will need a transcriptome to align against (a FASTA) as well as the gene information for each transcript (a gtf) file. You can download these for your species of interest [from Ensembl](https://www.ensembl.org/info/data/ftp/index.html). These files are included in the data import step below. Keep in mind, these are big files, so the fastest way to get these into your Galaxy account is through importing them by history. + -## Generate a transcript to gene map - -Gene-level, rather than transcript-level, quantification is standard in scRNA-seq, which means that the expression level of alternatively spliced RNA molecules are combined to create gene-level values. Droplet-based scRNA-seq techniques only sample one end each transcript, so lack the full-molecule coverage that would be required to accurately quantify different transcript isoforms. - -To generate gene-level quantifications based on transcriptome quantification, Alevin and similar tools require a conversion between transcript and gene identifiers. We can derive a transcript-gene conversion from the gene annotations available in genome resources such as Ensembl. The transcripts in such a list need to match the ones we will use later to build a binary transcriptome index. If you were using spike-ins, you'd need to add these to the transcriptome and the transcript-gene mapping. - -In your example data you will see the murine reference annotation as retrieved from Ensembl in GTF format. This annotation contains gene, exon, transcript and all sorts of other information on the sequences. We will use these to generate the transcript/ gene mapping by passing that information to a tool that extracts just the transcript identifiers we need. - -> + > > -> Which of the 'attributes' in the last column of the GTF files contains the transcript and gene identifiers? -> -> -> > Hint -> > -> > The file is organised such that the last column (headed 'Group') contains a wealth of information in the format: attribute1 "information associated with attribute 1";attribute2 "information associated with attribute 2" etc. -> {: .tip} +> Test rendering > > > -> > *gene_id* and *transcript_id* are each followed by "ensembl gene_id" and "ensembl transcript_id" +> > +> > is it ok? +> > > {: .solution} +> {: .question} -It's now time to parse the GTF file using the [rtracklayer](https://bioconductor.org/packages/release/bioc/html/rtracklayer.html) package in R. This parsing will give us a conversion table with a list of transcript identifiers and their corresponding gene identifiers for counting. Additionally, because we will be generating our own binary index (more later!), we also need to input our FASTA so that it can be filtered to only contain transcriptome information found in the GTF. + + +Additionally, to map your reads, you will need a transcriptome to align against (a FASTA) as well as the gene information for each transcript (a gtf) file. These files are included in the data import step below. + +```bash +wget -nv https://zenodo.org/ +wget -nv https://zenodo.org/ +``` + + + +You can also download these for your species of interest [from Ensembl](https://www.ensembl.org/info/data/ftp/index.html). Once you find the cDNA FASTA file you are interested in, right click on the link and choose "Copy link address" and paste it along the command `wget -nv`, then extract it using `tar`. Here is the example how to do it: + +```bash +# Getting FASTA file +wget -nv https://ftp.ensembl.org/pub/release-110/fasta/mus_musculus/cdna/ +tar +``` +Do exactly the same to get the GTF file: + +```bash +# Getting GTF file +wget -nv https://ftp.ensembl.org/pub/release-110/gtf/mus_musculus +tar +``` + + + +Why do we need FASTA and GTF files? +To generate gene-level quantifications based on transcriptome quantification, Alevin and similar tools require a conversion between transcript and gene identifiers. We can derive a transcript-gene conversion from the gene annotations available in genome resources such as Ensembl. The transcripts in such a list need to match the ones we will use later to build a binary transcriptome index. If you were using spike-ins, you'd need to add these to the transcriptome and the transcript-gene mapping. + +We will use the murine reference annotation as retrieved from Ensembl in GTF format. This annotation contains gene, exon, transcript and all sorts of other information on the sequences. We will use these to generate the transcript-gene mapping by passing that information to a tool that extracts just the transcript identifiers we need. + + +## Generate a transcript to gene map and filtered FASTA + +You can have a look at the Terminal tab again. Has the package `atlas-gene-annotation-manipulation` been installed yet? If yes, you can execute the code cell below and while it's running, I'll explain all the parameters we set here. + +```bash +gtf2featureAnnotation.R -g gtf.gff -c fasta.fasta -d "transcript_id" -t "transcript" -f "transcript_id" -o map_code -l "transcript_id,gene_id" -r -e filtered_fasta_code +``` + +In essence, [gtf2featureAnnotation.R script](https://github.com/ebi-gene-expression-group/atlas-gene-annotation-manipulation) takes a GTF annotation file and creates a table of annotation by feature, optionally filtering a cDNA file supplied at the same time. Therefore the first parameter `-g` stands for "gtf-file" and requires a path to a valid GTF file. Then `-c` takes a cDNA file for extracting meta info and/or filtering - that's our FASTA! Where --parse-cdnas (that's our `-c`) is specified, we need to specify, using `-d`, which field should be used to compare to identfiers from the FASTA. We set that to "transcript_id" - feel free to inspect the GTF file to explore other attributes. We pass the same value in `-f`, meaning first-field, ie. the name of the field to place first in output table. To specify which other fields to retain in the output table, we provide comma-separated list of those fields, and since we're only interested in transcript to gene map, we put those two names ("transcript_id,gene_id") into `-l`. `-t` stands for the feature type to use, and in our case we're using "transcript". Guess what `-o` is! Indeed, that's the output annotation table - here we specify the file path of our transcript to gene map. We will also have another output denoted by `-e` and that's the path to a filtered FASTA. Finally, we also put `-r` which is there only to suppress header on output. Summarising, output will be a an annotation table, and a FASTA-format cDNAs file with unannotated transcripts removed. + +Why filtered FASTA? +Sometimes it's important that there are no transcripts in a FASTA-format transcriptome that cannot be matched to a transcript/gene mapping. Salmon, for example, used to produce errors when this mismatch was present. We can synchronise the cDNA file by removing mismatches as we have done above. + ## Generate a transcriptome index & quantify! +> Process stopping +> +> +{: .warning} + +This is a study using the Drop-seq chemistry, however this tutorial is almost identical to a 10x chemistry. We will point out the one tool parameter change you will need to run 10x samples. This data is not carefully curated, standard tutorial data - it's real, it's messy, it desperately needs filtering, it has background RNA running around, and most of all it will give you a chance to practice your analysis as if this data were yours. + + Alevin collapses the steps involved in dealing with dscRNA-seq into a single process. Such tools need to compare the sequences in your sample to a reference containing all the likely transcript sequences (a 'transcriptome'). This will contain the biological transcript sequences known for a given species, and perhaps also technical sequences such as 'spike ins' if you have those. > How does Alevin work? From 8eb76231cc47a1c07fcd0e3cf1590f64eacf0fcd Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sun, 5 Nov 2023 17:53:33 +0000 Subject: [PATCH 05/46] add gene manipulation package --- topics/single-cell/tutorials/alevin-commandline/preamble.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/topics/single-cell/tutorials/alevin-commandline/preamble.md b/topics/single-cell/tutorials/alevin-commandline/preamble.md index db9818a607ee67..0202542a252820 100644 --- a/topics/single-cell/tutorials/alevin-commandline/preamble.md +++ b/topics/single-cell/tutorials/alevin-commandline/preamble.md @@ -92,6 +92,9 @@ Before we start working on the tutorial notebook, we need to install required pa >``` >conda install -y -c bioconda bioconductor-dropletutils >``` +>``` +>conda install -y -c bioconda atlas-gene-annotation-manipulation +>``` > {: .hands_on} From a187b39f5fb67c87f1fbc3bac92235459b73716c Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sun, 5 Nov 2023 23:06:55 +0000 Subject: [PATCH 06/46] salmon index --- .../tutorials/alevin-commandline/tutorial.md | 23 ++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index bda1cc96b181a9..9d639a7d09f3ee 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -158,7 +158,28 @@ Why filtered FASTA? Sometimes it's important that there are no transcripts in a FASTA-format transcriptome that cannot be matched to a transcript/gene mapping. Salmon, for example, used to produce errors when this mismatch was present. We can synchronise the cDNA file by removing mismatches as we have done above. -## Generate a transcriptome index & quantify! +## Generate a transcriptome index + +We will use Salmon in mapping-based mode, so first we have to build a salmon index for our transcriptome. We will run the salmon indexer as so: + +```bash +salmon-latest_linux_x86_64/bin/salmon index -t filtered_fasta_code -i salmon_index_code -k 31 +``` + +Where `-t` stands for our filtered FASTA file, and `-i` is the output the mapping-based index. To build it, the funciton is using an auxiliary k-mer hash over k-mers of length 31. While the mapping algorithms will make used of arbitrarily long matches between the query and reference, the k size selected here will act as the minimum acceptable length for a valid match. Thus, a smaller value of k may slightly improve sensitivity. We find that a k of 31 seems to work well for reads of 75bp or longer, but you might consider a smaller k if you plan to deal with shorter reads. Also, a shorter value of k may improve sensitivity even more when using selective alignment (enabled via the –validateMappings flag). So, if you are seeing a smaller mapping rate than you might expect, consider building the index with a slightly smaller k. + + + + + + +## Use Alevin + + > Process stopping > From 7cb3f0795fd4503c0880c12aeb4f1d03230a0eba Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Tue, 7 Nov 2023 08:23:43 +0000 Subject: [PATCH 07/46] alevin added --- .../tutorials/alevin-commandline/tutorial.md | 50 ++++++++++++++++++- 1 file changed, 48 insertions(+), 2 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index 9d639a7d09f3ee..f46b66de680b11 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -174,18 +174,64 @@ reference salmon ## Use Alevin +Time to use Alevin now! Alevin works under the same indexing scheme (as salmon) for the reference, and consumes the set of FASTA/Q files(s) containing the Cellular Barcode(CB) + Unique Molecule identifier (UMI) in one read file and the read sequence in the other. Given just the transcriptome and the raw read files, alevin generates a cell-by-gene count matrix (in a fraction of the time compared to other tools). +> +> +> How does Alevin work in detail? +> +> > +> > +> > Alevin works in two phases. In the first phase it quickly parses the read file containing the CB and UMI information to generate the frequency distribution of all the observed CBs, and creates a lightweight data-structure for fast-look up and correction of the CB. In the second round, alevin utilizes the read-sequences contained in the files to map the reads to the transcriptome, identify potential PCR/sequencing errors in the UMIs, and performs hybrid de-duplication while accounting for UMI collisions. Finally, a post-abundance estimation CB whitelisting procedure is done and a cell-by-gene count matrix is generated. +> > +> {: .solution} +> +{: .question} + +All the required input parameters are described in [the documentation](https://salmon.readthedocs.io/en/latest/alevin.html), but for the ease of use, they are presented below as well: + +- `-l`: library type (same as salmon), we recommend using ISR for both Drop-seq and 10x-v2 chemistry. + +- `-1`: CB+UMI file(s), alevin requires the path to the FASTQ file containing CB+UMI raw sequences to be given under this command line flag. Alevin also supports parsing of data from multiple files as long as the order is the same as in -2 flag. + +- `-2`: Read-sequence file(s), alevin requires the path to the FASTQ file containing raw read-sequences to be given under this command line flag. Alevin also supports parsing of data from multiple files as long as the order is the same as in -1 flag. + +- `--dropseq` / `--chromium` / `--chromiumV3`: the protocol, this flag tells the type of single-cell protocol of the input sequencing-library. + +- `-i`: index, file containing the salmon index of the reference transcriptome, as generated by salmon index command. + +- `-p`: number of threads, the number of threads which can be used by alevin to perform the quantification, by default alevin utilizes all the available threads in the system, although we recommend using ~10 threads which in our testing gave the best memory-time trade-off. + +- `-o`: output, path to folder where the output gene-count matrix (along with other meta-data) would be dumped. + +- `--tgMap`: transcript to gene map file, a tsv (tab-separated) file — with no header, containing two columns mapping of each transcript present in the reference to the corresponding gene (the first column is a transcript and the second is the corresponding gene). + +- `--freqThreshold` + +- `--keepCBFraction` + +- `--dumpFeatures` + +We have also added some additional parameters, and their values are derived from the [Alevin Galaxy tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) after QC. + +Once all the above requirement are satisfied, Alevin can be run using the following command: + +```bash +salmon-latest_linux_x86_64/bin/salmon alevin -l ISR -1 barcodes_701.fastq -2 transcript_701.fastq --dropseq -i salmon_index_code -p 10 -o alevin_output_code --tgMap map_code --freqThreshold 3 --keepCBFraction 1 --dumpFeatures +``` > Process stopping > -> +> The command above will display the log of the process and will say "Analyzed X cells (Y% of all)". For some reason, running Alevin may sometimes cause problems in Jupyter Notebook and this process will stop and not go to completion. This is the reason why we use hugely subsampled dataset here - bigger ones couldn't be fully analysed (they worked fine locally though). The dataset used in this tutorial shouldn't make any issues when you're using Jupyter notebook through galaxy.eu, however might not work properly on galaxy.org. If you're accessing Jupyter notebook via galaxy.eu and alevin process stopped, just restart the kernel and that should help. +> {: .warning} + This is a study using the Drop-seq chemistry, however this tutorial is almost identical to a 10x chemistry. We will point out the one tool parameter change you will need to run 10x samples. This data is not carefully curated, standard tutorial data - it's real, it's messy, it desperately needs filtering, it has background RNA running around, and most of all it will give you a chance to practice your analysis as if this data were yours. From b58514f554c7b29dc9739c6f2240bd6e4aa96361 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Wed, 8 Nov 2023 16:15:56 +0000 Subject: [PATCH 08/46] details on alevin flags --- .../tutorials/alevin-commandline/tutorial.md | 38 +++++++++---------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index f46b66de680b11..cfe443be75aadd 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -168,11 +168,17 @@ salmon-latest_linux_x86_64/bin/salmon index -t filtered_fasta_code -i salmon_ind Where `-t` stands for our filtered FASTA file, and `-i` is the output the mapping-based index. To build it, the funciton is using an auxiliary k-mer hash over k-mers of length 31. While the mapping algorithms will make used of arbitrarily long matches between the query and reference, the k size selected here will act as the minimum acceptable length for a valid match. Thus, a smaller value of k may slightly improve sensitivity. We find that a k of 31 seems to work well for reads of 75bp or longer, but you might consider a smaller k if you plan to deal with shorter reads. Also, a shorter value of k may improve sensitivity even more when using selective alignment (enabled via the –validateMappings flag). So, if you are seeing a smaller mapping rate than you might expect, consider building the index with a slightly smaller k. + +> What is the index? +> +> To be able to search a transcriptome quickly, salmon needs to convert the text (FASTA) format sequences into something it can search quickly, called an 'index'. The index is in a binary rather than human-readable format, but allows fast lookup by Alevin. Because the types of biological and technical sequences we need to include in the index can vary between experiments, and because we often want to use the most up-to-date reference sequences from Ensembl or NCBI, we can end up re-making the indices quite often. +> +{: .details} + - @@ -197,27 +203,27 @@ All the required input parameters are described in [the documentation](https://s - `-l`: library type (same as salmon), we recommend using ISR for both Drop-seq and 10x-v2 chemistry. -- `-1`: CB+UMI file(s), alevin requires the path to the FASTQ file containing CB+UMI raw sequences to be given under this command line flag. Alevin also supports parsing of data from multiple files as long as the order is the same as in -2 flag. +- `-1`: CB+UMI file(s), alevin requires the path to the FASTQ file containing CB+UMI raw sequences to be given under this command line flag. Alevin also supports parsing of data from multiple files as long as the order is the same as in -2 flag. That's our barcodes_701.fastq file. -- `-2`: Read-sequence file(s), alevin requires the path to the FASTQ file containing raw read-sequences to be given under this command line flag. Alevin also supports parsing of data from multiple files as long as the order is the same as in -1 flag. +- `-2`: Read-sequence file(s), alevin requires the path to the FASTQ file containing raw read-sequences to be given under this command line flag. Alevin also supports parsing of data from multiple files as long as the order is the same as in -1 flag. That's our transcript_701.fastq file. -- `--dropseq` / `--chromium` / `--chromiumV3`: the protocol, this flag tells the type of single-cell protocol of the input sequencing-library. +- `--dropseq` / `--chromium` / `--chromiumV3`: the protocol, this flag tells the type of single-cell protocol of the input sequencing-library. This is a study using the Drop-seq chemistry, so we specify that in the flag. - `-i`: index, file containing the salmon index of the reference transcriptome, as generated by salmon index command. - `-p`: number of threads, the number of threads which can be used by alevin to perform the quantification, by default alevin utilizes all the available threads in the system, although we recommend using ~10 threads which in our testing gave the best memory-time trade-off. -- `-o`: output, path to folder where the output gene-count matrix (along with other meta-data) would be dumped. +- `-o`: output, path to folder where the output gene-count matrix (along with other meta-data) would be dumped. We simply call it alevin_output_code -- `--tgMap`: transcript to gene map file, a tsv (tab-separated) file — with no header, containing two columns mapping of each transcript present in the reference to the corresponding gene (the first column is a transcript and the second is the corresponding gene). +- `--tgMap`: transcript to gene map file, a tsv (tab-separated) file — with no header, containing two columns mapping of each transcript present in the reference to the corresponding gene (the first column is a transcript and the second is the corresponding gene). In our case, that's map_code generated by using gtf2featureAnnotation.R function. -- `--freqThreshold` +- `--freqThreshold` - minimum frequency for a barcode to be considered. We've chosen 3 as this will only remove cell barcodes with a frequency of less than 3, a low bar to pass but useful way of avoiding processing a bunch of almost certainly empty barcodes. -- `--keepCBFraction` +- `--keepCBFraction` - fraction of cellular barcodes to keep. We're using 1 to quantify all! -- `--dumpFeatures` +- `--dumpFeatures` - if activated, alevin dumps all the features used by the CB classification and their counts at each cell level. It’s generally used in pair with other command line flags. -We have also added some additional parameters, and their values are derived from the [Alevin Galaxy tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) after QC. +We have also added some additional parameters (`--freqThreshold`, `--keepCBFraction`) and their values are derived from the [Alevin Galaxy tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) after QC. Once all the above requirement are satisfied, Alevin can be run using the following command: @@ -232,16 +238,10 @@ salmon-latest_linux_x86_64/bin/salmon alevin -l ISR -1 barcodes_701.fastq -2 tra {: .warning} -This is a study using the Drop-seq chemistry, however this tutorial is almost identical to a 10x chemistry. We will point out the one tool parameter change you will need to run 10x samples. This data is not carefully curated, standard tutorial data - it's real, it's messy, it desperately needs filtering, it has background RNA running around, and most of all it will give you a chance to practice your analysis as if this data were yours. - - -Alevin collapses the steps involved in dealing with dscRNA-seq into a single process. Such tools need to compare the sequences in your sample to a reference containing all the likely transcript sequences (a 'transcriptome'). This will contain the biological transcript sequences known for a given species, and perhaps also technical sequences such as 'spike ins' if you have those. + -> How does Alevin work? -> -> To be able to search a transcriptome quickly, Alevin needs to convert the text (FASTA) format sequences into something it can search quickly, called an 'index'. The index is in a binary rather than human-readable format, but allows fast lookup by Alevin. Because the types of biological and technical sequences we need to include in the index can vary between experiments, and because we often want to use the most up-to-date reference sequences from Ensembl or NCBI, we can end up re-making the indices quite often. Making these indices is time-consuming! Have a look at the uncompressed FASTA to see what it starts with. -> -{: .details} We now have: From a8b2d99116c24ad18ec54c053c07f9e6d959b8b3 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Wed, 8 Nov 2023 16:39:17 +0000 Subject: [PATCH 09/46] remove stuff --- .../tutorials/alevin-commandline/tutorial.md | 49 ++----------------- 1 file changed, 5 insertions(+), 44 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index cfe443be75aadd..ab941b98d5c882 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -231,6 +231,10 @@ Once all the above requirement are satisfied, Alevin can be run using the follow salmon-latest_linux_x86_64/bin/salmon alevin -l ISR -1 barcodes_701.fastq -2 transcript_701.fastq --dropseq -i salmon_index_code -p 10 -o alevin_output_code --tgMap map_code --freqThreshold 3 --keepCBFraction 1 --dumpFeatures ``` + +This tool will take a while to run. Alevin produces many file outputs, not all of which we'll use. You can refer to the [Alevin documentation](https://salmon.readthedocs.io/en/latest/alevin.html) if you're curious what they all are, you can look through all the different files to find different parameters such as the mapping rate, but we'll just pass the whole output folder directory for downstream analysis. + + > Process stopping > > The command above will display the log of the process and will say "Analyzed X cells (Y% of all)". For some reason, running Alevin may sometimes cause problems in Jupyter Notebook and this process will stop and not go to completion. This is the reason why we use hugely subsampled dataset here - bigger ones couldn't be fully analysed (they worked fine locally though). The dataset used in this tutorial shouldn't make any issues when you're using Jupyter notebook through galaxy.eu, however might not work properly on galaxy.org. If you're accessing Jupyter notebook via galaxy.eu and alevin process stopped, just restart the kernel and that should help. @@ -239,53 +243,10 @@ salmon-latest_linux_x86_64/bin/salmon alevin -l ISR -1 barcodes_701.fastq -2 tra -We now have: - -* Barcode/ UMI reads -* cDNA reads -* transcript/ gene mapping -* filtered FASTA - -We can now run Alevin. In some public instances, Alevin won't show up if you search for it. Instead, you may have to click the Single Cell tab at the left and scroll down to the Alevin tool. Alternatively, use Tutorial Mode as described above and you'll easily navigate to all the tools, and their versions will all be the tried and tested ones of this tutorial. It's often a good idea to check your tool versions. To identify which version of a tool you are using, select {% icon tool-versions %} 'Versions' and choose the appropriate version. In this case the tutorial was built with Alevin Galaxy Version 1.9.0+galaxy2. - -> What if I'm running a 10x sample? -> -> The main parameter that needs changing for a 10X Chromium sample is the 'Protocol' parameter of Alevin. Just select the correct 10x Chemistry there instead. -{: .comment} - -> Alevin file names -> -> You will notice that the names of the output files of Alevin are written in a certain convention, mentioning which tool was used and on which files, for example: *"Alevin on data X, data Y, and others: whitelist"*. Remember that you can always rename the files if you wish! For simplicity, when we refer to those files in the tutorial, we skip the information about tool and only use the second part of the name - in this case it would be simply *"whitelist"*. -{: .comment} - -This tool will take a while to run. Alevin produces many file outputs, not all of which we'll use. You can refer to the [Alevin documentation](https://salmon.readthedocs.io/en/latest/alevin.html) if you're curious what they all are, but we're most interested in is: - -* the matrix itself (*per-cell gene-count matrix (MTX)* - the count by gene and cell) -* the row (cell/ barcode) identifiers (*row index (CB-ids)*) and -* the column (gene) labels (*column headers (gene-ids)*). - - -> -> -> After you've run Alevin, {% icon galaxy-eye %} look through all the different files. Can you find: -> 1. The Mapping Rate? -> 2. How many cells are present in the matrix output? -> -> > -> > -> > 1. Inspect {% icon galaxy-eye %} the file {% icon param-file %} *Salmon log file*. You can see the mapping rate is a paltry `25.45%`. This is a terrible mapping rate. Why might this be? Remember this was downsampled, and specifically by taking only the last 400,000 reads of the FASTQ file. The overall mapping rate of the file is more like 50%, which is still quite poor, but for early Drop-Seq samples and single-cell data in general, you might expect a slightly poorer mapping rate. 10x samples are much better these days! This is real data, not test data, after all! -> > 2. Inspect {% icon galaxy-eye %} the file {% icon param-file %} *row index (CB-ids)*, and you can see it has `2163` lines. The rows refer to the cells in the cell x gene matrix. According to this (rough) estimate, your sample has 2163 cells in it! -> > -> {: .solution} -> -{: .question} - -{% icon congratulations %} Congratulations - you've made an expression matrix! We could almost stop here. But it's sensible to do some basic QC, and one of the things we can do is look at a barcode rank plot. - # Basic QC The question we're looking to answer here, is: "do we mostly have a single cell per droplet"? That's what experimenters are normally aiming for, but it's not entirely straightforward to get exactly one cell per droplet. Sometimes almost no cells make it into droplets, other times we have too many cells in each droplet. At a minimum, we should easily be able to distinguish droplets with cells from those without. From 16fc9a1fec4d8dcae686466f21f543ca634a9d4d Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sat, 11 Nov 2023 12:05:51 +0000 Subject: [PATCH 10/46] start gene metadata --- .../tutorials/alevin-commandline/tutorial.md | 71 +++++++++++-------- 1 file changed, 40 insertions(+), 31 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index ab941b98d5c882..4149f466b8279c 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -73,7 +73,7 @@ tar -xvzf salmon-1.10.0_linux_x86_64.tar.gz We're going to use Alevin {% cite article-Alevin %} for demonstration purposes, but we do not endorse one method over another. -## Get Data +# Get Data We continue working on the same example data - a very small subset of the reads in a mouse dataset of fetal growth restriction {% cite Bacon2018 %} (see the [study in Single Cell Expression Atlas](https://www.ebi.ac.uk/gxa/sc/experiments/E-MTAB-6945/results/tsne) and the [project submission](https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6945/)). For the purposes of this tutorial, the datasets have been subsampled to only 50k reads (around 1% of the original files). Those are two fastq files - one with transcripts and the onther one with cell barcodes. You can download the files by running the code below: @@ -144,7 +144,7 @@ To generate gene-level quantifications based on transcriptome quantification, Al We will use the murine reference annotation as retrieved from Ensembl in GTF format. This annotation contains gene, exon, transcript and all sorts of other information on the sequences. We will use these to generate the transcript-gene mapping by passing that information to a tool that extracts just the transcript identifiers we need. -## Generate a transcript to gene map and filtered FASTA +# Generate a transcript to gene map and filtered FASTA You can have a look at the Terminal tab again. Has the package `atlas-gene-annotation-manipulation` been installed yet? If yes, you can execute the code cell below and while it's running, I'll explain all the parameters we set here. @@ -158,7 +158,7 @@ Why filtered FASTA? Sometimes it's important that there are no transcripts in a FASTA-format transcriptome that cannot be matched to a transcript/gene mapping. Salmon, for example, used to produce errors when this mismatch was present. We can synchronise the cDNA file by removing mismatches as we have done above. -## Generate a transcriptome index +# Generate a transcriptome index We will use Salmon in mapping-based mode, so first we have to build a salmon index for our transcriptome. We will run the salmon indexer as so: @@ -183,7 +183,7 @@ reference salmon check if we need decoy --decoys decoys.txt --> -## Use Alevin +# Use Alevin Time to use Alevin now! Alevin works under the same indexing scheme (as salmon) for the reference, and consumes the set of FASTA/Q files(s) containing the Cellular Barcode(CB) + Unique Molecule identifier (UMI) in one read file and the read sequence in the other. Given just the transcriptome and the raw read files, alevin generates a cell-by-gene count matrix (in a fraction of the time compared to other tools). @@ -223,7 +223,7 @@ All the required input parameters are described in [the documentation](https://s - `--dumpFeatures` - if activated, alevin dumps all the features used by the CB classification and their counts at each cell level. It’s generally used in pair with other command line flags. -We have also added some additional parameters (`--freqThreshold`, `--keepCBFraction`) and their values are derived from the [Alevin Galaxy tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) after QC. +We have also added some additional parameters (`--freqThreshold`, `--keepCBFraction`) and their values are derived from the [Alevin Galaxy tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) after QC to stop Alevin from applying its own thresholds. Once all the above requirement are satisfied, Alevin can be run using the following command: @@ -232,7 +232,7 @@ salmon-latest_linux_x86_64/bin/salmon alevin -l ISR -1 barcodes_701.fastq -2 tra ``` -This tool will take a while to run. Alevin produces many file outputs, not all of which we'll use. You can refer to the [Alevin documentation](https://salmon.readthedocs.io/en/latest/alevin.html) if you're curious what they all are, you can look through all the different files to find different parameters such as the mapping rate, but we'll just pass the whole output folder directory for downstream analysis. +This tool will take a while to run. Alevin produces many file outputs, not all of which we'll use. You can refer to the [Alevin documentation](https://salmon.readthedocs.io/en/latest/alevin.html) if you're curious what they all are, you can look through all the different files to find information such as the mapping rate, but we'll just pass the whole output folder directory for downstream analysis. > Process stopping @@ -246,45 +246,54 @@ This tool will take a while to run. Alevin produces many file outputs, not all o check if we can get alevinQC to work - paste the info from the other tutorial? --> +# Alevin output to SummarizedExperiment -# Basic QC +Let's change gear a little bit. We've done the work in bash, and now we're switching to R to complete the processing. To do so, you have to change Kernel to R (either click on `Kernel` -> `Change Kernel...` in the upper left corner of your JupyterLab or click on the displayed current kernel in the upper right corner and change it). +![Figure showing the JupyterLab interface with an arrow pointing to the left corner, showing the option `Kernel` -> `Change Kernel...` and another arrow pointing to the right corner, showing the icon of the current kernel. The pop-up window asks which kernel should be chosen instead.](../../images//switch_kernel.jpg "Two ways of switching kernel.") -The question we're looking to answer here, is: "do we mostly have a single cell per droplet"? That's what experimenters are normally aiming for, but it's not entirely straightforward to get exactly one cell per droplet. Sometimes almost no cells make it into droplets, other times we have too many cells in each droplet. At a minimum, we should easily be able to distinguish droplets with cells from those without. +Now load the library that we have previously installed in terminal: -Now, the image generated here (400k) isn't the most informative - but you are dealing with a fraction of the reads! If you run the total sample (so identical steps above, but with significantly more time!) you'd get the image below. +```R +library(tximeta) +``` + +The [tximeta package](https://bioconductor.org/packages/devel/bioc/vignettes/tximeta/inst/doc/tximeta.html) REF (Love et al. 2020) is used for import of transcript-level quantification data into R/Bioconductor and requires that the entire output of alevin is present and unmodified. -![raw droplet barcode plots-total](../../images/scrna-casestudy/wab-raw_barcodes-total.png "Total sample - 32,579,453 reads - raw") +First, let's specify the path to the quants_mat.gz file: -This is our own formulation of the barcode plot based on a [discussion](https://github.com/COMBINE-lab/salmon/issues/362#issuecomment-490160480) we had with community members. The left hand plots with the smooth lines are the main plots, showing the UMI counts for individual cell barcodes ranked from high to low. We expect a sharp drop-off between cell-containing droplets and ones that are empty or contain only cell debris. Now, this data is not an ideal dataset, so for perspective, in an ideal world with a very clean 10x run, data will look a bit more like the following taken from the lung atlas (see the [study in Single Cell Expression Atlas](https://www.ebi.ac.uk/gxa/sc/experiments/E-MTAB-6653/results/tsne) and the [project submission](https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6653/)). +```R +path <- 'alevin_output/alevin/quants_mat.gz' +``` +We will specify the following arguments when running *tximeta*: +- 'coldata' a data.frame with at least two columns: + - files - character, paths of quantification files + - names - character, sample names +- 'type' - what quantifier was used (can be 'salomon', 'alevin', etc.) -![raw droplet barcode plots - lung atlas](../../images/scrna-casestudy/wab-lung-atlas-barcodes-raw.png "Pretty data - raw") +With that we can create a dataframe and pass it to tximeta to create SummarizedExperiment object. -In that plot, you can see the clearer 'knee' bend, showing the cut-off between empty droplets and cell-containing droplets. +```R +coldata <- data.frame(files = path, names="sample701") +alevin_se <- tximeta(coldata, type = "alevin") +``` -The right hand plots are density plots from the first one, and the thresholds are generated either using [dropletUtils](https://bioconductor.org/packages/release/bioc/html/DropletUtils.html) or by the method described in the discussion mentioned above. We could use any of these thresholds to select cells, assuming that anything with fewer counts is not a valid cell. By default, Alevin does something similar, and we can learn something about that by plotting just the barcodes Alevin retains. +Inspect the created object: +```R +alevin_se +``` -In experiments with relatively simple characteristics, this 'knee detection' method works relatively well. But some populations (such as our sample!) present difficulties. For instance, sub-populations of small cells may not be distinguished from empty droplets based purely on counts by barcode. Some libraries produce multiple 'knees' for multiple sub-populations. The [emptyDrops](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1662-y) method has become a popular way of dealing with this. emptyDrops still retains barcodes with very high counts, but also adds in barcodes that can be statistically distinguished from the ambient profiles, even if total counts are similar. In order to ultimately run emptyDrops (or indeed, whatever tool you like that accomplishes biologically relevant thresholding), we first need to re-run Alevin, but prevent it from applying its own less than ideal thresholds. +As you can see, *rowData names* and *colData names* are still empty. Let's add some metadata! -To use emptyDrops effectively, we need to go back and re-run Alevin, stopping it from applying it's own thresholds. Click the re-run icon {% icon galaxy-refresh %} on any Alevin output in your history, because almost every parameter is the same as before, except you need to change the following: +# Adding in metadata -Alevin outputs MTX format, which we can pass to the dropletUtils package and run emptyDrops. Unfortunately the matrix is in the wrong orientation for tools expecting files like those produced by 10X software (which dropletUtils does). We need to 'transform' the matrix such that cells are in columns and genes are in rows. +## Gene metadata -Alevin outputs MTX format, which we can pass to the dropletUtils package and run emptyDrops. Unfortunately the matrix is in the wrong orientation for tools expecting files like those produced by 10X software (which dropletUtils does). We need to 'transform' the matrix such that cells are in columns and genes are in rows. +As you saw above, the genes IDs are stored in *rownames*. Let's exctract them into a separate object: +```R +gene_ID <- rownames(alevin_se) +``` -> Generate gene information -> -> 1. {% tool [GTF2GeneList](toolshed.g2.bx.psu.edu/repos/ebi-gxa/gtf2gene_list/_ensembl_gtf2gene_list/1.52.0+galaxy0) %} with the following parameters: -> - *"Feature type for which to derive annotation"*: `gene` -> - *"Field to place first in output table"*: `gene_id` -> - *"Suppress header line in output?"*: `Yes` -> - *"Comma-separated list of field names to extract from the GTF (default: use all fields)"*: `gene_id,gene_name,mito` -> - *"Append version to transcript identifiers?"*: `Yes` -> - *"Flag mitochondrial features?"*: `Yes` - note, this will auto-fill a bunch of acronyms for searching in the GTF for mitochondrial associated genes. This is good! -> - *"Filter a FASTA-format cDNA file to match annotations?"*: `No` - we don't need to, we're done with the FASTA! -> 2. Check that the output file type is `tabular`. If not, change the file type by clicking the 'Edit attributes'{% icon galaxy-pencil %} on the dataset in the history (as if you were renaming the file.) Then click `Datatypes` and type in `tabular`. Click `Change datatype`.) -> 2. Rename {% icon galaxy-pencil %} the annotation table to `Gene Information` -{: .hands_on} Inspect {% icon galaxy-eye %} the **Gene Information** object in the history. Now you have made a new key for gene_id, with gene name and a column of mitochondrial information (false = not mitochondrial, true = mitochondrial). We need to add this information into the salmonKallistoMtxTo10x output 'Gene table'. But we need to keep 'Gene table' in the same order, since it is referenced in the 'Matrix table' by row. From 973f835ce948344e7bd1481b64764db98a4f6207 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sat, 11 Nov 2023 23:08:14 +0000 Subject: [PATCH 11/46] emptyDrops, metadata --- .../tutorials/alevin-commandline/tutorial.md | 162 +++++++++++++++++- 1 file changed, 154 insertions(+), 8 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index 4149f466b8279c..0d2b294d4a5bb7 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -253,7 +253,7 @@ Let's change gear a little bit. We've done the work in bash, and now we're switc Now load the library that we have previously installed in terminal: -```R +```r library(tximeta) ``` @@ -261,7 +261,7 @@ The [tximeta package](https://bioconductor.org/packages/devel/bioc/vignettes/txi First, let's specify the path to the quants_mat.gz file: -```R +```r path <- 'alevin_output/alevin/quants_mat.gz' ``` We will specify the following arguments when running *tximeta*: @@ -272,28 +272,174 @@ We will specify the following arguments when running *tximeta*: With that we can create a dataframe and pass it to tximeta to create SummarizedExperiment object. -```R +```r coldata <- data.frame(files = path, names="sample701") alevin_se <- tximeta(coldata, type = "alevin") ``` Inspect the created object: -```R +```r alevin_se ``` -As you can see, *rowData names* and *colData names* are still empty. Let's add some metadata! +As you can see, *rowData names* and *colData names* are still empty. Before we add some metadata, we will first identify barcodes that correspond to non-empty droplets. -# Adding in metadata +# Identify barcodes that correspond to non-empty droplets -## Gene metadata +Some sub-populations of small cells may not be distinguished from empty droplets based purely on counts by barcode. Some libraries produce multiple ‘knees’ (see the [Alevin Galaxy tutorial]() for multiple sub-populations. The [emptyDrops]() method has become a popular way of dealing with this. emptyDrops still retains barcodes with very high counts, but also adds in barcodes that can be statistically distinguished from the ambient profiles, even if total counts are similar. + +```r +library(DropletUtils) # load the library and required packages +``` + +emptyDrops takes multiple arguments that you can read about in the [documentation](https://rdrr.io/github/MarioniLab/DropletUtils/man/emptyDrops.html). However, in this case, we will only specify the following arguments: + +- `m` - A numeric matrix-like object - usually a dgTMatrix or dgCMatrix - containing droplet data prior to any filtering or cell calling. Columns represent barcoded droplets, rows represent genes. +- `lower` - A numeric scalar specifying the lower bound on the total UMI count, at or below which all barcodes are assumed to correspond to empty droplets. +- `niters` - An integer scalar specifying the number of iterations to use for the Monte Carlo p-value calculations. +- `retain` - A numeric scalar specifying the threshold for the total UMI count above which all barcodes are assumed to contain cells. + +Let's then extract the matrix from our `alevin_se` object. It's stored in *assays* -> *counts*. + +```r +matrix_alevin <- assays(alevin_se)$counts +``` + +And now run emptyDrops: +```r +# Identify likely cell-containing droplets +out <- emptyDrops(matrix_alevin, lower = 100, niters = 1000, retain = 20) +out +``` + + +False discovery rate - ??? +```r +is.cell <- out$FDR <= 0.01 +sum(is.cell, na.rm=TRUE) +``` + +We got rid of the background droplets containing no cells, so now we will filter the matrix that we passed on to emptyDrops, so that it corresponds to the remaining cells. + +```r +emptied_matrix <- matrix_alevin[,which(is.cell),drop=FALSE] # filter the matrix +dim(emptied_matrix) # check the dimension of the filtered matrix +``` + +From here, we can move on to adding cell metadata. + +# Adding cell metadata + +The genes IDs are stored in *colnames*. Let's exctract them into a separate object: +```r +barcode <- colnames(alevin_se) +``` + +Now, we can simply add those barcodes into *colData names* which stores cell metadata. To do this, we will create a column called `barcode` in *colData* and pass the stored values into there. + +```r +colData(alevin_se)$barcode <- barcode +``` + +As we saw above, the dimension of the filtered matrix is A x B. It means that there are X cells and Y genes. We will now extract those cells from the filtered matrix. + +```r +retained_cells <- colnames(emptied_matrix) +retained_cells +``` + +Now, we can simply add those barcodes into *rowData names* which stores gene metadata. To do this, we will create a column called `gene_ID` in *rowData* and pass the stored values into there. + + +# Adding gene metadata As you saw above, the genes IDs are stored in *rownames*. Let's exctract them into a separate object: -```R +```r gene_ID <- rownames(alevin_se) ``` +Now, we can simply add those genes IDs into *rowData names* which stores gene metadata. To do this, we will create a column called `gene_ID` in *rowData* and pass the stored values into there. + +```r +rowData(alevin)$gene_ID <- gene_ID +``` + +## Adding genes symbols based on their IDs + +Since gene symbols are much more informative than only gene IDs, we will add them to our metadata. We will base this annotation on Ensembl - the genome database – with the use of the library BioMart. We will use the archive Genome assembly GRCm38 to get the gene names. Please note that the updated version (GRCm39) is available, but some of the gene IDs are not in that EnsEMBL database. The code below is written in a way that it will work for the updated dataset too, but will produce ‘NA’ where the corresponding gene name couldn’t be found. + +```r +# get relevant gene names +library("biomaRt") # load the BioMart library +ensembl.ids <- gene_ID +mart <- useEnsembl(biomart = "ENSEMBL_MART_ENSEMBL") # connect to a specified BioMart database and dataset hosted by Ensembl +ensembl_m = useMart("ensembl", dataset="mmusculus_gene_ensembl", host='https://nov2020.archive.ensembl.org') + +# The line above connects to a specified BioMart database and dataset within this database. +# In our case we choose the mus musculus database and to get the desired Genome assembly GRCm38, +# we specify the host with this archive. If you want to use the most recent version of the dataset, just run: +# ensembl_m = useMart("ensembl", dataset="mmusculus_gene_ensembl") +``` +```r +genes <- getBM(attributes=c('ensembl_gene_id','external_gene_name'), + filters = 'ensembl_gene_id', + values = ensembl.ids, + mart = ensembl_m) + +# The line above retrieves the specified attributes from the connected BioMart database; +# 'ensembl_gene_id' are genes IDs, +# 'external_gene_name' are the genes symbols that we want to get for our values stored in ‘ensembl.ids’. +``` +```r +# see the resulting data +head(genes) +``` +```r +# replace IDs for gene names +gene_names <- ensembl.ids +count = 1 +for (geneID in gene_names) +{ + index <- which(genes==geneID) # finds an index of geneID in the genes object created by getBM() + if (length(index)==0) # condition in case if there is no corresponding gene name in the chosen dataset + { + gene_names[count] <- 'NA' + } + else + { + gene_names[count] <- genes$external_gene_name[index] # replaces gene ID by the corresponding gene name based on the found geneID’s index + } + count = count + 1 # increased count so that every element in gene_names is replaced +} +``` +```r +# add the gene names into rowData in a new column gene_name +rowData(alevin_se)$gene_name <- gene_names +``` +```r +# see the changes +rowData(alevin_se) +``` + +If you are working on your own data and it’s not mouse data, you can check available datasets for other species and just use relevant dataset in `useMart()` function. +```r +listDatasets(mart) # available datasets +``` + +> Ensembl connection problems +> Sometimes you may encounter some connection issues with Ensembl. To improve performance Ensembl provides several mirrors of their site distributed around the globe. When you use the default settings for useEnsembl() your queries will be directed to your closest mirror geographically. In theory this should give you the best performance, however this is not always the case in practice. For example, if the nearest mirror is experiencing many queries from other users it may perform poorly for you. In such cases, the other mirrors should be chosen automatically. +> +{: .warning} + + + + + Inspect {% icon galaxy-eye %} the **Gene Information** object in the history. Now you have made a new key for gene_id, with gene name and a column of mitochondrial information (false = not mitochondrial, true = mitochondrial). We need to add this information into the salmonKallistoMtxTo10x output 'Gene table'. But we need to keep 'Gene table' in the same order, since it is referenced in the 'Matrix table' by row. From 4bdf653b47b338da08b8b5372cda674cde299c92 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sun, 12 Nov 2023 00:25:47 +0000 Subject: [PATCH 12/46] cell metadata to be finished --- .../single-cell/tutorials/alevin-commandline/tutorial.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index 0d2b294d4a5bb7..426c72d0cbe27f 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -328,21 +328,24 @@ emptied_matrix <- matrix_alevin[,which(is.cell),drop=FALSE] # filter th dim(emptied_matrix) # check the dimension of the filtered matrix ``` -From here, we can move on to adding cell metadata. +From here, we can move on to adding cell metadata and we'll return to `emptied_matrix` soon. # Adding cell metadata -The genes IDs are stored in *colnames*. Let's exctract them into a separate object: +The cells barcodes are stored in *colnames*. Let's exctract them into a separate object: ```r barcode <- colnames(alevin_se) ``` -Now, we can simply add those barcodes into *colData names* which stores cell metadata. To do this, we will create a column called `barcode` in *colData* and pass the stored values into there. +Now, we can simply add those barcodes into *colData names* where we will keep the cell metadata. To do this, we will create a column called `barcode` in *colData* and pass the stored values into there. ```r colData(alevin_se)$barcode <- barcode +colData(alevin) ``` + + As we saw above, the dimension of the filtered matrix is A x B. It means that there are X cells and Y genes. We will now extract those cells from the filtered matrix. ```r From 01e271b18ae95bfa8b81bba60605efd5de36333d Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sun, 12 Nov 2023 15:50:13 +0000 Subject: [PATCH 13/46] sce, saving left --- .../tutorials/alevin-commandline/tutorial.md | 247 ++++++++++++++---- 1 file changed, 198 insertions(+), 49 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index 426c72d0cbe27f..b36f83e5f336e5 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -143,6 +143,11 @@ To generate gene-level quantifications based on transcriptome quantification, Al We will use the murine reference annotation as retrieved from Ensembl in GTF format. This annotation contains gene, exon, transcript and all sorts of other information on the sequences. We will use these to generate the transcript-gene mapping by passing that information to a tool that extracts just the transcript identifiers we need. +There is also one folder that we will use later on, but since currently we're using bash, it's easier to unzip that folder now. Don't worry, we will come back to it later! + +```bash +unzip +``` # Generate a transcript to gene map and filtered FASTA @@ -325,10 +330,9 @@ We got rid of the background droplets containing no cells, so now we will filter ```r emptied_matrix <- matrix_alevin[,which(is.cell),drop=FALSE] # filter the matrix -dim(emptied_matrix) # check the dimension of the filtered matrix ``` -From here, we can move on to adding cell metadata and we'll return to `emptied_matrix` soon. +From here, we can move on to adding metadata and we'll return to `emptied_matrix` soon. # Adding cell metadata @@ -341,30 +345,28 @@ Now, we can simply add those barcodes into *colData names* where we will keep th ```r colData(alevin_se)$barcode <- barcode -colData(alevin) +colData(alevin_se) ``` - - -As we saw above, the dimension of the filtered matrix is A x B. It means that there are X cells and Y genes. We will now extract those cells from the filtered matrix. +That's only cell barcodes for now! However, after running *emptyDrops*, we generated lots of cell information that is currently stored in `out` object (Total, LogProb, PValue, Limited, FDR). Let's add those values to cell metadata! Since we already have *barcodes* in there, we will simply bind the emptyDrops output to the existing dataframe: ```r -retained_cells <- colnames(emptied_matrix) -retained_cells +colData(alevin_se) <- cbind(colData(alevin_se),out) +colData(alevin_se) ``` -Now, we can simply add those barcodes into *rowData names* which stores gene metadata. To do this, we will create a column called `gene_ID` in *rowData* and pass the stored values into there. +As you can see, the new columns were appended successfully and now the dataframe has 6 columns. # Adding gene metadata -As you saw above, the genes IDs are stored in *rownames*. Let's exctract them into a separate object: +The genes IDs are stored in *rownames*. Let's exctract them into a separate object: ```r gene_ID <- rownames(alevin_se) ``` -Now, we can simply add those genes IDs into *rowData names* which stores gene metadata. To do this, we will create a column called `gene_ID` in *rowData* and pass the stored values into there. +Analogically, we will add those genes IDs into *rowData names* which stores gene metadata. To do this, we will create a column called `gene_ID` in *rowData* and pass the stored values into there. ```r rowData(alevin)$gene_ID <- gene_ID @@ -442,50 +444,197 @@ listDatasets(mart) # available datasets add mito annotation --> +# Subsetting the object +Let's go back to the `emptied_matrix` object. Do you remember how many cells were left after filtering? We can check that by looking at the matrix' dimensions: -Inspect {% icon galaxy-eye %} the **Gene Information** object in the history. Now you have made a new key for gene_id, with gene name and a column of mitochondrial information (false = not mitochondrial, true = mitochondrial). We need to add this information into the salmonKallistoMtxTo10x output 'Gene table'. But we need to keep 'Gene table' in the same order, since it is referenced in the 'Matrix table' by row. +```r +dim(matrix_alevin) # check the dimension of the unfiltered matrix +dim(emptied_matrix) # check the dimension of the filtered matrix +``` -> Combine MTX Gene Table with Gene Information -> -> 1. {% tool [Join two Datasets](join1) %} with the following parameters: -> - *"Join"*: `Gene Table` -> - *"Using column"*: `Column: 1` -> - *"with"*: `Gene Information` -> - *"and column"*: `Column: 1` -> - *"Keep lines of first input that do not join with second input"*: `Yes` -> - *"Keep lines of first input that are incomplete"*: `Yes` -> - *"Fill empty columns"*: `No` -> - *"Keep the header lines"*: `No` -> -> -> If you inspect {% icon galaxy-eye %} the object, you'll see we have joined these tables and now have quite a few gene_id repeats. Let's take those out, while keeping the order of the original 'Gene Table'. -> -> -> 2. {% tool [Cut columns from a table](Cut1) %} with the following parameters: -> - *"Cut columns"*: `c1,c4,c5` -> - *"Delimited by"*: `Tab` -> - *"From"*: output of **Join two Datasets** {% icon tool %} -> -> 3. Rename output `Annotated Gene Table` -{: .hands_on} +We've gone from X to Y cells. We've filtered the matrix, but not our SummarizedExperiment. We can subset `alevin_se` based on the cells that were left after filtering. We will store them in a separate list, as we did with the barcodes: + +```r +retained_cells <- colnames(emptied_matrix) +retained_cells +``` -Inspect {% icon galaxy-eye %} your `Annotated Gene Table`. That's more like it! You now have `gene_id`, `gene_name`, and `mito`. Now let's get back to your journey to emptyDrops and sophisticated thresholding of empty droplets! +And now we can subset our SummarizedExperiment based on the barcodes that are in the `retained_cells` list: -# emptyDrops +```r +alevin_subset <- alevin_se[, colData(alevin_se)$barcode %in% retained_cells] +alevin_subset +``` -emptyDrops {% cite article-emptyDrops %} works with a specific form of R object called a SingleCellExperiment. We need to convert our transformed MTX files into that form, using the DropletUtils Read10x tool: +And that's our subset! We have now filtered matrix, some gene and cell metadata... but we can do more! + +# Adding more metadata + +If you have a look at the Experimental Design from that study, you might notice that there is actually more information about the cells. The most important for us would be batch, genotype and sex, summarised in the small table below. + +| Index | Batch | Genotype | Sex | +|------ |--------------------| +| N701 | 0 | wildtype | male | +| N702 | 1 | knockout | male | +| N703 | 2 | knockout | female | +| N704 | 3 | wildtype | male | +| N705 | 4 | wildtype | male | +| N706 | 5 | wildtype | male | +| N707 | 6 | knockout | male | + +We are currently analysing sample N701, so let's finish it off by adding the information from the table. + +## Batch + +We will label batch as an integer - "0" for sample N701, "1" for N702 etc. The way to do it is creating a list with zeros of the length corresponding to the number of cells that we have in our SummarizedExperiment object, like so: + +```r +batch <- rep("0", length(colnames(alevin_subset))) +``` + +And now create a batch slot in the *colData names* and append the `batch` list in the same way as we did with barcodes: + +```r +colData(alevin_subset)$batch <- batch +colData(alevin_subset) +``` + +A new column appeared, full of zeros - as expected! + +## Genotype + +It's all the same for genotype, but instead creating a list with zeros, we'll create a list with string "wildtype" and append it into genotype slot: + +```r +genotype <- rep("wildtype", length(colnames(alevin_subset))) +colData(alevin_subset)$genotype <- genotype +``` + +## Sex + +You already know what to do, right? A list with string "male" and then adding it into a new slot! +```r +sex <- rep("male", length(colnames(alevin_subset))) +colData(alevin_subset)$sex <- sex +``` + +Check if all looks fine: +```r +colData(alevin_subset) +``` + +3 new columns appeared with the information that we've just added - perfect! You can add any information you need in this way, as long as it's the same for all the cells from one sample. + + +# More datasets + +We've done the analysis for one sample. But there are 7 samples in this experiment and it would be very handy to have all the information in one place. Therefore, you would need to repeat all the steps for the subsequent samples (that's when you'll appreciate wrapped tools and automation in Galaxy workflows!). To make your life easier, we will show you how to combine the datasets on smaller scale. Also, to save you some time, we've already run alevin on sample 702 (also subsampled to 50k reads). Let's quickly repeat the steps we performed in R to complete the analysis of sample 702 in the same way as we did with 701. + +At the very beginnig of the tutorial we unzipped the folder with the alevin output of sample 702, remember that? Normally, you would switch kernel to bash to run alevin, and then back to R to complete the analysis, but for the purpose of this tutorial we've done it for you so that you can continue in R. + +> Switching kernels & losing variables +> +> Be aware that every time when you switch kernel, you will lose variables you store in the objects that you've created, unless you save them. Therefore, if you want to switch from R to bash, make sure you save your R objects! The last section of this tutorial will show you how to do it. +> +{: .warning} + + + +## Analysis of sample 702 + +Above we described all the steps and explained what each bit of code does. Below all those steps are in one block of code, so read carefully and make sure you understand everything! + +```r +path2 <- 'alevin_output_702/alevin/quants_mat.gz' +alevin2 <- tximeta(coldata = data.frame(files = path2, names = "sample702"), type = "alevin") + +matrix_alevin2 <- assays(alevin2)$counts + +out2 <- emptyDrops(matrix_alevin2, lower = 100, niters = 1000, retain = 20) + +is.cell2 <- out2$FDR <= 0.01 +sum(is.cell2, na.rm=TRUE) + +emptied_matrix2 <- matrix_alevin2[,which(is.cell2),drop=FALSE] + +barcode2 <- colnames(alevin2) +colData(alevin2)$barcode <- barcode2 +colData(alevin2) <- cbind(colData(alevin2),out2) + +gene_ID2 <- rownames(alevin2) +rowData(alevin2)$gene_ID <- gene_ID2 + +# get relevant gene names +ensembl.ids2 <- gene_ID2 # fData() allows to access cds rowData table +mart <- useEnsembl(biomart = "ENSEMBL_MART_ENSEMBL") # connect to a specified BioMart database and dataset hosted by Ensembl +ensembl_m2 = useMart("ensembl", dataset="mmusculus_gene_ensembl", host='https://nov2020.archive.ensembl.org') + +genes2 <- getBM(attributes=c('ensembl_gene_id','external_gene_name'), + filters = 'ensembl_gene_id', + values = ensembl.ids2, + mart = ensembl_m2) + +# replace IDs for gene names +gene_names2 <- ensembl.ids2 +count = 1 +for (geneID in gene_names2) +{ + index <- which(genes2==geneID) # finds an index of geneID in the genes object created by getBM() + if (length(index)==0) # condition in case if there is no corresponding gene name in the chosen dataset + { + gene_names2[count] <- 'NA' + } + else + { + gene_names2[count] <- genes2$external_gene_name[index] # replaces gene ID by the corresponding gene name based on the found geneID’s index + } + count = count + 1 # increased count so that every element in gene_names is replaced +} + +rowData(alevin2)$gene_name <- gene_names2 + +retained_cells2 <- colnames(emptied_matrix2) +alevin_subset2 <- alevin2[, colData(alevin2)$barcode %in% retained_cells2] + +batch2 <- rep("1", length(colnames(alevin_subset2))) +colData(alevin_subset2)$batch <- batch2 + +genotype2 <- rep("knockout", length(colnames(alevin_subset2))) +colData(alevin_subset2)$genotype <- genotype2 + +sex2 <- rep("male", length(colnames(alevin_subset2))) +colData(alevin_subset2)$sex <- sex2 + +alevin_subset2 +``` + +Alright, another sample pre-processed! + + +# Combining datasets + +Now we can combine those two objects into one using one simple command: + +```r +alevin_combined <- cbind(alevin_subset, alevin_subset2) +alevin_combined +``` + +If you have more samples, just append them in the same way. We won't process another sample here, but pretending that we have third sample, we would combine it like this: + +```r +alevin_subset3 <- alevin_subset2 # copy dataset for demonstration purposes +alevin_combined_demo <- cbind(alevin_combined, alevin_subset3) +alevin_combined_demo +``` + +You get the point, right? It's imporant though that the rowData names and colData names are the same in each sample. + + +# Saving and exporting the files -> Converting to SingleCellExperiment format -> -> 1. {% tool [DropletUtils Read10x](toolshed.g2.bx.psu.edu/repos/ebi-gxa/dropletutils_read_10x/dropletutils_read_10x/1.0.4+galaxy0) %} with the following parameters: -> - {% icon param-file %} *"Expression matrix in sparse matrix format (.mtx)"*: `Matrix table` -> - {% icon param-file %} *"Gene Table"*: `Annotated Gene Table` -> - {% icon param-file %} *"Barcode/cell table"*: `Barcode table` -> - *"Should metadata file be added?"*: `No` -> -> 2. Rename {% icon galaxy-pencil %} output: `SCE Object` -{: .hands_on} -Fantastic! Now that our matrix is combined into an object, specifically the SingleCellExperiment format, we can now run emptyDrops! Let's get rid of those background droplets containing no cells! From 981616eb4e7798741db8be2cd09adb76ff0cb8f2 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sun, 12 Nov 2023 16:48:56 +0000 Subject: [PATCH 14/46] done! --- .../tutorials/alevin-commandline/tutorial.md | 86 +++++++++++++++---- 1 file changed, 69 insertions(+), 17 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index b36f83e5f336e5..9d775ae4fcbbd2 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -143,11 +143,6 @@ To generate gene-level quantifications based on transcriptome quantification, Al We will use the murine reference annotation as retrieved from Ensembl in GTF format. This annotation contains gene, exon, transcript and all sorts of other information on the sequences. We will use these to generate the transcript-gene mapping by passing that information to a tool that extracts just the transcript identifiers we need. -There is also one folder that we will use later on, but since currently we're using bash, it's easier to unzip that folder now. Don't worry, we will come back to it later! - -```bash -unzip -``` # Generate a transcript to gene map and filtered FASTA @@ -228,7 +223,7 @@ All the required input parameters are described in [the documentation](https://s - `--dumpFeatures` - if activated, alevin dumps all the features used by the CB classification and their counts at each cell level. It’s generally used in pair with other command line flags. -We have also added some additional parameters (`--freqThreshold`, `--keepCBFraction`) and their values are derived from the [Alevin Galaxy tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) after QC to stop Alevin from applying its own thresholds. +We have also added some additional parameters (`--freqThreshold`, `--keepCBFraction`) and their values are derived from the [Alevin Galaxy tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) after QC to stop Alevin from applying its own thresholds. However, if you're not sure what value to pick, you can simply allow Alevin to make its own calls on what constitutes empty droplets. Once all the above requirement are satisfied, Alevin can be run using the following command: @@ -531,21 +526,38 @@ colData(alevin_subset) We've done the analysis for one sample. But there are 7 samples in this experiment and it would be very handy to have all the information in one place. Therefore, you would need to repeat all the steps for the subsequent samples (that's when you'll appreciate wrapped tools and automation in Galaxy workflows!). To make your life easier, we will show you how to combine the datasets on smaller scale. Also, to save you some time, we've already run alevin on sample 702 (also subsampled to 50k reads). Let's quickly repeat the steps we performed in R to complete the analysis of sample 702 in the same way as we did with 701. -At the very beginnig of the tutorial we unzipped the folder with the alevin output of sample 702, remember that? Normally, you would switch kernel to bash to run alevin, and then back to R to complete the analysis, but for the purpose of this tutorial we've done it for you so that you can continue in R. +But first, we have to save the results of our hard work on sample 701! + +## Saving sample 701 data + +Saving files is quite straight forward. Just specify which object you want to save and how you want the file to be named. Don't forget the extension! + +```r +save(alevin_subset, file = "alevin_701.RData") +``` + +You will see the new file in the panel on the left. + + +## Analysis of sample 702 + +Normally, at this point you would switch kernel to bash to run alevin, and then back to R to complete the analysis of another sample. Here, we are providing you with the alevin output for the next sample, but to give you some practise in switching kernels and saving data, we will use bash to unzip the folder with that output data. > Switching kernels & losing variables > -> Be aware that every time when you switch kernel, you will lose variables you store in the objects that you've created, unless you save them. Therefore, if you want to switch from R to bash, make sure you save your R objects! The last section of this tutorial will show you how to do it. +> Be aware that every time when you switch kernel, you will lose variables you store in the objects that you've created, unless you save them. Therefore, if you want to switch from R to bash, make sure you save your R objects! You can then load them anytime. > {: .warning} - +Let's switch the kernel back to bash and run the following code to unzip the alevin output for sample 702: -## Analysis of sample 702 +```bash +unzip +``` -Above we described all the steps and explained what each bit of code does. Below all those steps are in one block of code, so read carefully and make sure you understand everything! +The files are there! Now back to R - switch kernel again. + +Above we described all the steps done in R and explained what each bit of code does. Below all those steps are in one block of code, so read carefully and make sure you understand everything! ```r path2 <- 'alevin_output_702/alevin/quants_mat.gz' @@ -602,13 +614,14 @@ alevin_subset2 <- alevin2[, colData(alevin2)$barcode %in% retained_cells2] batch2 <- rep("1", length(colnames(alevin_subset2))) colData(alevin_subset2)$batch <- batch2 -genotype2 <- rep("knockout", length(colnames(alevin_subset2))) +genotype2 <- rep("wildtype", length(colnames(alevin_subset2))) colData(alevin_subset2)$genotype <- genotype2 sex2 <- rep("male", length(colnames(alevin_subset2))) colData(alevin_subset2)$sex <- sex2 -alevin_subset2 +alevin_702 <- alevin_subset2 +alevin_702 ``` Alright, another sample pre-processed! @@ -616,17 +629,28 @@ Alright, another sample pre-processed! # Combining datasets +Pre-processed sample 702 is there, but we still need to load sample 701 that we saved before switching kernels. It's equally easy as saving the object: + +```r +load("alevin_701.RData") +``` + +Check if it was loaded ok: +```r +alevin_701 +``` + Now we can combine those two objects into one using one simple command: ```r -alevin_combined <- cbind(alevin_subset, alevin_subset2) +alevin_combined <- cbind(alevin_701, alevin_702) alevin_combined ``` If you have more samples, just append them in the same way. We won't process another sample here, but pretending that we have third sample, we would combine it like this: ```r -alevin_subset3 <- alevin_subset2 # copy dataset for demonstration purposes +alevin_subset3 <- alevin_702 # copy dataset for demonstration purposes alevin_combined_demo <- cbind(alevin_combined, alevin_subset3) alevin_combined_demo ``` @@ -636,5 +660,33 @@ You get the point, right? It's imporant though that the rowData names and colDat # Saving and exporting the files +It is generally more common to use SingleCellExperiment format rather than SummarizedExperiment. The conversion is quick and easy, and goes like this: + +```r +alevin_sce <- as(alevin_combined, "SingleCellExperiment") +alevin_sce +``` +As you can see, all the embeddings have been successfully transfered during this conversion and believe me, sce object will be more useful for you! + +You've already learned how to save and load objects in Jupyter notebook, let's then save the SCE file: + +```r +save(alevin_sce, file = "alevin_combined_sce.rdata") +``` + +The last thing that might be useful is exporting the files into your Galaxy history. To do it... guess what! Yes - switching kernels again! But this time we choose Python kernel and run the following command: + +```python +put("alevin_combined_sce.rdata") +``` + +# Conclusion +Well done! In this tutorial we have: +- examined raw read data, annotations and necessary input files for quantification +- created an index in salmon and run Alevin +- identified barcodes that correspond to non-empty droplets +- added gene and cell metadata +- applied the necessary conversion to pass these data to downstream processes. +As you might now appreciate, some tasks are much quicker and easier when run in the code, but the reproducibility and automation of Galaxy workflows is a huge advantage that helps in dealing with many samples more quickly and efficiently. From bad7f28cc71e55917c1f3642206fad017e1d2d89 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sun, 12 Nov 2023 18:18:13 +0000 Subject: [PATCH 15/46] add images for alevin tutorial --- .../images/scrna-pre-processing/bash.png | Bin 0 -> 10041 bytes .../scrna-pre-processing/switch_kernel.jpg | Bin 0 -> 136118 bytes 2 files changed, 0 insertions(+), 0 deletions(-) create mode 100644 topics/single-cell/images/scrna-pre-processing/bash.png create mode 100644 topics/single-cell/images/scrna-pre-processing/switch_kernel.jpg diff --git a/topics/single-cell/images/scrna-pre-processing/bash.png b/topics/single-cell/images/scrna-pre-processing/bash.png new file mode 100644 index 0000000000000000000000000000000000000000..1dcdbe726cabfa70bad788512b519e7ab2876b5b GIT binary patch literal 10041 zcmeHtXEdDM`|c!&h#&~wC?SOCf*`txJ`rtnMhFo-QARfj5-n;N5xtD)j50bAogf)~ zv>^J7(MNY4?{BSh{$I}5v({Pb_hHto+0VVNy`R0W`@XMz&yLW2u1`349CqSDk* z(FcL9^aGz?*RKIBgY-|>fv+nb`s&J{vO(50;0LMw6YVD;5Gt1P)S3+VP4Py<)B^;f z^Su0AnSI6X0|N03YN|Ye_(1V9=m~9`N54s64q1`hfTde~^jf+H>#BKTURja9co8L;`dF ztJ2hCIuaF(vO_PQ*6f^4jbamIBh#vU{o}~AIjEEt%z-IFMmpt%#TX9Aixf;dz1H5M z1zB+-Ku!0Saxvy7h1=uy1?$^}MOv?x=xCH`X~CfPW^N|F#d{Oai|W@zORd%kSN_~X zfXqyI10QKedk^Z4l51X62cbx@ ze*woxuYTrt5>^zL|AU6-7B|pwMPgP!-W}CkyWJ$73G|s`?jFN?+ZV!A!6{*x`XfR#h@Xs#K}-5%|oWgH2_W` zw8S8=+$<(}(9DM4n2Jup$8Wg&^>3dJ#?)z{XO)|&rIl45JU8C#jpdbGb0iR&l8C0N zv$+pO{Vp1HdYKgTw7JHuI2!>5gbL>Qc8W5@vXw!*E%Se{x0D{fmjZX1=P6`&2XHn^ zSV|Cr&IXIFuBiyVkE?pKoXk-4dR>#h2^ZK&%-Ez7>qaU}(676GhJI~F-tTejpzEQ7 z)k(Hi>mrARb!Rn+;HRxicod&jiV^pKzlK+)%PF6_QvNM4NB@LI)tjHB44J!GbmUM1 zUL#K-PScAz3OJJid54tTs{T7QX&?E@PM6{X*s5BR*-Rojg6egvSk6<-zmO? z^>s!@%ktN7ajoa+%o?%Ps4#LOlvM8D02{tp{h$wpRy-gXYsvdqf9_7EWONUBbVzyU z=B!?XrH(efClu?eC2Tt63$|qg;KJE|%8SHB`hSDvnYTMTbZSY6aWk5JG`ULqc?w=f z_f(1ojd6*oewEa^B^#$5&V5VtAZKU>e>$SZBj~F~Rr36u;8Vf(*ZU^dQzN709#3dh z9ZPG`fo<;q*t#VZwRBt$$jsNqhHGz-h{Xb^iQ&Gv{qgi6rJjo}WRqb8M~)nK^62M= z^zVT6;N*u+&#^%1z**wJNQT?Woj0OFs6cGME7&e+*oB0wfMk>f-}gic7)c8HP-k&< z(1ll+wDGi6dD_^S*s=p-n!G|Fo&KbgnTEMJaP*0Z-jTGgu#s-_-~a;~*4LLzrh_SX z&5KeQyTmjse!C5B+kM18^$ex5#o9v$;1$`1K&;rE^ZH+5zGp|$|L0ANa&Loo&6Nfr z4qKpElQ^%#jBA`l)-&DumS077d)Sc^f0X}sb5N%`RCG81udywubu-C5M4Q7L7&^El zen0akB^txcvwx>p<;klujVqEomy1+cWI?qH1)kU?Ru8;UOX|@AW|7rZ%iTa1@`sL>0T1OWpti zDyYt{7-qsL3sBI;ak;gmOWKXp+#zAYVEgwuo7nr3OZ!6qT_RLe005d$`x@|Z$B!&k z>5@pHBcwIhHqK%Ii*tR0#qex37KvSdp(Ar!(_F`(4K({K zZszd(8fVTj@{Xg=8R~_A_KvWgbEXX((%W&@RvdUqrGX;uqLXfI{Ytrs&(!K#VHR1Z z=b?0y03%F=3)Kwgydtap^eP^(Mdr1w2zZxYT_W>;=d339UhtLEcZ8y6O9D8P`XEp9 zJO%UQbK(jWpD+EB+gpLuwtT4sv;A7iku6+wA)A9Z6uWza!Kd{h38UTFYQ z5&ZagB&PZfZ*+@Jy#d%+!h}OCoI6hS6p2$P*E?#SGicv!)xbYvabuPKqt+RFr#xU* z`j%)wKZ$8hn?GbIrL@Ruv!H9-Rr6mP(|{JwLwI2Ydq4J`AAm#VsBi{ZB3fpDoqP7# zsF?|;%|vj?S=H)%-l*2ONWq;GVRH4JU6_T_2C|R%s&E}~mXsD)18Ei9Z>#Q2?#mZ| zozg%m>>QP8#vz=M0Pi^&HDssF3?>UArL1(9_Rb9br*1p!H=cJ73+V7aHbA``GT26o z_Ve5_?V!7UC3VKEIsZ9>TT|)2IJ~GRa}#--5~h_ovHuzI%X4>@S*uJBx3>X#{9KFF z|9t_lz2KT9EkCb5m0%0ONUz!4F&2ff5e>tQgd&Ryiw*V`ol)LI-h1daYC4&e(c0@qp-F|YrYLve~0YgPG; zW_W8fF9o-fAg%kI2xKJklmt0I+*UI|>W4({V}v%FkDK29ZSbAtee30DjCXhAJxU?Q znr#FQNEZgm=Rf0rtc5)>TVk6SZ;eTKTK+xxGU17%KGCsAxeB=Wp>2kr8_n1KLzzqB z@+Xb^mMN4ssvHsHt9g{1mP4mZQ5|J+HD3Hj)ZboAOjXk33w;J{VWXhn_A(`PU`Hrm zb)zg&UDQdiAl(SU95RQnS%5h&oGfdi1+guraVtlAAcT=nDw_B%*5Zkd=0x!Z6<=Kvl>tFyOBlz_-2&u;zMV3=bQB@M6S1reBN;oNuzbC=gg0dw(sTMo@*+n zn+@A>PnjmEy9EwqM&ZgLppR%0u4zM@e?XR^`;C7@uhosbK1X0cwd!C|*I@uy+nx*f;KU)?WEZEFNFyNt%tA1gqoh zH5;)N-^OZD(%-c0Nm8K*^qE&sL`?jF5A;Z#N;jw?>q=bof2fhhp?z9!L(d~%uN0FY z0n=37%6=9(xlp>VRc4v{x4}@W%xdwBq}jwsSl@ke1GDojuK_vxAZTDYuj>E-Yk6AV z{Yf{TSVcB&%L!!qP%|}~6nOeVa$5O93yb#v#NQfQSJ~#|)xPd1QF3Gv?bIAzW?uLg zvy!J;g)CZvDjEA_;?5fMxQ?kD<{Fcvys~aMPd2>}i4*#dRKt0+F08P9gEoiYqf6Gv zuw#@pEnQAa!JS2!Crt@%1BpfhmCc1T2X%UwvLXJ9snp`{*LvHa94uQT9KZDC=Gke_ z8Y)^N<#CvH^cpa99KvK+mkr&fb+FF$dJFzkPu_nZOx`KadP5B@{hV z&Gn0N=-muA3(LW|Iw<*BuFw2PiDbT`iFkd8ty|Yp zxZBjlkQROytfLABst@BC0N=bP9q}ddTSjiM2Q=y0=0CWB#Fe}Uu0{ix5qCuAT9FwNyv!5hNHb2M8nXV|LD|avVl6SRprtoXXiLpcbogd$obFK3mu-HNwaBX zSegj#AJKqfL6}4Gehg#a=oZDd(Wt7ghT+J^sNz399|@;)V|v z{t;Wi*&}h>JMo4N+=3s+g43)u&NHQibfZi{Z=*>6^s^{vlFen@bw*B+V^V1ff?S5C zzmsW5*-(79kNW4Y&f_Pj_leGq@HJVrdbDFBHSQSF?aFxx}-FNQWcLFf6~Y{-Z}4Ocq!EyXHIp?f!oX7po>>PrJb`4MMj7 z0jSi?>=g3AVY&I_7dVs*I02S@?5lCKXqBAM%Iz@e1mfJkbNXyKrbpc7bU6Du6647a zgl&f{*6UX;2k$NN0`yNX%8l>A1gY}7%Xe7Znbhy>ZZm3d=G&ivY|Bx%bIJjG z3%fsB_s6Ym*ONc~_~zD2ihqYh|Mimy+S96te~yF-f)hXgp}IcbM&y4dDG{WgQ#8=A zUYg1;$=k&T4!r?PFp|4+IUSxSi4jsz-hPr+t^(u%Q0a(qiQWv04w<>bn3L1Rn3d9( ze_MwFHG0X0u9>eTicsAtecY}Udh?P;XgcQ^aYf>}TS#jSv`FPK;GGN3SAvU7i3Rgi=H!y}0ta8%^5oIR!uzdYqrP~rvp{%~Y!$}2L@*WBL-k-u|G) z1b7_mT>pOkLwWH!OeB#(KwnmzQwO`Czx3?W+xqLum(FiMkT0^T(-{>%I8Z)vaJ29G z6wiY%309tUzU3Nlag|oTvCM*@wGrY4mzgPkmn zZ99u&i`={Pv|7YwrcBr1g1I{)5L{1h-^!stCFt@T2tWJ#)!p!XUu<*+Ua&VwKYJH! zpHG$0b)mRKV-JbIDFJF0GR|Qb)TF$@ewb2JHe(q*bSikLp~Jgp<{Py@PO~M#V0}%n z@(O?!8qMV7l)w+)^)j4Rw5+LpuD7$)-{jcx`8YI#PWJin@;v`~cko)zT|htHktLtp z+ak!X8CJUYY8fg7#PY+@QS0`b_nU*V_m1QUx$=0>+Mhf8tnKs0w#gG^@dZQe4ru&!ad@84p0zX4Q@sTvKzFaxF8nFytF{YRO!L}ywfTqeTLC*+U7)d z99d2(Z&HW9?iq^SO@DjvEhDSaeIS72w-WwQRK}iAEkV7IT4?JofgC`jzHZlb z=y<7RaS#`oPi1I*MI1`&0^o|b=Ruh?$zuhfZxUxgjxTE&IU{M0Z!3_(o9{1>L zUNB2~#+JWeGt2Rgr-|{M#*{w4oEr0f#e%&;Ua&*^+0!zZ=Jn4$??h0Ng38e~){X7@ zk0G;LdS87a=^9in zG5@DOf@U?|eu1D>!8nj~t&fZ?D>hQ9(vi1M^q=I~{J#H_U%z+51jHQ+-soQP#M3Mu1y2;wr&gvMr92m3GHRL-4Yr-4$|w zP!`~**@h-21unHSMxS}K9d}X0S^pDM^&17^w8vLT7qczd*1dx%w~aP(c8H6MO9eT| zj0@4@hcbEgG>m#wCuoPB4Su@aHY?J6xDa)GxELEE1bQ89!YLLR8R^+|$Nu%Wr8m_< z-o#Nq86_jBIp`NNN+r*^VJ*MZ#$kCNl}V1|-B>s`XD4+p^&PSM6UX|2fq`2$F9-R4 ztzzSmahXM-`dwcY#nXA3v(aGZ(X4!i(1VGWH0c35z31ysJyr${U%q^~B`fvJ%c};S zUT{`_urf97Q1AG=W*k~)t=cO2?3G_ipmQaCvr_jYaGA3Nu*E#?;x*bh@YeIjl@xF-_0R$`zdZ@7}~ zow2jK8zNs?b4qY3W}Ad%Dg>UonD4I+$`VEbi8J3KBX7X1{Bai#fJvjN<+hZTmVlDe z`z;&Wh#u1#N5d^aWL@1vrqI+KNwMn5s%}<~PlIy1KZQSFOkT9^rW00wtFz-V>rW2q z-CBAnS!3GWv^4;w**$zwp2C zc=wPC5q{HuvDV|w01a2rBFo&bpc6ckv)91k_M+GBfYeDC1RsZ!oN`{Y>Yu!SH})jC zXhP)+Fh}Dlx7NU|_{49SL1yKx=YAEV=#L+-o$Z&y_l=nyH^~(b-@CS+!SDdX_=SY1 z4)Q1FtpE1JpWkPvOi~T#2&ZjRmmT5}o`FrdoR6cgL;co}=h9db>c@Ws6fOQPFFOKD zfBZW(=x-)*I$(DoBsOmXPV8ofPhZr6tW^c$_i=KIC8kxUPa_6$UvA4>obJ~2@}!db z%M9`SoCc^ObNDC3P*?XB&L#81h%4<_Li%n=aWN&@%p7I&;c~PW0-Xqa)oxgN%Hx-9 z=<+~}<^!*F!ekm7tj42+u}`3^?#Ge+L?{wq@bHBjHX(XdRy^`rOIbK~WkR%#;3L>$ zj2a9UUU4K{%LbchvpOGCdtqt4jZ*9vj(_d@dbE@A=@&U)_ll$)AS`(LxM`nAO>lD# z;L%gA{Cs>AXfcBM+%bY$7Wa^wF5-pGiQpHQ$&?yyU^0g$Gw{H)f-IBdbmf^R~cfsi^ zTmEXCU)Z!VuUo|7A32{lTO}UH2Y8_L0Hdz@X4I4N&|0}x?Ke8^Nxze^ss1f%H9=Zw z*qu;IcTul(h$Z}NKfR$R=;fV9ujAlDgUfgQR`0ZJMZH z?LptABxBEuGvdicHMU+cRdtHt?0A27+{(q_cyH|-$(q8-Hg!LzY%q(gk^xw721dr? z+%nD2JT)g3rfF#2ca~0v0NQY@VSoEq*poo-WV@<kmYM?wiLvyf`Peo)nLHcGx&_sj!G*`DKKR z$JWBh8kYDK((T0c@o^|sl)PlsLAT;L>R`s(3T~GjQ#MEXO9J47?Gx$MtdQW++mv${ zBsipWOfV5$Jwt3EHp7@6)owf%>VjpA3>siL7bN{h8QFd2KQMVt+9i`^PI^gNFMo9^ z`yE1~FtMIuUa+06z6dW)zbB=_tBEao~l-4LD?Cq(XtWLLcO2s;3mz}LWQ_mVu_Mfm4(P(JtOnL`(E4pSO zCK`s?d;#Sp2I9sltozpbPhyvNBf#Lqi5Ru%hgUe+RAl;l?bwNpe`N z0|tJmmaIg)1y(^={+de=5*bgJQ7&F;$d~l1@`$&T@ZOyMW4b)iEC8fMoKbT3gm$*S zV~QNd6wo6fJ+*q&>uZ%qoJ%^f>bFV19To>oNUzVd^^knOdPbb1L5uafb~Wt|%95M& zNfz1FOK1550-;;@3QBbpUjdlO!g}KFwHX}`WGY&Tv>e)0B$>i|>XxKUS?*PRk^Uur znb7)&Cu_oB6dGVyW*;I7UfWGQkNxo{-7ZYgWqUW;m&6uPPSOy2Nz_LU@2kLqgVHN$ zM&nvo+w`Y2@*`IB-?&pB5AU4L<)}vO;i`sUjz2yH#t410cdCbt0-?D78y$sxUFK#CO4-F^taN~J5!e0&i8_2 z#y;0)_S~N+(tMqLcQ!>-{}`266i3rjh)Fej~b*KSr<#H26vKcs`Q+O zd50Gql93ZvR}Q@E*78$N&(Jra-iSsgvz9?u3m3{F^DI2VW1`C~<(`7Ia2Tq_ zr{QdyE@n^1f-)@!t))CRg`<`;&-Wg{MM+cFuzUMB>o60SJ8~!Ue}J?}8KfS`W`iyt zIMT`jL@TF#Un1Xmq$KRMXVrit`m2Xo^G`0%#3Jd#k zrH&SC+e|^vb4X0DZb!n_x_+jJpFPd7wCo<`xtNgWBGj+kz33WDx6U1JB(zMm(t4SH z4zs`!CtstA-=oQ<2@M;<7sA}>>GPCXfhb$`Dtb_L?;sPF^s9Y_Z&&MZ_<&r^(@M$s zE*_FecMllLLl(pQmv6mNdH3eS*pF*ZD$5&vwrwuTIJI%)S7lF~!fYPsWZ+`I_&e%( sbKLm%iqYxne{UiEe}CV~Q{z~1c-oAfCK?9=uDgLWRiCSrDZdE*U)-LLJOBUy literal 0 HcmV?d00001 diff --git a/topics/single-cell/images/scrna-pre-processing/switch_kernel.jpg b/topics/single-cell/images/scrna-pre-processing/switch_kernel.jpg new file mode 100644 index 0000000000000000000000000000000000000000..78b4c1b3f6b1495b0b703ce0a1e5d73449033959 GIT binary patch literal 136118 zcmeFa2UJtdw2{k~1bs9Kw(ZJXM zprN4w5a16$#Q?|k5$?AEz|<6w1pt5ve8&aQf`8F~KYv`*aX=3^L__oI_pcW%J?*cH zfsT%ro{@o(@z>47%*xEf#KOeL$il|L!g?6I7@65Q*bcM*`ukTSzdrx<75ICYiIM47 zjsM<7{RD6_9lF3UO-pkeIK)Xq%Sl6R10W!*^dLsRkoxOIbBLCXo&f}jg%zw&bp!;T zmi7<`EIl0^SUVhi572Saa~(T%j)B|UiSf7(kMh0bY$oyZWzD=-`|%Q|Z~ES6X5r%( z5EPO;Atfy%tD>r=uA!-=cR}C4(8&1Wwd)p^R@OGQ&MvoZySll1`1uC}1|fn&q8>y) zjCm9rmy-H4E&bW^7a2LZZ}RdB-WI+quc)l5uBrV{*V5Y7-qHEF>&w94(D&hyAERR! zEN*6YZhm2LiLkM`wY{@T+}r!E(|9^zp-x2!< zUSq&vS{iWhXgL8WKtAwbE`{ReDgrR;_r*>&mS@`8b?XSNAx=EPU<0D#%28L&suFf1 zb;XJLO*fLn5C!H9#AWQF0*OXpt;rpGo4}q14`Hj&&iOXCL-**6Q)J3Sb>BIWo3AxK zsKb>yDFUTdE(i4yu(@!|a9W+!=V!3W$@l|Bk|7a;g&%P=guijmZM94{pIS2Z@=odX zSQKj5c%3YBv;K`Yqr}`t)}1rt+AZW*>vaNOnkO`>R26ynx#OG}3o#$UwRp8tT9U<~ zQv|8ed^l0)iJ_p#kp9n6{|`@H5i%b$J3pH1^XexawA}lIPDHxFA6Y6#Ozl#H2~v-A zRjd~RCs_v8@7!tNa4-FNzUJ!G(y4BRPlLx&Z_mUX74ph@prI|aZ59iw;~J_;)LJ@lNEsXCR7)?{)1phW`8gZ= zAkR80F_KObodb1($7QMzzaJu&QKWME8>CPWRiLTHdQAjPFhFqRHBZXquIx>qI?pPAk_?i02SKuvfbZo z6FZ4#FOtyWAA55s+(K0Woo7(o`Kn86{17tk#8&=9m{T9rdDZ!99ie@$sBk%g8Bd>( zQS%61JtKzVzuj`w98hG{7I%mXjqHxnJNN(V&hA{kn}+_g_aAM*N2xJt)f#8$Ekhx z4cr-9U!?;13z=HP7ui&xMYYg~3UIXN*qYA21B-g{9jeJkuYM4_X3BJX@1D}CtnoeZSM&k83X6KTplJE#Dh3J7|@IFL7c zV>5=6k#F5?(_duS_rim3eyvek<6k$hnwvc1HLoN0nF_c<;-I5VJPi)WX)16e%!nX$ zU;7m1gSYAdhp8RJRw5q_sh9p9Jz`El&{F(dB0v4_MFe<>XeGnCNtBIsc zHbd80gr@I#e)XaP?!;0mfVoZu+A_~k0oHRisbpznTfJ9d?ffix-n6N8dDYQ%8a=*g zPq>Z3aBzA+*$rZ+Vf>yO2ycbUrU=t;wjcEw!%_#wo2yD@QY2gVnmb6(@yaHX5xpg= zn}g@=bI=hPncG;o^@>m54X1BSjo%^ZtM69%?c0tgEnXtLnz8NLdN)-Ox{lweo7(0_ zlBMpyFw8Y~R5%C=Gy6zXKTutAlqjqFEhfh9m2h^^i>mJjSGqYAN=4p3`uVvn_;F9Q{0bw> z*G~{rbJz)@5`_hoMMO7eYBW5YYHpPH-XNTR0=;8Kj~Ezl3mmyyd5`Y)tayUHuGz$` z&fMglz>N7-E8Iu*b_y@P0dlz5JM0>ERmprkZAC!t#uv6ncVo^SZ)2#fc$(EHi@@6XR&&Ouh{bL$cDkE zx1C=skM|^Rpf%d@yQ|jyO?+Oii*d`7bG03=KNZ~}^cI?)i`&GtUo7(xpLr2($Kci1R!+o79<6jxY zB=OjL+9DtEw`KTS)zqLk8>K{LY>`Jb?)Q{BzYIgi=vr@#+}rfMD`!)aS3;x$t;(;& z(wC7fQ*%k>0!g+Syo8eVRZ%qfW~UL)7^^En^o8@-AqFvbGJ0O;=Yc%sGCJO_(DSBH zvh~4>v>mlRE~&l8%;uX`cHvZjp6%JyGQ7j%7>t8Lm=kt+Rqv7({85-eooQOhY`myU7?aP%#Nr$ya&ZA~X1 zq$~MlxQ~}ejatOrmn-GU_X+C)9F-~R?Chtl8VF0WVvoOa*VRe7=BoCF8P*QRL5fkEK&EP0AO@5z?7lv%)E2Ob3)(g$jrwf(%kN+Z_c7 z;pDC(0Bv^woDNl&0({0uRw?kYPt166f30eam7@ZV=|h2mmP;$)4z=9IAI_qabh(ku z)+(^XOqDA9P7D=bS)&5i2?nU3U`)4{c;|54$EC`XDp;M%OvjyXcBpE~dh^=zOk8H3 z_d?!5C#6z)kxUI|8~O0s8J!8*mKmm#>s)ntD|Gf4Gm^>Y_!EjY=frO;H{83dpi)0h z2!uUy@9CbI`dOqq$I(t-t{>KfgN)gBP8-JRo+PF~7?&~%9;w8GvhRn7TKZUXT7aOp z0+usAAo;xhd()^-n=CW9>Y=9U)!j(5gkn54cy2n6*poK^)R;uUe_%o7y1bXfyKL#zP*K z`xfgDt!(hTYx+pC?DYXULn$V~;02AEitRPI?+?r*E>zPbnJ_Xezy2yI2ff`uMI)O*{>t z0$s_FTgCx{^JMYGpDl?k7229u6BrLMqVsA6QhDS`jmdN2gU?HXW;6Qso1Je9D?UCi z&8U-_ke-m`&<=^yCM-!G3J1T-0$=9#iseN9|MV)4Yc5hMC!34*Dq|L-`Pn=DXX$y& zf8YNb8VEo*wM81Cc=oEShSfWDJqgvO%B8p?$t>Qq;u_z4GB6Fn2PgG#$RD`t#Peef z)jMO)9zt5BuJK`B^OlfMXSye7bavKAp0!lq!Pq) zQ{Ndsk$n|d4pKT5zzwBTtqy%Vbx~ICR#{)aJry9G*hOvqQ{|VF|5TYL7};hvTGR22 z3RIfzwwcirHx2L~hYPc8-PF@Vo;`eNY0Nc-UGh&B^v;qsuDfxLaV;*XjS7@E#g#U| z-z{fCh_T>0yWJTtf2^FKg8Y%Qs3h_Dv_Neon+9LGm-BAJ>|&b6{7@4{wVn!OyS730 zK*85pXqdPTU9cIU0vmQzAnE$sG3}44Br5RL+qC<|dI2U!J08QhqTq_pfyF=x<_;dQ zcq;J8DwB?8)lzLpyf&qMZ1$0JS5j`-*07xW~rC3`sxYTZWI7IqD z?S*_ua0Nnv|4~lrZh7$IqDa{6&!jrJEM2vRUQ)ak6`+_P2}|ZU&ju>+`ELaL{$}Z- z_r?_L+dAQ9Ty^|@zj$`e2P!}`{RJ_B@sCfS7 zM56+*kx5p=0CIq~8D@UP6svR?(TZ8!sUjSA@G~JgOSUYGg=rnP*?VF_6K#n#)BKup zJUQk{1rD!@#)3W*9c<&WiBUWrs4H;mZLGuuxUlzY3V7*X?)v_%DuYl=sB@0#^5d`%Kj^OD%V z4y!oU`%ty_fVGM8=>A(G`ufoN>d$Mf8GTeBxFm-1y@(1_SbS_Rc(~nGTaz*jUky^= zPLqF86Pd6o;oljt#pg~6Bj?^{vO;=l)m^#>4dswFMkz30wlntwrf9jEnzriNmvN%MV z$f&SeI8gRw9>+H&HXv`6zV@K2zCYL6+G=KJVP!e-+4bq?dAOAC z2_oGtmLuSL4gGv0UG%r@VygX)-N9R}d~#+y)2nS2tPdbAOqI2-MddCIBAh$ea)>Cy z@9W!VbPU2zI*dQ9_Z?_fXv)aExpDGBobTyX2kj=BZ5WpaILo%-T!_~JrkzZKl3uM?{vqS=^lBOSV&{9S+CEh|KeAY60RDbG*>l9p{bmHphs#P@bd4*G* zwcTB@LxNMl;B^rZ-a&9vZ*93kRJ!`1wl9T(H4zvc56G>2r*P|pfkavwvvrqV1>?x7 zHXHWQsnjuYQV!Sz~6d3S2`BcZ+C$wNC~9xdY^1JCM|L%jt$yjr52(;7k`WIWj4~e zRrKQz{n|~M%)sR529aT37&sU{Y7k*|K!NFwSt7$Fz_ikOIHqsP>Vs)HTH zRokEJ{L`+@>q23Q6*c`;y^s2dMJ;s>v4dL&R(;W>SN{pXyj0$AI$Rs2__n*f7x^EtSfeK7p z2Ua**l0zd{d`8f5DSzW!YZ^<3_Y&y)NPm`>gbpYgO zr+kF(<(j$6)r7ASd?b>w}GLyBQ(2ps*pT)(MxcwIT-Vm`9P-3NnaM8tRcbr!m#vx`UghpB*5*4)b=!(`oFYjD-onT`p#RG7SE-lO0!}S(v156Ay+(OqWWF0ceS163 z;Wg;{YPx2nS5t+bi+3*=efw~rgQw=m|r(EhNDo+3JBX zq=avyNsML`#YmlyxD5!=%Z!N%^frPkNF*p4+xw{iTOkIrw+kx9e*tOfH%Py!^m{J- z=D6Q->9?r-y`cPFc7H1;|6P5_vQlYmq_t7>0(q=JZy{?ia)B0&oKgdQ%YjKm zyKYcT6%D2Afi?)Pl4iu8l9pCToHFcJOi<6rM|qwUJk{g-s1+}z`VOJDg>Ii3$C;z9 zkkO>z`5uYJ%lQPukFh@@4@++yY8<>*Ye;tLFk@K?RTbED99c@Map--7%Tx-}dGPRr z%HBj?x94~s=>YP63SU7wLj}&Qg7DCSa{>DnG+$IPuC*ifj_mh9U}D%sWX`;9fEn zAXpGj>q7{K3je7Q2ti~C*(VD&iQ0oS{3no1fq)Q4rh9mhZLFXYeGW9HXker#$TmN) zxwD`n?+|h(ViVcYiw9wRzLWTuE)yYt!%g_FxE0}V$3e05pt7C?(l`=LF@j<>hrp5r z_yu$W8qCqQ&W@w#jX^04hM@XvQKHllVo>8ZGrE)!KJ=sa8C#3dkA2z<5+~XnboK^& zoVKQnI?^BMkuXu+)IVz3v3;uX`u=0N-kt{j#~k!l{36%9Ielvs(6>UpgKkBrnDHm( zMI&6E&g4dTl@2s^uKeUt`pIK8J->cVpXUnAUvFCb?{@hu8Rb+Uq|kmsKD%10dcfkn zU)17R5h@V7eo1Y`kk@cpvblb+;oPx#tq){{l6I+tiIQBDN7Rm&*QCgZ8Cidd3K&mk zaTw&pNdq}?GZghO)1>Ln7PBM7a!-TC_|_feDV;ha&U~gftB22ZL_DP#Hn%s6ZE{1= zB3R9;P;HYZ@#u%Pi^t@|3XAQ=PP!YIBN5Bpf!m$DiCCqv1I)++ita)aTL6U5O)XQA zz>vA0nD(YFu}?vQLSFVTF~ z1uy+_THl@BkFL8Lq)pS%F7o)|?%U%;!gTN&T4AIkg4;7uOJlY^F(A};r%)pEs!CME zb(P30DM6=uy*xBO%;h-m4opZ3ePKb%#T2)w`B)}D`tdG^Wai%$%i)%QJWLps9{a)28jPMo}?|9}j=FZ>!Zah3=9Z9OP5jtCPWNpcaw~ypb>~hBkO_x^S z&T;ItKTdU1;yV)@$uE8JUe%t!nTkv|+UlH!^_D47Bu0x^Ux*0RA8={xCqV^G?d;W$ zDXe<+Cr@oFX!vEzJBzi_62ucowlbQM$;Vyf&`et4FUGA{*_o9=Tp}*NS5S! z=*&9hDDsD`*^Jyy-rhA>+*FM#33RNraU8H3lcFQ~r(;NoMaYpRa-9^mQ62fYGzLuY zF$h-WF^3H?ZZcUG+;4z{6KLD{LUc3Uf85@Dj^X)Et|bh#qF zS%VXElIs4s{&;)SDlzO>=rn|T(ry1yG`!elWA4b;ilcXETMK1Fc2xV-+n^H_a{Xz{Kf@n%hX+D%|E@&~x+xPL^lCxopP)OS z8)yvf2;o6~M)86vKC^qLCorfzVN6pege(K8WZq>QNKe+C-oHA6e?cA*O`6}`=_X=u z#!a3uF=b*w8&kh$jT~1e^0@q2qrhm3qtc?00_Z$#zXQXFQP8oeOS=%E3Wm~)R1*?q zC&wHRH#@%$=ES>MW#}3eUD8-hzg3uK#5YuVCm8dOYU2mP z19--k5-uJ4PQGk;p}dt8?Te$+>VkqQG+RNBh55?EWtB!z6#Ya$O3OOPGDi!w1Hhna z$rh+=-G=Y+{VhrvsdM}n=QMQhbmG5daOQ%*ysOcQ*$ioDD(2BWtXDi;WDE=hHhqJ4ZUczsaGp!VfX}VvvP7`*H z2r2TI6dlRyyax|E9iXk{Uvt$xXrS9!0_Hx(9Ie(WwGLrnY8F4O;_^8 zDJ)}*Ka$TACi1Nc#Ii@3mg74c;3zZ!@AF+mmgrt@af}S2(tps24CEt_Y7c#@1>>`JNAW@bNL;JWWoc z2?)C7ZmUxt^W67z#?JbhJzi*PmVU%lz>bkweO$5gBDr`mSQ z`(Pdy+`8YcDi6)3YUj$fAVFzm+`Ez6Y-NgmyxzXxM`v1|kxF$eYl4#<@(c64>f97p zBLkw|Y*0~2;TT~!D#xvf6TNN2-zsGBdUb!So6~5`Yt_4?)cJ*8*=Hl0Mqev135=JA zWVj!`mq738wbMLmsJ~LGdnH?gwl+ugpokDDhe?NnzT^Eta(@?>ti; zQJ)hBNdFIRx#z^L{0UDOQ^Pe^_ruTlt!(v&PPPlpz6p%J|5mh1f@N_bi$j|Hyuofq zi4Fm8R=4%e^(;cbTk7nV*i@XZGK(F#&;4!vq`9#%eP_0h6R8rX-c&e$4LB=w$-isKCw1@pUvTIp zSb@1NN#qdFPRkM^HQ_bh3P*=9wByK z#!pu7CpZcw^Hx@JmkA>Fn1v2?-W9$0m_~(Ww}*Ex&HW&NP zz4BDQu)5IG8R0<9WmrbDL)pyZU~tEtI2dqnj>O(avI)+eaI?}X)}I`>CGJj_P#P>7 z5;bv!u`d*XPWTlaBubc~J0bD=y>TUBK5Fsgs#CS;!>01~t53GaCOv#QFGZ;CIC)F+ zF1vi#fyx#W1j8>ItGecdgsAl6=Ix=8J%zG@4>Yl_hf6#++)=0|Ja!ITYciUMW+;kr zE=9zCYU>(kyEB8zf0yb$bKEJI8EC{qW1ALazzFt9DllUP&fZ6`R@fFB6-WhNra*PU z=x94=-0Q0%@hB)M)sUp|p=lu>P2rvgUwdwG^V~#{E>KQ@98myT)UBW|075!X5dzC_ zQU;QgytfOcmng#bWk4(a!5-L02J~MVgk?R?1Ow|~I!3yYg^!2cUh|KoA{2X3h=BalsybDy-^ zRMs~0(!dXAQ&Opb;sCPy6nPLP9i~K1*48W@&1w=H^l zO=+MqH4g~pAf4imMlKw~avXpe{!~CHmm(0~(zJDiz(}S7e|+D$DtBOoFxtPJhYV{fRGSrbllIU4o`qNNhUY>MpM z>RbZf3(nML2C~G^<{VBan|2B5V}ShIo~_DqyXL4YPy(!ZKB? zxm<7`9wUe9$MY2Raxzjv%LqNaA-0zReL-^s-_eLDxx8lE=q0pC(UM z5TmAhlpF?0z3-R#$|mjh;;Tp>c$8-XHIA2UHzkZI#ZZ(M8kuW7(W^ZnUqTPgmfaQ? zkzkw{ohG%?^3Z@uzpb1IE6V+U5EA;QZbKz`XmH3s^_@h#sl)<$Pxaa$yA%jT)v*iA z_nZ3n*q5816;r1IhxIYAeI`T{1?-**bbyw}!{c}~nNIs0n6iceJd0{s!*v zGuG-i<$lk(-wg2&5&oMYJbp98Z-)5I5dWkR{lCc&R^o(u977tO17lf(WAjqV8`xsK zU03^f;F$bcZn=!OeCXjg0E_&S+V*E{F&E#v`Z|B9QKrB0r)M+jNElz^(WiFSj>}_r zPLA1Mvhr5i92LxoC;j{2!zWx8XtIvF%cM7#SgTX8aeU9bty`H&pVhIYdT@uYAl? z@(n?DZ{gwHmt`Ny9)G||u#gXtQyV##k%uTW3t|EU_UZt#)ujh-MjXt8KPNH+Pt7Bh zPq&AFUeSfL_Qn0qIBmniC8X1bKChBHXo;#s*NO~dCdZlB>k(;VR!aDBitewSRZd`B z|9KW*L;T9I$AoYNcjFF%md-vKD&aF|WyL&1;vJw=08+AQXHnC))rD5fv-a|O(j}kR z&*N)c3N<0;YV(&k+KrDbC!4Ac7T=A&51TwKa#-~Fyo!QDGvVi%GneKLC!S#}# z+%=p(D*$ySpg=wTG#KSr1amgXkUy0Jc&;!i?LY;8*$%Bxq9O|1DA@n!aslZEezKw% zbo!4i^_lt)R+Z-zIn7wymK7GzVCsw-tFBZC9Bj+D>1f?6+EW>Q|MN#?OIhS`CX)0f zE&oac*m>69sgwJsaw+Hlo~3l+prHNncarM4rkm!rXeg1lb;T(YV#K>Y^85FAU(T2W+l9F z-970Hq!q7k!!)v(fISNA2FM@Tmu#&pM?|(*p$0Hzy_Os+AL8 zGeqH%j)Z9wR1+x=$L4pEV*HCEnWJrKbCxYP;8ZKD8?UXY&AyX6!j|ZGm-y5q8B3#&mW?26^C@B zVh&y_4$og9%+8_VMV0T~5)63;YHJ>!jq(vc%1fJ_#omg2M&g0DM9>kT^ZmnZ{2)wW8^`g79x1s|bwaM?b1@YaN29>HR3AbLC=DF4M zp2X)D$cNa=`^-6(+ajqXL2A#pZB!`#k z_C1oEcib%vW8rEu1_x)&j|JNM`m|e+i;=EMu~U^sBPDBLX9C{Z5Tjb-z24X&TW;)o zJ$tAW_JR+lsqjT-a#yBH{yFWe$PDci$>QKTQN6|SjF>>mvh`5ISugP+`t;PC$)(E* zQ-NJcqu9Gg(!_mm5+QNf2a*xr)gV;`tT9>`LY2izS)ttMeb?Xb|q23kxBRDYk}6gH5SDBW)%lucXp< zsV2noUbtVuV&vo7b?Fl6XQY$l+1euM>DG>NJ6gS?e2$REceW=zf^lmfR~d(hPI>xF zx<}8oY0cgrYZe~;c|?Q0|4V903|)ZgHX}?#CRgf5RF;jZSU=yXd|6++ zvYKZsrkR&6J*I(k;>Ns){7kEPR0zh3^n%7T35J;!xZ5%m7TTK>+v!Zc4A6}a^>lt` z@O)|~^u+j5cT% zOfBVJdHL;e#~RL{p=MB3^Cn%L?PM3%xP3XMyv(R`##{GVJfs}jrelXK_~a#SH=?LN znHM10(k&h9e3;8sfA2a?ANzL1^!har`lOwR;V5cG&+J8J`r{Wjtm zI+}C0qDPHBk5^q@@rm0`8pSp1Y7vz{w_dBRP^xkNnu1f*m1kFK)u*(%Uzm*NNDJJ$ z(dk1GVF7l&$UVd}e8iO)fD1f~6M@tp^jcFf*L}NFGZ$DnRHNvH6A;FZ*{|cWa2acE zVDj0o1hoIwO&d?#Bix4GN3v9zaSFp~Om7#7R+WXL+%gTAh4`AsSWk8}f9mCJC5aIW z2((7TFOAH^B=rZXtgFrjUZKVfc*<*^f=({Oz3zt@UWi5eDUK`wU4kwpU>x$a~pXl!`z6fZ`}f)9T{6naH^{c>P?r59oA=T9Iz7HE%n1F-4AoX zDJ3BBuMvZKRp5yOFLOoNJ1;*ZE15ieYAnC90MNu`Su7oTwVlP`0*(VMv>mn~zXaW0 zR-#0DVw#AOZ76|T{oB>Vdpk&?0cG@LU*nM3&Woh4cf18U(-OQUfRcyadG1mdfxRFKOVD&YGS+!|H)bu_CCZvm8JJM4 ztQ&eJnd~|H5sp>ZUFE>Lo^2dz=n}?2VkD*8o|@bxQGr2~M-i&~!S9HE zW+Q8NW*f^YKBk&Iw;yZ3%O`qC(=i`;68N4jAwkU4JQahEaUAgU%W}(~#Ns_pU$}E~ zRN?cn2PR%R| zRz%G$D!&i|5Q!-0DSN(kmZ>}h3SdKzImhdMh+e?v*XW34NUbBxVD zksXsa<~-A-j5;;a)M3#K?VKhx508(6G%6iDGUbWm_qb@JhP)Uo*f+qWAxEl+bv z1;0aZV|tsBW4WzQF@QBrVbIZC5Y`MTanK zt?{k@ytx>n`temlZ+DqnAD7*S^~p$GK|Fse`jG7cq++cEKXv8dawmk_d#ts+rq7|O zXFXPB&;KB%Dg0ddSIYGwVL77+V>Q5Fwcy(f`HdX{Xl8z)i4H59jJl01N$Gql;jc|}IZJ%M2W%cYad_C2e z#-sgebCbKrOsNIM+I#D*UDLxYL4|F*GGr*Xd*9QB55hhwQQl(bs_w3)ZahY0al|z+ zABZ75JE3CQIv5DVx(6GtAcv7xzqYvl^xIB_qbX9nazn4)`CkJV;o52yw(^6YU6OMjVy$xzccvfXG%)V!R9&TQwM7ggD;rx~$vJ(j zdI2*^uV?OC8#W17yQJ1=bmg{E#NOjG)5K7DKSmd;)%G@4SDmbHCP}15Uje!#@LH*7 z#qfiz(l|ei-oeXj*wm zIE`goWh`0B)E+(tKTKr2VkX|I3(5S!y8P%~?|7-mMqcO71QD98?$F$z4HmUOp_=tS-s^1jWzIenNtl6{DV_{)rXi7!QC*2lZs zMzm5)cf2g674L}N8|&M1x?3Mv1GD-eDBLf>v$Z{L4KqyQ-rui zVK&PtEv_)!S#?;KA2@r?E-*}qu}rot+_UgsvNo2dpdLMnQx}Nfk(V0I)m2rQlLAhO z=ieAL$VO@aG>(jT4$8%0a8rO+)0Rme5xUx*|aqGxNu&9^~z_)vRsOLEVM zUZA!8>+FyBmEfT^R}O`FLI`doV=xnv8@X`egISbTZ(M1m8GCc^^hd>BlLL{XQzzb+ ze|{$}Z1-j3To*2J!gcH;cU#?IUD-Y*m2oiT)U`1-JJqSx%PA`Aa=(A=mV($o2iv6S zD`UFQ$95i(3}&VD(hKA3@K%b5mgMrjm?%bWIf>Jg!hc}V-r7WxXgqq)Wh&;vHQ#qM ztC0ZYVh6s8QZl_xhX}gRKi(D+g@~hU7x3*8O1`|g9ATXqWN~R;A}W(zZcGIkRc8<# z?5qv(*i<<<_-LRoa7cSRW$qGTa5wSZ_~R-ND_1g~siFzA(xY3TL4BvR=8VhWtY%kq z+_C%)|NTaK~H=shqQfJFPP!LSs8YejNYEqL?8`n}RO&4j#ulF3y)OTnq{E=Feu zeZkT3&#Z}O>?EFUfsID+zckfKFY%iziV9HeR&7~ZgS_p$<6jycLF-na*HB*j@_Iw+ zmD-1MOBv0?)Dn9}M`8HR zgAB}@t9-%<0LQr}>^i}P{$~81daYty__;LC5_`k2nq4bele+t3*t5?R55-FQvuJjW z`8n3-l6d>K>^L~aTShWr4+}D|z5ZD#14TvNxa3%2(KFZ8p@%B%P#lXpr@G#bHKt0W zj82~ODW9Vi=ewhJ@|-;4Xk4KR1+Sqq#c9{#H`Y}^j5}lEV8LXzDtEfr-uij$EgP=$ z^~3f7H@-Hws*Z*+5QLWbgwoL9d2#`g3Dd$3w@0&^-dhSXAA>H_G#@LE$S;OOW=QR5 zjqi=+OokV&PK}*kAlG~wuZSQyuaRmyCIU-PbYa?Mc#$B3Nt5pSqInL#pdh6e+B1jQ zI)NMPk$`OI)cy~MiJ1EvgwT516gTDggg%zd%J@KM_Cu_E9_PKIm5yF0gF0 zvul!3v@pMNOyHo_@rQP%r#iMX5{Xd&6KjH6hBM^*7XyQiOjM0I9j%H}+Pkg!?K04r zCz`SWdx+jF%2p`@$B6TAuruC1Jz&FUzI1gYHBu1`i<4GLLb_CMb8!9 zEeV+O*zvKR;KGsZJ-*c9q8@6hgrm|k^I)*$I_*wy1U}@tH_{R$J3bUHaU-I?s_&#| z9Gpgv<~_ZWa9EC3V8++i&zizSuu39XU{)b*!?tE|rEpa}-tHHwCiDZ7{F>75U(`sm z+-2|E6WMg^^2FDeR1a#Oxt%}Xh5wr4sa@OUB;2`4_&hd0SmpAA(yPTAz43hRdsD^D z+JoyMHuWbuG}XO7Uvd@xF#fvnyVWil)bH*IjrCSH76iPymmKvtGwt%hn#IV$_lpN> z*AuX$|ImN*<}aB=x#Sm3Gu9)}6-6qrf1Cx_iu)to@LfiJp=Y*xg(J-VoU8kthsFh& z=R|y@XDUv80zxu?8hyV>W?~Tr8S8lpKOd_CITHT#-l@R=rczSz*3a3pPMibVA2Rxv zoaV7~p~w!btilUkUM+jbCXd}PtkpNU1WYwZU^I?yK4vNkRPS2~ygOlMD%#$#B&K=% z?&9&xb98slZD+Cm(ZDr{n*+G(~^xigF>gc4WXJ{ndu5;(z!b@FpL3&y#X<9E-)A>q+RkG8A}4%-03I{x;CE&44m|5a!{c}>_dYt)#;y2#C8Zi9Z^Av4hP-k469(miCk$;q$Y%DqHO~n zBi0CdQ2%Iv2ianS?cL|y0zJCIHB>;Lv@Q;qL1N?yu4I;dB!vwOBP5uENsF%W&;={d z7mxRzBHgS5H7!tmcu>)f+LHU_^HzsbP8pIUGv&dvjzGD<-g{`0h%753Tc)vfh9$LG>c-1sGsSP=9 zhJmzjA19__t?v8B<{k`hdY%l=#XmabM^tdo@_RY^z%IfzRW>=U;q~JrP)p(kPka5l zvL+2GSLlCMu3$QzD7?-3{YGbfOmLVdfdzZIvK@?${+w{xXH0h1+h$7&nIJz!PUl)S zRVj5kSbaiIEAb*u>~~)fBG>FJ#%ZoAG;BwrWLUjgPcC&XI{1De+>I*IP)Xk%#D~0E z7CwLVNz_>HEHtP>BwM8Q7Cd zp#mdcVP8j{gwcXKW2)XD$KhnIEt;*ozn-8i7VC(_^zutRsB-XSEm3#8{p@P`b%aV$ zY;FMfU%tFOM_TTa&SJNt@b#d`he)6fd;biYavVGrgb%sJN`O+B)|!ZMOKyKuX@3D` zslZ}EZP*Yr1=~Q$&U&+#axX#()O(aZQ3y{y0^Z~vOQ0TU)&EAs@ssj=`_t)PDI z7)qBNq&)piM(6#2_(!=!Hyz;dzLv?+Ps|&Y5>){Vq1zbWn&q73r z!W~*MExn?ijVFe~STRk9h*8a%bSXRC2K`F^i@i4ihq`_Hho@|nWujy^l{G>LDGUiA zOOiFFqHG~W#F!D;w-Ab$kdT;UEZN2`Bs)dM7P8M+#$aZ8ukQPI|9w8`2_qxvg{G411<1O;s>~n!RofQ(Dw+?*lmF@dV&@+yo&}+er zFf{YmY^exylLCj53n~69bUvzarcfE~{lMNFn7cL$M_xldZsWBls zHEAdfcn1_+(FRNgAP@(84~Ou?)0W4w^<`%`(bYAz%ZvJ2>oGuPY=bFMqp zr{sQgp~^mQGM!yxOYfn_LEhfJLrl$#Yws_tg;X$WuOC=}yLHFcsQ^^paa%u$W??mw zaxHQ9f{~tW^rfEDAx_66`$QHF5l-*p9^|-L;;CewBBgQ+>(PsLgh8C`S%zxPRvG66 z=20F?4u}nuRnyMzOvp)U46AkD&Dz%y`s9vU@%}d&r)c3Qex#OE)lhhWt7-X*(}mA& zlbEw_3@isLNZISnEmt42nk;6FIde3a33AWhG7jlOiY5nD_AVi<2s}g@HWWv5$tvwH z%F8s2j@?Z{ex$_nF2Bg1D+YFUup0$!-v@WH@=zM*btRk(s9sFkUsq8cF(UP8;p!#% ztm_xMyAu^3J!%Pnh0}SQH4c&3(#kP;MRi#+F=yRAi9Qe~2yVLsU);X;u+`+AE4B$8 z0%)(J@sXQm7BXqt|1>y92q@Tl3nq{*|y<$2F|v5B^n!xVe!ipPOq)!BDQtzBNEI!U%DkMPZ6w#``ReIsC*9{B zddvmSbVmUP09&=x6f>pnG;=Pa-e5THjX>Uzd(+wd~8O3UDSlh(m!KxJguuSsN=-$ zC~*Mb9Ur1*cc?o+h-xnsuhUm#uk8FCmqH+IvfKz|y!1GuD>-ID?j5S3!s`pFgI}<( zs*O#~Z)M5vDi2qpk+;TvS`uV!< z9Hfe%im<{^9gM0Ow_~JeHn|R4=Uo1% zOIV-Fpy>8{<*<+E0|omQ5%0Pi$2ln}>*lq@Pf7ZLepOG)ovzmMJ6Cg^#Jd_EJ5?SR zxWL-%6=_N>o|8KLHE3c|6ST9*38B z?k<+ykne{vA8>cSdG~6{K3Q4ODCpDVPX@0}NS!*9N2wrdP|(u?>>Pw?y@5OUNr5t3 zc8x%ZYxut6ta4LGXkJ;Gs$=}4wR5LdqaY-B92|#kE<{6qkX%T9{YI479E!jC{brOf zap-vtq3uSE#!$_Ny64xeqAyJ7xywJB{MZo_msAhqt$9dlo3;#IT1IMvY(IE`pLll* z4|%BkRM+V(uCup~t`)jRgdLsmm=BijL;CE(km{6#e5-E@BSKBJ89{wD)$TUrKGsgH zjCa>s*b0OO9PZz6vSbKw5Qy_P*-zymc0cm5m0p%GG(1E3oV9axp#~OcXPx#pP6#NpFG|0)ddo22Om0T`=&KiR z1QliPD!4Tx&yg(k2holu(I|$#BC1y27mE0TNY2ROS&ppiG1+vr-u8nB?VAo)b5fgG zZl8Ly2V1HH4fof;PLIho4=bY{lx2^aINiG2uOT!2BI_yRSgFiR-Y3hukRVM_K|DW| zg@|dkb6xlevFc-*8QhHM_Ymk#kyomGa*;FHiXq%ot5vU<9rAE^Y49gRLX3on0w@nh(ZW!NF)48jY@~gGUP0yjY>nzDy&nwO?FRv@rA{XWEP&y=DGvI;s@HsQhcIb1ai?HQ&-k_NZ z2L`xgJxi+kaPtScBt?3XfFT{9?HHPCh<&qgb|6jZY1NDVx1=|xueW`d>o9mHf9fca z0XgU}fms#+C^mkD4OPNr1y%$DAKLwr?Bvd;atL{bqIxMYV#;lP_k|;Ew3M7?=$u(U#qH)vWZ? zdp(~iA>FC=?@>x|s`h>QBLw~At7v~xYDd^u<@j?Qvt^0jpMxNNs|}d#JtM$Qlc(l0 z4B@&f@dkay$?1QDF!ulN?(*l7{{R2H4))Sy7m{SILsLMGB9j_LnwZ|kj|o>AKVszc z)_IG0WM$7QdxPUod;}7Y=Pm5#js?Zrk7$@S>gQn?MFbBbJW2~!KHir~xDvV_M@C|) zaVKtDS=Cva$I>ou3~4-dm`5n0o5n%n!uXC*F`TG}21d6tDEP+HyYU<6uAwL5eu9FK zm?6vp8SDs6kQ`r+*k;|m!UAv%9JX@iUj$0F_0`?}Uw_dHW`d?+&_5mk@Ph_`BnkRl z=FsfB;6H@X6xN@Rqx69WfOP(a_G8BhW+1QU7}r@}xG zml~M{T+J86xo*dA2Al>FwTB>5CPxkV7cmn3R`p||wx6OK5O^md{QIAlj=fG{+12%f zlhcTI2uF#=uXoCmJs*%2E)BgOO-{ZllzQh)(Uzn33kQtrY|E`P-%hK)KK?XIctu98 zh5vHfjfKv{rvn(r{&O9Y*_9kGv!i}VAyAQHJw0&lU*3TIB@3G=fCZ%K;npi z%UH1wJiMa+PTy+#Oj@fpup*O$JLzvpTjxVHhOP)Qj*Fi-x8eJK1N7IZP+M|q&h65f zpD)n1qE(|^JOvqOkrU8R0z;U|4u|X3*=ADw4Udv6&w`eFBV_r!rQF;D9#^9yjJL0-9Rm z&dSZYef&36z)b*T=SJ6e#BZ+itSfzo{l_}7e}JE|WLrL^e}R{IMDGe9`BGbL1~XQ& zTr0e+^*miQA37#^31&hXOS;Sa%aFE!rOhzmZ``Vv5I$Zi5r?VH43b730ULFZdW;fY zFxD`oGpGg?ln)P^T9c@bo0wxl-;L=ASd$-4+hG7jW^F_)Ig`XblnNTs+PICxzJNQY zjJbQrj5&N(*>ln*;e``ls}5T2>J|(2;4yt#>85zYbW^J8nfcu>&J1LOS#wDido&k}Zc(IL(>$9IY$J>O>P>bo6D~BZL1;28kLw!| z6V&6Eyb61g<^Dh-$8grASLZO>-$|ms2A%m2sUb;i2-3p1K|C3--vkHXD1BG`6Y?WM zlA-VSVJqHaSM2AIqE7UI=1M|o-AB0Ki^|E7k=f)p*kCMZb7=CP6Y6O-8wF+2F2~+R5D4a zxG3)nLM!jn$u(=yu}~ zvTIA>cMoW^4D)mM5FVM5;7w$)IO?(dk&jm$06qI~Pk`=UIB^__CmN}e>xWXeZ5gqs ztP+mS%_VTWZXMwpd1&jhcGKka${ppsTN2!3eud;jVtixaGE5(s*f=E07!YvG~R#Ak5c?G|D9`j#%CIag0NLY!rl>GC0)rxne@$ZgOGt zGo)NhI1bI{>>pQ>FH&3f_Rxw;BNxo2bw8tE%v?wd#9xVfVjP#cTrr6WP2t%w4xdur zyV#ndl2X3<`m>Euw*GW%Lj!}K<$U!~ka60*tVBZXP9Pd$uQaE5q`MWm9=T`R;OTb% z-Vm`1ZYQp7?e1*Li&HUyyg&*noVVXQg44aQ=s1GAmNQ(MIe7M}{z$0+ZzzW?V zM|{Za8Y-FQ-3$%GL?V3*fENUC;>G6UPhcT1N7N{(@F zSpcg*VW*#*lCC2zbd>$BxySXKI#834t8?!xr~pN~n+@d{3@F zHVpW1VBf?|KaSQ55HwDbVo4fpgbBE>iV;BzU7DBoJTg7YW6m4;qeD|dQJ9oP7|b@v*{I6qA}r7Cpp%M#aQ z?>B+S;F1f?j>!B^R{00?LBeO0e?<^o2m$z74zc1fq4j2wY{nb?04Gr^CM&fk&AXAJ z?q4&{e0FB&`GFN6XW+liA_yL>59S1X(WNhTO<8<&9uhw6np4$?tMQ|aTbU>O zZVXRb+HHnjuK@nE4AQHSc0yVDb$T=ceNus>cPtxTTP7gwwrb%BIIKCr1?w zNHIGrUq5KJH*ru+SxPTolL=pzm@qP#d`edcX*qi-legkys_DACprfGs(b&=25{&B% ziV-!Sj2G5W@#{D2S;f`bHtq8JBR~OTZSpeBR(81?cSvA6yywYgdW_sV~m ziTP?aONu^G2uRWtn)!>auY7a>)Slo3Vog|U^ag|dIieF?^8tSI6IzNa$Bq72nhy-C zxA>hsSRslc(9h2yc4pzp5dWd)TRihN`hBI|zuQ|+_W)4e|Gcm6T{LdC)EBJ%(ay|F z>|jgsfQD}S%O`EeZ%>`N+o8qB7@t+VRx?d5cgL5`MZlLwe?kmyg1UwDX;?J+hYN{^ z&h{Eq#=cPoS@>WSMazhAS~E~$rwTLPl~xY<>~ohnmN+##(|lQvf5nCO1%#zQX$(}q ziBq+`5+=W@&ZT!Efi&!ZG{F1(`7Oh5rS*Hbx85ddSfoq#SylSJsDZT}nsPpnqE0ZbFi2t3gV4p8~^sIud{ z=$^qTIF-)?VAWf6Qx)Wz<9Ch#GliiDPnnURD!_jI4JgCl4*rHXfVKq7q;ML${V$RF z6exICVgDEv5eX;WQURmd!iXQyT<{;Zo>ca7VE#QtHyxv11lEf?o~R5)iZf+~Q)Ros zjfba9sC-7?#$ZRVmbSt+lqmn8(A5&)XbAr`eK5jr6wv#tsVx1P3QF|XR6t$q|IFmq z0Q4>dK*`mIPopQ?D}F)*bAO=+07XJ!^dE1$WJ{(DPEG$(k^UIP1QA1*fdiEXXo|m7 zo~v*1K*hN++yqpUN9E(xdztw6mcLXUaPQw$9*85h)dYB6ditm2e%A^t>rl-o`N0}n zkh`x!eqB)Dig4PMOD0#RrgwH{WK>7C-c^jNi?i$b{ib+5hGvBkr{tqwFAFXo!dDWWNVabJ{pg@~AkN-}li=)5X z^PIO=v2O=Xq62VJI^qy~NmXS7ivsSUV*sn7yG1U7pcdUs5bCMX5_+9+hbWtSgS)S> z)q*iU!MN|PV!Q!8EzbGK1s$jqy9;XfStTCJyLaptI+3q$GfDb^=Z~K)yv2x{4lE#}%;wsQbTBlg~0u8`3(i>Mc~JYA2A2uKb}+JK>zUX2S~5I zKz)k@WucLnCa^0dir|zn7>T<8-E$78bE(@bJE@j*6t4KfBd z^L|1Ea_D@ACiQ5C;Nd_QHnBmIs^C-7X29JIT;Gr6)}WiRSSk(SMPWb`jT6s@fRYdN zTlAMfpP!IdL&+QAzo>!P5l4ZA|2~`H%MDF<`{`^Q(6@pUP(vfB4RXvbnJh%LNQoABM9UKHx+=Y`f?;dtY0mrR52Kmb80G_w z<&Beh_HT08$2ZXl8Lqv#EUI_nGw^r19CT>dpAe7Qw)aH8c|mK;gQ)j3L>FxoCVfJ+eIVdnc9c})fpyPO)&{T*GU%Ok=lX+J6L1e# zjw4p}4{U#gujaS-8)>5*j^sHg;TD7iB1TRe{t4lF*8oc>%#+qX%{8dFDEBoUSPZhT zubuTMsqtBOL=D^_|Kt4^YUMvxpp`eZwz$VQc!xXTL|r^r=NnW5O0X9+D3_!1Z3amCw)Iwz#}WlF*Xmqs-yXdIZ!<=eVA^G|PZvgbc@FAJ`<&S6mf zh&4koU_L&>KgS;ZZ{|>1)u|Ghqe&`F@8^Y&_APC?E8pWcbjxo}C=ud-df8w+!|4Iq zKOuop8sktZIh?+Gc+U))*obGr|9C+`0CchzZ#+BR3C(~M9FV9c;zMzP2c{;~KHS1p zaF$}P9h$d8kAicX*#mlx-;A#S2S7awbZu*aTwg%5pRRSAGESN*%(^e0mu+*+`fH&9 zGUD+JYu)F}+ad=9@-1#qb*l^>k2k_+aEPrhFAna8N*L-0ATqxxKeCR!^ zTW^xLz7;~z%;;gAU)0M9$k!^Rd85H(IQ0J7O?;wG6(2N$U(KGUk@S{2k0v$ld^O`m zyWx)l_>O(RFOd&ow$fc_w{eCA_#yOds1OBxvDA%bXw_TnE>o1;$T!v#^fD3hIUvD4 zSqFklD4@6Qsy#hFf#8gq3@@E z*1aeztEM&K8MBp^8V;IO<%kL&C4p9Gu?wF|UX_!Ni94)sYqLwn$nwM_#g~G15ixh_ zj!-XwfIewz?C1{-=_G@Gr+!BwA$*Jr_IzGj8GP`K%A+*)$!)deuugIPX1l9C>0jYYNWaD13Rg!$U@3?tzM zKelX+EE2^ZR-;UH8iC9GluNNQBzND9ARm=T}TCoOdWlj!sV>+x z6qFPxUP{8c|B9o{o>oJXJTWCQZuR)uv>) z-(kmutLeD7cshoS-AEjwt7JR~RTax#N10vZc1E-_BB`_*Lmwub?|)Wb*}r?C_Y+dSr@c%9I{m8eK}7hq~()8Ht{B6X(qAb z!qUS}p{SLI4O`e;K>o;|kXz`l2(w?P1?oI4_Atr^@uCU1^2IXgan8~1$w;St^K-Vk zVv4X+^5=xb8g4VbaHKKqiuoP7MseFcGhd~V=R_8A{W7MKAYI6KJ5Bdevgq|-C`+w> z+2K7TSR*0?z?CCSNOO~`k!nU+emg?i=lNqaj;3B2xHGI1WqtB{tZ{}LTWmQq64ET< zEvBF^6#fW+eywu)IZZefsPc7`&Ru}=DA4as9+Xu&7cJ&?>xr&oq{rk#)JOETdN;ItLw)H_F&=e`)b}bM21!y*y>z7_nKiN}r zE}!zw&z^p}rvLO@sHa$zRg`D+=fpGZS7O3^CWhaE4YVH#ilC#4>e&6#%}H{Pkk4xj zNlE7iHfUT6D}3zB6{9Cp)R$C_fB14Q@93%4i`#``C?*&C<3P(>=AME^oW+DF#qXI< zbS63v4cWX+8TS|va#T3oCsvb~{f1b$+h}b2Xnd%~i@2yqNxtFYQfy5tA2%70DKqq5 zw};El5_D2{I!;Nr<)2X#ydHd}XaPdR2>MAkDNSjc7R+B41- z2Rpqa1A^90rdwUgy5u@{R~2&q8Y~FO0`jE}`oSmhO{4g*o$xR*if^Es`gnwR+90ko zwEUxs4BI{2(t6;OXsrB)_a^AK`7fa=wAcIubU5bvfQAsz^0;ud2A7!#&tDf68gWg$ z?_g>!kz-?#ntb|1&tMzdJYI@#X1AP}%@93n5`Uuy37={n^p0n5DsJJ4Jc%0@Jr?Pt zaN7Q+fnB1ayue+xZj%bhm(V1n#O@iw=nvsT1Kyq-l(oKvxi)ed{axnC!RyrnDUuT@ z;%Vm-&+ge`9(W+3x><~vtXN8arGv+si|onqJvCa&_K;y5P>y9dn`F7PMXG2K7VWPQDx$15(G`!Z)$j|ZG=g3}m66Iq z?OrD@_P(F^uKxu6c(WKvjweE+QPLpg3c)D=zH|8Poh9!sZKLX*8pWcKuZt-%Y9*)5 z>TX*J9*zjV`fQnS$~JAM51cU126XAwtq>o4xKY-RvzwXmdCnulBmI^GB2~3znX7J6 zpC%(@yIw4dX{{2f<-UHRG)-*PuQF0B78_KCdHCTWLvGJ4c9`}D9=u>H=tx~*Jvgx> zz+s}aVs6SJGJz3TBpe@ds2He`@XZ?4f34*=tU%CF$OZb!+p>|GmH6tw{t-|m{jZjN zzty!%z67|Kf9tCc;Q8BH@V@~0O8du%zclK<{a-Uot^p2ud<6b?2R+sRt4Mn{>bW#G zB?2&9#&@o=Ly#4_ikcbRF@>nFqny(?llZvNkBGx^_X zP5GbxX3yFj^{^1xFR4l6^$41_U%CakZ+Ej3y|W6VhuQ;oJ=5XTPLfLKPsrl$AFJ(= zU4Af3uR9Km<9vnB2^0$b#uRmi^;IO?#9i@&bs?+hOUk z{@da71BcPm%5}e6xCj$;!yfea{C|Cc9@LHh4m|M{$3P1uOzI4N4a6BMFbk-x%R1at zI_l-$AqE`2taS!-r=X%J^FRasS~@|mo%TFz7Z|V8Ba=3DQ-0Q9j7zeObaCbCVc(O| z6VTu)xd;sr%J4Nx_4E%UR={sUGORsPdU8zO*nLT)`x2WXQAPE;BbN_Kg9`9L!|1f! z$I6McoCzR41GQ3ySx))oT{GtQ)C(RLyHYAIagu|@3*q&20##_3i59F_v*YRZUDNBD z_R=8aoka1{Tu(SuD-?4hh(@$;XqN$WD~x zU54j!IFn^;G>m0}DedG#f(Q|(JsMv&n!j>t;cQT0Qn%m(?jb)Pxre@A>)N+$-KtOT z)ZSvc+BSwp7wToB8^o1BnGi~X-ebRCf;ErwdbByAV5KTKyOeXe>f7A7Mj5$e5%B_a zi0qE3f=1vca_%h%3k|ya$goW51+j|0s~E2>slHP3bfWO&A!SgYB&3S%ZJn*Njc~c= zV2gHCe<9%<2Bm5=)Zy0An4ZSDZ zs8yrr@!X$~1I)2b7|JG4NbrVqZkM;c2qbt#u3VJRIwDqFTk? zV@ z$3%t0lL;5kOg(ihLZ#sioxfO*uOI^723+o0S_cVh<@l}TUo$y(2}JI4D`>k2A{K(6 zNQF13DxF(oBq?W!ZRS4qe`QgSZI9bYF5F1#Wwb3e^*t3fI?mni3aiL#+=V};3z1%F z97{6mMcVNGAkCL65(oJQQdDoJhi zAAJh*C*u|^Ql#5c4qk9wvmQOi#F!KV{%vNt1$b9X2cR^^-eN#`D1J8HCHpr#)ivm7 zPcjX3G@azv1>QIR)$e9|&|{gZ^>c(EbyO6^VS=M@a`h4F6iFiTG(7Y@^;k#)okL{@ zAC`E|&xs^`c$~XYEUDA&cxnf%DN@GEZ( z!&}OY)1wi|`~oLrvNe3<(mi&xjs7-lTT`PvyH0=-yL$m?MoHb3cioALB}vyG z(Gcn1bXPc67yxHk@PCk%_1sC5oNBS=dft>8|LiH@C4JdWzresUElZ)yECQy*1p zQvLol;^&yE*R{FU6dsGg*R_-Bq9bDox#uT&PnObNI!W-J=Ve$}kglJg^pjYpJljsx zoSS@D1AF=70YTvXuw&K0Ms=+2Oy)Ft<*LJnN1D7WD3kNbFTTTw#>-dQs(=Gi5=(tJ zl_AlhV#!mnx6jQ^cIva0gSFj*j4B|@Gc;6EzTGR{Unx=7%r+@B11 z?hO)~?>2eTp0hNKIbGwQ-j$Q0gjl|f&SA|VFhJqkngBFC7mFNaXKk~sEuOn50_KGeMh@Y%0 zns%75=+O|f=_)nu#VzKqdOw_gl*i`bSm|A=){Y`Or zu{9;h#I{yrZD?oc{COi!sdj;h31FGr)q;g0Hpqy)cI?6Ws>Fdu8ffvY!Qm*ZC}XvyAuf}!S#OC=IPOrLOj zmQP!Dz@c#WRZcfNE+u_(sHPWbIEmn9U0tePR&31NCUVs~xj7Lrf zs+T!`&ummFv3p*3B>-}Y`d#Ale#EoT?GHymz8xWok~$Gr=s>gp$sThNH9SP}@Zg9J*W%H4+)gT#s#N6^s~GW#;#Q4 zGY%!Vu9u2DKVo&6@BBf^K10FphgKvRhVrq=zy}7r_1sGNbx=xlD~uQf^r~|tYkcXb zexEnk0l4%e7nx7@1B(q);+(%|PPCA{Z>{riY4lfD(Q4_+^LXYuMtyF2iBlFns(7kr zs`|r+-Lv(jf`;Ps(VK+(i%}4=jFU|%k&vuT&hqheaB>aPFP0Uv$_~}DxP22M{C(ho z{qg=yA%lQ(0gHyw)#m!*!K%rxLJ?xDlB#>&-8H#p*)#%eMX*t@^!*?v)>)q^KoY}5 zE};aizJ2*ZXz-R*+ueVIYgFFjRsQXmq`_?0EyUMFx^eZCCG8OoPUBI61k7-+*XA{O zLg|Abc2y?=A}j}SweS`jcnG!#*QKaAkhQCe@!%g(9hrZ5*WqCX$W}a>e0Aj2tUAefS^6sK zMEf_o49=a4J>NlCxF|exEYZiBU$p+=-lW9B5BJ8Hb^QmbEQ;K*^W|IRD#}|azFu#8 zu4x@jJdqG~H91WC1mi(H<~lWiyS-W&Q%_^O}9eMTlcG#b+4z~Z7EkHc^lRKu= zDH2KxK+l^iEpX4^%hTOytWZi4^K08HJL>BfvmG=iv<2?q0VaEBd{fqswM2AQVf z@cAtmW!myb?tdBi?CYPd`Vd07$ zz8;$JD4dqLVZUa0PX^a@R%DlEzjG-<#z8Rg^VnUELAEcAX_WX0o~BwiefkI~H8#&k zdrBo+2r{}> z-e*f%Y^$2G`+;I0p45Lx0uYYr?81s2R(dgAfpXV$4gemL$il*C`3?U*eyKN$-i~0{ zKr=a?dmY!nR#DLh9bc&M{4}hPsqgFYPy!+qci0jG+IAM`_EKgl5`k|}XKUl^J3l9z z*Z2eD-JR5Ko$|7{M#7n!u?|s6`y!&tF6%)cXCMcj#q{bR`8L`V6oqX&q^NxgG1*WK#m*kM}HMpu>$77XA7kRzy`4Y9e~4R}P~-EZUcV?1}S> z^iTvB`a6iF3g^&|_za=HUj^Bc20&~qWZHQyQVLO*&f4}9UFhK;&3_+$0(c1O_POGP zd2~n*GL6e7X)3_=1@(DetX~_|*SjKorWkdlY+Aqhyz#w*%8FJM7^0D9RGnn_w9wLy zNBnC7Rh!}f1_DXxl*B*-?fjJ_nUn7s9-n*E5yb3oNzj|}8Kxbm6QvfRDVxVA>E7sB zFrOK4^4bc{h`-%cntS6$s@yKCT0_}bW7~yKyrQ?Gj^Sv<8oZP~!$9d>9$-mu$f?QQ zkJL-&{*HCE?7Vg_{pJ&okBj|j+iP%&7(O0=-h;dc0h3;;PLZGZmTlasCeX8j=Pre< z4DBp+2`1QSUQN^irRX6q9$>41}#VMhgw65yLy5ky--ACKLR>^<-2}v1rqaXP*0W^htxWW64|GpGpB4_!4aj=o_ zHA`SPnz8_W?>AhTx$&vk&wtv&xd9}NIZtZTuUeLNPrACMu>0-l#e_y2|TgkMH9WJ}VB_3MdcTq6AQPfEUA%u8jX4ZY@Bu~;{>YZHi-!%6*mFBOl7IB#{o{t5Yu+k78Fmw1MmgdlZtUHvPf zYP5W&ybcgtcanG7^B?UW+o#XXZQ1EzIkRAV-<*1Xm(|a0il5My?-$Z0Y+ROGoK)1p z{N|XQzgTRxWZl9d&`u8h$`VD9*$u=FG*Sngz&wW$%q>@hbuFuUq=3}=N zKa|JjCfzu?Id6}p6+)>SU@I8GM&1EekqOnihvg)m5qE}aG8Jpyco)-+TVJvOGb;X&}r`TyM7F$l@rP8q+~a ziUY2IAE1u9i}T%dw%PaODD8`G{Wq)FoXijNr-2VUzKP%IF;XV0O=#;25?1gN;^tES z?G}(AHM?^#1^Oms$KTSIhJ|j$)7G5N;zTIFt_&?c-CtN5{J(FRw&(j*$w9xlN*CsZ zPYz!4w=4RR;k2F;BReB*YE?XiQAE}Z=L4sQQ!dq>=-wLg_GFiX*XcuN!i?gpPMDd` z(Y{~Xwdftpf;FR8leGSE)f7a;p`8$HRBh6Wv$MHOZKIYKD|_-d_1~XcRDE~pimb-n zw<_YOd6~F8weBYTmz*XzwC1Wqk&r{n&Pwj<`BG={eb=455sfaVqmagzQZa07T&(^{ z_P9St^~aS3noI9O@%7f6n^2cMT(>LNl%8DO+hmerY5UN0jC$eyW;Cle zp4$(wGLq2Th-e*g>eI<5h7p%@_gHsKr_|Sjuk2a~{>~EMWs`IermZuA+(jG$(Q_Cp z)oRG6ns(VU2h(iB>UaP0?rFN|hcX2|9$%KZ^WA0DES@%FL~NtFNJHJ3L8_LcEF9<4 z8Q^A{y|!}x2{Y$}0|4aSEDqVVhpmDK4jhkv9C1zIha^3+V=rBlQn>5iZ$r|1tTY#4 z`)xr(sP0G-;sV}B95^%&Z=Cbxc;3d1nKJo$NRK3p$=XJVl3u44IiG6xozc5Ybcyr?TJ7{1HPx5%LsbZ}sLit@{Ev^+?`!^Hnxe** zeCxCHF#0dfIEW}uVm6Xf?=9Ginks$VoQN)HFT16MJI_J|*fBf&V?P$1GNmcxl_x_U|YaVfaGyWH6)%q8k z|6f`5?Wm%Ep67qDi~mbQ7t)HLTlilh0tYAjVn`61zn%rl-vuIT-I{ma%1jn%dZ>-+ zQzdp^*Q8E#Q-hY#i_hrhdlqf}(SfL|Ut@|Ut$)^(ao{hYvH#rw)V;6C^boQ9eiVIM z?hwokL6ZurHLJ-Z6VA|5YcBt0qgsL>rRVMmeawhMBLf|SChKB*7pSAOBn(~QK2Y>F zspt(BY@Y$-_qy2{GQNy#l7s;8_CagpsstI%Z=v^zP-zUR9n^!#k=|Pi=`#uq5 zvA<&|gzMtrJxhlz8>-OM!=2>f^yZPkmboS9v=O|1YnW5SwTJ|}h%qmF^i?axNI~@B zqpR+p9!0Zp4%MvzqfvSm&yD&*m*@e`skf+u6i)WIU;-zJr72dgWs4`CEziubI$$xt z`-0cemWIbO<0Cs#K($L7qb%Tuk-GGC9J8G zTeN`qvZFLJ+pW0tu4YuR;=!N<-|`DnrZ>{{`0I$#@;34SE!rC!Gm;tWIiaq+wcPFF z=JkfP^6l}mOsh=2O8I8x<%kbr5L2(uiq_-rpv#N-KzY$*iLSYv?Z-T)MbXb&V81a5 z=SH~dEeoHv`ux>n^8Ff5d)c>r`ur9O4#zZKLUVqlp8ftT!1e7o-=YG=w_&F|>p*M2 z)z>aB{z@^!<3}J9Au(~ymLLYyAdPoj zfRh*A7Z;6?2PA@{Jp=HgAYWSJR?m$dZ!}vvcYn{P+aJ$hQDi(gg0P+;S32nhb>Q3Ue-AbG4;Q!1<<}i zKtii&4nX)QBPr6GlF$Y_0rY*WK&#iDlAr}Iqc_}!Kf#Hn*ll&}pW@S=&Q2XG`gC=GtXG6;1ru zU?z4s@skMzvR`yCz8V4I@eK~EHGvXK%MQ5|H@zV_wL^vWk4ZBt(Ys^D-3E5KeUg5+ z^;Nmf&bgjoGAL=KsKG0W_$|u_OKqH%UmHFTmrR{qZT^b)#EzV!A5m(14q~Ij_3K7A zCTvv?hYsYi+}SBDpp)!MoU)7WzTK~paIG6~l*1H63!cNfI-c3i%Fge>87@|M31+M@ z;^jW_rdbZ85TfU0r-X{;_sZioIM5pxiMI1Eoq6~o{YA#;fnO)Pk@B1b{5X0O>L$(p zWA){I8_(F*;UtyoQRhmz4s(2eWLwrk?}G1#PuAgGOqQd6P*=e-*uTj+Q^?jZmEonz z(7U|+e*e;E&F!ZzbPS&48VwcV(1mi!<_;&?uY|4c;6IXK6mid`hC*lMs?=7Kh$BXx zB$hVfn-QxHRpRG_j2`QJwYTxji5uslg7;W-4@=#3W?+xHvgNkEbX&?Vt@!O0I`~Iw z)+VpNd@j?zXs{zx-+>*mr~O|cY66Hm9E-#g=*eb5YP;k#VDwqb(U2|Fu7@o3@NH=u zJz6$yxk!XQivE%M5}NzRf8FvbLJ~Z17yJw?6TH9{!fArIJwwP}Ss58fZpzry ziPNnU&kR|`|d*B zE0IG{beF}&Rmw5irzdp6S(~?Uh2p6XgtY2!gsSzhEVrL|DLAY|x<7RoaR9})m?@gZ zQWdZBK5?nJ2-k^olr_1OsW?(&(u5Jzko7Z}N^Q!3w}M!2G0ro>@c!w@+!4ExG3V?v zM{Yf?N$N=D9SeK9FZ};u@4dsC{I+#b6afK|E>fe?RGLx*DT$3HB8Un|iwaT&L|TxL zAV@D!5D*k1(v;poz=V!8=^&jX6s1T47$YS4o#|S;tiA5t&vVb-`<%b-KSsVM-%Oix zly{8x9TJB9hU!5j-)S4a*+Ggq9yK&@qBFkl_U)kVg?+cH`)x+-M@S8o^>s}(ubj_k zKQ|VSCq;^KynxHoG^vZ@y7wrny{#qhU7@GR_;{rn{gE0T@z#t_5qZhB&D$q2x9eRh zM@I{_&(>~zzL#0gW~c|cI;LQjG?hI`e{Oy{9qcfF zMfQqZS1y_2&zKwQ_`Y#k#_7{;?_H&R*=~bzBlW;`12GmaQd0#}V$QiW_gLQTG?GHa_v*uqR>I5m;L? zI<|zafF_!9*bOV$S`2Tv)KuUr@!MyJbH=#{=gWd&hbm9?WGx=r^;N3=JvUD8m*v!7 zZDJ_-??OynT;KY@6_`^1^W#V;hC-Nw8N3l1JSuHqc(K|+Jrx$hoX0OEmd;_zesgxsT5O_Q=a)mAInL)5+Nf{YyHYE!QnyTGPX3@S zzRHX~@KQb#mo@h2k=w@G)IuT>zGmtt_%d5J0nO&vTVA8TSJe9{nlU?@Y~MsOYAfE~`ToQqliSXI&c3g_#ZxT4 zsd~pt7j5KEW`HJe}BjKI!|Mt{P?G# zr|CjP>h7OgUhW)Q6q5!S!nozsqes5s7kJ;r_SaLOnQp{`>gg)$q|qP8J)%9u9tie| zJr^BQOM3g9vy)w)#qEYQj-N6YN7>)AO1dkR`W(fXgvrN8lKk)v3&_{#R|}2q!5#v zRYN}$bS#<%e>e;?=QzjmD$|YGWUl?V`C`U*6aaC@CJ0d*J|u=O0kg405w|A)0OqVA zuP*vWpOxj!YE_lZ>jt;t!)K&;pX{1Y0ZlDz$(;$=gQih#8Z)PBj+$+hsA|M+a1@sV z$)P1KK8{&Oct|`r&=Te49l+Wl@`Qc|BsJhNAuEeW93=Lnc4ZEH)~yDGCG(EZeT}R< zePC>8S2P=QNJ>JiARKwQU@i46@7a4uQ!g|9B)PWEkaYck_KB9!9E+Ek+An;PX7xmO zMpq4dmctt)6uLp?A&8mu?SN3%{$j~{G*N7ojGBpJp7rQ}nZOwek-0g<%}FIZNxliN zy5RGB(i7irquiL?@<6uYPfo)4i=|Qzm1_PsiX2=XxHUk>xepgZeT8#0i4-DLM|VPC z#K2bGBU1=cE5(BaG=95VS@x`Yx+wG%$4H;_>XL+#cU!?6xBy*D^<_tUlMiUmxoOA$UKNK?p zwuLur{0Z1)N1-ZuGqnw=AI1CgX|^WS<`XvYL&tg4l$VT1pAW68HJ`w#fIC5m&~g#V z5B_xppGF+^3|8fpOvJ}w1#5WutW9&*EOC6J50!NHsaqY~a@Q0|dvvAzx9d^9ViOm) zwE=rj*BfCx13U>tzt@$$3M)%!Zwg+3lSE( zn@cqrkjl?P175=k@s;~mMNy7nW)~@fOw02$x4&4NbiigDsmVlQ_aMwZ2-$c(5nc@m zUq>CJ>rpi0bv?ZcF9x~Ey~1g@s-MsH)^KS`UDmIQaCB`^P;g4hahut<+eVb97yFV& zn4becRkFZCo5iIaGbK&x+92$~=~rlGWR5N|-WEJNQaoe+Iv8xGOxJ-#YbY>mT%XO~ zt*psRowYbQc`eDf{)a!yql@3b#(_;)5jWfL)A9g4tMcx`@FRk|aB}73MW_n7Es`NZXZ$>CGPG>Ogt)$$)a$bEY@%vOA5!8-8bY@)OYqW82R7&BI5b^vKU zMx>~O<4)M%8m9E4lVnXYRoC{EZ!BG_I_2*st&_qe@5fbvpH+w1hb=Fu>*VD_Y*TO0 z&_oaM4NYV12VnH0G)3`={fvmfekygd(0J+k@28^iM$-oKuh?5y@Psl9wmOO~tV1-z z$k4bhEuo6%Ns}4uj?L1d$AW-> zpwFH=&6au4*M)?BI$!-v8_*t+dLBfEnH||B(dyzS{2&qt;MnU8ce>QZnw?&je~~hL zfrXTnzFS4&?etV&a$N*Mfp}Wod-|y=3VV?Y(VNG8*D~6lM{337wRQ4(Um>~eQOwzd z$=V`SW#VD>Ge#VlV9Rsj4j^3tQ$`NK>p*)-Sz{h?3H!Op=H^Kl;6}F!I_98TPv^_t z#Xo%7S7#j`y?soy^G81JE2WMelYwUhnF5NY2GSe6G?dZE6lD%UDy^R4L}|X(qjYUcWjTa#Pos>EUQZ6-62Pr#+-(i0Xb{9>~ z5aR^{MTlp3w!c^$=Y{O&kvg<{?`_=YPLCI4-1PRLT-T~G9-!*S$EEN|BPUqOU!(*? zOsB#=X30+Is_khY!IbRWx*9OvcfnK2B%G)E;)%+t+Gn++7n`|aLOI@+^_kJ+&DJ?` zQS)f>C3NJj`Dm%G0%@MxeX6!U>WbHAADp0F5J3GPp0W?;*{uBswBt!wsPl_vz>rCKzp zUDiAtrYmu_b-@`&V7$YzB1LIDfW^Ivd=j-_7zX%q5Efc^YesE}cY%k;`>r5`@~yrN zsch*PgGxpBp0c{7(*a>dZ+XG57KR+yGe!-i;t!)#5!*zsl(#fiJ93{IBSUP) zN9ydMz+X>N8AI0w8sI19G-7VIE=9{Vu~NBcD|-%LM;b$q(r;7^qm+O(=JrI2Q$!pF zOPbt#7p5gWZr~%iQ%!!jN;7%DsIu-{& zxIdG77AK%e=t*ZvATuq4Wev9Q3If{TI(UWXX0$P~A*nWp!0;uCH23xzpas7N#NFeK zQE#lbu??=PvACjoPvQ(bP5No4*&F%^8g}m%y2Z+m;X*WQ#l}lP7@1Ih+toT7?`OxM z2Wu(kiin>}@?C9qH%c=t4$8a`zki8!W3gOIZeTqoM+G!tH=t{&SZP(m5Q54<+p5Eb zQsW!O43F*Jg|u}IaDM(`VZ^Ez0VJt5trg8r(OW$o|1%9Vf7V300B>OU5YgK*|MR;KC=ll%yjwvwAM>+}fzKXvvQrYtZ@K%cA41l*H{uSPOZ`x`2 zqviL$^3uZf0aJVD<~l0dhXw{vFQy+`_fSGaUmWHWn5sWzHqHBmM&#n=y)W<__K)1v-jx!Vh8i2eLdi4gPCdXjfgPx@%ymYYP7MBYvH1Y@sh z^b?r*`**R-$1tnEp>Kb~)&UClua5`&h0HlV!>U*4f=b>^R>2@^3~NZhhSL2XU7OhT z$Y!=s&aUeBDKHB!4{YX$cFTlWmxDwj3NJPFt8*Ix>6}tp3B^fUm%0r86gV5u#M~Xe zw`F(bk$-xH1#+i4qye@Cb$QoBBBCR4GN@)Nlvz8Z z2C+>@B&XuEDr%Ji^G<2q8eOn^&cQlxExZ%u-2926Lt=)LHwmn?R)y%FuTKq$Tuba| z?=LbtK7HuS#+T&)Qama7>%{o5Pm%Cxvvj=-W;N;>W$}%8{8)x6?#vu5ocIf~0}n^B zzG)%Oi{===Up=YG%k%wozyN~wt^I+tsHkrXhwv^BS!V(F6r>L)2azI#;Iod|G}M@< zXU=MW;++}5`)xep?6uzht4s`hwLYkB5#X0~_kBQAb>7#(#=`kBH6VO_=f$2l{t`nP zzA@x_BO#bJ+r7_cCleVE_^NQ5sy_b_((w}wHhEZZI!a`+Ys!p>4qMlbJy~{WC+uHR z-oqyUu4rKS^M6?N_Bf?&AuS5;ijK^Pde(H)d7?o%U+_zUUFSj(yG8N&1AGc)fRqVL zW7G*Ga4~Psfyh2w4xq`KXcrZl9KIDh>CsitJ|{S=a(o>6WxQLXf7)~@$1z~2xx0D! z6E&8lg@(zVq`hty>%_iyD=Gducc)i~s_?6H@B{nVk>o+E&v;hoJ_p1!Y2s-adosm~ z_YmSpCnXR;A!62xbWOOvPIZZWc(Fe`T}JsUm!{a>T`n(G_R~SPe+;@K;B5n66+of7 zX7kWIFF%50g=sl6V}n5T-WbhEnlP@*n3=9V7$cB&Lv)|br61=ouT$A3(B-fQQq7R# zu>Yng!lydn?9LtM;Z0(!{q>F|H=DTt*H;@e8<7~tcyjBfzQmK#u~Ma*yUfzoQ~GnT ziJwjg=VV}qnB*gw3ma-}zj({sGoUeq;|sIe!&ES%HnDRgVcRg`=*LGDkuMjY3V^tB z7E_EA!BHkQlok|%l#oWdylvt~f=gU(o<58}f1zA%sq1z5j{uzuGf$jxG1DE&r{BEE z>zk|dUSYUP!gyaSy(5uyv$VG6+6Ynf6B~gwx%QX#HUO)Ac2quj>G$hkuj#CvfqkgY zP(fNCnjoI3SK}xjwmiBr(lq_Nxi(v&7y1{=OE2-v+MVVuo1e&`>l(7c4;X&*(BE)s z6eE7(#spLJE6tlJ;1PTtDmO=Kv>tt^av)IqO7i!H7iu}AeSkfD($eZ?y#l+%#hrub z`@G!vw#?v*qZ-~q?h|uzO|i_qc9Fo({(<#TQdf#j7*u3qSY54Kw;mt99)hjU?J~=g zC2y7B2_?Vo2wQlWYL0*+<{eTNA0vjYLTDoeX(*Yr&z)|f#pLPdAi`E`J{f22&S@r)Iq9gY}NBCm1))(WF56q77-Ieg_O!jS}W_xdf!u2Xo?d93cV(J>7z`?{7hWMM4 zQ-^}Zy)Dc%HlCqwdFG7+ZOES=BnoFfaW$Mv_RjupSsPFhUGk7oIrqZHXi-yRvB)|{_hVgJ}8h!VJvrVp{q3Mabhcsh20=EyI8QxHve%cBJhbV?l|3r$Fg z)(txad1`!ZD&K6lBkVLQetKEgk>im%mJ=$Ed_*pZ*o><&v9~oRZ%jm})I?#d#&>!b zenr3N=J=XZedWcy-*3)5$Fo82K4G|&BF_!#8VBdiz`q~S^!?U|Yu;)r|>aM0o1@YVha}g zBrUeAW&=8Yp>AlZHl-kaFpXgcV=Ees(&K1^JU~ z(usnS+aU*j_WO42&d18pQr;dJEA}ec$f*qTxi5dGWK7!1VCMXx^~2X$zMV#%piHzY zfeB9>$iO1{h4GpXNVhT`!J9*Qb+x7%3#mFuiBBB`ia*~sd)+sM@_y-pDZER&LzuQT zJ$!V>UeA%-=WzXH*zvdnTl7yAf3f&I{Ed1!4SzN#RypJsa!!x;dSX}GFNL7|=JJHp znI?Q(;%vL$MD|Lk;|GF+ct&|@V& zW1g!dmx5;$WzKo!3F$b8v>m#x)h&5l?ShmCuzj86+XLbF`#@_%aqh%K4Jp`Vxyap& zTLatI+c-9w z9^JHLjfXIMss<)t%WY~I5Gwfo3i~-D?K55P0nkZv*VQLSQDcU6C1Du<0f-yw*x(P1 z^NH29{XzF+%mj*~sgx`4MCRH&PU(cU`Il(|-}}@K(J)PoC;QSY&o5R6*Nx{@)FxY2Ki$Y~bn2_PvUub|2q&xbR4CMz)=RNmZfiwzA^|t-Jc&FIr=;xg z#h&#><;ahwBE(qxneG4#-}6-GM>nvGD77SUa9|zAU<50gRXYmwNabgjy4&QH=J+~l zWG>OiUtAHB94=H>c*H4s`t=if6Xl1^%6c;3ePkVN9P(}wB>dhf+R* z?O_QNC6E^jZ0Snl$4v!V=jKK`9>wL~88~5oB)U3!Q(EHqB~8u#&#Y=hkxp0wL>MJY zOF;^H7lImGi<@k*Wy4&$OF?7z{vI2u0+>B@!dNsRIW$NPSC1ulM|pYruLY@l>Jcbc zvI0sYk0)HDSI7Bb=b$zxwkP%PBJe;UJP5?D0&csV6$D8U^D*_8MjOgb;lG|2hBO~s zW&b1b9LLDaZNO6VXrCu~#T+#7QSocyUi#>A+ybx9n~4TXhjapu%9V*4C5!Rasd@7; zdUmjn*nN{10xvN^n>EI)gO>kwrpJGrdH;{&UW&dp-TsT^#FrdE$*2SoK6_K(|F^!O zpt_AOr1cj|MkWae z@^&*d_7wIWVQ6ibSU6Q-6~1dKi{hAAmlD1EXp9W_Y3`;P4Q2)DWL^8{qU+Vd=QRop zy$VV|wJ_*UX8i7776B7X5On8%|6Ah+q;j6lzj=#M0A>9K)vBX(L}H{Gs#GVt^L6|A z5`&H}wa2~r9rV4^#~~E95UNdpY*|!;nH$ES+sfm6k&B&*LGg!M<`!{orPqrL2e;go zet$AL{d~E~%HbDXa1kiBd8@?##jeWwwWc6r!=4be`zgRwk@hj+N z-v8d(f6_=6jb_+lo*SAbhi2*+?zXg|fgfmwEpSkseHoi{ZYC$0HwJ0)9?{h<(ok9y zg?DpGbXkQ)u5k!y-?RNFMzmCT-Esw6mkacPVi^zV|yQG zw`Lh!_@e5w`eZ@{9(XD5{BXimIP2B9Oy1WYL=AG{f2}@5h!78dB=@#IUjy9!8Xl2b zb?0MOdjcYF{}i=Q<$Z+ETl+CE>C5C!DMp~zci|nT@_X(R&zMsqASE1p(#yf${lh$) zC1hNK2&vqp3*d=S8z((C^U6WcbGgm;9=^K9byCVnj}RV}oOn^k959;kh9GPmpviHx z{^^nhiYVXlBcK!gny#4vm~xZgR0CDHK2 z<6DUH<+Nq@t9hEeyn5K)vW4J+P-I{M@+vt1Tjr)}XPm!U!MQOPd`BtQNslv8-wG1_ z>$@E|z*!69dI!Y^}8o?+p zp*#Rm-elze+_i^;vWX!*QzRU{_D2}e?Gq|(@XX&UxBpaX{Tn~}|LQeXA~-y!0m%yW z=l$wn8Hcrjx||F1G&g+rcO2;568?cD-};N?x5S@%okJjOSHhow=H{iFf#yY`3ge(T z$Eqv{TX~>kU+6|OD*`@VO6;{h#UQR=EyHOB$nJ5U4xdcZEjrwH#?aATggkiu=sUre zVs9ifR%2c|%J$0oq5S?JPM5`PhkOd@r^caO@+ypo!X5`{W}l`+4efeviN~xj4!mkT+^=9J)GG-(!T|I4(Vh&jv#zS5VS)1@cNKgvaz9Ek}Rk z=d7}0YF*RQ2j;=5D=GITBM+{7?C4_J?E?5uBk*&Val>#Z9@g|5*%k5VXPLItI-a+& z;4>Y#x3D8%p@H@e8Wyw|L8AQhg4fZLXcy*CdF)#EEVX}Hw?oIb#y7FiFc0+ZA>8s25CZED?8Zx-mrjHKc!|p)4L)Y6F$(KF!1Hp?^%fo zaTH}h>WjxpeX{oM567FrQ7>Q0dU)Y_%qWI=FPG-6T3b_YeN3V6;s*&ZLJCS0J)}bh zW0gJ8wAg?CdlLrxf*IH@<6Q$@Ba^YdgDW%nZ8h@?|M7yR{R+70pKp`i zQ$(JHY+t8?E%oO%OT0GICKkM@xmuTr<9 zq(kmzJ`PAU5?v03UZVkT$H56|$kMNwf&=VRGj2+p{Gc>i+dX_O-lfko@5!&S_n^=< zKvPHyLqCMaDJ(El+L*_O{7)(6evH{{%1^xT>T}Q8mKA&+>NKDeRa4{{+zl&C77$>x zPfWncR|X>IPKj88}XNzwZ(-Fh5PSQu~Rx8C~Lp}TrZ zyt)?3V)M!KS^nR4x~JMNH0Etzl6WXNJu)x6<=xEVLP!A>5`1LALRSo6Ro{7rC%(#sXyO=B05I7jwXYEw249mWPz(C?Z4C zkOEB*9(oP%$Ebls*Uceg*Ncd+r_%OcKlkHvDVXnklniSr3q@ClBqsae#3|3Dbn7yh zJg2GfjUrD~+dXNCZZ5TJ!c}sew|XU>CU$JDGR;IV%{%s?+mFnbV(GIS$@@ z9G@4Eb(HZA$-ifT8iI4Hqj?ZX6CDkvE6@C8$4KS%ZY3n=Na1t84n?`!9eUHgXwY4e-R9l87U=n5lnl*bz;aDSe8S z{llI7JzI*^>AW21?G>6-`hTDp3z@S*Hg79zezb!B?comx1y35orC3qDKDr{$mpm23 z(@p1J{KaB-tjOX26>7Nv+}1Rf8M2hKF#uFxw#G6}?yvj*FZ|#7K63Z`u-mQh;Xk@N zI1r?6Zz@5&(7%7B77zVDFNjLXfAS@Ky3CM*)2IL?y{NH>Wk)Y^fDp^BYpd)&#fl!M zjbS|L1vs+UKTgav=O}T*`k+#BGZi$i?Zxes$b7#b_@`0(PiyWHx{jBgj_-i%3D>AF ztZ|s3;2lFE+8GrZh2s!dZ17((l{@lyYJP)p1+k05Y z7*3;EEh-7!aJcjLilpZ3JD`B@63-7?NJ9(Y-C@%r#3RME6*PEjox|}4zp)xw*mwSP z-u?s9E6HCo-=p#H2dG{=`5JE|q>8U@RC)wWkfX@RXi`gy|0&)b!Uv>j?7aG+*yp1z z9kH@hAe54mw_Z{-%XBcrM0_S!yKB$XB?mS#IQDe`76tAhb$?SDbp?D2))cE@5V2e( zl-ZQYGC+<3`e<+dJ=S$)C1NL~nb$Eb_&<$V)pPi33fnNM5x6Ud?9D) z`hnijT5CleL4R%o8w`FH{YR-i;h*f!zbUo?yH)gLg&qm2AHln~ zuY$AipHCKi0-_cTKn$q-6NdlJ*y0UEZ&_Aj_B=sW?GOrm`7cos>(IY_SKOf5bD&fI zFm+L&WrF^sZuoy|GT-T} zilQk*uu1(VTz%x1FJp(QOx2zly?kseA9{Mp2QS5xaKnp01w0(Y$NV{l*K4BnKQ>t2 zKJ)QjdBXgu-?e0fp3Wc#ESEWn=4sL=galg;(@i}JCu#1ouW#%rK2~dv2n(-%FP}WE z@jW%o4ecgG$sFnn)0c5P79UFk-1A=m8ox|-AqKW#kAm9h0MS&l?Z(`s$LxnNnZ41=7DdubEmN0 z2~B!Ffymmxi_V|m700(L&lA2rpP$pn29bFW4Vkr{k*-KxlR7b$v9-07?r%bE*gJb? z_G%~ZjD!=$Zji&1I6rMmfIrps@h5z;?VQp}th;x*&DTR3x)yl%D97=DvIA0PmmSgz z-1*$8XYHo?B|DmWWD>34s2UYKzbL`)z)hAmAKEDsL?Y`c2gdnaAJN{v{;ra4IeBl7 zt#PC@Uz^Humyk)lH-N+({$cj~5# zT`JlZ#te$F?466HEpL;?8_8CN4dZ3>JMY&VcxizoQ9qqemsZV9H{PUnKlM4@oI=Q? zrhR$LqC0|OLBc828j95=S|E{EaK;W1w`U1I0@pfODapT}!iJk~0#mcKwiA)5E3qtl zCo1X2Bp$U^{yViFLcH%cM9c)7>vW!8F*|--8cu!9u^*=gdR8E!u4o<{KmEceGK5&- zdy4`FPgSa{dzBhe{ZcDu2gESPSPn#IUr7EirRIdmaL(0*P6N^G@9g?r8dSpn5=@x)!l>EXyz(jAIe_;9AI5{)EIH2g%LYB`3 zi02Ks60LwRAMK9?%p5)hP8vb&YMX*T{qnwZ{^oSBZ-BS8VhL395&P^Xt_7)-mB4v% zlpONzAX9_}H_0%0-S2Dosp(@XuMqaI!>Wa?-OB>?n5OLhWE>cnEx$XABZ#e15 zWe@RzT7#a-=8EOJ(NY(WUf8I3OsE(BdHcU5g7bd@9))kT{ly~ZO)+G4c!Dg(IJc~M zmqBLKqZj*`CjoX}@&Wx_J%t8=*CK8*H{o=XEtoDzg>-X=VK$AS$!yYdet`gj@%R%! zgw|qr6?fc!k1#pGVL8e8Gg88?487Ap#Kq}};pGQliw~kf`!#?k^FwNFncq$53R_1& zfnIUrUpp-hIsdE0{Wq6v_CMf~6&BI7)3A;bLqcZ(xwXrNf{%e7K|&^KOkR77A-XQ1 zt7g7GyTtQSzSf@BR=U2gAKv%we8Ay(u@(_sFxfC_v8C9yas- zU`Egr+}p2B(yU^OeR}_K;i_GvTc-DSJn0aip?l>{QxC z8himFh?J$lDGC>>%{UQe?I))MW;~1@q*7PsK5M4uw>*OO^RwNT9b4S$n)CFYOkN0V z!^c7dn|QqoX@=Cr8j4}0zk=tDbz~qBCS;EYzJjZ>Sd$(I^4kp=R@h&7H>j%V?%g*m zPkyZrQLt_Jsu^^RRlM#UG$S+H6)&WmSD$h(DEG17hK7X}>4Jl?WNX^C_b zm3_eDC3`b`dtX0GZ~z7vM#vyNh;rCdz8NUOpcIjp$c)6&DcQeR;&ynXW_P%Ca-Zv; z6zKO6s^b(~axk&Z*FW9-=0lg@-Z|7fPM7up2!&YCf{KvGX+>1=mO3xo)KXKfHcD5e$d!;im|Qm~wkBXgC#fhw zjhOMRu0bT)k0IdEFU0L3A4yR5BC+dO8C;Bb^&w=f%Ads*-C~JeoTEzvR9UU%FP7bq zsxkv|;6@Mu9Ssp}x<}v`1UIq2y2>4AEgZcM!2##I-J{Xi~tCz|CM6uZ&MsrjHYjDnU|?x*f{oQzl*z zUYb9E;z5iMR+?3!v6VmxX!a6Ya+~%W=wVvKy}YY_7`~}l#X@oO42@IXh0@*W&z^wV#!fdMqH_>?X@312T@9&G5>Sb;mCU{X0_2P4*)FvW z>!qwYtz1>P37yC%dn(hu#uSkaHqeNh1tA9}>KYSa2j(YjJUk!pybQ=dYML9W+E-W2 zwCBs-Idj8TM=0clmW2l}bd%FmI2jh_XT}D`u>K0ed+B%|aUFIWbhk0cCgO$W5tpWx z$4Cz{mFL#KCr!z`P>jy*WxFk9{ZJ_cv!{R%xlLZ#2;MVZo!EtD-+MshK^rEUeX6fU zyc>OoF|13-ICXU2iQK27DKhMbw9dahtY!hMK^sjBwr>r0dYB2J!zuC+bV!lAwqdNl zr}U*T;&hvmqpwoa)&TRl*e!Os=OvZ>Ck%QV)a8X&vEndKJSQD8M3Gfw)mA74Tsj8v3>M;3?-!yY0n$&8gq ze|wnqSPfiqfc^EkOc_m0tovw<=yCt(_b%J1SDcPX5=P?}tp?@=U=K>rM&w`X-4yPQ zEI^&`AXyt*9$TB2lgBBJbmm&-K-fjUZB-)74-wZkcF>4 zl1CnC#a7F%MC%6TwYjfRxI$cROw`mqwd&crJ1Ubzxov4+S{(hjfa?M~{*wAL*Xq}| zADr&x+JCTxV}B@23%&3LD}_`c!ea1JI9;SU4NfXM;DM8xsQK)L3?g<_Bu(oNE9FL> z%`v$9DU;-ys^Jo=)s+9}j{Vm~hA06zBl##5x(VSc(wIC!l5ZQ}p|x&F!Y)zFVs-U| z^IyETGwW#bvItUE60^8eFnv7SSG8>Jm{sC;SY=rvP9D{Qt=4?D?kD~lBIJ%2)+poq zmMYe#;&`M+LOWOFx$<`*)bhgRa^tXP(8mTIaB3WFo(Q3xv})-^KLa|iI+Wm6)Uk=5 zdS})qaEj!<27@$$(Xoce|V|=@pg*UjKQRi1QKPA|XKjg>D zPFw46I88?GJ{lXj9@(Sgq2SSexXx3fW7K%bAos@wFWvk+8~|;7{`16FWeF(==$&=;p?m(&c$?IZZ=(@w7KwTV<0~%^3v%UhqUuNn3v! z*5KCRc}!{}nc?ee5QI?+C9T}7ThG{@ODBoD=~yo;Oca6{@&9|T!Lp??TWl za>_%BcMvKC40?7k!a*kpc}W+NJ3XuaR`Hut^tA~QzMv#^M|6spY_c1Ni!}5Xd>?ae zJjo3riM&yalqC;rh!S9!62}8J&NT$g;TZ>nNvDOOSWS~noDt-jt9eCCZv7cEvCXdw zZ0h@VQlrX{dgy61d=L7x_%cNqw%>Yg-aSc&+f6;Sv7T@GU|-{vBGs#1zpg%N_SWM_ z`wCF-2PlX;oO=fbxwwFidra3HT}Az}ab_CcN+#b%MfP_s;&ZWuj3R_%L-{ z_-PQ9C{cK?lbw?Za7pBKN^SN%MyVf!!@9-a_rZPm-G=&9qN@@i;q`IN98~X%@dkZ%4+YpfS3lmp z`op5s59kyK)6%G_jB@%#P}t+6pQk+~;yIAl$!Sp=N;Md+Zz5t#%L!;li3$^!AP*Kd zQUZPH#e;*62R=*Wt0G+uiDilNY_NrQFQ#c~c2yLP=+b)_GxDw>wg1L1#FIgNas9)a z^W`mPvqN$>x{nQVOwJ4AADTj_$Pv0cp%;CKHb4|fsOdlHFo|$!4KSZ|SMd;d*X=89 zrg)CrDw-15ugZR3I8f@eqz;oyjogoSNeCnwl=zk+T+GP0Tt7WWqcOh^_s7>Y^ghQG zyjDNFWoI;OAZ;BOb0ylmvRq?ZWg~Cz+(0&J!iEFnNP=##YPsx%5ZmU|EOF2aqFhrv zxjH4eQN48yI_8$wm!^4StGIpt2l-UW<3e$NW1l^_X`1bVTsnoDz!3%o$yQ=G*ywZnNgr~t+M!V{>hcIKY|||wJYM|eGff~nxZn#4C&Zb*v_4I9Dl3ojYI-TsC&1~ps z?ACp(Bz_%V#?00BM7^+ThU(HZN!$Zpdr9Zgcq5R;$_0f;Z+n}bA@%Nz;6~WSEEHbd z`7}@$YNPWzmTSY{e6t*`m^QR`9N|I&?0f%yM|)ac{_JFHc^7XK^u!uds_KEIBKNfQ z+~q?F$-UY~q*;O?GSS}urO6f)oHcVh@;8QD|4x*F@{KhHy@Tu`07|SH`TwUtC@tDx`GNG)-*8h&@ z57f{{^sv{x=wlX|NA3r~CPfwu6Pl#=V9a)#fIJYI9&!7HI zmowG#RXwNCuN>sI=pBd=*9=|3BoTtAA=P<=$xcVpJ`^CY48tB8VmhHpfqGXoOtgd5?t*Yz+Y&^r=bHmOEM(4;9G3oD^Ww<32K z+l}J42o%{hjI8c;TT0@7)ijzelR-S0a^bg+LT6vgGY`3O=^ncGo&qhL5V&qi zc5FN2_9XHX@jXtO7TAtEo!U9}LvmAY?8xSg#z${c!0>}h*5vm&GW#4V_B0WYTZ4E{ z?39Adr=g$RdN8|>gRrX2f@ndH^iLbOegT zv2~2vghZr7;S8+0&@+80r4rp;Pdd`6CDp zLrPbB2@LGXxKfP`vn!)b`oTm!)b8+V5An+#C6`eznIz-4+wC!eCOuKNSe_r`cu&)z zE{u4g#AuBjI9;-QyIFV!BwQ(CW2q}avE+np^kKR&XVu4#RxoAV$}4Xhc+&6ks(|%r zr*Pqgw2`(3?-&3IIsr-_t!GOQb^&ay>Vuwsnb((Zyt=0(++JnV^oig5L}=AL!@di; zT^mZokef^{Szr?COvsR@%wulaU&2H3)-2CCXpomCB9k4mdUkcHoZ=84DB&S?yyw{@lb}?)m^6a^|0-D3zmo4$`bmhGE zKwpS_FJ|5Ofg&O9beJpa7)lo1a#dGn0Un(NC8s7Y@W!HaDe`SWp7`2gq*|`+tggA) z!!+*4C9>`A<_e^HC#;;F8%jJiE(Rj4<&Z-^VZ7g@P7eF2L}#qD={;T^?}1Bw{Z=>! z8CLcZ(;5~}b+cR$PP&q4n8+y|^Gh)?K3*i3@e0XNgg{d(82K8u8zSZvfCc+8*SNsJ z%$3sdL0>AtTl}bfs6gRoNn?u(o^9b7zbBYNnD}gvaja?_n`nly1GrW$kjcR0|I&yq5D z-Rv!>J#ZvXB3U|UDCpsUaCz2+*KjvO1Fy2?Io7ca{6c=#F^bTxa-I}MzI9f?H0oBc z;gNwy4GE;~{e1=_$8L(NvOecZ(zfCxa}!*mt6`IE)Y6Iu@?4EeFwsneR=IYyF3l|U zyuVR?&Fx4dq4%oAQ4vlD&MpAS`8pEshJQX%ja(4P znivn!OhWatuGz78A=RAJJEEe8mz)o{EN@)CO?yqACwQ-u2v|OAddMJ3w5%;y#}ibN z2W1<^b0#cx5aSc({B5!uxe8KSVjA%rJj>bP>`SyLFt`4MeL0D^q)GVO2b%;qt zEx=<@nnQGH^#*i|t{GvLQ+daeiy%F#c)eeB;1MNEGtS86B;q081q^kYe58grkN~Mt zT26Skn(-2_`BT#xo>46yU(0w<0k9vpJf{QtQckxNzuMJ{E^=*cPC61Xv3SQaPDJCv zTpc->7*IiUrJIzxi?J8I`=;XmmXUk2G=&_f(E7GIKJ}4344BSWSUvKRhq*Yie~0J zTe&p#Bo807lbcUk?S+q-fn6sau=0Fvdw(KSkJJVyTX=J=!9{gX5daX(PQOYXe@)qc z*_&V{v?it=unUPU)y;o-kvk>>vFJLK4jnpVatw0Z>lBCsz*r+Fgyuuq^}=wSSy`Gmp+DCKXrc zZkA0q9V{zjOL&tk2_08Ycgipg{q6Jqp_;@=jsx59ldG^Q5EW?xAKhW9LzH{5cV8EE zmg4^_x?0i-oVKPLa!(6w=7gUcD&bNdNmtb;)EO)T9urr*WaXoMySK?CqFfv$n81u9 zVt7{h+~>eZOv!iKicN2NyeJ|*joIJb3o}(pfN=3Gf0JevlF)E zNzJclki8^s%^SY8Z&xhcsir1{~ip`79{@>~|D_t(^U zQhUJePL^Q9aL#CUxveXZw9QR24R+h35o!`kgD!6l+|FzJf3|GlHnPf7Qn zJz-Ou&7hwdb{`V~6RJ)y2!?a@QZBcdJ(`MWw6%8|ZYz?xIX%GnI|;F5kdU*uX?JMR zWz}XcQF{1rd-YL!j&-o5KMIbt=o)S%5q z6(}$2Mheqn&HHm9`(4Va&aXdbe%_W~$kveefEvj8T3SZ%jC9`9@;&A!WG=u>eU!oJ z_GAVrLnEYlay5ufwn<&2IsF3g*A=Z&26Nze(w?wCxPZ|n_rar@- zXoT27FyX*iDBM!8Pmb%LZ%+NDZJ3W{j)OdESRuw2r(G4E4^*xSouI?z%L7lMXuELd zDTJt!rC~fZmZp?JgNXQs_T8S|#Hdg%aA7{JtiIdsZXMaQeVe>|jeB23jHX1ZCd~7a z8lmdwBV5scq}~1{B^F{tTK+9178{6NJ~afY!M>n3`Mk>*d?=PWVh4Dg+|CLX@zkwr z-w<$y=D#C;vKijU`X9)Hm@N5g;U=KH{{ef}Kv;sVW?vBX{bwX{I5aZ@KN$};HIUjM zYk85(#k$%DGNqjHt6ClGUC)1G7=8FNdw2W4@VD~^%smV3+fwt}BW#@E29Bw`)hs}# zi`h6J*G}iav!42={b6bVY&G;}S#dK3_jnZ@g5CyF6q{cgmXjsmd1<|#B}|3~_@G|TY1b@ice|Q7nrUTBygYJcXY=D8 z_xwoRUG{qO@td*siQW$K@whwNJIizxg*_zOch{WZ4m~W%xl8ram|;5amO7d+)sf~w z#v8}XlYvU=*h<;hk`Ws9@sqnM~KL8JqytGcDdk5nd5z9rxsoEd7nK>`ys?e9- zzgU(aSvcu_ZoJ~^akiuYqC#Mv4;%ZF3G!$FJdku-<8Wuc?o(q`p?wAi_Ev87-vJX4 z;JeSoHq%#gLUGa*nIu}=mvbA_*9(S0&*aw1>B`n^G5qJYt>%RuXWPLQR3lQ_xynSU zu&HaKG+&t;56e8iEqwm+f&FSP>Q3F((PxX%w&<$@hDW@TThCd#$_0~eM@K5Y)fGNH zrlY6&jkDjJTpvADZ1#0P=(7efnS`vV@xjgx6heZ9?Pdps5!KRv-nVy}g?Yg=DkTx++!^xVD zUD@cKD$EAUF`I`&0x^#}%*>-@1EWiy&?>J-Dq9c{!wUP|rv<0^+C2L~t6spWDrFs{ z@IPV*<2)a>V2}i@9xzLC&WTShX(aEK@@e}dIJw!6QHA|71EN`ACPCIodpR@;S+%8( zDzcx7B2f!70>@8ZexzJ{Sf9_F>-oJ#%x~-CC+ow1=I{UhsP$i?*SnX|AfHm(>NkR~ zYKPwkv0rnALe^0mim0C(<~pnUA@daO3kWLxV`AA~_TPLo(6_>I_#F1;qZ})8PAg3l zm#;IpZNp6>M7&-d#jbbczpX-1gA2p1gJPon&1a;eOG-T}K3-CfMD!7O1VVlZ>>Cd8yC{D3`8 z{oU*QDP^AY1^A3R6ZPRZaO@?(PKs56=wpVX-N5?8x}nJDKq^@g+%^enZ-Hpr@Ef=4 zp?X`NQ-Es%uL~3;?*kbS-v7sc_&^q99S_JkY){5B$^X#u4bmCx#yYSQ*rI;Gq`3ZU z_|KsL^X<=B@gGa?pAW?finN_Ftxw4emBAo|{W?>)$f8*49shaV-m0COn@T~qZ9@9K z4^G@qJ9+R^fSr=p`4ak38YmN3$QvU$5LgGyYosX(VI}s!rrA#mzWFGEdQKDf3BYY-P?v&GifhH zmKfZeLx5=mctTjW3nK2IIAcRkQw445n87}SWmK@ulB^QT$(f$1?7Ib&l;_dVNcB!?(EyVtzrkF2VcZb%tp~Jx#Blg%7D1bcKF|Iq$(M}dO*%W zWB0&!xV5l?+1LO^84P@R>2nyG(^3$8RbdwNV)56y#(#WWiT!VAw73dR*QK#Hj~*-T zEQWZD2@VNUeWN3Mz3Gw>Vq4y#htDB0~%0G_#VuS42_; z0lS>Vh}*5ZgEmLMCZFpGe=Z^Z?QQP_`%$+~#qOokC#85^67?nZMc$=IUBy z0Ldo6{vDz0-cXpsJrbI8N|oOBNN3B({Gs=LhtczB5>kjYAi9sTAQUcz1aX(Q_Lq&yg@Pj7tf-aOPxZ@>T05LP^6hkuOl!+# z=rRxeIIXC0lfA()483ax#n~Kwwzv?@IXLI|z~0UE7-zrBjRR+Im$N(EsfcmxLLTvJ z7ix|`%GfJD$2jgtV1MMMyQlrw4_Jy)(o3hIGyDhh55-C#Q;N5YIHE-uN~tFwe>b<( zmI#%etmzu}|7wy}8F7T?pk%gEDR-cl_^U06(~vcKD72qiNV8T}#k}?msLwb*>96Ih zt5aUG$KdG>g~a2G&c~mxUpUVmZTWyaOhh$mW5yNga7bp zkMVHs31!Cu_Y=c0!Q!@g4;2=o#}NB{j?+Bu5u^6mmzEaTt#^lU410H;>v`=Zbn&qE z>jpn3KJnH03kPkm6H5_jUgR;xrQkZtLo}<585T$1j+c%Z4mAg=%PV?LV|bf=7aoR6 zn}r8@aM%uqlU(9DXb-489Y#dX;Bl^fCJydGeK_Sl)PSNz+GaI<6Rz^ zjU_lfH$HgC73?R&kfPxzJao3&hn(5zv}RL_tF6Z6{S8ufdy7xQ&kU$pl)m^}_n~dS z-&dA2P0VAAz)eH;Yg6J&-V76l9QuqBpkL@-)$!-kQv|Pmh=@0CZn@YJqZkTXjbdl@ z_;FC$sgeV83LNs)59Gr4nVV>1&hnqi*f^9da`~bpPDy;X?>p2^Rv(g$^8KZ)SK54- zHzv!m%hQJK;C>q|8l#P3<#KnMj)(8JlMgJlCNP5@%_$PJ8#qTp3H7cTv3XIks0KrK z|FI7zvSHlycB)GyC< z5**9iJ&p)B2EnXDj^L z?x`+5d259<6>&%UYOxO18+GPGvp_c9wmp1QoYLOJVIiU4gY34FcY=Pg}XGpRiB zqWDV#_T<#5ubq*wd>c9yK~Z?@<2>G)$KJ5H9$E}gR8LE134NqkPmIZ%ig0+LMxfP= zGA_w7ky?@k_9OPu=6Zd5D0S%T8!D=xajaVQB1Y1=jrv0H%PXrzMm>8q%$i}sEJPrM zK$^?+X2XTZ`m7-@AyfHXCHKI-JSsb&4{$ zh7(w0O5o%U?@IJe&xf09R@eP(_o+{}6d*>(P?v8uKjWqVlrN zk}wRHu_?{gIHc%-mDjd@-6Mq)FD&jvm^=s)oH!&jQFDUzB}JZ=IU&N{YSKj+@<&hX;M?LvEgQ8u_99UQ)@KHlp`f3kfO$fRWNm%Zl&qIxoFQ;UB%Lf z9dQ}o(A87#KJ#Kuaf{v3|B@8a)B)r+6WhBX7ku!;_OR40nWd7KS`JZftI3@w!3@m9 z8I&e@xm4kgo!>PkbxJ}^u+5f%U?zU2J)lH=tSC)Nn|+7~_K_XS$=fz)_N}Vv>$qc* zy}uaJq?^0O_*HC_X?-P-(9%XMtY>*T^Xm#F_eaE{(S+NwE^<2;InChSw^LL++h87} zhnP4(kc594Q179jSkZE_tIk68taoOYO#UmP^HdFYlIN6o*1%Wclc|EY%%l8K4g(Ac z`-w2Dz^H@iqi%);GAwm)w&LcfWlI>ZQ-SLCVjE43*8T3Wk(qqcKsi(9SO_InKgxjS zduu#TSf3(3v|xECp|L-H-YMp*i@auL(DJ0`w`V&)I-L+ZpGOd8wOJkyuJ@CS@DZ~e zPl~W=EEDPtF!Ovm%b$3=_;l`rMr70Qp*!dGc8UqMUxjT-eMFuw0v_;Qlxg<_KB)Wp zQ>}v@efI~yHK~XP85pUF^ga^nDwF!`$3IKpRpugc?VZ`gQFWhYlytaore=>=L_aw8 zHV-~Hepp{h>~O5li-WH}ey_l=&5fR`?v+kxFx3gPvKhLjH=>d}(L$MHXIfv~Dfsk3 z(46maw&LH)7h(U#BO9s_y(~S4TAN3(9yvDq?fC)w#M#d(kD9V<9LH@OaD#8I0psq1 z+ywq_HN&sDOn=iU@sit4cI zpdxb|9lbFv2nk^`=ls4wwGz-29MAOwHXpadN-#I-v`sboiUqv)8|q6SguOZC4VYp2 z`BqJ4dN+_>pe$8dod$|h(`8A8g!QXvBKleJ4OK5&yWi@Rf7UAh`eziw0ubS{qv(lE zQ&323E;ny5i7!Jj5}QN);_;h5Ro!mV#i7rgGzAtOPfhBd4fL~ReTFZ){>tI=MbK3}A5bI72g=7u zh`PT42?_%kGCEOR9Z23qp0znkR2+5z+V#O zzuvbAXed~+o{tY8*!DUO5Wlp z&FVCSxJbAX?aKxfT`Fht*vNkB^;wSABiED%#ka);eJkIsaEhIG%WBj@02p~5nhf1* z$f3|jfcUcP!HRj*+Aq_i8y-of2>O3EY8cWQ3iVGVnx=Ge|U401wTpgZV5G(SR_ z3+Ho(OiHQF7-9Nv#0)L^D(QasbaT&oXIkHC{Tj2WZUs4?TYw_Fg_FMxi)LWor z2q{YQ7=x31=_lvv^(%r}hK-$h8azskZfeUkXy4xxZzSwd5zPMOc6B5SwyDM1twq1< zfG=VPrI3RL#RTD5q!2jq*%}l2SkhJH2O{oxXEdj`-GB1!YTnc#HzR`ZVX^RM_bfx` z2+%!akoyrej2QQw_Gbw*bh?x;j^~`yZl_wuCm8aK)G4;PoV*=HIvrU_9Q`mYM~-Ti zu_aj##yO;s8YDJFmzZeF176Wr}pnVCm-#JcdosNOT~|6 z)rb@nGiy+{@?3IiJSPz z0l`o5e4S!^b>66I=B%P><%q?%&oS?t0;KP55#NOEl7Ma1;LE;&Hm=Ve)OLQb$LPTN z&Q2}fERmyX0+Jue7q|IynZg=Dz1!uBwDvI>*@jVpp1d_*E~6*fj&8eJKFnMcF?c?C z=GpFN7h(QV5^Bu144!d##6xU-4uac$9Vy$n(8XH0rglu@wpAoxI1$vlF47{h3wBCk z0v`bt_BY`M@J$bku|>Dc+IKQE5*Er7r6=0Z9G0Ih&|Q@xdoPp=f4ww@G1{Hx%SGSA zYDh#a+#@d=4`E2{!j5fJAOq-syUX0cXphz@PJ%*wuFjUJ5NU#}5NutNf)2+oyTjS> z3x~77HkZFx{jSPr0>MZQu3e(IjR^G zp0tDG5dVBjrmBta&j#C>j8u|5YK01ce%p8cn`8HXNYnaxfOIs<^?}CxD{B=L{WOGE zV6I;{kOi2*21A$|_@yFsssre{CVuH1tU%R>^+k87?FY;^5mm%PUx5fVkECchEY)On zR`ne~xv}>j_$>RXv@;Kp``)a8D@3~>IQ`has)P%)Zh=wt2PNCOj9d6d#NW7GsO`I< zk$C```KeF8y2Ehr!)DU~1h@?p`#<+%04}V2BmAbtJ+x{{1X}SAjkw8;X-0Sfd;-gQ z-d@@Yjhui2{VsK?@=Ybb`TcQVsYM59+2`E<-~r70z4dl>3q^M!`UgxxeT=Zl0&0U8 zbOzrB_IC!{uPFv{1ZPi7GrbyL!e3_zP5o-1k&L>%92~zRhRw|SfWng)vHfXrlAxSs z?qtu|nFV6d`1tsD`|(Hv{m7l^&&HKZzCHV}<@}NAeE4&#@b$^JuN|-4gdRG0u4#jY}#vr2N4^BC#&ZH)3xU9 z4_HtJ4aB&wv^QXBNBRJNAo?aV>?X?(?BKP}0L4tGNiX8pQ1|wB06U&1Y=dfdBZY@( zW$HaOVnEy`jioX;qtH2LsEBB2e|po=^oqjiSIs9%jmOm`)h_tf_i#5pM3c}ZPTeHQ=i zeXD@d)7eI_*#5n*iz>~A+AFZMlMo|h7AXb{TOWpOuN&86ne^c~Oa5-i)_WaKPKis? z;&s1Jk~6e%Gepe8b>}1_Kin*)`PoBk6t;*z2(rV4;CC~O@(A;W6;XBYeVXGcqDq2q zHC$Xo`^@ypwoKUAQLi=c{Nw@iusTG;Fys^(5u3;S#KG_!C<8IjO%yS^<^$dzunJs>aM8%y)IFjP9=?haH!J z0xTEl9eQOklf?n}iH`V&FyG@xnRcuJ^pal*h4AOl{o_{g=g@(b`{%6txAy42-}L?* zx<7~R&tqcpfBJ+CFg9n-QSjkHPobfDP$XFT<}Z$XRUib8>I*A?JQ+_D`_=d}V4R>J z8nN6cdni~Tdd|VCk$y|c?SLefEn=rqZ&a? zkWcIS(=@M~UL!-G-*9j3H79>FnleYYI*A16c5@H_0B(VBZ|A9gK=?sh!=&GbvjaZ8 zdVtlvkA-Gl*se zNbER#?>u3!MiRX+3?vCh(=xuVibw9(2s!hpPGb1**x17k_8xts@HYCbA$xeS2=-Bl z#Ss~#ww`-w!3WkrN#eNeZe#vp!s7R=_!vJ8<&cHu%+c8gOd1^N{hifTMt5d{*A5!0rHL#4+kvxvi~=zw&K! z^= zl;R^moAsq=)h+v2^i-)MqshY;yN-J1MvJlESe_VB8R^rn;FBB{(@VQ=cK#!5^H}Ic zKghLb@S0tqL;^wb_Kb@dxzgx+c?0S3<+j;!q$;ro!TXd1#E0}ru|1nY-Kf$Cx*c+F zMHF!qf3&}0m$)`(hvmhZJ7$owa^|@}jg9*woU(n7R7OL|ZR*siPKFY>aNe}9xU^HQ zRmQ$b2^ry?HC#VMfaQxRNY{x3$r}+~-O7pXLpgFlk)v>O_~zNRboA_~QbD(znYWCd z(x>;GpXkaofo925LmKR&siCmVpuaMq{hN=RMYYiK4zLFWZyGjQ(GaW)1ZoE$H63`p zJ25}o`K^7og{ApLN5`#;G4^TP%>rxEo^12xG}`c3_Yat?M^FuR=q^Mtm`0Ot5H?OP zg`lZBA?7Yea(>PH6WVFpb6$OJUgz&P<2sS^CE(|b$*amGv9XjC7#rzYP0DygnVS?~vTdLl8{e#v1?*z}Mne6+9imJ3q&VvD~T7 zxby<*P-k7me}U%HK=$iz_y6O|>;GRaN6)A0Zcsrz5siJ(h?EX0yc4kt5fsVM{)SQb zhYugfU*8!MSsU<@mDY@a$l$v=^Q%9Q2Xj#gYz57)G{p(+sChLx|(%l z?@bk!uKyurZ&7E?v@T5?qsX3v~LLBMm3|DC$QOf@Qc8ktivC+dmFm^ zwd42E$=}oCI`-v!h5(zfTof-;23pM*g3&U6PmarIXU3o=XBfhu4vRq7o?or|RbWxp zEqEwy0@}Ce;p0cUR8w=T0*Gge=2+I}K{*Z-=&3Ed7y*6)y5J&ywfS8KdrklMW(L0m zYd06r+v_K7Co51rnAj5hHU^3^-MlM*4~QjLU6C$sKRm%beC1fAad^?`)5JG%Au(qo z_QXgbW>x$Qz!?adEM{NWNcVA|nUL|+BARTr1B%_}#vL!OG=@RIVH1XB4&G8 z$&)Yjn8C?g)t&2V*Yy|wVA)wM=wgV$ zcVwcWb4GCy?>wGt+aQl*uc-=>S~_4lq@LY=_MX}2gm@lNg~Fz}tc^pFT;`drT=p{6 zaccSHW_PpVNPVBv(&7vB@KwW+$R_Rkmpu$)MmKf#yVu0`otz~}h&bgTs%qd+?MHM0 zp%E2T3hh~iM_P*1P~6{S%g5Sbf5C_8R+ewe(pVbZUUYXszNz)JiPrH;jn74ssKDCt zZ} zYI5!s3v75(dw&-^BV|7z0|XEcyBF>4++5%2oG)T5%}8EP<8k7UVdChK87MIPzDD^$ zlDo*R?6~tQdj6d)Zbz1wEhp1^lVhEaI<~atEyiChHqYqGzO)|%xmAqVD{4Y4}XI5yCO}qCl^9IWe@YG8u0P#<2*psxHLSQ>_XJkq ztB(XlZ&ieYt@9_rvk*g@eh#WV?|0U7rS@S0RUu4Sf~Z8&y;?(8HiF;z5P9Yk%lsXs zhRfEF{ev7C+l&oG3esdjkL)}p3@E`*ixIo+A4b(0bmt-D!f3h9i}~p_6~?b4rtTQV z4F_J(Iz7gQz6I4{08+RQEGabeX<`B^M6NkJuCQNE;rmo}-e$I@@7$2Pc}_aPkW*%& z`W3b6>{>Yb+tT_7!3@#LfsO#Mm4PlhREu&2sBqy5HJql9!#H0?W|oCdcHquBBRS}; z$3T$5CfZ0dM4>P_U97_Jd;_K`AG%t_2keU~8iI%zygY_#Ktza26wq#F`?!)f=Z2i) zrb@n1-(Cro|9~!5=Mle`9qnHHB`F~P%zdeqe%3AkM(;>vwL!)03vOhq@#(Ot%an)f z`V{woOwRy=&hft7*IkR2S4_mqwm3D`)qTUYo_luU+}*qN1NgB*{0@c`5l%Y+Lgu=) ziY|}S2Wje;DcZdeK{3y`3WqPcxGo)&yWGTc+X^+pH+z+9xOsQ(t2s3(t!3Q9%@NC` z_}0&)2cbNcb;b$qZ*(5`o=*&o={9p(noGD$+KXm!+@znPc~dpI{1hGFfGZNHDQZ`{e8UB zJ#%Vngv%29b`c9J5zKeGan82)9IVWq3Ym=ZG3V6eug17=zkf*6VsQ}Q;mU9@p83pp z#x;U~6Kjj78hY=XqGcF`&4|2~QJzM_7+IaiehDJyRXLy4c7!J`$nTNAaOn!V5MJBP z0n!0Ma;SfZ=Wsn#ax8&<)dA4U8NN+HRCf9S7Kb|7)cSYo8EEc%+sGPy{1{1DL81Qh z;z|W)wvPImtPrIPdQG%*RE)%Kc$AsABx%L9@ayTm+OjpqHkhK4Vts< z*aom3`K_yYSUbTHw*6{<4a)BeVIN>Z@?pMP)!lE$I#6T?!u6SC)w1ARKhaj1FE#3n zYlg>%HZnrEZ}9jDU3eLDDUFGza|7N>ye-&Xds!v6WXExxq=~(k>pvBt2a<%V<(H4Y z7`u9?GW3L~^0SO9h6Xm*HN@FyIdok+ke~*Q^{uyzcbNt-{;l7&4d4Bbo_jYsmV^!Tt`{f-Z*l0EY+8ZLlirDIy1DheLxxgc-8d- zwx??a%k-G;X))z&uTM6Yi5N|?-h~QD%Wmr zDQ{gb5)^2kY2jwL35+#7*&IE1`6IsMAa;ivCITr`L<``2U7_FSjB$6$Tm19`cH=1R z>0t9Al$g}D-JhR3os+VRpxcAr<#Np)s8)%=mq81dKt#FXx6N)g)|Phn+1XZB)VS3` zl0&2CC8BTMU%GQCyV`^MgY8Bqpk7f*r?sd@7Z)vmD*LPrBHfBIm;6rd<20LI1$Xkn^ya)CQJL z7Of4y6+0Q4-9FBA)U4muF2+Tomba9!K_o}W+xN1v*N&LIIpmdL6yz}?yr`yXJ*0%8 zor?n6UN7mpShtD33V0M+hO1Pw+4`8O z`1#hY4`K(QTB0I5<0wSdMr@t+^8kV7;+hlo2WnhKM$s?r{VEHhU)i4f)W+`+#CbR3 z+pDt}6U!n)mzH%)(onB!X;4qkosh;$&Zej0v+ad%yr#wGmCP<(oPsP zaSgLkM+=z=kd?~)X3hR8DMbsP8?;o~tttGN8!9jSl>QPV=+Wa1{B z`kGnA+7H#9soCpk10aC1MmyI6*XUV51RM7z=_X}+gWX$`#^fi0Rxx>pgSLmt9tl5m zhi?Nx*Y;~TfF?V^o6(PWJzX{{oUiqh3o1$9vXZ#P zO9R#S*N~!UQVJ0zd1fq*Oga)MI6-z6>|>}$RcLaJ8X46J^|a@_ zGlwqbEbOa=&Y+b!k#bYm`&?a;c6o~|#_cG-c#N~@nNaf(Iq%Y*j6~0zm8)(%nisF0 zaMRAJv5BopJ+?8Ds3~=>al{Yjg9H;ZjKN34(T_L~M3ogLND8JjgVzHi4vO(5_LzHO z#;Q(%)BA^avbti9L^K_Hs_CD0;+yQ$&&Z(vn%`%LPmIDw!;FKY1O9{|g@|wqF6wYI zG1BfzU2cfIs#?m|(<*QZPT2ZUV5QN|m$8$kM1;13g?Jh@mI{c}qDUt)S)CyhdwC2n z+-Q3=S5%;Dj*5enaOypoO11A@GhYguzED$d4Mmk#ebmkML6V!WL|vNf(4vU^*U=^^ zTtHcWk@Rt1E#*3sqtVqS$uC0|wRKKfwnUh=E?nBLWih?0ddNt#(Il-rzOTZrUP7^F zrj>M6vt;Ou|F9X#D69(4DL0pfWhPXnT0@5rEmdaW-onw(Lw98`O z_8$qd|FTuszIAVgaB!r7jlrGm&*2)(GH5Idy#=tv#t%tj1#6DX*6dAaxYn9!ngpn7 zm#6eax;VUz)1O~C9Uyk-Lc__0XIvA_3|_8}Lx?A+u|eH|NN0p*_v%JMqa}Zhi43Vv z!7MCEYtZHBo*kDzA3oXE8+EJ#KO71j{3%Wl5Up&LjrQvrf-mG?#xX&CQbdH1XHwm! zOIgOK$(8pU5sK&K_#=gF9OKHLKijyS{_xZ5SyN6qmFJ53*}^sNczgN{Y?hoyLyD7% zH8Y0&v!zVBpLF$m>ARAONN(#2J@h2jZji~g{df|f3Q|S!`*q-pq8j$fk82+4c5&t6 z9yG1>70Rs$tlFP*@qzHWPkfUR6Y-}g@9*3((qr!j!iM1`h^P1>g?ijRM+LZ3pWM1vX{i+yV*2S_C@-L@NVq@D3*!TPPK%$P_?w7)dHQ zoiprt0NOcVYMHFAW0sXCGZL=5V@*$JKpM7aKu_ZuZXdw1u7PBEX2?SY*0Xxk?r<-u z52QBDoK?My$u4&zqq!KpLU-9b3DR+dWdF)k^4TP#hiRh;K4lK{lw?(2!yUE1PjC2F zUt6N+rsl$C?ZIa_bH=6WP-JYM)PPhI?gTMPa87$+*otmN#I>GYTRh(x;vCjs8&)a!obhOgOg zSfCh(;(;F1knSe{^Xxo|XHa98dAEEqQlg-V8CXg%aWR`%ef>+RY3?xzPqcM`=(`#3 zGw-*-Z&=3u?ekFGF5m;Hq%i_vo)V7%Q@3F^=4}y$E3!%^cjf@)?BUnsZ}CIv149Pa zx_#5%NK|^=J*KN&Erq(md_LKQ6kwbo&-U}R*N^ClXfAHgMwjg*CaOI4V7y!@dQ~^`@ep-zq^-D*cf)e%$VFBjs2IC<3e^eF6jl38}T=A8(xIB_9dcaT)Y+dJ4wiWPLiMPjqI&Ur;@7B!A*tLkq35ElEj}K! zI>gla!(MtpjY~nmODy)|nMVpl&N`t=ocF$KpTg{0o&H5kMbX+Q$q`=gw#_YPV76jt zwuYAy8F4ah-1(5Hi`i#W_z4%%u;)H)=0Z-+FV@yJH7@~)#eM?ITXO@{t#1SFdrFtz zrANp+WcWh;xU)j-W?ux(fyI`LIQ#N!rB)k%9W+>Gdw*fY#k~DcbbKDKy1l5-V7o%5vUNk{r-J{jrtbJP021#P__}^{{+rTtK z1$UZ5oP91oxP!AN&nvJ_ib{8<&^7tv-Iu6#&TUoX1W{L~g3_ynaIF8%Hs_0U`MIIy zK%et~N8R#3X^$&GesFuC(LmzUJjTUb)%-a} z@$mKNA(?j-2}k)`^)3vh?U{aMTm6v#Kb8zg9;(~R^_L~n+G(NAI0V4vpR?s}&n_IS z=Iep}`l4su!kI=#1i`KN30|MoW!{vb5B7f@4it>y)MdY7p4qCy+@JlEJo zCf0C=+kxS@^lQsDSG%S^;F3m3{2|0Xy{r#tn@D zOS!J5nBL?3>3ijSuTq6w{Cng~E~+>snZEiEzIXRhVlgi^5eZgYu%r(sz`>!Xc(M~H zWKd_Ezd7c?WMim;)7KW1Ic#>5nRY*wC`yJ|h2Qb3+$hGbR{%H^KmgEsVp0Ud+F4Il zYgVmFQ9(5b6CY2S&$aiQ5G%Qh;)KSYV|P|*Wn7f$jr`DdWQUTQf=Er=dhpXjG4mH| zBITr)%|5+LEtkB2I}iA%+qBd=NilORcGkup4+mck&7#OV0`WT@z$NhlvKqq!ggu%_ zMU&Og{<2vk0Hgr`PCJY59!OP#vjM;E8T1q)yMQjB^^J$BG!#xu4CNnhIsxUDXEL_| zJanrStGO0xFIg9xB+%|>eq7pQr-9`H+5qe{aY%Nk1J!`0$wOou;F3k^{d33*r0>I{Mx+`u8J*BDP@J@H6mn1Vb1_GwQ>FE2C&!^Xh+hHT83;7tN;|F##-Y8fDBIw}&=S zH+qng5VVx2P~PP>A}^xWM2xO~<7s24COMb8i+%gWgQ2WDS-27GqwJ+fkcx+9b}(?P zBaNjXk8Mhq)xmmR!_I02Id&);71Z@&nlpLnBc99I&3?b*fREhkq1n@;ivE0QW}Y2N z$Hlh}9^$#tNb;#em(y4mC4W`E{Xcc|;BC3A&Q@(O*t%I?& z+%*?5(GB~?(q5^`qc}ztDLpbA9unJN78ni+)gp522f6K(&e>V||3b zZu5JvEm>l0BqMCx{!oH(H<4_aB$A&?KJRHdewcI6RKdpjvm^zFAm8^p5vpnlZ~jqFu{U%+G}e_ zj;i<7notXv&{$1M*3wRDAPjcT#uRgWXqFi4&?tvDK4VwW;j8(Dc8{#kL=Xu4OpXd* z+#JUhJUfXhSbe!v6~0QC5_4{^zBk4YtM7YGRDO^t>F%r(CG$Ks6lyF<)KOZsn`@wj znC_N4AgF)SSwiU5P;i)V>*zh~sRb5Ij)rfYBe%Vpz<`3 zTqM%9b)02hM!)#DqSkc8^sM+B!Q+|l^RdmM+dGlRXqWZKAer~9uT+B~X91fjVYVVU z&BNWJ<9)rlwBFXpCwC1V@4zP}-z37r&gzOZ6CDYAJcHUBx+_yU(!;haTXypAx+%EOeJJ{Aj9}F2$i{eqXBDQmT#@3GV0gVD>xUBh% zVoG{it%uj!tHU0ej?zypcPR182##<;T5yTfZW&4Y#0L-==mnX*CgxBE8Jokt=KnYoId6qoYomMn!N`1K_ z-AXofhnn5b1^RKH#@HiOeR9;o+88_-#5azTcLa<<>@z8BDZ5D4s}138lijZwz1*yvh~#7JA?tpul+hWkMRJmH0eNdVvvS&MGF+Mgx?1#d7}vey zfxMbh+XdZQ5AI2k+DLwzz_@_kjGV;~UAuM1o!@qUc84=mF`I!aw9R#zGGP?bNbFyD zKn`xe3LU0JXMxIl3`E&KYMdF?sDKnIOfU`AcqXn>79sfJ%#nB5TyeW$ANi^&fMMAm zvV+)x@t{Is`#G=LgTH*f@7li4{*;5ejsDj`lc9xwz1DQfTw0hJZwmyR#~D>QD!4#<<8yPqq*nZC!&Qx(AZsl(bq;w zlx(~A~U1w0Vu)WTKOFv);r6sE38Y+E6it0-~5J*zH$T6A0;bJ6GJr{Ye-WkdLvn*TYPY!$2p~>LY zWYDXdL@1mkFyptY4{+GB84okNI%Fkh zJ7K#pNS>y~cQ~!$pN?)J8#U}j=wL~P@qXeRq!Kf4lhU{qZV7b*-?aR(Ewy5AaNpmC z9vzceZ+63zE7n=ZHW45AL#e|Yg=Nj%WwoWaB~&_YpWjz3myZeA9TEI6^RcpUwhzjG zRQMd@w8z_F?bPONN3WgcP`kWSqvfyxN1)^fwyhNrJXeayUn6WYW=q{G{IHpI@qMMG zozmBbuY~f6UAfL@pbz7P!C<>tXuC`x`yx_V7DX_k=Mk_=09NESv1hfGj`&X))p_7Eg3wYft~dS7y*!HccD z(G!F+3~3lu%Q1`&2Ie`bN|oZ#aDAzy_(4Vgg)(JfdKYr;q@VuW^g%1!=K$)7PB+~h z7V-t}<{kz&?7g(=s30mgYGkCXA<}w}_t6v|=hi09j#LpZoVU!Uk%xX=MN_)OMFtZk z+uCSj`PJun?+}4et=qeY(x#N>6HDLiH}WtujG=?yRU=(enoPN@^Q1a(Gb^&a$vq1- z@tyFR=aH=VPTxDZ#Bdqf@YCOWh4QUkO%|BZetaj_as)WSSXGXEOL>S=wj-Gcr_+e`&ey9u|s>)_)S76)$Z3KUy%+@ zC>w<}n7Eny8KJkUrb|A%XH=cvZ+48dZ2RWUOIk)`o%?*|>OmsB?oxJDrBY}Et~X>HqeWD;@<1c{8>e!wugVW^_!#{k6; zA}55Qbvpt-S4DoE&?&}8j#~AO)34|Wgv>AVO~&@bqJ?ihp)5Ygxx_o0t?1#O6ysC1 z9`kAzsp;LRnWb`MO*^CJEq328Jz_e=nKu)Xv(IIfe2LMDmt9YFbAOeP3KG&^k0L2L z1!II^;W2#ZSvWhbwJ8iFkVlt`cd(&d^Z9IZLUr!S*_)}_mfbG&~^LT?d2Lt_S-|wpTi`S z{D2J;#@9oY9}v+zXwM0~4-_u{KI<+yo;_0cTnbBP`Mw>Jl{)Bubo7+Qj7B=Gk?zAV zqpiI*okbp}xF#f8o#fOeE-9>2!rK0Mh&%(Z|^7V#~%=<>{x>5&fk|q zzaNHjjh{%4>LyH51UMpIEcGW68M@wR)3YPz%L`p;NeXcydUi&r~ID!FhY zUVIF;!*gp*K=3)2d^5f0Sjii2bY7KXa>s}74_YWyD&*(SEU)A)pX)X79$hm!M@Ta* zG^;$Dn_Jv{KZ5rL+S;e%uZ$}H6OTrnSn}2E_+jeTPHRl1Xk5H(MEb@n-#e}UltpGKel$+ZXL!Oaeb;73_;dU^BMWq~B-(yYO8wsyQdZsTao zuVM*b`k(o*0ne}`4I0KU_pi-^G=_dg%;%p{3Ev$*A-90yY11wgsf-o!5VooJ8$!o* z<-c%5Y^f$oeMb#fQWulP0@rZ$gy2%@=n-1w_;HUR)AJ*aMEZkTe6`r(+vzEt{CiD` z*fI~&26-`vV(DcrLK+z3}^0%Rz!N&$C?43E4CX8w= z6eetgZWmX^PwVuo&c?53_U3+T_DZuj+@i}7zV9NNnDWGodI75rTV|O(4(bkYb?7j) zVIA6=%uAMaOnr6R2_=0vWd+l*>!V=I;qL9vt?p+En|n0j{3|d`?Ry&SJ!HNnXU}%a ze2>jBDUV#ysJQ$*g^_n0mhRYsqKA^?$%xE96p5U2WUB5|@~+DAo0Q z|0#>1hC?tQJLuhPk11=fhk92BPN-n|f3f%8QB7{$x;SFwz5$O;TMVeIUTM$qX=|oCEYNSh-E}?{gNDU;E@Is1rxz9M~cfNMcx#NEK z{O%p&{(%e=5?F86JJ+1geCG2!14wOU*WW<3e+}LKv;Y71q2NkZqMGqK2;hC6>~~G2 z|CM09)EfM+$=)0iAS?bCB$&T7den%-n_&&@!-Xv~iTsq49HRkBUd?TYw?`HgR zq0n&561*b4X(fPA45Ppqmzl}{*>i*}rcjVQ_t7pgC^TV99bB?YFzG_m?$x?haiTmu zy}C}#X~C^^0Q4Lh345?}(M)Nu#=9VbMVii5F!{Zj_>@Uh6BHzF`|{9W4Y8RJ?SjiH z(K@8|M2+ZMQ1IeaQ0A+h-shCo71#DL`A~FjD9akGj}t1SnNC$K5vKKtYh-x1le0q4 zt{yVt1nCn5{B7Jmijjz(GX06Cf$oH{v2KqDBKZF4#6A%(^4=9G>`+c zLI4{%lYPQ#6303UCPfI)t$SEn9`hdS6Bf0fEek#|5#T;9yOYT|tS*DTMTIq^gs=_V zMP}ZO_W=rII>m+bF@Td{)dO!e0(jrtj(J`8;1Or-A3o1mTp)3>WC48NUiRB2{U^sz zS`Xa?WH)y3k<#V~A&&v{d3du0?9?|RviSHae!m5<1TQPjj{hw6qTZI6BQhUl7v&+& zjZ(-U>bBpa2igse8jB{pNhS6tix>9mEXYEV zf}}mz$rNlccSM~Wwo#QSP0MDUMLY&Dvld-3IF(LNUlzb_fQ2R5l9au+%e%*1-$Y9T zW|I#?nF%&sfQlmtUjd3!yY!�+qudO7C*KO?1WegDrIC1hrfr4WAt>#z&dxi_dCm zeXIVcq7SL?v+n>wyk8?H^Fw_EF(9!z4g<$rfD*gAPKl~pTgU_=Wmu=4K)P>H=*Js% zvfNOiw6wYR%kczleo?AiX~W*%4{N2Gs&&@oht1#=MGN&qOYE*&(x6B=YkY zM_EY$#-NW_6;-6?*m3Xtb?NQqt}E(2@^giZRE7}qG`P#|vD>kSG2+;UP{`sfwgF2X zmBuoZ*+&^%xR3rHc$3=KFx_dkSt&;CJ%|tna)rxs?CSA$UT%jHW_mhN(1YQiTGr8^M$3a^BDWj#+Ces56oUL)*P~I zpGYE;p0NeNvK7JB@xX8}rCI5wCcw!K^YxT)UfKn!T5|J`EF097_j1~NxhFm)njN#A z36g1wjJcV)BkD?orcK25_xbB(1D59T(<|DG``3^~VMKASaiU6^zZ##=!-1Jw(s{1A zvL4c6;zA08<4Qxxpm0Ef8%YlTYZ<#2SfOejv?TV)Su!dDPUZ?5LM&mIEDpgbX&z&I z-`3b2GF^3^D;v(KpQbMht1|2`BZN};5|Y8s{0$2CW2iE((3G3t5kyQn`~^nH!F*3d zRbr`!$3eLhCf6J-IwCql`$DxZ&~mPK=UdJ)1>1gdNL+>He{xtQBiN}+`Kt(WYXd3@ zq+jWLm=1V_X{+-MT_~4q`KY0zVY@t)zQ0Hu!oKUUkD13gO*(1+zUO=Ad0|m*ft!@4 zEB8;JNkVt;W~==^RXN|9@;cv5e0hA_O0P{R*}Sx>&LS}AgmKV`HX^RNy=co=DgC`+ zoabP9VwJ__4Y~&_9?eY?n9L6GAcew8?Hb@AX|jXP8r_-X4xqj^WO~z9v_RTj`iz+< zX9Z?}h^sVy!$Bcp8xJ#if!GL~S^?009d5QFA~~J%U7hpyc8SXGBaa%$xkHR$Gp2zmGnzA;sb9Wfs-Ovy9R z_Crk!?dBUqh05;<^F_8PD4oR<%)7Nku?{u!Gu~P%Mw$EDI%3&N?uO3i3JG2%_@z-@ zST^ZzBD2BY9!D5$Y(0TM)k6h&rBiPng42Ts-7Q#dQ_a~SY8E!{7dDCbv0{6rK#~O> zUx-Vgd-uM${e86!q?tIS@=)N0PWzj5NlzeVEktKj9B6_`;zpd0Gwm~H3!cyZu0%oi6EEvMW z*24mTklj)vjcZDaDM1LTnq{8G2-OV(nNb2+=wmsWtg&3&nCip@cO zLB*bDI0U!j#dzEKH7Thqh4+JUV*@645|a$ zH*irO#DJyrB~`AKk5gtMBeVX=SAKH1UH(?<8YOQ9T2X(UmUzRM0^nf@OIA8~p%s=T zl~&hXwB%_Qk^M4Pk2l-qTvDxc+=kj%1!qY9i!76@OO>}SqQ))CQs#V6fAfJD{q1dJ zm8J0%>jaMY*>#XJoX5T%XQhK9B^~U^&!$_#;Q0OPKMD3X``0wrK#Sf2XJbK$wKHie zgpT27`)mE=*n#-|U;ZOtqbGpC5ev>dfcOC$fe>38Fyu5Uu&^(4qA=7shzO$beNG4c6Tc)}6(E`2Q}#|GNnPZzud;b32$O zvHRhrNsXkyW03e=YWg$gNwWVfe=9aor%>eg^5V3cQ7`f@M%3IaFe*7CsVA4`cVIiL z<_3KnfJgezu}i9cKRI&P#X1cAlZIotCc0Pp>GkBo4`3H7gauWeb{-q%%A?SwszdTJ?m*immuK+gxNCbMPQblCVV`f76Sy(Lqody8Q!AnGNDlV19=H%n&tCR= zbX9nIf=XrUiN&?qdi{V`Oo`F~Sy>OR^rPO@O@{szh~X8z#X^(G4N-8wA>W)meBAGQ zdS=^dw1bo88NXV7tn$oOnx?AY6e~e`yN*P?@m~e4+%$MMWOPeg`iTF!aKZm7JQQus z3WB@T8aA`V5rG80$p5%qq^I-=*e>jGCq<^64C`CANL4lL|6tK2Zc^$d^drVS0>}DEvDa!~K)vTaU+2jt^H%YP0{r6VI+}*Li_F6{HRIGVG(^Uqd$csP~Pj579o# zbfYe}%tiD2Q^O{`Vo!GAYYK`!dE7a5`RKHB-Yvf0rPeig<+D%b->mWw+teD{Z@RvN_R8dL(1>nyF_h~7|L`-MZlHi zSYBz#FGf2)zPn)Yft3ZH3lb|x4WTkj;ho!0GhnJZbvq)y&#)=}Hg_;L! ze|Q8T9>;1Pkh&xtJYFWa_tNfsF&Tsdo40?;!J@a?YFpWs&U@+8+EWj(fYB<_j5D_! zEt+{$gziU6S~`@;?sg_y)T(~26Tp_wZI&wsOWWxTfHHxm5H=Tp88X*;)LD3PysX8# zfs62w#5^WXURhG6A*`Sv?r?QUvy}c+^jxJ?@`=1B^d~xGubH(OSJY6q0j7wt@B~|tv}8y|KoVX{rBkRt zjKYB0wJZqLi~xe{-7nRCBWZoVuk|F&qU(5$98Kc#jrvRV-$5iT*JmCJC)-i_89esb zooH89&BRG+rwZCvFW*M73VF(Jq@`F(;c1QRsV}s>kv$2)W`Z1Y7uTD8(pzPT*^ML) z^>b)xYEh|O;r#^|-^LbNHM7m;AnFjl$$`{W{-p`pm3zv>QBp}-X(A8os<}ogE&yoc##0ZL=kc4t+Zc~^n z8&-VXMJR5{b{HQn1fPLhKc{QH0@fgKDfsFchdfx}aj8!^8{?Z>&e zn`H!VpE=ZI$~99{-2OuLz>7+O0AInQ=`<}mCzZPe`^{W`WVY$+f_e6{@6}ETQri>SMPJ8EM(JhA)Qit2F z+#R`MlKq_B3gV9*Za2Fb{*eji0KkEF{vE2jQgUSSgk!~TI=S(+u?}Kl)4`WY4t7;+ z-s3nJJgbfbh3;+*$j}CSD@~$nAdaA~!w>z4Z}$-Kw_|zI|s??lnqig^n+AVKP2SqcpOKfT&26>O=JEzP%SSJWwX2L-NF4F z5j=D83m97a`;A!x=gnKI(c_-Oxp!Pr!lK?3;=fNUZAty9=^>AnWaXp9sdWNf$SLIj z^h2tbUWhieBB};f^poSUcZxp#rU_p-g-bxQeh}l-{K|}e{4lwQ2~g(%I9i;Q27U

U!kr)8><}N9uAiUi!5= zTIew+yu7e}+I1Am1zp*3hVr2+sg>x$jEW3rnKpc>S_fT;70*;AaBP{GiuJygtohXr)d)F}T&G_ix_Ur!aQ%{U03t~&G zbO0wrQg`P+tBdou*{~JK>I?(4x4(#gCeVdye{u|9@8xCF&e~B+GofogIZo#g#(C%H z=?jb9>PN6&BxN@>e6M70SH=D#UXwieW>2W}zc{?t_OK5CNJAv)1$MD}_9w@UThx`n zOcfU4-`cH*{z1$1U-PQ}zrG*Zw=Zmqn|;6mU_4iQvDVU(qy}H1A91cU*`@=0eX5o4 zUU!M~=yJ`l%e+{*mi6-}?UVDCCsg8d zhYSma?Tda99VSmqo$r3)nk{2G#*Bl+8Jf%!kbzx}l@;@LF^xaGMniejgN0eYHEdp3 zzo0$3+EUqYnygi)y$mPUVYsnhw=zM#E(gw4-tC8g>3KK=__k-+8eENig56)Y6Y81bRKRgT$&F?MXPy_O z?ccS0FCjta>04`=Oj^sYwCO*gtnK~q{w^LmjpN9sVLLR zTLq4NLHDH2OKpPBx;x&6v;EjyLzqO~5kM!{(Zn?T<)uK?V-F2CTb(KCwX2Kh*r^AT zlei@MKzO4aKsRV6Q2%(hrUmK_5LBk5$|0%u=8JHu5n*?;4{FbV_2=#U-O0bTbEk83 z?_X?VF0QwR0W;al#Yq4zP9^_oP4)>puZxBY0bYDI}b0b%aUz`pFk#C z4mUZUG?VsF)gK@6F5k_uMGe^7*03gePv~U)1G(&PB|L8q`>93Pkk?NBFpLq_cLY$dA{BPzt#(!$Bd;e*kqlN2RVD_?NHQ|o1 zy??rsf7#f!{IeK5Y5sQ`vJ{oS_i)*cl>tE3;H6MKfOdhT}?X2pEz71Cg%mqklYLL|m zjBLokNbTdW*x|nnSrkQgFyXRBf9qRHl)1>o<{obb!p32>NRE-K& zm!@g8vJB~WS*c}&lin$x)sI#|Fz0C!4+(Nv`sTRZ^hE2S4eH^cDfa3RT+TC5E3hEq zsF6y5+xjN+_$HfkgiL_W4==qdKr0U8Pu8gV%ZJzUH1W~d{w+b?@8WO2tmVoVX%qOc zK@yCrJh|Jg?gH404r9@K(foB(M3X9R57PuXLvEtBBusr;cdscMbc(F`ZOp>zo7yuS zl+-oBcYKe%$9I_RyWGz$B~D2goFp`59%z@|YA_5@WQz$fx^?-j?sp6K=t{z=?JnUC zKHAq!DjH^N+T+c-+3grHn${8tM7*=KkR_VWG@KXOtmhZrQAz`2G%_^?7-AysT`5Ax z+U@aX31<5T-(&5gAM9c{AKVYP2=1DC7d5vjYzXWm_OY~eJ%%@~rS3unl;Xh`m&x_q zDJe;A&OE|kt0YY%h$|$9daEVsswf90fR*r*<54&%;3o&ani=gztJS3{D1m0$lBiUr z`=iAD(|XDwRlRaM?}7rS-tVgq*F=2;&?VK?k1Id#LeT#Jl5ymWI(hUlnio?P1~j0k z*=R%fOH3ObR)$4xCHoI1#SuCY;xGpi0Xb_so~as;%P-M;*UL3tFg-T6Ytg%*ZNpg= z;l)3cA5h}zcz%=aC_8h0yr|t$2_cQG7ZI@2tF)Mp?~~TrHeGA?QHKe zhcX-QDsOW+oV|KVfXlY?kPG3d@z~7kRkf8V807(}tb&qCYt^ECs9;z)SRIGVrcN-$ z$*J;S+#|T=>d;FClUH4?{(gR0qnDFC6VYNd-H-0|o@kn|U0}fIW2_ef^!GGEBSIL` zZzfR-*^-GTo2df&U^_5fO%KeViR^vlD0d~D*mw0-5(7bQiMUyJI`ytDr!A|Zd~ zDdOOLr|N8Bv%BkljLbFuG2?91_%-h18rWF#~VPKE4ElZa89E&s%K==AQnmSG1adlrF z`UxaGg#$`DR|$6M0T_O$n&ziPRVtn&@Gq#JO+@gGlN^Z!cgE*+kEK+Oe>da(u-b>Q zjS^3=c9*&D+Zv3WOxT7$NAIB`ZuFt83-py6NU$~Kn*`Zvg*7S8ZnB2fdpto4^L#5=MMbvESX^D%Bu{=I}2SIw85vBze!$oOu4n` zC;FaPNv&ui5I0+4)Y}v;dOvj{X{zT0`u7s(1{uNao%DIOIqB&lG3vEjK3BMHfKJl* zmCmDM8iH?cch>8icRDz%L=aN5q-xjfo9YSP4xtRwc&0><$@kS|`q6gO&88kQiRYfa znycgC@-;K*qU48BTFK||N1o3}68_-qA*NjdvBD0BsgMB0J=*FlEjWib8;rT)t))#( z%zRszJwnnCYjdf8ErV__8ygG0uUTjD`(X>Ur%D9sZBnTEm1?Tq3z{MQ*xXw!!Cs;*+2}4ggFH4j+ zagjnfSNOE}yvox^nLftY8M}UmHHDSsmE+4+tQXktY=YAYL$D0S(sDvBIe#Cu>|;)hvKpae0}29Ar1sWyKldOI7ZCR*2U>lvAs}kg6T6J;wxH)GK~>&)!du zFx)WEW=rHntvP#P{{Sc4vE$%S`DcFdPJu$*2&sVOHVB?U9kmR&qD9|}v+A{yXINvq zR6+F{wepkW_NiJ9=l_VE$e{z2LH2>*NHtnU+W_9-`X|6U?5nv!KCnDo2+ZYvz*q-r z8`%d^dLn8BK<7RQzJ1E{B~wV3_#M7M<7Nk4g=M?{lh^joyiUK~3(du_10ym^6>|1A)~{f1;Vxg3h_>4<*An5Bk4ZPrmMJMwHz7$r166 z>czrhm!c6__^o^fnAm*`=H>~DjGr71wU~rI2P5{H;Ms-w&2?w7Z4FEq*j#CgU~~Bv z`sw}s$AQNNKsmx09!yczGxZa#)7%llm-?7IZ+o7Um#^Lzt>EV`Dc<#%@2g19#ghjc z6r95oog_qk2Qil5n8pwuz{6kyaC_V{tdQC+YGxz6OevcB9KXLti#lwXCicR(?6q-c zv6lS)8MJ0QzfEczpJ8fo0DG_grVo@f`115#l(!Lh$o4)59^aR^Y2Hz7+Z>``b z(lDMSgPBV`-{RN@+^8b#1M(jh`C)r8XXi2UDDE#J1JWkM9}V<@-}XF4B6Frv+MOnE z9eMYl{Y1$};dC-QvhFC2)P&`Mv@~dZ3He}b73t_fQSWnAs8YS3U&E3YOf)9a)tEL8`1uay4eh$2kp6ZhsSf5wF)in%E#3%xZr_4;deZ)?=O1_$m@;-kCg z?&&7f^xb!k=;(aXWZucqr;<8Am?TBNAe~}@ynoLjR;kBcsNWigZ4j8Y7=<67uc+fK z`-q30=cQK^@b`^r)vkZ*vyoHqoLjgN6L8!vJ5jZ)kNz~>-&#>fdZDP|ZmfCMxytX~ zGl+w04MJt{_wlvWs_qWTeyiDxAPvkOrZ0%l?PzaSBwGf3VX%nctp6MVQ@b|J&D*>$ z6vOMy-!SBb5$atZnMqVy_4r~++1<=|`%+#@>4(WfIVqoX157`*EQw-31MVorQ?eu3 zWLQg~IN9`tHROhx(yFbYItSOv#GH>A(+V_u}0F3r%0+>i>ap2*Ks z2<-PRhXo~8#X)JP9S$F~*K@MnKlPs)ywz}URZAh={*Exb%-Og>8uJ|N$@=tD%xlzL z?ENdW#wIKNs;d5t#J)3qQ`Hq0N-VT|y!N~=DtLeV;qji}eUeS5I*uCjHxy`6)*Q<7 z^rTks>B%lbrM_f`DhQV^tr$jtGSbO@+lqzD3D__g)fKh`fT5}8*#LYGP}jEf)BmKS z6aCR4IV#Kgcwuz>goQ<$^XhVAjLebXgGTu~Hi}--3>(;D5Q;mZjC9OiB_vSh0v3OE zSQ@`StKX(dEhqkpnx2PoFC?(VUWjkZjHkv z!_voi}kfEOG_&8CIcm+_5$}HZHf1ro0;dOJ*y2IyF^oQ=c0)#5eVvC{Q&F^ z%zj9DN-Ipw%fVq18)_a|ol1=TI_M+^e?IYU-F?CqE(>+19 z-@G$**xgkj3|Yzna~MhcSfk9$ZuUG$2o!8sYSNTxJ@!RgCi)&!q1=Mw_b)N~PUC@W zdm8f|I{7@2E-bpUn}gZ>`j6tvyErgFj1P;~0>aoRHrGXl2O7>wuHnrEfW2~42a-R| z?BgpRHedhJg<}>e@g@Shqe}Ppjtmq;ggu|}9xBZ>Zm1~LdHqz++@avEZ$*hIx|=9U z^c-(FY*a}{_nOn4$LEJS^egk{Fmvc%)$1!@g-d1t+2l#w3#NUV7!)~~GokCH-w$e$ za7(F&#vX|q50%0eI*(+he!BFyWyVIBj7Wx4U2x&-T}(;v4vIIXLB(VuC*6`_?Ul>a z@X69CwDo^9#Ft1GeV=gNKIPf(rn&)YXAnDpeDki<>yya&n4m#7VV0{ll9KO zx7NIyuQQru--bWF*=yA`SD9oP>#Sj!PncT>V`j063C__()oF1i9}v>}44C9UIr!Go zL)7;{nbeQoc{-y2i2JsfKGNL$J%BD6WBfhk@sqrk{{23p`A-kQh~o4hV2UjQHWms} zW?X>8r-y=jGK|!-x?`8`7iE_yvvG|N zZ{EaOJ(>Nnt}{}L@F3pDrSZkbCuF=Gop)fUBUB0YW%zaf>8|Zw4U8tcjX*UeM5LF> zK8L$V@0%W~{lI)g<6p|Wz^`IbYxBN7qr5KZ?Bf8=w-p*uVrwFSv3Z>tW;+~|a^V-* z-5`GUMAe+9g-;HLGH)>Mcxz2-;F>5dqqk|ehA)_tx@0`$!@C~N7`eEo)L3(%Re|dW1Zw89#j5elR$<3;CB5!I%C5RG-^vE`bwbWrI|Z z5go+8je3C=_|OR0t`m}xQQ;YhXSpstmm~(Q$zC+qytaJG3g2T5$mj#u4cb0-p#84(mVZ{S;eXTna6iX7t!&*&55X=a>i*=& zHZV6%Vc22+qWZvnE&0i@sbZC$uWri2^x0;If+}O-B48Lqz5C%t;K0sErg4HS8)gh9 zR>FN*oG|a<*^1Awty|dv*d?t&?0Uuo;kwp@;_7{EHVsC_b*232C1wy6d_6dZQrKZ@xezOmD;5N0jatrX?V9xELpCZ6PtS%%`wXrYL@vzqDooEc} z4O`p{eI8;_SDV+77PWdxaXN!bqOanPLs1W^cUSD*Taoa1f;ax5CHAUBZoJDtxvQ71 z!wmGHf7Rx^BEN8~2{LmKUjKeL`8KnH6-t~6&ikv?~H&B83&1&wxV{h`(gOtqztH^jE3*T zB1Q7lN-vRQin3uvdW*9CQptVC%?}4#<*r{f2`1h+e<>!o*?KeN1-Oi<7?ob| zty1_jD_aIH3wCw`cL7={yn!=Y!|MsLz$;(jn0)+i`)l6({#iMtI%f5->qf|nIvGfR zN<6gYS|)^T78V*-L~)K4CRW;HFWecwZ{J_Ce6Pq_Q~F?#&wK0S>!rz}jDmtpmv;GC9jiVz)*GVFIj%E#EKR=OJ~Zqe#rfn&PIa#B2}#BzpF<%TlK(ktS!(oZ(5by+GL*4k6g-}DuW zEQpaCQz$R!@_mes42|Js&kM2)825mJ7}}XeVjrM@i6s9%5B&q9km@8mC2wR=$ty|z zLzesIoi0J!3B{m{2M5Ee(K%Q5acnYnHDILsYW&Fl%J;wGufWZFJ;Lo*oD{jZ0xR)# z0UF$b#kT_UTiS0(oj)D0U(_VXSzOH6D5?46Ed+i2-4dW|QU)vhZ(HwQy(InSIGi$g zURP`_Zp#do$o|#5H^vO7f6R_%pZB(-TR`~$35IUakWIL$q$*l)oR`}ae7EKOy3zEH z8iFVXcRtZ@JEGpuw>%TMQ(7B1NCLs+;hFNHxK8VXj)74z4Hx3k(eq4Q;Ui5v+{52T zFV$n}_;ZT6gY#rrg}sTxLAl?eH(hfd?6Qv8^x#c}je`_x%X4zt-+77hrIt5#P1}lY z(r2}vB~Fiz`^izk7GGFd9U7{xM4VtZ!>D=$fj@~}MhIf`v`b~4xA|fNLBK*=MJOw> zbJdYA{c%vNepzNi%m?i)bfxH*#W03Sm_b;iv6)MJQo9M^`?1YjhBCf?tJQifosI&_ zSA*8*I^cUO(~-s>h>zYgj_CmNEn<<4-sugd$`zj5MM+qNr>NLIvFC4XKpM z>}d4ZhhZFW>a3K^h!Uz_n4^8zE{?) z?Z>Z;zAN#6c>HU!bN4K*A{-dhTJ2zcD8%a@Xq|)~fObwdRCpd6>sIuUd5_G%2|nWC z4BwoOwLR2`u9c?;@w&!Ei`BI%5-4m8iX~%0$$sQ`O&?_1Te3enT%H2;g!xaTy4s#@ z#C~l3F^?X8v^%S^2y+IR1`xMEZhAp*_o$YNVpG3k^3@=5VJBM0f!~W_)m+{QU-dk4 zb9v=kCS8J6UAISv+DgW~tW!qw^(v#7>&`jE^>f^-4uF8v%HZ}R+l zSz_noX!{*mqHT!G<3@L)OCqcfWuDL+Kb7slRa9QR>gDjEOtNC-ewOt2?+mo|c?b_Mk5X?*QE!5L<-vajY5r%=^G?Idm6`nn z#);*h9P9P@!=ib^GV7M$b+k;c`ja_ z#-pc7Z<;<$I+}1W>pAQS3Dbr+slzkJ4ywzyK zDQSJo?V$P{DVN+=dAHQCd*>%k4^?nODURt7FZeSwvb>}331FujaejCCV3;n_t&`sD({7I1MGSs~Z8s;S>k!7m55I(YXIlczo9+;;(>vTlNTW ziB5nY27;~^QhsviWt8QwUm&o$ld>0j5}L8X7(xa}tg9lfgM|OE*WUr`nRIB^Pmc3O zRDD+HHjW){F+6vZ(Nd@L5Kuw!e*&$*hpQDAw!)mw33eAI>awUsc8h<8MdFP02yS>VTrKAXoG20sh*e0g+2!_j@oeYi;vYEuXN5#ZZETwU|dqYtbk)yc>cf`)RJNqGbJ?dZ}1Md5Layc zWR;P!O6YeCKD5#>eDV#pOm=C08pqFDQOznS^XuZDQs_JoNV6Gi?yk%(orMT}xPlga zYK!WDZPXNw;y|lzB2W#u`Oy2Ihwqt}%J98tuJ;fgtutle*YU^taAkE}MeKR$$uy-( zS-LkGZyNHz?DczWj8rGq84jJ^S1)~oZb#E-{frF~J__;`XTB2~xIK$1kHRg&{Z0Ce zD$B2llsB7+hHC0O;P@V^5`CYQ2pH|RYgu(_hBRc$#G$wbPt0$SE?M`iH{xz-l~PqR zekfOQBYjy6y@hZLKgK=&36=0i@Ge|3kK=jsi%rO<9g^azT^YQ@cMz;Xuqb*F8~4M z!$|;Yvh8WuPK%k+J=|*EKVIYE<2u9o1D}>6GV)$&#>Q2?q`m9(-rshB*lF09tL?Y? zA!Rl09p7L%TU;^L@s`zMw50aRrJU~G5#6kLiW8nRSw5dFWK>x+9PR)Uu0haHWH^&2 zq-A0kgr&}E(d;TjH@C15gfGoA?4cC0>5?TwfC+<8DB#_9&q^KA|A5rbgUWy0EJGZ2 zNc8nc)^@w#D%a*F(xKqmx>H+t>fY<)=M`5-#XXdiDV#9->%>J!mUa=%?KZt~z@DBa zRytH8q>!yiJDlWt;T0zzXPUONU6}oA;d8MDA(4qUK0ROC7Yd(Dm=LXij2Yrk|9wby z5*CrX*#HZoPwWDU{n5t{Da~XXA|26-CzpRu#3IW-Js&`MMBSWu#c6sOGoml8X|d6tU)sOl&q#>NQy%be>MT7xU<# zwp%j27%yTdVD6mlGw3pEk!;n~yj7&Rh4O1>Sg9SKnhpuHu6>89QUmi-$4v1B$-~y0PP-ig#4suGGQX+9&F1A>9L-G2367?InX8D>mIaGIoQg!P#NUF zn1iu%L;G}Dmk-raSNLbUl|4?s*3DruKmGxg@5K=j8M&>VFfo~5MXMc01m=vxv zVb!LT0@~HX{S?osTCT|@RgZ+997Q0{p40rasCxg@vO3R)E*tn_4Sf6PTkc-R|BgwT z|ApV7zK2x`fjyAtjb*nvk+wZ}v+lvEcjlP7lUPUs7NYjhnz}M28&M_;U4?1+FfYE< z%?*_ixZ`lW@){oVd$)Z0Yw!9lS1h%+?lc(j=V1Vm2QF8(;`UyI1WPaZhlQ8r$m+L0 z2qUUN95ac*eG(0V5a?QyNyo;3cG)}fA#QDI z)H`d%m6zzt>M2~hq902#s%^B-=6RbpixC)KTnc1v|d^Dr|4h9>cUzMZ5y9 ziG?FaV{*q8f@Ho}6w=c|-|;y`&VEOd=bI1V1Mu1q1;6*hcIO8Iu~czA{SKUFn;%{f*3mN- zm_FW}l*ZPXUyxdhp0*1ugYCx*q*L#Vn>GsrF_6H-+p?pqONL*ox{+DP%J8@AA)l7gKQ zC-#K(`J3E4=Od%;VHA1BN66alL1wJRxH6Fx4WHk;y-{0G8I~XGD74I;(VYjBk7HgH zI#w%53o258O!x@+z20GEd*DK?cxi~|E_H&;-yzdHi+Rl1UcFr5e0%kNrzE9ezG3^& zbHO*$PnWrVIvsq?Qj6jU`b*H|{>Ic1#SeI0F<$4&D&fd>um*7SB_1goS|?jxGr00a+5AoMVnt|wTaQzh!Q%uL$5_2lHh3_5A-HAa%Sr4719C*8& zTw&T)d`)&J1ESvH`U7~&le1U*N89!^(Kq?n?ykYBGZ!L`s}v#Uaija_?gMZ~T%gos zHR2Rfn6uyGkY1nV)9G6WlafEA2)*Jsc?sL?DqD~5qAFge^!M5&pyw)hGQPjae;1RL|fF}!_MFfO?!=j z1orU=JFycr_LRt61Mvw0hs;%{e9W=DX!&qM7;AKUgTxe4RDsbcg0Qs=@pTo`v{tvHqIYbm5 zfqK081nB2n)C6&i{>x-5Caa@MFvKA~j53PCBP)@&~*j}sPnuvcMOzjhPvB#e58J2ah7{SbuwTqg(F zWh}5S^wp?Ng`T6ef>0~rvSDd`-P<+_RU-PrP4{@=?s^d$uY2RyE;swJc;)mps}$Hj zQq$cjwpID>^3Adb6=(At7fVg6{Do3q1J|S#YHKqIPokJHw9vMwT~G(PCm8Kd-PDKt z$>FGz4zsrg70*Agm`z*OQD!Y238{?a={nGbkNSGiiif7!1T^}X{s0rSr(%~@J}~z{ zZ77<+=PIwDGB^)c&!GW>kr zo@;^G3hbx3O2ZcK3m-{o41`ygrEXZto@jNMzpmszIKGVWzzkw{qaCJXUm~b?!Xx1g z>JoJ?TGIC^=*l+sS%+t>#TTX-;G_LHZgykq0HK-NV zQBdV!U_@suvZyZtaeWc)ASPXq-0$z7gkS{5EoX_M_kq=WK>T@V?mv6QEvP^+NvBVS z^CE!DGj%XAPncZfY|?P^&6wUXh^ z_Eg2)eG8+EaZ>w=zVu4vsy_4R?B}{O5S#(?&H#?`Lc{ApjlZ+<@j{1n^2G=`Ym<=lO02#RAoWEv45erm;3E5Ns2O#D_w?as}4t>!vei`78N z3eNjlh~|iumocvDmubkgPkGgtpi=dPU!fYLtvwqf{QTHk%Em!j*QF*IE-5*^ z9jr&4*YPoDDHzQ3m-eb|T9&EvR#puYCf|Y_Bl@|zD!Abq5us_jsL`Kk1&%7Mp=dku ze9aZF+z!9T&LVw<&dW+;w^CBN6D`y9m#q(U-xOKSlgR7U(C}eVK-VL7bz+&4TR zKe>9yKNSHQbM;La!D65Xz|+!mb5QzGUB8EZDCSscbw#|bYUtykLWACr%=4!ir@C*g zWeKg^P2Q-znGY8q5R)>@l`iD0rf-Vj@-A^h60J*(t{~)%v7(ulw7AcRGCRtyZ14X1 zC#^o2X5Q6PVzVz5(vRHV3ggjBY~OmXjb?5=&?>)r?}+zlRt%5?Ywp9yG3{w~jTVLc zmeiFfG2f^e zWsMXSMy^Z3%|Cs)gwu?T)3H1IrXXf3CE$`;@crMe?8hAgw@N+K+5{`!*Mi-N>7C%} zUB*XR-0Agz2l=KZ6xbxwkx*AlaOb|qgAb$rM|*Dq4fX&3iz|u9nq5pq$zHM)GM^Gc z$ewj7A|B4lSQ!`QM6Gs<`~%jdqj|L^_%f9HR{|9j58 zzx%uQcka38bac-9Fw6V>dcB_8W8rg*7U}&e)|ZqQzc_05P)Q9{rf=BrdiVC`yQaaW z$@fFc!M2I%8&t2AZ#@;y1hb}V3r7vmBpleJuA_SNQ=cL8SwehPXTp6-7n3jC>z7gW z;HZkgDa>+NZ(^mBa~@20jx!G<=eTaixe;$|QhbXR<_!ah^lUL)Qb+y%9PSy6@6Km( zaFqH*t}lCR^#r4WgUJ?}mB=ijwuI%4${K8~frhqk?! zlH1_yKrk=hI>qm)tmUnmgEQN{+X|Vi3x*A!r#7ZoHH=uZwqX+z)uJ!K` zoU5)~TJYBJFl&lOJl(W#@+;mY&KK|8`{IGQzqKsBC~FQqFq~-<-SeKs!jN!1r&-wq zyZEfQ)z^9bbwP9DrH(Ls(bI$itrvTjdS&T-Lb7a9V_s(`t(S7@d@#Y9bm6=&!MP$7 z_ojC!O@L3u^3ruigG@y__HvWw#+8*C&+LVN|7fnbFcp)o(ds_wb1MB-$i)1!j}Cjd zFS^=;fbl3|K+%ldd^`Lb&Ud4&*mEq!!)>JTt4xx9ACK4#slRl_+R$Y4lJjM_*_VVW z=`QgSr;vGKVCr*S8rpl+Bn>c*ZD7%19(4Y_xeY+fLR6Mq?Z{>djmFgJMs^;9|0REK z%jWp09g95Rd?%iWy)s+2x*5)9x$0e+ZUaD*#&Rfp8m@bElzD`9Kg3~kcIg#G8W5)e zDkGpBb9i$DtNaeK$qO;*Q1#5H`O(IrF@W~)f#LU}Di-r&ZfuAg{_u#$dks52*}*&y zJ*J(aYmVuj@-%u&zdX{Y=^9K_U`yy0uI=Bi_N`L(7LS<~WM=5K+cj@a7mS@voFecy zOE3p!_^Zh^weM!8R~&+4f|`^~-5P@nzvg|d8+=sTv5Xg4JabvJ$H69mP$!YQ zx?t?^*=z(CSX=@Z0Rnnd3-#S6$mm*az0K>)b4G3+=Z<1jx$Gi^52PLrxp=x2v=eP1 zbg&=&AlM+axk=%EzIU=l#+WjGO-os8_0{$Moxlz-EpZ4L{_j?RAaucY(7ynA&*5ks zUVnnS`fni=4QyEc<+-jHAUr5t4E$>}OMkF;!g=0UxWXi(AB>V8VHjQuOvx*_2?~XN zqQN`lBPRIw8;+>ZouX3q6|1O%Lx1s=o?mmw6>+Yg#xjDFK`AJ7O<8^iyyAkU=ie;O zpriUDnk9Vue|E+6f8y)8plMdzgniqD{}S^R`u}F}GT%L}wfQeWe<1#OSA14H%cEk! z^b6MpLvQ{#Xj`zuJD5og0OjI$$Pom36AH_=5s`@{(fAOh+c5oyw?EBPbSCbD&qYI0X4=)6*)g5y0kyn-xc;yqv0^*>2sB-%hhYBn+!ryvQ9e#a+opN9uk8dip)I6VQ|Sev;fAOaM?r zxlGADEx{Rkq|C=}f6SRzd-JjLIXfXlRAekF<1F`y6K@P8eB5XFG-4FP3BQY(P@#HK zx;nVosbv&+ykE2A&&&vEEtP~uvAMSVRSN#{P*YWztDE9=EI@Y?UcaWRswU<$teX&HvEvBjD&3J+M=9LOhtQS{P>D z!`^UK1i(hN=;0P}AaKzET+HfaL4jc!^U?qiTvn*6rKNY#&nGZXQW$IxMk(AcSbm%{ zo3QjL+856f+g-whnPyF67{TTY4jZrPx9G3F{mh0UX4Glc&Hw0A2G0M|Oa9;aKPZjx zY0Rr;Aj|dzO>PUJhisvs2V?X%i;OJDqHEEN)A5gV;s5Au?T`7h`CidFz>f&Dg|+j>L`XoN-WnD_C2f1KxT>!kx0*~Y+IA{@6$rT_pm*9x z=!MbI`0}!6C8v)?p0gyrels)hLWVwkCx5JO^bjKTzT@X_s3`|r_wrIdUGNw45J6;p zh`ek9RlAjm`B#=)wU~EN7?X(lJ)oFXPPQvEi1ET zh@u-x8+=`gmdEa-Ty3#skK{OiY`YWzCkXoffBm}6Xiy2nqQ7aA`x#NyaE76J*53Yqa*<_MVy>V5WPXp6gk&z> zNARE*V6Y!{-hMl$_Z$CN?*D41%9lC(z%6J3Qmyk4h^K=4cB=oq#-RTvz3zXtOPlsL z0tQGvXsaA42dHEm^M(FjFxdYS`vmvtS<+%xt}$}a3qaHcL`PeY41wqf|L2!M(6sA> z*P!WwR-J^(BDxTqihyA6JR^kb$(YXSk-oa%aHRIC2JTMm_vH1yG#SsdoyM;VUsnK{ z6DNb7{frPnU8J1@+ua9#lkhb6!Y~+^Mh1SXUNh-+U%Xmf73vzpcjuKDQbD z8^=lbiIZ%!zyD_KmsiG6Pmt#UKlcx#Wk&}-!1wePzE<#vB=JN86orN z22d@z>x+2D;>X6?PZZgBpTLq=vx;9&UR_QXdtRZn!764e_Emis6y$6-7+75q<~W*c zhk#^(Cfn~ixoH+1(Z`PHixi`IWQ1B`|T9Es`1wGX!! zEvkYa-e9y-c-zrDkjWQK7ln~(lE^i3Q)0?+c~C;b_}d{8>9U_DulAY5+lPMnDT*9j z7D^^tQ4^qWnla?Rd;g|mG@JE2 zwGF&!!LsJT#?qnSh+K~e4A~w&>DYlFo5LVkyX0o&5EXMWvzTbk$%tLgz)u}9b(O`0 zu{G?*9sYEo)wumu*D(+2Cv)&xP-qUhfIe^XEH08M2-z7xuV~h&fPvpV3|K4QriRw- zDaZVZ>FM|BF535BX|Sj3S_gaXzd~{#kONidZ;#B&j)CYoR8|n-hG%xdLGNTXl6eU1 zJm0%Ipxqw7wPu^~5ZzH?lAV#_I_Nk1*k`WRll|pDn?MJLCu0BIvIp;upKgTA9MWDZ zG~Wojs_cy)OQ`arS})#NEK4VOxYwWMv`Ca2w98bo>mj7K{0iMtz-!te74XZO=Fz&c zUKmzDK0%&qC&1`-990i>664_+R^&s{ zbs2A&FuDWerkU8NFppseF|MZPTrbvBtG9aNl|WnCS?`zA@Am6Rw%7JNd2P2~PCv>` zadco50cVqODgyzHs#`yCy3H1#2 zOc|{h#i--#(mMK{orykI@_KqmJ~g9R4d&(LtKB3QrHysVA`vaMH@kIBIfNk1J7ZzK zlY|)~XJx-35>Pf{BTKaZa_B0gBO(3wMqve0-WhAic}XpIjCt*^sGX<6uJto2mHo{Y zTe2yqu8xm~<4C{Ar+(RztE1rJ%>q5)^XZ9nO`lwqA-o@-Vw!rrg!&QpS+%beRz9D4 z)#R7_h;$hFsvkI({!B~9L4uAlt01dHyF$rC-q?r>wBUH2OfGMMH_RWk{9Nyvy*d_K z__o>(>;^nbHSfj=f!z;}{$qRf#29si;%7qEj3qV8^_JqjNW_II{n5td(UUcD)Herm zuf}p7@6X^YA(qfL|v%naRp{ts^JZL^;wq!%voqRqvkYdJiSss2wE9SKCTgAjh6>v68m9gm~xp_(3|!=AgXzA$aN-Gl>v!$r#2=4XC+ z{#?-F`Co(~H{|qGl~LYy0Q1l^?TR;|x^`WKj3EZcDFrqtg>|`zR*YgvzsDDn`os#K zYAc#ZH$)vEi;d6MVV^34tz1pu2lN(5@l+)ZApa=91;S(2oUOlba@7I*OueT#$Xw{^ zqSPM2K)6!TfBJIb2@d1)c8?Q2_jT+gGdVJW@xik{?(yrF264Z0yyyOAflgu9r^y?< z&;1!d=>=6s|o!pJ-6A=}9eEAE5{wuviYYWe##|rX z?lp53F|DehJiw^!Mbljn*`Az=1^GETKWneNsoyPru6cNd(ocFtM$g4|II8Rq$po55 z2p(wOGDNzHOf4dME124@6d^yBKB%~Q(p8#|k4-z7RejeSbbdOSJ-~vV_X{Q&Ba{!J z=iq!On*m*U?~Hn$+V?Bhoe3+2TsK?*08%<5Fg)1Wjn=yb>XE>p(x1L}bN%-RoY$FlV_=!4e%EzchNP<5d@)eqO+(sJgL z6KA*}S7JEp6+nb(RGA}rbXb}Q$K5Jk{pjHA%_|U-j?7DWa_wuWSmb_dKC^x0@HTWU z<_kd-bln==@=@*(05R&dVg(084F%Ok(5n?9PDYu`UP>aF(VzD|9eY({U`G6;^4 zZ046~4<=6%R6MJY9IeZWj@>hFrTOtYhFH;rdDq!+@LNy%H6$YXt{N>K6Z{ga%!1Bp zSd2+6{ppOzEjDxdCNeuvy3&>$&;KSOt0eS*ZJERObc!nM<=29NAkq z2lk{{ggGoFIMkyD);%X3IqIWUa<<*}`iCoT`3*VGBz$4H@8S*o;rZwXD6*tW;LRQ| z5*pd(U{AA_ThRg`N&W47H=s&Gfozgk#`y&mSh|`@hV>n7H?yN#hU_tx%Ox??;A%q= zq>87Ne>HVr1+B31DL))&!gqo3HFllD$RG}?7WlkUM4@!rmyAPQ#g zV)tnL2x(4!cCbjN9eLFDP*6xa@igkVlZE3&N-i zfFT6#gt>4?5eT#*OQt-3v)B*L;nED;8X8~Sss3ft_zLxUe(1p`M2x_A2Tx@YG&3kPgevrY zb;2sti|8q1aUval_xa;rskam5Y^eNcm9JNvmFnE+ zgXWt}*(Q0$q$!3A8BKcpmd?k>qO%ft@e;(MFPKA6T8FTLVpcAOc=|Vi+iNgo$-^{b z@A@zO#VQ|(`!bSK``I39w!`o+u-ID>F&RuXid_;4Hiuz%e(Jy-Y&(i9t@;-AMbvoe z>e`XeN0FcXvUS{Xyt>+S9*}VYgkVbx>?n$D4#5`D-FzzfiwnnKAmLC*z`RR(s77MJ zT`oVQwB(7=0b>Q3t>)#)7k0;uBU?XXCs}7b;@}?1kotW449}RZ3`!cx&RZb}mZ9fh zO`wHgzs-OdRixf|zCt9r@rcW1xqi$%e#hz4h4<;c551~Hx=#C-Rpr>2_4oBL(kzp& zOkGY$)W6iQRR~JU#^PVop`N}Z^^{?P zJSsvM$2)2GulhdkPEh)u^4@c?-YO&6^#*% zN#{)MispKS5{;#7f2yUo(PbN0YO+S$$cSSr zdp@oBjpa6n#qu)l(t==)g|OgzrPfcXp5N~c#gfu#=nxU=Jt%-8Z9W75Leor@5WAMW zqNl%c?5pC_b;IQg)k8~-$wBeoGgbI+B-DI4^jG^CR_R}xR1QW=gbGk3B-@qoc4T#y z(YR6P8*ya)>D)^4+U!O#y1*smiTMe`>S1_=Y=Td=vh5iUD^Yyeud|}?bLf_fICoqm zx=NGIFGBqWY!K5fk;me6)eTESvgVyB>~TtYNDsYX*BX1DQL%iBlMNSpmibebv@0aP z__&~)N5oSTYLzaS?$~CYs5Ql>HRBC~gGMq76!q2B16NzQDlHRoW~dJ5cj+dku=QlR zfFrdL9OsK*#JvT114$aHmQ*$oE@p2Y5QfZaQ`9)C>TJrvUs`c>7MW-?Mf!H}Gwd{z z@Tcs?-!`8dLNIdSu7pTkG3XVsKlY(%^QU=k^9~Ug{q*`NMWo=Fn1)LIn-!hGYTx_p zTRRV2TOP&DP7-!6Cx1cH*gy+&rFFIW3^*;4eqt(lNeFP@T!vDLCg~ASA`Q?D0pzkO z+f;Y0bLU}tP4ru+yf;dzPDbS3w|l^bFyuMweV^|IJl`^exh7s&S$vlimF<50+f>#q z%x^$jBcm+z)IGPLqh#IDhdmiuO+iS6Z<(`f1u_)ofo>7U&3>3sU^K@9=F1+>es&|d7$zK9(14k?(7=^%(HGmoyo z1$8A?F??0jbmhjgo*|D@kKFAX)lM)lDE*trd|A!yGC&R6X~w|+UL_9CYP`C zUzU$@l1JJWA1v!RJrrzI@~hioX%YE3pGZ)eo~&~Cagz4gOpZ3-Z=@M^y7QdpH(!!W zUD>;a)cRm%obGw%7@*0c-LCrU-iIHpW778IK|(YO;?U-Us0C1JZEzH zWI$ruMQ?BdMOOk30-JfDJDCQ+`Fu4)1{L7^CyAmfdgGFNMG&6Ti=k48{w3eZ+Sl`0 zbTE^P%w)D81egG8*kToN0%*!AxY87u!@2UG$;~A#%3{C*64~qBQEt0IkRP)d zo^2|<-7JNzTv)EG2JMXM>eQuh%zEvYP5jQBCP_Q^{Ulu-dJDJ)LY+x*C+QJll#{uo zRO7`+wRbayr$tLx9L4bWe~fX{L{Wj1+D-s#6ak-080wl8iC5H-C3*~1Y5T=GRPErC z61~K)9~&LCyTwiBeDjW_&i?$Y-J-|(z7lT?@HP{19GJuMYlL^C&&JB zc7P&z<3#u`_Ld0cj&?gvKE<9iI3VO!6Ph&Z|M}~JRhcDUk^GOH)F?vwyya`;w7tm; z(nYI8y*TH5eOi$bt=#6gWSDV|5u(yADdWMzi{hVX$+PguKT@(C@c7^vHZrcmFQ!?B zQuFdx!E)?x+!1>hubMm~J}jdL0C*P=pNNzQK-#&u;QdLwukfEBRH z%e#UQZj^ij)sJ0+V1A`Nd8?yw!A04N=|si?Z~X-u)b5#bF+HU32M&C-EN5Tz_OoTv zhPzXqb<+{1zEuq)qJyPH2I~01>(2&P>k8j`>`cR*)yB3PfDRt+AoGV5S3-L(E5Ple zkaH6YA+958+?L$N-G+Lxm(&!$+M6F>eg1IF6!8M6OR-0TMG=ju#*$WM(L8v)JV>t% zE;OSgd$mSFNp0bmpX;PRphd;$OxDCt?|-sIV6DF?-!Mg)@56DC6~>W6$h)C@D2#b3pN9uY}@9V^-DO6s?-kTd&Wu50)(A@YzgGJ1_+Uv6t;I z1H;@85@ZoZH4w!8;blk*pdr=?|Kdx1oX%jtM-6Ld3N4;3imTcMJTPe>3>8_vDB$S+ z@W8^(uVQjrF`^SeHt3(gFb{3dm-GvwPC^!WkVkvw$EQ5S7q^$nLj9yfg(4ll72aiK zKTdy_-O1|9dDf8H41$A9L=U1ixdR57Hp`);2T_-Zwb3^*R@KAU-`NQr*Z05beNEn7 z{~)B-Qy2N+{qeu{5o$4X!8WF7G8F?fd?dg}2etMgO}ZNU(&OR=UP!JsU#0p*gl7eCYOlwc^6PapD?!@U+Ca~3g? zf)fej?6bHeoEr!8;3|fjelCEVJM;C%rMVX5;w1h}pVTQ!nWTI6ug#vFWq;PXR&W>}_OV4s)*@mAp zYz-=|j;bYppI(Zwv6(!EKBF3QSd`Tz3_O25Mm>CP@*kGbtze>j=>N5h&ZGw8>9o&C zfUMoMT%TOp0IEzAlS$`N7JTc!0l58Eh`A*N)879qBSRoWdu}x40m5Qr zLpd+*`)&QWmj*F5=4UQYX`w4&bb;kr*t$14phIdu->#ig<% zd7Z9VTb@#j>9$7u%o4*HrQbYnOh#b~d4|M2cJ!jz}UJr}{ilK*u8{+kM zw`z4S)!%q$?|f-r{Hf{5$jbLT(@1Xz z+p!+)2bxkD=zVm1n&6EHpvL$(Vt^cQ?QC@|$YpD)*|xf!BtWaq$iCxSdw#9z>)$IK z;cWWysX-FyrcPPDd((w@zXsEq290izxC~OamTJ|Te6Ta!%TFT4;pA)Pj{vry>oMx$ zTz>+ZClD*@*?dHrCS|1T>$4fSJ)nU zf3*1kF}K;~n?74ePW*_unjF z5#bR`(J`tXlt{6oi7*Asp}*7=a{PeGB*vA7PK|EFYTwCqxexq$rD7_}EYB2K+USY| ztLT=mVTm^9jjbYMR=ahbQ`dtq2b#Gc4{89UKi9tx(lljuk&B{APYnLfhE{cTQ zz;pu=Cq;B4n>~Tqy3l-=IB;LsL@=_M{B)K^2=w4NI6VVG2<^W@`ug>rw0)crohZ0^ zG7i6ZJH`h2VNJTCKB_eqxhNLyI)NkVx-B!-niF=jA8fcv8OBoMrxMb4eP8(0H&owA zr%A8CC$y<@kQhTkv;8k%UP^fhr4jkUNtZd0F*;q#A8 zwmy5k%iEa@9rs($q+MqIK#;3Y2(n@89yU>mq*Cr{>s{h#Y7q3I+pWGb#=5%R!~a@L z$m?z%nVIw0()Xo%ozAyYU5sQ>ol}iYB2%wUzN#9+JK!{%2;q)8ne*^>(iYbgP4QyMq?e0}l;PU+ey2nxvfpWZsL@1))JBP@qs_r8tjfC2mk&1nrG zS&BNhK)0pvJk6CHR_`ZC)X?-JtFC=yyzWMh9TZJkdnhU0VZ-;(O8LK<} zzBgF|{X6c?*3L~-3hB~>K1}crDKZ8vj=Spz_LESf=^U1%0t2=M(rw~J)aYUO1m>=D zTEx-?Td`mjSd(Fwk6p1{H=JxkfE0?_2$0N9YmTfnMLnsQh!YK|_f(`S&0UniPF6QW zTYQt%wdULuXngGaIQ?>$v}BQ(vDz+dt11sUG;w;;xyp8B!7?E=_2t zfHyT5+_U&KakBzS@?26Hk>q&gS&QfUvc;Kf)F%P+7h}Sy9f~RYzxPN~%2%Fc5tf;O z00QekGsH!KQw@e9SV3Cx`z2FgMYZK3Ha2iAJ4aP?F}f!|#>7)P{dJx4MxTR$yN9BY zEa7y4VCtfyv&mA<(6+htj=8l}n!jaOf!ojD-hw-gKXcrHhwBbRrq zjM*Dj=*1tJay5_5-h9T0|IwWk&U&p91%=bc(Rab(l3-*;0F6AN>@yXqwpQlhA=!~? zxzuwq35ShuL|y99zA68HwVh{s*J_RcX@S;w1#W=uJnuJ7n-_@$V$r^IJwJ+k@aL?_ z_0&qG>i!HL%jupu6GO>Kk_1()60Z!N5ran(Fkxnt=E{L!Mg*(gGVc^4rm3*-~RKp(2FY@V0j@LpE>7~XfbwlS0#QjkNs z{R-LJc1R2Wc~p7@ZeJ(r+rex?0@`hLmC4&lWd+lh(1oiRlxe)v7ux*i7=50k{ChSp zB6;t5EqYoQ_3UJKt~n1*V3m9M4Ll8g=cbOCVyu4?n)+Z=KQQ$bp-H2T{D_c5=?viJ zU}V3T+aYSxa8kfD{b{@BYR$mieBUM-Uk^>w~oFy3!e@Wm)t{_6`CS< z%O?d%PN(-bW`BW^FPl%m=7LEQydB}}v*_r+zgfg5U2=@^Zdt3Qx_t{qRA z_8cWiEB&y`zhSD>@l4g6cVa13-Sfl@to!x_rz9R;F-fny3m0g27>fma>$y1&)zuEp z4m+lD_l>2xXVyliH6J!-HiWH}^=E+u0|rUQuKy`%I4OM?B02%b`5Pw~l3-e^@oDwm zsz|-f@F&U7M){Rl!mpJH&y5l8j=s~qxqUqVmPj@vhvhnYZ0EL6H$kjbsMDJdnMUII{lR)=TcvkcnZ5vqaOjOZ{F}fXOSfy)ul=kO2xu z6Gyj*qTS6Q+2wX`tOzKDtO`Rs+IMU_hiGQu#x$F{Na>s&E*JB&vG-)oCV1Xz6F866 zhlFUHsMCzBZET!)>=TrKJEK2$-K|0SM>cf{<2iJKK`jz z6zllzgXuR1hZVoHVDEwIFl6|(VapNaMuJ+F`N*k}ol0-l1#nKm$$I7!n$!Tso8%6a zgTNcDIeA|HbLz&-&;FnQ@1kp4hje-zFBF71+S;<-ij+BwxqwC4KXEMYlJ7qF^6xfJM23NW1;%S%c4mC2#Yxn;Et zgucF2#`^ZQ4wi~$#OsPOd(eE%CqhcT1kzr$q0BT6qd; z@au#2_c%?ngBLC1Pyfukle%SOuX3fz`T0f1wCNf`_QM)l&9>kB{}#*64uIGJ#W#n-^q75ZI@^5ce2zn96>7u170@#z{c0biQ#QMlasM3VYOBY#Y z9{QW5#5*fGH>F^=o`J=0p5$v;>JiWr_w(WImHG4ru$7aDINOqCpdj-%3teDy4hCSA zvy9p(UHQ8^X=t}=V_Le^1$|0ENC7(*ZO88JwOE^dc_~sF`24qbF4r~R9W8K?XZ9J} z;`h|O={;2*#sy;90Ue#xrW$Wktuj>gm|aQyRb3f3eer{iwW9fh#z*4uL_C9kF&DH*b#rH3v`dL8|q$ z_Rk?i3F8E4pRKTRvFF{&uOFUOYKtMW#vAf7r=>e1_P`woLbuewD$(>Ki1{cqf1`5q zc}nIp#p~s1;lydNb>C~l$!vNedH}3D)=E}?@PSvW_Pc&Z4qA-aMtqscSG~k*V3@iB1;#S zz4+={yvCBLk>*B;Q?-2`vj6J)Ryiet;jc$N6uzK^8o^n@pM7d%a0s19&iyAql`eaAFZ(@bLAe8KH)uMy|P9ns0KLYsl@n!C=i1{yc`Lup+pW*fj{Y(BjXQlO#^zqp8F70##pL{SD5Y47@u=G+2N zd;L>i2W}BW{XQdfd?%wN?`I{QC^I2sSq5+9obMbRXfV=vk!_@YGPEc{%D(ZU7V_ks z%!0zjrObxAwSw7|Nhw}40Kdb28AvGb>#BwaVi;O@yG2}^UkpM5J)43Shl7wm89Xu-}Pevmseg9Bp z^>A*OBPeM~;aeiNN?U|H4UXRkG@6UqgP$O3M{UjAEx42!cd1+IP$%yRY?t}?pP@Mk zM&q4|=w~WDu^4XTz5fRP^MsY+ZKhDo)zO(nnTd~11_Or2klDdlc{>HC@ldlT#SY>< zSxU-DI_2lgD?KMJj%);tE|wM#VA?#Vw@p9$S5KeH_V#cK_=N!zWr#rRPasIa-4Kzw z{N#?Vj^tRhsIN*;uX(GMU4mWPefnh+>lhPnjp1+O83g!ig9v|Vd7E}0RrD+dQOQnX z1Joa0>W3eVdSp!}5I?n`gt^gz;t-G4T7rNg%ENE2*Ivol!^AW1#gx30!Z!u>%bEtN zyOWqtjUHez%0l`S392F#Kt{MwWZN005qwSwSXO9!?fN9t(ArlKdkGoJU2h;HWT|c! z{pbeqJVDSmhj+aa{H;H9izW}L5Yr2ap6Z?qne4!bmBLIYh(x+8k(ZP#u&Obm*K#x> z%s|>g-~#6r`}3xYt;d|a=f*}Tf}X1j9}3V_g{kqLjyHB|(&F!%tp~1ZT1FdV+*+!d zdlDK!^Ty&uvCAkQVtZ)40 z<*R*)6$?~AlZx|_x<@DS&s;j}(|ONTAAP=ZvrQ1*gaF@_e&pO2bY(34$cK5V+mCHO z5$6z|W&M$=Q@G~(+B7>pFO_@0KBgSI8eCkLc5CY%i&IQXKkvLbbi%I{CkeEc0x=WH zo9!?@U>cOCro@xSy`8n@bGy;IdRngf%HUC%|JX!PZVC!)IN6xB zbOR4R|6a8H(z%Rr`<(1Fep4OY6FC=`KAfERWuB#Cf|M}WYOq~&S*t%?^I>k;f{>Q7 z=ifQ#KnM5Tk)SXZ%S4dPQq&87NZgF%K=1Bf6M8hKvu-T=>AL z${Ay-&@Khg(s5-g%x}$2HAcz+QzRLjtIG{ zE!d%e`efGm1VR7f=i&S$lW%XM_bV6=ygTmCe$v7qjPn<`zi_fy@&P&#54ojlc0Vb-$;jZwPMo9vu+g@e?c(nd z*@Zz59nYRs1@2KY&RZpcT>lF$4{qWV2Aouwjz9B#;b!fS%C`F0XF5S-EOE3+^GVtA z2v)eu;B!14@17xXb%C#H=$wJrLgF`?&iR&0PW|iQG>#An;2j3GbSYFuyN$X@(TrDt z-;FCrUBY_0sF(}PI2e)loYiw3wY_ho2F0I$cem|B{i5)7zCxjrZIE(+$Z}wrYz|`Y z94hMe?BbsSXMEl>oNU0KpOiu zv(fdS$3z8^W(Qonk$pBv#ac&P_olu`8+7BhL+Po@~uiRP7Y9xN=){ zAHb&FS?NsI?D0Ai85X7$f5tMaHdr{@;b&Q%$6L~IlCvFl(XG4aVoO0w$<{`auXw#k|{Dhb+0cG0`Y=^s=G-jI(V~;0ob${D%%{q~MG*g2j ztQ-|^4j3r~!}sH6Ul0x)aGAtFKO;-_Clu&w6G4R>|Ggr`G#Y?%1ENWaP%Cs)+H z(CZyF{Ewr<*R?;D@lN(L53P`kn4G3$W50d$8)V{|gIA?_(aa^e+F>=v)M&Qe(w^?C zb$z4U60_`vPQ}DT<4=JK@Q2qemKsKI9^XzCV4F%j8>S1wH?XNZIS;v}6Acx#?`Bq6 zCB@_KrXjlVW3E7bLtBrCiPX6W7V_&c9Z9V7%CC^2PJD9?Z&V;Q!-UrG(swS^{^m0j z%~Eo;?;!i@_hxr&{LvrhkBD|O*;{43j>d-y7>YQ4qieqZsW|hAyz||ad2TcO z>X320!BpniHYCy)djjBta+c(ui&<-x@CGrO;j<~Z`xrQ;QiufcV8OgfJb^pn80umS z5#*36k97X3*DcKW6)7d0@-CAz_r&wIdBw1T@ z6)8gCH9E5~u?{@%#*+MgbBkZ%QoDZb%x#&?eVG#6ZRz+2^E>Rb1fXWy0zZKE#Q<>! zsqE5_`khXKpj!4tm|b;KeQmhL>9SNo!)O7v6z3?88&%H^DLp#R8uQ2I=YT0BO|GSS zK%dCT*Q+VKK!${p)?TYCKUPZD3qA5pa;o#x-zdik$q9oN7~VQ!`D^3FBV%w-#A@-%E_b-+%sz75$)-G?P|Y>nJ-;-vxWXE20Hn zC9<9kwkIoO1q@`-6}C8<7G1WvCR~>2#6EA9GtUJRoK!QVhOH>!ZBj41Nw0mv;VT-Y ze|T<)$?@mZT!kEI?tqSr_Q43Ft_=pM2!124&ELt2jSZYoPv}8vwC<{?oD@m)J6wEp zkp3Rm%+p}~>{d2BIHD6nzD$S)bMMYxR-zyfVi=%+DIeF6S=yW&^8{2vyRE`LsfIk# zbxS~>^Gcb7Ps?G!)8l*#7+c0Opf3%mKry-}PzZ`P9Kf&vpf2~_Z?a{ZICKJsS(PIXA zX8Dfz9sd_r4Zao*+ooVcSdr}JbhBx!bo1ExjLOhCLiEF$B)l1}81LZZY<^2}NU71k z^SQoa_NcwI{1qS8n;$2$=CL7M-Eahq9gh6Km9ZMgm=Pu17mZyA8o# zC0hAm{F|C7XTHC|cM_}WQ|U13O`%$ z)fK0Y)hd!7WG*r`?h%W@@K zHKfY1jHwD+Z}+TVbJNoPY}{-itlADp0l$Gu;9L>W00XKNHs2owavll8c^qW@F!ZD2vwsd!Upf;Ip(1wQ;;#_@Dgj0LXPV zA7S_#Swr23h&T)SZADZtHe}K+&3X*wV8)19TDfvZG%CtqK+&gYx&Ng}j=X_#dWfMJ z+H;J!ls+F^Vs^H|wC?;?K2}Vv;#pSCDQa@%WL|t#X>LYZL&c)`tC`+Y5sJ)qCNF5* z3#sgXdD786rj{6567MOTATL@kQ`E%r_V znZKa%h80n5dEEH~$m{YF$30jSm-rVE$savDmW!#(|*MG836C3Jl{5%LkGN{RbH zLRtRgEUmq-rdsmZ_SP23s_td6ySsel()Cn_aKOX|a8YICl1Kx$A0%?JxlLUQJLK+@MNP)neyKr|tyCZ-!eg|%N`jq38=af%(fXe5 z)htv!q8zy9C$ehR!a5(SikL%ZzUV|Y*F(i%F#tjUMV5JJARlPW-VezJ z^D^$h?1u15Cx2`ncQzv~$sI0Jt295{y{Q;gnVLgx5MaDm#;4Lya7TSuygJ{lq?V((9p6mZ{~FES$O$H}!! z?hlYT&B}AHC{bb_(Cfprn^@bGhzgy~oCKQJ-T8{@8qdRgdJ<<}iw+?A6lvrR-VKj+ z@Kg?J?LoGEq9%Zm&VS2>LJm`d=>bw5q*H3VW zy^24lU3nVgJF3T9Rj?d7~S(i&ip7(vvHr1NhEC zKIe47(=v|^-n;Nc=s``pv1*AdcR6V#AemRp;DzhOLA&jbKzBI~{P=InF8|Y>^Z#u{ z|L^l8`Tw+{ff^|a{SRdyHg1y*Siw1J=)cI?IR25f0q{TAJ8STZ|Gh%E|JL6Z0+lme wP@8E)h(WKP1n1A{h0+1=c@zIAv;2`_srj45KmNZ~a{!(6|0QDy*1sqI6DvU?mH+?% literal 0 HcmV?d00001 From 6c6721aa7ba99db9f45b5a9b1158868ede20a9a3 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sun, 12 Nov 2023 18:20:17 +0000 Subject: [PATCH 16/46] Update preamble.md --- .../tutorials/alevin-commandline/preamble.md | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/preamble.md b/topics/single-cell/tutorials/alevin-commandline/preamble.md index 0202542a252820..15806b7708256b 100644 --- a/topics/single-cell/tutorials/alevin-commandline/preamble.md +++ b/topics/single-cell/tutorials/alevin-commandline/preamble.md @@ -7,10 +7,9 @@ As a recap, we fill go from raw FASTQ files to a cell x gene data matrix in AnnD 2. Making a transcript-to-gene ID mapping 3. Creating Salmon index 4. Quantification of transcript expression using Alevin -5. (Quality control using Alevin) -6. Creating Summarized Experiment from the Alevin output -7. Adding metadata -8. Combining samples data +5. Creating Summarized Experiment from the Alevin output +6. Adding metadata +7. Combining samples data ## Launching JupyterLab @@ -52,7 +51,7 @@ You have two options for how to proceed with this JupyterLab tutorial - you can > > 1. Select the **Bash** icon under **Notebook** > -> ![Bash icon](../../images/bash.png "Bash Notebook Button") +> ![Bash icon](../../images/scrna-pre-processing/bash.png "Bash Notebook Button") > > 2. Save your file (**File**: **Save**, or click the {% icon galaxy-save %} Save icon at the top left) > @@ -86,15 +85,15 @@ Before we start working on the tutorial notebook, we need to install required pa > > 1. Navigate back to the `Terminal` (see Option 1 in the box above) > 2. In the Terminal tab open, write the following, one line at a time: +>``` +>conda install -y -c bioconda atlas-gene-annotation-manipulation +>``` > ``` >conda install -y -c bioconda bioconductor-tximeta >``` >``` >conda install -y -c bioconda bioconductor-dropletutils >``` ->``` ->conda install -y -c bioconda atlas-gene-annotation-manipulation ->``` > {: .hands_on} From 66d8c06e0da2cce3c809f42784f7ab902388843a Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sun, 12 Nov 2023 18:22:49 +0000 Subject: [PATCH 17/46] change order to avoid rtracklayer problem --- topics/single-cell/tutorials/alevin-commandline/preamble.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/preamble.md b/topics/single-cell/tutorials/alevin-commandline/preamble.md index 15806b7708256b..c655645402ab0d 100644 --- a/topics/single-cell/tutorials/alevin-commandline/preamble.md +++ b/topics/single-cell/tutorials/alevin-commandline/preamble.md @@ -85,13 +85,13 @@ Before we start working on the tutorial notebook, we need to install required pa > > 1. Navigate back to the `Terminal` (see Option 1 in the box above) > 2. In the Terminal tab open, write the following, one line at a time: ->``` ->conda install -y -c bioconda atlas-gene-annotation-manipulation ->``` > ``` >conda install -y -c bioconda bioconductor-tximeta >``` >``` +>conda install -y -c bioconda atlas-gene-annotation-manipulation +>``` +>``` >conda install -y -c bioconda bioconductor-dropletutils >``` > From 02d21d33c456ac2037de6b55492514a39cfb516e Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sun, 12 Nov 2023 18:26:34 +0000 Subject: [PATCH 18/46] comment about rtracklayer --- topics/single-cell/tutorials/alevin-commandline/preamble.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/preamble.md b/topics/single-cell/tutorials/alevin-commandline/preamble.md index c655645402ab0d..fc387ada56ae4e 100644 --- a/topics/single-cell/tutorials/alevin-commandline/preamble.md +++ b/topics/single-cell/tutorials/alevin-commandline/preamble.md @@ -86,10 +86,10 @@ Before we start working on the tutorial notebook, we need to install required pa > 1. Navigate back to the `Terminal` (see Option 1 in the box above) > 2. In the Terminal tab open, write the following, one line at a time: > ``` ->conda install -y -c bioconda bioconductor-tximeta +>conda install -y -c bioconda bioconductor-tximeta # install this first to avoid problem with re-installation of rtracklayer >``` >``` ->conda install -y -c bioconda atlas-gene-annotation-manipulation +>conda install -y -c bioconda atlas-gene-annotation-manipulation >``` >``` >conda install -y -c bioconda bioconductor-dropletutils @@ -98,4 +98,4 @@ Before we start working on the tutorial notebook, we need to install required pa {: .hands_on} -Installation will take a while, so in the meantime, when it's running, you can open the notebook and follow the rest of this tutorial there! +Installation will take a long while, so in the meantime, when it's running, you can open the notebook and follow the rest of this tutorial there! From 9c298b6ca49e90aed9213909506b0c75982c10d2 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sun, 12 Nov 2023 18:27:08 +0000 Subject: [PATCH 19/46] checking the notebook --- .../tutorials/alevin-commandline/tutorial.md | 61 ++++--------------- 1 file changed, 11 insertions(+), 50 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index 9d775ae4fcbbd2..829aa62b81181a 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -1,7 +1,7 @@ --- layout: tutorial_hands_on -title: 'Generating a single cell matrix using Alevin (bash + R)' +title: 'Generating a single cell matrix using Alevin and combining datasets (bash + R)' subtopic: single-cell-CS-code priority: 1 zenodo_link: @@ -15,10 +15,10 @@ objectives: - Interpret quality control (QC) plots to make informed decisions on cell thresholds - Find relevant information in GTF files for the particulars of their study, and include this in data matrix metadata -time_estimation: 1H +time_estimation: 2H key_points: - - Create a scanpy-accessible AnnData object from FASTQ files, including relevant gene metadata + - Create a SCE object from FASTQ files, including relevant gene and cell metadata requirements: - @@ -35,11 +35,9 @@ follow_up_training: - scrna-case_alevin-combine-datasets tags: -- single-cell - 10x - paper-replication - jupyter-notebook -- interactive-tools contributions: @@ -70,7 +68,6 @@ Once you've downloaded a specific binary (here we're using version 1.10.0), just tar -xvzf salmon-1.10.0_linux_x86_64.tar.gz ``` - We're going to use Alevin {% cite article-Alevin %} for demonstration purposes, but we do not endorse one method over another. # Get Data @@ -82,62 +79,26 @@ wget -nv https://zenodo.org/ wget -nv https://zenodo.org/ ``` - - > > -> Test rendering +> How to differentiate between the two files if they are just called 'Read 1' and 'Read 2'? > > > > > -> > is it ok? +> > The file which contains the cell barcodes and UMI is significantly shorter (indeed, 20 bp!) compared to the other file containing longer, transcript read. For ease, we will use explicit file names. > > > {: .solution} > {: .question} - -Additionally, to map your reads, you will need a transcriptome to align against (a FASTA) as well as the gene information for each transcript (a gtf) file. These files are included in the data import step below. +Additionally, to map your reads, you will need a transcriptome to align against (a FASTA) as well as the gene information for each transcript (a gtf) file. These files are included in the data import step below. You can also download these for your species of interest [from Ensembl](https://www.ensembl.org/info/data/ftp/index.html). ```bash -wget -nv https://zenodo.org/ -wget -nv https://zenodo.org/ +wget -nv https://zenodo.org/record/4574153/files/Mus_musculus.GRCm38.100.gtf.gff +wget -nv https://zenodo.org/record/4574153/files/Mus_musculus.GRCm38.cdna.all.fa.fasta ``` - - -You can also download these for your species of interest [from Ensembl](https://www.ensembl.org/info/data/ftp/index.html). Once you find the cDNA FASTA file you are interested in, right click on the link and choose "Copy link address" and paste it along the command `wget -nv`, then extract it using `tar`. Here is the example how to do it: - -```bash -# Getting FASTA file -wget -nv https://ftp.ensembl.org/pub/release-110/fasta/mus_musculus/cdna/ -tar -``` -Do exactly the same to get the GTF file: - -```bash -# Getting GTF file -wget -nv https://ftp.ensembl.org/pub/release-110/gtf/mus_musculus -tar -``` - - - Why do we need FASTA and GTF files? To generate gene-level quantifications based on transcriptome quantification, Alevin and similar tools require a conversion between transcript and gene identifiers. We can derive a transcript-gene conversion from the gene annotations available in genome resources such as Ensembl. The transcripts in such a list need to match the ones we will use later to build a binary transcriptome index. If you were using spike-ins, you'd need to add these to the transcriptome and the transcript-gene mapping. @@ -149,7 +110,7 @@ We will use the murine reference annotation as retrieved from Ensembl in GTF for You can have a look at the Terminal tab again. Has the package `atlas-gene-annotation-manipulation` been installed yet? If yes, you can execute the code cell below and while it's running, I'll explain all the parameters we set here. ```bash -gtf2featureAnnotation.R -g gtf.gff -c fasta.fasta -d "transcript_id" -t "transcript" -f "transcript_id" -o map_code -l "transcript_id,gene_id" -r -e filtered_fasta_code +gtf2featureAnnotation.R -g gtf.gff -c fasta.fasta -d "transcript_id" -t "transcript" -f "transcript_id" -o map -l "transcript_id,gene_id" -r -e filtered_fasta ``` In essence, [gtf2featureAnnotation.R script](https://github.com/ebi-gene-expression-group/atlas-gene-annotation-manipulation) takes a GTF annotation file and creates a table of annotation by feature, optionally filtering a cDNA file supplied at the same time. Therefore the first parameter `-g` stands for "gtf-file" and requires a path to a valid GTF file. Then `-c` takes a cDNA file for extracting meta info and/or filtering - that's our FASTA! Where --parse-cdnas (that's our `-c`) is specified, we need to specify, using `-d`, which field should be used to compare to identfiers from the FASTA. We set that to "transcript_id" - feel free to inspect the GTF file to explore other attributes. We pass the same value in `-f`, meaning first-field, ie. the name of the field to place first in output table. To specify which other fields to retain in the output table, we provide comma-separated list of those fields, and since we're only interested in transcript to gene map, we put those two names ("transcript_id,gene_id") into `-l`. `-t` stands for the feature type to use, and in our case we're using "transcript". Guess what `-o` is! Indeed, that's the output annotation table - here we specify the file path of our transcript to gene map. We will also have another output denoted by `-e` and that's the path to a filtered FASTA. Finally, we also put `-r` which is there only to suppress header on output. Summarising, output will be a an annotation table, and a FASTA-format cDNAs file with unannotated transcripts removed. @@ -163,7 +124,7 @@ Sometimes it's important that there are no transcripts in a FASTA-format transcr We will use Salmon in mapping-based mode, so first we have to build a salmon index for our transcriptome. We will run the salmon indexer as so: ```bash -salmon-latest_linux_x86_64/bin/salmon index -t filtered_fasta_code -i salmon_index_code -k 31 +salmon-latest_linux_x86_64/bin/salmon index -t filtered_fasta -i salmon_index -k 31 ``` Where `-t` stands for our filtered FASTA file, and `-i` is the output the mapping-based index. To build it, the funciton is using an auxiliary k-mer hash over k-mers of length 31. While the mapping algorithms will make used of arbitrarily long matches between the query and reference, the k size selected here will act as the minimum acceptable length for a valid match. Thus, a smaller value of k may slightly improve sensitivity. We find that a k of 31 seems to work well for reads of 75bp or longer, but you might consider a smaller k if you plan to deal with shorter reads. Also, a shorter value of k may improve sensitivity even more when using selective alignment (enabled via the –validateMappings flag). So, if you are seeing a smaller mapping rate than you might expect, consider building the index with a slightly smaller k. @@ -249,7 +210,7 @@ check if we can get alevinQC to work - paste the info from the other tutorial? # Alevin output to SummarizedExperiment Let's change gear a little bit. We've done the work in bash, and now we're switching to R to complete the processing. To do so, you have to change Kernel to R (either click on `Kernel` -> `Change Kernel...` in the upper left corner of your JupyterLab or click on the displayed current kernel in the upper right corner and change it). -![Figure showing the JupyterLab interface with an arrow pointing to the left corner, showing the option `Kernel` -> `Change Kernel...` and another arrow pointing to the right corner, showing the icon of the current kernel. The pop-up window asks which kernel should be chosen instead.](../../images//switch_kernel.jpg "Two ways of switching kernel.") +![Figure showing the JupyterLab interface with an arrow pointing to the left corner, showing the option `Kernel` -> `Change Kernel...` and another arrow pointing to the right corner, showing the icon of the current kernel. The pop-up window asks which kernel should be chosen instead.](../../images/scrna-pre-processing/switch_kernel.jpg "Two ways of switching kernel.") Now load the library that we have previously installed in terminal: From a837a440131cdfef21d5951de189f54808732e06 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sun, 12 Nov 2023 18:42:32 +0000 Subject: [PATCH 20/46] salmon, alevin edits --- .../tutorials/alevin-commandline/tutorial.md | 23 ++++++++++--------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index 829aa62b81181a..4fc8b876c79e81 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -95,8 +95,8 @@ wget -nv https://zenodo.org/ Additionally, to map your reads, you will need a transcriptome to align against (a FASTA) as well as the gene information for each transcript (a gtf) file. These files are included in the data import step below. You can also download these for your species of interest [from Ensembl](https://www.ensembl.org/info/data/ftp/index.html). ```bash -wget -nv https://zenodo.org/record/4574153/files/Mus_musculus.GRCm38.100.gtf.gff -wget -nv https://zenodo.org/record/4574153/files/Mus_musculus.GRCm38.cdna.all.fa.fasta +wget -c https://zenodo.org/record/4574153/files/Mus_musculus.GRCm38.100.gtf.gff -O GRCm38_gtf.gff +wget -c https://zenodo.org/record/4574153/files/Mus_musculus.GRCm38.cdna.all.fa.fasta -O GRCm38_cdna.fasta ``` Why do we need FASTA and GTF files? @@ -110,7 +110,7 @@ We will use the murine reference annotation as retrieved from Ensembl in GTF for You can have a look at the Terminal tab again. Has the package `atlas-gene-annotation-manipulation` been installed yet? If yes, you can execute the code cell below and while it's running, I'll explain all the parameters we set here. ```bash -gtf2featureAnnotation.R -g gtf.gff -c fasta.fasta -d "transcript_id" -t "transcript" -f "transcript_id" -o map -l "transcript_id,gene_id" -r -e filtered_fasta +gtf2featureAnnotation.R -g GRCm38_gtf.gff -c GRCm38_cdna.fasta -d "transcript_id" -t "transcript" -f "transcript_id" -o map -l "transcript_id,gene_id" -r -e filtered_fasta ``` In essence, [gtf2featureAnnotation.R script](https://github.com/ebi-gene-expression-group/atlas-gene-annotation-manipulation) takes a GTF annotation file and creates a table of annotation by feature, optionally filtering a cDNA file supplied at the same time. Therefore the first parameter `-g` stands for "gtf-file" and requires a path to a valid GTF file. Then `-c` takes a cDNA file for extracting meta info and/or filtering - that's our FASTA! Where --parse-cdnas (that's our `-c`) is specified, we need to specify, using `-d`, which field should be used to compare to identfiers from the FASTA. We set that to "transcript_id" - feel free to inspect the GTF file to explore other attributes. We pass the same value in `-f`, meaning first-field, ie. the name of the field to place first in output table. To specify which other fields to retain in the output table, we provide comma-separated list of those fields, and since we're only interested in transcript to gene map, we put those two names ("transcript_id,gene_id") into `-l`. `-t` stands for the feature type to use, and in our case we're using "transcript". Guess what `-o` is! Indeed, that's the output annotation table - here we specify the file path of our transcript to gene map. We will also have another output denoted by `-e` and that's the path to a filtered FASTA. Finally, we also put `-r` which is there only to suppress header on output. Summarising, output will be a an annotation table, and a FASTA-format cDNAs file with unannotated transcripts removed. @@ -141,7 +141,7 @@ reference salmon --> # Use Alevin @@ -160,6 +160,12 @@ Time to use Alevin now! Alevin works under the same indexing scheme (as salmon) > {: .question} +Alevin can be run using the following command: + +```bash +salmon-latest_linux_x86_64/bin/salmon alevin -l ISR -1 barcodes_701.fastq -2 transcript_701.fastq --dropseq -i salmon_index -p 10 -o alevin_output --tgMap map --freqThreshold 3 --keepCBFraction 1 --dumpFeatures +``` + All the required input parameters are described in [the documentation](https://salmon.readthedocs.io/en/latest/alevin.html), but for the ease of use, they are presented below as well: - `-l`: library type (same as salmon), we recommend using ISR for both Drop-seq and 10x-v2 chemistry. @@ -186,12 +192,6 @@ All the required input parameters are described in [the documentation](https://s We have also added some additional parameters (`--freqThreshold`, `--keepCBFraction`) and their values are derived from the [Alevin Galaxy tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) after QC to stop Alevin from applying its own thresholds. However, if you're not sure what value to pick, you can simply allow Alevin to make its own calls on what constitutes empty droplets. -Once all the above requirement are satisfied, Alevin can be run using the following command: - -```bash -salmon-latest_linux_x86_64/bin/salmon alevin -l ISR -1 barcodes_701.fastq -2 transcript_701.fastq --dropseq -i salmon_index_code -p 10 -o alevin_output_code --tgMap map_code --freqThreshold 3 --keepCBFraction 1 --dumpFeatures -``` - This tool will take a while to run. Alevin produces many file outputs, not all of which we'll use. You can refer to the [Alevin documentation](https://salmon.readthedocs.io/en/latest/alevin.html) if you're curious what they all are, you can look through all the different files to find information such as the mapping rate, but we'll just pass the whole output folder directory for downstream analysis. @@ -210,7 +210,8 @@ check if we can get alevinQC to work - paste the info from the other tutorial? # Alevin output to SummarizedExperiment Let's change gear a little bit. We've done the work in bash, and now we're switching to R to complete the processing. To do so, you have to change Kernel to R (either click on `Kernel` -> `Change Kernel...` in the upper left corner of your JupyterLab or click on the displayed current kernel in the upper right corner and change it). -![Figure showing the JupyterLab interface with an arrow pointing to the left corner, showing the option `Kernel` -> `Change Kernel...` and another arrow pointing to the right corner, showing the icon of the current kernel. The pop-up window asks which kernel should be chosen instead.](../../images/scrna-pre-processing/switch_kernel.jpg "Two ways of switching kernel.") +![Figure showing the JupyterLab interface with an arrow pointing to the left corner, showing the option 'Kernel' -> 'Change Kernel...' and another arrow pointing to the right corner, showing the icon of the current kernel. The pop-up window asks which kernel should be chosen instead.](../../images/scrna-pre-processing/switch_kernel.jpg "Two ways of switching kernel.") + Now load the library that we have previously installed in terminal: From 1ba005dd510722bf5ac7cf708d36960f1b037f12 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sun, 12 Nov 2023 23:20:00 +0000 Subject: [PATCH 21/46] all code boxes bash to be rendered correctly --- .../tutorials/alevin-commandline/tutorial.md | 76 +++++++++---------- 1 file changed, 38 insertions(+), 38 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index 4fc8b876c79e81..cb604be7abf899 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -215,7 +215,7 @@ Let's change gear a little bit. We've done the work in bash, and now we're switc Now load the library that we have previously installed in terminal: -```r +```bash library(tximeta) ``` @@ -223,7 +223,7 @@ The [tximeta package](https://bioconductor.org/packages/devel/bioc/vignettes/txi First, let's specify the path to the quants_mat.gz file: -```r +```bash path <- 'alevin_output/alevin/quants_mat.gz' ``` We will specify the following arguments when running *tximeta*: @@ -234,13 +234,13 @@ We will specify the following arguments when running *tximeta*: With that we can create a dataframe and pass it to tximeta to create SummarizedExperiment object. -```r +```bash coldata <- data.frame(files = path, names="sample701") alevin_se <- tximeta(coldata, type = "alevin") ``` Inspect the created object: -```r +```bash alevin_se ``` @@ -250,7 +250,7 @@ As you can see, *rowData names* and *colData names* are still empty. Before we a Some sub-populations of small cells may not be distinguished from empty droplets based purely on counts by barcode. Some libraries produce multiple ‘knees’ (see the [Alevin Galaxy tutorial]() for multiple sub-populations. The [emptyDrops]() method has become a popular way of dealing with this. emptyDrops still retains barcodes with very high counts, but also adds in barcodes that can be statistically distinguished from the ambient profiles, even if total counts are similar. -```r +```bash library(DropletUtils) # load the library and required packages ``` @@ -263,12 +263,12 @@ emptyDrops takes multiple arguments that you can read about in the [documentatio Let's then extract the matrix from our `alevin_se` object. It's stored in *assays* -> *counts*. -```r +```bash matrix_alevin <- assays(alevin_se)$counts ``` And now run emptyDrops: -```r +```bash # Identify likely cell-containing droplets out <- emptyDrops(matrix_alevin, lower = 100, niters = 1000, retain = 20) out @@ -278,14 +278,14 @@ comment on those values --> False discovery rate - ??? -```r +```bash is.cell <- out$FDR <= 0.01 sum(is.cell, na.rm=TRUE) ``` We got rid of the background droplets containing no cells, so now we will filter the matrix that we passed on to emptyDrops, so that it corresponds to the remaining cells. -```r +```bash emptied_matrix <- matrix_alevin[,which(is.cell),drop=FALSE] # filter the matrix ``` @@ -294,20 +294,20 @@ From here, we can move on to adding metadata and we'll return to `emptied_matrix # Adding cell metadata The cells barcodes are stored in *colnames*. Let's exctract them into a separate object: -```r +```bash barcode <- colnames(alevin_se) ``` Now, we can simply add those barcodes into *colData names* where we will keep the cell metadata. To do this, we will create a column called `barcode` in *colData* and pass the stored values into there. -```r +```bash colData(alevin_se)$barcode <- barcode colData(alevin_se) ``` That's only cell barcodes for now! However, after running *emptyDrops*, we generated lots of cell information that is currently stored in `out` object (Total, LogProb, PValue, Limited, FDR). Let's add those values to cell metadata! Since we already have *barcodes* in there, we will simply bind the emptyDrops output to the existing dataframe: -```r +```bash colData(alevin_se) <- cbind(colData(alevin_se),out) colData(alevin_se) ``` @@ -319,13 +319,13 @@ As you can see, the new columns were appended successfully and now the dataframe The genes IDs are stored in *rownames*. Let's exctract them into a separate object: -```r +```bash gene_ID <- rownames(alevin_se) ``` Analogically, we will add those genes IDs into *rowData names* which stores gene metadata. To do this, we will create a column called `gene_ID` in *rowData* and pass the stored values into there. -```r +```bash rowData(alevin)$gene_ID <- gene_ID ``` @@ -333,7 +333,7 @@ rowData(alevin)$gene_ID <- gene_ID Since gene symbols are much more informative than only gene IDs, we will add them to our metadata. We will base this annotation on Ensembl - the genome database – with the use of the library BioMart. We will use the archive Genome assembly GRCm38 to get the gene names. Please note that the updated version (GRCm39) is available, but some of the gene IDs are not in that EnsEMBL database. The code below is written in a way that it will work for the updated dataset too, but will produce ‘NA’ where the corresponding gene name couldn’t be found. -```r +```bash # get relevant gene names library("biomaRt") # load the BioMart library ensembl.ids <- gene_ID @@ -345,7 +345,7 @@ ensembl_m = useMart("ensembl", dataset="mmusculus_gene_ensembl", host='https://n # we specify the host with this archive. If you want to use the most recent version of the dataset, just run: # ensembl_m = useMart("ensembl", dataset="mmusculus_gene_ensembl") ``` -```r +```bash genes <- getBM(attributes=c('ensembl_gene_id','external_gene_name'), filters = 'ensembl_gene_id', values = ensembl.ids, @@ -355,11 +355,11 @@ genes <- getBM(attributes=c('ensembl_gene_id','external_gene_name'), # 'ensembl_gene_id' are genes IDs, # 'external_gene_name' are the genes symbols that we want to get for our values stored in ‘ensembl.ids’. ``` -```r +```bash # see the resulting data head(genes) ``` -```r +```bash # replace IDs for gene names gene_names <- ensembl.ids count = 1 @@ -377,17 +377,17 @@ for (geneID in gene_names) count = count + 1 # increased count so that every element in gene_names is replaced } ``` -```r +```bash # add the gene names into rowData in a new column gene_name rowData(alevin_se)$gene_name <- gene_names ``` -```r +```bash # see the changes rowData(alevin_se) ``` If you are working on your own data and it’s not mouse data, you can check available datasets for other species and just use relevant dataset in `useMart()` function. -```r +```bash listDatasets(mart) # available datasets ``` @@ -405,21 +405,21 @@ add mito annotation Let's go back to the `emptied_matrix` object. Do you remember how many cells were left after filtering? We can check that by looking at the matrix' dimensions: -```r +```bash dim(matrix_alevin) # check the dimension of the unfiltered matrix dim(emptied_matrix) # check the dimension of the filtered matrix ``` We've gone from X to Y cells. We've filtered the matrix, but not our SummarizedExperiment. We can subset `alevin_se` based on the cells that were left after filtering. We will store them in a separate list, as we did with the barcodes: -```r +```bash retained_cells <- colnames(emptied_matrix) retained_cells ``` And now we can subset our SummarizedExperiment based on the barcodes that are in the `retained_cells` list: -```r +```bash alevin_subset <- alevin_se[, colData(alevin_se)$barcode %in% retained_cells] alevin_subset ``` @@ -446,13 +446,13 @@ We are currently analysing sample N701, so let's finish it off by adding the inf We will label batch as an integer - "0" for sample N701, "1" for N702 etc. The way to do it is creating a list with zeros of the length corresponding to the number of cells that we have in our SummarizedExperiment object, like so: -```r +```bash batch <- rep("0", length(colnames(alevin_subset))) ``` And now create a batch slot in the *colData names* and append the `batch` list in the same way as we did with barcodes: -```r +```bash colData(alevin_subset)$batch <- batch colData(alevin_subset) ``` @@ -463,7 +463,7 @@ A new column appeared, full of zeros - as expected! It's all the same for genotype, but instead creating a list with zeros, we'll create a list with string "wildtype" and append it into genotype slot: -```r +```bash genotype <- rep("wildtype", length(colnames(alevin_subset))) colData(alevin_subset)$genotype <- genotype ``` @@ -471,13 +471,13 @@ colData(alevin_subset)$genotype <- genotype ## Sex You already know what to do, right? A list with string "male" and then adding it into a new slot! -```r +```bash sex <- rep("male", length(colnames(alevin_subset))) colData(alevin_subset)$sex <- sex ``` Check if all looks fine: -```r +```bash colData(alevin_subset) ``` @@ -494,7 +494,7 @@ But first, we have to save the results of our hard work on sample 701! Saving files is quite straight forward. Just specify which object you want to save and how you want the file to be named. Don't forget the extension! -```r +```bash save(alevin_subset, file = "alevin_701.RData") ``` @@ -521,7 +521,7 @@ The files are there! Now back to R - switch kernel again. Above we described all the steps done in R and explained what each bit of code does. Below all those steps are in one block of code, so read carefully and make sure you understand everything! -```r +```bash path2 <- 'alevin_output_702/alevin/quants_mat.gz' alevin2 <- tximeta(coldata = data.frame(files = path2, names = "sample702"), type = "alevin") @@ -593,25 +593,25 @@ Alright, another sample pre-processed! Pre-processed sample 702 is there, but we still need to load sample 701 that we saved before switching kernels. It's equally easy as saving the object: -```r +```bash load("alevin_701.RData") ``` Check if it was loaded ok: -```r +```bash alevin_701 ``` Now we can combine those two objects into one using one simple command: -```r +```bash alevin_combined <- cbind(alevin_701, alevin_702) alevin_combined ``` If you have more samples, just append them in the same way. We won't process another sample here, but pretending that we have third sample, we would combine it like this: -```r +```bash alevin_subset3 <- alevin_702 # copy dataset for demonstration purposes alevin_combined_demo <- cbind(alevin_combined, alevin_subset3) alevin_combined_demo @@ -624,7 +624,7 @@ You get the point, right? It's imporant though that the rowData names and colDat It is generally more common to use SingleCellExperiment format rather than SummarizedExperiment. The conversion is quick and easy, and goes like this: -```r +```bash alevin_sce <- as(alevin_combined, "SingleCellExperiment") alevin_sce ``` @@ -632,13 +632,13 @@ As you can see, all the embeddings have been successfully transfered during this You've already learned how to save and load objects in Jupyter notebook, let's then save the SCE file: -```r +```bash save(alevin_sce, file = "alevin_combined_sce.rdata") ``` The last thing that might be useful is exporting the files into your Galaxy history. To do it... guess what! Yes - switching kernels again! But this time we choose Python kernel and run the following command: -```python +```bash put("alevin_combined_sce.rdata") ``` From 3c5d72b8559dc645779215a836b5a34aa0d3d601 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Mon, 13 Nov 2023 00:32:56 +0000 Subject: [PATCH 22/46] polishing up --- .../tutorials/alevin-commandline/tutorial.md | 22 +++++++++++++------ 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index cb604be7abf899..6f144d6867ce2b 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -75,8 +75,8 @@ We're going to use Alevin {% cite article-Alevin %} for demonstration purposes, We continue working on the same example data - a very small subset of the reads in a mouse dataset of fetal growth restriction {% cite Bacon2018 %} (see the [study in Single Cell Expression Atlas](https://www.ebi.ac.uk/gxa/sc/experiments/E-MTAB-6945/results/tsne) and the [project submission](https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6945/)). For the purposes of this tutorial, the datasets have been subsampled to only 50k reads (around 1% of the original files). Those are two fastq files - one with transcripts and the onther one with cell barcodes. You can download the files by running the code below: ```bash -wget -nv https://zenodo.org/ -wget -nv https://zenodo.org/ +wget -nv https://zenodo.org/records/10116786/files/transcript_701.fastq +wget -nv https://zenodo.org/records/10116786/files/barcodes_701.fastq ``` > @@ -326,7 +326,7 @@ gene_ID <- rownames(alevin_se) Analogically, we will add those genes IDs into *rowData names* which stores gene metadata. To do this, we will create a column called `gene_ID` in *rowData* and pass the stored values into there. ```bash -rowData(alevin)$gene_ID <- gene_ID +rowData(alevin_se)$gene_ID <- gene_ID ``` ## Adding genes symbols based on their IDs @@ -410,7 +410,7 @@ dim(matrix_alevin) # check the dim(emptied_matrix) # check the dimension of the filtered matrix ``` -We've gone from X to Y cells. We've filtered the matrix, but not our SummarizedExperiment. We can subset `alevin_se` based on the cells that were left after filtering. We will store them in a separate list, as we did with the barcodes: +We've gone from 3608 to 35 cells. We've filtered the matrix, but not our SummarizedExperiment. We can subset `alevin_se` based on the cells that were left after filtering. We will store them in a separate list, as we did with the barcodes: ```bash retained_cells <- colnames(emptied_matrix) @@ -495,7 +495,7 @@ But first, we have to save the results of our hard work on sample 701! Saving files is quite straight forward. Just specify which object you want to save and how you want the file to be named. Don't forget the extension! ```bash -save(alevin_subset, file = "alevin_701.RData") +save(alevin_subset, file = "alevin_701.rdata") ``` You will see the new file in the panel on the left. @@ -514,7 +514,10 @@ Normally, at this point you would switch kernel to bash to run alevin, and then Let's switch the kernel back to bash and run the following code to unzip the alevin output for sample 702: ```bash -unzip +wget https://zenodo.org/records/10116786/files/alevin_output_702.zip +``` +```bash +unzip alevin_output_702.zip ``` The files are there! Now back to R - switch kernel again. @@ -522,6 +525,10 @@ The files are there! Now back to R - switch kernel again. Above we described all the steps done in R and explained what each bit of code does. Below all those steps are in one block of code, so read carefully and make sure you understand everything! ```bash +library(tximeta) +library(DropletUtils) +library(biomaRt) + path2 <- 'alevin_output_702/alevin/quants_mat.gz' alevin2 <- tximeta(coldata = data.frame(files = path2, names = "sample702"), type = "alevin") @@ -594,7 +601,7 @@ Alright, another sample pre-processed! Pre-processed sample 702 is there, but we still need to load sample 701 that we saved before switching kernels. It's equally easy as saving the object: ```bash -load("alevin_701.RData") +load("alevin_701.rdata") ``` Check if it was loaded ok: @@ -625,6 +632,7 @@ You get the point, right? It's imporant though that the rowData names and colDat It is generally more common to use SingleCellExperiment format rather than SummarizedExperiment. The conversion is quick and easy, and goes like this: ```bash +# library(SingleCellExperiment) # might need to run this if code below is not working alevin_sce <- as(alevin_combined, "SingleCellExperiment") alevin_sce ``` From fc39fce768cdaeee6da31aa9c671423d6a16657e Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Mon, 13 Nov 2023 08:04:54 +0000 Subject: [PATCH 23/46] Create tutorial.bib --- .../tutorials/alevin-commandline/tutorial.bib | 12 ++++++++++++ 1 file changed, 12 insertions(+) create mode 100644 topics/single-cell/tutorials/alevin-commandline/tutorial.bib diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.bib b/topics/single-cell/tutorials/alevin-commandline/tutorial.bib new file mode 100644 index 00000000000000..64338c14c52de5 --- /dev/null +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.bib @@ -0,0 +1,12 @@ +@article{Bacon2018, + doi = {10.3389/fimmu.2018.02523}, + url = {https://doi.org/10.3389/fimmu.2018.02523}, + year = {2018}, + month = nov, + publisher = {Frontiers Media {SA}}, + volume = {9}, + author = {Wendi A. Bacon and Russell S. Hamilton and Ziyi Yu and Jens Kieckbusch and Delia Hawkes and Ada M. Krzak and Chris Abell and Francesco Colucci and D. Stephen Charnock-Jones}, + title = {Single-Cell Analysis Identifies Thymic Maturation Delay in Growth-Restricted Neonatal Mice}, + journal = {Frontiers in Immunology} +} + From a7275d5cffc303d7dda3ba8eb9fd4feb90a6165e Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Mon, 13 Nov 2023 08:08:09 +0000 Subject: [PATCH 24/46] remove duplicates and agenda --- .../tutorials/alevin-commandline/preamble.md | 10 ---------- 1 file changed, 10 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/preamble.md b/topics/single-cell/tutorials/alevin-commandline/preamble.md index fc387ada56ae4e..38f50909e95552 100644 --- a/topics/single-cell/tutorials/alevin-commandline/preamble.md +++ b/topics/single-cell/tutorials/alevin-commandline/preamble.md @@ -1,7 +1,6 @@ # Introduction This tutorial is the part of [Single-cell RNA-seq: Case Study]({% link topics/single-cell/index.md %}) series and focuses on generating a single cell matrix using [Alevin]( https://salmon.readthedocs.io/en/latest/alevin.html) in bash command line. It is a replication of the [previous tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) and will guide you through the same steps that you followed in the previous tutorial and will give you more understanding of what is happening ‘behind the scenes’ or ‘inside the tools’ if you will. -We will work on the case study data from a mouse model of fetal growth restriction {% cite Bacon2018 %} (see [the study in Single Cell Expression Atlas](https://www.ebi.ac.uk/gxa/sc/experiments/E-MTAB-6945/results/tsne) and [the project submission](https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6945/)). As a recap, we fill go from raw FASTQ files to a cell x gene data matrix in AnnData format. After completing the previous tutorial you should already know what is a data matrix and AnnData format. We will perform the following steps: 1. Getting the appropriate files 2. Making a transcript-to-gene ID mapping @@ -65,15 +64,6 @@ You have two options for how to proceed with this JupyterLab tutorial - you can Let's crack on! -> -> -> In this tutorial, we will cover: -> -> 1. TOC -> {:toc} -> -{: .agenda} - {% snippet topics/single-cell/faqs/notebook_warning.md %} From 3b2b45b46a57803ea815e32d5bf542defb52572e Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Mon, 13 Nov 2023 08:23:50 +0000 Subject: [PATCH 25/46] add alevin ref --- topics/single-cell/tutorials/alevin-commandline/preamble.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/preamble.md b/topics/single-cell/tutorials/alevin-commandline/preamble.md index 38f50909e95552..05113af0e1e4be 100644 --- a/topics/single-cell/tutorials/alevin-commandline/preamble.md +++ b/topics/single-cell/tutorials/alevin-commandline/preamble.md @@ -1,6 +1,6 @@ # Introduction -This tutorial is the part of [Single-cell RNA-seq: Case Study]({% link topics/single-cell/index.md %}) series and focuses on generating a single cell matrix using [Alevin]( https://salmon.readthedocs.io/en/latest/alevin.html) in bash command line. It is a replication of the [previous tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) and will guide you through the same steps that you followed in the previous tutorial and will give you more understanding of what is happening ‘behind the scenes’ or ‘inside the tools’ if you will. +This tutorial is the part of [Single-cell RNA-seq: Case Study]({% link topics/single-cell/index.md %}) series and focuses on generating a single cell matrix using Alevin ({% cite srivastava2019alevin %}) in bash command line. It is a replication of the [previous tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) and will guide you through the same steps that you followed in the previous tutorial and will give you more understanding of what is happening ‘behind the scenes’ or ‘inside the tools’ if you will. As a recap, we fill go from raw FASTQ files to a cell x gene data matrix in AnnData format. After completing the previous tutorial you should already know what is a data matrix and AnnData format. We will perform the following steps: 1. Getting the appropriate files 2. Making a transcript-to-gene ID mapping From 5d25a0212f4cb65838cbd16513dd8d1ae07e0fa4 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Mon, 13 Nov 2023 08:54:54 +0000 Subject: [PATCH 26/46] refs, links, change order --- .../tutorials/alevin-commandline/tutorial.md | 192 +++++++++--------- 1 file changed, 94 insertions(+), 98 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index 6f144d6867ce2b..ef52dee6836d69 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -18,7 +18,7 @@ objectives: time_estimation: 2H key_points: - - Create a SCE object from FASTQ files, including relevant gene and cell metadata + - Create a SCE object from FASTQ files, including relevant gene and cell metadata, and do it all in Jupyter Notebook! requirements: - @@ -56,7 +56,7 @@ notebook: # Setting up the environment -Alevin is a tool integrated with the salmon software, so first we need to get salmon. You can install salmon using bioconda, but in this tutorial we will show an alternative method - downloading the pre-compiled binaries from the [releases page](https://github.com/COMBINE-lab/salmon/releases). +Alevin is a tool integrated with the [Salmon software](https://salmon.readthedocs.io/en/latest/salmon.html), so first we need to get Salmon. You can install salmon using bioconda, but in this tutorial we will show an alternative method - downloading the pre-compiled binaries from the [releases page](https://github.com/COMBINE-lab/salmon/releases). ```bash wget -nv https://github.com/COMBINE-lab/salmon/releases/download/v1.10.0/salmon-1.10.0_linux_x86_64.tar.gz @@ -68,7 +68,7 @@ Once you've downloaded a specific binary (here we're using version 1.10.0), just tar -xvzf salmon-1.10.0_linux_x86_64.tar.gz ``` -We're going to use Alevin {% cite article-Alevin %} for demonstration purposes, but we do not endorse one method over another. +We're going to use Alevin for demonstration purposes, but we do not endorse one method over another. # Get Data @@ -140,9 +140,6 @@ Where `-t` stands for our filtered FASTA file, and `-i` is the output the mappin reference salmon --> - # Use Alevin @@ -204,7 +201,7 @@ This tool will take a while to run. Alevin produces many file outputs, not all o # Alevin output to SummarizedExperiment @@ -219,7 +216,7 @@ Now load the library that we have previously installed in terminal: library(tximeta) ``` -The [tximeta package](https://bioconductor.org/packages/devel/bioc/vignettes/tximeta/inst/doc/tximeta.html) REF (Love et al. 2020) is used for import of transcript-level quantification data into R/Bioconductor and requires that the entire output of alevin is present and unmodified. +The [tximeta package](https://bioconductor.org/packages/devel/bioc/vignettes/tximeta/inst/doc/tximeta.html) created by {% cite Love2020 %} is used for import of transcript-level quantification data into R/Bioconductor and requires that the entire output of alevin is present and unmodified. First, let's specify the path to the quants_mat.gz file: @@ -248,7 +245,7 @@ As you can see, *rowData names* and *colData names* are still empty. Before we a # Identify barcodes that correspond to non-empty droplets -Some sub-populations of small cells may not be distinguished from empty droplets based purely on counts by barcode. Some libraries produce multiple ‘knees’ (see the [Alevin Galaxy tutorial]() for multiple sub-populations. The [emptyDrops]() method has become a popular way of dealing with this. emptyDrops still retains barcodes with very high counts, but also adds in barcodes that can be statistically distinguished from the ambient profiles, even if total counts are similar. +Some sub-populations of small cells may not be distinguished from empty droplets based purely on counts by barcode. Some libraries produce multiple ‘knees’ (see the [Alevin Galaxy tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}#Basic-QC) for multiple sub-populations. The emptyDrops method ({% cite Lun2019 %}) has become a popular way of dealing with this. emptyDrops still retains barcodes with very high counts, but also adds in barcodes that can be statistically distinguished from the ambient profiles, even if total counts are similar. ```bash library(DropletUtils) # load the library and required packages @@ -273,14 +270,12 @@ And now run emptyDrops: out <- emptyDrops(matrix_alevin, lower = 100, niters = 1000, retain = 20) out ``` - -False discovery rate - ??? +We also correct for multiple testing by controlling the false discovery rate (FDR) using the Benjamini-Hochberg (BH) method ({% cite Benjamini1995 %}). Putative cells are defined as those barcodes that have significantly poor fits to the ambient model at a specified FDR threshold. Here, we will use an FDR threshold of 1%. This means that the expected proportion of empty droplets in the set of retained barcodes is no greater than 1%, which we consider to be acceptably low for downstream analyses. + ```bash -is.cell <- out$FDR <= 0.01 -sum(is.cell, na.rm=TRUE) +is.cell <- out$FDR <= 0.01 +sum(is.cell, na.rm=TRUE) # check how many cells left after filtering ``` We got rid of the background droplets containing no cells, so now we will filter the matrix that we passed on to emptyDrops, so that it corresponds to the remaining cells. @@ -314,6 +309,61 @@ colData(alevin_se) As you can see, the new columns were appended successfully and now the dataframe has 6 columns. +If you have a look at the Experimental Design from that study, you might notice that there is actually more information about the cells. The most important for us would be batch, genotype and sex, summarised in the small table below. + +| Index | Batch | Genotype | Sex | +|------ |--------------------| +| N701 | 0 | wildtype | male | +| N702 | 1 | knockout | male | +| N703 | 2 | knockout | female | +| N704 | 3 | wildtype | male | +| N705 | 4 | wildtype | male | +| N706 | 5 | wildtype | male | +| N707 | 6 | knockout | male | + +We are currently analysing sample N701, so let's add its information from the table. + +## Batch + +We will label batch as an integer - "0" for sample N701, "1" for N702 etc. The way to do it is creating a list with zeros of the length corresponding to the number of cells that we have in our SummarizedExperiment object, like so: + +```bash +batch <- rep("0", length(colnames(alevin_se))) +``` + +And now create a batch slot in the *colData names* and append the `batch` list in the same way as we did with barcodes: + +```bash +colData(alevin_se)$batch <- batch +colData(alevin_se) +``` + +A new column appeared, full of zeros - as expected! + +## Genotype + +It's all the same for genotype, but instead creating a list with zeros, we'll create a list with string "wildtype" and append it into genotype slot: + +```bash +genotype <- rep("wildtype", length(colnames(alevin_se))) +colData(alevin_se)$genotype <- genotype +``` + +## Sex + +You already know what to do, right? A list with string "male" and then adding it into a new slot! +```bash +sex <- rep("male", length(colnames(alevin_se))) +colData(alevin_se)$sex <- sex +``` + +Check if all looks fine: +```bash +colData(alevin_se) +``` + +3 new columns appeared with the information that we've just added - perfect! You can add any information you need in this way, as long as it's the same for all the cells from one sample. + # Adding gene metadata @@ -398,7 +448,7 @@ listDatasets(mart) # available datasets # Subsetting the object @@ -407,7 +457,7 @@ Let's go back to the `emptied_matrix` object. Do you remember how many cells wer ```bash dim(matrix_alevin) # check the dimension of the unfiltered matrix -dim(emptied_matrix) # check the dimension of the filtered matrix +dim(emptied_matrix) # check the dimension of the filtered matrix ``` We've gone from 3608 to 35 cells. We've filtered the matrix, but not our SummarizedExperiment. We can subset `alevin_se` based on the cells that were left after filtering. We will store them in a separate list, as we did with the barcodes: @@ -424,64 +474,7 @@ alevin_subset <- alevin_se[, colData(alevin_se)$barcode %in% retained_cells] alevin_subset ``` -And that's our subset! We have now filtered matrix, some gene and cell metadata... but we can do more! - -# Adding more metadata - -If you have a look at the Experimental Design from that study, you might notice that there is actually more information about the cells. The most important for us would be batch, genotype and sex, summarised in the small table below. - -| Index | Batch | Genotype | Sex | -|------ |--------------------| -| N701 | 0 | wildtype | male | -| N702 | 1 | knockout | male | -| N703 | 2 | knockout | female | -| N704 | 3 | wildtype | male | -| N705 | 4 | wildtype | male | -| N706 | 5 | wildtype | male | -| N707 | 6 | knockout | male | - -We are currently analysing sample N701, so let's finish it off by adding the information from the table. - -## Batch - -We will label batch as an integer - "0" for sample N701, "1" for N702 etc. The way to do it is creating a list with zeros of the length corresponding to the number of cells that we have in our SummarizedExperiment object, like so: - -```bash -batch <- rep("0", length(colnames(alevin_subset))) -``` - -And now create a batch slot in the *colData names* and append the `batch` list in the same way as we did with barcodes: - -```bash -colData(alevin_subset)$batch <- batch -colData(alevin_subset) -``` - -A new column appeared, full of zeros - as expected! - -## Genotype - -It's all the same for genotype, but instead creating a list with zeros, we'll create a list with string "wildtype" and append it into genotype slot: - -```bash -genotype <- rep("wildtype", length(colnames(alevin_subset))) -colData(alevin_subset)$genotype <- genotype -``` - -## Sex - -You already know what to do, right? A list with string "male" and then adding it into a new slot! -```bash -sex <- rep("male", length(colnames(alevin_subset))) -colData(alevin_subset)$sex <- sex -``` - -Check if all looks fine: -```bash -colData(alevin_subset) -``` - -3 new columns appeared with the information that we've just added - perfect! You can add any information you need in this way, as long as it's the same for all the cells from one sample. +And that's our subset, ready for downstream analysis! # More datasets @@ -525,31 +518,42 @@ The files are there! Now back to R - switch kernel again. Above we described all the steps done in R and explained what each bit of code does. Below all those steps are in one block of code, so read carefully and make sure you understand everything! ```bash +## load libraries again ## library(tximeta) library(DropletUtils) library(biomaRt) -path2 <- 'alevin_output_702/alevin/quants_mat.gz' -alevin2 <- tximeta(coldata = data.frame(files = path2, names = "sample702"), type = "alevin") +## running tximeta ## +path2 <- 'alevin_output_702/alevin/quants_mat.gz' # path to alevin output for N702 +alevin2 <- tximeta(coldata = data.frame(files = path2, names = "sample702"), type = "alevin") # create SummarizedExperiment from Alevin output -matrix_alevin2 <- assays(alevin2)$counts +## running emptyDrops ## +matrix_alevin2 <- assays(alevin2)$counts # extract matrix from SummarizedExperiment +out2 <- emptyDrops(matrix_alevin2, lower = 100, niters = 1000, retain = 20) # apply emptyDrops +is.cell2 <- out2$FDR <= 0.01 # apply FDR threshold +emptied_matrix2 <- matrix_alevin2[,which(is.cell2),drop=FALSE] # subset the matrix to the cell-containing droplets -out2 <- emptyDrops(matrix_alevin2, lower = 100, niters = 1000, retain = 20) +## adding cell metadata ## +barcode2 <- colnames(alevin2) +colData(alevin2)$barcode <- barcode2 # add barcodes -is.cell2 <- out2$FDR <= 0.01 -sum(is.cell2, na.rm=TRUE) +colData(alevin2) <- cbind(colData(alevin2),out2) # add emptyDrops info -emptied_matrix2 <- matrix_alevin2[,which(is.cell2),drop=FALSE] +batch2 <- rep("1", length(colnames(alevin_subset2))) +colData(alevin_subset2)$batch <- batch2 # add batch info -barcode2 <- colnames(alevin2) -colData(alevin2)$barcode <- barcode2 -colData(alevin2) <- cbind(colData(alevin2),out2) +genotype2 <- rep("wildtype", length(colnames(alevin_subset2))) +colData(alevin_subset2)$genotype <- genotype2 # add genotype info + +sex2 <- rep("male", length(colnames(alevin_subset2))) +colData(alevin_subset2)$sex <- sex2 # add sex info +## adding gene metadata ## gene_ID2 <- rownames(alevin2) rowData(alevin2)$gene_ID <- gene_ID2 # get relevant gene names -ensembl.ids2 <- gene_ID2 # fData() allows to access cds rowData table +ensembl.ids2 <- gene_ID2 mart <- useEnsembl(biomart = "ENSEMBL_MART_ENSEMBL") # connect to a specified BioMart database and dataset hosted by Ensembl ensembl_m2 = useMart("ensembl", dataset="mmusculus_gene_ensembl", host='https://nov2020.archive.ensembl.org') @@ -575,20 +579,12 @@ for (geneID in gene_names2) count = count + 1 # increased count so that every element in gene_names is replaced } -rowData(alevin2)$gene_name <- gene_names2 +rowData(alevin2)$gene_name <- gene_names2 # add gene names to gene metadata +## create a subset of filtered object ## retained_cells2 <- colnames(emptied_matrix2) alevin_subset2 <- alevin2[, colData(alevin2)$barcode %in% retained_cells2] -batch2 <- rep("1", length(colnames(alevin_subset2))) -colData(alevin_subset2)$batch <- batch2 - -genotype2 <- rep("wildtype", length(colnames(alevin_subset2))) -colData(alevin_subset2)$genotype <- genotype2 - -sex2 <- rep("male", length(colnames(alevin_subset2))) -colData(alevin_subset2)$sex <- sex2 - alevin_702 <- alevin_subset2 alevin_702 ``` @@ -619,7 +615,7 @@ alevin_combined If you have more samples, just append them in the same way. We won't process another sample here, but pretending that we have third sample, we would combine it like this: ```bash -alevin_subset3 <- alevin_702 # copy dataset for demonstration purposes +alevin_subset3 <- alevin_702 # copy dataset for demonstration purposes alevin_combined_demo <- cbind(alevin_combined, alevin_subset3) alevin_combined_demo ``` @@ -632,7 +628,7 @@ You get the point, right? It's imporant though that the rowData names and colDat It is generally more common to use SingleCellExperiment format rather than SummarizedExperiment. The conversion is quick and easy, and goes like this: ```bash -# library(SingleCellExperiment) # might need to run this if code below is not working +# library(SingleCellExperiment) # might need to run this if code below is not working alevin_sce <- as(alevin_combined, "SingleCellExperiment") alevin_sce ``` From 4ed220355f29eb48bd4d88f9ae15ebaa20f5e149 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Mon, 13 Nov 2023 08:55:10 +0000 Subject: [PATCH 27/46] more refs --- .../tutorials/alevin-commandline/tutorial.bib | 52 +++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.bib b/topics/single-cell/tutorials/alevin-commandline/tutorial.bib index 64338c14c52de5..1fe3a0f14e2df8 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.bib +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.bib @@ -10,3 +10,55 @@ @article{Bacon2018 journal = {Frontiers in Immunology} } +@article{Lun2019, + doi = {10.1186/s13059-019-1662-y}, + url = {https://doi.org/10.1186/s13059-019-1662-y}, + year = {2019}, + month = mar, + publisher = {Springer Science and Business Media {LLC}}, + volume = {20}, + number = {1}, + author = {Aaron T. L. Lun and and Samantha Riesenfeld and Tallulah Andrews and The Phuong Dao and Tomas Gomes and John C. Marioni}, + title = {{EmptyDrops}: distinguishing cells from empty droplets in droplet-based single-cell {RNA} sequencing data}, + journal = {Genome Biology} +} + +@article{Love2020, + doi = {10.1371/journal.pcbi.1007664}, + url = {https://doi.org/10.1371/journal.pcbi.1007664}, + year = {2020}, + month = feb, + publisher = {Public Library of Science ({PLoS})}, + volume = {16}, + number = {2}, + pages = {e1007664}, + author = {Michael I. Love and Charlotte Soneson and Peter F. Hickey and Lisa K. Johnson and N. Tessa Pierce and Lori Shepherd and Martin Morgan and Rob Patro}, + editor = {Mihaela Pertea}, + title = {Tximeta: Reference sequence checksums for provenance identification in {RNA}-seq}, + journal = {{PLOS} Computational Biology} +} + +@article{srivastava2019alevin, +title={Alevin efficiently estimates accurate gene abundances from dscRNA-seq data}, +author={Srivastava, Avi and Malik, Laraib and Smith, Tom and Sudbery, Ian and Patro, Rob}, +journal={Genome biology}, +volume={20}, +number={1}, +pages={65}, +year={2019}, +publisher={BioMed Central} +} + +@article{Benjamini1995, + doi = {10.1111/j.2517-6161.1995.tb02031.x}, + url = {https://doi.org/10.1111/j.2517-6161.1995.tb02031.x}, + year = {1995}, + month = jan, + publisher = {Wiley}, + volume = {57}, + number = {1}, + pages = {289--300}, + author = {Yoav Benjamini and Yosef Hochberg}, + title = {Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing}, + journal = {Journal of the Royal Statistical Society: Series B (Methodological)} +} From 021e6576c97b42e1eee3d08ca0c04244c18dc8e1 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Mon, 13 Nov 2023 09:59:19 +0000 Subject: [PATCH 28/46] add mito flagging --- .../tutorials/alevin-commandline/tutorial.md | 25 ++++++++++++++++--- 1 file changed, 21 insertions(+), 4 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index ef52dee6836d69..eb41b6107f91c3 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -447,9 +447,22 @@ listDatasets(mart) # available datasets {: .warning} - +## Flag mitochondrial genes + +We can also flag mitochondrial genes. Usually those are the genes whose name starts with 'mt-' or 'MT-'. Therefore, we will store those characters in `mito_genes_names` and then use *grepl()* to find those genes stored in *gene_name* slot. + +```bash +mito_genes_names <- '^mt-|^MT-' # how mitochondrial gene names can start +mito <- grepl(mito_genes_names, rowData(alevin_se)$gene_name) # find mito genes +mito # see the resulting boolean list +``` + +Now we can add another slot in *rowData* and call it *mito* that will keep boolean values (true/false) to indicate which genes are mitochondrial. +```bash +rowData(alevin_se)$mito <- mito +rowData(alevin_se) +``` + # Subsetting the object @@ -579,7 +592,11 @@ for (geneID in gene_names2) count = count + 1 # increased count so that every element in gene_names is replaced } -rowData(alevin2)$gene_name <- gene_names2 # add gene names to gene metadata +rowData(alevin2)$gene_name <- gene_names2 # add gene names to gene metadata + +mito_genes_names <- '^mt-|^MT-' # how mitochondrial gene names can start +mito2 <- grepl(mito_genes_names, rowData(alevin2)$gene_name) # find mito genes +rowData(alevin2)$mito <- mito2 # add mitochondrial information to gene metadata ## create a subset of filtered object ## retained_cells2 <- colnames(emptied_matrix2) From c0cbb6b540179b95981f17bec2e8cd71b145d9e1 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Mon, 13 Nov 2023 10:40:16 +0000 Subject: [PATCH 29/46] remove duplicate tag --- topics/single-cell/tutorials/alevin-commandline/tutorial.md | 1 - 1 file changed, 1 deletion(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index eb41b6107f91c3..2e18a682d823b5 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -37,7 +37,6 @@ follow_up_training: tags: - 10x - paper-replication -- jupyter-notebook contributions: From 3e584098c2cb1c2c7c3ef9401fbd9f3b677013f2 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Mon, 13 Nov 2023 10:48:36 +0000 Subject: [PATCH 30/46] make switching kernels more explicit --- .../tutorials/alevin-commandline/tutorial.md | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index 2e18a682d823b5..ae1c392c490c2e 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -516,20 +516,23 @@ Normally, at this point you would switch kernel to bash to run alevin, and then > {: .warning} -Let's switch the kernel back to bash and run the following code to unzip the alevin output for sample 702: +Let's **switch the kernel back to bash** and run the following code to unzip the alevin output for sample 702: ```bash +# we're in bash again! wget https://zenodo.org/records/10116786/files/alevin_output_702.zip ``` ```bash unzip alevin_output_702.zip ``` -The files are there! Now back to R - switch kernel again. +The files are there! Now **back to R - switch kernel again**. Above we described all the steps done in R and explained what each bit of code does. Below all those steps are in one block of code, so read carefully and make sure you understand everything! ```bash +# we're in R now! + ## load libraries again ## library(tximeta) library(DropletUtils) @@ -656,9 +659,10 @@ You've already learned how to save and load objects in Jupyter notebook, let's t save(alevin_sce, file = "alevin_combined_sce.rdata") ``` -The last thing that might be useful is exporting the files into your Galaxy history. To do it... guess what! Yes - switching kernels again! But this time we choose Python kernel and run the following command: +The last thing that might be useful is exporting the files into your Galaxy history. To do it... guess what! Yes - **switching kernels again**! But this time we choose Python kernel and run the following command: ```bash +# that's Python now! put("alevin_combined_sce.rdata") ``` From 19a4f070504f102d09d3e7c4a962a80211f332bf Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Mon, 13 Nov 2023 11:18:45 +0000 Subject: [PATCH 31/46] make sure everything works --- .../tutorials/alevin-commandline/tutorial.md | 23 ++++++++++--------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index ae1c392c490c2e..0d682f33dfe08c 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -483,7 +483,8 @@ And now we can subset our SummarizedExperiment based on the barcodes that are in ```bash alevin_subset <- alevin_se[, colData(alevin_se)$barcode %in% retained_cells] -alevin_subset +alevin_701 <- alevin_subset +alevin_701 ``` And that's our subset, ready for downstream analysis! @@ -500,7 +501,7 @@ But first, we have to save the results of our hard work on sample 701! Saving files is quite straight forward. Just specify which object you want to save and how you want the file to be named. Don't forget the extension! ```bash -save(alevin_subset, file = "alevin_701.rdata") +save(alevin_701, file = "alevin_701.rdata") ``` You will see the new file in the panel on the left. @@ -554,14 +555,14 @@ colData(alevin2)$barcode <- barcode2 colData(alevin2) <- cbind(colData(alevin2),out2) # add emptyDrops info -batch2 <- rep("1", length(colnames(alevin_subset2))) -colData(alevin_subset2)$batch <- batch2 # add batch info +batch2 <- rep("1", length(colnames(alevin2))) +colData(alevin2)$batch <- batch2 # add batch info -genotype2 <- rep("wildtype", length(colnames(alevin_subset2))) -colData(alevin_subset2)$genotype <- genotype2 # add genotype info +genotype2 <- rep("wildtype", length(colnames(alevin2))) +colData(alevin2)$genotype <- genotype2 # add genotype info -sex2 <- rep("male", length(colnames(alevin_subset2))) -colData(alevin_subset2)$sex <- sex2 # add sex info +sex2 <- rep("male", length(colnames(alevin2))) +colData(alevin2)$sex <- sex2 # add sex info ## adding gene metadata ## gene_ID2 <- rownames(alevin2) @@ -647,7 +648,7 @@ You get the point, right? It's imporant though that the rowData names and colDat It is generally more common to use SingleCellExperiment format rather than SummarizedExperiment. The conversion is quick and easy, and goes like this: ```bash -# library(SingleCellExperiment) # might need to run this if code below is not working +library(SingleCellExperiment) # might need to load this library alevin_sce <- as(alevin_combined, "SingleCellExperiment") alevin_sce ``` @@ -656,14 +657,14 @@ As you can see, all the embeddings have been successfully transfered during this You've already learned how to save and load objects in Jupyter notebook, let's then save the SCE file: ```bash -save(alevin_sce, file = "alevin_combined_sce.rdata") +save(alevin_sce, file = "alevin_sce.rdata") ``` The last thing that might be useful is exporting the files into your Galaxy history. To do it... guess what! Yes - **switching kernels again**! But this time we choose Python kernel and run the following command: ```bash # that's Python now! -put("alevin_combined_sce.rdata") +put("alevin_sce.rdata") ``` # Conclusion From 6036bdb194617998891084b7483cf42b10778864 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Mon, 13 Nov 2023 11:46:54 +0000 Subject: [PATCH 32/46] add funding --- topics/single-cell/tutorials/alevin-commandline/tutorial.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index 0d682f33dfe08c..e1ab1dca6c2fa7 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -45,7 +45,7 @@ contributions: - nomadscientist funding: - - + - eosc-life notebook: language: bash From 29314d2020f71058f8f110f0b513e82908a0cdd9 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Mon, 13 Nov 2023 11:54:32 +0000 Subject: [PATCH 33/46] fix? --- topics/single-cell/tutorials/alevin-commandline/tutorial.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index e1ab1dca6c2fa7..40497daefbb0f6 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -38,12 +38,10 @@ tags: - 10x - paper-replication - contributions: authorship: - wee-snufkin - nomadscientist - funding: - eosc-life From 20a663a43678233a5d861d0dbeee639332c633bc Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Mon, 13 Nov 2023 11:58:26 +0000 Subject: [PATCH 34/46] fix bib --- .../tutorials/alevin-commandline/tutorial.bib | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.bib b/topics/single-cell/tutorials/alevin-commandline/tutorial.bib index 1fe3a0f14e2df8..b97706c130f31d 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.bib +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.bib @@ -39,14 +39,16 @@ @article{Love2020 } @article{srivastava2019alevin, -title={Alevin efficiently estimates accurate gene abundances from dscRNA-seq data}, -author={Srivastava, Avi and Malik, Laraib and Smith, Tom and Sudbery, Ian and Patro, Rob}, -journal={Genome biology}, -volume={20}, -number={1}, -pages={65}, -year={2019}, -publisher={BioMed Central} + doi = {10.1186/s13059-019-1670-y}, + url = {https://doi.org/10.1186/s13059-019-1670-y}, + year = {2019}, + month = mar, + publisher = {Springer Science and Business Media {LLC}}, + volume = {20}, + number = {1}, + author = {Avi Srivastava and Laraib Malik and Tom Smith and Ian Sudbery and Rob Patro}, + title = {Alevin efficiently estimates accurate gene abundances from {dscRNA}-seq data}, + journal = {Genome Biology} } @article{Benjamini1995, From 9557c5c03f8193575936205e63b8f15966230a72 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sat, 25 Nov 2023 10:51:35 +0000 Subject: [PATCH 35/46] typo Co-authored-by: Pavankumar Videm --- topics/single-cell/tutorials/alevin-commandline/preamble.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/preamble.md b/topics/single-cell/tutorials/alevin-commandline/preamble.md index 05113af0e1e4be..f63e7e68c21478 100644 --- a/topics/single-cell/tutorials/alevin-commandline/preamble.md +++ b/topics/single-cell/tutorials/alevin-commandline/preamble.md @@ -1,7 +1,7 @@ # Introduction This tutorial is the part of [Single-cell RNA-seq: Case Study]({% link topics/single-cell/index.md %}) series and focuses on generating a single cell matrix using Alevin ({% cite srivastava2019alevin %}) in bash command line. It is a replication of the [previous tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) and will guide you through the same steps that you followed in the previous tutorial and will give you more understanding of what is happening ‘behind the scenes’ or ‘inside the tools’ if you will. -As a recap, we fill go from raw FASTQ files to a cell x gene data matrix in AnnData format. After completing the previous tutorial you should already know what is a data matrix and AnnData format. We will perform the following steps: +As a recap, we will go from raw FASTQ files to a cell x gene data matrix in AnnData format. After completing the previous tutorial you should already know what is a data matrix and AnnData format. We will perform the following steps: 1. Getting the appropriate files 2. Making a transcript-to-gene ID mapping 3. Creating Salmon index From 6d4b24ff67f3f943159855a1b460854a03752339 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sat, 25 Nov 2023 10:53:06 +0000 Subject: [PATCH 36/46] typo Co-authored-by: Pavankumar Videm --- topics/single-cell/tutorials/alevin-commandline/preamble.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/preamble.md b/topics/single-cell/tutorials/alevin-commandline/preamble.md index f63e7e68c21478..d4741cb9a8c1cc 100644 --- a/topics/single-cell/tutorials/alevin-commandline/preamble.md +++ b/topics/single-cell/tutorials/alevin-commandline/preamble.md @@ -1,6 +1,6 @@ # Introduction -This tutorial is the part of [Single-cell RNA-seq: Case Study]({% link topics/single-cell/index.md %}) series and focuses on generating a single cell matrix using Alevin ({% cite srivastava2019alevin %}) in bash command line. It is a replication of the [previous tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) and will guide you through the same steps that you followed in the previous tutorial and will give you more understanding of what is happening ‘behind the scenes’ or ‘inside the tools’ if you will. +This tutorial is part of [Single-cell RNA-seq: Case Study]({% link topics/single-cell/index.md %}) series and focuses on generating a single cell matrix using Alevin ({% cite srivastava2019alevin %}) in bash command line. It is a replication of the [previous tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) and will guide you through the same steps that you followed in the previous tutorial and will give you more understanding of what is happening ‘behind the scenes’ or ‘inside the tools’ if you will. As a recap, we will go from raw FASTQ files to a cell x gene data matrix in AnnData format. After completing the previous tutorial you should already know what is a data matrix and AnnData format. We will perform the following steps: 1. Getting the appropriate files 2. Making a transcript-to-gene ID mapping From d7e9bd482a7767b0b4d37d3826412cc167fdc172 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sat, 25 Nov 2023 10:53:22 +0000 Subject: [PATCH 37/46] typo Co-authored-by: Pavankumar Videm --- topics/single-cell/tutorials/alevin-commandline/preamble.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/preamble.md b/topics/single-cell/tutorials/alevin-commandline/preamble.md index d4741cb9a8c1cc..5f20894c9c4d38 100644 --- a/topics/single-cell/tutorials/alevin-commandline/preamble.md +++ b/topics/single-cell/tutorials/alevin-commandline/preamble.md @@ -14,7 +14,7 @@ As a recap, we will go from raw FASTQ files to a cell x gene data matrix in AnnD ## Launching JupyterLab > Data uploads & JupyterLab -> There are a few ways of importing and uploading data into JupyterLab. You might find yourself accidentally doing this differently than the tutorial, and that's ok. There are a few key steps where you will call files from a location - if these don't work from you, check that the file location is correct and change accordingly! +> There are a few ways of importing and uploading data into JupyterLab. You might find yourself accidentally doing this differently than the tutorial, and that's ok. There are a few key steps where you will call files from a location - if these don't work for you, check that the file location is correct and change accordingly! {: .warning} > {% snippet faqs/galaxy/interactive_tools_jupyter_launch.md %} From 093ffdd33b249d06673b7e35c185ab88bf2735dd Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sat, 25 Nov 2023 11:08:01 +0000 Subject: [PATCH 38/46] starting a new launcher more explicit --- topics/single-cell/tutorials/alevin-commandline/preamble.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/preamble.md b/topics/single-cell/tutorials/alevin-commandline/preamble.md index 5f20894c9c4d38..cc59c712ded798 100644 --- a/topics/single-cell/tutorials/alevin-commandline/preamble.md +++ b/topics/single-cell/tutorials/alevin-commandline/preamble.md @@ -48,7 +48,7 @@ You have two options for how to proceed with this JupyterLab tutorial - you can > Option 2: Creating a notebook > -> 1. Select the **Bash** icon under **Notebook** +> 1. If you are in the Launcher window, Select the **Bash** icon under **Notebook** (to open a new Launcher go to File -> New Launcher). > > ![Bash icon](../../images/scrna-pre-processing/bash.png "Bash Notebook Button") > @@ -73,7 +73,7 @@ Before we start working on the tutorial notebook, we need to install required pa >Installing the packages > -> 1. Navigate back to the `Terminal` (see Option 1 in the box above) +> 1. Navigate back to the `Terminal` (if you haven't opened it yet, just go to File -> New -> Terminal) > 2. In the Terminal tab open, write the following, one line at a time: > ``` >conda install -y -c bioconda bioconductor-tximeta # install this first to avoid problem with re-installation of rtracklayer From e1ae7d795730c36ce675cf67d2d912714ae13368 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sat, 25 Nov 2023 12:06:41 +0000 Subject: [PATCH 39/46] Pavan's suggestions --- .../tutorials/alevin-commandline/tutorial.md | 74 +++++++++++++------ 1 file changed, 50 insertions(+), 24 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index 40497daefbb0f6..2dbd4f71156889 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -42,6 +42,8 @@ contributions: authorship: - wee-snufkin - nomadscientist + testing: + - pavanvidem funding: - eosc-life @@ -53,7 +55,7 @@ notebook: # Setting up the environment -Alevin is a tool integrated with the [Salmon software](https://salmon.readthedocs.io/en/latest/salmon.html), so first we need to get Salmon. You can install salmon using bioconda, but in this tutorial we will show an alternative method - downloading the pre-compiled binaries from the [releases page](https://github.com/COMBINE-lab/salmon/releases). +Alevin is a tool integrated with the [Salmon software](https://salmon.readthedocs.io/en/latest/salmon.html), so first we need to get Salmon. You can install Salmon using conda, but in this tutorial we will show an alternative method - downloading the pre-compiled binaries from the [releases page](https://github.com/COMBINE-lab/salmon/releases). ```bash wget -nv https://github.com/COMBINE-lab/salmon/releases/download/v1.10.0/salmon-1.10.0_linux_x86_64.tar.gz @@ -65,6 +67,17 @@ Once you've downloaded a specific binary (here we're using version 1.10.0), just tar -xvzf salmon-1.10.0_linux_x86_64.tar.gz ``` +> Conda installation +> +> As mentioned, installing salmon using conda is also an option, and you can do it using the following command in the terminal: +> ``` +> conda install -c bioconda salmon +> ``` +> +> However, for this tutorial, it would be easier and quicker to use the downloaded pre-compiled binaries, as shown above. +> +{: .details} + We're going to use Alevin for demonstration purposes, but we do not endorse one method over another. # Get Data @@ -133,26 +146,17 @@ Where `-t` stands for our filtered FASTA file, and `-i` is the output the mappin > {: .details} - - # Use Alevin Time to use Alevin now! Alevin works under the same indexing scheme (as salmon) for the reference, and consumes the set of FASTA/Q files(s) containing the Cellular Barcode(CB) + Unique Molecule identifier (UMI) in one read file and the read sequence in the other. Given just the transcriptome and the raw read files, alevin generates a cell-by-gene count matrix (in a fraction of the time compared to other tools). -> -> -> How does Alevin work in detail? -> -> > -> > -> > Alevin works in two phases. In the first phase it quickly parses the read file containing the CB and UMI information to generate the frequency distribution of all the observed CBs, and creates a lightweight data-structure for fast-look up and correction of the CB. In the second round, alevin utilizes the read-sequences contained in the files to map the reads to the transcriptome, identify potential PCR/sequencing errors in the UMIs, and performs hybrid de-duplication while accounting for UMI collisions. Finally, a post-abundance estimation CB whitelisting procedure is done and a cell-by-gene count matrix is generated. -> > -> {: .solution} + +> How does Alevin work in detail? > -{: .question} +> Alevin works in two phases. In the first phase it quickly parses the read file containing the CB and UMI information to generate the frequency distribution of all the observed CBs, and creates a lightweight data-structure for fast-look up and correction of the CB. In the second round, alevin utilizes the read-sequences contained in the files to map the reads to the transcriptome, identify potential PCR/sequencing errors in the UMIs, and performs hybrid de-duplication while accounting for UMI collisions. Finally, a post-abundance estimation CB whitelisting procedure is done and a cell-by-gene count matrix is generated. +> +{: .details} Alevin can be run using the following command: @@ -174,7 +178,7 @@ All the required input parameters are described in [the documentation](https://s - `-p`: number of threads, the number of threads which can be used by alevin to perform the quantification, by default alevin utilizes all the available threads in the system, although we recommend using ~10 threads which in our testing gave the best memory-time trade-off. -- `-o`: output, path to folder where the output gene-count matrix (along with other meta-data) would be dumped. We simply call it alevin_output_code +- `-o`: output, path to folder where the output gene-count matrix (along with other meta-data) would be dumped. We simply call it alevin_output - `--tgMap`: transcript to gene map file, a tsv (tab-separated) file — with no header, containing two columns mapping of each transcript present in the reference to the corresponding gene (the first column is a transcript and the second is the corresponding gene). In our case, that's map_code generated by using gtf2featureAnnotation.R function. @@ -190,6 +194,20 @@ We have also added some additional parameters (`--freqThreshold`, `--keepCBFract This tool will take a while to run. Alevin produces many file outputs, not all of which we'll use. You can refer to the [Alevin documentation](https://salmon.readthedocs.io/en/latest/alevin.html) if you're curious what they all are, you can look through all the different files to find information such as the mapping rate, but we'll just pass the whole output folder directory for downstream analysis. +> +> +> 1. Can you find the information what was the mapping rate? +> 2. How many transcripts did Alevin find? +> +> > +> > +> > 1. As mentioned above, in *alevin_output* folder there will be many different files, including the log files. To check the mapping rate, go to *alevin_output* -> *logs* and open *salmon_quant* file. There you will find not only information about mapping rate, but also many more, calculated at salmon indexing step. +> > 2. Alevin log can be found in *alevin_output* -> *alevin* and the file name is also *alevin*. You can find many details about the alevin process there, including the number of transcripts found. +> > +> {: .solution} +> +{: .question} + > Process stopping > > The command above will display the log of the process and will say "Analyzed X cells (Y% of all)". For some reason, running Alevin may sometimes cause problems in Jupyter Notebook and this process will stop and not go to completion. This is the reason why we use hugely subsampled dataset here - bigger ones couldn't be fully analysed (they worked fine locally though). The dataset used in this tutorial shouldn't make any issues when you're using Jupyter notebook through galaxy.eu, however might not work properly on galaxy.org. If you're accessing Jupyter notebook via galaxy.eu and alevin process stopped, just restart the kernel and that should help. @@ -215,11 +233,18 @@ library(tximeta) The [tximeta package](https://bioconductor.org/packages/devel/bioc/vignettes/tximeta/inst/doc/tximeta.html) created by {% cite Love2020 %} is used for import of transcript-level quantification data into R/Bioconductor and requires that the entire output of alevin is present and unmodified. -First, let's specify the path to the quants_mat.gz file: +In the *alevin_output* -> *alevin* folder you can find the following files: +- *quants_mat.gz*- Compressed count matrix +- *quants_mat_rows.txt*- Row Index (CB-ids) of the matrix. +- *quants_mat_cols.txt* - Column Header (Gene-ids) of the matrix. +- *quants_tier_mat.gz* – Tier categorization of the matrix. + +We will only focus on *quants_mat.gz* though. First, let's specify the path to that file: ```bash path <- 'alevin_output/alevin/quants_mat.gz' ``` + We will specify the following arguments when running *tximeta*: - 'coldata' a data.frame with at least two columns: - files - character, paths of quantification files @@ -385,13 +410,19 @@ Since gene symbols are much more informative than only gene IDs, we will add the library("biomaRt") # load the BioMart library ensembl.ids <- gene_ID mart <- useEnsembl(biomart = "ENSEMBL_MART_ENSEMBL") # connect to a specified BioMart database and dataset hosted by Ensembl -ensembl_m = useMart("ensembl", dataset="mmusculus_gene_ensembl", host='https://nov2020.archive.ensembl.org') +ensembl_m = useMart("ensembl", dataset="mmusculus_gene_ensembl", version=100) # The line above connects to a specified BioMart database and dataset within this database. # In our case we choose the mus musculus database and to get the desired Genome assembly GRCm38, # we specify the host with this archive. If you want to use the most recent version of the dataset, just run: # ensembl_m = useMart("ensembl", dataset="mmusculus_gene_ensembl") ``` + +> Ensembl connection problems +> Sometimes you may encounter some connection issues with Ensembl. To improve performance Ensembl provides several mirrors of their site distributed around the globe. When you use the default settings for useEnsembl() your queries will be directed to your closest mirror geographically. In theory this should give you the best performance, however this is not always the case in practice. For example, if the nearest mirror is experiencing many queries from other users it may perform poorly for you. In such cases, the other mirrors should be chosen automatically. +> +{: .warning} + ```bash genes <- getBM(attributes=c('ensembl_gene_id','external_gene_name'), filters = 'ensembl_gene_id', @@ -438,11 +469,6 @@ If you are working on your own data and it’s not mouse data, you can check ava listDatasets(mart) # available datasets ``` -> Ensembl connection problems -> Sometimes you may encounter some connection issues with Ensembl. To improve performance Ensembl provides several mirrors of their site distributed around the globe. When you use the default settings for useEnsembl() your queries will be directed to your closest mirror geographically. In theory this should give you the best performance, however this is not always the case in practice. For example, if the nearest mirror is experiencing many queries from other users it may perform poorly for you. In such cases, the other mirrors should be chosen automatically. -> -{: .warning} - ## Flag mitochondrial genes @@ -569,7 +595,7 @@ rowData(alevin2)$gene_ID <- gene_ID2 # get relevant gene names ensembl.ids2 <- gene_ID2 mart <- useEnsembl(biomart = "ENSEMBL_MART_ENSEMBL") # connect to a specified BioMart database and dataset hosted by Ensembl -ensembl_m2 = useMart("ensembl", dataset="mmusculus_gene_ensembl", host='https://nov2020.archive.ensembl.org') +ensembl_m2 = useMart("ensembl", dataset="mmusculus_gene_ensembl", version=100) genes2 <- getBM(attributes=c('ensembl_gene_id','external_gene_name'), filters = 'ensembl_gene_id', From 84104f0d7074ba37ef9387d2097e90ed5767e44f Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sat, 25 Nov 2023 12:07:24 +0000 Subject: [PATCH 40/46] typo Co-authored-by: Pavankumar Videm --- topics/single-cell/tutorials/alevin-commandline/tutorial.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index 2dbd4f71156889..4140e601d0fd0a 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -684,7 +684,7 @@ You've already learned how to save and load objects in Jupyter notebook, let's t save(alevin_sce, file = "alevin_sce.rdata") ``` -The last thing that might be useful is exporting the files into your Galaxy history. To do it... guess what! Yes - **switching kernels again**! But this time we choose Python kernel and run the following command: +The last thing that might be useful is exporting the files into your Galaxy history. To do it... guess what! Yes - **switching kernels again**! But this time we choose **Python 3** kernel and run the following command: ```bash # that's Python now! From 67c3e6107247ba0c9814252bc19619814ed1c9b0 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sat, 25 Nov 2023 12:09:45 +0000 Subject: [PATCH 41/46] typos from Pavan's review Co-authored-by: Pavankumar Videm --- .../tutorials/alevin-commandline/tutorial.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index 4140e601d0fd0a..5597add4c7d6c2 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -82,7 +82,7 @@ We're going to use Alevin for demonstration purposes, but we do not endorse one # Get Data -We continue working on the same example data - a very small subset of the reads in a mouse dataset of fetal growth restriction {% cite Bacon2018 %} (see the [study in Single Cell Expression Atlas](https://www.ebi.ac.uk/gxa/sc/experiments/E-MTAB-6945/results/tsne) and the [project submission](https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6945/)). For the purposes of this tutorial, the datasets have been subsampled to only 50k reads (around 1% of the original files). Those are two fastq files - one with transcripts and the onther one with cell barcodes. You can download the files by running the code below: +We continue working on the same example data - a very small subset of the reads in a mouse dataset of fetal growth restriction {% cite Bacon2018 %} (see the [study in Single Cell Expression Atlas](https://www.ebi.ac.uk/gxa/sc/experiments/E-MTAB-6945/results/tsne) and the [project submission](https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6945/)). For the purposes of this tutorial, the datasets have been subsampled to only 50k reads (around 1% of the original files). Those are two fastq files - one with transcripts and the another one with cell barcodes. You can download the files by running the code below: ```bash wget -nv https://zenodo.org/records/10116786/files/transcript_701.fastq @@ -131,25 +131,25 @@ Sometimes it's important that there are no transcripts in a FASTA-format transcr # Generate a transcriptome index -We will use Salmon in mapping-based mode, so first we have to build a salmon index for our transcriptome. We will run the salmon indexer as so: +We will use Salmon in mapping-based mode, so first we have to build a Salmon index for our transcriptome. We will run the Salmon indexer as so: ```bash salmon-latest_linux_x86_64/bin/salmon index -t filtered_fasta -i salmon_index -k 31 ``` -Where `-t` stands for our filtered FASTA file, and `-i` is the output the mapping-based index. To build it, the funciton is using an auxiliary k-mer hash over k-mers of length 31. While the mapping algorithms will make used of arbitrarily long matches between the query and reference, the k size selected here will act as the minimum acceptable length for a valid match. Thus, a smaller value of k may slightly improve sensitivity. We find that a k of 31 seems to work well for reads of 75bp or longer, but you might consider a smaller k if you plan to deal with shorter reads. Also, a shorter value of k may improve sensitivity even more when using selective alignment (enabled via the –validateMappings flag). So, if you are seeing a smaller mapping rate than you might expect, consider building the index with a slightly smaller k. +Where `-t` stands for our filtered FASTA file, and `-i` is the output of the mapping-based index. To build it, the function is using an auxiliary k-mer hash over k-mers of length 31. While the mapping algorithms will make use of arbitrarily long matches between the query and reference, the k size selected here will act as the minimum acceptable length for a valid match. Thus, a smaller value of k may slightly improve sensitivity. We find that a k of 31 seems to work well for reads of 75bp or longer, but you might consider a smaller k if you plan to deal with shorter reads. Also, a shorter value of k may improve sensitivity even more when using selective alignment (enabled via the –validateMappings flag). So, if you are seeing a smaller mapping rate than you might expect, consider building the index with a slightly smaller k. > What is the index? > -> To be able to search a transcriptome quickly, salmon needs to convert the text (FASTA) format sequences into something it can search quickly, called an 'index'. The index is in a binary rather than human-readable format, but allows fast lookup by Alevin. Because the types of biological and technical sequences we need to include in the index can vary between experiments, and because we often want to use the most up-to-date reference sequences from Ensembl or NCBI, we can end up re-making the indices quite often. +> To be able to search a transcriptome quickly, Salmon needs to convert the text (FASTA) format sequences into something it can search quickly, called an 'index'. The index is in a binary rather than human-readable format, but allows fast lookup by Alevin. Because the types of biological and technical sequences we need to include in the index can vary between experiments, and because we often want to use the most up-to-date reference sequences from Ensembl or NCBI, we can end up re-making the indices quite often. > {: .details} # Use Alevin -Time to use Alevin now! Alevin works under the same indexing scheme (as salmon) for the reference, and consumes the set of FASTA/Q files(s) containing the Cellular Barcode(CB) + Unique Molecule identifier (UMI) in one read file and the read sequence in the other. Given just the transcriptome and the raw read files, alevin generates a cell-by-gene count matrix (in a fraction of the time compared to other tools). +Time to use Alevin now! Alevin works under the same indexing scheme (as Salmon) for the reference, and consumes the set of FASTA/Q files(s) containing the Cellular Barcode(CB) + Unique Molecule identifier (UMI) in one read file and the read sequence in the other. Given just the transcriptome and the raw read files, Alevin generates a cell-by-gene count matrix (in a fraction of the time compared to other tools). > How does Alevin work in detail? @@ -166,7 +166,7 @@ salmon-latest_linux_x86_64/bin/salmon alevin -l ISR -1 barcodes_701.fastq -2 tra All the required input parameters are described in [the documentation](https://salmon.readthedocs.io/en/latest/alevin.html), but for the ease of use, they are presented below as well: -- `-l`: library type (same as salmon), we recommend using ISR for both Drop-seq and 10x-v2 chemistry. +- `-l`: library type (same as Salmon), we recommend using ISR for both Drop-seq and 10x-v2 chemistry. - `-1`: CB+UMI file(s), alevin requires the path to the FASTQ file containing CB+UMI raw sequences to be given under this command line flag. Alevin also supports parsing of data from multiple files as long as the order is the same as in -2 flag. That's our barcodes_701.fastq file. @@ -293,7 +293,7 @@ out <- emptyDrops(matrix_alevin, lower = 100, niters = 1000, retain = 20) out ``` -We also correct for multiple testing by controlling the false discovery rate (FDR) using the Benjamini-Hochberg (BH) method ({% cite Benjamini1995 %}). Putative cells are defined as those barcodes that have significantly poor fits to the ambient model at a specified FDR threshold. Here, we will use an FDR threshold of 1%. This means that the expected proportion of empty droplets in the set of retained barcodes is no greater than 1%, which we consider to be acceptably low for downstream analyses. +We also correct for multiple testing by controlling the false discovery rate (FDR) using the Benjamini-Hochberg (BH) method ({% cite Benjamini1995 %}). Putative cells are defined as those barcodes that have significantly poor fits to the ambient model at a specified FDR threshold. Here, we will use an FDR threshold of 0.01. This means that the expected proportion of empty droplets in the set of retained barcodes is no greater than 1%, which we consider to be acceptably low for downstream analyses. ```bash is.cell <- out$FDR <= 0.01 @@ -522,7 +522,7 @@ But first, we have to save the results of our hard work on sample 701! ## Saving sample 701 data -Saving files is quite straight forward. Just specify which object you want to save and how you want the file to be named. Don't forget the extension! +Saving files is quite straightforward. Just specify which object you want to save and how you want the file to be named. Don't forget the extension! ```bash save(alevin_701, file = "alevin_701.rdata") @@ -695,7 +695,7 @@ put("alevin_sce.rdata") Well done! In this tutorial we have: - examined raw read data, annotations and necessary input files for quantification -- created an index in salmon and run Alevin +- created an index in Salmon and run Alevin - identified barcodes that correspond to non-empty droplets - added gene and cell metadata - applied the necessary conversion to pass these data to downstream processes. From e784f341e503635c4d264525a8da10f1bb694332 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sat, 25 Nov 2023 12:12:27 +0000 Subject: [PATCH 42/46] typo --- topics/single-cell/tutorials/alevin-commandline/tutorial.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index 5597add4c7d6c2..a13d558428efe8 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -154,7 +154,7 @@ Time to use Alevin now! Alevin works under the same indexing scheme (as Salmon) > How does Alevin work in detail? > -> Alevin works in two phases. In the first phase it quickly parses the read file containing the CB and UMI information to generate the frequency distribution of all the observed CBs, and creates a lightweight data-structure for fast-look up and correction of the CB. In the second round, alevin utilizes the read-sequences contained in the files to map the reads to the transcriptome, identify potential PCR/sequencing errors in the UMIs, and performs hybrid de-duplication while accounting for UMI collisions. Finally, a post-abundance estimation CB whitelisting procedure is done and a cell-by-gene count matrix is generated. +> Alevin works in two phases. In the first phase it quickly parses the read file containing the CB and UMI information to generate the frequency distribution of all the observed CBs, and creates a lightweight data-structure for fast-look up and correction of the CB. In the second round, Alevin utilizes the read-sequences contained in the files to map the reads to the transcriptome, identify potential PCR/sequencing errors in the UMIs, and performs hybrid de-duplication while accounting for UMI collisions. Finally, a post-abundance estimation CB whitelisting procedure is done and a cell-by-gene count matrix is generated. > {: .details} From 63248b8ac97d57e7d0640152ccd25cc3a19602d1 Mon Sep 17 00:00:00 2001 From: wee-snufkin <44121095+wee-snufkin@users.noreply.github.com> Date: Sat, 25 Nov 2023 12:38:59 +0000 Subject: [PATCH 43/46] warning about memory and the tutorial being just for teaching purposes --- topics/single-cell/tutorials/alevin-commandline/preamble.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/topics/single-cell/tutorials/alevin-commandline/preamble.md b/topics/single-cell/tutorials/alevin-commandline/preamble.md index cc59c712ded798..97a1e3bb679356 100644 --- a/topics/single-cell/tutorials/alevin-commandline/preamble.md +++ b/topics/single-cell/tutorials/alevin-commandline/preamble.md @@ -10,6 +10,10 @@ As a recap, we will go from raw FASTQ files to a cell x gene data matrix in AnnD 6. Adding metadata 7. Combining samples data +> This tutorial is for teaching purposes +> We created this tutorial as a gateway to coding to demonstrate what happens behind the Galaxy buttons in the [corresponding tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}). This is why we are using massively subsampled data - it's only for demonstration purposes. If you want to perform this tutorial fully on your own data, you will need another compute power because it's simply not going to scale here. You can always use the Galaxy buttons Alevin version which has large memory and few cores dedicated. +{: .warning} + ## Launching JupyterLab From 4d2fe171432254b1cedb0465c62bc19d5e2c07f7 Mon Sep 17 00:00:00 2001 From: Pavankumar Videm Date: Tue, 5 Dec 2023 17:57:34 +0100 Subject: [PATCH 44/46] Update tutorial.md fix internal link --- topics/single-cell/tutorials/alevin-commandline/tutorial.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index a13d558428efe8..14ae22bf3340da 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -267,7 +267,7 @@ As you can see, *rowData names* and *colData names* are still empty. Before we a # Identify barcodes that correspond to non-empty droplets -Some sub-populations of small cells may not be distinguished from empty droplets based purely on counts by barcode. Some libraries produce multiple ‘knees’ (see the [Alevin Galaxy tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}#Basic-QC) for multiple sub-populations. The emptyDrops method ({% cite Lun2019 %}) has become a popular way of dealing with this. emptyDrops still retains barcodes with very high counts, but also adds in barcodes that can be statistically distinguished from the ambient profiles, even if total counts are similar. +Some sub-populations of small cells may not be distinguished from empty droplets based purely on counts by barcode. Some libraries produce multiple ‘knees’ (see the [Alevin Galaxy tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}#basic-qc) for multiple sub-populations. The emptyDrops method ({% cite Lun2019 %}) has become a popular way of dealing with this. emptyDrops still retains barcodes with very high counts, but also adds in barcodes that can be statistically distinguished from the ambient profiles, even if total counts are similar. ```bash library(DropletUtils) # load the library and required packages From 7658b08486b1224a20f122fa13712361dd630f84 Mon Sep 17 00:00:00 2001 From: Pavankumar Videm Date: Fri, 8 Dec 2023 11:06:14 +0100 Subject: [PATCH 45/46] Apply suggestions from code review Co-authored-by: mtekman <20641402+mtekman@users.noreply.github.com> --- .../tutorials/alevin-commandline/preamble.md | 4 ++-- .../tutorials/alevin-commandline/tutorial.md | 15 +++++++++------ 2 files changed, 11 insertions(+), 8 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/preamble.md b/topics/single-cell/tutorials/alevin-commandline/preamble.md index 97a1e3bb679356..15f28facd2d29a 100644 --- a/topics/single-cell/tutorials/alevin-commandline/preamble.md +++ b/topics/single-cell/tutorials/alevin-commandline/preamble.md @@ -1,6 +1,6 @@ # Introduction -This tutorial is part of [Single-cell RNA-seq: Case Study]({% link topics/single-cell/index.md %}) series and focuses on generating a single cell matrix using Alevin ({% cite srivastava2019alevin %}) in bash command line. It is a replication of the [previous tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) and will guide you through the same steps that you followed in the previous tutorial and will give you more understanding of what is happening ‘behind the scenes’ or ‘inside the tools’ if you will. +This tutorial is part of [Single-cell RNA-seq: Case Study]({% link topics/single-cell/index.md %}) series and focuses on generating a single cell matrix using Alevin ({% cite srivastava2019alevin %}) in the bash command line. It is a replication of the [previous tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) and will guide you through the same steps that you followed in the previous tutorial and will give you more understanding of what is happening ‘behind the scenes’ or ‘inside the tools’ if you will. As a recap, we will go from raw FASTQ files to a cell x gene data matrix in AnnData format. After completing the previous tutorial you should already know what is a data matrix and AnnData format. We will perform the following steps: 1. Getting the appropriate files 2. Making a transcript-to-gene ID mapping @@ -11,7 +11,7 @@ As a recap, we will go from raw FASTQ files to a cell x gene data matrix in AnnD 7. Combining samples data > This tutorial is for teaching purposes -> We created this tutorial as a gateway to coding to demonstrate what happens behind the Galaxy buttons in the [corresponding tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}). This is why we are using massively subsampled data - it's only for demonstration purposes. If you want to perform this tutorial fully on your own data, you will need another compute power because it's simply not going to scale here. You can always use the Galaxy buttons Alevin version which has large memory and few cores dedicated. +> We created this tutorial as a gateway to coding to demonstrate what happens behind the Galaxy buttons in the [corresponding tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}). This is why we are using massively subsampled data - it's only for demonstration purposes. If you want to perform this tutorial fully on your own data, you will need another compute power because it's simply not going to scale here. You can always use the Galaxy buttons' Alevin version which has large memory and few cores dedicated. {: .warning} diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index 14ae22bf3340da..942ec23a4afe3c 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -55,7 +55,7 @@ notebook: # Setting up the environment -Alevin is a tool integrated with the [Salmon software](https://salmon.readthedocs.io/en/latest/salmon.html), so first we need to get Salmon. You can install Salmon using conda, but in this tutorial we will show an alternative method - downloading the pre-compiled binaries from the [releases page](https://github.com/COMBINE-lab/salmon/releases). +Alevin is a tool integrated with the [Salmon software](https://salmon.readthedocs.io/en/latest/salmon.html), so first we need to get Salmon. You can install Salmon using conda, but in this tutorial we will show an alternative method - downloading the pre-compiled binaries from the [releases page](https://github.com/COMBINE-lab/salmon/releases). Note that binaries are usually compiled for specific CPU architectures, such as the 64-bit (x86_64) machine release referenced below . ```bash wget -nv https://github.com/COMBINE-lab/salmon/releases/download/v1.10.0/salmon-1.10.0_linux_x86_64.tar.gz @@ -112,7 +112,7 @@ wget -c https://zenodo.org/record/4574153/files/Mus_musculus.GRCm38.cdna.all.fa. Why do we need FASTA and GTF files? To generate gene-level quantifications based on transcriptome quantification, Alevin and similar tools require a conversion between transcript and gene identifiers. We can derive a transcript-gene conversion from the gene annotations available in genome resources such as Ensembl. The transcripts in such a list need to match the ones we will use later to build a binary transcriptome index. If you were using spike-ins, you'd need to add these to the transcriptome and the transcript-gene mapping. -We will use the murine reference annotation as retrieved from Ensembl in GTF format. This annotation contains gene, exon, transcript and all sorts of other information on the sequences. We will use these to generate the transcript-gene mapping by passing that information to a tool that extracts just the transcript identifiers we need. +We will use the murine reference annotation as retrieved from Ensembl (*GRCm38* or *mm10*) in GTF format. This annotation contains gene, exon, transcript and all sorts of other information on the sequences. We will use these to generate the transcript-gene mapping by passing that information to a tool that extracts just the transcript identifiers we need. # Generate a transcript to gene map and filtered FASTA @@ -125,7 +125,8 @@ gtf2featureAnnotation.R -g GRCm38_gtf.gff -c GRCm38_cdna.fasta -d "transcript_id In essence, [gtf2featureAnnotation.R script](https://github.com/ebi-gene-expression-group/atlas-gene-annotation-manipulation) takes a GTF annotation file and creates a table of annotation by feature, optionally filtering a cDNA file supplied at the same time. Therefore the first parameter `-g` stands for "gtf-file" and requires a path to a valid GTF file. Then `-c` takes a cDNA file for extracting meta info and/or filtering - that's our FASTA! Where --parse-cdnas (that's our `-c`) is specified, we need to specify, using `-d`, which field should be used to compare to identfiers from the FASTA. We set that to "transcript_id" - feel free to inspect the GTF file to explore other attributes. We pass the same value in `-f`, meaning first-field, ie. the name of the field to place first in output table. To specify which other fields to retain in the output table, we provide comma-separated list of those fields, and since we're only interested in transcript to gene map, we put those two names ("transcript_id,gene_id") into `-l`. `-t` stands for the feature type to use, and in our case we're using "transcript". Guess what `-o` is! Indeed, that's the output annotation table - here we specify the file path of our transcript to gene map. We will also have another output denoted by `-e` and that's the path to a filtered FASTA. Finally, we also put `-r` which is there only to suppress header on output. Summarising, output will be a an annotation table, and a FASTA-format cDNAs file with unannotated transcripts removed. -Why filtered FASTA? +**Why filtered FASTA?** + Sometimes it's important that there are no transcripts in a FASTA-format transcriptome that cannot be matched to a transcript/gene mapping. Salmon, for example, used to produce errors when this mismatch was present. We can synchronise the cDNA file by removing mismatches as we have done above. @@ -267,7 +268,7 @@ As you can see, *rowData names* and *colData names* are still empty. Before we a # Identify barcodes that correspond to non-empty droplets -Some sub-populations of small cells may not be distinguished from empty droplets based purely on counts by barcode. Some libraries produce multiple ‘knees’ (see the [Alevin Galaxy tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}#basic-qc) for multiple sub-populations. The emptyDrops method ({% cite Lun2019 %}) has become a popular way of dealing with this. emptyDrops still retains barcodes with very high counts, but also adds in barcodes that can be statistically distinguished from the ambient profiles, even if total counts are similar. +Some sub-populations of small cells may not be distinguished from empty droplets based purely on counts by barcode. Some libraries produce multiple ‘knees’ (see the [Alevin Galaxy tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}#basic-qc) for multiple sub-populations. The `emptyDrops` method ({% cite Lun2019 %}) has become a popular way of dealing with this. `emptyDrops` still retains barcodes with very high counts, but also adds in barcodes that can be statistically distinguished from the ambient profiles, even if total counts are similar. ```bash library(DropletUtils) # load the library and required packages @@ -335,6 +336,7 @@ If you have a look at the Experimental Design from that study, you might notice | Index | Batch | Genotype | Sex | |------ |--------------------| +|--:|--:|:--|:-:| | N701 | 0 | wildtype | male | | N702 | 1 | knockout | male | | N703 | 2 | knockout | female | @@ -465,6 +467,7 @@ rowData(alevin_se) ``` If you are working on your own data and it’s not mouse data, you can check available datasets for other species and just use relevant dataset in `useMart()` function. + ```bash listDatasets(mart) # available datasets ``` @@ -516,7 +519,7 @@ And that's our subset, ready for downstream analysis! # More datasets -We've done the analysis for one sample. But there are 7 samples in this experiment and it would be very handy to have all the information in one place. Therefore, you would need to repeat all the steps for the subsequent samples (that's when you'll appreciate wrapped tools and automation in Galaxy workflows!). To make your life easier, we will show you how to combine the datasets on smaller scale. Also, to save you some time, we've already run alevin on sample 702 (also subsampled to 50k reads). Let's quickly repeat the steps we performed in R to complete the analysis of sample 702 in the same way as we did with 701. +We've done the analysis for one sample. But there are 7 samples in this experiment and it would be very handy to have all the information in one place. Therefore, you would need to repeat all the steps for the subsequent samples (that's when you'll appreciate wrapped tools and automation in Galaxy workflows!). To make your life easier, we will show you how to combine the datasets on smaller scale. Also, to save you some time, we've already run Alevin on sample 702 (also subsampled to 50k reads). Let's quickly repeat the steps we performed in R to complete the analysis of sample 702 in the same way as we did with 701. But first, we have to save the results of our hard work on sample 701! @@ -664,7 +667,7 @@ alevin_combined_demo <- cbind(alevin_combined, alevin_subset3) alevin_combined_demo ``` -You get the point, right? It's imporant though that the rowData names and colData names are the same in each sample. +You get the point, right? It's important though that the rowData names and colData names are the same in each sample. # Saving and exporting the files From 0bc3c8260ea9113626b4b83d6fbd0342ea668321 Mon Sep 17 00:00:00 2001 From: Pavankumar Videm Date: Fri, 8 Dec 2023 11:30:24 +0100 Subject: [PATCH 46/46] add details boxes for input parameters of alevin and emptydrops --- .../tutorials/alevin-commandline/tutorial.md | 60 ++++++++++--------- 1 file changed, 33 insertions(+), 27 deletions(-) diff --git a/topics/single-cell/tutorials/alevin-commandline/tutorial.md b/topics/single-cell/tutorials/alevin-commandline/tutorial.md index 942ec23a4afe3c..c27a1bc73542bc 100644 --- a/topics/single-cell/tutorials/alevin-commandline/tutorial.md +++ b/topics/single-cell/tutorials/alevin-commandline/tutorial.md @@ -125,8 +125,7 @@ gtf2featureAnnotation.R -g GRCm38_gtf.gff -c GRCm38_cdna.fasta -d "transcript_id In essence, [gtf2featureAnnotation.R script](https://github.com/ebi-gene-expression-group/atlas-gene-annotation-manipulation) takes a GTF annotation file and creates a table of annotation by feature, optionally filtering a cDNA file supplied at the same time. Therefore the first parameter `-g` stands for "gtf-file" and requires a path to a valid GTF file. Then `-c` takes a cDNA file for extracting meta info and/or filtering - that's our FASTA! Where --parse-cdnas (that's our `-c`) is specified, we need to specify, using `-d`, which field should be used to compare to identfiers from the FASTA. We set that to "transcript_id" - feel free to inspect the GTF file to explore other attributes. We pass the same value in `-f`, meaning first-field, ie. the name of the field to place first in output table. To specify which other fields to retain in the output table, we provide comma-separated list of those fields, and since we're only interested in transcript to gene map, we put those two names ("transcript_id,gene_id") into `-l`. `-t` stands for the feature type to use, and in our case we're using "transcript". Guess what `-o` is! Indeed, that's the output annotation table - here we specify the file path of our transcript to gene map. We will also have another output denoted by `-e` and that's the path to a filtered FASTA. Finally, we also put `-r` which is there only to suppress header on output. Summarising, output will be a an annotation table, and a FASTA-format cDNAs file with unannotated transcripts removed. -**Why filtered FASTA?** - +Why filtered FASTA? Sometimes it's important that there are no transcripts in a FASTA-format transcriptome that cannot be matched to a transcript/gene mapping. Salmon, for example, used to produce errors when this mismatch was present. We can synchronise the cDNA file by removing mismatches as we have done above. @@ -167,27 +166,30 @@ salmon-latest_linux_x86_64/bin/salmon alevin -l ISR -1 barcodes_701.fastq -2 tra All the required input parameters are described in [the documentation](https://salmon.readthedocs.io/en/latest/alevin.html), but for the ease of use, they are presented below as well: -- `-l`: library type (same as Salmon), we recommend using ISR for both Drop-seq and 10x-v2 chemistry. - -- `-1`: CB+UMI file(s), alevin requires the path to the FASTQ file containing CB+UMI raw sequences to be given under this command line flag. Alevin also supports parsing of data from multiple files as long as the order is the same as in -2 flag. That's our barcodes_701.fastq file. - -- `-2`: Read-sequence file(s), alevin requires the path to the FASTQ file containing raw read-sequences to be given under this command line flag. Alevin also supports parsing of data from multiple files as long as the order is the same as in -1 flag. That's our transcript_701.fastq file. - -- `--dropseq` / `--chromium` / `--chromiumV3`: the protocol, this flag tells the type of single-cell protocol of the input sequencing-library. This is a study using the Drop-seq chemistry, so we specify that in the flag. - -- `-i`: index, file containing the salmon index of the reference transcriptome, as generated by salmon index command. - -- `-p`: number of threads, the number of threads which can be used by alevin to perform the quantification, by default alevin utilizes all the available threads in the system, although we recommend using ~10 threads which in our testing gave the best memory-time trade-off. - -- `-o`: output, path to folder where the output gene-count matrix (along with other meta-data) would be dumped. We simply call it alevin_output - -- `--tgMap`: transcript to gene map file, a tsv (tab-separated) file — with no header, containing two columns mapping of each transcript present in the reference to the corresponding gene (the first column is a transcript and the second is the corresponding gene). In our case, that's map_code generated by using gtf2featureAnnotation.R function. - -- `--freqThreshold` - minimum frequency for a barcode to be considered. We've chosen 3 as this will only remove cell barcodes with a frequency of less than 3, a low bar to pass but useful way of avoiding processing a bunch of almost certainly empty barcodes. - -- `--keepCBFraction` - fraction of cellular barcodes to keep. We're using 1 to quantify all! - -- `--dumpFeatures` - if activated, alevin dumps all the features used by the CB classification and their counts at each cell level. It’s generally used in pair with other command line flags. +> Alevin input parameters +> - `-l`: library type (same as Salmon), we recommend using ISR for both Drop-seq and 10x-v2 chemistry. +> +> - `-1`: CB+UMI file(s), alevin requires the path to the FASTQ file containing CB+UMI raw sequences to be given under this command line flag. Alevin also supports parsing of data from multiple files as long as the order is the same as in -2 flag. That's our barcodes_701.fastq file. +> +> - `-2`: Read-sequence file(s), alevin requires the path to the FASTQ file containing raw read-sequences to be given under this command line flag. Alevin also supports parsing of data from multiple files as long as the order is the same as in -1 flag. That's our transcript_701.fastq file. +> +> - `--dropseq` / `--chromium` / `--chromiumV3`: the protocol, this flag tells the type of single-cell protocol of the input sequencing-library. This is a study using the Drop-seq chemistry, so we specify that in the flag. +> +> - `-i`: index, file containing the salmon index of the reference transcriptome, as generated by salmon index command. +> +> - `-p`: number of threads, the number of threads which can be used by alevin to perform the quantification, by default alevin utilizes all the available threads in the system, although we recommend using ~10 threads which in our testing gave the best memory-time trade-off. +> +> - `-o`: output, path to folder where the output gene-count matrix (along with other meta-data) would be dumped. We simply call it alevin_output +> +> - `--tgMap`: transcript to gene map file, a tsv (tab-separated) file — with no header, containing two columns mapping of each transcript present in the reference to the corresponding gene (the first column is a transcript and the second is the corresponding gene). In our case, that's map_code generated by using gtf2featureAnnotation.R function. +> +> - `--freqThreshold` - minimum frequency for a barcode to be considered. We've chosen 3 as this will only remove cell barcodes with a frequency of less than 3, a low bar to pass but useful way of avoiding processing a bunch of almost certainly empty barcodes. +> +> - `--keepCBFraction` - fraction of cellular barcodes to keep. We're using 1 to quantify all! +> +> - `--dumpFeatures` - if activated, alevin dumps all the features used by the CB classification and their counts at each cell level. It’s generally used in pair with other command line flags. +> +{: .details} We have also added some additional parameters (`--freqThreshold`, `--keepCBFraction`) and their values are derived from the [Alevin Galaxy tutorial]({% link topics/single-cell/tutorials/scrna-case_alevin/tutorial.md %}) after QC to stop Alevin from applying its own thresholds. However, if you're not sure what value to pick, you can simply allow Alevin to make its own calls on what constitutes empty droplets. @@ -276,10 +278,14 @@ library(DropletUtils) # load the library and required packages emptyDrops takes multiple arguments that you can read about in the [documentation](https://rdrr.io/github/MarioniLab/DropletUtils/man/emptyDrops.html). However, in this case, we will only specify the following arguments: -- `m` - A numeric matrix-like object - usually a dgTMatrix or dgCMatrix - containing droplet data prior to any filtering or cell calling. Columns represent barcoded droplets, rows represent genes. -- `lower` - A numeric scalar specifying the lower bound on the total UMI count, at or below which all barcodes are assumed to correspond to empty droplets. -- `niters` - An integer scalar specifying the number of iterations to use for the Monte Carlo p-value calculations. -- `retain` - A numeric scalar specifying the threshold for the total UMI count above which all barcodes are assumed to contain cells. +> emptyDrops input parameters +> +> - `m` - A numeric matrix-like object - usually a dgTMatrix or dgCMatrix - containing droplet data prior to any filtering or cell calling. Columns represent barcoded droplets, rows represent genes. +> - `lower` - A numeric scalar specifying the lower bound on the total UMI count, at or below which all barcodes are assumed to correspond to empty droplets. +> - `niters` - An integer scalar specifying the number of iterations to use for the Monte Carlo p-value calculations. +> - `retain` - A numeric scalar specifying the threshold for the total UMI count above which all barcodes are assumed to contain cells. +> +{: .details} Let's then extract the matrix from our `alevin_se` object. It's stored in *assays* -> *counts*.