diff --git a/docs/source/getting_started/bioinformatics.md b/docs/source/getting_started/bioinformatics.md
index e8c24416d2952c419e14393cf4b78b45afb3e9f8..c7458136600fb49732e734111661d308ca8f7951 100644
--- a/docs/source/getting_started/bioinformatics.md
+++ b/docs/source/getting_started/bioinformatics.md
@@ -1,32 +1,544 @@
 # Bioinformatics Pipelines
 Bioinformatics pipelines are an integral component of next-generation sequencing (NGS). Processing raw sequence data to detect genomic alterations has significant impact on disease management and patient care. Because of the lack of published guidance, there is currently a high degree of variability in how members of the global molecular genetics and pathology community establish and validate bioinformatics pipelines.
 
-## Bioinformatics Analysis of NGS Data
-NGS bioinformatics pipelines are frequently platform specific and may be customizable on the basis of laboratory needs. A bioinformatics pipeline consists of the following major steps:
+## Bioinformatics Analysis of RNA-seq Data
+RNA-seq is a powerful platform for comprehensive investigation of the transcriptome.The General bioinformatics workflow for the quantitative analysis of RNA-seq data includes three parts: 
+- RNA-seq Quality Check; 
+- Quality timming of Adapters;
+- mapping sequencing reads to a reference genome or transcriptome; 
+- quantifying expression levels of individual genes and transcripts; and
+- identifying specific genes and transcripts that are differentially expressed between samples.
 
-### Sequence Generation
+### BASIC WORKFLOW
 
-> Sequence generation (signal processing and base calling) is the process that converts sensor (optical and nonoptical) data from the sequencing platform and identifies the sequence of nucleotides for each of the short fragments of DNA in the sample prepared for analysis. For each nucleotide sequenced in these short fragments (ie, raw reads), a corresponding Phred-like quality score is assigned, which is sequencing platform specific. The read sequences along with the Phred-like quality scores are stored in a FASTQ file, which is a de facto standard for representing biological sequence information
+#### FASTQ Files - RAW Data
+Raw RNA-seq data are typically formatted as **FASTQ** files. **FASTQ** is a text-based format storing the sequences of the reads as well as their sequencing quality. The file is organized in groups of four lines per read as shown below:
+
+> @NB500929:247:HL2TYBGX3:1:11101:25163:1060
+> GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+> +  
+> !â€™â€™*((((\*\*\*+))%%%++)(%%%%).1\*\*\*-+\*â€™â€™))**55CCF>>>>>>CCCCCCC65
+
+The first line starts with â€œ@â€ and is followed by a unique sequence identifier, which includes instrument ID (NB500929), run number (247), and flow cell ID (HL2TYBGX3), followed by the numbers specifying the location of the DNA fragment on the flowcell. In the case of paired-end sequencing, two FASTQ files for read 1 and read 2 include the same sequence identifiers plus the read number (1 or 2), which indicates whether the sequence comes from read 1 or read 2 of the DNA fragment. The second line contains the read sequence. The third line starts with a â€œ+â€ character and can optionally be followed by the same sequence identifier and any additional description. The fourth line encodes the sequencing quality scores for each base, which are coded as individual symbols according to a [coding scheme](https://support.illumina.com/help/BaseSpace_OLH_009008/Content/Source/Informatics/BS/QualityScoreEncoding_swBS.htm)
+
+#### QUALITY CHECK ON RAW DATA
+RNA-Seq has become one of the most widely used applications based on next-generation sequencing technology. However, raw RNA-Seq data may have quality issues, which can significantly distort analytical results and lead to erroneous conclusions. Therefore, the raw data must be subjected to vigorous quality control (QC) procedures before downstream analysis. Currently, an accurate and complete QC of RNA-Seq data requires of a suite of different QC tools used consecutively, which is inefficient in terms of usability, running time, file usage, and interpretability of the results.
+
+[FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) provides a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis.
+
+The main functions of FastQC are:
+
+    Import of data from FASTQ files (also accepts BAM and SAM alignment files)
+    Quick overview of any likely sequencing problems
+    Summary graphs and tables to quickly assess your data
+    Export of results as an HTML-based report
+
+FastQC has a really well documented manual page with detailed explanations about every plot in the report.
+
+<details>
+<summary>Working on FastQC</summary>
+
+```bash
+    $ fastqc SRR20076358_1.fastq.gz                                                 
+```
+Running the above command produces the following output in console and Results in making reports as two files `SRR20076358_1_fastqc.zip` and `SRR20076358_1_fastqc.html`
+```bash
+Started analysis of SRR20076358_1.fastq.gz
+    Approx 5% complete for SRR20076358_1.fastq.gz
+    Approx 10% complete for SRR20076358_1.fastq.gz
+    Approx 15% complete for SRR20076358_1.fastq.gz
+    Approx 20% complete for SRR20076358_1.fastq.gz
+    Approx 25% complete for SRR20076358_1.fastq.gz
+    Approx 30% complete for SRR20076358_1.fastq.gz
+    Approx 35% complete for SRR20076358_1.fastq.gz
+    Approx 40% complete for SRR20076358_1.fastq.gz
+    Approx 45% complete for SRR20076358_1.fastq.gz
+    Approx 50% complete for SRR20076358_1.fastq.gz
+    Approx 55% complete for SRR20076358_1.fastq.gz
+    Approx 60% complete for SRR20076358_1.fastq.gz
+    Approx 65% complete for SRR20076358_1.fastq.gz
+    Approx 70% complete for SRR20076358_1.fastq.gz
+    Approx 75% complete for SRR20076358_1.fastq.gz
+    Approx 80% complete for SRR20076358_1.fastq.gz
+    Approx 85% complete for SRR20076358_1.fastq.gz
+    Approx 90% complete for SRR20076358_1.fastq.gz
+    Approx 95% complete for SRR20076358_1.fastq.gz
+    Analysis complete for SRR20076358_1.fastq.gz
+```
+Run the following command to view the result of the Quality Check as shown in _fig.2.1_
+
+```bash
+$ firefox SRR20076358_1.fastq.gz
+```
+```{image} ../img/fastqc_report.png
+```
+<p align="center">
+fig.2.1
+</p>
 </details>
 
+#### QUALITY TRIMMING OF ADAPTERS
+Trimming for adaptors and low quality bases is important part of the analysis pipeline for sequencing data. Typically, after you isolate and fragment your RNA sample, adaptors are attached to the ends of the sequences that are needed for sequencing .These adaptors need to be removed from the sequenced reads before downstream processing. An additional step that needs to be taken is removing low quality bases.
+
+Quality trimming decreases the overall number of reads, but increases to the total and proportion of uniquely mapped reads. Thus, you get more useful data for downstream analyses.
+
+There are many tools for trimming reads and removing adapters, such as Trim Galore!, Trimmomatic, Cutadapt, skewer, AlienTrimmer, BBDuk, and the most recent SOAPnuke and fastp.
+
+Trim Galore! is a wrapper script to automate quality and adapter trimming as well as quality control, with some added functionality to remove biased methylation positions for RRBS sequence files (for directional, non-directional (or paired-end) sequencing).
+<details>
+<summary>Working on trim_galore</summary>
+
+```bash
+$ trim_galore --gzip --fastqc --max_n 2 --paired --length 50 SRR11862696_1.fastq.gz SRR11862696_2.fastq.gz
+```
+<details>
+<summary>The above command will produce the following result in console
+</summary>
+
+```bash
+# trim_galore --gzip --fastqc --max_n 2 --paired --length 50 SRR11862696_1.fastq.gz SRR11862696_2.fastq.gz                              
+
+Multicore support not enabled. Proceeding with single-core trimming.
+Path to Cutadapt set as: 'cutadapt' (default)
+Cutadapt seems to be working fine (tested command 'cutadapt --version')
+Cutadapt version: 4.1
+single-core operation.
+No quality encoding type selected. Assuming that the data provided uses Sanger encoded Phred scores (default)
+
+
+
+AUTO-DETECTING ADAPTER TYPE
+===========================
+Attempting to auto-detect adapter type from the first 1 million sequences of the first file (>> SRR11862696_1.fastq.gz <<)
+
+Found perfect matches for the following adapter sequences:
+Adapter type    Count   Sequence        Sequences analysed      Percentage
+Illumina        4229    AGATCGGAAGAGC   1000000 0.42
+Nextera 7       CTGTCTCTTATA    1000000 0.00
+smallRNA        3       TGGAATTCTCGG    1000000 0.00
+Using Illumina adapter for trimming (count: 4229). Second best hit was Nextera (count: 7)
+
+Writing report to 'SRR11862696_1.fastq.gz_trimming_report.txt'
+
+SUMMARISING RUN PARAMETERS
+==========================
+Input filename: SRR11862696_1.fastq.gz
+Trimming mode: paired-end
+Trim Galore version: 0.6.6
+Cutadapt version: 4.1
+Number of cores used for trimming: 1
+Quality Phred score cutoff: 20
+Quality encoding type selected: ASCII+33
+Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected)
+Maximum trimming error rate: 0.1 (default)
+Maximum number of tolerated Ns: 2
+Minimum required adapter overlap (stringency): 1 bp
+Minimum required sequence length for both reads before a sequence pair gets removed: 50 bp
+Running FastQC on the data once trimming has completed
+Output file(s) will be GZIP compressed
+
+Cutadapt seems to be fairly up-to-date (version 4.1). Setting -j 1
+Writing final adapter and quality trimmed output to SRR11862696_1_trimmed.fq.gz
+
+
+  >>> Now performing quality (cutoff '-q 20') and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file SRR11862696_1.fastq.gz <<< 
+10000000 sequences processed
+20000000 sequences processed
+30000000 sequences processed
+40000000 sequences processed
+This is cutadapt 4.1 with Python 3.10.6
+Command line parameters: -j 1 -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC SRR11862696_1.fastq.gz
+Processing single-end reads on 1 core ...
+Finished in 1353.17 s (29 Âµs/read; 2.08 M reads/minute).
+
+=== Summary ===
+
+Total reads processed:              46,831,782
+Reads with adapters:                15,832,779 (33.8%)
+Reads written (passing filters):    46,831,782 (100.0%)
+
+Total basepairs processed: 4,730,009,982 bp
+Quality-trimmed:              45,195,419 bp (1.0%)
+Total written (filtered):  4,644,800,323 bp (98.2%)
+
+=== Adapter 1 ===
+
+Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 15832779 times
+
+Minimum overlap: 1
+No. of allowed errors:
+1-9 bp: 0; 10-13 bp: 1
+
+Bases preceding removed adapters:
+  A: 29.1%
+  C: 33.1%
+  G: 21.6%
+  T: 15.3%
+  none/other: 1.0%
+
+Overview of removed sequences
+length  count   expect  max.err error counts
+1       10545866        11707945.5      0       10545866
+2       3526202 2926986.4       0       3526202
+3       947984  731746.6        0       947984
+4       207831  182936.6        0       207831
+5       70578   45734.2 0       70578
+6       34326   11433.5 0       34326
+7       27723   2858.4  0       27723
+8       25762   714.6   0       25762
+9       23820   178.6   0       23189 631
+10      21148   44.7    1       19978 1170
+11      18346   11.2    1       17295 1051
+12      16859   2.8     1       16455 404
+13      14322   0.7     1       14086 236
+14      14148   0.7     1       13862 286
+15      13064   0.7     1       12807 257
+16      13012   0.7     1       12753 259
+17      12842   0.7     1       12598 244
+18      12040   0.7     1       11787 253
+19      9449    0.7     1       9243 206
+20      8880    0.7     1       8713 167
+21      8038    0.7     1       7853 185
+22      6802    0.7     1       6651 151
+23      6453    0.7     1       6261 192
+24      5955    0.7     1       5803 152
+25      5584    0.7     1       5415 169
+26      5190    0.7     1       5018 172
+27      5085    0.7     1       4924 161
+28      4953    0.7     1       4808 145
+29      4594    0.7     1       4427 167
+30      4196    0.7     1       4048 148
+31      3299    0.7     1       3176 123
+32      3192    0.7     1       3072 120
+33      2944    0.7     1       2808 136
+34      2785    0.7     1       2612 173
+35      2601    0.7     1       2458 143
+36      2053    0.7     1       1923 130
+37      2335    0.7     1       2155 180
+38      2403    0.7     1       2260 143
+39      2047    0.7     1       1855 192
+40      2027    0.7     1       1880 147
+41      1750    0.7     1       1537 213
+42      1814    0.7     1       1595 219
+43      1645    0.7     1       1545 100
+44      994     0.7     1       914 80
+45      1238    0.7     1       1163 75
+46      975     0.7     1       880 95
+47      1148    0.7     1       1021 127
+48      1106    0.7     1       991 115
+49      958     0.7     1       864 94
+50      980     0.7     1       839 141
+51      924     0.7     1       825 99
+52      828     0.7     1       721 107
+53      958     0.7     1       742 216
+54      1029    0.7     1       741 288
+55      917     0.7     1       785 132
+56      529     0.7     1       464 65
+57      698     0.7     1       537 161
+58      740     0.7     1       519 221
+59      829     0.7     1       535 294
+60      1114    0.7     1       576 538
+61      992     0.7     1       596 396
+62      1052    0.7     1       456 596
+63      1551    0.7     1       483 1068
+64      2633    0.7     1       533 2100
+65      4064    0.7     1       619 3445
+66      2119    0.7     1       542 1577
+67      1983    0.7     1       503 1480
+68      2846    0.7     1       449 2397
+69      5405    0.7     1       485 4920
+70      10363   0.7     1       662 9701
+71      47660   0.7     1       968 46692
+72      34144   0.7     1       2204 31940
+73      15329   0.7     1       970 14359
+74      8711    0.7     1       812 7899
+75      3735    0.7     1       481 3254
+76      3407    0.7     1       281 3126
+77      3785    0.7     1       93 3692
+78      2433    0.7     1       44 2389
+79      1587    0.7     1       35 1552
+80      964     0.7     1       17 947
+81      709     0.7     1       14 695
+82      469     0.7     1       8 461
+83      325     0.7     1       5 320
+84      295     0.7     1       5 290
+85      198     0.7     1       3 195
+86      161     0.7     1       6 155
+87      184     0.7     1       9 175
+88      143     0.7     1       3 140
+89      124     0.7     1       2 122
+90      142     0.7     1       1 141
+91      131     0.7     1       1 130
+92      164     0.7     1       3 161
+93      168     0.7     1       0 168
+94      188     0.7     1       1 187
+95      188     0.7     1       1 187
+96      267     0.7     1       0 267
+97      312     0.7     1       2 310
+98      390     0.7     1       0 390
+99      404     0.7     1       0 404
+100     767     0.7     1       0 767
+101     4375    0.7     1       0 4375
 
-### Sequence Alignment
+RUN STATISTICS FOR INPUT FILE: SRR11862696_1.fastq.gz
+=============================================
+46831782 sequences processed in total
+The length threshold of paired-end sequences gets evaluated later on (in the validation step)
 
-> Sequence alignment is the process of determining where each short DNA sequence read (each typically <250 bp) aligns with a reference genome (eg, the human reference genome used in clinical laboratories). This computationally intensive process assigns a Phred-scale mapping quality score to each of the short sequence reads, indicating the confidence of the alignment process. This step also provides a genomic context (location in the reference genome) to each aligned sequence read, which can be used to calculate the proportion of mapped reads and depth (coverage) of sequencing for one or more loci in the sequenced region of interest. The sequence alignment data are usually stored in a de facto standard binary alignment map (BAM) file format, which is a binary version of the sequence alignment/map format. The newer compressed representation [Compressed and Reference-oriented Alignment Map (CRAM)] or its encrypted version [Selective retrieval on Encrypted and Compressed Reference-oriented Alignment Map (SECRAM)]6 is a viable alternative that saves space and secures genetic information, although laboratories need to carefully validate variant calling impact if lossy (as opposed to lossless) compression settings are used in generating CRAM (European Nucleotide Archive, CRAM format specification version 3.0; http://samtools.github.io/hts-specs/CRAMv3.pdf, last accessed November 23, 2016) and SECRAM files.
+Writing report to 'SRR11862696_2.fastq.gz_trimming_report.txt'
 
-### Variant Calling
+SUMMARISING RUN PARAMETERS
+==========================
+Input filename: SRR11862696_2.fastq.gz
+Trimming mode: paired-end
+Trim Galore version: 0.6.6
+Cutadapt version: 4.1
+Number of cores used for trimming: 1
+Quality Phred score cutoff: 20
+Quality encoding type selected: ASCII+33
+Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected)
+Maximum trimming error rate: 0.1 (default)
+Maximum number of tolerated Ns: 2
+Minimum required adapter overlap (stringency): 1 bp
+Minimum required sequence length for both reads before a sequence pair gets removed: 50 bp
+Running FastQC on the data once trimming has completed
+Output file(s) will be GZIP compressed
 
-> Variant calling is the process of accurately identifying the differences or variations between the sample and the reference genome sequence. The typical input is a set of aligned reads in BAM or another similar format, which is traversed by the variant caller to identify sequence variants. Variant calling is a heterogeneous collection of algorithmic strategies based on the types of sequence variants, such as single-nucleotide variants (SNVs), small insertions and deletions (indels), copy number alterations, and large structural alterations (insertions, inversions, and translocations). The accuracy of variant calling is highly dependent on the quality of called bases and aligned reads. Therefore, prevariant calling processing, such as local realignment around expected indels and base quality score recalibration, is routinely used to ensure accurate and efficient variant calling. For SNVs and indels, the called variants are represented using the de facto standard variant call format (VCF; https://samtools.github.io/hts-specs/VCFv4.3.pdf, last accessed November 23, 2016). Alternative specifications exist for representing and storing variant calls [Genomic VCF Conventions, https://sites.google.com/site/gvcftools/home/about-gvcf/gvcf-conventions, last accessed November 23, 2016; The Sequence Ontology Genome Variation Format Version 1.10, https://github.com/The-Sequence-Ontology/Specifications/blob/master/gvf.md, last accessed November 23, 2016; The Human Genome Variation Society, Human Genome Variation Society (HGVS) Simple Version 15.11. 2016, http://varnomen.hgvs.org/bg-material/simple, last accessed November 23, 2016; Health GAfGa File Formats, https://www.ga4gh.org/ga4ghtoolkit/genomicdatatoolkit, last accessed November 27, 2017].
+Cutadapt seems to be fairly up-to-date (version 4.1). Setting -j -j 1
+Writing final adapter and quality trimmed output to SRR11862696_2_trimmed.fq.gz
 
-### Variant Filtering
 
-> Variant filtering is the process by which variants representing false-positive artifacts of the NGS method are flagged or filtered from the original VCF file on the basis of several sequence alignment and variant calling associated metadata (eg, mapping quality, base-calling quality, strand bias, and others). This is usually a postvariant calling step, although some variant callers incorporate this step as part of the variant calling process. This automated process may be used as a hard filter to allow annotation and review of only the assumed true variants.
+  >>> Now performing quality (cutoff '-q 20') and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file SRR11862696_2.fastq.gz <<< 
+10000000 sequences processed
+20000000 sequences processed
+30000000 sequences processed
+40000000 sequences processed
+This is cutadapt 4.1 with Python 3.10.6
+Command line parameters: -j 1 -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC SRR11862696_2.fastq.gz
+Processing single-end reads on 1 core ...
+Finished in 1380.06 s (29 Âµs/read; 2.04 M reads/minute).
 
-### Variant Annotation
+=== Summary ===
 
-> Variant annotation performs queries against multiple sequence and variant databases to characterize each called variant with a rich set of metadata, such as variant location, predicted cDNA and amino acid sequence change (HGVS nomenclature), minor allele frequencies in human populations, and prevalence in different variant databases [eg, Catalogue of Somatic Mutations in Cancer, The Cancer Genome Atlas, Single-Nucleotide Polymorphism (SNP) Database, and ClinVar]. This information is used to further prioritize or filter variants for classification and interpretation.
+Total reads processed:              46,831,782
+Reads with adapters:                15,576,554 (33.3%)
+Reads written (passing filters):    46,831,782 (100.0%)
+
+Total basepairs processed: 4,730,009,982 bp
+Quality-trimmed:             103,667,050 bp (2.2%)
+Total written (filtered):  4,596,284,410 bp (97.2%)
+
+=== Adapter 1 ===
+
+Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 15576554 times
+
+Minimum overlap: 1
+No. of allowed errors:
+1-9 bp: 0; 10-13 bp: 1
+
+Bases preceding removed adapters:
+  A: 29.7%
+  C: 33.2%
+  G: 21.6%
+  T: 15.3%
+  none/other: 0.2%
+
+Overview of removed sequences
+length  count   expect  max.err error counts
+1       10415056        11707945.5      0       10415056
+2       3549673 2926986.4       0       3549673
+3       930335  731746.6        0       930335
+4       203432  182936.6        0       203432
+5       70152   45734.2 0       70152
+6       34001   11433.5 0       34001
+7       28487   2858.4  0       28487
+8       25528   714.6   0       25528
+9       23898   178.6   0       23109 789
+10      22040   44.7    1       20613 1427
+11      18086   11.2    1       16804 1282
+12      17308   2.8     1       16717 591
+13      15092   0.7     1       14434 658
+14      15841   0.7     1       15497 344
+15      11921   0.7     1       11544 377
+16      12960   0.7     1       12493 467
+17      14323   0.7     1       14027 296
+18      9700    0.7     1       9441 259
+19      11057   0.7     1       10834 223
+20      7979    0.7     1       7799 180
+21      7047    0.7     1       6893 154
+22      6828    0.7     1       6651 177
+23      7131    0.7     1       6180 951
+24      6703    0.7     1       6513 190
+25      5485    0.7     1       5227 258
+26      8361    0.7     1       5068 3293
+27      5341    0.7     1       4702 639
+28      5433    0.7     1       5188 245
+29      5099    0.7     1       4100 999
+30      8235    0.7     1       8058 177
+31      574     0.7     1       431 143
+32      3205    0.7     1       3094 111
+33      1589    0.7     1       1483 106
+34      2116    0.7     1       2014 102
+35      2219    0.7     1       2095 124
+36      2189    0.7     1       1992 197
+37      2105    0.7     1       1963 142
+38      2127    0.7     1       1985 142
+39      2108    0.7     1       1912 196
+40      1853    0.7     1       1753 100
+41      2786    0.7     1       1641 1145
+42      2087    0.7     1       2001 86
+43      974     0.7     1       885 89
+44      2286    0.7     1       1231 1055
+45      1723    0.7     1       1619 104
+46      777     0.7     1       690 87
+47      920     0.7     1       864 56
+48      912     0.7     1       829 83
+49      1068    0.7     1       869 199
+50      1074    0.7     1       957 117
+51      1098    0.7     1       1033 65
+52      631     0.7     1       589 42
+53      598     0.7     1       551 47
+54      661     0.7     1       608 53
+55      728     0.7     1       649 79
+56      528     0.7     1       476 52
+57      656     0.7     1       578 78
+58      602     0.7     1       561 41
+59      593     0.7     1       544 49
+60      632     0.7     1       548 84
+61      681     0.7     1       516 165
+62      1995    0.7     1       637 1358
+63      1889    0.7     1       1135 754
+64      4483    0.7     1       1659 2824
+65      13843   0.7     1       3320 10523
+66      6624    0.7     1       2171 4453
+67      1203    0.7     1       529 674
+68      354     0.7     1       196 158
+69      163     0.7     1       72 91
+70      80      0.7     1       19 61
+71      55      0.7     1       21 34
+72      81      0.7     1       19 62
+73      48      0.7     1       14 34
+74      36      0.7     1       8 28
+75      29      0.7     1       1 28
+76      38      0.7     1       5 33
+77      60      0.7     1       6 54
+78      31      0.7     1       3 28
+79      35      0.7     1       7 28
+80      39      0.7     1       3 36
+81      48      0.7     1       4 44
+82      29      0.7     1       5 24
+83      32      0.7     1       4 28
+84      34      0.7     1       3 31
+85      34      0.7     1       5 29
+86      32      0.7     1       6 26
+87      53      0.7     1       7 46
+88      32      0.7     1       3 29
+89      40      0.7     1       6 34
+90      35      0.7     1       5 30
+91      41      0.7     1       2 39
+92      61      0.7     1       1 60
+93      32      0.7     1       1 31
+94      32      0.7     1       2 30
+95      19      0.7     1       2 17
+96      25      0.7     1       0 25
+97      33      0.7     1       1 32
+98      68      0.7     1       1 67
+99      24      0.7     1       0 24
+100     27      0.7     1       0 27
+101     105     0.7     1       0 105
+
+RUN STATISTICS FOR INPUT FILE: SRR11862696_2.fastq.gz
+=============================================
+46831782 sequences processed in total
+The length threshold of paired-end sequences gets evaluated later on (in the validation step)
+
+Validate paired-end files SRR11862696_1_trimmed.fq.gz and SRR11862696_2_trimmed.fq.gz
+file_1: SRR11862696_1_trimmed.fq.gz, file_2: SRR11862696_2_trimmed.fq.gz
+
+
+>>>>> Now validing the length of the 2 paired-end infiles: SRR11862696_1_trimmed.fq.gz and SRR11862696_2_trimmed.fq.gz <<<<<
+Writing validated paired-end Read 1 reads to SRR11862696_1_val_1.fq.gz
+Writing validated paired-end Read 2 reads to SRR11862696_2_val_2.fq.gz
+
+
+
+
+
+
+
+Total number of sequences analysed: 46831782
+
+Number of sequence pairs removed because at least one read was shorter than the length cutoff (50 bp): 1159884 (2.48%)
+Number of sequence pairs removed because at least one read contained more N(s) than the specified limit of 2: 62468 (0.13%)
+
+
+  >>> Now running FastQC on the validated data SRR11862696_1_val_1.fq.gz<<<
+
+Started analysis of SRR11862696_1_val_1.fq.gz
+Approx 5% complete for SRR11862696_1_val_1.fq.gz
+Approx 10% complete for SRR11862696_1_val_1.fq.gz
+Approx 15% complete for SRR11862696_1_val_1.fq.gz
+Approx 20% complete for SRR11862696_1_val_1.fq.gz
+Approx 25% complete for SRR11862696_1_val_1.fq.gz
+Approx 30% complete for SRR11862696_1_val_1.fq.gz
+Approx 35% complete for SRR11862696_1_val_1.fq.gz
+Approx 40% complete for SRR11862696_1_val_1.fq.gz
+Approx 45% complete for SRR11862696_1_val_1.fq.gz
+Approx 50% complete for SRR11862696_1_val_1.fq.gz
+Approx 55% complete for SRR11862696_1_val_1.fq.gz
+Approx 60% complete for SRR11862696_1_val_1.fq.gz
+Approx 65% complete for SRR11862696_1_val_1.fq.gz
+Approx 70% complete for SRR11862696_1_val_1.fq.gz
+Approx 75% complete for SRR11862696_1_val_1.fq.gz
+Approx 80% complete for SRR11862696_1_val_1.fq.gz
+Approx 85% complete for SRR11862696_1_val_1.fq.gz
+Approx 90% complete for SRR11862696_1_val_1.fq.gz
+Approx 95% complete for SRR11862696_1_val_1.fq.gz
+Analysis complete for SRR11862696_1_val_1.fq.gz
+
+  >>> Now running FastQC on the validated data SRR11862696_2_val_2.fq.gz<<<
+
+Started analysis of SRR11862696_2_val_2.fq.gz
+Approx 5% complete for SRR11862696_2_val_2.fq.gz
+Approx 10% complete for SRR11862696_2_val_2.fq.gz
+Approx 15% complete for SRR11862696_2_val_2.fq.gz
+Approx 20% complete for SRR11862696_2_val_2.fq.gz
+Approx 25% complete for SRR11862696_2_val_2.fq.gz
+Approx 30% complete for SRR11862696_2_val_2.fq.gz
+Approx 35% complete for SRR11862696_2_val_2.fq.gz
+Approx 40% complete for SRR11862696_2_val_2.fq.gz
+Approx 45% complete for SRR11862696_2_val_2.fq.gz
+Approx 50% complete for SRR11862696_2_val_2.fq.gz
+Approx 55% complete for SRR11862696_2_val_2.fq.gz
+Approx 60% complete for SRR11862696_2_val_2.fq.gz
+Approx 65% complete for SRR11862696_2_val_2.fq.gz
+Approx 70% complete for SRR11862696_2_val_2.fq.gz
+Approx 75% complete for SRR11862696_2_val_2.fq.gz
+Approx 80% complete for SRR11862696_2_val_2.fq.gz
+Approx 85% complete for SRR11862696_2_val_2.fq.gz
+Approx 90% complete for SRR11862696_2_val_2.fq.gz
+Approx 95% complete for SRR11862696_2_val_2.fq.gz
+Analysis complete for SRR11862696_2_val_2.fq.gz
+Deleting both intermediate output files SRR11862696_1_trimmed.fq.gz and SRR11862696_2_trimmed.fq.gz
+
+====================================================================================================
+```
+
+</details>
+
+Produces quality control reports after trimming adapters from the raw data   using FastQC as `SRR11862696_1_val_1_fastqc.html` and `SRR11862696_2_val_2_fastqc.html`
+
+You can view quality control report using the following command:
+```bash
+$ firefox SRR11862696_1_val_1_fastqc.html
+```
+
+```{image} ../img/trim_galore_report.png
+
+```
+
+</details>
 
-### Variant Prioritization
+#### ALIGNMENT
+Once high-quality data are obtained from pre-processing, the next step is the read mapping or alignment.When studying an organism with a reference genome, it is possible to infer which transcripts are expressed by mapping the reads to the reference genome **(genome mapping)** or transcriptome **(transcriptome mapping)**. Mapping reads to the genome requires no knowledge of the set of transcribed regions or the way in which exons are spliced together. This approach allows the discovery of new, unannotated transcripts.
 
-> Variant prioritization uses variant annotations to identify clinically insignificant variants (eg, synonymous, deep intronic variants, and established benign polymorphisms), thereby presenting the remaining variants (known or unknown clinical significance) for further review and interpretation. Clinical laboratories often develop variant knowledge bases to facilitate this process.
-> Some clinical laboratories choose to apply hard filters on called variants on the basis of variant call metadata or from a data dictionary (variant filtering) as a component of the pipeline analysis software. Because its purpose is to hide certain variants from the view of the human interpreter, it is absolutely critical that filtering algorithms be thoroughly validated to ensure that only those variants meeting strict predefined criteria are being hidden from view. Otherwise, the human interpreter may miss clinically significant variants that may result in harming the patient.
diff --git a/docs/source/img/fastqc_report.png b/docs/source/img/fastqc_report.png
new file mode 100644
index 0000000000000000000000000000000000000000..bba66ebd7dd2f2fdc60700750cf8df41e37a32e8
Binary files /dev/null and b/docs/source/img/fastqc_report.png differ
diff --git a/docs/source/img/trim_galore_report.png b/docs/source/img/trim_galore_report.png
new file mode 100644
index 0000000000000000000000000000000000000000..7c60f7c50694095750563b6e60b60718ba9c371f
Binary files /dev/null and b/docs/source/img/trim_galore_report.png differ