#+title: Further development on Finder, a pipeline to identify Tandem Arrayed Genes #+author: Samuel Ortion #+date: 2023-2024 #+LATEX_CLASS: scientific-project #+LATEX_HEADER: \usepackage{sty/lamme2024} #+latex_header_extra: \newglossaryentry{LaMME}{name={LaMME},description={Laboratoire de Mathématiques et Modélisation d'Évry}} #+bibliography: ../references.bib #+exclude_tags: noexport #+options: H:7 #+options: toc:nil # ref. https://write.as/dani/writing-a-phd-thesis-with-org-mode #+name: acronyms | key | abbreviation | full form | |-----+--------------+---------------------| | TAG | TAG | Tandem Arrayed Gene | | FTAG | FTAG | Families and Tandem Arrayed Gene | #+begin_export latex \hypersetup{ pdfauthor={Samuel Ortion}, pdftitle={}, pdfkeywords={duplicate genes, workflow management systems, pipeline}, } \pagenumbering{roman} #+end_export #+begin_abstract Duplicate genes is an important component of genomes. They have a particular role in genome evolution, allowing species to explore new gene functionality offering a pool of usable genes to build on. TODO: #+end_abstract #+begin_center *keywords*: duplicate genes, tandem arrayed genes, pipeline #+end_center #+begin_export latex \tableofcontents #+end_export [[printglossaries:]] #+begin_export latex \pagenumbering{arabic} #+end_export * Context ** What are duplicate genes? Duplicate genes are genes that experienced a duplication event during species evolution. These are homologous genes. *** Duplication mechanisms #+name: fig:gene-duplication-mechanisms #+CAPTION: Mechanisms leading to gene duplication [[./figures/lallemand2020-fig1_copy.pdf]] Several mechanisms may lead to gene duplication. We review them in this section. **** Segment duplication **** Retroduplication Transposable elements cause an important part of gene duplication [citation needed] Retrotransposon, or RNA transposon is one type of transposable element. Some of the representant of retrotransposon are similar to retroviruses. Retrotransposon may be duplicated in the genome through a mechanism known as "copy-and-paste". These transposons are typically composed of a reverse transcriptase gene. The protein encoded by this gene may proceed in the reverse transcription of the RNA transcript of the transposon sequence resulting in a DNA sequence which can then be included elsewhere in the genome. During this process, the RNA transcript may include nearby gene sequence, which can thus be copied and pasted along with the retrotransposon. **** Transduplication DNA transposon is an other type of transposable element whose transposition mechanisms can lead to gene duplication too. This type of transposable element moves in the genome through a mechanisms known as "cut-and-paste". The typical DNA transposon contains a transposase gene. The protein encoded by this gene recognize two sites surrounding the donnor transposon sequence in the chromosome resulting in a DNA cleavage. The transposase can then insert the transposon in a new place of the genome. Similarly to retrotransposon, if a gene was present between the two cleavage sites of the donnor transposon, it may move with the transposed sequence. **** Tandem Duplication **** Polyploidisation ***** Alloployploïdisation ***** Autopolyploïdisation ***** Mechanisms ****** Polyspermy ****** Non-reduced gametes **** Unequal crossing-over A crossing-over may occur during cell division. A fragment of chromosome is exchanged between two chromatids of a pair of chromosome. If the cleavage of the two chromatids occured at different positions on both chromosomes, the shared fragments may have different lengths. When the repair of missing fragment is performed, the resulting chromosome will incorporate a duplicate region of the chromosome, leading to a potential duplication for genes present in this region, as represented in figure [[fig:gene-duplication-mechanisms]] B. # TODO: check that this is really the B subfigure *** Role in genome evolution ** Identification of duplicate genes *** *** Finder * Objectives ** Amend the existing Galaxy pipeline Last year, a M1 student, Seanna Charles, worked on the Galaxy's version of the gls: Finder pipeline [cite:@charlesFinalisationPipelineFTAG2023]. During my internship, I will continue this work. ** Porting Finder pipeline on a workflow manager #+begin_export latex \printbibliography #+end_export