feat: Add llap for heading number

This commit is contained in:
Samuel Ortion 2024-04-18 16:33:38 +02:00
parent 03f3efd668
commit b441d29ba8
Signed by: sortion
GPG Key ID: 9B02406F8C4FB765
6 changed files with 141 additions and 143 deletions

Binary file not shown.

Before

Width:  |  Height:  |  Size: 546 B

View File

@ -18,6 +18,8 @@
| FTAGFinder | FTAG Finder | Families and Tandemly Arrayed Genes Finder |
| WGD | WGD | Whole Genome Duplication |
| MCL | MCL | Markov Clustering |
| BLAST | BLAST | Basic Local Alignment Search Tool |
| | | |
#+name: glossary
| label | name | description |
@ -33,6 +35,7 @@
| segment_duplication | segment duplication | DNA sequences present in multiple locations within a genome that share high level of sequence identity |
| subfunctionalization | subfunctionalization | Fate of a duplicate gene which gets a part of the original gene function, the function being shared among multiple duplicates |
| orthologues | orthologues | Homologous genes whose divergence started at a speciation event |
| neofunctionalization | neofunctionalization | Acquisition of a new function by the duplicate gene |
#+begin_export latex
\makeatletter
@ -63,7 +66,7 @@
#+end_export
* Scientific context
It is estimated that between 46% and 65.5% of human genes could be considered as duplicate genes\footnote{The estimate vary strongly depending on the criteria in use} [cite:@correaTransposableElementEnvironment2021].
\lettrine{D}uplicate genes represent an important fraction of Eukaryotic genes: It is estimated that between 46% and 65.5% of human genes could be considered as duplicate[fn:: The estimate vary strongly depending on the criteria in use, because ancient duplication event may be hard to detect.] [cite:@correaTransposableElementEnvironment2021].
Duplicate genes offers a pool of genetic material available for further experimentation during species evolution.
** Gene duplication mechanisms
@ -95,9 +98,9 @@ Multiple mechanisms may lead to a gene duplication. Their effect ranges from the
During an event of gls:WGD, the entire set of genes present on the chromosomes is duplicated ([[cref:fig:gene-duplication-mechanisms]] (A)).
gls:WGD can occur thanks to gls:polyspermy or in case of a non-reduced gamete.
Gls:polyploidisation is a mechanism leading to a species with at least three copies of an initial genome.
A striking example is probably /Triticum aestivum/ (wheat) which is hexaploid[fn:hexaploid: An hexaploid cell have three pairs of homologous chromosomes.] due to several hybridisation events [cite:@golovninaMolecularPhylogenyGenus2007a].
A striking example is probably /Triticum aestivum/ (wheat) which is hexaploid due to hybridisation events [cite:@golovninaMolecularPhylogenyGenus2007a].
We distinguish two kinds of glspl:polyploidisation, based on the origin of the duplicate genome: (i) Gls:allopolyploidisation occurs when the supplementary chromosomes come from a divergent species. This is the case for /Triticum aestivum/ hybridisation, which consisted in the union of the chromosome set of a /Triticum/ species with those of an /Aegilops/ species. (ii) Gls:autopolyploidisation consists in the hybridisation or duplication of the whole genome within the same species.
We distinguish two kinds of glspl:polyploidisation, based on the origin of the duplicate genome: (i) Gls:allopolyploidisation occurs when the supplementary chromosomes come from a divergent species. This is the case for the /Triticum aestivum/ hybridisation, which consisted in the union of the chromosome set of a /Triticum/ species with that of an /Aegilops/ species. (ii) Gls:autopolyploidisation consists in the hybridisation or duplication of the whole genome within the same species.
*** Unequal crossing-over
Another source of gene duplication relies on unequal crossing-over. During cell division, a crossing-over occurs when two chromatids exchange fragments of chromosome. If the cleavage of the two chromatids occurs at different positions, the shared fragments may have different lengths. Homologous recombination of such uneven crossing-over leads to the incorporation of a duplicate region, as depicted in cref:fig:gene-duplication-mechanisms (B, C).
@ -122,26 +125,30 @@ Transposable elements may well be involved in the mechanism, as a high enrichmen
** Fate of duplicate genes in genome evolution
In his book /Evolution by Gene Duplication/, Susumu [[latex:textsc][Ohno]] proposed that gene duplication plays a major role in species evolution [cite:@ohnoEvolutionGeneDuplication1970], because it provides new genetic materials to build on new phenotypes while keeping a backup gene for the previous function.
Indeed, duplicate genes may evolve after duplication: they may be inactivated, becoming glspl:pseudogene; they may be deleted or conserved and so, they may acquire new functions.
Indeed, duplicate genes evolve after duplication: they may be inactivated, and become glspl:pseudogene; they may be deleted or conserved, and if conserved, the may or may not acquire a new function.
*** Pseudogenization
Duplicate genes may be inactivated and become pseudogenes. These pseudogenes keep a gene-like structure, which degrades as and when further genome modifications occur. However, they are no longer expressed.
*** Neofunctionalization
Duplicate genes may be conserved and gain a new function.
For instance, the current set of olfactory receptor genes result from several duplication and deletion events (in /Drosophila/: [cite/t:@nozawaEvolutionaryDynamicsOlfactory2007]), after which the duplicate olfactory genes specialized in the detection of particular chemical compounds.
*** Subfunctionalization
Two duplicate genes with the same original function may encounter a gls:subfunctionalization by which each gene conserves only one part of the function.
*** Functional redundancy
The two gene copies may keep the ancestral function: in this case the quantity of gene product may increase.
# *** Pseudogenization
As genome evolves, duplicate genes may be inactivated and become pseudogenes. These pseudogenes keep a gene-like structure which degrades as and when further genome modifications occur but they are no longer expressed.
# *** Neofunctionalization
After duplication, the new gene copy may gain a new function. We call this possible outcome gls:neofunctionalization.
For instance, the current set of olfactory receptor genes result from several duplication and deletion events (for /Drosophila/, see: [cite/t:@nozawaEvolutionaryDynamicsOlfactory2007]), after which each duplicate olfactory gene specialized in the detection of a particular chemical compound.
# *** Subfunctionalization
Two duplicate genes with the same original function may encounter a gls:subfunctionalization: each gene conserves only one part of the function.
# *** Functional redundancy
Another possibility is that the two gene copies keep the ancestral function, resulting in a functional redoundancy. In this case the quantity of gene product may increase.
** Methods to identify duplicate genes
[[latex:textsc][Lallemand]] et al. review the different methods used to detect duplicate genes. These methods depend on the type of duplicate genes they target and vary on computation burden as well as ease of use [cite:@lallemandOverviewDuplicatedGene2020].
Different methods exists to detect duplicate genes. These methods depend on the type of duplicate genes they target and vary on computation burden as well as in the ease of use (for a review, see [cite/t:@lallemandOverviewDuplicatedGene2020]).
*** Paralog detection
Paralogs are homologous genes derived from a duplication event. We can identify them as homologous genes coming from the same genome, or as homologous genes between different species once we filtered out gls:orthologues (homologous genes derived from a speciation event).
We can use two gene characteristics to assess the homology between two genes: gene structure or sequence similarity.
The sequence similarity can be tested with a sequence alignment tool, such as =BLAST= [cite:@altschulBasicLocalAlignment1990], =Psi-BLAST=, and =HMMER3= [cite:@johnsonHiddenMarkovModel2010], or =diamond= [cite:@buchfinkSensitiveProteinAlignments2021], which are heuristic algorithms, which means they may not provide the best results, but do so way faster than exact algorithms, such as the classical Smith and Waterman algorithm [cite:@smithIdentificationCommonMolecular1981] or its optimized versions =PARALIGN= [cite:@rognesParAlignParallelSequence2001] or =SWIMM=.
This is the case for Triticum aestivum hybridisation, which consisted in the union of the
chromosome set of a Triticum species with those of an Aegilops species
*** FTAG Finder
Developed in the LaMME laboratory, the FTAG Finder (Families and Tandemly Arrayed Genes Finder) pipeline is a simple pipeline targeting the detection of gls:TAG from the proteome of single species [cite:@bouillonFTAGFinderOutil2016].
@ -170,6 +177,7 @@ For a given chromosome, the tool seeks genes belonging to the same family and lo
* Objectives for the internship
** Scientific questions
The underlying question of FTAG Finder is the study of the evolutionary fate of duplicate genes in Eukaryotes.
Duplicate genes are
** Extend the existing FTAG Finder Galaxy pipeline
Galaxy is a web-based platform for running accessible data analysis pipelines, first designed for use in genomics data analysis [cite:@goecksGalaxyComprehensiveApproach2010].
Last year, Séanna [[latex:textsc][Charles]] worked on the Galaxy version of the FTAG Finder pipeline during her M1 internship [cite:@charlesFinalisationPipelineFTAG2023]. I will continue this work.
@ -178,7 +186,7 @@ Last year, Séanna [[latex:textsc][Charles]] worked on the Galaxy version of the
Another objective of my internship will be to port FTAG Finder on a workflow manager better suited to larger and more reproducible analysis.
We will have to make a choice for the tool we will use.
The two main options being Snakemake and Nextflow. Snakemake is a python powered workflow manager based on rules /à la/ GNU Make [cite:@kosterSnakemakeScalableBioinformatics2012]. Nextflow is a groovy powered workflow manager, which rely on the data flows paradigm [cite:@ditommasoNextflowEnablesReproducible2017]. Both are widely used in the bioinformatics community, and their use have been on the rise since they came out in 2012 and 2013 respectively [cite:@djaffardjyDevelopingReusingBioinformatics2023].
The two main options being Snakemake and Nextflow. Snakemake is a python powered workflow manager based on rules /à la/ GNU Make [cite:@kosterSnakemakeScalableBioinformatics2012]. Nextflow is a groovy powered workflow manager, which rely on the data flows paradigm [cite:@ditommasoNextflowEnablesReproducible2017]. Both are widely used in the bioinformatics community. Their use have been on the rise since they came out in 2012 and 2013 respectively [cite:@djaffardjyDevelopingReusingBioinformatics2023].
#+begin_export latex
\flstop
@ -218,12 +226,6 @@ The application of both operator iteratively eventually ends up in a partition o
** Walktrap
Principle: construct vertex communities based on where an agent would get stuck in a random walk.
# LocalWords: speciation Subfunctionalization Neofunctionalization
# LocalWords: Pseudogenization
# Local Variables:
# eval: (progn (org-babel-goto-named-src-block "startup") (org-babel-execute-src-block) (outline-hide-sublevels 1))
# End:
* Setup :noexport:
#+name: startup
@ -233,3 +235,10 @@ Principle: construct vertex communities based on where an agent would get stuck
#+RESULTS: startup
: Loaded ./setup.el
# LocalWords: speciation subfunctionalization neofunctionalization
# LocalWords: pseudogenization bioinformatics
# Local Variables:
# eval: (progn (org-babel-goto-named-src-block "startup") (org-babel-execute-src-block) (outline-hide-sublevels 1))
# End:

BIN
report.pdf (Stored with Git LFS)

Binary file not shown.

View File

@ -0,0 +1,97 @@
\RequirePackage[manualmark]{scrlayer-scrpage}
\iffalse
\renewcommand*\chaptermark[1]{%
\markboth{\Ifnumbered{chapter}{\chaptermarkformat}{}}{#1}% <- outdated macro replaced
}
\AfterTOCHead[toc]{\markboth{}{\contentsname}}
\fi
\clearpairofpagestyles
\clubpenalty = 10000
\widowpenalty = 10000
\automark[section]{part}
\setlength{\footheight}{120pt} % avoids scrlayer-scrpage warning:
% footheight to low warning
\setlength{\footskip}{185pt} % BAD HACK that moves the foot downwards
\KOMAoption{footwidth}{foot:53pt} % BAD HACK that moves the foot towards
\setkomafont{pagefoot}{\normalfont\footnotesize}
\setkomafont{pagenumber}{\normalfont \fontfamily{\sfdefault}\selectfont \normalsize \bfseries\color{black}}
\renewcommand{\partmark}[1]{%
\markboth{%
% use \@chapapp instead of \chaptername to avoid
% 'Chapter A Appendix ...', thanks to @farbverlust (issue #47)
\fontfamily{\sfdefault}\selectfont
{\color{fgBlue}\textbf{\partname\ \thepart}}%
\quad%
\protect\begin{minipage}[t]{.65\textwidth}%
#1%
\protect\end{minipage}%
}{}%
}
\newlength{\lensectionnumber}
\renewcommand{\sectionmark}[1]{%
\markright{%
\normalsize\fontfamily{\sfdefault}\selectfont\bfseries
\setlength{\lensectionnumber}{0em}
\settowidth{\lensectionnumber}{\textbf{\thesection}\quad}
\protect\begin{minipage}[t]{.72\textwidth}%
{\ }% bad hack to prevent a wrong baseline for the minipage
\protect\raggedleft%
\hangindent=\lensectionnumber%
{\color{black}\textbf{\fontfamily{\sfdefault}\selectfont\thesection}}%
\quad%
#1%
\protect\end{minipage}%
}%
}
\newcommand{\ctfooterline}{%
\color{black}\rule[-90pt]{1.25pt}{100pt}%
}
% Page number for odd (right) pages
\newcommand{\ctfooterrightpagenumber}{%
\ctfooterline%
\hspace*{10pt}%
\begin{minipage}[b]{1.5cm}%
\pagemark\ %
\end{minipage}%
}
%% Page number for even (left) pages
\newcommand{\ctfooterleftpagenumber}{%
\begin{minipage}[b]{1.5cm}%
\raggedleft\pagemark%
\end{minipage}%
\hspace*{10pt}%
\ctfooterline%
}
%% Defines the content for header and footer
\lehead{}
\cehead{}
\rehead{}
\lohead{}
\cohead{}
\rohead{}
\lefoot[% > plain
\ctfooterleftpagenumber%
]{% > srcheadings
\ctfooterleftpagenumber%
\hspace*{0.75cm}%
%\headmark%
}
\cefoot{}
\refoot{}
\lofoot{}
\cofoot{}
\rofoot[% > plain
\ctfooterrightpagenumber%
]{% > srcheadings
%\headmark%
\hspace*{0.75cm}%
\ctfooterrightpagenumber%
}

View File

@ -1,4 +1,4 @@
\RequirePackage{lettrine}
% Font
\usepackage{fontspec}
@ -137,122 +137,7 @@
\fi
}
\usepackage{scrhack}
% From S. Ivanov hdr preamble
\iffalse
\titleformat{\chapter}[frame]
{\itshape\color{primary}}
{\filright
\normalsize
\enspace Chapter \thechapter\enspace}
{10mm}
{\fontsize{35}{20}\selectfont\normalfont\bfseries\filright\hspace{1ex}}
\titleformat{\section}{\Large\normalfont\bfseries\color{primary}}{\thesection \hspace{1ex}}{1ex}{}
\titleformat{\subsection}{\large\normalfont\bfseries\color{primary}}{\thesubsection \hspace{1ex}}{1ex}{}
\titleformat{\subsubsection}{\normalsize\normalfont\bfseries\color{primary}}{}{1ex}{}
\usepackage{scrhack}
\newcommand{\changelocaltocdepth}[1]{%
\addtocontents{toc}{\protect\setcounter{tocdepth}{#1}}%
\setcounter{tocdepth}{#1}%
}
\fi
%
% \usepackage{sty/cleanthesis-extracts}
\RequirePackage[manualmark]{scrlayer-scrpage}
\renewcommand*\chaptermark[1]{%
\markboth{\Ifnumbered{chapter}{\chaptermarkformat}{}}{#1}% <- outdated macro replaced
}
\AfterTOCHead[toc]{\markboth{}{\contentsname}}
\clearpairofpagestyles
\clubpenalty = 10000
\widowpenalty = 10000
\automark[section]{part}
\setlength{\footheight}{120pt} % avoids scrlayer-scrpage warning:
% footheight to low warning
\setlength{\footskip}{185pt} % BAD HACK that moves the foot downwards
\KOMAoption{footwidth}{foot:53pt} % BAD HACK that moves the foot towards
\setkomafont{pagefoot}{\normalfont\footnotesize}
\setkomafont{pagenumber}{\normalfont \fontfamily{\sfdefault}\selectfont \normalsize \bfseries\color{black}}
\renewcommand{\partmark}[1]{%
\markboth{%
% use \@chapapp instead of \chaptername to avoid
% 'Chapter A Appendix ...', thanks to @farbverlust (issue #47)
\fontfamily{\sfdefault}\selectfont
{\color{fgBlue}\textbf{\partname\ \thepart}}%
\quad%
\protect\begin{minipage}[t]{.65\textwidth}%
#1%
\protect\end{minipage}%
}{}%
}
\newlength{\lensectionnumber}
\renewcommand{\sectionmark}[1]{%
\markright{%
\normalsize\fontfamily{\sfdefault}\selectfont\bfseries
\setlength{\lensectionnumber}{0em}
\settowidth{\lensectionnumber}{\textbf{\thesection}\quad}
\protect\begin{minipage}[t]{.72\textwidth}%
{\ }% bad hack to prevent a wrong baseline for the minipage
\protect\raggedleft%
\hangindent=\lensectionnumber%
{\color{black}\textbf{\fontfamily{\sfdefault}\selectfont\thesection}}%
\quad%
#1%
\protect\end{minipage}%
}%
}
\newcommand{\ctfooterline}{%
\color{black}\rule[-90pt]{1.25pt}{100pt}%
}
% Page number for odd (right) pages
\newcommand{\ctfooterrightpagenumber}{%
\ctfooterline%
\hspace*{10pt}%
\begin{minipage}[b]{1.5cm}%
\pagemark\ %
\end{minipage}%
}
%% Page number for even (left) pages
\newcommand{\ctfooterleftpagenumber}{%
\begin{minipage}[b]{1.5cm}%
\raggedleft\pagemark%
\end{minipage}%
\hspace*{10pt}%
\ctfooterline%
}
%% Defines the content for header and footer
\lehead{}
\cehead{}
\rehead{}
\lohead{}
\cohead{}
\rohead{}
\lefoot[% > plain
\ctfooterleftpagenumber%
]{% > srcheadings
\ctfooterleftpagenumber%
\hspace*{0.75cm}%
%\headmark%
}
\cefoot{}
\refoot{}
\lofoot{}
\cofoot{}
\rofoot[% > plain
\ctfooterrightpagenumber%
]{% > srcheadings
%\headmark%
\hspace*{0.75cm}%
\ctfooterrightpagenumber%
}
\usepackage{sty/cleanthesis-footer}
\usepackage{sty/scr-legrand-heading}

View File

@ -0,0 +1,7 @@
\colorlet{headingcolor}{black}
\renewcommand*{\sectionformat}{\llap{\textcolor{headingcolor}{\thesection}\hspace{1em}}}
\renewcommand*{\chapterformat}{\llap{\textcolor{headingcolor}{\thechapter}\hspace{1em}}}
\renewcommand*{\subsectionformat}{\llap{\textcolor{headingcolor}{\thesubsection}\hspace{1em}}}