2024-03-19 13:11:18 +01:00
\part { Sequence alignment}
2024-03-26 11:13:08 +01:00
\chapter { Definitions}
2024-03-19 13:11:18 +01:00
A function $ d $ is a distance between two sequences $ x $ and $ y $ in an alphabet $ \Sigma $ if
\begin { itemize}
\item $ x, y \in \Sigma ^ { * } , d ( x, x ) = 0 $
\item $ \forall x, y \in \Sigma ^ { * } $ $ d ( x,y ) = d ( y,x ) $
\item $ \forall x, y, z \in \Sigma ^ { * } $ $ d ( x, z ) \leq d ( x, y ) + d ( x, z ) $
\end { itemize}
Here we are interested by the distance that is able to represent the transformation of $ x $ to $ y $ using three types of basic operations:
\begin { itemize}
\item Substition
\item Insertion
\item Deletion
\end { itemize}
Example:
\begin { itemize}
2024-03-25 10:38:20 +01:00
\item $ sub ( a, b ) = \begin { cases } 0 & \text { if } a = b \\ 1 & \text { otherwise } \end { cases } $ .
2024-03-19 13:11:18 +01:00
\item $ del ( a ) = 1 $
\item $ ins ( a ) = 1 $
\end { itemize}
2024-03-25 10:38:20 +01:00
Let $ X = x _ { 0 } x _ { 1 } \ldots x _ { m - 1 } $ , $ Y = y _ { 0 } y _ { 1 } \ldots y _ { n - 1 } $
An alignment is noted as $ z = \begin { pmatrix } \bar { x } _ { 0 } \\ \bar { y } _ { 0 } \end { pmatrix } \ldots \begin { pmatrix } \bar { x } _ { p - 1 } \\ \bar { y } _ { p - 1 } \end { pmatrix } $ of size $ p $ . $ n \leq p \leq n + m $
$ \bar { x } _ { i } = x _ { j } $ or $ \bar { x } _ { i } = \varepsilon $ for $ 0 \leq i \leq p - 1 $ and $ 0 \leq j \leq m - 1 $
$ \bar { y } _ { i } = y _ { j } $ or $ \bar { y } _ { i } = \varepsilon $ for $ 0 \leq i \leq p - 1 $ and $ 0 \leq j \leq n - 1 $
$ X' = \bar { x } _ { 0 } \bar { x } _ { 1 } \ldots \bar { x } _ { i } \ldots \bar { x } _ { p - 1 } $
$ Y' = \bar { y } _ { 0 } \bar { y } _ { 1 } \ldots \bar { y } _ { i } \ldots \bar { y } _ { p - 1 } $
for $ 0 \leq i \leq p - 1 $ , $ \nexists i $ , such that $ \bar { x } _ { i } = \bar { y } _ { i } = \varepsilon $