38 lines
1.4 KiB
TeX
38 lines
1.4 KiB
TeX
\part{Sequence alignment}
|
|
|
|
\section{Simililarity between sequences}
|
|
|
|
A function $d$ is a distance between two sequences $x$ and $y$ in an alphabet $\Sigma$ if
|
|
\begin{itemize}
|
|
\item $x, y \in \Sigma^{*}, d(x, x) = 0$
|
|
\item $\forall x, y \in \Sigma^{*}$ $d(x,y) = d(y,x)$
|
|
\item $\forall x, y, z \in \Sigma^{*}$ $d(x, z) \leq d(x, y) + d(x, z)$
|
|
\end{itemize}
|
|
|
|
Here we are interested by the distance that is able to represent the transformation of $x$ to $y$ using three types of basic operations:
|
|
\begin{itemize}
|
|
\item Substition
|
|
\item Insertion
|
|
\item Deletion
|
|
\end{itemize}
|
|
|
|
Example:
|
|
\begin{itemize}
|
|
\item $sub(a, b) = \begin{cases} 0 & \text{if} a = b \\ 1 &\text{otherwise} \end{cases}$.
|
|
\item $del(a) = 1$
|
|
\item $ins(a) = 1$
|
|
\end{itemize}
|
|
|
|
Let $X = x_{0} x_{1} \ldots x_{m-1}$, $Y = y_{0} y_{1} \ldots y_{n-1} $
|
|
|
|
An alignment is noted as $z = \begin{pmatrix} \bar{x}_{0} \\ \bar{y}_{0} \end{pmatrix} \ldots \begin{pmatrix} \bar{x}_{p-1} \\ \bar{y}_{p-1} \end{pmatrix}$ of size $p$. $n \leq p \leq n + m$
|
|
|
|
|
|
$\bar{x}_{i} = x_{j}$ or $\bar{x}_{i} = \varepsilon$ for $0 \leq i \leq p-1$ and $0 \leq j \leq m - 1$
|
|
|
|
$\bar{y}_{i} = y_{j}$ or $\bar{y}_{i} = \varepsilon$ for $0 \leq i \leq p-1$ and $0 \leq j \leq n - 1$
|
|
|
|
$X' = \bar{x}_{0} \bar{x}_{1} \ldots \bar{x}_{i} \ldots \bar{x}_{p-1}$
|
|
$Y' = \bar{y}_{0} \bar{y}_{1} \ldots \bar{y}_{i} \ldots \bar{y}_{p-1}$
|
|
for $0 \leq i \leq p-1$, $\nexists i$, such that $\bar{x}_{i} = \bar{y}_{i} = \varepsilon$
|