Skip to content

[Rule] MaximumContactMapOverlap to ILP #1044

@zhangjieqingithub

Description

@zhangjieqingithub

Source

MaximumContactMapOverlap

Source model issue: #1043

Target

ILP/bool

Motivation

This rule gives MaximumContactMapOverlap immediate solver connectivity through the existing ILP hub.

The formulation follows the standard contact-map-overlap idea: choose an order-preserving residue alignment and maximize the number of contact pairs that are superimposed by the alignment.

Reference

  • Rumen Andonov, Noel Malod-Dognin, and Nicola Yanev, "Maximum Contact Map Overlap Revisited," Journal of Computational Biology 18(1):27-41, 2011. https://doi.org/10.1089/cmb.2009.0196
  • Wei Xie and Nikolaos V. Sahinidis, "A reduction-based exact algorithm for the contact map overlap problem," Journal of Computational Biology 14(5):637-654, 2007. https://doi.org/10.1089/cmb.2007.R007

Reduction Algorithm

Let the source instance contain ordered contact maps
G_1 = (V_1,E_1) and G_2 = (V_2,E_2), with
V_1 = {0,...,n_1-1} and V_2 = {0,...,n_2-1}.

Construct a binary ILP as follows.

  1. For each possible residue match (i,j) in V_1 x V_2, create a binary variable x_{i,j}.
    It means vertex i of G_1 is aligned to vertex j of G_2.

  2. For each pair of contacts {i,k} in E_1 and {j,l} in E_2 with i < k and j < l, create a binary variable y_{i,k,j,l}.
    It means contact {i,k} is preserved by aligning i to j and k to l.

  3. Add row constraints for every i in V_1:

    sum_j x_{i,j} <= 1.

    Each vertex of G_1 is matched at most once.

  4. Add column constraints for every j in V_2:

    sum_i x_{i,j} <= 1.

    Each vertex of G_2 is matched at most once.

  5. Add order-preservation constraints. For every i < k in V_1 and every j >= l in V_2, add:

    x_{i,j} + x_{k,l} <= 1.

    This forbids crossings and equal-image matches, so all matched images are strictly increasing.

  6. Link each preserved-contact variable to its endpoint-match variables:

    y_{i,k,j,l} <= x_{i,j}

    y_{i,k,j,l} <= x_{k,l}.

    Because every y has positive objective coefficient and the objective is maximization, an optimal solution sets y_{i,k,j,l} = 1 exactly when the two endpoint matches are selected.

  7. Maximize:

    sum y_{i,k,j,l}

    over all contact pairs from E_1 x E_2 that respect endpoint order.

  8. Extract the CMO alignment by mapping i to the unique j with x_{i,j} = 1, and leaving i unmatched if all x_{i,j} are zero.

Correctness Argument

The row and column constraints make the selected x variables a partial injective matching between the two ordered vertex sets. The crossing constraints ensure that if i < k in the first contact map and both vertices are matched, then their images occur in the same order in the second contact map. Therefore every feasible ILP assignment projects to a feasible order-preserving partial alignment.

Conversely, any feasible CMO alignment defines a feasible ILP assignment by setting x_{i,f(i)} = 1 for matched vertices and all other x variables to zero. The alignment is injective and order-preserving, so all row, column, and crossing constraints are satisfied.

For every contact {i,k} in E_1 and contact {j,l} in E_2, the corresponding y variable can be one only when both endpoint matches are selected. Since all y variables have positive objective coefficient and no negative side effect, an optimal ILP sets exactly the preserved-contact y variables to one.

Thus the ILP objective equals the number of contacts preserved by the alignment, and maximizing the ILP objective is exactly maximizing contact-map overlap.

Size Overhead

Let:

  • n_1 = num_vertices_1,
  • n_2 = num_vertices_2,
  • m_1 = num_contacts_1,
  • m_2 = num_contacts_2.

A direct formulation has:

Target metric Formula
num_vars n_1 * n_2 + m_1 * m_2
num_constraints n_1 + n_2 + n_1^2 * n_2^2 + 2 * m_1 * m_2

The crossing-constraint count is a conservative quadratic-pair bound. Implementations may generate only the invalid ordered pairs i < k and j >= l, but the asymptotic overhead remains O(n_1^2 n_2^2 + m_1 m_2).

Validation Method

Use closed-loop validation on small contact maps:

  • enumerate all order-preserving partial alignments directly,
  • compute the maximum contact overlap by brute force,
  • solve the ILP instance,
  • extract the alignment from the x variables,
  • verify that the extracted alignment is order-preserving and preserves exactly the ILP objective number of contacts.

Include cases where:

  • the best alignment leaves some residues unmatched,
  • a crossing alignment would preserve more contacts but must be rejected,
  • two alignments tie for optimum,
  • no contact is preserved,
  • all contacts of the smaller map can be preserved.

Example

Use the source instance from #1043.

First contact map:

E_1 = {{0,2}, {1,3}}.

Second contact map:

E_2 = {{0,3}, {1,4}, {0,2}}.

The optimal alignment is:

Vertex in G_1 Image in G_2
0 0
1 1
2 3
3 4

Set the corresponding match variables to one:

  • x_{0,0} = 1,
  • x_{1,1} = 1,
  • x_{2,3} = 1,
  • x_{3,4} = 1.

Then the preserved-contact variables include:

  • y_{0,2,0,3} = 1, since {0,2} maps to {0,3},
  • y_{1,3,1,4} = 1, since {1,3} maps to {1,4}.

The ILP objective value is therefore 2. Since G_1 has only two contacts, this is optimal and matches the source CMO objective.

BibTeX

@article{AndonovMalodDogninYanev2011CMO,
  author  = {Rumen Andonov and Noel Malod-Dognin and Nicola Yanev},
  title   = {Maximum Contact Map Overlap Revisited},
  journal = {Journal of Computational Biology},
  volume  = {18},
  number  = {1},
  pages   = {27--41},
  year    = {2011},
  doi     = {10.1089/cmb.2009.0196},
  url     = {https://doi.org/10.1089/cmb.2009.0196}
}

@article{XieSahinidis2007CMO,
  author  = {Wei Xie and Nikolaos V. Sahinidis},
  title   = {A Reduction-Based Exact Algorithm for the Contact Map Overlap Problem},
  journal = {Journal of Computational Biology},
  volume  = {14},
  number  = {5},
  pages   = {637--654},
  year    = {2007},
  doi     = {10.1089/cmb.2007.R007},
  url     = {https://doi.org/10.1089/cmb.2007.R007}
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    ruleA new reduction rule to be added.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions