Next Article in Journal
A New Modified Helmholtz Equation for the Expression of the Gravity Gradient and the Intensity of an Electrostatic Field in Spherical Harmonics
Previous Article in Journal
Theorems for Boyd–Wong Contraction Mappings on Similarity Spaces
Previous Article in Special Issue
Edge-Weighted Consensus-Based Formation Control with Collision Avoidance for Mobile Robots Based on Multi-Strategy Mutation Differential Evolution
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Population-Based Local Search Algorithm for the Identifying Code Problem

by
Alejandro Lara-Caballero
* and
Diego González-Moreno
Department of Applied Mathematics and Systems, Universidad Autónoma Metropolitana, Cuajimalpa, Mexico City 05348, Mexico
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(20), 4361; https://doi.org/10.3390/math11204361
Submission received: 18 August 2023 / Revised: 30 September 2023 / Accepted: 6 October 2023 / Published: 20 October 2023

Abstract

:
The identifying code problem for a given graph involves finding a minimum subset of vertices such that each vertex of the graph is uniquely specified by its nonempty neighborhood within the identifying code. The combinatorial optimization problem has a wide variety of applications in location and detection schemes. Finding an identifying code of minimum possible size is a difficult task. In fact, it has been proven to be computationally intractable (NP-complete). Therefore, the use of heuristics to provide good approximations in a reasonable amount of time is justified. In this work, we present a new population-based local search algorithm for finding identifying codes of minimum cost. Computational experiments show that the proposed approach was found to be more effective than other state-of-the-art algorithms at generating high-quality solutions in different types of graphs with varying numbers of vertices.

1. Introduction

An identifying code is a dominating subset of vertices in a graph such that the closed neighborhood of any vertex has a distinct intersection with the identifying code. The fundamental idea is to uniquely identify an element in a system by its neighbors. The goal of the identifying code problem is to find a code of minimum cardinality for any given graph. Identifying codes can be applied to a wide variety of applications such as fault detection and location detection [1,2]. The definition of identifying codes was motivated by the need to solve fault diagnosis problems in multiprocessor systems [1]. The main objective of fault diagnosis is to test a system and locate faulty nodes. The aim is to choose the smallest subset of processors, known as a code, to perform fault diagnostics at the lowest possible cost. A processor in the code is called a codeword. A codeword tests itself and all the processors it is connected to. If a codeword detects a fault, it sends an alarm signal. Once some alarms are activated, the exact location of the fault can be determined.
The term identifying codes was first coined by Karpovsky et al. in [1], where they described the conceptual foundations and provided results and lower bounds for graphs with specific topologies, such as binary cubes, non-binary cubes, and trees. Since this prominent paper, the theoretical and practical aspects of identifying codes have attracted the attention of many scholars during the last few decades.
From a theoretical perspective, researchers have studied the identifying code for many classes of graphs. For instance, Charon et al. [3] studied the identifying codes of different types of infinite graphs, such as square and triangular lattices. They also showed that the problem complexity in several types of graphs is NP-hard. Some classes that have been extensively studied include trees [4], paths [5], cycles, [5,6,7], hypercubes [8,9], and grids [10]. For additional references on the theoretical aspect of identifying codes and other related problems, we refer the reader to D. Jean’s web page [11].
From a practical viewpoint, the identifying code problem has numerous applications in real-world domains, such as fault diagnosis of multiprocessor systems [1], compact routing in networks [12], emergency sensor networks in facilities [2], and the analysis of secondary RNA structures [13]. Due to its wide applicability, some researchers have focused on designing algorithms and techniques to solve the problem computationally.
Laifenfeld et al. studied covering problems using identifying codes in [14]. Approximation algorithms for computing identifying codes for certain types of graphs are given in [15,16]. In the context of emergency sensor networks, there have been several attempts to apply identifying codes for target identification [17]. In particular, Ray et al. [2] proposed a greedy algorithm called ID-CODE to construct irreducible identifying codes. It should be emphasized that their algorithm converges to a local minimum, which is not necessarily close to the global minimum. In [18], Xiao et al. formulated the problem using an integer program and developed a genetic algorithm to solve it. Recently, Horan et al. [19] compared three approaches for finding minimum identifying codes on Brujin graphs using quantum annealing.
The identifying code problem is proven to be NP-complete, even when restricted to the case of bipartite graphs [20]. This means that there is no known polynomial-time algorithm to solve the problem, so the use of heuristics is justified to obtain quality solutions in a reasonable time frame. To the best of our knowledge, the use of population-based approaches on the identifying code problem from a graph theory perspective is limited in the literature. This paper presents our work in this direction.
This study proposes a new population-based local search method (PB-LS) for solving the identifying code problem. While existing techniques offer viable solutions, they often struggle to balance exploration and exploitation, especially for large problem instances. Our main contribution is a local search strategy based on configuration checking that addresses these limitations by improving the quality of solutions. Our goal is to contribute to ongoing efforts to find higher-quality solutions for the identifying code problem.
The experimental results demonstrate that the proposed algorithm outperforms other state-of-the-art methods in several test instances, suggesting that combining population-based heuristics with a local search is a promising approach for solving the identifying code problem.
The rest of the manuscript is organized as follows. Section 2 provides some necessary notation and definitions. Section 3 introduces the proposed algorithm and details its components. Section 4 presents the computational experiments and discussion. In Section 5, conclusions and paths for future investigation are drawn.

2. Preliminaries

A graph G consists of a pair of sets ( V ( G ) , E ( G ) ) , where V ( G ) is the vertex set of G and E ( G ) is the edge set of G. Specifically, E ( G ) is a set of unordered pairs of vertices of V ( G ) . If u , v V ( G ) , an edge between u and v is denoted by u v . As we mentioned previously, the physical location of the components in many mathematical and real-life problems can be modeled using graphs. For example, given a multiprocessor system, we associate it with a graph G in the following way: the vertices of G are the processors and the edges represent the links between them. If there is just one fault at the time, an identifying code can be used to determine the exact location of the fault once a certain number of alarms are activated.
Let G be a graph and let v be a vertex of G. The neighborhood N ( v ) of v is the set of all vertices adjacent (directly connected) to v and the closed neighborhood N [ v ] of v is the set N ( v ) { v } . That is, the closed neighborhood includes both the neighbor vertices of v and v itself. Some definitions are described as follows.
Definition 1.
Given a graph G, a subset C of V ( G ) is called a separating code if for each pair of distinct vertices u , v of G, we have N [ u ] C N [ v ] C . (An example is shown in Figure 1).
Definition 2.
Given a graph G, a subset C of V ( G ) is called dominating if for every vertex v V ( G ) , we have C N [ v ] . (An example is shown in Figure 2).
Definition 3.
Given a graph G, a subset C of V ( G ) is called an identifying code if C is both a separating code and a dominating code of G. (An example is shown in Figure 3).
To illustrate the meaning of the identifying code, we use the graph in Figure 3, which represents a simplified multiprocessor system. To diagnose faulty processors, specific software routines are executed on a select group of processors. Suppose that these special processors are located at the vertices C = { v 1 , v 2 , v 5 , v 6 } . It can be verified that the intersection between this group and the close neighborhood of each processor is unique: N [ v 1 ] C = { v 1 , v 5 } , N [ v 2 ] C = { v 2 } , N [ v 3 ] C = { v 1 , v 2 , v 6 } , N [ v 4 ] C = { v 1 } , N [ v 5 ] C = { v 1 , v 5 , v 6 } , N [ v 6 ] C = { v 5 , v 6 } . Therefore, the vertex set C allows for the unique identification of a faulty processor and is an identifying code.
The aforementioned relationships can be expressed conveniently in matrix notation. If G is a graph with V ( G ) = { u 1 , u 2 , , u n } , the adjacency matrix of G is the n × n ( 0 , 1 ) -matrix A ( G ) = [ a i j ] , where
[ a i j ] = 1 if u i u j E ( G ) , 0 if u i u j E ( G ) .
The closed adjacency matrix is the matrix A + I , where I is the n × n -identity matrix. Let S be a subset of V ( G ) . We can associate a vector ( s 1 , s 2 , , s n ) with S such that s i = 1 , if s i S and s i = 0 , if s i S . The closed neighborhood of a vertex u i of G is the ith row of the closed adjacency matrix, that is
N [ u i ] = ( a i 1 , a i 2 , , a i n ) .
Let G be a graph of order n and vertex set V ( G ) = { u 1 , u 2 , , u n } . Let C be a subset of vertices of G with | C | = k . Let M = [ m i j ] be a k × n matrix with entries from { 0 , 1 } such that
[ m i j ] = 1 if u i N [ u j ] C , 0 if u i N [ u j ] C .
In other words, M is a matrix obtained from the closed adjacency matrix of G by deleting the rows corresponding to the vertices not in C. Observe that C is an identifying code of G if M satisfies the following conditions:
(i)
all the columns of M contain at least a 1,
(ii)
all the columns of M are distinct.
Condition (i) states that N [ u ] C for all v V ( G ) , and condition (ii) states that N [ u ] C N [ v ] C , for every u , v V ( G ) . To clarify the expression of an identifying code using the matrix point of view, consider as an example the graph G depicted in Figure 3. The vertex set C = { v 1 , v 2 , v 5 , v 6 } is an identifying code of G. We present the adjacency closed matrix A and the matrix M obtained by considering the row corresponding to vertices of C. Note that M satisfies conditions (i) and (ii).
A = 1 0 1 1 1 0 0 1 1 0 0 0 1 1 1 1 0 1 1 0 1 1 0 0 1 0 0 0 1 1 0 0 1 0 1 1 ,   M = 1 0 1 1 1 0 0 1 1 0 0 0 1 0 0 0 1 1 0 0 1 0 1 1 .
Therefore, the problem of finding a minimum identifying code is equivalent to that of finding a ( 0 , 1 ) -matrix M of n columns and the minimum number of rows satisfying the previous conditions; the rows of this matrix are an identifying code of the graph.

An Integer Formulation for the Identifying Code Problem

The minimum identifying code problem in graphs can be modeled by an integer program formulation [21]. Let G be a graph of order n and let A = [ a i j ] be the closed adjacency matrix of G, as defined in Section 2. Our vertex subset C is defined as a vector ( c 1 , c 2 , , c n ) T with c i = 1 , if c i C and c i = 0 , if c i C . Then, C is an identifying code of G if and only if the following conditions hold:
k = 1 n | a i k a j k | · c k 1 ,
A · C 1 T .
Inequality (5) implies that in order for C to be a separating code, the condition must be fulfilled for all pairs of vertices v i and v j , where i j and i , j { 1 , 2 , , n } . Inequality (6) is required for the dominating property to be satisfied. Inequalities in (5) and (6) can be grouped in a matrix U = [ u i j ] m x n , where m = ( n 2 n ) / 2 . We can then define the augmented matrix,
I G = U A .
Observe that I G consolidates both the separating and dominating constraints of the problem; we call I G the identifying matrix of G. Thus, the integer programming model is given by the following:
min         i = 1 n c i s . t . I G · C 1 T c i { 0 , 1 } , i { 1 , 2 , , n }
The formulation described in (8) has all binary variables, and the goal is to find the minimum-size code. Although the identifying code problem differs from set covering in terms of the type of constraints, some of the ideas from set covering can be adapted to solve it. The integer programming model has n 2 + n / 2 rows constraints, which can be numerous even for small graphs. For example, in a graph with 100 vertices, the number of constraints is approximately 5000. This makes it challenging to solve large-scale instances of the minimum identifying code problem. For this reason, the development of efficient algorithms to solve the problem is an important research topic, and new algorithms are needed.

3. Proposed Population-Based Local Search

In this section, we focus on the proposed approach for solving the identifying code problem. Population-based algorithms are general-purpose optimization methods that have been successfully applied to a wide range of combinatorial optimization problems [22]. Generally, population-based heuristics are strong at exploring the search space but weak at exploiting the solutions they find. Therefore, in this work, we employ a local search mechanism to augment the intensification ability. The flowchart of the proposed algorithm is shown in Figure 4.
In the following subsections, we describe the solution representation, fitness function, initial population creation, selection, and repair mechanism of the proposed algorithm. The repair mechanism was inspired on the work by Xu et al. [18], but we have made some modifications to improve its performance. We also describe the local search procedure in detail and the update population strategy.

3.1. Solution Representation and Fitness Function

A solution is represented as a binary array of length equal to the number of vertices in the instance that needs to be solved. Specifically, a value of 1 in the lth element of the array indicates that vertex l is in the solution; a value of 0 indicates that it is not. An example is shown in Figure 5. Here, we can see that vertices 1, 3 and 4 belong to the solution, while vertices 2 and 5 do not.
The fitness function used to evaluate the population is identical to our optimization objective. That is, the fitness of a vector solution C = ( c 1 , c 2 , , c n ) is calculated by the following equation:
f i t n e s s = i = 1 n c i
In other words, the smaller the size of the code, the better the fitness of a solution. One challenge of using binary representation is that operators can produce infeasible solutions (i.e., solutions that violate one or more constraints). Several mechanisms exist for handling infeasible solutions such as penalty functions and repair mechanisms [23]. Penalty functions penalize the fitnesses of infeasible solutions without distorting the fitness landscape. Repair mechanisms, on the other hand, seek to design specific operators to transform infeasible solutions into feasible ones. We chose the repair mechanism approach due to its good results in the literature [18] and because it can be difficult to determine an effective penalty function [24,25]. More details about the repair mechanism can be found in Section 3.5.

3.2. Population Initialization

The algorithm begins by creating a set of feasible solutions to form the initial population. To ensure that the initial population is diverse and spread out across the solution space, a random initialization method is used. The pseudocode for creating an initial solution is detailed in Algorithm 1.
At first, every vertex in the solution is established as a codeword (Line 1). It is known that a set that includes all vertices of a graph is an identifying code [2]. Once all vertices are set, to generate a uniformly distributed population over the search space, each vertex has a 0.5 probability of being removed (Lines 2–4). If removing the vertex results in an identifying code, we proceed to the next vertex. Otherwise, the removed vertex is reinserted into the solution, and we proceed to consider another vertex (Lines 5–9).
After this step, the feasible solution may still contain non-essential or redundant vertices. A vertex is considered redundant if it can be deleted from the solution and still be an identifying code. Consequently, some unnecessary vertices are eliminated randomly (Lines 10–17). It is worth noting that not all redundant vertices are removed. This initial redundancy in the solutions can provide more possibilities in the search process. This procedure is repeated N times until the initial population is fully populated.
Algorithm 1 Initial_Solution()
  1:
Generate C : = V
  2:
for all vertex v in C do
  3:
    if  r a n d ( ) < 0.5  then
  4:
         C : = C { v } ;
  5:
        if C is not an identifying code after removal then
  6:
            C S : = C S { v };
  7:
        end if
  8:
    end if
  9:
end for
10:
t : = | C |
11:
while  t > 0   do
12:
     v : = Randomly select a vertex from the first vertices of C;
13:
    if v is not essential then
14:
         C : = C { v } ;
15:
         t : = t 1 ;
16:
    end if
17:
end while
18:
return C;

3.3. Selection

The selection process determines which solution is chosen for being improved by the local search mechanism. The main idea is that the better a solution, the higher its likelihood of being selected. There are several selection mechanisms, but tournament selection is probably the most popular strategy due to its simplicity and efficiency [26].
This method involves randomly selecting n individuals from a population P; then, the selected individuals compete against each other. If n = 2 , the process is called a binary tournament selection. In our algorithm, a binary tournament selection has yielded effective outcomes.

3.4. Developed Local Search Algorithm

In large and complex search spaces, population-based algorithms have been seen as search techniques that can find high-performance regions. However, they are not ideal for fine-tuning solutions [27]. This paper presents a local search method based on the concept of configuration checking to intelligently add and remove vertices from the solution. Local search methods show great performance on seeking good solutions by exploring the neighborhood structure. In the following subsections, we discuss the score function, the configuration checking mechanism, and the main local search procedure.

3.4.1. Scoring Function

To calculate the score value, denoted as s c ( v ) , we consider the frequency of all the rows in I G , referring to the idea in [28]. Let R ( I G ) = { r 1 , r 2 , , r m } be the set that contains all the row vectors of I G and let q ( r ) be an attribute for each r i , i { 1 , 2 , , m } denotating the frequency of the ith row. For a row r of the constraints, if I ( r , v ) = 1 , we say that “row r is covered by vertex v”.
At the beginning, each row r is assigned a frequency of 1. After each iteration, the frequency of any row r that is not covered by the vertices in the candidate solution, C S , is increased by 1. This encourages the algorithm to find vertices that fulfill the dominating and separating code constraints for those uncovered rows in the next iterations. Based on this idea, we propose the following scoring function:
s c ( v ) = r C 1 q ( r ) , if v C S , r C 2 q ( r ) , if v C S .
where C 1 is the set of rows which will become uncovered if vertex v is removed from C S , while C 2 is a set of rows whose current status is uncovered and will be covered by adding v to C S . Regardless of whether we are removing or inserting vertices, we prefer one with the highest s c ( v ) value.

3.4.2. Configuration Checking Strategy

Configuration checking (CC) is a diversification strategy that was introduced in [29] to reduce the cycling problem in local search. The intuition behind CC is that by reducing cycles on local structures of the candidate solution, we may also reduce cycles on the whole candidate solution. CC has been shown to be effective in reducing cycling in local search for combinatorial optimization problems such as combinatorial interaction testing [30], minimum weighted vertex cover [31,32,33], constraint satisfaction problems such as satisfiability [34,35,36], maximum satisfiability [37], and a minimum weighted clique problem [38]. However, to the best of our knowledge, the configuration checking mechanism has not been adapted to the identifying code problem.
The CC strategy depends on the configuration concept. For each vertex, v, its configuration denotes the states of all the vertex in its neighborhood, N ( v ) . If the configuration of v has not changed since the last time it was removed from the candidate solution, C S , then it is not allowed to be added back to C S . To implement the CC strategy, a boolean array named c o n f is generally employed, if c o n f [ v ] = 1 , then v is authorized to be added to C S ; otherwise, c o n f [ v ] = 0 and v is forbidden to be added to C S .
In this work, we used the improved CC strategy, ES-CC, proposed by Wang et al. [28] to update the c o n f array after each iteration. As opposed to the traditional CC strategy, ES-CC considers the element or row states in addition to the neighboring states. A set R c h a n g e that includes all the rows of the I G that change its state to respect to v (i.e., rows that were covered by v and are now uncovered after v is removed) must be defined. Also, the set P v which contains all the rows of the I G covered by v should be specified.
For the identifying code problem, the strategy can be adapted as follows. In the beginning, the value of c o n f [ v ] is initialized as 1 for each vertex v ( G ) . Afterwards, if a vertex v has changed its state (i.e., being added or removed from C S ), then the value of c o n f [ v ] is assigned to 0. For each vertex u N ( v ) , if R c h a n g e P v , then c o n f [ u ] is assigned to 1.
Considering the score function and the CC strategy, the following rules to remove and add vertices can be defined.
  • Removing rule: Choose a vertex v with the highest s c ( v ) value and update the configuration to reflect this change.
  • Adding rule: Randomly select an uncovered row r and choose a vertex v that covers it. Vertex v should be allowed to be added to the solution with c o n f [ v ] = 1 and the highest s c ( v ) value. Ties are broken randomly. Update the configuration to reflect this change.

3.4.3. Local Search Main Procedure

The pseudocode of the local search is shown in Algorithm 2. The procedure receives as input a candidate solution, C S , and the maximum number of local search iterations, m a x _ i t e r . At the beginning, some variables are initialized: i t e r , t a b u , q ( r ) for all rows, while c o n f and s c ( v ) values are set for all vertices (Line 1). Also, the best solution found so far C S * is established as C S (Line 2).
Algorithm 2 Local Search Procedure ( C S , m a x _ i t e r )
  1:
Initialize i t e r , t a b u , q ( r ) for all rows, c o n f ( v ) and s c ( v ) for all vertices;
  2:
C S * : = C S ;
  3:
while  i t e r < m a x _ i t e r   do
  4:
    if all rows in I G are covered then
  5:
        if  C S < C S *  then
  6:
            C S * : = C S ;
  7:
            i t e r = 0 ;
  8:
        end if
  9:
         v : = a vertex selected based on Removal Rule;
10:
         C S : = C S { v }
11:
    end if
12:
     v : = a vertex selected based on Removal Rule and v t a b u ;
13:
     t a b u : =
14:
     C S : = C S { v } ;
15:
     v : = a vertex selected based on Adding Rule;
16:
     C S : = C S { v };
17:
     t a b u : = t a b u { v } ;
18:
end while
19:
return  C S * ;
While the maximum number of iterations is not reached, the solution is improved (Lines 13–18). If the algorithm arrives at a C S in which all vertices form an identifying code, then we update the best known solution C S * and reduce the size of C S (Lines 5–10). We define that C S < C S * if for the associated vectors C S = ( c 1 , c 2 , , c n ) and C S * = ( c 1 * , c 2 * , , c n * ) , it follows that i = 1 n c i < i = 1 n c i * . To remove a vertex, the algorithm decides based on the highest s c ( v ) value and prevents the vertex that has been added in the previous step from being removed with a tabu list (Lines 12–14). A new vertex is chosen by means of Adding Rule (Lines 15–16). Finally, the incumbent solution C S * is returned (Line 19).

3.5. Repair Mechanism

Because infeasible solutions can occur during the search process, the following repair mechanism is applied. This procedure consists of two main steps. First, we verify that all dominating and separating constraints are fulfilled. If some constraints are missing, this means that some rows of the I G matrix are uncovered. A row of matrix I G is said to be uncovered if there is no vertex that is included in the solution that satisfies the row constraint. We proceed to compute the set of these uncovered rows. Then, we consider these rows one at a time in their natural order. With probability of 0.5, we select a vertex that covers the row in consideration and covers the greatest number of remaining uncovered rows. Otherwise, we randomly select a vertex from all the vertices that cover the row.
Our approach differs from previous ones for identifying codes, such as the one proposed in [18]. Previous approaches used a greedy policy to select the vertex that covers the row in consideration. Our approach combines a greedy policy that considers the minimum cost-to-benefit ratio with a restricted criterion of randomness. In our experience, this flexibility promotes higher-quality solutions.
After the first phase, there may be some redundant vertices in the solution. In order to obtain an irreducible code, as stated in [2], we define a procedure to delete them. A vertex is considered redundant if it can be deleted from the solution and still be an identifying code. Therefore, we iterate over the solution sequentially and seek for possible removals by flipping set bits and verifying if the solution is still feasible.

3.6. Update Population Strategy

An adequate population update strategy is essential to maintain diversity and explore promising areas of the search space. After applying the proposed local search and repair mechanism, a retention policy should be used. This policy outlines whether a solution is good enough to join the population, and if so, which existing solution should be replaced.
In nature, the fittest members in a population are more likely to survive. This is known as the elite retention strategy [26]. Inspired by this strategy, we retain the top elements in the population following a steady-state model. Let | P o p | be the population size. During the search process, the population constantly maintains the top | P o p | optimal solutions. After each iteration, if the fitness of the new solution found by local search is higher than the fitness of the worst solution, the new solution replaces the inferior one. To increase diversity, if the fitness of the new solution is equal to the fitness of the worst solution, the new solution replaces the worst solution at random. One advantage of this model is that the newly generated solution is immediately available for selection. In our experiments, we found that the strategy was able to improve convergence.

4. Computational Experiments

This section presents the results of the experiments conducted to evaluate the proposed algorithm on a variety of graphs. We compare our algorithm with other state-of-the-art heuristics and an integer programming approach.

4.1. Test Instances

The computational experiments were carried out using different types of lattice graphs, namely square, triangular, and hexagonal grids. In previous studies, grids have been used to test algorithms for solving the identifying code problem. These studies have shown that grids can have complex landscapes for metaheuristics [18]. Also, the properties of lattices are well described in the literature, and they depict some essential components of real-world networks, such as roads [39]. We also explored the use of hypercube graphs due to their suitability as a network topology for communications [40]. To illustrate the different types of graphs employed in this paper, Figure 6 depicts some small graphs and their minimum identifying codes.

4.2. Data Preprocessing

The computational complexity of solving the identifying code problem depends primarily on the number of rows and columns in the I G . To reduce the dimension of the test instances, we implemented some of the reduction procedures outlined by Beasley [41]. Specifically, there are two basic steps: eliminating redundant rows (constraints) and eliminating columns (vertices) that are subsets of other columns.
Table 1 presents the details of the reduced instances for each type of graph. Input data for these well-known graph families can be generated using open-source frameworks such as Octave (https://octave.org) and SageMath (https://www.sagemath.org). In our work, we used SageMath to generate the adjacency matrices of the graphs of interest, which we then exported to text files that could be read by our C++ program. The type of graph, its dimensions, and the name of the instance are specified in columns 1, 2, and 3, respectively. The information related to the vertices is reported in columns 4–7, while the information regarding constraints is reported in columns 8–10. It is worth noting that increasing the size of the grids and hypercubes significantly increases the search space for each instance.
As can be observed, this method does not allow us to remove any vertex from the solution a priori. In our experiments, all algorithms only consider the essential constraints, or rows, in the core. It is worth noting that working on this reduced scheme is equivalent to working on the original one. In our experience, this preprocessing step does not significantly impact the quality of the solutions, but it reduces the overall running time of the algorithm.

4.3. Experimental Settings

The proposed approach is compared to several state-of-the-art metaheuristics: the genetic algorithm for the identifying code problem by Xu et al. [18] (GA), artificial bee colony (ABC), firefly algorithm (FFA), and cat swarm optimization (CSO). These algorithms have shown great performance in a wide variety of discrete problems. In the remainder of this paper, the following acronyms will be used to identify each algorithm: GA, ABC, FFA, CSO, and PB-LS (for the population-based local search algorithm described in Section 3). We also used the commercial software Lingo 19.0 to solve the integer program and compare the results to those obtained with the metaheuristics.
The techniques were coded in the C++ language and deployed on a desktop computer with a Pentium i7 (3.4 GHz) processor, 32 GB of RAM, and Microsoft Windows 10 operating system. The algorithms were compiled using MinGW 8.0.
Regarding the operating parameter settings, there are two main parameters that need to be tuned: the population size N and m a x _ i t e r , which controls the maximum number of iterations in the local search stage. The parameters were tuned using a brute force approach, as described in [42]. These parameter values provide good results, although they may not be optimal for all instances. To ensure a fair comparison with the standard algorithms, all parameter values are consistent, and the same parameters that appear in different algorithms have the same values. The remaining parameters for GA, ABC, FFA, and CSO were obtained from [18,43,44,45], respectively. The parameters for the proposed algorithm are shown in Table 2.
To ensure an impartial comparison of the internals of each metaheuristic, the initialization procedure from Section 3.2 was used for all algorithms. To handle constraints, the repair mechanism as described in Section 3.5 was applied uniformly across all techniques.

4.4. Experiments

Following previous works on identifying codes [18] and other related graph and set problems, we used a time limit as the stopping criterion for all algorithms. The cutoff time was set to 900 s (15 min) for all instances and all algorithms. To address the stochastic nature of heuristic techniques, 30 independent runs were performed for each algorithm. Additionally, the same seeds were used for all algorithms to ensure that all techniques started with the same initial solutions. Because the Lingo solver consistently produces the same result across different executions, we only recorded the minimum value found. Table 3, Table 4, Table 5 and Table 6 show the detailed results obtained by different instances of our proposed PB-LS. The algorithms are compared in terms of their minimum solution values (min), average solution values (avg) over 30 executions, and relative percentage deviation (RPD), which are given by the following:
R P D = m i n b e s t b e s t × 100 ,
where b e s t is the top solution obtained from all runs and algorithms for each instance. The best results per table row and instance are indicated in bold font.
Table 3 shows the results from the hexagonal lattice graphs. Our algorithm outperformed all others in three out of four instances, both in terms of the minimum and average solutions found. Only in instance H1 did GA, ABC, FFA, CSO, and Lingo find the best solution. ABC was the second-best competitor in terms of average solution. Instances H3 and H4 are among the most complex due to their dimensions, and PB-LS was able to find the best solutions for these instances.
Table 4 summarizes the results obtained from the hypercube test instances. Our algorithm achieved the best results on two out of four instances. It is worth noting that instances Q1 and Q2 have the fewest number of vertices and constraints among all the test instances.
Table 5 shows the results obtained from the grid graphs. Our algorithm had a good performance, achieving four out of five of the best solutions. GA, ABC, FFA, CSO and Lingo had similar results in terms of the minimum solution found.
Table 6 shows the results obtained from the triangular lattice test instances. Here, PB-LS was able to find three out of five of the best solutions. It is worth noting that both GA and FFA struggled with this type of graph, as reflected by their high RPD values.
Previous experiments and discussions have shown that the PB-LS algorithm for the identifying code problem outperforms competing algorithms, especially on large graphs.

5. Conclusions

Identifying codes are a concept borrowed from graph theory that can be used to solve problems such as fault detection and location detection. The computational complexity of the problem is NP-complete, making it challenging to find high-quality solutions using traditional techniques. In this scenario, heuristics are a viable alternative that can find good solutions in a reasonable amount of time.
In this paper, we propose a new population-based local search algorithm (PB-LS) to find an identifying code with minimum cost. Our PB-LS incorporates a local search mechanism with the concept of configuration checking to avoid cycles and achieve high-quality solutions.
The effectiveness of the proposed technique was evaluated using four different graph families with different vertex sizes. We compared the algorithm to other state-of-the-art metaheuristics, including GA, ABC, FFA, and CSO, in terms of minimum and average code size. We also used the commercial software Lingo to solve the integer program. The experimental results showed that the proposed algorithm was superior to other algorithms on most instances. However, the PB-LS is considered a strong alternative for finding minimum identifying codes.
There are several potential directions for future research. In particular, we believe that exploring advanced constraint handling methods such as adaptive penalty functions can improve solution quality. We are also interested in combining other population-based techniques and local search methods to create more effective algorithms for solving the identifying code problem. Additionally, we plan to explore the use of metaheuristics in conjunction with other approaches, such as mathematical programming, constraint programming, or machine learning. Finally, we plan to adapt the proposed strategy to solve other interesting problems. We believe that some of the ideas devised in this work can be generalized and applied to solve other known NP-hard problems in graph theory.

Author Contributions

Conceptualization, D.G.-M.; methodology, A.L.-C. and D.G.-M.; software, A.L.-C.; validation, A.L.-C. and D.G.-M.; formal analysis, A.L.-C. and D.G.-M.; investigation, A.L.-C. and D.G.-M.; data curation, A.L.-C.; writing—original draft preparation, A.L.-C. and D.G.-M.; writing—review and editing, A.L.-C. and D.G.-M.; visualization, A.L.-C. and D.G.-M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by CONAHCYT under project CB-2017-2018 No. 45208.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Karpovsky, M.; Chakrabarty, K.; Levitin, L. On a new class of codes for identifying vertices in graphs. IEEE Trans. Inform. Theory 1998, 44, 599–611. [Google Scholar]
  2. Ray, S.; Ungrangsi, R.; De Pellegrini, F.; Trachtenberg, A.; Starobinski, D. Robust location detection in emergency sensor networks. IEEE J. Sel. Areas Commun. 2004, 22, 1016–1025. [Google Scholar]
  3. Charon, I.; Hudry, O.; Lobstein, A. Identifying codes with small radius in some infinite regular graphs. Electron. J. Comb. 2002, 9, R11. [Google Scholar] [CrossRef] [PubMed]
  4. Bertrand, N.; Charon, I.; Hudry, O.; Lobstein, A. 1-identifying codes on trees. Australas. J. Comb. 2005, 31, 21–36. [Google Scholar]
  5. Chen, C.; Lu, C.; Miao, Z. Identifying codes and locating–dominating sets on paths and cycles. Discret. Appl. Math. 2011, 159, 1540–1547. [Google Scholar]
  6. Bertrand, N.; Charon, I.; Hudry, O.; Lobstein, A. Identifying and locating-dominating codes on chains and cycles. Eur. J. Comb. 2004, 25, 969–987. [Google Scholar]
  7. Gravier, S.; Moncel, J.; Semri, A. Identifying codes of cycles. Eur. J. Comb. 2006, 27, 767–776. [Google Scholar]
  8. Moncel, J. Monotonicity of the minimum cardinality of an identifying code in the hypercube. Discret. Appl. Math. 2006, 154, 898–899. [Google Scholar]
  9. Karpovsky, M.; Chakrabarty, K.; Levitin, L.; Avreky, D. On the covering of vertices for fault diagnosis in hypercubes. Inform. Process. Lett. 1999, 69, 99–103. [Google Scholar]
  10. Ben-Haim, Y.; Litsyn, S. Exact minimum density of codes identifying vertices in the square grid. SIAM J. Discret. Math. 2005, 19, 69–82. [Google Scholar]
  11. Cohen, G.; Maffray, F.; Manoussakis, Y.; Slater, P. Watching Systems, Identifying, Locating-Dominating and Discriminatiing Codes in Graphs. Available online: https://dragazo.github.io/bibdom/main.pdf (accessed on 9 August 2023).
  12. Laifenfeld, A.; Trachtenberg, R.C.; Starobinski, D. Joint monitoring and routing in wireless sensor networks using robust identifying codes. Mob. Netw. Appl. 2009, 14, 415–432. [Google Scholar]
  13. Haynes, T.; Knisley, D.; Seier, E.; Zou, D. A quantitative analysis of secondary RNA structure using domination based parameters on trees. BMC Bioinform. 2006, 7, 108. [Google Scholar]
  14. Laifenfeld, M.; Trachtenberg, A. Identifying codes and covering problems. IEEE Trans. Inf. Theory 2008, 54, 3929–3950. [Google Scholar]
  15. Suomela, J. Approximability of identifying codes and locating–dominating codes. Inf. Process. Lett. 2007, 103, 28–33. [Google Scholar]
  16. Xiao, Y.; Hadjicostis, C.; Thulasiraman, K. The d-identifying codes problem for vertex identification in graphs: Probabilistic analysis and an approximation algorithm. In Proceedings of the International Computing and Combinatorics Conference, Taipei, Taiwan, 15–18 August 2006. [Google Scholar]
  17. Qi, H.; Iyengar, S.S.; Chakrabarty, K. Distributed sensor networks—A review of recent research. J. Frankl. Inst. 2001, 338, 655–668. [Google Scholar]
  18. Xu, Y.C.; Xiao, R.B. Solving the Identifying Code Problem by a Genetic Algorithm. IEEE Trans. Syst. Man Cybern.-Part A Syst. Hum. 2007, 37, 41–46. [Google Scholar]
  19. Horan, V.; Adachi, S.; Bak, S. A comparison of approaches for finding minimum identifying codes on graphs. Quantum Inf. Process. 2016, 15, 1827–1848. [Google Scholar]
  20. Charon, I.; Hudry, O. Minimizing the size of an identifying or locating-dominating code in a graph is NP-hard. Theor. Comput. Sci. 2003, 290, 2109–2120. [Google Scholar]
  21. Xu, Y.C.; Xiao, R.B. Identifying code for directed graph. In Proceedings of the Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD 2007), Qingdao, China, 30 July–1 August 2007. [Google Scholar]
  22. Weinand, J.M.; Sörensen, K.; San Segundo, P.; Kleinebrahm, M.; McKenna, R. Research trends in combinatorial optimization. Int. Trans. Oper. Res. 2022, 29, 667–705. [Google Scholar]
  23. Rahimi, I.; Gandomi, A.H.; Chen, F.; Mezura-Montes, E. A Review on Constraint Handling Techniques for Population-based Algorithms: From single-objective to multi-objective optimization. Arch. Comput. Methods Eng. 2023, 30, 2181–2209. [Google Scholar]
  24. Back, T.; Schutz, M.; Khuri, S. A comparative study of a penalty function, a repair heuristic, and stochastic operators with the set-covering problem. In Proceedings of the European Conference on Artificial Evolution, Brest, France, 4–6 September 1995; pp. 320–332. [Google Scholar]
  25. Bilal, N.; Galinier, P.; Guibault, F. A new formulation of the set covering problem for metaheuristic approaches. Int. Sch. Res. Not. 2013, 2013, 203032. [Google Scholar]
  26. Goldberg, D.E.; Deb, K. A comparative analysis of selection schemes used in genetic algorithms. In Foundations of Genetic Algorithm; Whitley, D., Ed.; Elsevier: Amsterdam, The Netherlands, 1991; pp. 69–93. [Google Scholar]
  27. Blum, C.; Roli, A. Metaheuristics in combinatorial optimization: Overview and conceptual comparison. ACM Comput. Surv. 2003, 35, 268–308. [Google Scholar]
  28. Wang, Y.; Pan, S.; Al-Shihabi, S.; Zhou, J.; Yang, N.; Yin, M. An improved configuration checking-based algorithm for the unicost set covering problem. Eur. J. Oper. Res. 2021, 294, 476–491. [Google Scholar]
  29. Cai, S.; Su, K.; Sattar, A. Local search with edge weighting and configuration checking heuristics for minimum vertex cover. Artif. Intell. 2011, 175, 1672–1696. [Google Scholar]
  30. Fu, Y.; Lei, Z.; Cai, S.; Lin, J.; Wang, H. WCA: A weighting local search for constrained combinatorial test optimization. Inf. Softw. Technol. 2020, 122, 1106288. [Google Scholar]
  31. Li, R.; Hu, S.; Zhang, H.; Yin, M. An efficient local search framework for the minimum weighted vertex cover problem. Inf. Sci. 2016, 372, 428–445. [Google Scholar]
  32. Li, R.; Hu, S.; Cai, S.; Gao, J.; Wang, Y.; Yin, M. NuMWVC: A novel local search for minimum weighted vertex cover problem. J. Oper. Res. Soc. 2019, 71, 1498–1509. [Google Scholar]
  33. Zhou, Y.; Li, J.; Liu, Y.; Lv, S.; Lai, Y.; Wang, J. Improved Memetic Algorithm for Solving the Minimum Weight Vertex Independent Dominating Set. Mathematics 2020, 8, 1155. [Google Scholar]
  34. Cai, S.W.; Su, K.L. Local search for Boolean Satisfiability with configuration checking and subscore. Artif. Intell. 2013, 204, 75–98. [Google Scholar]
  35. Cai, S.; Su, K. Configuration checking with aspiration in local search for sat. In Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, Toronto, ON, Canada, 22–26 July 2012. [Google Scholar]
  36. Luo, C.; Cai, S.; Su, K.; Wu, W. Clause states based configuration checking in local search for satisfiability. IEEE Trans. Cybern. 2015, 45, 1014–1027. [Google Scholar]
  37. Luo, C.; Cai, S.; Su, K.; Wu, W. Ccls: An efficient local search algorithm for weighted maximum satisfiability. IEEE Trans. Comput. 2015, 64, 1830–1843. [Google Scholar]
  38. Wang, Y.Y.; Cai, S.W.; Yin, M.H. Two efficient local search algorithms for maximum weight clique problem. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016. [Google Scholar]
  39. Church, R.L.; Wang, S. Solving the p-median problem on regular and lattice networks. Comput. Oper. Res. 2020, 123, 105057. [Google Scholar]
  40. Laing, A.K.; Krumme, D.W. Optimal permutation routing for low-dimensional hypercubes. Netw. Int. J. 2010, 55, 149–167. [Google Scholar]
  41. Beasley, J.E. An algorithm for set covering problem. Eur. J. Oper. Res. 1987, 31, 85–93. [Google Scholar] [CrossRef]
  42. Birattari, M. Tuning Metaheuristics: A Machine Learning Perspective; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
  43. Crawford, B.; Soto, R.; Cuesta, R.; Paredes, F.T. Application of the artificial bee colony algorithm for solving the set covering problem. Sci. World J. 2014, 2014, 189164. [Google Scholar]
  44. Crawford, B.; Soto, R.; Suárez, M.O.; Paredes, F.; Johnson, F. Binary firefly algorithm for the set covering problem. In Proceedings of the 2014 9th Iberian Conference on Information Systems and Technologies (CISTI), Barcelona, Spain, 18–21 June 2014. [Google Scholar]
  45. Crawford, B.; Soto, R.; Berríos, N.; Johnson, F.; Paredes, F. Binary cat swarm optimization for the set covering problem. In Proceedings of the 2015 10th Iberian Conference on Information Systems and Technologies (CISTI), Aveiro, Portugal, 17–20 June 2015; pp. 1–4. [Google Scholar]
Figure 1. An example of a separating code on the graph. The black vertices are the members of the separating code.
Figure 1. An example of a separating code on the graph. The black vertices are the members of the separating code.
Mathematics 11 04361 g001
Figure 2. An example of a dominating set on the graph. The black vertices are the members of the dominating set.
Figure 2. An example of a dominating set on the graph. The black vertices are the members of the dominating set.
Mathematics 11 04361 g002
Figure 3. An example of an identifying code on the graph. The black vertices are the codewords.
Figure 3. An example of an identifying code on the graph. The black vertices are the codewords.
Mathematics 11 04361 g003
Figure 4. Flowchart of the proposed algorithm.
Figure 4. Flowchart of the proposed algorithm.
Mathematics 11 04361 g004
Figure 5. Solution encoding scheme.
Figure 5. Solution encoding scheme.
Mathematics 11 04361 g005
Figure 6. Identifying codes for different types of graphs: (a) 3 × 3 hexagonal grid. (b) 3 hypercube. (c) 5 × 5 square grid. (d) 5 × 5 triangular grid. The black vertices are the codewords.
Figure 6. Identifying codes for different types of graphs: (a) 3 × 3 hexagonal grid. (b) 3 hypercube. (c) 5 × 5 square grid. (d) 5 × 5 triangular grid. The black vertices are the codewords.
Mathematics 11 04361 g006aMathematics 11 04361 g006b
Table 1. Test instances information.
Table 1. Test instances information.
GraphDimensionNameVerticesConstraints
TotalRemovedCoreTotalRemovedCore
Hexagonal5 × 5H17007024852202283
10 × 10H2240024028,92927,7871133
15 × 15H35100510130,305127,7722533
20 × 20H48800880387,640383,1574483
Hypercube5Q1320325282080272
6Q26406420801344736
7Q31280128825663361920
8Q4256025632,89628,0324864
Square5 × 5S125025325206119
10 × 10S2100010050504456594
15 × 15S3225022525,42524,0061419
20 × 20S4400040080,20077,6062594
25 × 25S56250625195,625191,5064119
Triangular5 × 5T12102123117358
10 × 10T26606622111952259
15 × 15T3136013693168665651
20 × 20T4231023126,79625,5701226
25 × 25T5351035161,77659,8001976
Table 2. Parameters of the proposed algorithm.
Table 2. Parameters of the proposed algorithm.
ParameterDescriptionValue
NPopulation size100
m a x _ i t e r Local search iterations150
Table 3. Computational experiments on hexagonal lattice test instances.
Table 3. Computational experiments on hexagonal lattice test instances.
   InstanceH1H2H3H4
Best32110236416
Our approach
PB-LSMin32110236416
Avg32110.4237.5418.6
RPD0000
Other approaches
GAMin32115247433
Avg32.7115.8248.1433.7
RPD04.54.64
ABCMin32112243425
Avg32112.5243.8426.6
RPD01.82.92.16
FFAMin32116249434
Avg32.7116.2249.6434
RPD05.45.54.32
CSOMin32111248433
Avg32.2113.8248.9435.5
RPD00.95.14.01
LingoMin32114251432
Avg----
RPD03.66.43.8
Table 4. Computational experiments on hypercube test instances.
Table 4. Computational experiments on hypercube test instances.
   InstanceQ1Q2Q3Q4
Best10193463
Our approach
PB-LSMin10193463
Avg10193463.5
RPD0000
Other approaches
GAMin10193567
Avg101936.167.8
RPD002.96.3
ABCMin10193566
Avg10193566.4
RPD002.94.7
FFAMin10193566
Avg10193566.5
RPD002.94.7
CSOMin10193567
Avg101935.868.3
RPD002.96.3
LingoMin10193565
Avg----
RPD002.93.2
Table 5. Computational experiments on square lattice test instances.
Table 5. Computational experiments on square lattice test instances.
   InstanceS1S2S3S4S5
Best103987158246
Our approach
PB-LSMin103987158246
Avg1039.688.4158.5247.8
RPD00000
Other approaches
GAMin104090164260
Avg1040.792.5166260.7
RPD02.63.43.85.7
ABCMin104089160255
Avg104089.8161.7256.3
RPD02.52.21.23.6
FFAMin104092166258
Avg1040.992.7166.5259.8
RPD02.55.75.14.8
CSOMin104090164259
Avg1040.690.7164.4260.2
RPD02.53.43.75.2
LingoMin104090160250
Avg-----
RPD02.53.41.21.6
Table 6. Computational experiments on triangular lattice test instances.
Table 6. Computational experiments on triangular lattice test instances.
   InstanceT1T2T3T4T5
Best9234676116
Our approach
PB-LSMin9234676116
Avg92346.778.1116.3
RPD00000
Other approaches
GAMin9234882124
Avg923.848.982.4126.3
RPD004.36.56.9
ABCMin9234778121
Avg92347.278.8121.6
RPD002.22.64.3
FFAMin9234984127
Avg923.74984127
RPD006.510.59.4
CSOMin9234778120
Avg924.248.280.7123.2
RPD002.22.63.4
LingoMin9234780120
Avg-----
RPD002.25.33.4
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lara-Caballero, A.; González-Moreno, D. A Population-Based Local Search Algorithm for the Identifying Code Problem. Mathematics 2023, 11, 4361. https://doi.org/10.3390/math11204361

AMA Style

Lara-Caballero A, González-Moreno D. A Population-Based Local Search Algorithm for the Identifying Code Problem. Mathematics. 2023; 11(20):4361. https://doi.org/10.3390/math11204361

Chicago/Turabian Style

Lara-Caballero, Alejandro, and Diego González-Moreno. 2023. "A Population-Based Local Search Algorithm for the Identifying Code Problem" Mathematics 11, no. 20: 4361. https://doi.org/10.3390/math11204361

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop