Dynamic Multiagent Incentive Contracts: Existence, Uniqueness, and Implementation

Luo, Qi; Saigal, Romesh

doi:10.3390/math9010019

Open AccessArticle

Dynamic Multiagent Incentive Contracts: Existence, Uniqueness, and Implementation

by

Qi Luo

^1,†

and

Romesh Saigal

^2,*,‡

¹

Department of Industrial Engineering, Clemson University, Clemson, SC 29634, USA

²

Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI 48109, USA

^*

Author to whom correspondence should be addressed.

^†

Current address: 277B Freeman Hall, Clemson, SC 29634 , USA.

^‡

Current address: 2883 IOE Building, 1205 Beal Avenue, Ann Arbor, MI 48109-2117, USA.

Mathematics 2021, 9(1), 19; https://doi.org/10.3390/math9010019

Submission received: 5 May 2020 / Revised: 19 December 2020 / Accepted: 20 December 2020 / Published: 23 December 2020

Download

Browse Figure

Versions Notes

Abstract

:

Multiagent incentive contracts are advanced techniques for solving decentralized decision-making problems with asymmetric information. The principal designs contracts aiming to incentivize non-cooperating agents to act in his or her interest. Due to the asymmetric information, the principal must balance the efficiency loss and the security for keeping the agents. We prove both the existence conditions for optimality and the uniqueness conditions for computational tractability. The coupled principal-agent problems are converted to solving a Hamilton–Jacobi–Bellman equation with equilibrium constraints. Extending the incentive contract to a multiagent setting with history-dependent terminal conditions opens the door to new applications in corporate finance, institutional design, and operations research.

Keywords:

Nash equilibrium; moral hazard; differential game; dynamic programming

1. Introduction

In this paper, we consider the problem of a single party, called the principal, creating contracts to delegate a task to a group of different agents. Incentive contracts stimulate the agents to act in the principal’s interest by compensating them for achieving two goals: (i) they accept the offered contract (i.e., the contract is subject to the individual rational (IR) constraint); and (ii) they exert the effort at a desired level determined by the compensation spelled out in the contract (i.e., the contract is subject to the incentive compatible (IC) constraint). Such incentive contracts have been used for many practical problems ranging from corporate finance to strategic behavior in politics to institutional design [1,2,3,4,5,6,7,8,9,10].

In a dynamic setting, the goal as before is to incentivize agents to exert the desired effort over the planning horizon. To achieve this, each contract defines a stream of payoff amounts, which depend on the effort exerted by the corresponding agent. In the framework we consider in this paper, the agent’s effort process is not perfectly observable, possibly due to the cost or the difficulty of monitoring it. Instead, the principal observes a noisy output process, which is a result of the effort exerted by the agent. This proxy results in information asymmetry about the agent’s effort (the agent knows it, but the principal can only infer it from a proxy). The asymmetric information can create a potential moral hazard problem in the contract design [11]. The system efficiency is degraded as the first-best contract is not admissible. Given all these considerations, the incentive contract must solve the moral hazard problem and maximize the principal’s utility.

This fundamental incentive contract problem in the case of a single agent has been explored in many settings [7,12,13,14,15]. We consider the problem as a special case of stochastic Stackelberg differential games played between a principal and an agent. The principal expects the agent to exert a targeted level of effort and knows ex ante that once the agent has accepted the contract, it has no incentive to deviate from this target level (thus bypassing any resulting moral hazard). This incentive-compatible condition can be satisfied if the agents’ actions form a subgame perfect Nash equilibrium. However, finding such a globally optimal contract over the planning horizon is not trivial. The dynamic moral hazard problem has been studied in a discrete-time setting, where the state space explodes exponentially in the size of the planning horizon (the curse of dimensionality in dynamic programming) [16]. Holmstrom and Milgrom [17] proposed a continuous-time model. In this setting, the agent’s output process is represented by a stochastic differential equation (SDE) whose drift term is controlled by the agent’s effort. As a result, the continuous-time incentive contract problem is a limit of discrete-time dynamic games whose number of stages becomes unbounded in any finite interval. Some extensions of their work include Schattler and Sung [18], Sung [19], and Muller [20]. In recent years, following the groundbreaking work of Sannikov [21], there has been a resurgence of interest in the dynamic contract theory. The main contribution of [21] was to parameterize the incentive-compatible constraint at each epoch using the Martingale representation theorem. As a consequence, we can decouple the principal’s and the agent’s problems by representing the agent’s effort as a function of a parameter. The principal’s problem can then be solved by dynamic programming (more specifically, a Hamiltonian Jacobi–Bellman equation) for the incentive contract [3,15,21,22].

A significant extension of the single-agent incentive contract is multiagent incentive contracts. For example, a company hires multiple employees to collaborate on a project. Since employees with correlated responses may have different capabilities and utility functions, designing contracts separately for each is not viable. Koo et al. [23] presented the first extension of multiagent incentive contracts that initiated a stream of literature for team incentives using the Martingale approach [24,25,26,27]. In the multiagent setting, new challenges arise due to varied interactions between agents. For example, an arbitrary agent may compare both its effort and payoff with others; such a phenomenon is called inequity aversion [28]. Goukasian and Wan showed that inequity aversion is present in multiagent incentive contracts [29], and agents’ comparisons lower their exerted effort levels.

The critical condition for the existence of effective multiagent incentive contracts is that agents’ actions at each epoch must form a Nash equilibrium. This equilibrium then incentivizes each agent to choose the principal’s desired actions and nullifies the moral hazard in the contract. The conditions for the existence of this equilibrium is still an open question. Prior work [27] assumed that the existence conditions are satisfied in their setting without verification. The agents’ optimal actions constituting a Nash equilibrium led to a circular argument. Yet, characterizing the existence of a Nash equilibrium in multiagent contracts is non-trivial [30,31,32], more so in the dynamic setting considered in this work. The following example demonstrates the importance of investigating the existing conditions in a static matrix game setting. A principal chooses to compensate

c_{i}

to two agents as either low (L) or high (H) payoff, i.e.,

c_{i} \in {L, H}

for

i \in {1, 2}

. Agents putting effort into a project generate output denoted as

X_{i} \in {A, B}

at levels A or B.

The principal desires to stimulate Agent 1 to exert output A and Agent 2 to exert output B. The outcomes of signing contracts are represented by the matrices in Table 1, where each entry is the principal’s and agent’s utility received from the contract. If these two contracts are signed separately, the unique equilibria are ${L, A}$ with Agent 1 and ${L, B}$ with Agent 2.
We now assume that two agents’ outputs are aggregated in a linearly additive way. In this case, the principal’s dominant policy is $[c_{1}, c_{2}] = [L, L]$ . Notice that the existence and the number of equilibria may vary with the agents’ utility functions $u_{i} (c_{i}, X_{i}, X_{- i})$ . Three possible outcomes for the contracts are below:
- Unique Nash equilibrium: Assume that the utility of each agent is only dependent on its payoff, i.e., $u_{i} (c_{i}, X_{i}, X_{- i}) = c_{i}$ . The agents’ best responses are $[X_{1}, X_{2}] = [A, B]$ . With a fixed $[c_{1}, c_{2}] = [L, L]$ , their utility follows Table 2.
- Multiple Nash equilibria: Assuming that the principal rewards whoever delivers B an additional unit of compensation, there exists two Nash equilibria, $[X_{1}, X_{2}] = [A, B]$ and $[X_{1}, X_{2}] = [B, B]$ , for which their utility follows Table 3.
- No Nash equilibrium: Assuming that the utility of each agent is affected by the other’s action such that the principal would reward the agents when their outputs match, i.e., $u_{i} (c_{i}, X_{i}, X_{- i}) = c_{i} + 2$ if $X_{i} = X_{- i}$ , then there is no Nash equilibrium, as seen in Table 4.

Our goal in this paper is to find conditions that guarantee the existence of a unique multiagent Nash equilibrium in incentive contracts. We see that even if the existence problem is settled [27], the uniqueness of the multiagent Nash equilibrium must still be tackled. Characterizing unique equilibrium has practical value as coordinating agents to select the optimal Nash equilibrium is improbable; it also has theoretical value as the optimal contracts with a set of equilibria is computationally intractable. Using a fixed-point theorem (specifically, the Kakutani fixed-point theorem), we prove the existence of a subgame perfect Nash equilibrium. The existence conditions include the assumption that all agents are risk-averse and the interactions of all agents’ actions on other’s output follow a concave function. With a slight strengthening of the condition on the Hessian matrix of the interaction functions and with the use of the theorem of Gale and Nikaido [33] and Kojima and Saigal [34], we prove that the equilibrium is unique. These results then enable us to develop a provably convergent iterative procedure to solve for the incentive contracts.

Unlike the infinite horizon setting of [21], we consider the problem with a finite horizon where the terminal condition may be path-dependent. Such terminal conditions are widely used in modeling options, mortgage defaults, and car leasing, thus enhancing the applicability of the methodology.

The general notation used in the rest of this paper is as follows. A set of indices

[n] = {1, 2, \dots n}

. Bold variables are vectors or matrices of random variables or functions. In equilibrium analysis for the

i^{t h}

agent, we denote a vector as

x = [x_{1}, \dots, x_{i}, \dots, x_{n}] = [x_{- i}, x_{i}]

, where

x_{i}

indicates the variable associated with the

i^{t h}

agent.

x_{P}

indicates that the variable is associated with the principal.

\tilde{x}

is a variable that deviates from x in the domain of x.

D_{x} F

is the Jacobian, and

D_{x}^{2} F

is the Hessian of the

C^{2}

function F of x.

The remainder of this paper is organized as follows. In Section 2, we describe the setting of multiagent incentive contracts. In Section 3, we characterize the agents’ optimal responses and prove the existence of a unique Nash equilibrium. We then formulate the principal’s problem as a Hamilton–Jacobi–Bellman equation. We also give an iterative procedure to implement the optimal incentive contracts. In Section 5, we draw the final conclusion.

2. Setting

There is a single principal and n agents (indexed by

i \in [n]

) entering the contracts simultaneously at epoch

t = 0

. A contract signed between the principal and each agent i specifies the payoff

c_{i} (t)

that the agent will receive by outputting

X_{i} (t)

, a proxy for the the agent’s action

a_{i} (t)

in working for the principal over the horizon

t \in [0, T]

. The vectors of n agents’ actions and compensations are denoted as

a (t)

and

c (t)

, respectively. Since the principal’s goal is to incentivize n agents to collaborate on one project, these n contracts are correlated in many ways. The principal’s decision, the payoff

c_{i} (t)

for agent i, is in a domain

C_{i} \subseteq R

; the agent i’s decision, the effort level

a_{i} (t)

, is in a domain

A_{i} \subseteq R

. The size of the domains may vary for each

i \in [n]

. The Cartesian products of compensations and efforts are denoted as

C

and

A

, respectively.

2.1. Output Processes and Terminal Conditions

In an environment of uncertainty, the principal can only observe output processes

X (t) = {[X_{1} (t), \dots, X_{n} (t)]}^{T} \in X

, which are imperfect observations of agents’ actions. We assume that the dynamics of

X_{i} (t)

follow an SDE that depends on n agents’ actions

a (t)

:

d X_{i} (t) = f_{i} (a (t)) d t + σ_{i} d B_{i} (t), \forall i \in [n],

(1)

which follows the following assumptions that are a general extension of the multiagent contract in [23,24,27,35].

The drift term $f_{i} : A \to R_{+}$ in (1) is in an $L^{2}$ space such that $\int_{0}^{T} f_{i}^{2} d s < \infty$ for all $i \in [n]$ .
$f_{i}$ is partially differentiable almost everywhere with respect to $a_{i} (t)$ for all $i \in [n]$ .
The diffusion term $σ_{i}$ is a known constant for all $i \in [n]$ .
The Brownian motions $B (t) = {[B_{1} (t), \dots, B_{n} (t)]}^{T}$ are correlated with the correlation matrix $E (B (t) B {(t)}^{T}) = Σ$ , strongly positive definite, i.e., $x^{T} Σ x \geq α {∥ x ∥}^{2}$ for all $x \in R^{n}$ and some constant $α > 0$ .

For each agent

i \in [n]

, there is a path-dependent terminal payoff

Φ_{i}

at the end of planning horizon

T < \infty

. In other words,

Φ

is a vector of functions of

{X (t), c (t)}_{0 \leq t \leq T}

. Path-dependent terminal conditions strengthen the commitments in contracts. Each agent could be charged a penalty if its cumulative outputs do not reach a specified target at termination. Similarly, the principal may rectify the payoff if the cumulative compensations do not reach a certain threshold. Let

Z (t)

denote the cumulative measures along the sample paths whose dynamics

d Z_{i}

for

i \in [n]

follows:

d Z_{i} (t) = μ_{Z_{i}} (X (t), c (t)) d t + σ_{Z_{i}} d B_{Z_{i}} (t),

(2)

where

μ_{Z_{i}}, σ_{Z_{i}}

are deterministic functions of appropriate dimension and

B_{Z_{i}}

are independent Brownian motions. Two sets of processes

B (t)

and

B_{Z} (t)

(

B_{Z} (t) = {(B_{Z_{1}} (t), \dots, B_{Z_{n}} (t))}^{T}

) are also independent.

An example of a path-dependent terminal condition is an Asian-options type, i.e.,

Z (t) \in R^{n}

represents the total observed output from n-agents from zero to t:

\begin{matrix} Z (t) & = \int_{0}^{t} X (s) d s, \end{matrix}

(3)

and this can be derived from (2) by letting

μ_{Z_{i}} (X, c) = X_{i}

and

σ_{Z_{i}} = 0

.

The two systems of SDEs, (1) and (2), are adapted to the filtration generated by the Brownian motions

B_{i}

and

B_{Z_{i}}

for all

i \in [n]

. It is a well-known result that the vector

(X (t), Z (t))

is a Markov process.

2.2. Solving Optimal Contracts

u_{i} : A \times C_{i} \to R

is the

i^{t h}

agent’s instantaneous utility, i.e., utility in

[t, t + d t)

and

u_{P} : X \times C \to R

is the principal’s instantaneous utility. Note that

u_{i}

is possibly a function of all agents’ actions.

The principal’s and the agents’ goals are to maximize the respective expected total discounted utility over the finite horizon

[0, T]

. We denote the

i^{t h}

agent’s expected total discounted utility by

U_{i}

and the principal’s expected total discounted utility from contracting with n agents by

U_{P}

as follows:

\begin{matrix} U_{i} & = E^{a} [r_{i} \int_{0}^{T} e^{- r_{i} s} u_{i} (a (s), c_{i} (s)) d s + r_{i} e^{- r_{i} T} Φ_{i} (Z_{i} (T))], \forall i \in [n], \\ U_{P} & = E^{a} [r_{P} \int_{0}^{T} e^{- r_{P} s} u_{P} (X (s), c (s)) d s - r_{P} 1^{⊺} \cdot e^{- r_{P} T} Φ (Z (T))], \end{matrix}

where

r_{i} \in (0, 1)

and

r_{P} \in (0, 1)

are the discount rate of the

i^{t h}

agent and the principal, respectively. The discount rates in front of the integral normalize the utility to annuity costs [16]. In the case that the principal is risk-neutral, i.e.,

u_{P}

is a linear function of

X

, we can reduce the principal’s problem using the following observation. After taking expectations on the integral of the

i^{t h}

agent’s output process,

E [\int X_{i} (t) d t] = E [\int f_{i} (a (t)) d t] + E [\int σ_{i} d B_{i} (t)] = E [\int f_{i} (a (t)) d t]

, using the fact that the expectation of Ito’s integral is zero. Thus, we can write

U_{P}

in terms of

a

only in this special case [24].

Optimal multiagent contracts should maximize the principal’s expected total discounted utility

U_{P}

subject to (a) n individual-rational (IR) constraints at

t = 0

and (b) n incentive-compatible (IC) constraints at any

t \in [0, T]

. The IR constraints guarantee that agents would agree to enter the contracts if the expected utility exceeds certain thresholds; the IC constraints guarantee that agents would realize the target efforts at each epoch of the horizon. In the presence of the interactions between agents, we have one additional constraint that the n agents’ best responses constitute a Nash equilibrium at each

t \in [0, T]

. In summary, optimal multiagent contracts can be solved as follows:

\begin{matrix} max_{c (t), t \in [0, T]} U_{P} \\ s . t . & U_{i} \geq {\underset{̲}{W}}_{i}, \forall i \in [n] & (individual - rational constraint), \\ a_{i}^{*} (t) \in arg max_{a_{i}} U_{i}, \forall i \in [n], \forall t \in [0, T] & (incentive - compatible constraint) . \end{matrix}

(4)

3. Incentive-Compatible Constraints

In this section, we characterize an individual agent’s optimum action within given multiagent contracts.

3.1. Parametrization of the Individual Agent’s Problem

We analyze an arbitrary

i^{t h}

agent’s optimum action given the other agents’ optimum actions. Without loss of generality, we reformulate the analysis of the prior work [3,21] under a new multiagent contracts setting.

In dynamic Stackelberg games, one commonly defines the continuation value

W_{i} (t)

(the value function in dynamic programming) when the optimal actions

a

are taken by all agents in

[t, T]

, i.e., the agent i’s conditional expected optimal discounted utility received from t to T, as follows,

\begin{matrix} W_{i} (t) & = E^{a} [\int_{t}^{T} r_{i} e^{- r_{i} (s - t)} u_{i} (a (s), c_{i} (s)) d s + r_{i} e^{- r_{i} (T - t)} Φ_{i} (Z_{i} (T)) | F_{t}^{B, B_{z}}] . \end{matrix}

(5)

where

F_{t}^{B, B_{z}}

is the filtration generated by the Brownian Motions

B

and

B_{z}

.

We now describe the dynamics of

W_{i} (t)

for a single agent with a path-dependent terminal condition as follows:

Proposition 1.

There exists an

F_{t}^{B, B_{z}}

adapted process

Y_{i} (t) = {(Y_{i 1} (t), Y_{i 2} (t))}^{T}

such that the continuation value

W_{i} (t)

of the

i^{t h}

agent is represented by the process:

\begin{matrix} d W_{i} (t) = r_{i} [W_{i} (t) - u_{i} (a (t), c_{i} (t))] d t + r_{i} Y_{i 1} (t) σ_{i} d B_{i} (t) + r_{i} Y_{i 2} (t) σ_{Z_{i}} d B_{Z_{i}} (t), \end{matrix}

Conversely, a process

W_{i} (t)

satisfying the SDE is the

i^{t h}

agent’s continuation value.

Proof.

Given fixed and optimal n-agents’ efforts

{a (t) : t \geq 0}

and the filtration

F_{t} = F_{t}^{B, B_{z}}

, we have:

\begin{matrix} U_{i} (t) & = E^{a} [\int_{0}^{T} r_{i} e^{- r_{i} s} u_{i} (a (s), c_{i} (s)) d s + r_{i} e^{- r_{i} T} Φ_{i} (Z_{i} (T)) | F_{t}], \end{matrix}

(6)

U_{i} (t)

is an

F_{t}

-Martingale, i.e., for any

s < t

, using (6) and the iterated conditional expectation, it is readily seen that

E^{a} (U_{i} (t) | F_{s}) = U_{i} (s)

. From the Martingale representation theorem [3], we obtain the existence of adapted processes

Y_{i 1} (t)

and

Y_{i 2} (t)

such that:

\begin{matrix} d U_{i} (t) = r_{i} e^{- r_{i} t} Y_{i 1} (t) σ_{i} d B_{i} (t) + r_{i} e^{- r_{i} t} Y_{i 2} (t) σ_{Z_{i}} d B_{Z_{i}} (t) . \end{matrix}

From (5), it is easily seen that (6) can be rewritten as:

\begin{matrix} U_{i} (t) = \int_{0}^{t} r_{i} e^{- r_{i} s} u_{i} (a (s), c_{i} (s)) d s + e^{- r_{i} t} W_{i} (t), \end{matrix}

and using Ito’s lemma, we obtain the dynamics:

d U_{i} (t) = r_{i} e^{- r_{i} t} u_{i} (a (t), c_{i} (t)) d t + e^{- r_{i} t} d W_{i} (t) - r_{i} e^{- r_{i} t} W_{i} (t) .

Equating the above two dynamics of

d U_{i} (t)

gives the result. □

The expansion of the state space (when compared to [21]) is needed to accommodate the path-dependent terminal condition, requiring the vector

{(X (t), Z (t))}^{T}

to be a part of the state space. Dynamic contracts between the principal and the

i^{t h}

agent must specify: (a) the instantaneous compensations

c_{i} (t)

and (b) two processes

Y_{i 1} (t)

and

Y_{i 2} (t)

as the sensitivity of the agent’s continuation value

W_{i} (t)

to the output

X_{i} (t)

and terminal process

Z_{i} (t)

, respectively.

Given a contract

{c_{i} (t), Y_{i} (t)}_{t \in [0, T]}

, we use the one-shot deviation principle to derive the necessary condition for the optimality of the effort

{a_{i} (t)}_{0 \leq t \leq T}

with given

{Y_{i} (t)}_{0 \leq t \leq T}

. This optimality condition is equivalent to the IC constraint in (4). Such an optimality condition holds for an arbitrary

i^{t h}

agent’s

a_{i} (t)

given

a_{- i}

.

Proposition 2.

For any fixed

a_{- i} (t)

, the contracted compensation

c_{i} (t)

for the agent i is implementable if and only if

{a_{i} (t)}

satisfies:

\begin{matrix} a_{i} (t) \in arg max_{{\tilde{a}}_{i} (t) \in A_{i}} [Y_{i 1} (t) f_{i} (a_{- i} (t), {\tilde{a}}_{i} (t)) + u_{i} (a_{- i} (t), {\tilde{a}}_{i} (t), c_{i} (t))], \end{matrix}

(7)

for all

t \in [0, T]

.

Proof.

Let

a (t)

be the optimal effort vector, and let the effort of the

i^{t h}

agent, for a fixed

t > 0

, be:

{\tilde{a}}_{i} (s) = \{\begin{matrix} {\tilde{a}}_{i} (s) & if s < t \\ a_{i} (s) & if s \geq t . \end{matrix}

We denote

\tilde{a} = (a_{- i}, {\tilde{a}}_{i})

. Choosing actions

\tilde{a}

will change the dynamics of

X_{i}

and

W_{i}

. To obtain the new dynamics, we apply Girsanov’s theorem with the kernel

ϕ (t) = f_{i} (\tilde{a} (t)) - f_{i} (a (t))

. The new dynamics adapted to Brownian motions

{\tilde{B}}_{i}

and

{\tilde{B}}_{Z_{i}}

on the space

(Ω, A, \tilde{P})

are given by:

\begin{matrix} \{\begin{matrix} σ_{i} d B_{i} (t) & = σ_{i} d {\tilde{B}}_{i} (t) + ϕ (t) d t, \\ σ_{Z_{i}} d B_{Z_{i}} (t) & = σ_{Z_{i}} d {\tilde{B}}_{Z_{i}} (t) . \end{matrix} \end{matrix}

Substituting in (1) and Proposition 1 under

\tilde{a}

, the dynamics of

U_{i} (t)

become:

\begin{matrix} d {\tilde{U}}_{i} (t) = & r_{i} e^{- r_{i} t} (u_{i} (\tilde{a} (t), c_{i} (t)) - u_{i} (a (t), c_{i} (t)) + Y_{i 1} (t) (f_{i} (\tilde{a} (t)) - f_{i} (a (t))) d t + \\ Y_{i 1} (t) σ_{i} d \tilde{B} (t) + Y_{i 2} σ_{Z_{i}} d {\tilde{B}}_{Z_{i}} (t) . \end{matrix}

Since

a_{i}

is optimal, the drift of this SDE must be non-positive. This completes the proof. □

These two propositions decouple the principal’s and an arbitrary

i^{t h}

agent’s problem. To specify the target efforts that are not observable, the principal can incentivize the agent by recommending a sensitivity level

r_{i} Y_{i} (t)

. With n agents, the Nash equilibrium is equivalent to finding the optimal

Y (t) = {[Y_{1} (t), \dots Y_{n} (t)]}^{⊺}

jointly. The principal can create a contract with: (a) functions for

{c_{i} (W (t), X (t), Z (t))}_{i \in [n]}

for each agent i; and (b) functions of the sensitivity

{r_{i} Y_{i} (t)}_{i \in [n]}

that specify the target effort processes. Hence, we create multiagent contracts that provide consistent information for all agents over the planning horizon, which are thus implementable.

Characterizing implementable multiagent contracts require that the actions of the agents

a (t)

form a multiagent Nash equilibrium at each epoch

t \in [0, T]

. We note that in our formulation, there are interactions among n-agents both in the instantaneous utility

u_{i}

and drift term of output processes

f_{i}

for all

i \in [n]

. The principal thus chooses a target effort level

a (t)

, which form a Nash equilibrium among agents, so that each agent

i \in [n]

is disincentivized to deviate from the target

a_{i} (t)

when the other agents do not, i.e., implementing the targeted

a_{i} (t)

.

3.2. Multiagent Nash Equilibrium

We now prove the existence of a Nash equilibrium among n-agents’ best responses (7) at a fixed epoch t. Bellman’s principle of optimality guarantees that it is sufficient to show the existence of a Nash equilibrium within the Hamiltonian of the IC constraint to prove the existence of a subgame perfect Nash equilibrium.

We need the following assumptions on the functions

u_{i}

and

f_{i}

for all

i \in [n]

:

$u_{i} : A \times C_{i} \to R$ is twice continuously differentiable, decreasing in $c_{i}$ , and concave in $a_{i}$ .
$f_{i} : A \to R_{+}$ is twice continuously differentiable, increasing and concave in $a_{i}$ .
For each i and $a$ , $\frac{\partial f_{i} (a_{i}, 0)}{\partial a_{i}} \neq 0$ and $f_{i} (a) \to \infty$ while $\frac{\partial f (a)}{\partial a_{i}} \to 0$ as $a_{i} \to \infty$ .
The set $\cap_{i} {(a, c) : u_{i} (a, c_{i}) \geq 0 for all i}$ is nonempty and compact.
There exists an $m > 0$ such that $m < {sup}_{x} u_{i} (a_{- i}, x, c_{i})$ , and $u_{i} \to - \infty$ as $x \to \infty$ , for all i and $a_{- i}, c_{i}$ .
$u_{i} (a_{- i}, 0, c_{i}) \geq 0$ for each $a_{- i}, c_{i}$ .

The single-agent contract in [21] and the multiagent contracts in [24] are special cases of the functions above with u separable in

a (t)

and

c (t)

and

f (a (t)) = a (t)

. Assumption 4 is satisfied because an arbitrary agent can choose effort

a_{i} (t) = 0

to have zero utility. Assumption 6 is valid because

a_{i} (t) \notin A_{i}

if

u_{i} < 0

. With these assumptions, we can show the following lemmas.

Lemma 1.

Let

α_{i} = (a_{- i}, c_{i})

, and we define:

g_{i}^{α_{i}} (x) = \frac{- u_{i}^{'} (a_{- i} (t), x, c_{i} (t))}{f_{i}^{'} (a_{- i} (t), x)} .

g_{i}^{α_{i}}

is continuously differentiable and monotonically increasing as a function of x in the domain

A_{i}

. Furthermore, there exist

0 \leq β_{i} < γ_{i}

such that for each

β_{i} < y < γ_{i}

and

α_{i} \in R^{n}

,

g_{i}^{α_{i}} (x) = y

has a solution.

Proof.

g_{i}^{α_{i}}

is well defined from Assumption 3 on

f_{i}^{'}

, i.e., it is nonzero, and its monotonicity follows from the concavity of

u_{i}

and

f_{i}

. We define:

{\hat{g}}_{i} (x) = \inf_{α \in R^{n}} g_{i}^{α} (x), β_{i} = \max {0, \sup_{α \in R^{n}} g_{i}^{α} (0)} .

Let

θ_{i}

be the

i^{t h}

agent’s greatest effort, i.e.,

θ_{i} = sup A_{i}

. Define

γ_{i} = \hat{g} (θ_{i})

and

θ_{i}

sufficiently large so that

[β_{i}, γ_{i}]

is nonempty. This exists as

{\hat{g}}_{i}

is an increasing function in Figure 1.

For arbitrary

y \in [β_{i}, γ_{i}]

, we define

{\hat{g}}_{i} (\hat{x}) = y

. Such an

\hat{x} \in A_{i}

exists because the function

{\hat{g}}_{i}

is monotonically increasing. Now, for any

α_{i}

, the function

g_{i}^{α_{i}} (\hat{x}) \geq y

and

g_{i}^{α_{i}} (0) \leq β_{i}

. The result follows from the continuity of

g_{i}^{α_{i}}

and the intermediate value theorem. □

Applying Lemma 1 to all agents

i \in [n]

, we define a set

Y = \prod_{i} [β_{i}, γ_{i}]

. We can now rigorously define the multiagent Nash equilibrium as follows.

Definition 1.

The multiple agents’ effort

a

is called a Nash equilibrium if and only if an arbitrary agent’s deviation from the stipulated effort level in

a

while the other agents follow their stipulated actions will result in a loss to the agent, i.e., for each

i \in [n]

,

\begin{matrix} a_{i} \in Γ_{i} (a_{- i}, c_{i}, y_{i 1}) = \{\hat{x} : \hat{x} = arg max_{x} [y_{i 1} f_{i} (a_{- i}, x) + u_{i} (a_{- i}, x, c_{i})]\} . \end{matrix}

(8)

Note that the multiagent equilibrium is independent of

Y_{i 2}

. We now prove a simple lemma to characterize the equilibrium:

Lemma 2.

For all

t \in [0, T]

and each

y (t) \in Y

and

c (t) \in C

, if

a (t)

satisfying (8) exists, it lies in the set

⋂_{i} {(a, c_{i}) : u_{i} (a, c_{i}) \geq 0}

.

Proof.

For any given contract

y (t), c (t)

, let

a (t)

be a Nash equilibrium for each

t \in [0, T]

, and let

u_{i} (a (s), c_{i} (s)) < 0

for some

i \in [n]

and

s \in (t_{1}, t_{2})

. Thus,

\int_{t_{1}}^{t_{2}} u_{i} (a_{- i} (s), a_{i} (s), c_{i} (s)) d s < 0

. However, from Property 6,

\int_{t_{1}}^{t_{2}} u_{i} ((a_{- i} (s), 0, c_{i} (s)) d s \geq 0 .

Thus,

a (t)

is not a Nash equilibrium for

t \in (t_{1}, t_{2})

, a contradiction. The result follows from the fact that as

u_{i} (a_{- i} (t), a_{i} (t), c_{i} (t))

is continuous, thus it cannot be strictly negative on a set of measure zero in

[0, T]

. □

The following corollary shows that agents continue to abide by the conditions of the contracts until the termination epoch.

Corollary 1.

A consequence of the implementation of the Nash equilibrium is that no agent has an incentive to leave the contracts before the terminal epoch T.

Proof.

As is seen in the proof of Lemma 2, when agents’ actions form a multiagent Nash equilibrium, each agent receives a positive utility in any finite interval, thus making each agent’s total utility an increasing function of its continuation value. Therefore, no agent is motivated to deviate from the target action before the termination epoch T. □

The theorem below establishes the existence of such an equilibrium in each given epoch t.

Theorem 1.

For each given

y (t) \in Y

and

c (t) \in C

, there exists a subgame perfect Nash equilibrium

a (t) \in A

for every

t \in [0, T]

.

Proof.

For a fixed agent

i \in [n]

, given the concavity of the functions in Proposition 1, a necessary and sufficient condition for

\hat{x}

to solve the optimization problem is that

g_{i}^{a (t), c_{i} (t)} (\hat{x}) = y_{i 1} (t)

. We note that as defined in (8),

Γ_{i} (a (t), c_{i} (t), y_{i} (t)) = {x : g_{i}^{a (t), c_{i} (t)} (x) = y_{i 1} (t)}

.

We now define a point-to-set map,

Γ (a (t)) : = Γ^{c (t), y (t)} (a (t)) = [Γ_{1} (a (t), c_{1} (t), y_{1} (t))), \dots, Γ_{n} (a (t), c_{n} (t), y_{n} (t)))] .

Note that

Γ : A \to A^{*}

, where

A^{*}

is the set of all compact and convex subsets of

A

. To see that

Γ

is an upper hemicontinuous point to set map, let

a^{k}

be a sequence in

A

that converges to

a

. Furthermore, let

x^{k} \in Γ (a^{k})

for each k such that

x^{k}

converges to

x

. To see that

x

is in

Γ (a)

, we note that

x_{i}^{k}

is such that

g_{i}^{α^{k} (t)} (x_{i}^{k}) = y_{i 1} (t)

. From the definition of

g_{i}

in Lemma 1, it is a continuous function of

α

and x, thus

y_{i 1} (t) = {lim}_{k \to \infty} g_{i}^{α^{k} (t)} (x^{k}) = g_{i}^{α (t)} (x)

for each i. The existence of the Nash equilibrium now follows from Lemma 2, Property 4, in the assumptions, and the Kakutani fixed-point theorem [36]. □

3.3. On the Uniqueness of the Nash Equilibrium in Multiagent Contracts

The individual incentive contract assumes that, if multiple subgame perfect Nash equilibria exist, the principal has the power to choose her or hiss preferred one. However, if multiple equilibria exist in the multiagent contracts, first all equilibria must be found, and then, we look for plausible selection criteria to convince the agents to implement a specific chosen equilibrium. To avoid this computational problem at each epoch t, we impose reasonable and mild additional conditions to guarantee a unique Nash equilibrium. We now state these conditions:

$u_{i}$ is strictly concave in $a_{i}$ and $u_{i}^{'} (a_{- i}, a_{i}, c_{i}) : = \frac{\partial u_{i} (a_{- i}, a_{i}, c_{i})}{\partial a_{i}} < 0$ for each $i \in [n]$ and each $a_{- i}$ .
Let for each $i \in [n]$ , $u_{i j}^{^{″}} : = \frac{\partial^{2} u_{i}}{\partial a_{i} \partial a_{j}}$ for each $i, j$ and, similarly, $f_{i j}^{^{″}}$ . The matrix $D^{2} u_{i}$ ( $D^{2} f_{i}$ ) is such that its ith row is strictly diagonally dominant (diagonally dominant) in variables $a$ , i.e.,

$- u_{i i}^{^{″}} > \sum_{i \neq j} | u_{i j}^{^{″}} | (- f_{i i}^{^{″}} \geq \sum_{i \neq j} | f_{i j}^{^{″}} |) .$

Remark 1.

Comments on the uniqueness conditions of the Nash equilibrium of agents:

1.: Condition 1 stipulates that the optimal effort the agents exert is unique and also has a negative effect on their instantaneous utility, i.e., the marginal utility as a function of the agent i’s effort $a_{i}$ is negative.
2.: Condition 2 states that agent i’s particular decision mostly affects the decrease in his or her marginal utility. In contrast, the other agents’ efforts have a minor effect (note the strict concavity implies that $u_{i i}^{^{″}}$ is negative).
3.: The signs of $u_{i j}^{″}$ are related to whether $a_{i}$ is a strategic complement or a strategic substitute [36]. Diagonal dominance thus assumes that the magnitude of the effect of any agent’s actions exceeds the magnitude of the combined strategic effects of all the other agents’ actions.

We now prove a result:

Lemma 3.

Let

u_{i}

and

f_{i}

satisfy Conditions 1 and 2 above and

g_{i}^{α_{i}}

be as defined in Lemma 1, and

g (a) = {[g_{1}^{α_{1}} (a), \dots, g_{n}^{α_{n}} (a)]}^{T}

. The Jacobian matrix of g,

D_{a} g (a)

is then a P-matrix, i.e., has all principal minors positive.

Proof.

We first show that

D_{a} g (a)

is a strictly row diagonally dominant Jacobian matrix. Note that, suppressing the argument

a

,

c

, we obtain:

\begin{matrix} \frac{\partial g_{i}}{\partial a_{i}} & = & \frac{1}{f_{i}^{'}} {- u_{i i}^{^{″}} - \frac{- u_{i}^{'}}{f_{i}^{'}} f_{i i}^{^{″}}}, \\ \frac{\partial g_{i}}{\partial a_{j}} & = & \frac{1}{f_{i}^{'}} {- u_{i j}^{^{″}} - \frac{- u_{i}^{'}}{f_{i}^{'}} f_{i j}^{^{″}}} . \end{matrix}

The row dominance now follows from Condition 2 and the observation that

f_{i} > 0

,

- u_{i i}^{^{″}} > 0

,

- f_{i i}^{^{″}} \geq 0

, and

g_{i}^{α_{i}} = \frac{- u_{i}^{'}}{f_{i}^{'}} > 0

in the domain

g^{- 1} (Y) \subset A

. Furthermore, it is easy to see that each principal submatrix of

D_{a} g

is also strictly row diagonally dominant. Using Gershgorin’s theorem [37], it follows that all the principal submatrices of

D_{a} g

are nonsingular. We now let B be any such principal submatrix and let

I_{B}

be the diagonal matrix of its diagonal elements and

A_{B}

the matrix of its off-diagonal elements. Define

B (t) = I_{B} + t A_{B}

for each

t \in [0, 1]

.

A (t)

is strictly row diagonally dominant for each t, and since

d e t (B (0)) > 0

,

d e t (B (1))

is also positive. Thus,

D_{a} g

is a P-matrix. □

Theorem 2.

Assume Conditions 1 and 2 above hold. Then, for each epoch

t \in [0, T]

, the Nash equilibrium is unique.

Proof.

From the strict concavity of

u_{i}

and (8), we see that for given

y

and

c

,

a

is a Nash equilibrium if and only if:

g (a) = y .

Let

θ_{i}

be the largest effort agent i can put, as found in Lemma 1; define

\hat{A} = Π_{i} [0, θ_{i}]

, and consider the set

g (\hat{A}) = {y : g (a) = y, a \in \hat{A}}

. Using the P-matrix property of

D_{a} g

, the fact that

\hat{A}

is a hypercube and the Gale–Nikaido theorem [33] (or [34]), we see that g maps

\hat{A}

homeomorphically onto

g (\hat{A})

. The uniqueness follows as

Y \subset g (\hat{A})

. □

4. The Optimal Multiagent Contracts

In this section, we solve the optimal multiagent contracts given that n-agents put effort at equilibrium in Section 3. We denote the principal’ controls as

v (t) = (c (t), y (t))

. Define:

\begin{matrix} U_{P}^{v} = E^{v} [\int_{0}^{T} r_{P} e^{- r_{P} s} (u_{P} (X^{v} (t), c (s)) d s - r_{P} e^{- r_{P} T} 1^{⊺} \cdot Φ (Z (T))] . \end{matrix}

(9)

With the parameterized IC constraints and a well-defined set of Nash equilibria

Θ (v)

for given

{v (t)}

for all

t \in [0, T]

, the principal’s problem is as follows:

\begin{matrix} v^{*} & = {argmax}_{\begin{matrix} {v : a \in Θ (v)}_{0 \leq t \leq T} \end{matrix}} U_{P}^{v} \end{matrix}

(10)

U_{P} : = U_{p}^{v^{*}} .

(11)

We note here that, in general, getting all Nash equilibrium points is generally not possible, but if it is unique, the problem (10) can be solved. Let

R_{P}^{v}

be the present value of the conditional expectation of the continuation value of the principal at time t when the policy

v (ξ)

is followed in

ξ \in [t, T]

. Thus:

\begin{matrix} R_{P}^{v} (t) = E^{v} [\int_{t}^{T} r_{P} e^{- r_{p} ξ} (u_{P} (X (ξ), c (ξ)) d ξ - r_{P} e^{- r_{P} T} 1^{⊺} \cdot Φ (Z (T)) | F_{t}^{B, B_{z}}] . \end{matrix}

(12)

and note that

{R_{P}^{v} (t)}

is a random process. Therefore, define:

\begin{matrix} U_{P}^{v} (t) & = E^{v} [U_{P}^{v} | F_{t}^{B, B_{z}}] \\ = \int_{0}^{t} r_{P} e^{- r_{p} ξ} (u_{P} (x (ξ), c (ξ)) d ξ + R_{P}^{v} (t) . \end{matrix}

In case the optimal solution

v^{*}

exists, then

R_{P}^{v^{*}} (t)

is a

F_{t}^{B, B_{z}}

adapted Martingale and thus has a zero drift, and for other

v

’s, its drift is non-positive. We now make the following assumption about the principal’s continuation value:

Assumption 1.

We assume that the value

R_{p}^{v} (t)

has the following

C^{1, 2, 2, 2}

functional form

F^{v} (t, W (t), X (t), Z (t))

in variables t, the n-agents’ continuation vector

W (t)

, the observed output vector

X (t)

, and the termination value descriptor vector

Z (t)

.

C^{k}

represents the differentiability class regarding the scalar or the vector. In what follows, for the ease of exposition, we will shorten

F^{v} (t, W (t), X (t), Z (t))

to

F_{t}^{v}

whenever there is no possibility of confusion.

Remark 2.

The state space includes

Z (t)

to assure that the vector process

(W (t), X (t), Z (t))

is Markov. In the special case that

T \to + \infty

(i.e., an infinite time horizon with the transversality condition), the state space does not contain t, as in [3].

Then, the optimum utility received by the principal, following the optimal control

v^{*}

, can also be written as:

\begin{matrix} U_{P} (t) = \int_{0}^{t} r_{P} e^{- r_{P} s} u_{P} (x (ξ), c^{*} (ξ)) d ξ + F^{v^{*}} (t, W (t), X (t), Z (t)) . \end{matrix}

(13)

Note that, at epoch t,

{x (ξ)}_{ξ \leq t}

is realized, and

{X (ξ)}_{ξ > t}

is determined by the control

v

. Following the argument of Proposition 1, we see that

U_{P} (t)

, defined by (11) and (13), is a

F^{B, B_{Z}}

-adapted Martingale and thus has drift zero. Applying Ito’s multidimensional lemma and the dynamics of

W (t)

,

X (t)

, and

Z (t)

, we obtain the dynamics of

U_{P} (t)

. Thus, we can solve for F by setting the drift of its dynamics to zero.

To obtain the drift term, we recall the dynamics of the state variables. For notational convenience, we let

σ = diag ((σ_{1}, \dots, σ_{n})

,

Y_{1} (t) = diag (r_{1} σ_{1} Y_{11} (t), \dots, r_{n} σ_{n} Y_{1 n} (t))

,

Y_{2} (t) = diag (r_{1} σ_{Z_{1}} Y_{12} (t), \dots, r_{1} σ_{Z_{n}} Y_{2 n} (t))

, and

r = diag (r_{1}, \dots, r_{n})

. Let

L

be the Cholesky factor of

Σ

(i.e.,

Σ = L L^{T}

), the covariance matrix of

B (t)

. There exists a process

\hat{B}

, a vector of n independent Brownian motions, with:

\begin{matrix} \{\begin{matrix} B (t) = L \hat{B} (t), \\ μ (t, W (t)) = W (t) - u (a (y (t)), c (t)), \\ σ (y (t)) = Y_{1} (t) L . \end{matrix} \end{matrix}

Using Proposition 1, we get:

\begin{matrix} d W (t) = r [W (t) - u (a (y (t)), c (t))] d t + σ (y (t)) d \hat{B} (t) + Y_{2} (t) d B_{z} (t), \end{matrix}

(14)

and similarly for the dynamics of

Z (t)

using (2).

We define a differential operator

H^{v}

as a function of the control vector

v = {(y, c)}^{T}

as follows,

\begin{matrix} H^{v} F_{t} & = & r (D_{w} F_{t} μ (t, w (t))) + D_{x} F_{t} f (a (y (t))) + D_{z} F_{t} μ_{z} (x (t), c (t)) + \\ \frac{1}{2} trace (σ {(y (t))}^{⊺} D_{w}^{2} F_{t} σ (y (t)) + Y_{2} (t) D_{w}^{2} F_{t} Y_{2} (t) + L^{T} σ^{⊺} D_{x}^{2} F_{t} σ L + \\ σ (y (t)) D_{w x}^{2} F_{t} σ L + σ_{z}^{⊺} D_{z}^{2} F_{t} σ_{z}), \end{matrix}

(15)

where

D_{x} F_{t}

and

D_{x}^{2} F_{t}

are the first and second derivative matrices of

F_{t}

with respect to

x

. We note here that in the above, we suppressed the superscript in

F_{t}

.

Applying the multidimensional Ito’s lemma to (13), we get the drift of the dynamics of

U_{P} (t)

as:

\begin{matrix} \frac{\partial}{\partial t} F_{t}^{v^{*}} + r_{P} e^{- r_{P} t} u_{P} (X (t), c^{*} (t)) + H^{v^{*}} F_{t}^{v^{*}} . \end{matrix}

(16)

We now prove the theorem that verifies Assumption 1 and sets up a Hamilton–Jacobi–Bellman equation that solves the problem (9):

Theorem 3.

The principal’s problem can be formulated as the Hamilton–Jacobi–Bellman equation:

\begin{matrix} \frac{\partial}{\partial t} F_{t} + max_{v = (y, c)} \{r_{P} e^{- r_{P} t} u_{P} (x (t), c (t)) + H^{v} F_{t}\} = 0 \\ s . t . F (T, w, x, z) = - r_{P} e^{- r_{P} T} 1^{⊺} \cdot Φ (z), \forall w, x, z, \\ a (v (t)) \in Θ (v (t)), \forall t \in [0, T] . \end{matrix}

(17)

Let its solution be

G (t, w, x, z)

and the control

\hat{v} (t, w, x, z)

.

F = G

and

v^{*} = \hat{v}

solve the optimization problem (9). Thus, Assumption 1 is verified.

Proof.

For the ease of notation, we define

s = {(w, x, z)}^{T}

and let

G_{t} = G (t, s (t))

be the weak solution of the equation (17) under control

\hat{v}

.

Now, using an arbitrary control law

v

, such that

a (v (t)) \in Θ (v (t)), \forall t \in [0, T]

at the arbitrary time t, with the state dynamics of

S^{v}

governed by the Brownian motions

B, B_{Z}

and when G solves the HJB equation, we see that:

\begin{matrix} \frac{\partial}{\partial t} G_{t} + r_{P} e^{- r_{P} t} u_{P} (x^{v} (t), c (t)) + H^{v} G_{t} \leq 0, \end{matrix}

for all

v

. Thus, we have, for each time

ξ \in [0, T]

,

\begin{matrix} \frac{\partial}{\partial t} G_{ξ} + H^{v} G_{ξ}^{v} \leq - r_{P} e^{- r_{P} t} u_{P} (x^{v} (ξ), c (ξ)) . \end{matrix}

(18)

Integrating the above system from t to T, using Ito’s lemma to

G (t, S)

, and integrating (which sets the stochastic integral to zero), we see that:

\begin{matrix} G_{t}^{v} = E^{v} [G_{T}^{v} - \int_{t}^{T} (\frac{\partial}{\partial t} G_{ξ}^{v} + H^{v} G_{ξ}^{v}) d ξ | F_{t}^{B, B_{z}}] . \end{matrix}

From the boundary condition, we also have

G_{T} = - r_{P} e^{- r_{P} T} 1^{⊺} Φ (z^{v})

. Integrating the above expression and Inequality (18), we obtain:

\begin{matrix} G_{t}^{v} \geq E^{v} [\int_{t}^{T} r_{P} e^{- r_{P} ξ} u_{P} (X^{v} (ξ), c^{v} (ξ)) d ξ - r_{P} e^{- r_{p} T} 1^{⊺} Φ (Z^{v} (T)) | F^{B, B_{z}}] = R_{P}^{v} (t) . \end{matrix}

Since the control

v

was chosen arbitrarily,

R_{P}^{v}

as in (12), the optimal solution to the problem (9), we have:

\begin{matrix} G_{t} \geq sup_{v} G_{t}^{v} \geq sup_{v} R_{P}^{v} (t) = R_{P} (t) . \end{matrix}

(19)

To see the converse, let

G_{t}

and

\hat{v}

solve the HJB

(17)

. Ito’s lemma gives, as in (18) an Ito integral J:

\begin{matrix} \int_{t}^{T} (\frac{\partial}{\partial t} G_{ξ} + H^{\hat{v}} G_{ξ}) d ξ + J = G_{T} - G_{t} . \end{matrix}

Using (17) and the above with minor rearrangement and taking an expectation conditioned on

F^{B, B_{z}}

, we get:

\begin{matrix} G_{t} & = E^{\hat{v}} [\int_{t}^{T} r_{P} e^{- r_{P} ξ} (u_{P} (X^{\hat{v}} (ξ), c (ξ)) d ξ - r_{P} e^{- r_{P} T} 1^{⊺} \cdot Φ (Z^{\hat{v}} (T)) | F^{B, B_{Z}}] = R_{P}^{\hat{v}} (t), \end{matrix}

Since

\hat{v}

is a control and since

R_{P} (t)

is the optimal continuation value under the optimal control

v^{*}

,

R_{P} (t) \geq R_{P}^{\hat{v}} (t)

. Thus, combining with (19), we get:

G_{t} \geq R_{P} (t) \geq R_{P}^{\hat{v}} (t) = G_{t} .

The theorem now follows since we have from the above inequalities

G_{t} = R_{P} (t)

for arbitrary t, and

\hat{v}

is the optimal contract. □

Iterative Algorithm for Solving Multiagent Contracts

Since adding an equilibrium constraint causes new computational issues, we propose here an iterative algorithm to obtain the optimal multiagent contracts in Theorem 3. The main idea is to integrate a numerical method for solving the HJB (i.e., Howard’s algorithm [38]) with a fixed-point algorithm (i.e., Eaves–Saigal’s algorithm [39]). For brevity, we denote the state variable at time t by a time-generic vector

s = (w, x, z) \in R^{3 n}

(note that the mesh width for each type of state may vary) and the control at time t by

v = (c, y) \in R^{2 n}

. We discretize the

s - t

plane by choosing uniform mesh widths

Δ s = (Δ w, Δ x, Δ z) \in R^{3 n}

and a time step

Δ t

such that

T / Δ t \in N

. We define the discrete mesh points

s_{i, j, k}

by:

\begin{matrix} s_{i, j, k} = {(i, j, k)}^{⊺} Δ s, & (i, j, k) = {(i_{1}, \dots, i_{n}, j_{1}, \dots j_{n}, k_{1}, \dots, k_{n})}^{⊺} \in N^{3 n}, \\ t_{τ} = τ Δ t, & τ \in [\frac{T}{Δ t}] . \end{matrix}

Our goal is to compute an approximation

F_{i, j, k}^{τ}

to the solution

F (t, w, x, z)

in (17) by discretization and a finite difference method.

Now, define the approximation for the Hamiltonian operator

H^{v} F_{t}

in (15) as

H^{v} {\hat{F}}_{t_{τ}}

(we use a forward-in-time and central-in-space scheme) with the following approximations for gradients:

\begin{matrix} \{\begin{matrix} \frac{\partial {\hat{F}}_{t_{τ}}}{\partial t} = \frac{F_{i, j, k}^{τ + 1} - F_{i, j, k}^{τ}}{Δ t} \\ D_{w} {\hat{F}}_{t_{τ}} |_{ℓ} = \frac{F_{i + e_{ℓ}, j, k}^{τ} - F_{i - e_{ℓ}, j, k}^{τ}}{2 Δ w}, & \forall ℓ \in [n] \\ D_{x} {\hat{F}}_{t_{τ}} |_{ℓ} = \frac{F_{i, j + e_{ℓ}, k}^{τ} - F_{i, j - e_{ℓ}, k}^{τ}}{2 Δ x}, & \forall ℓ \in [n] \\ D_{z} {\hat{F}}_{t_{τ}} |_{ℓ} = \frac{F_{i, j, k + e_{ℓ}}^{τ} - F_{i, j, k - e_{ℓ}}^{τ}}{2 Δ z}, & \forall ℓ \in [n] \end{matrix}, \end{matrix}

where

e_{ℓ} \in R^{n}

is a unit vector with one in the

ℓ^{t h}

entry and zero elsewhere. The

ℓ^{t h}

entry of the approximation for a Hessian (we only present the Hessian with respect to

w

) is:

\begin{matrix} D_{w}^{2} {\hat{F}}_{t_{τ}} |_{ℓ, ℓ^{'}} = \{\begin{matrix} \frac{F_{i + e_{ℓ} + e_{ℓ^{'}}, j, k}^{τ} - F_{i + e_{ℓ} - e_{ℓ^{'}}, j, k}^{τ} - F_{i - e_{ℓ} + e_{ℓ^{'}}, j, k}^{τ} + F_{i - e_{ℓ} - e_{ℓ^{'}}, j, k}^{τ}}{4 Δ w^{2}} if ℓ \neq ℓ^{'}, \\ \frac{F_{i + e_{ℓ}, j, k}^{τ} - 2 F_{i, j, k}^{τ} + F_{i - e_{ℓ}, j, k}^{τ}}{Δ w^{2}} otherwise . \end{matrix} \end{matrix}

We define the function

Ψ^{v} : = r_{P} u_{P} + H^{v} {\hat{F}}_{t}

and the principal’s value function under optimal control at time t as

F^{*} : = F^{v *} (t, s)

. We initialize with the boundary condition

F^{P} (T, w, x, z) = - r_{P} e^{- r_{P} T} 1^{⊺} Φ (z)

as the terminal conditions and the well-posed conditions for the state space. Especially, we note that, in an n-agents’ contract, when

n_{1}

-agents have zero continuation values w, we need to first solve an

(n - n_{1})

-agents subproblem as a boundary condition. In the

m^{t h}

step in the policy iteration, policy evaluation under controls

v^{m}

is conducted by solving the approximation of the PDE as

{(\frac{\partial {\hat{F}}_{t_{τ}}}{\partial t})}^{m} + Ψ^{v^{m}} = 0

.

Since the PDE under arbitrary control is well-posed, we can find a weak solution to

F_{t}

[40]. We then (1) solve a fixed-point problem to find the agents’ unique optimal responses

a^{*} (t) \in Θ (v^{m})

and (2) use a greedy algorithm to improve the policy as:

\begin{matrix} v^{m + 1} = arg max_{v^{'} \in V} Ψ^{v^{'}} . \end{matrix}

Summarizing the above, we can solve for the optimal multiagent contracts by adopting the following backward scheme:

Initialize the terminal condition $F (T, s) = - 1^{⊺} Φ (z)$ .
While $t = T - τ Δ t \geq 0$ , with a fixed $ϵ > 0$ ,
(a)
For each state $w, x, z$ , start with an arbitrary contract $v_{0} = {c_{0}, y_{0}}$ .
(b)
Solve a fixed point problem such that $a^{*} (t) \in Θ (v_{0})$ . If the conditions in Section 3.3 are satisfied, the equilibrium is unique.
(c)
Solve for the boundary conditions as a single-agent contract in [21]. Then, solve a parabolic PDE within (17), i.e., with fixed contracts, to obtain $\tilde{F} (t, s)$ [39].
(d)
Optimize the objective value $\tilde{F} (t, s)$ for each state $s = (w, x, z)$ by the gradient ascent method. The gradient is $\nabla_{v} \tilde{F} \in R^{2 n}$ , and the step size $γ$ can be determined by a line-search method. If $∥ \nabla_{v} \tilde{F} ∥ \geq ϵ$ , go back to (b) with the new contracts $v_{0} \leftarrow (c, y)$ .
(e)
Go to Step 3 if $∥ \nabla_{v} \tilde{F} ∥ < ϵ$ .
Update the contracts ${c (t), y (t)}$ and continuation value $F (t, s)$ . Go to Step 2 with $τ \leftarrow τ + 1$ .

Lemma 4.

The iterative algorithm for multiagent incentive contracts converges to the optimal contract as

m \to \infty

.

Proof.

The backward scheme is a generic Howard’s algorithm, which guarantees that the sequence

F^{m}

converges to

F^{*}

and

v^{m}

converges to

v^{*}

as

m \to \infty

[38]. In addition, we need to guarantee the following three conditions are met. First, under any implementable contracts, the numerical method can evaluate the value F in (17). This is because the weak solution of a linear parabolic PDE can be computed by the finite difference method [40]. Second, for any given

v^{m}

, the Nash equilibrium of agents

a^{*} (t)

exists, Theorem 1, and the feasible region is non-empty for each

v^{m}

. Finally, if there are multiple Nash equilibrium, we must consider the policy-search procedure in a vector-valued case and compare the objective values of all Nash equilibria, which is known to be difficult if not impossible. Imposing the uniqueness conditions in Theorem 2, searching for all multiagent Nash equilibria is not required [39], and the convergence of the iterative algorithm follows. With these conditions, Howard’s algorithm solves (17) to the optimum and obtains the optimal contract by Theorem 3. □

The multiagent Nash equilibrium is defined for noncooperative multiplayer concave games where each player’s objective function is concave only in his/her own decisions and not necessarily concave with respect to other players’ decisions. Alternative approaches that fully exploit the structure of concave games in searching in equilibrium were reviewed in [41]. The above procedure has been implemented to solve a multiagent incentive contract designed for the simultaneous penetration of electric vehicles and charging stations (with real-world data) in the transportation infrastructure [42].

5. Conclusions

Multiagent incentive contracts with broad applications are hard to solve in general. We characterize the sufficient conditions under which the Nash equilibrium of agents exists and additional requirements for the Nash equilibrium to be unique. We develop a backward iterative algorithm to find optimal contracts. The implication of our result is two-fold. First, compared to the single-agent setting, multiagent contracts can model either team collaborations or competitions depending on the context. Second, those conditions of existence and uniqueness contain new insights about the inertia of effective contracting in multiagent systems.

The limitations of the multiagent incentive contracts’ model include:

The Martingale approach is restricted to the SDE output process, where the each agent’s decision only affects the drift term. An extension to controlling the diffusion of output process may cause significant technical difficulties even in the single-agent case.
The coupled gradient-based and fixed-point optimization restricts the computational efficiency of solving the contracts. In the absence of a unique multiagent Nash equilibrium, the proposed algorithm can only compute local optimum contracts, and thus, the verification theorem in Theorem 3 fails. Developing more efficient algorithms for multiagent contracts and with multiple Nash equilibria is a meaningful future direction.

In summary, this work presents a solvable multiagent incentive contracts’ model that opens the door to implementing dynamic contracts with a wide range of applications in quantitative finance, economics, operations research, and decentralized controls.

Author Contributions

Authors’ individual contributions: conceptualization, Q.L. and R.S.; methodology, Q.L. and R.S.; formal analysis, Q.L. and R.S.; writing, original draft preparation, Q.L. and R.S.; writing, review and editing, Q.L. and R.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; nor in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

IC	incentive-compatible
IR	individual-rational
PDE	partial differential equation
SPNE	subgame perfect Nash equilibrium
HJB	Hamilton–Jacobi–Bellman equation

References

Aïd, R.; Possamaï, D.; Touzi, N. Optimal electricity demand response contracting with responsiveness incentives. arXiv 2018, arXiv:1810.09063. [Google Scholar]
Brunnermeier, M.K.; Sannikov, Y. A macroeconomic model with a financial sector. Am. Econ. Rev. 2014, 104, 379–421. [Google Scholar] [CrossRef] [Green Version]
Cvitanic, J.; Zhang, J. Contract Theory in Continuous-Time Models; Springer Science & Business Media: Berlin, Germany, 2012. [Google Scholar]
DeMarzo, P.M.; Sannikov, Y. Optimal security design and dynamic capital structure in a continuous-time agency model. J. Financ. 2006, 61, 2681–2724. [Google Scholar] [CrossRef]
Faingold, E.; Sannikov, Y. Reputation in continuous-time games. Econometrica 2011, 79, 773–876. [Google Scholar] [CrossRef]
Fuchs, W. Contracting with repeated moral hazard and private evaluations. Am. Econ. Rev. 2007, 97, 1432–1448. [Google Scholar] [CrossRef] [Green Version]
Guo, L.; Ye, J.J. Necessary optimality conditions for optimal control problems with equilibrium constraints. SIAM J. Control Optim. 2016, 54, 2710–2733. [Google Scholar] [CrossRef] [Green Version]
Luo, Q.; Saigal, R.; Chen, Z.; Yin, Y. Accelerating the adoption of automated vehicles by subsidies: A dynamic games approach. Transp. Res. Part B Methodol. 2019, 129, 226–243. [Google Scholar] [CrossRef]
Mastrolia, T.; Ren, Z. Principal-Agent problem with common agency without communication. SIAM J. Financ. Math. 2018, 9, 775–799. [Google Scholar] [CrossRef] [Green Version]
Nadtochiy, S.; Zariphopoulou, T. Optimal Contract for a Fund Manager with Capital Injections and Endogenous Trading Constraints. SIAM J. Financ. Math. 2019, 10, 698–722. [Google Scholar] [CrossRef] [Green Version]
Laffont, J.J.; Martimort, D. The Theory of Incentives: The Principal-Agent Model; Princeton University Press: Princeton, NJ, USA, 2009. [Google Scholar]
Piskorski, T.; Tchistyi, A. Optimal mortgage design. Rev. Financ. Stud. 2010, 23, 3098–3140. [Google Scholar] [CrossRef]
Williams, N. A solvable continuous time dynamic principal–agent model. J. Econom. Theory 2015, 159, 989–1015. [Google Scholar] [CrossRef] [Green Version]
Demarzo, P.M.; Sannikov, Y. Learning, termination, and payout policy in dynamic incentive contracts. Rev. Econ. Stud. 2016, 84, 182–236. [Google Scholar] [CrossRef] [Green Version]
Cvitanić, J.; Possamaï, D.; Touzi, N. Dynamic programming approach to principal–agent problems. Financ. Stoch. 2018, 22, 1–37. [Google Scholar] [CrossRef] [Green Version]
Spear, S.E.; Srivastava, S. On repeated moral hazard with discounting. Rev. Econ. Stud. 1987, 54, 599–617. [Google Scholar] [CrossRef]
Holmstrom, B.; Milgrom, P. Aggregation and linearity in the provision of intertemporal incentives. Econometrica 1987, 55, 303–328. [Google Scholar] [CrossRef]
Schättler, H.; Sung, J. The first-order approach to the continuous-time principal–agent problem with exponential utility. J. Econom. Theory 1993, 61, 331–371. [Google Scholar] [CrossRef]
Sung, J. Linearity with project selection and controllable diffusion rate in continuous-time principal-agent problems. RAND J. Econ. 1995, 26, 720–743. [Google Scholar] [CrossRef]
Müller, H.M. Asymptotic efficiency in dynamic principal-agent problems. J. Econom. Theory 2000, 91, 292–301. [Google Scholar] [CrossRef] [Green Version]
Sannikov, Y. A continuous-time version of the principal-agent problem. Rev. Econ. Stud. 2008, 75, 957–984. [Google Scholar] [CrossRef]
Cvitanić, J.; Possamaï, D.; Touzi, N. Moral hazard in dynamic risk management. Manag. Sci. 2016, 63, 3328–3346. [Google Scholar] [CrossRef] [Green Version]
Keun Koo, H.; Shim, G.; Sung, J. Optimal Multi-Agent Performance Measures for Team Contracts. Math. Financ. 2008, 18, 649–667. [Google Scholar] [CrossRef]
Thakur, A. Continuous-Time Principal Multi-Agent Problem: Moral Hazard in Teams & Fiscal Federalism; Working Paper; Princeton University: Princeton, NJ, USA, 2015. [Google Scholar]
Georgiadis, G. Projects and team dynamics. Rev. Econ. Stud. 2014, 82, 187–218. [Google Scholar] [CrossRef]
Elie, R.; Mastrolia, T.; Possamaï, D. A Tale of a Principal and Many, Many Agents. Math. Oper. Res. 2018, 377–766. [Google Scholar] [CrossRef]
Elie, R.; Possamaï, D. Contracting theory with competitive interacting agents. SIAM J. Control Optim. 2019, 57, 1157–1188. [Google Scholar] [CrossRef] [Green Version]
Englmaier, F.; Wambach, A. Optimal incentive contracts under inequity aversion. Games Econom. Behav. 2010, 69, 312–328. [Google Scholar] [CrossRef] [Green Version]
Goukasian, L.; Wan, X. Optimal incentive contracts under relative income concerns. Math. Financ. Econ. 2010, 4, 57–86. [Google Scholar] [CrossRef] [Green Version]
Ma, C.T. Unique implementation of incentive contracts with many agents. Rev. Econ. Stud. 1988, 55, 555–572. [Google Scholar] [CrossRef]
Demski, J.S.; Sappington, D. Optimal incentive contracts with multiple agents. J. Econom. Theory 1984, 33, 152–171. [Google Scholar] [CrossRef]
Holmstrom, B.; Milgrom, P. Multitask principal-agent analyses: Incentive contracts, asset ownership, and job design. J. Econ. Org. 1991, 7, 24. [Google Scholar] [CrossRef] [Green Version]
Gale, D.; Nikaido, H. The Jacobian matrix and global univalence of mappings. Math. Ann. 1965, 159, 81–93. [Google Scholar] [CrossRef]
Kojima, M.; Saigal, R. A study of PC¹ homeomorphisms on subdivided polyhedrons. SIAM J. Math. Anal. 1979, 10, 1299–1312. [Google Scholar] [CrossRef]
Shan, Y. Optimal contracts for research agents. RAND J. Econ. 2017, 48, 94–124. [Google Scholar] [CrossRef]
Osborne, M.J.; Rubinstein, A. A Course in Game Theory; MIT Press: Cambridge, MA, USA, 1994. [Google Scholar]
Householder, A.S. The Theory of Matrices in Numerical Analysis; Courier Corporation: North Chelmsford, MA, USA, 2013. [Google Scholar]
Jensen, M.; Smears, I. On the convergence of finite element methods for Hamilton–Jacobi–Bellman equations. SIAM J. Numer. Anal. 2013, 51, 137–162. [Google Scholar] [CrossRef] [Green Version]
Eaves, B.C.; Saigal, R. Homotopies for computing fixed points in unbounded regions. Math. Program. 1972, 3, 225–237. [Google Scholar] [CrossRef]
Evans, L.C. Partial Differential Equations; American Mathematical Society: Providence, RI, USA, 2010; Volume 19. [Google Scholar]
Chernov, A. On Some Approaches to Find Nash Equilibrium in Concave Games. Autom. Remote Control 2019, 80, 964–988. [Google Scholar] [CrossRef]
Yin, Y.; Luo, Q.; Zhan, J.; Saigal, R. Dynamic Subsidies for a Synergy between Charging Infrastructure Development and Electric Vehicle Adoption; Working Paper; University of Michigan: Ann Arbor, MI, USA, 2020. [Google Scholar]

Figure 1. Proof for the existence of

g_{i}^{α_{i}} (x) = y

in Lemma 1.

Figure 1. Proof for the existence of

g_{i}^{α_{i}} (x) = y

in Lemma 1.

Table 1. Static incentive contracts with two agents.

	Agent 1		Agent 2
	$A$	$B$	$A$	$B$
L	4,2	2,1	2,1	4,2
H	3,3	1,2	1,2	3,3

Table 2. Agents’ output with a single Nash equilibrium.

	A	B
A	2,1	2,2
B	1,1	1,2

Table 3. Agents’ output with multiple Nash equilibria.

	A	B
A	2,1	2,3
B	2,1	2,3

Table 4. Agents’ output with no Nash equilibrium.

	A	B
A	4,3	2,2
B	1,1	3,4

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Luo, Q.; Saigal, R. Dynamic Multiagent Incentive Contracts: Existence, Uniqueness, and Implementation. Mathematics 2021, 9, 19. https://doi.org/10.3390/math9010019

AMA Style

Luo Q, Saigal R. Dynamic Multiagent Incentive Contracts: Existence, Uniqueness, and Implementation. Mathematics. 2021; 9(1):19. https://doi.org/10.3390/math9010019

Chicago/Turabian Style

Luo, Qi, and Romesh Saigal. 2021. "Dynamic Multiagent Incentive Contracts: Existence, Uniqueness, and Implementation" Mathematics 9, no. 1: 19. https://doi.org/10.3390/math9010019

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Dynamic Multiagent Incentive Contracts: Existence, Uniqueness, and Implementation

Abstract

1. Introduction

2. Setting

2.1. Output Processes and Terminal Conditions

2.2. Solving Optimal Contracts

3. Incentive-Compatible Constraints

3.1. Parametrization of the Individual Agent’s Problem

3.2. Multiagent Nash Equilibrium

3.3. On the Uniqueness of the Nash Equilibrium in Multiagent Contracts

4. The Optimal Multiagent Contracts

Iterative Algorithm for Solving Multiagent Contracts

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI