Analysis of a Queue with General Service Demands and Multiple Servers with Variable Service Capacities

De Muynck, Michiel; Bruneel, Herwig; Wittevrongel, Sabine

doi:10.3390/math11040953

Open AccessArticle

Analysis of a Queue with General Service Demands and Multiple Servers with Variable Service Capacities

by

Michiel De Muynck

,

Herwig Bruneel

and

Sabine Wittevrongel

^*

SMACS Research Group, Department of Telecommunications and Information Processing (TELIN), Ghent University (UGent), Sint-Pietersnieuwstraat 41, B-9000 Gent, Belgium

^*

Author to whom correspondence should be addressed.

Mathematics 2023, 11(4), 953; https://doi.org/10.3390/math11040953

Submission received: 30 December 2022 / Revised: 4 February 2023 / Accepted: 8 February 2023 / Published: 13 February 2023

(This article belongs to the Special Issue Queue and Stochastic Models for Operations Research II)

Download

Browse Figures

Versions Notes

Abstract

:

We present the study of a non-classical discrete-time queuing system in which the customers each request a variable amount of service, called their “service demand”, from a system with multiple servers, each of which can provide a variable amount of service, called their “service capacity”, in each time slot. The service demands are independent from customer to customer and follow a general distribution, whereas the service capacities follow a phase-type distribution and are independent from server to server and from slot to slot. Since an exact analytical analysis for this general queuing system is infeasible, we propose several approximations for the key performance characteristics in this system such as the mean system content and the mean customer delay in steady state. The accuracy of each of these approximations is compared to simulations using several numerical examples.

Keywords:

discrete-time queuing theory; service demands; service capacities; multi-server

MSC:

60K25; 90B22; 68M20

1. Introduction

Consider a queuing phenomenon in which the time it takes to serve each customer is variable. Two of the most common causes for this variability in the service times are that the amount of work that each customer requires from the system can vary from customer to customer, and that the rate at which the server(s) in the system can serve the customers can vary over time and/or from server to server in the case of multi-server systems.

In the classical queuing-theory literature, several kinds of queuing models are used to study such queuing phenomena. One class of queuing models uses the concept of “service times”, i.e., the time it takes to serve each customer is modeled explicitly. These service times are usually assumed to be independent and identically distributed (i.i.d.) from customer to customer. This kind of queuing model is commonly used to model queuing phenomena where the reason for the variable service times is the first reason above, i.e., the demands of each customer are variable. Another class of queuing models that is widely used in the literature uses the concept of “service rate”, especially in continuous-time queuing models and for queuing phenomena where the reason for the variable service times is the second reason above, i.e., that the capacity of the system to serve customers varies over time.

However, in some queuing phenomena, both the service requirements of each customer and the rate of service over time can be variable. Examples include grocery stores with variable-sized shopping carts and servers with variable levels of experience, web services with variable types of requests and variable server capacity due to spot pricing, Wi-Fi connections with variable packet sizes and variable connection quality, etc. For these kinds of queuing phenomena, neither the concept of service time nor the concept of service rate is sufficient to fully describe the queuing system. For these kinds of queuing phenomena, it is more precise to model the “service demands”, i.e., the amount of work that each customer requires from the system, and the “service capacities” of the server(s), i.e., the amount of work that each server can provide in a given time slot, explicitly.

Several studies in the scientific literature have proposed and analyzed queuing models with variable service demands and variable service capacities or rates of service in both continuous time (see, e.g., [1,2,3]) and in discrete time (see, e.g., [4,5,6,7,8,9,10,11,12]). However, these discrete-time studies are primarily focused on single-server queuing phenomena. To the best of our knowledge, no studies in the literature have studied discrete-time queuing models with variable service demands, variable service capacities, and multiple servers.

Even more classical multi-server queuing models are notoriously difficult to study analytically. For instance, in the queuing-theory literature, no closed-form expressions have been reported for the steady-state system-content or customer-delay distributions of the continuous-time

G / G / c

queuing model with c servers and generally distributed inter-arrival times and service times. The same is true for the discrete-time

G e o^{X} / G I / c

queuing model with so-called general independent arrivals from slot to slot (i.e., with geometrically distributed inter-arrival times between batch arrivals), generally distributed service times and c servers. Herein, the standard Kendall’s notation

A^{X} / B / c

is used to describe a queuing model, as can be seen in [13], where A refers to the distribution of the inter-arrival times; X refers to a random variable for the arrival batch size that is only present in case of batch arrivals; B refers to the service-time distribution; and c refers to the number of servers. Exact results for the steady-state system-content or customer-delay distributions have only been reported for special cases, where specific model restrictions are introduced on, e.g., the number of servers, the distribution of the inter-arrival times, and/or the service-time distribution. Examples of such classical multi-server queuing models are [14,15,16,17,18] in a continuous-time setting and [19,20,21,22,23,24,25,26] in a discrete-time setting.

It may seem surprising that multi-server systems are more difficult to analyze than single-server systems with variable service capacity. After all, a system with two servers which each have a service capacity of x seems reasonably similar to a system with one server with a service capacity of

2 x

. However, there is a crucial difference between those two systems, which is that the single-server system is work-conserving whereas the multi-server system is not. Indeed, if there is exactly one customer in the system, then in the multi-server system, this customer will be served by one of the two servers while the other server is idle and its service capacity remains unused, whereas in the single-server system, the entire service capacity is used to serve the customer. This fundamental difference is what makes it impossible to precisely analyze the multi-server system using the techniques available in the current literature.

Therefore, in this paper, we develop several techniques which may be used to obtain approximations for the key performance characteristics such as the mean number of customers in the system at the beginning of an arbitrary slot or the mean delay of an arbitrary customer. Some of these are focused on being accurate only in specific situations, such as low or high traffic load. We also propose and analyze one approximation, called the Two-Phase approximation, which is aimed at being accurate regardless of the situation. These various approximations are then verified using several numerical simulations.

The rest of this paper is structured as follows. In Section 2, we formalize the multiserver queuing model studied in this paper and define the mathematical notation used in this paper. The various approximations for the probability generating functions (pgfs) and means of the system-content and customer-delay distributions are discussed in Section 3. In Section 4, we compare the accuracy of these approximations against simulations for several example scenarios. Section 5 provides a brief conclusion to the paper. Finally, in Appendix A and Appendix B, we state and prove two theorems which are used in the analysis of the considered queuing model.

2. Queuing Model

In this section, we give a precise description of the queuing model studied in this paper, as well as the mathematical notation used in the analysis.

The studied queuing model is a discrete-time queuing model, i.e., time is divided into (time) slots. Customers that arrive during a time slot are only allowed into the system at the end of that slot. The number of customers that arrive during time slot k is denoted as

A_{k}

. These numbers of arrivals

{A_{k} | k = 1, 2, . . .}

are assumed to be i.i.d. random variables with a general common distribution with pgf

A (z)

, i.e.,

A (z) ≜ \sum_{n = 0}^{\infty} P (arbitrary slot contains n arrivals) z^{n} .

(1)

The mean number of arrivals per slot is denoted as

λ ≜ A^{'} (1)

.

The “service demands”, i.e., the amount of work that each customer requires from the system, are assumed to be positive integer numbers of “work units”. The service demand of customer k is denoted as

S_{k}

. The service demands

{S_{k} | k = 1, 2, . . .}

are assumed to be i.i.d. random variables with a general common distribution with pgf

S (z)

. The mean service demand is denoted as

τ ≜ S^{'} (1)

.

There are c servers in our queuing model for some given integer

c \geq 1

. In each time slot, each of these servers has a “service capacity”, which is the maximum amount of work that the server can provide to customers during that time slot. The service capacity of server i during slot k is an integer number of work units, denoted as

R_{i, k}

. The service capacities

{R_{i, k} | i = 1, 2, . . ., c, k = 1, 2, . . .}

are assumed to be i.i.d. random variables with a common distribution with pgf

R (z)

. The mean service capacity of a server (per slot) is denoted as

μ ≜ R^{'} (1)

. The pgf

R (z)

is assumed to be a rational function. We therefore introduce the mutually prime polynomials

P_{R} (z)

and

Q_{R} (z)

such that

R (1 / z) = \frac{P_{R} (z)}{Q_{R} (z)},

(2)

and we let m denote the degree of

Q_{R} (z)

.

The

R_{i, k}

work units of service capacity are always executed in a work-conserving manner, i.e., if the available service capacity of a server is larger than the (remaining) service demand of the customer in service by that server, then a next customer can immediately begin service (during the same time slot). This repeats until either the available service capacity of the server runs out or there are no more customers left in the queue. As such, it is possible that many customers are served during one slot by one server. Conversely, if the remaining service demand of the customer in service by server i is larger than

R_{i, k}

, then this remaining service demand simply decreases by

R_{i, k}

and the service of the customer continues in the next slot

k + 1

. As such, one customer’s service may last many slots.

The scheduling mechanism by which customers are allocated to servers is as follows. There is a single queue which is shared between all servers. Each customer can only receive service from a single server. At the beginning of each slot, the customer at the front of the queue goes to a random server that has at least 1 work unit of service capacity available. This is repeated until either there are no more customers in the queue or all servers are serving customers with a total (remaining) service demand that is greater than or equal to their service capacity during that slot.

The number of customers in the system at the beginning of the kth time slot is denoted as

B_{k}

. This is also called the system content at the beginning of the kth slot. The delay of the kth customer, which is the whole number of slots between the end of this customer’s arrival slot and the end of the last slot during which the customer receives service, is denoted as

D_{k}

. The two main performance characteristics that we are interested in are the limiting distributions of

B_{k}

and

D_{k}

as k goes to infinity, i.e., in so-called “steady state”. When these two random variables have a limiting distribution, the system is said to be “stable”. This is the case when the mean total service demand of all customers arriving per slot is less than the mean service capacity per slot, i.e.,

λ τ < c μ

, or alternatively, when the load

ρ = λ τ / μ c < 1

. When the system is stable, we denote the pgf of the limiting distribution of

B_{k}

and

D_{k}

as, respectively,

B (z)

and

D (z)

. Alternatively,

B (z)

is the pgf of the system content at the beginning of an arbitrary slot in steady state, and

D (z)

is the pgf of the delay of an arbitrary customer in steady state.

A concept that will also be used in the analysis of this system is the concept of “unfinished work”, which is simply the sum of all remaining work units of the service demand of all customers in the system. The unfinished work observed by a customer C is defined as the unfinished work in the system right after customer C arrives, i.e., it is the sum of three independent terms:

The (remaining) service demand of all customers already in the system before the arrival slot of customer C and still in the system at the end of the arrival slot of customer C;
The work units of other customers arriving in the same slot as customer C but arriving before customer C;
The service demand $S_{C}$ of customer C itself.

3. Approximations

As mentioned in the introduction, obtaining the exact expressions for the system-content and the customer-delay distributions for this general model is extremely difficult. To the best of our knowledge, none of the techniques commonly used in the queuing-theory literature can solve this queuing model exactly. Therefore, in this section, we will explore several approaches to obtain approximations for the pgfs of the system-content and customer-delay distribution.

3.1. Geometric Service Times: The GI-Geom-c Approximation

If the mean service demand

τ

of a customer is larger than the mean service capacity

μ

of each server, then we can obtain our first approximation by assuming that, in each server, only one customer may be in service at the same time, and in each slot, the service of any customer in service ends with the probability of

μ / τ

, independently from slot to slot and from customer to customer. Under these assumptions, the system behaves identically to the classical multi-server model with (shifted) geometric service times with mean

τ / μ

. An expression for the pgf of the steady-state system content and customer delay was obtained in [19]. We refer to this approximation as the GI-Geom-c approximation.

3.2. Deterministic Service Times: The GI-D-c Approximation

Similarly to the previous approximation, when

τ

is an integer multiple of

μ

, we can obtain another approximation by considering the system as a classical multi-server queuing system with deterministic service times of

τ / μ

slots. An expression for the pgf of the system content for this type of system was obtained in [20], and an expression for the pgf of the customer delay was obtained in [21]. We refer to this approximation as the GI-D-c approximation.

3.3. The Single-Server Approximation

The next approximation that we will investigate, which we will refer to as the Single-Server approximation, is to consider the c servers as a single server whose service capacity in each slot is equal to the combined service capacity of the c servers in the original queuing system during that same slot. In other words, this is the same as assuming that the c servers all work together on serving the customer at the front of the queue, in a strictly FCFS manner, rather than each server serving their own customer(s) or potentially being idle if the number of customers in the system is less than c.

Since the service capacity of any given server in an arbitrary slot in steady state is assumed to be independent from the service capacities of the other servers and to have rational pgf

R (z)

, the combined service capacity of all c servers has pgf

R {(z)}^{c}

. This pgf

R {(z)}^{c}

is also a rational function, where the denominator of

R {(1 / z)}^{c}

is a polynomial in z of degree

c m

, with m being the degree of the denominator of

R (1 / z)

. An expression for the pgf, mean, and other moments of the system-content and customer-delay distributions in this single-server model was obtained in [10]. Note that the roots for

ξ

of the characteristic equation

1 - z R {(1 / ξ)}^{c} = 0

can be found by calculating the roots of the characteristic equation

1 - z_{i} R (1 / ξ) = 0

of the single-server system, where

z_{i}, i = 0, . . ., m - 1

are the complex cth order roots of z. This implies that the former set of roots are distinct for all z if and only if the latter roots are distinct for all z (as can also be seen in [10], Appendix D).

3.4. The Infinite-Server Approximation

Another approximation, which is only useful for situations where the load is very low, is to assume that each arriving customer always finds an idle server and can begin its service in the first slot after its arrival slot. This is effectively the same as assuming that there are an infinite number of servers, so we refer to this approximation as the Infinite-Server approximation.

With this assumption of an infinite number of servers, the distribution of the customer delay is very straightforward to calculate. Since each customer always finds an idle server upon arrival, it can begin service in the next slot after its arrival, and it finishes service as soon as the cumulative service capacity of the server it is assigned to matches or exceeds the service demand of the customer. We can therefore use Theorem A1 in Appendix A to obtain our approximation of the pgf

D (z)

of the customer delay

D_{C}

as

D (z) = \frac{z - 1}{z} \sum_{k = 0}^{m - 1} \frac{S (α_{k} (z))}{R^{'} (1 / α_{k} (z))} \cdot \frac{α_{k} (z)}{α_{k} (z) - 1},

(3)

where the zeros

α_{k} (z)

are, as defined in Theorem A1, the m zeros for

ξ

of (A2). Note that this result does not depend on the pgf

A (z)

of the number of arrivals, as is expected for an infinite-server system.

3.5. The Two-Phase Approximation

Obviously, the above Infinite-Server approximation is very poor if the load is high, but it turns out to be a very good approximation for low loads. In fact, unlike all our other approximations so far, this approximation does become exact as the load goes to 0.

Conversely, the behavior of the Single-Server approximation is in many regards the opposite. For low loads, where most customers are served while they are the only customer in the system, the Single-Server approximation assumes that these lone customers are served with the combined service capacity of all servers in the system, while in reality, they only receive service from a single server with a mean service capacity

μ

. Thus, if the load is low, the Single-Server approximation is expected to be a poor approximation. However, for a high load, where most of the time there are many customers in the system and the delays of most customers are very long, the Single-Server approximation becomes a lot better. Indeed, while a customer C is waiting in the queue because all servers are fully occupied, the system with multiple servers behaves very similarly to the Single-Server approximation with regard to customer C. In this case, the number of unfinished work units that are “in front” of customer C, i.e., belonging to customers that are either in service or in front of customer C in the queue, is easily seen to decrease every slot by the combined service capacity of all servers. The difference between both systems only manifests itself for customer C when customer C enters service. At that point, it only takes approximately

τ / μ

slots on average before customer C leaves the multi-server system. As this part,

τ / μ

does not depend on the arrival rate

λ

while the total mean customer delay grows unboundedly as

λ

increases, the relative difference between the multi-server system and the Single-Server approximation goes to 0 as the load goes to 1.

We therefore have one approximation, namely the Infinite-Server approximation, that is very good if the load is low, but terrible if the load is high, and another approximation, namely the Single-Server one, that is good for high load, but poor for low load. Luckily, it is possible to combine both approaches and create an approximation that is good for both high and low load. This is exactly what we will do in this subsection.

Specifically, we proceed by approximating the number of slots that the customer spends in the shared queue, i.e., the “waiting time”, and the number of slots that the customer spends in service, i.e., the “service time”, separately, and we make the simplifying assumption that these two quantities are independent.

Note that the service-time distribution is not simply a function of the service-demand and the service-capacity distributions. It also depends on other quantities including the distribution of the number of arrivals and the load. For instance, when the load is low, most customers arrive in an empty system, and the available service capacity for the first slot of service of a customer will be close to the service capacity in an arbitrary slot. On the contrary, if the load is high, most customers’ first slot of service is also the last slot of service of a different customer, and the available service capacity of the server during that slot is divided between both customers. Generally, determining the exact service-time distribution is expected to be as difficult as determining the distribution of the steady-state system content or customer delay.

As already indicated, the goal in this subsection is to obtain an approximation which is simultaneously accurate for both high and low loads. Because the waiting time is the dominant term of the customer delay when the load is high, while the service time is the dominant term when the load is low, we will attempt to find an approximation for the waiting-time distribution which is as accurate as possible when the load is high while we will attempt to find an approximation for the service-time distribution which is as accurate as possible when the load is low. The latter is achieved by assuming that, during the first slot of service of an arbitrary customer C, the distribution of the service capacity that the server can spend on this customer is the same as the distribution of the service capacity in an arbitrary slot, i.e., that the server that the customer went to does not have other customers making use of the server’s available service capacity. The probability that this is true approaches 1 as the load approaches 0, because then the expected number of other customers in the system at the beginning of the arrival slot of an arbitrary customer also approaches 0. With this assumption, our approximation for the service-time distribution is in fact the same as the distribution of the customer delay under the Infinite-Server approximation. We denote the pgf of this distribution as

S_{\infty} (z)

.

Approximating the waiting-time distribution of the arbitrarily selected customer C is more difficult. As the waiting time is the dominant term of the customer delay

D_{C}

for a high load, we want our approximation to be as accurate as possible when the load is high. Under high load, in most slots, it is the case that all servers are fully occupied serving customers, i.e., using all their available service capacity. During these slots, the service process is work-conserving, and the unfinished work in the system decreases by the combined service capacity of all servers (and increases by the combined service demand of all arrivals), just like in a single-server system. It is therefore reasonable to assume that if the load is high, the distribution of the unfinished work (both at the beginning of an arbitrary slot and as observed by an arbitrary customer) is similar to that in a single-server system with the same total service capacity (as considered under the Single-Server approximation), with the relative difference between the two decreasing as the load increases.

We will now obtain an approximation for the pgf

V (z)

of the unfinished work

V_{C}

observed by an arbitrary customer C. As noted in Section 2,

V_{C}

consists of three independent terms:

The (remaining) service demand of all customers already in the system before the arrival slot of customer C and still in the system at the end of the arrival slot of customer C;
The work units of other customers arriving in the same slot as customer C but arriving before customer C;
The service demand $S_{C}$ of customer C itself.

We now consider each of these three terms separately. The third term, of course, simply has pgf

S (z)

. For the second term, it is a well-known property from renewal theory (see, e.g., [27,28]) that for any queue with independent, ordered arrivals, the pgf

F (z)

of the number of customers arriving in the same slot as an arbitrary customer C but arriving before customer C is given by

F (z) = \frac{A (z) - 1}{λ (z - 1)} .

(4)

The pgf of the total number of work units of these customers, i.e., the second term above, is then given by

F (S (z))

. For the first term, due to the reasoning in the previous paragraphs, the pgf of the first term is approximately equal to

U (z) / A (S (z))

(as also seen in Appendix B), where

U (z)

is now the pgf of the unfinished work at the beginning of an arbitrary slot in the multi-server system as obtained under our Single-Server approximation, for which an expression can be obtained from Theorem A2 (for the case where the single server has a service-capacity pgf

R {(z)}^{c}

and mean service capacity

c μ

).

We now wish to approximate the pgf

W (z)

of the waiting time

W_{C}

of the arbitrary customer C. To do this, we begin by defining

{\hat{V}}_{C}

as the unfinished work observed by customer C belonging to customers other than C. In other words,

{\hat{V}}_{C}

is equal to the sum of the first and second term of the unfinished work

V_{C}

observed by customer C above. We also define

R^{(k, c)}

as the combined service capacity of all c servers during the k slots following the arrival slot of customer C. Note that the pgf of

R^{(k, c)}

is

R {(z)}^{k c}

, while the pgf

\hat{V} (z)

of

{\hat{V}}_{C}

is

\begin{matrix} \hat{V} (z) & = \frac{U (z)}{A (S (z))} \frac{A (S (z)) - 1}{λ (S (z) - 1)} . \end{matrix}

(5)

If

R (z)

is a rational function, where

P_{R} (z)

and

Q_{R} (z)

denote the mutually prime polynomials in, respectively, the numerator and denominator of

R (1 / z)

, we can use Theorem A2 to obtain the expression

\begin{matrix} \hat{V} (z) & = \frac{c μ - λ τ}{λ} \cdot \frac{z - 1}{1 - R {(1 / z)}^{c} A (S (z))} \cdot \frac{A (S (z)) - 1}{S (z) - 1} \\ \cdot \prod_{ζ \in S_{R}^{- 1}} {(\frac{1 - ζ}{z - ζ})}^{μ_{ζ} c} \prod_{ξ \in N_{T_{c}}^{-}} {(\frac{z - ξ}{1 - ξ})}^{n_{ξ}}, \end{matrix}

(6)

where

S_{R}^{- 1}

denotes the set of poles of

R (1 / z)

,

μ_{ζ}

denotes the multiplicity of the pole

ζ \in S_{R}^{- 1}

,

N_{T_{c}}^{-}

denotes the set of zeros of

T_{c} (z) ≜ Q_{R} {(z)}^{c} - A (S (z)) P_{R} {(z)}^{c}

inside or on the unit circle, excluding the zero at

z = 1

, and

n_{ξ}

denotes the multiplicity of a zero

ξ

in this set.

Now note that if this combined service capacity

R^{(k, c)}

of all servers during the k slots after the arrival slot of customer C is larger than

{\hat{V}}_{C}

, then customer C cannot be waiting in the queue anymore after k slots. Indeed, while a customer is in the queue, all servers must be fully occupied, in which case the unfinished work in front of that customer decreases by the combined service capacity of all servers. Thus, if the customer C would still be waiting in the queue after k slots, the unfinished work in front of the customer would be equal to

{\hat{V}}_{C} - R^{(k, c)}

, which in this case is negative, and therefore impossible. In conclusion, we have the following implication:

R^{(k, c)} > {\hat{V}}_{C} \Rightarrow W_{C} < k .

(7)

Unfortunately, the reverse of this implication is not necessarily true. In other words, if the combined service capacity

R^{(k, c)}

during the k slots after customer C’s arrival slot is less than or equal to the unfinished work

{\hat{V}}_{C}

, then that does not necessarily mean that the customer C is still waiting in the queue after k slots. Indeed, it may happen that some of these

{\hat{V}}_{C}

work units in front of customer C belong to a customer

C_{2}

that entered the service with a server whose (remaining) available service capacity is less than the service demand of the customer

C_{2}

. In that case, the service of customer

C_{2}

will not finish during that slot, even if there are other servers with available service capacity, because each customer can only receive the service from a single server. These other servers will instead serve different customers, potentially including the arbitrary customer C. Thus, while these excess work units of the service demand of customer

C_{2}

count towards the unfinished work

{\hat{V}}_{C}

in front of customer C, they do not inhibit customer C from going to a different server. Note that intuitively, such a situation where the reverse of (7) does not hold, it seems more likely in case the variance of the service demands is higher.

Calculating the exact value that the combined service capacities of the servers needs to reach in order for customer C to be able to begin its service, or calculating the exact waiting time

W_{C}

of customer C, is very difficult. Therefore, we make a final simplifying assumption that the implication (7) is nonetheless a two-way relationship. In other words, we assume that

R^{(k, c)} > {\hat{V}}_{C} \Leftrightarrow W_{C} < k .

(8)

Note that if the service demands of all customers other than C are equal to one work unit, then this relationship (8) is always true, because if

{\hat{V}}_{C} \geq R^{(k, c)}

then there are

{\hat{V}}_{C} - R^{(k, c)} \geq 0

customers in front of customer C in the queue at the beginning of the kth slot after the arrival slot of customer C, so that customer C is still waiting in the queue after k slots. However, if the service demands of the other customers are not always equal to 1 work unit, the assumption that (8) is true is an approximation. Moreover, as already indicated above, it is intuitively expected that (8) is more likely to break down in case of service demands with a higher variance.

Under the assumption that the waiting time

W_{C}

ends when the accumulated service capacity

R^{(k, c)}

matches or exceeds

{\hat{V}}_{C}

, we can use Theorem A1 to obtain our approximation for the pgf

W (z)

of the waiting time as

\begin{matrix} W (z) & = \frac{z - 1}{z} \sum_{k = 0}^{c m - 1} \frac{\hat{V} (α_{k}^{(c)} (z))}{c R^{'} (1 / α_{k}^{(c)} (z)) R {(1 / α_{k}^{(c)} (z))}^{c - 1}} \cdot \frac{α_{k}^{(c)} (z)}{α_{k}^{(c)} (z) - 1} \\ = \frac{c μ - λ τ}{λ} \sum_{k = 0}^{c m - 1} \frac{1 - z}{c R^{'} (1 / α_{k}^{(c)} (z)) R {(1 / α_{k}^{(c)} (z))}^{c - 1}} \cdot \frac{α_{k}^{(c)} (z)}{S (α_{k}^{(c)} (z)) - 1} \\ \cdot \frac{1 - A (S (α_{k}^{(c)} (z)))}{z - A (S (α_{k}^{(c)} (z)))} \prod_{ζ \in S_{R}^{- 1}} {(\frac{1 - ζ}{α_{k}^{(c)} (z) - ζ})}^{μ_{ζ} c} \prod_{ξ \in N_{T_{c}}^{-}} {(\frac{α_{k}^{(c)} (z) - ξ}{1 - ξ})}^{n_{ξ}}, \end{matrix}

(9)

where

α_{k}^{(c)} (z)

denotes the kth zero for

ξ

of

Q_{R} {(ξ)}^{c} - z P_{R} {(ξ)}^{c}

.

Our final approximation for the pgf

D (z)

of customer delay is then obtained using our aforementioned assumption that the waiting time and service time of a customer are independent. This leads to

D (z) = W (z) S_{\infty} (z)

, where

S_{\infty} (z)

is the pgf of the customer delay under the Infinite-Server approximation.

We refer to the approximation derived in this subsection as the Two-Phase approximation, highlighting the fact that this approximation estimates the durations of the queuing phase and the service phase separately. Informally, this model essentially assumes that, while a customer C is waiting in the queue, the customer C must wait until the unfinished work belonging to other customers with service priority over C is executed, which is assumed to happen at a rate corresponding to the combined service capacity of all servers. In the second phase, while the customer C is in service, this customer must wait until the

S_{C}

work units of its own service demand are executed, which is assumed to happen at a rate corresponding to the service capacity of a single server.

While this model only provides an approximation for the pgf of the customer delay, and while there is no clear way for obtaining a similar approximation for the pgf of the system content, it is well known that an approximation for the mean system content can always be obtained from the obtained approximation of the mean customer delay using Little’s law (as can be seen in [29,30]), as this (simple) relationship applies to any queuing system in steady state.

3.6. Monte-Carlo Simulations

Finally, in order to be able to compare the accuracy of the various approximations derived in this paper, we also use Monte-Carlo simulations of the model with multiple servers over

10^{8}

time slots. While these simulations are in most cases more accurate than the other approximations derived in this paper and are therefore taken as the “ground truth” in the next section, these simulations can often take an extremely long time to complete, whereas the analytical expressions usually only take a few seconds to evaluate. For brevity, these simulations are referred to as the Monte-Carlo estimate in the next section.

4. Numerical Examples

In this section, we consider several numerical examples to test the accuracy of the various approximations described in the previous section.

In our first numerical example, shown in Figure 1, we study the impact of the load

ρ

on the accuracy of our approximations of the mean customer delay. We consider a queuing system with

c = 5

servers, where the service capacities of the servers are independent and have a negative-binomial distribution with 5 phases and with a mean value of

μ = 10

work units. The service demands have a (shifted) negative-binomial distribution, with five phases but with a mean value of

τ = 50

work units. The number of arrivals per slot follows a Poisson distribution with mean

λ

that is scaled such that the system obtains the desired load

ρ = λ τ / μ c

. In this example,

λ

thus equals

ρ

.

Unsurprisingly, as the load

ρ

increases to 1, the mean customer delay approaches ∞. As can be seen in Figure 1a, all of our approximations model this behavior correctly except for the Infinite-Server approximation, which does not take the arrival rate into account at all. For high loads, however, it is difficult to see in Figure 1a which approximations are most accurate. Therefore, in Figure 1b, we show each of our estimates relative to the estimate from the Monte-Carlo simulations, i.e., Figure 1b shows the ratio between each of our mean delay approximations and the Monte-Carlo estimate. In this figure, it can be seen that the Infinite-Server approximation is very accurate if the load is less than 0.4, but falls off dramatically afterwards, as expected. Conversely, the Single-Server approximation is very inaccurate for loads less than 0.9, but becomes more and more accurate as the load approaches 1. This agrees with the discussion in Section 3.5. It can also be seen that the GI-D-c approximation consistently underestimates the mean delay, which is in accordance with observations in [8,10] that a lower variance of the service-demand or service-capacity distributions typically results in lower mean system content as well as mean delay in single-server systems. The same thus also appears to be true for this multi-server model. The GI-Geom-c approximation interestingly underestimates the mean customer delay for low loads while it overestimates the mean delay for high loads. This can be explained by the fact that this approximation estimates the mean service time for all loads as simply

τ / μ

, in this case five slots, whereas the real mean service time for low load is slightly higher, namely approximately 5.6 slots, as can be seen, e.g., from the simulation results in Figure 1a, as the load

ρ

goes to 0. For high loads, this model overestimates the mean customer delay because it assumes that the service times are geometrically distributed, whereas the relatively low variance (in comparison to the geometric distribution) of both the service-demand and the service-capacity distributions causes the real service times to have relatively low variance as well. This is true for all values of the load

ρ

. Finally, the Two-Phase approach accurately approximates the mean customer delay for both high and low loads

ρ

, as intended. Even for intermediate values of the load

ρ

, the Two-Phase estimate of the mean customer delay never deviates from the value obtained by the Monte-Carlo simulations by more than 5% (i.e., the ratio between the Two-Phase estimate and the Monte-Carlo estimate in Figure 1b is always between 0.95 and 1.05).

In our second numerical example, shown in Figure 2, we study the effect of the number of servers on the accuracy of our various approximations of the mean customer delay. The distributions of the service demands and the service capacities are the same as in the first numerical example. The distribution of the number of arrivals per slot is again a Poisson distribution, although for this example, the arrival rate

λ

is always chosen such that the load

ρ = 0.95

. This high load makes the Infinite-Server estimate a poor approximation, similar to the previous example for high

ρ

, especially if the number of servers c is low. It is no surprise that as the number of servers increases, the quality of the Infinite-Server approximation increases while the quality of the Single-Server approximation decreases. The Two-Phase approximation is relatively accurate regardless of the number of servers, deviating by no more than 4% from the Monte-Carlo estimate. As in the previous example, the GI-D-c approximation consistently underestimates the mean customer delay. The GI-Geom-c approximation overestimates the mean delay because the load is high, although it appears to become very accurate as the number of servers increases.

However, this last observation is mostly a coincidence, as we will demonstrate in our third numerical example, shown in Figure 3. In this example, all the parameters are the same as in our second example, with the only difference being that the service demands follow a “bursty” distribution, where each customer has a 95% chance of having a geometrically distributed service demand with mean 30 work units and a 5% chance of having a (shifted) geometrically distributed service demand with mean 430 work units. The mean service demand is still

τ = 50

work units. Due to the fact that there is much more variance in the service demands, the mean customer delay is much higher in this case. As neither the GI-Geom-c nor the GI-D-c approximation take the variance of the service-demand distribution into account, these approximations significantly underestimate the mean customer delay. The Two-Phase approximation is also less accurate than in the previous example, deviating up to 25% from the Monte-Carlo estimate. This is because the assumption that (8) is true in the Two-Phase approach overestimates the waiting time

W_{C}

by an amount that depends on the variance of the service-demand distribution. Indeed, as explained in Section 3.5, this assumption that (8) is true is exact if the service demands of the customers other than C are all deterministically equal to one work unit. Using a similar reasoning, it can be seen that the higher the variance of the service demands becomes, the less realistic this assumption becomes. For instance, if both the variance of the service demands as well as the amount of unfinished work

{\hat{V}}_{C}

in front of customer C are large, then it is not unlikely that most of these work units are nevertheless only part of the service demand of a single customer, and are therefore only occupying a single server, leaving the other servers free to serve other customers, potentially including the arbitrary customer C. Therefore, we conclude that the Two-Phase approximation becomes less accurate if the variance of the service demands is high. In Figure 3, it can even be seen that the Single-Server approximation is more accurate than the Two-Phase approximation in this case, although this is aided by the fact that the load

ρ = 0.95

is relatively high.

In our next numerical example, shown in Figure 4, we study the impact of the ratio

τ / μ

, which is a rough approximation for the mean “service time”. In this example, there are

c = 5

servers, and the service capacities follow a uniform distribution between 0 and 10 work units (inclusive), so that the mean service capacity

μ

of a server is equal to 5 work units, while the service demands are deterministically equal to

τ

work units. As the GI-D-c approximation is only defined when

τ / μ

is a positive integer, in this example, we will only consider values of

τ

which are multiples of the mean service capacity

μ = 5

. The arrival process is again a Poisson process, and the arrival rate

λ

is scaled such that the load

ρ

is always equal to 0.9. In Figure 4, it can be seen that as the ratio

τ / μ

increases, the mean customer delay also increases. This is a consequence of the fact that the mean “service time” of an arbitrary customer increases as

τ / μ

increases. Note that in Figure 4, the Two-Phase approximation is relatively close to the Monte-Carlo estimate, with a relative error of no more than 5%. This is in agreement with the discussion in the previous numerical example with respect to the impact of the variance of the service-demand distribution on the accuracy in the Two-Phase approximation, because the variance of the service demands in this example is low (zero).

The mean customer delay is not the only performance characteristic that one might be interested in. Furthermore, the variance, other moments, tail probabilities, or the entire probability mass function of the customer delay may be of interest. In Figure 5, we show the probability mass function of the customer delay for the various approximations discussed in Section 3. The arrival, service-demand, and service-capacity distributions and the number of servers

c = 5

are the same as in Figure 1, with the arrival rate

λ

chosen such that the load

ρ = 0.7

. In Figure 5, it can be seen that the Two-Phase approach provides a good approximation for the probability distribution of the customer delay. In fact, it appears that the rate of exponential decay of the probability mass function is the same for the Single-Server approximation and the Two-Phase approximation as for the real multi-server queuing system. This is not only the case for this particular example, but it is true for all other examples that we have explored. This can be explained as follows. As mentioned in Section 3.5, when all servers are busy in a certain slot k, i.e., the total available service capacity is utilized during slot k, and the total number of unfinished work units in front of a customer in the queue decreases by the total combined service capacity of all servers. This is the same for both the Single-Server approximation and the Two-Phase approach (in the term corresponding to the queuing phase) as for the real system. Furthermore, as the delay of a customer C increases, the probability that all servers are occupied during all the slots where this customer C is in the system approaches 1. If there are many customers in front of customer C when this customer C arrived, then it will take a similar amount of time for customer C to reach the front of the queue regardless of whether there is only one server or multiple (provided, of course, that the combined service capacity is the same). Therefore, the ratio of

P (D_{C} = n + 1) / P (D_{C} = n)

should be similar for large n for the Single-Server estimate, the Two-Phase approach, and the real multi-server queuing system.

In our last numerical example, shown in Figure 6, we study the impact of the variance of the service-capacity distribution on the mean customer delay. In this example, there are

c = 5

servers, and the service capacities follow a negative-binomial distribution with a mean value of

μ = 5

work units and r phases, for various r. As r increases, the variance of the service-capacity distribution decreases. The service demands follow a (shifted) negative-binomial distribution with five phases and a mean of

τ = 10

work units, and the arrivals follow a Poisson distribution with a mean arrival rate

λ

chosen such that the load

ρ = 0.95

. Note that the GI-Geom-c and GI-D-c estimates do not depend on r, as expected. In Figure 6, it can be seen that the variance of the service-capacity distribution plays a relatively small role in determining the mean customer delay. Indeed, the more servers there are, the smaller the impact of the variance of the service capacity of a single server becomes.

5. Conclusions

In this paper, we studied a multi-server queuing model, where each customer can only receive service from a single server. This means that if there are fewer customers in the system than the number of servers, some of the available service capacity is not used, making the system not work-conserving and much harder to analyze. While obtaining exact expressions for the pgfs of the system content and the customer delay does not appear feasible using current techniques, we studied several approximations and compared the obtained estimates using numerical examples. One approximation in particular, the Two-Phase approximation, provides good approximation for both the mean customer delay and the tail probabilities of the delay, for both low and high traffic loads. This approximation works especially well when the variance of the service demands is low.

Author Contributions

Conceptualization, M.D.M., S.W., H.B.; methodology, M.D.M.; software, M.D.M.; validation, M.D.M., S.W.; formal analysis, M.D.M. and S.W.; writing—original draft preparation, M.D.M.; writing—review and editing, S.W. and H.B.; visualization, M.D.M.; supervision, S.W., H.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The numerical results calculated in this study are presented in the figures in this article.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

pgf	Probability generating function
i.i.d.	Independent and identically distributed
Geom	Geometric
GI	General and independent
FCFS	First-come–first-served

Appendix A. Time to Serve V Work Units

In this appendix, we state and prove a relationship between the pgf

V (z)

of an arbitrary positive random variable V and the number of slots it takes for a given server to serve V work units. The proof is based on [10], Section 3.2 and Appendix B.

Theorem A1.

Given a (single) server with i.i.d. service capacities from slot to slot with rational pgf

R (z)

, and given a positive random variable V with pgf

V (z)

that is independent of the service capacities of this server, if we define

D_{V}

as the number of slots from an arbitrary slot K in steady state until the accumulated service capacity of the given server matches or exceeds V, then an expression for the pgf

D_{V} (z)

of

D_{V}

can be obtained as

D_{V} (z) = \frac{z - 1}{z} \sum_{k = 0}^{m - 1} \frac{V (α_{k} (z))}{R^{'} (1 / α_{k} (z))} \cdot \frac{α_{k} (z)}{α_{k} (z) - 1},

(A1)

where the functions

α_{k} (z), k = 0, . . ., m - 1

are the m zeros for ξ of

1 - z R (1 / ξ),

(A2)

if these zeros

α_{k} (z)

are distinct, which is the case for all but at most

2 m - 1

values of z.

Proof.

We define the random variable

R^{(k)}

as the accumulated service capacity of the given server s during the k slots following slot K:

\begin{matrix} R^{(k)} ≜ \sum_{i = 1}^{k} R_{s, K + i} . \end{matrix}

Note that because

R^{(k)}

is the sum of k independent random variables with the same pgf

R (z)

, the pgf of

R^{(k)}

is

R {(z)}^{k}

. By definition, the following relationship between

D_{V}

and

R^{(k)}

holds:

\begin{matrix} D_{V} > k \Leftrightarrow V > R^{(k)} . \end{matrix}

(A3)

Consequently, the probabilities of both events must be the same, so that

\begin{matrix} P (D_{V} > k) = P (V > R^{(k)}) . \end{matrix}

(A4)

The next step is now the derivation of the pgf

D_{V} (z)

of

D_{V}

. To this end, we use the easily proven identity

\begin{matrix} \sum_{k = 0}^{\infty} P (D_{V} > k) z^{k} = \frac{D_{V} (z) - 1}{z - 1} . \end{matrix}

Multiplying both sides of (A4) by

z^{k}

and summing over all k, we can write

\begin{matrix} \frac{D_{V} (z) - 1}{z - 1} & = \sum_{k = 0}^{\infty} \sum_{i = 0}^{\infty} P (V = i, R^{(k)} < i) z^{k} \\ = \sum_{k = 0}^{\infty} \sum_{i = 0}^{\infty} \sum_{j = 0}^{i - 1} P (V = i) P (R^{(k)} = j) z^{k}, \end{matrix}

(A5)

due to the statistical independence of the

R^{(k)}

’s and V. Using the probability generating property of generating functions, we can rewrite this equation as

\frac{D_{V} (z) - 1}{z - 1} = {\sum_{k = 0}^{\infty} \sum_{i = 0}^{\infty} \sum_{j = 0}^{i - 1} \frac{P (V = i)}{j!} \frac{\partial^{j}}{\partial x^{j}} {(z R (x))}^{k}|}_{x = 0}

(A6)

.

We can use Cauchy’s differentiation formula to replace the partial derivatives in (A6) with contour integrals. Cauchy’s differentiation formula states that for any function

f (z)

that is holomorphic on a certain open subset U of the complex plane containing a point a,

lim_{x \to a} \frac{d^{j}}{{d x}^{j}} f (x) = \frac{j!}{2 π ı} \oint_{L} \frac{f (ζ)}{{(ζ - a)}^{j + 1}} d ζ,

(A7)

where L is any contour completely contained in U that circles a in the counterclockwise direction exactly once. Applying this formula with

a = 0

and

f (ζ) = {(z R (ζ))}^{k}

in (A6), we find

\begin{matrix} \frac{D_{V} (z) - 1}{z - ı} & = \frac{1}{2 π ı} \sum_{k = 0}^{\infty} \sum_{i = 0}^{\infty} \sum_{j = 0}^{i - 1} \oint_{L} P (V = i) \frac{{(z R (ζ))}^{k}}{ζ^{j + 1}} d ζ \\ = \frac{1}{2 π ı} \sum_{k = 0}^{\infty} \sum_{i = 0}^{\infty} \oint_{L} P (V = i) {(z R (ζ))}^{k} \frac{ζ^{- i} - 1}{1 - ζ} d ζ, \end{matrix}

(A8)

where L is any contour around the origin that remains inside the radius of convergence

R_{R}

of

R (z)

, i.e., such that

\forall ζ \in L : | ζ | < R_{R}

. Note that the radius of convergence of

R (z)

and that of

R {(z)}^{k}

are equal.

The infinite summation of contour integrals in (A8) is equal to the contour integral of the infinite series (i.e., we may “swap” the summation and integration symbols) if the contour L is chosen such that the resulting infinite series is uniformly convergent. It is important to question when such a contour can be constructed. This is the case if

\forall ζ \in L : | 1 / ζ | < R_{V}

and

| z R (ζ) | < 1

. The former condition imposes a lower bound on

| ζ |

, whereas the latter imposes an upper bound on

| R (ζ) |

that depends on z. Since this upper bound is most severe when

ζ

is real and positive, and since

R (ζ)

is an increasing function of

ζ

on the part of the real axis where

0 \leq ζ < R_{R}

, the bounds can be rewritten as

R (1 / R_{V}) < R (| ζ |) < | 1 / z |

. We conclude that a contour can be constructed if and only if

| z | < 1 / R (1 / R_{V})

. It follows that the radius of convergence

R_{D_{V}}

of

D_{V} (z)

is given by

R_{D_{V}} = 1 / R (1 / R_{V})

.

If

| z | < R_{D_{V}}

, we may therefore choose L as described above and bring the summations in (A8) behind the integral. We obtain

\frac{D_{V} (z) - 1}{z - 1} = \frac{1}{2 π ı} \oint_{L} \frac{V (1 / ζ) - 1}{(1 - ζ) (1 - z R (ζ))} d ζ .

(A9)

Substituting

z = 0

in (A9), since

D_{V} (0) = 0

, we obtain

1 = \frac{1}{2 π ı} \oint_{L} \frac{V (1 / ζ) - 1}{1 - ζ} d ζ .

Using this result in (A9) again, we find

D_{V} (z) = \frac{z}{2 π ı} \oint_{L} \frac{V (1 / ζ) - 1}{1 - ζ} \cdot \frac{1 - R (ζ)}{1 - z R (ζ)} d ζ .

(A10)

In order to further simplify the above expression, we now split the integrand into two terms, as follows:

\frac{V (1 / ζ)}{1 - ζ} \frac{1 - R (ζ)}{1 - z R (ζ)} - \frac{1}{1 - ζ} \frac{1 - R (ζ)}{1 - z R (ζ)} .

(A11)

The latter term has no poles inside L, since L was chosen such that

\forall ζ \in L : | z R (ζ) | < 1

, which implies (due to Rouché’s theorem) that

1 - z R (ζ)

has no zeros inside L, and the simple zero of the denominator at

ζ = 1

(if that would be inside L) is canceled by the zero of the numerator at

ζ = 1

. We conclude that the contribution of the latter term to the value of the contour integral in (A10) is zero. Therefore, we can rewrite (A10) as

D_{V} (z) = \frac{z}{2 π ı} \oint_{L} \frac{V (1 / ζ)}{1 - ζ} \cdot \frac{1 - R (ζ)}{1 - z R (ζ)} d ζ .

(A12)

Moreover, we change the integration variable in (A12) to

ξ = 1 / ζ

(which yields a factor

- 1 / ξ^{2}

in the integrand), and we invert the integration path L into

L^{'}

but still integrate in counterclockwise sense (which yields an extra factor of

- 1

, since the inversion of L is a clockwise path). This then leads to the following relationship between the pgfs

D_{V} (z)

and

V (z)

:

D_{V} (z) = \frac{z}{2 π ı} \oint_{L^{'}} \frac{V (ξ)}{ξ (ξ - 1)} \cdot \frac{1 - R (1 / ξ)}{1 - z R (1 / ξ)} d ξ,

(A13)

where

L^{'}

is a contour around the origin such that

\forall ξ \in L^{'} : R_{R}^{- 1} < | ξ | < R_{V}

and

| z R (1 / ξ) | < 1

.

Since we assume that the pgf

R (z)

of the service capacities is a rational function, i.e.,

R (1 / z)

is given by (2), Equation (A13) can be rewritten as

D_{V} (z) = \frac{z}{2 π ı} \oint_{L^{'}} \frac{V (ξ)}{ξ (ξ - 1)} \cdot \frac{Q_{R} (ξ) - P_{R} (ξ)}{Q_{R} (ξ) - z P_{R} (ξ)} d ξ .

(A14)

We now focus on the poles of the integrand in (A14) inside the contour

L^{'}

. Since V is positive by the theorem’s assumptions,

V (0)

must equal 0. The zero of the factor

V (ξ)

in the numerator of the integrand at

ξ = 0

then ensures that the factor

ξ

in the denominator does not cause a pole of the integrand at

ξ = 0

. Furthermore, the factor

Q_{R} (ξ) - P_{R} (ξ)

in the numerator ensures that the factor

(ξ - 1)

in the denominator does not cause a pole of the integrand at

ξ = 1

. Finally, since the contour

L^{'}

was chosen such that

\forall ξ \in L^{'} : | ξ | < R_{V}

,

V (ξ)

has no poles inside

L^{'}

either. Therefore, the only poles of the integrand in (A14) inside

L^{'}

are the zeros for

ξ

of

Q_{R} (ξ) - z P_{R} (ξ),

(A15)

or equivalently, of

1 - z R (1 / ξ) .

(A16)

Since (A15) is by definition a polynomial in

ξ

of degree m, (A15) has exactly m zeros for

ξ

, counted with their multiplicities. We denote these zeros for a given value of z by

α_{k} (z)

,

k = 0, 1, . . ., m - 1

. It can easily be seen that all these zeros lie inside

L^{'}

. Indeed, the contour

L^{'}

was chosen such that

\forall ξ \in L^{'} : | z R (1 / ξ) | < 1

. This implies that

| z P_{R} (ξ) | < | Q_{R} (ξ) |

, so using Rouché’s theorem, we can say that (A15) has as many zeros inside

L^{'}

as

Q_{R} (ξ)

. However, all m zeros of

Q_{R} (ξ)

must lie inside

L^{'}

, because

L^{'}

was chosen such that

\forall ξ \in L^{'} : | ξ | > 1 / R_{R}

. This means that (A15) must have exactly m zeros for

ξ

inside

L^{'}

.

To calculate the value of the contour integral in (A14), we can therefore apply Cauchy’s residue theorem (see, e.g., [31]). Note that the zeros

α_{k} (z)

are not necessarily distinct. For a given value of z, let

α (z)

denote the set of zeros for

ξ

of (A15). The pgf of

D_{V} (z)

is then obtained as

D_{V} (z) = z \sum_{ξ^{*} \in α (z)} \underset{ξ = ξ^{*}}{Res} [\frac{V (ξ)}{ξ (ξ - 1)} \cdot \frac{Q_{R} (ξ) - P_{R} (ξ)}{Q_{R} (ξ) - z P_{R} (ξ)}],

(A17)

where the residue at a pole

ξ^{*}

with multiplicity

m^{*}

is given by

\frac{1}{(m^{*} - 1)!} lim_{ξ \to ξ^{*}} \frac{^{m^{*} - 1}}{ξ^{m^{*} - 1}} [{(ξ - ξ^{*})}^{m^{*}} \frac{V (ξ)}{ξ (ξ - 1)} \cdot \frac{Q_{R} (ξ) - P_{R} (ξ)}{Q_{R} (ξ) - z P_{R} (ξ)}] .

(A18)

Since all quantities in expression (A17) are known or can be calculated numerically (when z is known), this expression may be used to evaluate

D_{V} (z)

for any z. However, due to the

(m^{*} - 1)

st derivative with respect to

ξ

in (A18), the evaluation of

D_{V} (z)

may be difficult in practice if the zeros

α_{k} (z)

are not distinct.

Note that if a zero

ξ

of (A16) has a multiplicity larger than 1, then

R^{'} (1 / ξ) / ξ^{2} = 0

. Since

R^{'} (1 / ξ) / ξ^{2}

is the derivative of

- R (1 / ξ)

, a rational function with degree of the numerator and the denominator at most m, there are at most

2 m - 1

values of

ξ

for which

R^{'} (1 / ξ) / ξ^{2} = 0

, with at most

2 m - 1

corresponding values of z (see (A16)). Therefore, for all but at most

2 m - 1

values of z, the zeros

α_{k} (z)

of (A16) are distinct. For those z, a substantially simpler expression for

D_{V} (z)

is available, because we can simplify (A17) to

D_{V} (z) = \sum_{k = 0}^{m - 1} V (α_{k} (z)) \frac{α_{k} (z)}{α_{k} (z) - 1} \frac{1 - R (1 / α_{k} (z))}{R^{'} (1 / α_{k} (z))} .

(A19)

Due to (A16), we moreover have that

R (1 / α_{k} (z)) = 1 / z

. This allows to simplify (A19) further to (A1). □

Appendix B. Unfinished Work in Single-Server System

In this appendix, we state and prove an expression for the pgf

U (z)

of the unfinished work at the beginning of an arbitrary slot in a single-server system. The proof is based on [10], Appendix A.

Theorem A2.

If there is only a single server, i.e., if

c = 1

, then the pgf

U (z)

of the unfinished work at the beginning of an arbitrary slot in steady state is given by

U (z) = (μ - λ τ) \frac{(z - 1) A (S (z))}{1 - R (1 / z) A (S (z))} \prod_{ζ \in S_{R}^{- 1}} {(\frac{1 - ζ}{z - ζ})}^{μ_{ζ}} \prod_{ξ \in N_{T}^{-}} {(\frac{z - ξ}{1 - ξ})}^{n_{ξ}},

(A20)

where

S_{R}^{- 1}

denotes the set of poles of

R (1 / z)

,

μ_{ζ}

denotes the multiplicity of the pole

ζ \in S_{R}^{- 1}

,

N_{T}^{-}

denotes the set of zeros of

T (z) ≜ Q_{R} (z) - A (S (z)) P_{R} (z)

inside or on the unit circle, excluding the zero at

z = 1

, and

n_{ξ}

denotes the multiplicity of a zero ξ in this set.

Proof.

The unfinished work in the system at the beginning of slot k is denoted as

U_{k}

. This unfinished work

U_{k}

has a very elegant system equation. The reason for this is that the service process is work-conserving. If the service capacity of the single server in slot k is equal to

R_{k}

, then we know that, in total,

R_{k}

work units worth of service demand will be executed during that slot, unless the total unfinished work in the system at the beginning of slot k is less than

R_{k}

. Which customers will receive this service is a more difficult question to answer, but the total amount of work that will be executed is not. We find

U_{k + 1} = {(U_{k} - R_{k})}^{+} + \sum_{i = 1}^{A_{k}} S_{k, i},

(A21)

where

{(. . .)}^{+}

denotes

max (0, . . .)

, and

S_{k, i}

denotes the service demand of the ith customer arriving in slot k. Taking the z-transform of both sides of Equation (A21), we obtain

\begin{matrix} E [z^{U_{k + 1}}] = A (S (z)) E [z^{{(U_{k} - R_{k})}^{+}}], \end{matrix}

(A22)

where we have used the independence of the variables

U_{k}

,

R_{k}

,

A_{k}

and the

S_{k, i}

’s. An expression for

U (z)

can then be obtained by taking the limit for

k \to \infty

. We find

\begin{matrix} U (z) & ≜ lim_{k \to \infty} E [z^{U_{k}}] = A (S (z)) lim_{k \to \infty} E [z^{{(U_{k} - R_{k})}^{+}}] . \end{matrix}

(A23)

Since the unfinished work

U_{k}

and the service capacity

R_{k}

are independent random variables and

R_{k}

is a random variable with a rational generating function

R (z)

, we can use a method based on complex contour integration, similar to that presented in [32] (for the analysis of the classical discrete-time

G^{(G)} / Geo / 1

queue), to further work out the above equation. Under the assumption of a stable system, i.e., under the equilibrium condition

λ τ < μ

, the following equation is then obtained for the steady-state pgf

U (z)

of the unfinished work at the beginning of a slot (see [32]):

U (z) = A (S (z)) [U (z) R (1 / z) + (z - 1) \sum_{ζ \in S_{R}^{- 1}} F_{ζ} (z)] .

(A24)

Equation (A24) is valid at least for all z inside the unit circle, with

z \notin S_{R}^{- 1}

, where

S_{R}^{- 1}

denotes the set of singularities of

R (1 / z)

. The function

F_{ζ} (z)

in (A24) is defined as

F_{ζ} (z) = \frac{1}{2 π ı} \oint_{C_{ζ}} \frac{U (ξ) R (1 / ξ)}{(ξ - z) (ξ - 1)} d ξ,

(A25)

with

ı^{2} = - 1

and

C_{ζ}

being a small (counterclockwise) contour around

ζ

but not around any other singularity of

R (1 / ξ)

, nor any singularity of

U (ξ)

, nor around 1 or z.

Let us now assume that the service-capacity pgf

R (z)

is a rational function. Then, all singularities

ζ

of

R (1 / z)

are poles and we can write

R (1 / z) = \frac{P_{R} (z)}{Q_{R} (z)} = \frac{P_{R} (z)}{\prod_{ζ \in S_{R}^{- 1}} {(z - ζ)}^{μ_{ζ}}},

(A26)

wherein

P_{R} (z)

and

Q_{R} (z)

are two mutually prime polynomials and

μ_{ζ}

denotes the multiplicity of the singularity

ζ \in S_{R}^{- 1}

. Note that the degree of

P_{R} (z)

cannot be higher than the degree

m = \sum_{ζ \in S_{R}^{- 1}} μ_{ζ}

of

Q_{R} (z)

, since

{lim}_{z \to \infty} R (1 / z) = R (0) \in [0, 1]

. Therefore, using the expression for the residue of a complex function at a pole

ζ

with multiplicity

μ_{ζ}

, we easily find that the contour integral

F_{ζ} (z)

takes the form

F_{ζ} (z) = \sum_{k = 1}^{μ_{ζ}} \frac{c_{k}}{{(z - ζ)}^{k}},

(A27)

for yet unknown constants

c_{k}

, which in turn leads to

\sum_{ζ \in S_{R}^{- 1}} F_{ζ} (z) = \frac{N (z)}{Q_{R} (z)},

(A28)

with

N (z)

an unknown polynomial of degree

m - 1

. Hence, we obtain

U (z) = \frac{(z - 1) A (S (z)) N (z)}{Q_{R} (z) - A (S (z)) P_{R} (z)} .

(A29)

Using Rouché’s theorem, it can now be shown (as can be seen in, e.g., [33]) that the denominator

T (z) ≜ Q_{R} (z) - A (S (z)) P_{R} (z)

has exactly m zeros inside or on the unit circle, one of which is equal to 1. Since

U (z)

must remain bounded in these zeros, the numerator of

U (z)

has to vanish as well, with at least the same multiplicity. This completely determines the polynomial

N (z)

and the pgf

U (z)

except for a constant factor. With the normalization condition

U (1) = 1

, we finally obtain the desired expression (A20) for

U (z)

. □

References

Boxma, O.J.; Kurkova, I.A. The M/G/1 queue with two service speeds. Adv. Appl. Probab. 2001, 33, 520–540. [Google Scholar]
Halfin, S. Steady-state distribution for the buffer content of an M/G/1 queue with varying service rate. SIAM J. Appl. Math. 1972, 23, 356–363. [Google Scholar] [CrossRef]
Mahabhashyam, S.R.; Gautam, N. On queues with Markov modulated service rates. Queueing Syst. 2005, 51, 89–113. [Google Scholar] [CrossRef]
Bruneel, H.; Walraevens, J.; Claeys, D.; Wittevrongel, S. Analysis of a discrete-time queue with geometrically distributed service capacities. In Proceedings of the 19th International Conference on Analytical and Stochastic Modeling Techniques and Applications, ASMTA 2012, Lecture Notes in Computer Science, Grenoble, France, 4–6 June 2012; Al-Begain, K., Fiems, D., Vincent, J.M., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; Volume 7314, pp. 121–135. [Google Scholar]
Walraevens, J.; Bruneel, H.; Claeys, D.; Wittevrongel, S. The discrete-time queue with geometrically distributed service capacities revisited. In Proceedings of the 20th International Conference on Analytical and Stochastic Modeling Techniques and Applications, ASMTA 2013, Lecture Notes in Computer Science, Ghent, Belgium, 8–10 July 2013; Dudin, A., De Turck, K., Eds.; Springer: Berlin/Heidelberg, Germany, 2013; Volume 7984, pp. 443–456. [Google Scholar]
Bruneel, H.; Wittevrongel, S.; Claeys, D.; Walraevens, J. Discrete-time queues with variable service capacity: A basic model and its analysis. Ann. Oper. Res. 2016, 239, 359–380. [Google Scholar] [CrossRef]
Yao, Y.C.; Miao, D.W.C. Sample-path analysis of general arrival queueing systems with constant amount of work for all customers. Queueing Syst. 2014, 76, 283–308. [Google Scholar] [CrossRef]
De Muynck, M.; Wittevrongel, S.; Bruneel, H. Analysis of discrete-time queues with general service demands and finite-support service capacities. Ann. Oper. Res. 2017, 252, 3–28. [Google Scholar] [CrossRef]
De Muynck, M.; Bruneel, H.; Wittevrongel, S. Delay analysis of a queue with general service demands and phase-type service capacities. In Proceedings of the 10th International Conference on Queueing Theory and Network Applications, QTNA 2015, Advances in Intelligent Systems and Computing, Hanoi, Vietnam, 17–20 August 2015; van Do, T., Takahashi, Y., Yue, W., Nguyen, V.H., Eds.; Springer International Publishing: Berlin/Heidelberg, Germany, 2016; Volume 383, pp. 29–39. [Google Scholar]
De Muynck, M.; Bruneel, H.; Wittevrongel, S. Analysis of a discrete-time queue with general service demands and phase-type service capacities. J. Ind. Manag. Optim. 2017, 13, 1901–1926. [Google Scholar] [CrossRef]
De Muynck, M.; Bruneel, H.; Wittevrongel, S. Delay analysis of a queue with general service demands and correlated service capacities. In Proceedings of the 13th International Conference on Queueing Theory and Network Applications, QTNA 2018, Lecture Notes in Computer Science, Tsukuba, Japan, 25–27 July 2018; Takahashi, Y., Phung-Duc, T., Wittevrongel, S., Yue, W., Eds.; Springer International Publishing: Berlin/Heidelberg, Germany, 2018; Volume 10932, pp. 64–85. [Google Scholar]
De Muynck, M.; Bruneel, H.; Wittevrongel, S. Analysis of a queue with general service demands and correlated service capacities. Ann. Oper. Res. 2020, 293, 73–99. [Google Scholar] [CrossRef]
Kendall, D.G. Stochastic processes occurring in the theory of queues and their analysis by the method of the imbedded Markov chain. Ann. Math. Stat. 1953, 24, 338–354. [Google Scholar] [CrossRef]
Brandwajn, A.; Begin, T. Reduced complexity in M/Ph/c/N queues. Perform. Eval. 2014, 78, 42–54. [Google Scholar] [CrossRef]
Kim, C.; Dudin, A.; Dudin, S.; Dudina, O. Hysteresis control by the number of active servers in queueing system MMAP/PH/N with priority service. Perform. Eval. 2016, 101, 20–33. [Google Scholar] [CrossRef]
Morozov, E.; Pagano, M.; Peshkova, I.; Rumyantsev, A. Sensitivity analysis and simulation of a multiserver queueing system with mixed service time distribution. Mathematics 2020, 8, 1277. [Google Scholar] [CrossRef]
Dudin, A.; Dudina, O.; Dudin, S.; Gaidamaka, Y. Self-service system with rating dependent arrivals. Mathematics 2022, 10, 297. [Google Scholar] [CrossRef]
Grosof, I.; Harchol-Balter, M.; Scheller-Wolf, A. WCFS: A new framework for analyzing multiserver systems. Queueing Syst. 2022, 102, 143–174. [Google Scholar] [CrossRef]
Gao, P.; Wittevrongel, S.; Bruneel, H. Discrete-time multiserver queues with geometric service times. Comput. Oper. Res. 2004, 31, 81–99. [Google Scholar] [CrossRef]
Bruneel, H.; Wuyts, I. Analysis of discrete-time multiserver queueing models with constant service times. Oper. Res. Lett. 1994, 15, 231–236. [Google Scholar] [CrossRef]
Gao, P.; Wittevrongel, S.; Walraevens, J.; Moeneclaey, M.; Bruneel, H. Calculation of delay characteristics for multiserver queues with constant service times. Eur. J. Oper. Res. 2009, 199, 170–175. [Google Scholar] [CrossRef]
Miao, D.W.C.; Chen, H. On the variances of system size and sojourn time in a discrete-time DAR(1)/D/1 queue. Probab. Eng. Inf. Sci. 2011, 25, 519–535. [Google Scholar] [CrossRef]
Kim, J.; Kim, B.; Kang, J. Discrete-time multiserver queue with impatient customers. Electron. Lett. 2013, 49, 38–39. [Google Scholar] [CrossRef]
He, Q.; Alfa, A. Construction of Markov chains for discrete time MAP/PH/K queues. Perform. Eval. 2015, 93, 17–26. [Google Scholar] [CrossRef]
Chaudhry, M.L.; Kim, J.J.; Banik, A.D. Analytically simple and computationally efficient results for the GI(X)/Geo/c queues. J. Probab. Stat. 2019, 2019, 6480139. [Google Scholar] [CrossRef] [Green Version]
Kim, J.J.; Chaudhry, M.L.; Goswami, V.; Banik, A.D. A new and pragmatic approach to the GI(X)/Geo/c/N queues using roots. Methodol. Comput. Appl. Probab. 2021, 23, 273–289. [Google Scholar] [CrossRef]
Bruneel, H.; Kim, B.G. Discrete-Time Models for Communication Systems Including ATM; Kluwer Academic Publishers: Boston, MA, USA, 1993; p. 200. [Google Scholar]
Mitrani, I. Modelling of Computer and Communication Systems; Cambridge University Press: Cambridge, UK, 1987; p. 192. [Google Scholar]
Little, J.D.C. A proof for the queuing formula: L = λW. Oper. Res. 1961, 9, 383–387. [Google Scholar] [CrossRef]
Fiems, D.; Bruneel, H. A note on the discretization of Little’s result. Oper. Res. Lett. 2002, 30, 17–18. [Google Scholar] [CrossRef]
González, M.O. Classical Complex Analysis; Marcel Dekker: New York, NY, USA, 1992; p. 767. [Google Scholar]
Vinck, B.; Bruneel, H. Analyzing the discrete-time G(G)/Geo/1 queue using complex contour integration. Queueing Syst. 1994, 18, 47–67. [Google Scholar] [CrossRef]
Adan, I.J.B.F.; van Leeuwaarden, J.S.H.; Winands, E.M.M. On the application of Rouché’s theorem in queueing theory. Oper. Res. Lett. 2006, 34, 355–360. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Mean customer delay versus load

ρ

for a 5-server system with Poisson arrivals, shifted negative-binomial service demands with 5 phases and mean

τ = 50

, and negative-binomial service capacities with 5 phases and mean

μ = 10

, as estimated by various approximations, (a) in absolute value and (b) relative to the Monte-Carlo estimate.

Figure 1. Mean customer delay versus load

ρ

for a 5-server system with Poisson arrivals, shifted negative-binomial service demands with 5 phases and mean

τ = 50

, and negative-binomial service capacities with 5 phases and mean

μ = 10

, as estimated by various approximations, (a) in absolute value and (b) relative to the Monte-Carlo estimate.

Figure 2. Mean customer delay versus the number of servers c for a system with Poisson arrivals, shifted negative-binomial service demands with 5 phases and mean

τ = 50

, and negative-binomial service capacities with 5 phases and mean

μ = 10

, with variable mean arrival rate

λ

such that the load

ρ = 0.95

, as estimated by various approximations: (a) in absolute value; and (b) relative to the Monte-Carlo estimate.

Figure 2. Mean customer delay versus the number of servers c for a system with Poisson arrivals, shifted negative-binomial service demands with 5 phases and mean

τ = 50

, and negative-binomial service capacities with 5 phases and mean

μ = 10

, with variable mean arrival rate

λ

such that the load

ρ = 0.95

, as estimated by various approximations: (a) in absolute value; and (b) relative to the Monte-Carlo estimate.

Figure 3. Mean customer delay versus the number of servers c for a system with Poisson arrivals, bursty service demands, and negative-binomial service capacities with 5 phases and mean

μ = 10

, with variable mean arrival rate

λ

such that the load

ρ = 0.95

, as estimated by various approximations, (a) in absolute value; and (b) relative to the Monte-Carlo estimate.

Figure 3. Mean customer delay versus the number of servers c for a system with Poisson arrivals, bursty service demands, and negative-binomial service capacities with 5 phases and mean

μ = 10

, with variable mean arrival rate

λ

such that the load

ρ = 0.95

, as estimated by various approximations, (a) in absolute value; and (b) relative to the Monte-Carlo estimate.

Figure 4. Mean customer delay versus mean service demand

τ

for a five-server system with Poisson arrivals, deterministic service demands of

τ

work units, and uniformly distributed service capacities between 0 and 10 work units (inclusive), with arrival rate

λ

scaled such that the load

ρ = 0.9

, as estimated by various approximations.

Figure 4. Mean customer delay versus mean service demand

τ

for a five-server system with Poisson arrivals, deterministic service demands of

τ

work units, and uniformly distributed service capacities between 0 and 10 work units (inclusive), with arrival rate

λ

scaled such that the load

ρ = 0.9

, as estimated by various approximations.

Figure 5. Probability mass function of the customer delay for a 5-server system with Poisson arrivals, shifted negative-binomial service demands with 5 phases and mean

τ = 50

, and negative-binomial service capacities with 5 phases and mean

μ = 10

, with load

ρ = 0.7

, as estimated by various approximations.

Figure 5. Probability mass function of the customer delay for a 5-server system with Poisson arrivals, shifted negative-binomial service demands with 5 phases and mean

τ = 50

, and negative-binomial service capacities with 5 phases and mean

μ = 10

, with load

ρ = 0.7

, as estimated by various approximations.

Figure 6. Mean customer delay for a 5-server system with Poisson arrivals, shifted negative-binomial service demands with 5 phases and mean

τ = 10

, and negative-binomial service capacities with r phases and mean

μ = 5

, with load

ρ = 0.95

, for various r, as estimated by various approximations.

Figure 6. Mean customer delay for a 5-server system with Poisson arrivals, shifted negative-binomial service demands with 5 phases and mean

τ = 10

, and negative-binomial service capacities with r phases and mean

μ = 5

, with load

ρ = 0.95

, for various r, as estimated by various approximations.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

De Muynck, M.; Bruneel, H.; Wittevrongel, S. Analysis of a Queue with General Service Demands and Multiple Servers with Variable Service Capacities. Mathematics 2023, 11, 953. https://doi.org/10.3390/math11040953

AMA Style

De Muynck M, Bruneel H, Wittevrongel S. Analysis of a Queue with General Service Demands and Multiple Servers with Variable Service Capacities. Mathematics. 2023; 11(4):953. https://doi.org/10.3390/math11040953

Chicago/Turabian Style

De Muynck, Michiel, Herwig Bruneel, and Sabine Wittevrongel. 2023. "Analysis of a Queue with General Service Demands and Multiple Servers with Variable Service Capacities" Mathematics 11, no. 4: 953. https://doi.org/10.3390/math11040953

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Analysis of a Queue with General Service Demands and Multiple Servers with Variable Service Capacities

Abstract

1. Introduction

2. Queuing Model

3. Approximations

3.1. Geometric Service Times: The GI-Geom-c Approximation

3.2. Deterministic Service Times: The GI-D-c Approximation

3.3. The Single-Server Approximation

3.4. The Infinite-Server Approximation

3.5. The Two-Phase Approximation

3.6. Monte-Carlo Simulations

4. Numerical Examples

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A. Time to Serve V Work Units

Appendix B. Unfinished Work in Single-Server System

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI