Special Issue "Advances in Markovian Dynamic and Stochastic Optimization Models in Diverse Application Areas"

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Probability and Statistics".

Deadline for manuscript submissions: 31 August 2023 | Viewed by 9596

Special Issue Editor

Department of Statistics, Carlos III University of Madrid, 28903 Getafe (Madrid), Spain
Interests: operations research; dynamic and stochastic optimization

Special Issue Information

Dear Colleagues,

Markovian dynamic and stochastic optimization is an active research area concerning the design and analysis of optimal or nearly optimal policies for Markov decision models of stochastic systems evolving over time. Such models arise in a wide variety of application areas, including manufacturing, marketing, service operations, finance, call centers, and cloud service systems.

In this Special Issue, we shall collect recent theoretical and application-oriented advances regarding Markovian dynamic and stochastic optimization models in any application area. This includes the design and analysis of optimal and nearly optimal policies, performance analysis, large-scale systems, queueing systems, bandit models,and computational studies.

Prof. Dr. José Niño-Mora
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2100 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Markov decision processes
  • stochastic dynamic programming
  • optimal policies
  • optimal control
  • queueing systems
  • bandit models
  • reinforcement learning
  • machine Learning
  • operations research
  • dynamic and stochastic optimization

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

Article
Multi-Gear Bandits, Partial Conservation Laws, and Indexability
Mathematics 2022, 10(14), 2497; https://doi.org/10.3390/math10142497 - 18 Jul 2022
Cited by 1 | Viewed by 710
Abstract
This paper considers what we propose to call multi-gear bandits, which are Markov decision processes modeling a generic dynamic and stochastic project fueled by a single resource and which admit multiple actions representing gears of operation naturally ordered by their increasing resource [...] Read more.
This paper considers what we propose to call multi-gear bandits, which are Markov decision processes modeling a generic dynamic and stochastic project fueled by a single resource and which admit multiple actions representing gears of operation naturally ordered by their increasing resource consumption. The optimal operation of a multi-gear bandit aims to strike a balance between project performance costs or rewards and resource usage costs, which depend on the resource price. A computationally convenient and intuitive optimal solution is available when such a model is indexable, meaning that its optimal policies are characterized by a dynamic allocation index (DAI), a function of state–action pairs representing critical resource prices. Motivated by the lack of general indexability conditions and efficient index-computing schemes, and focusing on the infinite-horizon finite-state and -action discounted case, we present a verification theorem ensuring that, if a model satisfies two proposed PCL-indexability conditions with respect to a postulated family of structured policies, then it is indexable and such policies are optimal, with its DAI being given by a marginal productivity index computed by a downshift adaptive-greedy algorithm in AN steps, with A+1 actions and N states. The DAI is further used as the basis of a new index policy for the multi-armed multi-gear bandit problem. Full article
Article
Pricing the Volatility Risk Premium with a Discrete Stochastic Volatility Model
Mathematics 2021, 9(17), 2038; https://doi.org/10.3390/math9172038 - 25 Aug 2021
Cited by 2 | Viewed by 1753
Abstract
Investors’ decisions on capital markets depend on their anticipation and preferences about risk, and volatility is one of the most common measures of risk. This paper proposes a method of estimating the market price of volatility risk by incorporating both conditional heteroscedasticity and [...] Read more.
Investors’ decisions on capital markets depend on their anticipation and preferences about risk, and volatility is one of the most common measures of risk. This paper proposes a method of estimating the market price of volatility risk by incorporating both conditional heteroscedasticity and nonlinear effects in market returns, while accounting for asymmetric shocks. We develop a model that allows dynamic risk premiums for the underlying asset and for the volatility of the asset under the physical measure. Specifically, a nonlinear in mean time series model combining the asymmetric autoregressive conditional heteroscedastic model with leverage (NGARCH) is adapted for modeling return dynamics. The local risk-neutral valuation relationship is used to model investors’ preferences of volatility risk. The transition probabilities governing the evolution of the price of the underlying asset are adjusted for investors’ attitude towards risk, presenting the asset returns as a function of the risk premium. Numerical studies on asset return data show the significance of market shocks and levels of asymmetry in pricing the volatility risk. Estimated premiums could be used in option pricing models, turning options markets into volatility trading markets, and in measuring reactions to market shocks. Full article
Show Figures

Figure 1

Article
Three-Stage Numerical Solution for Optimal Control of COVID-19
Mathematics 2021, 9(15), 1777; https://doi.org/10.3390/math9151777 - 27 Jul 2021
Viewed by 1553
Abstract
In this paper, we present a three-stage algorithm for finding numerical solutions for optimal control problems. The algorithm first performs an exhaustive search through a discrete set of widely dispersed solutions which are representative of large subregions of the search space; then, it [...] Read more.
In this paper, we present a three-stage algorithm for finding numerical solutions for optimal control problems. The algorithm first performs an exhaustive search through a discrete set of widely dispersed solutions which are representative of large subregions of the search space; then, it uses the search results to initialize a Monte Carlo process that searches quasi-randomly for a best solution; then, it finally uses a Newton-type iteration to converge to a solution that satisfies mathematical conditions of local optimality. We demonstrate our methodology on an epidemiological model of the coronavirus disease with testing and distancing controls applied over a period of 180 days to two different subpopulations (low-risk and high-risk), where model parameters are chosen to fit the city of Houston, Texas, USA. In order to enable the user to select his/her preferred trade-off between (number of deaths) and (herd immunity) outcomes, the objective function includes costs for deaths and non-immunity. Optimal strategies are estimated for a grid of (death cost) × (non-immunity cost) combinations, in order to obtain a Pareto curve that represents optimum trade-offs. The levels of the four controls for the different Pareto-optimal solutions over the 180-day period are visually represented and their characteristics discussed. Three different variants of the algorithm are run in order to determine the relative importance of the three stages in the optimization. Results from the three algorithm variants are fairly consistent, indicating that solutions are robust. Results also show that the Monte Carlo stage plays an especially prominent role in the optimization, but that all three stages of the process make significant contributions towards finding lower-cost, more effective control strategies. Full article
Show Figures

Figure 1

Review

Jump to: Research

Review
Markovian Restless Bandits and Index Policies: A Review
Mathematics 2023, 11(7), 1639; https://doi.org/10.3390/math11071639 - 28 Mar 2023
Viewed by 1062
Abstract
The restless multi-armed bandit problem is a paradigmatic modeling framework for optimal dynamic priority allocation in stochastic models of wide-ranging applications that has been widely investigated and applied since its inception in a seminal paper by Whittle in the late 1980s. The problem [...] Read more.
The restless multi-armed bandit problem is a paradigmatic modeling framework for optimal dynamic priority allocation in stochastic models of wide-ranging applications that has been widely investigated and applied since its inception in a seminal paper by Whittle in the late 1980s. The problem has generated a vast and fast-growing literature from which a significant sample is thematically organized and reviewed in this paper. While the main focus is on priority-index policies due to their intuitive appeal, tractability, asymptotic optimality properties, and often strong empirical performance, other lines of work are also reviewed. Theoretical and algorithmic developments are discussed, along with diverse applications. The main goals are to highlight the remarkable breadth of work that has been carried out on the topic and to stimulate further research in the field. Full article
Review
Reinforcement Learning Approaches to Optimal Market Making
Mathematics 2021, 9(21), 2689; https://doi.org/10.3390/math9212689 - 22 Oct 2021
Cited by 5 | Viewed by 3467
Abstract
Market making is the process whereby a market participant, called a market maker, simultaneously and repeatedly posts limit orders on both sides of the limit order book of a security in order to both provide liquidity and generate profit. Optimal market making entails [...] Read more.
Market making is the process whereby a market participant, called a market maker, simultaneously and repeatedly posts limit orders on both sides of the limit order book of a security in order to both provide liquidity and generate profit. Optimal market making entails dynamic adjustment of bid and ask prices in response to the market maker’s current inventory level and market conditions with the goal of maximizing a risk-adjusted return measure. This problem is naturally framed as a Markov decision process, a discrete-time stochastic (inventory) control process. Reinforcement learning, a class of techniques based on learning from observations and used for solving Markov decision processes, lends itself particularly well to it. Recent years have seen a very strong uptick in the popularity of such techniques in the field, fueled in part by a series of successes of deep reinforcement learning in other domains. The primary goal of this paper is to provide a comprehensive and up-to-date overview of the current state-of-the-art applications of (deep) reinforcement learning focused on optimal market making. The analysis indicated that reinforcement learning techniques provide superior performance in terms of the risk-adjusted return over more standard market making strategies, typically derived from analytical models. Full article
Show Figures

Figure 1

Back to TopTop