Abstract
Trade execution is an ongoing optimisation problem in finance, focussed on attaining the best prices through forecasting future price movements and liquidity. This is particularly relevant to institutional investors transacting large positions.
Deep reinforcement learning is particularly well-suited to this problem given its ability to capture highly non-linear relationships within a sequential decision-making context. However, research that employs deep reinforcement learning for this task is limited, with most focussed on established mathematical methods. Numerous reinforcement learning formulations have been explored, but progress has been hindered by the lack of comparisons between studies. To address this, we develop a framework that allows comparisons to be made on a consistent basis, so that advances can be made in determining more effective deep reinforcement learning formulations. To facilitate this comparison, a comprehensive suite of benchmarks, diagnostics and analytics is implemented into the framework.
The deep reinforcement learning approaches consistently outperform the benchmark strategies on a median basis and do so with a lower standard deviation. The most effective reinforcement learning approach was the one which had the most degrees of freedom in the action space. This finding highlights the ability of deep reinforcement learning techniques to learn effective execution strategies, even when relatively unconstrained. To assess robustness, strategies are evaluated across two trading horizons of varying lengths and the behaviour of the reinforcement learning agents is analysed.
Deep reinforcement learning is particularly well-suited to this problem given its ability to capture highly non-linear relationships within a sequential decision-making context. However, research that employs deep reinforcement learning for this task is limited, with most focussed on established mathematical methods. Numerous reinforcement learning formulations have been explored, but progress has been hindered by the lack of comparisons between studies. To address this, we develop a framework that allows comparisons to be made on a consistent basis, so that advances can be made in determining more effective deep reinforcement learning formulations. To facilitate this comparison, a comprehensive suite of benchmarks, diagnostics and analytics is implemented into the framework.
The deep reinforcement learning approaches consistently outperform the benchmark strategies on a median basis and do so with a lower standard deviation. The most effective reinforcement learning approach was the one which had the most degrees of freedom in the action space. This finding highlights the ability of deep reinforcement learning techniques to learn effective execution strategies, even when relatively unconstrained. To assess robustness, strategies are evaluated across two trading horizons of varying lengths and the behaviour of the reinforcement learning agents is analysed.
| Original language | English |
|---|---|
| Article number | 102876 |
| Pages (from-to) | 1-18 |
| Number of pages | 18 |
| Journal | Pacific-Basin Finance Journal |
| Volume | 94 |
| DOIs | |
| Publication status | Published - Dec 2025 |