Skip to main content

Neural Networks for Conditional Probability Estimation

Forecasting Beyond Point Predictions

  • Book
  • © 1999

Overview

  • Provides unique, comprehensive coverage of generalisation and regularisation: Provides the first real-world test results for recent theoretical findings on the generalisation performance of committees

Part of the book series: Perspectives in Neural Computing (PERSPECT.NEURAL)

This is a preview of subscription content, log in via an institution to check access.

Access this book

Softcover Book USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Free shipping worldwide - view details

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

About this book

Conventional applications of neural networks usually predict a single value as a function of given inputs. In forecasting, for example, a standard objective is to predict the future value of some entity of interest on the basis of a time series of past measurements or observations. Typical training schemes aim to minimise the sum of squared deviations between predicted and actual values (the 'targets'), by which, ideally, the network learns the conditional mean of the target given the input. If the underlying conditional distribution is Gaus­ sian or at least unimodal, this may be a satisfactory approach. However, for a multimodal distribution, the conditional mean does not capture the relevant features of the system, and the prediction performance will, in general, be very poor. This calls for a more powerful and sophisticated model, which can learn the whole conditional probability distribution. Chapter 1 demonstrates that even for a deterministic system and 'be­ nign' Gaussian observational noise, the conditional distribution of a future observation, conditional on a set of past observations, can become strongly skewed and multimodal. In Chapter 2, a general neural network structure for modelling conditional probability densities is derived, and it is shown that a universal approximator for this extended task requires at least two hidden layers. A training scheme is developed from a maximum likelihood approach in Chapter 3, and the performance ofthis method is demonstrated on three stochastic time series in chapters 4 and 5.

Similar content being viewed by others

Table of contents (18 chapters)

Authors and Affiliations

  • Neural Systems Group, Department of Electrical & Electronic Engineering, Imperial College, London, UK

    Dirk Husmeier

Accessibility Information

PDF accessibility summary

This PDF is not accessible. It is based on scanned pages and does not support features such as screen reader compatibility or descriptions for non-text content (e.g., images and graphs). However, it likely supports searchable and selectable text based on OCR (Optical Character Recognition). Users with accessibility needs may not be able to use this content effectively. Please contact us at through this accessibility request webform if you require assistance or an alternative format.

Bibliographic Information

Keywords

Publish with us