Research Report No. UVACTS-15-13-11
                                             May, 2003

 

 

 

 

 

 

 

 

Low-Cost Modified Adaptive Control (Phase I)

State-Based Modeling: An Analysis of Error in Order Approximation

 

 

By:

Christopher Patrick Allred

Dr. William Scherer

 

 

 

A Research Project Report for:

The Virginia Transportation Research Council (VTRC)

 

 

Christopher Patrick Allred

University of Virginia

 

Dr. William T. Scherer

Department of Systems and Information Engineering

University of Virginia

Email: wts@virginia.edu

 

 

 

 

 

 

Center for Transportation Studies at the University of Virginia produces outstanding transportation professionals, innovative research results and provides important public service. The Center for Transportation Studies is committed to academic excellence, multi-disciplinary research and to developing state-of-the-art facilities. Through a partnership with the Virginia Department of Transportation’s (VDOT) Research Council (VTRC), CTS faculty hold joint appointments, VTRC research scientists teach specialized courses, and graduate student work is supported through a Graduate Research Assistantship Program. CTS receives substantial financial support from two federal University Transportation Center Grants: the Mid-Atlantic Universities Transportation Center (MAUTC), and through the National ITS Implementation Research Center (ITS Center). Other related research activities of the faculty include funding through FHWA, NSF, US Department of Transportation, VDOT, other governmental agencies and private companies.

 

Disclaimer: The contents of this report reflect the views of the authors, who are responsible for the facts and the accuracy of the information presented herein.  This document is disseminated under the sponsorship of the Department of Transportation, University Transportation Centers Program, in the interest of information exchange.  The U.S. Government assumes no liability for the contents or use thereof.

 

 

 

CTS Website                                                                         Center for Transportation Studies

http://cts.virginia.edu                                                                                 University of Virginia

351 McCormick Road, P.O. Box 400742

Charlottesville, VA 22904-4742

                                                                                                                                434.924.6362

 

           

1. Report No.

2. Government Accession No.

3. Recipient’s Catalog No.

 

UVACTS-15-13-11

 

 

4. Title and Subtitle

5. Report Date

Low-Cost Modified Adaptive Control (Phase I)

State-Based Modeling: An Analysis of Error in Order Approximation

 

May, 2003

 

6. Performing Organization Code

 

 

7. Author(s)

Christopher Patrick Allred, William T. Scherer

8. Performing Organization Report No.

 

 

 

 

9. Performing Organization and Address

10. Work Unit No. (TRAIS)

 

Center for Transportation Studies

 

University of Virginia

11. Contract or Grant No.

PO Box 400742

Charlottesville, VA 22904-7472

 

12. Sponsoring Agencies' Name and Address

13. Type of Report and Period Covered

Office of University Programs, Research and Special Programs Administration

US Department of Transportation

400 Seventh Street, SW

Washington DC 20590-0001

 

Final Report

 

 

14. Sponsoring Agency Code

 

 

 

15.  Supplementary Notes

 

 

16. Abstract

When building discrete time, discrete-state data-driven models, transition probabilities are stored in Markov transition matrices.  Augmenting the history or order of the associated Markov chains allows for present states to be predicated upon states in previous time periods without contradicting the Markovian property.  However, when faced with real data, the “true” order of the model is not always known.  This project details the inadequacies of the present method of evaluating order, and outlines and evaluates new methods to approximate the model order including simulation, the likelihood-based Bayesian Information Criterion, Schwarz Information Criterion, and Akaike Information Criterion as well as using comparison vectors, an ad-hoc method.  In addition, this project suggests an approach for using these techniques to approximate order, and discusses the error associated with using that prediction model. 

 

17 Key Words

18. Distribution Statement

 

No restrictions. This document is available to the public.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Abstract

 

When building discrete time, discrete-state data-driven models, transition probabilities are stored in Markov transition matrices.  Augmenting the history or order of the associated Markov chains allows for present states to be predicated upon states in previous time periods without contradicting the Markovian property.  However, when faced with real data, the “true” order of the model is not always known.  This project details the inadequacies of the present method of evaluating order, and outlines and evaluates new methods to approximate the model order including simulation, the likelihood-based Bayesian Information Criterion, Schwarz Information Criterion, and Akaike Information Criterion as well as using comparison vectors, an ad-hoc method.  In addition, this project suggests an approach for using these techniques to approximate order, and discusses the error associated with using that prediction model. 


 

Acknowledgements

            I would like to sincerely thank my advisor, Professor William T. Scherer, for his guidance and supervision.  He has been both a teacher and a friend throughout the thesis effort.  Thanks go to Thomas Pomroy whose related research helped establish the foundations of the project.  I am grateful to Dr. James Lark and Dr. Roman Krzysztofowicz for the insight they brought to the project, as well as taking the time to serve on my committee.

            Research for this thesis was funded by Providian and the Virginia Department of Transportation Smart Travel Lab.


Table of Contents

Chapter 1:  Introduction.. 1

1.1.  ..................................................................................... State-Based Modeling.. 1

1.2.  Estimation of Model Order.. 2

1.3.  Overview... 3

1.4.  Thesis Statement.. 3

Chapter 2:  Literature Review... 4

2.1.  ...................................... State-Based Modeling as a Markov Process. 4

2.1.1.    The Markov Property.. 6

2.1.2.    Order in State-Based Models. 6

2.1.3.    True Order in State-based Models. 8

2.2. Error Factors in Data Driven Modeling.. 10

2.3. Bayes’ Factors. 12

2.3.1.    Description of Bayes’ Factors. 12

2.3.2.    Prior Distributions. 14

2.3.3.    Bayesian Information Criterion.. 16

2.3.4.    Testing of Bayesian Information Criterion.. 16

2.3.5.    Large Sample Bayes’ Procedure (Schwarz Information Criterion) 17

2.3.6.    Akaike Information Criterion.. 18

2.4.  .............................................. Convergence Bounds of Markov Chains. 19

2.5.   MTD Model.. 22

Chapter 3:  Methodology.. 24

3.1.  ....................................................................................... Problem Description.. 24

3.2.  Formulaic Description of the Problem... 24

3.2.1.    Control Generation.. 27

3.3.  .................................................... Time Dependence of Data Evaluation.. 28

3.4.  Simulated Model Order Estimation vs. Statistical Model Order Estimation   33

3.5.  Ad-Hoc Simulation Approach for the Determination of Model Order   33

3.3.1.    Coding the Simulation.. 35

3.6.  .................................................................. Comparison Vector Technique.. 37

3.6.1.    Coding Issues Associated with the Comparison Vector.. 44

3.7.  ........................................................ Information Criterion Techniques. 44

3.7.1.    Programming Notes. 45

3.8.  ..................................................................................................................... Testing.. 46

3.9.  ...................................................................................... Description of Error.. 47

3.9.1.    Evaluation of Error.. 48

Chapter 4:  Results. 51

4.1.  ...................................................... Small-Sample Known Order Dataset.. 51

4.1.1.    Data Description.. 51

4.1.2.    Data Creation.. 52

4.1.3.    Simulation Results. 54

4.1.3.    Comparison Vector Results. 56

4.1.4.    Information Criterion Results. 57

4.1.5.    Testing.. 57

4.2.  ......................................................... Large-Sample Known-Order Model.. 59

4.2.1.    Data Description.. 59

4.2.2.    Simulation Results. 60

4.2.3.    Comparison Vector Results. 61

4.2.4.    Information Criteria Results. 62

4.2.5.    Testing.. 63

4.3.  Bank of Scotland Data.. 65

4.3.1.    Data Description.. 65

4.3.2.    Simulation Approach.. 67

4.3.3.    Comparison Vector Results. 69

4.3.3.    Information Criteria Results. 69

4.3.4.    Testing.. 70

4.4.  Calgary Healthcare.. 72

4.4.1.    Data Description.. 72

4.4.2.    Simulation Results. 75

4.4.3.    Comparison Vector Technique. 76

4.4.4.    Information Criteria Results. 77

4.4.5.    Testing.. 77

4.5.  Virginia Department of Transportation Data.. 81

4.5.1.    Data Description.. 81

4.5.2.    Simulation Results. 83

4.5.3.    Comparison Vector Results. 84

4.5.4.    Information Criteria Results. 85

4.5.5.    Testing.. 85

Chapter 5:  Conclusions. 88

5.1.  Observations about techniques. 88

5.2.  Recommendations. 90

5.2.1.    Reccomendation Test. 92

5.3.  Contributions. 95

5.4.  Future Work.. 96

References. 97

Appendix A: Number Generation Code.. 100

Appendix B: Simulation Technique Code.. 104

Appendix C: Comparison Vector Technique Code.. 111

Appendix D: Bayesian Information Criterion Code.. 115

Appendix E: Referenced Prior Densities. 121

 


 List of Figures

Figure 1 - Expansion of 2 state Markov chain – the grey cells indicate that the calculated probabilities must be 0, since a transition between the two states is impossible. 21

Figure 2-(Top) MTD(2) Model, # signifies that that transition is unimportant.  (Bottom) MTD(2) model applied to Markov Transition Matrix. 23

Figure 3 - Data Makeup. 26

Figure 4 - 1st and 2nd order 2-state TPMs. 29

Figure 5 – 2nd order augmented 1st order matrix for a 2-state model 30

Figure 6 - Dataset Orientation. 34

Figure 7 - Observed prior transitions from the same prior data: 1st order (top) 2nd order (middle) 3rd order (bottom) 36

Figure 8- Formulation of rth order transition count model 38

Figure 9 - Transition count matrices generated from simulated data. 39

Figure 10 - Comparison vector generated from N(k) 39

Figure 11 - 1st, 2nd, and 3rd order transition probability matrices. 40

Figure 12 – Comparison Vector associated with matrix from figure 12, C(4,k) 40

Figure 13 - Formulaic Approach to Comparison Vectors. 42

Figure 14 - Experimental Data Description. 48

Figure 15 - Convergence of probabilities. 49

Figure 16 - Makeup of Small Sample Known Dataset 52

Figure 17 - Known 3rd order matrix. 53

Figure 18 - Transition Matrix Squared: P(3) 53

Figure 19 - Absolute Error of Small Sample Model 54

Figure 20 - Mean Absolute Error for Small Sample Model 55

Figure 21 - Absolute Prediction Error for 8 Time Periods. 58

Figure 22 - Makeup of Large-Sample Known-Order Model 60

Figure 23 - Mean Average Error for 343 accountrealization Large-Sample Known Order Model 61

Figure 24 – Mean Percent Error of Prediction for Large Sample Known Order Model 64

Figure 25 - Data Classification for Bank of Scotland Bank AccountRealization Data. 67

Figure 26 - Mean Absolute Error for Bank of Scotland Data. 68

Figure 27 - Prediction Mean Average Percentage Error for Individual AccountRealizations. 71

Figure 28 - Histogram for 12,000 accountrealization sample. 73

Figure 29 - Percentage in Each State as Classified by the Max Value. 74

Figure 30 - Data Breakdown for Calgary Healthcare Datasets. 75

Figure 31 - Mean Absolute Percent Error for Calgary Healthcare Dataset 76

Figure 32 - Percentage Error measuring prediction in Calgary Healthcare data. 79

Figure 33 - Predicted Cost at Each Period vs. Real Cost 80

Figure 34 - Percent Error in Approximation of Cost Associated with 3rd and 4th Order Models. 81

Figure 35 - Makeup of Transportation Dataset 83

Figure 36 - Mean Absolute Error for Simulation of Transportation Data. 84

Figure 37 - Prediction Error on Transportation Data for 6 simulated time periods. 86

Figure 38 - Mean Absolute Error for 4 orders. 93

 


List of Tables

Table 1 - Sample Transition Probability Matrix. 5

Table 2 – 2nd Order Augmented State Matrix. 7

Table 3 - Simplified 2nd Order Augmented Transition Probability Matrix. 8

Table 4 - True 2-state 2nd order model 8

Table 5 - Expanded 2 state 3rd order model 9

Table 6 - "State Ceilings" associated with MS Excel (N is the number of states, r is the order of the model, the grey sections indicate models that are infeasible in MS Excel) 27

Table 7- Creating Matrix for Small Sample Known Order Model 52

Table 8-Values for Mean Absolute Error of 473 accountrealization sample. 55

Table 9- Difference in Closeness of Comparison Vectors. 56

Table 10 - Information Criterion Results for Small-Sample Known- Order Dataset 57

Table 11 - Absolute Error (Max 5000) per simulated time epoch. 57

Table 12 - Mean Absolute Error and Mean Absolute Percent Error over 8 simulated periods. 58

Table 13 - Short-term individual error 58

Table 14 - Generating Probability Matrix. 59

Table 15 - Mean Average Error and Mean Average Percent Error for 343 accountrealization 6 period Large Sample Known Order Model 61

Table 16-Large-sample Known-order model Comparison Vector Convergence. 62

Table 17 - Information Criterion Data for Large-Sample Known-Order dataset 62

Table 18 - Mean Average Error and Mean Average Percent Error for "Group of AccountRealizations" on Large Known Order Dataset 63

Table 19- State Classification Table for Bank of Scotland Data; All Balances are given in Pounds. 66

Table 20- Mean Absolute Error and Mean Absolute Percent Error for Simulation about Bank of Scotland Data  68

Table 21 - Bank of Scotland Model Comparison Vector Closeness Measures. 69

Table 22 - Information Criterion Analysis for Bank of Scotland Data. 69

Table 23 - Mean Absolute Error and Mean Absolute Percent Error for the Bank of Scotland Data. 71

Table 24 - Histogram Breakdown for 12,000 accountrealization Calgary Sample. 73

Table 25 - Comparison Vector Convergence Table for Calgary Healthcare. 77

Table 26 - Information Criteria Data for Calgary Healthcare. 77

Table 27 - Mean Absolute Error and Mean Absolute Percent Error for Calgary Healthcare Data. 78

Table 28 - Predicted Cost at Each Period vs. Real Cost 80

Table 29 - State Description Diagram, where μt represents the time-average at time t, and σ represents the standard deviation at that period. 82

Table 30 - Mean Absolute Error and Mean Absolute Percentage Error for the Simulation of Transportation data  83

Table 31 - Difference in Closeness of Comparison Vectors. 85

Table 32 - Information Criteria Calculation Results. 85

Table 33 - Mean Average Error and Mean Absolute Percent Error for the Group of AccountRealization Error accountrealizationing method. 86

Table 34 - Difference between suggested error and real error of the model from results section. 88

Table 35 - Sample Size Breakdown. 91

Table 36 - Reccomendation Matrix. 92

Table 37 - State Discretization, where μt represents the average for all accountrealizations at time t 92

Table 38 - Mean Absolute Error and Mean Average Percent Error using the Simulation Technique on Precipitation Data  93

Table 39 - Expected Predicted Rainfall vs. Real Rainfall 94

 


Glossary

T – time scale for which T={t:0,1,…T}

A – accountrealization space for which A = {a:0,1,…A}

S – state space for which S = {s: 0,1,… S}

K – order space for which K = {k: 0,1,…r,…}

 - worst case order of K for the comparison vector; the order of the largest transition probability matrix contained in the comparison vector

P(k) – kth order transition probability matrix

pij – transition probability contained in P(k); given the system is in state i , the probability of transitioning into state j at the next time period

pij(n) – n-step transition probability; given the system is in state i , the probability of transitioning into state j n periods in the future

πj – the steady-state probability of being in state j

π – set of steady-state probabilities; π = (π0, π1, … πS)  

- a sequence of events occurring at times t, for a accountrealizations

- an indexed event occurring at time t for accountrealization a

nij…kl – number of occurrences of the sequence of states ij…kl in a given data range

sk – vector of states for a kth order transition probability matrix

- estimated kth order transition probability matrix

N(k) – kth order transition matrix

C(k, r) – the comparison vector created for an rth order model, containing the transition matrices up to an order k

μt(i) – the probability of being in state i at time t

β – Convergence Factor

y – set of observed data

θ(k) – parameters of the kth order Markov model

ρi – The number of parameters in model i

Mi – model i

 

 


Chapter 1:
Introduction

1.1.  State-Based Modeling

State-based modeling derives its name from the process of discretizing real-world events into states, and characterizing the underlying process of a system by the movement between states. The theory of state-based modeling may be applied to a plethora of systems.  Example applications from this thesis include banking, healthcare, transportation and climatology to show the broad reaching impact of the modeling. 

The crux of modeling lies in the assumption that future values are dependent on present and past states of the system.  By determining a guiding process, the future values of a system may be predicted based upon some observed data.  The features of this underlying process are stored in a Markov model.  It becomes necessary to determine how much past information is sufficient to allow for an accurate prediction of future values of a system.  The increasing complexity of the model when incorporating more past information calls a limitation on the amount of history in the model.

This notion of limiting past information leads us to specify the amount of history to include in a Markov model, which is known as model order.  Model order is defined as the number of past periods of information required to predict future values.  Within in this paradigm of model order, there exists the concept of “true order”:  true order may be defined as the real order required to predict future values precisely.  At any order greater than the true order of a system all extra history has no modeling value; the model gains complexity but no improvement on forecasting.  In the real world, however, the true order of a system is rarely known; therefore an estimation of order is necessary to make correct predictions about the data.

The increasing complexity of higher order models leads to notions of incompleteness of certain models.  If models are characterized by some given input, the model exhibits an exponential increase in complexity as one state of history is appended to the estimated order[1].  This leads to the same data characterizing a significantly more involved process which may indicate there are some unknown properties of the model.[2]  It becomes obvious that some estimation of model order can significantly reduce the uncertainty about the model properties.

 

1.2.    Estimation of Model Order

This thesis uses five different order estimation techniques to estimate the model order of five datasets.  Two of these, the simulation technique and comparison vector technique, are new methods to approximate model order using the recent advances in computing via simulation.  The other three order estimating techniques, the Bayesian, Schwarz and Akaike Information Criteria combine research on frequentist theory and stochastic processes to create methods to estimate the order of a model.  Through a rigorous numerical analysis, recommendations are made on the methodology to estimate model order given any sample. 

 

1.3.         Overview

Chapter 2 details the research behind the methods for estimating order, as well as creating a theoretical basis for the problem.  The chapter reviews key assumptions about the data, and the characterization of the data, leading to a clearer understanding of the analysis.  In addition, this chapter reviews relevant literature relating to the field of order approximation of stochastic models.  Chapter 3 describes the methodology associated with performing the tests estimating the model orders.  This section outlines the measures of error and the implications of some of our methodological assumptions.  Chapter 4 details the results from the tests on the five data sets using the five different techniques, and makes some passing assumptions.  Chapter 5 outlines the conclusions obtained from the results as well as a general recommendation, contributions, and future work.

 

1.4.         Thesis Statement

There exists some approach to estimating the order of a model that can provide a precise estimation of the underlying process, allowing for accuracy in predicting future values of a system guided by the same process.


Chapter 2:
Literature Review

2.1.  State-Based Modeling as a Markov Process

State-based modeling, also known as Markovian modeling, is the process from which a structure of states and probabilities generated from observed data predicts the future values of a system.  These Markovian states, ideally, “are descriptors that completely capture the status of the system and contain all the information necessary for decision making regarding the system being studied.”[3]

            Formally, a discrete time Markov chain consists of a set, S, of possible states, where , which occur over a set of states, T, where .  These occurrences may be categorized in a sequence that can be defined as, where Xt represents an event at time t, for which the conditional probability, also known as the transition probability,   remains stationary for all values of n.[4]

            The values of the transition probabilities, pij, satisfy the conditions:, and .  These probabilities can be formed into a transition probability matrix, . The transition probability matrix encapsulates all the possible transitions and the conditional probabilities, allowing for calculations of the modeled process.  The n-step transition probabilities are defined as , and are equivalent to the individual probabilities in the matrix Pn.  In the long run, it is found that the probabilities of transitioning to a future state converge to a set of steady-state probabilities, π.  π may be defined as the set of probabilities πj governed by the equation  , implying that the long term probability of being in state j is independent of events that transpired in the past.   serves as the unique positive solution to the system of equations  , and [5]

            Data-driven modeling, which includes state-based modeling, uses the concept of Markov chains and Markov processes to estimate future values of a system.  The overriding present methodology is to create a model and fit the model to the data, e.g. linear regression.  Data-driven modeling calls for the model to be created by the data, and therefore the modeled process must be subject to observations and not intuition.  To this end, in building a data-driven model, transition probabilities are obtained from observed data, and the states are defined by the system modeler.  The transition probabilities and states then form a Markov model, which may be used for estimating future states of the system.  Consider an example, with the states S = {0,1}.  From these states, one can create a transition probability matrix (see Table 1) from observed state counts.

Table 1 - Sample Transition Probability Matrix

The probabilities, pab, are not known but must be estimated by, which can be calculated as follows for the preceding table:

nab represents the number of one-step observed transitions in the dataset from state a to state b.  It is easy to see how the observed transitions may be formed into transition probabilities to be used in state-based modeling.

2.1.1.    The Markov Property

Governing all of Markov modeling, the Markov property states that the observed process is memory-less. 

Def:(Markov property) –“ [occurs when] states of a process evolve in time… conditional on a knowledge of the state of the system at any given time t the past (before t) and the future (after t) are statistically independent.”[6]

This memory-less property is important in the simulation and functionality of Markov processes.  Through n-step transition probabilities, one can show that the state of the system at a future time epoch can be probabilistically determined based solely on the one-step transition probabilities. 

2.1.2.    Order in State-Based Models

In real-world models, the Markov property may seem like an unnecessary and constricting requirement of modeling.  However, in Markov models, the state does not need to be defined as a state in a single time epoch.  The notion of state augmentation dictates that history can be incorporated into the memory-less Markov models.  The data of previous time periods is incorporated into a Markov model by altering the definitions of the states to include the past k periods.  This concept of an extra k periods held in the state definition, dictates that the model is of kth order.  Order is formally defined as the number of discrete time epochs in “history” of an augmented state definition.  For example, for the 2 state system S={0,1}, a second order model would consist of the states (0-0, 0-1, 1-0, 1-1), where state 0-1 would indicate state 0 was observed at time t-1, and state 1 was observed at time t.  Formally the vector of expanded states may be defined as sk(i), for  , where sk = (0-0-…-0, 0-0-…-1, …, 0-0-…-S, …S-S-…-0, S-S-…-S) indexed by i.  One can easily conclude that the corresponding transition probability matrix would have the dimensions of Sk x Sk.

These higher order state-based models may be created from observed data using the same method with which the first order state-based model was created. Consider an example of first order states of A, B.  The 2nd order augmented state space would consist of states A-A, A-B, B-A, B-B.  Table 2 shows the 2nd order augmented transition probability matrix.

Table 2 – 2nd Order Augmented State Matrix

The probability estimates are defined as follows:

,

where naab is defined as the number of states that are in state a at time t-2 and in state a at time t-1 and in state b at time t, and S is the total number of states that exist in the single time dimension.  It can be seen that the augmented matrix is further constrained as there are augmented states that are non-communicative.  For example, AA cannot transition to BA, because that would assume that the state in the 2nd time epoch of history is both A and B.   These paradoxes lead to further simplification of the higher order transition probability matrix (see Table 3).

Table 3 - Simplified 2nd Order Augmented Transition Probability Matrix

Additionally, in shorthand notation,   may be written as