shahid lecture-11- mkag1273

44
MAL1303: STATISTICAL HYDROLOGY Fitting Distribution & Markov Chain Analysis Dr. Shamsuddin Shahid Department of Hydraulics and Hydrology Faculty of Civil Engineering, Universiti Teknologi Malaysia Room No.: M46-332; Phone: 07-5531624; Mobile: 0182051586 Email: [email protected] You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Upload: nchakori

Post on 15-Jan-2017

171 views

Category:

Engineering


2 download

TRANSCRIPT

Page 1: Shahid Lecture-11- MKAG1273

MAL1303: STATISTICAL HYDROLOGY

Fitting Distribution & Markov Chain Analysis

Dr. Shamsuddin ShahidDepartment of Hydraulics and Hydrology

Faculty of Civil Engineering, Universiti Teknologi Malaysia

Room No.: M46-332; Phone: 07-5531624; Mobile: 0182051586 Email: [email protected]

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 2: Shahid Lecture-11- MKAG1273

One common application of probability distributions is modeling uni-variate data with a specific probability distribution. This involves thefollowing two steps:

1. Determination of the "best-fitting" distribution.2. Estimation of the parameters (shape, location, and scale parameters)

for that distribution

There are various methods, both numerical and graphical, for estimatingthe parameters of a probability distribution:

1. Moments2. Maximum likelihood3. Least squares4. Probability plots5. Statistical tests

Modeling Distribution

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 3: Shahid Lecture-11- MKAG1273

Probability Plot

Statistical Tests

• Chi-square Test• Kolmogorov-Smirnov (K-S) Test• Anderson-Darling Test

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 4: Shahid Lecture-11- MKAG1273

Fitting Data Distribution

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 5: Shahid Lecture-11- MKAG1273

Fitting Data Distribution

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 6: Shahid Lecture-11- MKAG1273

Fitting Data Distribution

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 7: Shahid Lecture-11- MKAG1273

Chi-square Test

Kolmogorov-Smirnov (K-S) Test

Anderson-Darling (AD) Test

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 8: Shahid Lecture-11- MKAG1273

Kolmogorov-Smirnov (K-S) Test

A fully non-parametric test for comparing two distributions Does not depend on approximations for the distribution

Given two cumulative probability functions FX and FY, the test statistics are

Usually the value D=max{D+, D-} is used

))()((max

))()((max

xFxFD

xFxFD

XYx

YXx

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 9: Shahid Lecture-11- MKAG1273

))()((max

))()((max

xFxFD

xFxFD

XYx

YXx

Kolmogorov-Smirnov (K-S) Test

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 10: Shahid Lecture-11- MKAG1273

It is non-parametric and hence robust It does not rely on the mean’s location only (like the t-test) It works for non-normal data (the t-test can fail if the data is too far

from normal) It is not sensitive to scaling It is more powerful than χ2

However, it is less sensitive than t if the data is indeed normal

Kolmogorov-Smirnov Test: Advantages

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 11: Shahid Lecture-11- MKAG1273

Problem:Samples of groundwaterDepth (meter) in a catchmentare collected as given below.What is the distribution ofdata?

Kolmogorov-Smirnov (K-S) Test: Example

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 12: Shahid Lecture-11- MKAG1273

Normal Distribution

Probability Plots

Gamma Distribution (=2; =2)

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 13: Shahid Lecture-11- MKAG1273

To know the distribution of data, we have to fit the data with different types of distribution.

Let us, first try with Gamma Distribution with =2; =2

Therefore,

Ho: Groundwater depth data is following Gamma Distribution (=2; =2)

Ha: Groundwater depth data is not following Gamma Distribution (=2; =2)

Kolmogorov-Smirnov Test (K-S)

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 14: Shahid Lecture-11- MKAG1273

Steps:1. Arrange data in order2. Rank the data3. Calculate observed cumulative

Frequency as 1/(n+1)4. Calculate the expected cumulative

frequency of data for a particular distribution of interest.

Kolmogorov-Smirnov (K-S) Test

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 15: Shahid Lecture-11- MKAG1273

Steps:The expected cumulative frequency of data for a particular distribution of interest.

In the present case we calculate the expected cumulative frequency for Gamma distribution (=2; =2)

GAMMADIST (x, , , cumulative)

Example:GAMMADIST(4.13, 2, 2, 1)= 0.6113

Kolmogorov-Smirnov (K-S) Test

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 16: Shahid Lecture-11- MKAG1273

The K–S statistic Dn is defined as:

Dn = max[|Fn(x) – F(x)|]

Where ,n = total number of data points F(x) = distribution function of the fitted distribution Fn(x) = i/n+1 i = the cumulative rank of the data point.

Kolmogorov-Smirnov (K-S) Test

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 17: Shahid Lecture-11- MKAG1273

The K–S statistic,

Dn = max[|Fn(x) – F(x)|]

= 0.5302

α=0.05 ; n = 17 DCritical = 0.318 Since 0.5302> 0.318

Null Hypothesis isrejected.

Decision:Groundwater depth issignificantly differentfrom Gammadistribution (=2; =2)

Kolmogorov-Smirnov (K-S) Test

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 18: Shahid Lecture-11- MKAG1273

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 19: Shahid Lecture-11- MKAG1273

Let us, now try with normal distribution.

Therefore,

Ho: Groundwater depth data is Normally Distribution

Ha: Groundwater depth data is not Normally Distribution

Kolmogorov-Smirnov (K-S) Test

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 20: Shahid Lecture-11- MKAG1273

In the present case we calculate the expected cumulative frequency for normal distribution

Mean of the Data = 5.34Standard Deviation = 0.865722

NORMDIST (x, mean, stdev, cum)

NORMDIST(4.13, 5.34, 0.865722,1)= 0.0811

Kolmogorov-Smirnov (K-S) Test

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 21: Shahid Lecture-11- MKAG1273

The K–S statistic,

Dn = max[|Fn(x) – F(x)|]

= 0.1648

α=0.05 ; n = 17 DCritical = 0.318 Since 0.1648 < 0.318

Null Hypothesis can notbe rejected.

Decision:Groundwater depth innot significantlydifferent from normaldistribution

Kolmogorov-Smirnov (K-S) Test

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 22: Shahid Lecture-11- MKAG1273

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 23: Shahid Lecture-11- MKAG1273

Anderson-Darling (AD) Test

Anderson-Darling (AD) test is also widely used in practice.

AD goodness of fit test can be done by using following formula:

nZFln(ZFln(n

iAD )in()i(

n

i

100

1121

Hypothesis rejected if: AD > CV

Where, CV = 0.752/(1+0.75/n + 2.25/n2)

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 24: Shahid Lecture-11- MKAG1273

Anderson-Darling (AD) Test: Example

Groundwater depth data of a catchment is givenbelow. Find the best distribution that fits the data.

Solution:

First, we shall try with Normal Distribution. Thenother distributions.

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 25: Shahid Lecture-11- MKAG1273

nZFln(ZFln(n

iAD )in()i(

n

i

100

1121

Anderson-Darling (AD) Test: Example

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 26: Shahid Lecture-11- MKAG1273

Gamma Distribution with =1; =5.Decision:Groundwater depth in not significantly different fromGamma Distribution with =1; =5.

Anderson-Darling (AD) Test: Example

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 27: Shahid Lecture-11- MKAG1273

A stochastic process is the counterpart to a deterministic process.

Instead of dealing with only one possible way the process might developover time, in a stochastic or random process there is some indeterminacydescribed by probability distributions.

This means that even if the initial condition (or starting point) is known,there are many possibilities the process might go to, but some paths maybe more probable and others less so.

In the simplest possible case, a stochastic process amounts to a sequenceof random variables known as a time series.

Stochastic Processes

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 28: Shahid Lecture-11- MKAG1273

Stochastic hydrology is mainly concerned with the assessment ofuncertainty in model predictions

Stochastic hydrology is an essential base of water resourcessystems analysis, due to the inherent randomness of the input,and consequently of the results.

Stochastic hydrology is very important in decision-making processregarding the planning and management of water systems.

Stochastic Hydrology

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 29: Shahid Lecture-11- MKAG1273

In the simplest possible case, a stochastic process amounts to a sequenceof random variables known as a time series.

Stochastic process recognize the pattern of random events with certainuncertainty.

This process is known as Markov Chain Analysis.

Stochastic Hydrology

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 30: Shahid Lecture-11- MKAG1273

• A Markov chain, named for Andrey Markov, is a random processbased on that the next state depends only on the current stateand not on the past.

• A Markov analysis looks at a sequence of events, and analyzes thetendency of one event to be followed by another.

• A Markov process is useful for analyzing dependent randomevents - that is, events whose likelihood depends on whathappened last.

• The Markov chain is based on the assumption that the occurrenceof one event depends upon the previous events.

Markov Chain Analysis

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 31: Shahid Lecture-11- MKAG1273

• Markov chains are widely used in Hydrology. It is used to predictoccurrence of hydrological events.

• Markov chain analysis has been used to quantify tendencies ofhydrological processes. Does certain phenomena will increase forthe time being?

• Prediction of hydrological hazards or any other natural events.

• Prediction of weather, river discharge, etc.

Markov Chain Analysis

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 32: Shahid Lecture-11- MKAG1273

Markov Chain Analysis

Let us consider sequence of events as given below:

ABACAABCABBBCABCCABBCA

Is there any pattern present in the sequence?

Apparently there is no clear pattern of occurrence of events.

Markov chain tries to find the patterns present in the sequence.

Once patterns are identified, it is possible to predict the possibility of future events.

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 33: Shahid Lecture-11- MKAG1273

Markov Analysis: Example

Let us consider, rainfall time series data for twenty years are given below. We want identify the pattern in rainfall and predict the future rainfall.

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 34: Shahid Lecture-11- MKAG1273

Markov Analysis: Example

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 35: Shahid Lecture-11- MKAG1273

Sequence of Precipitation Climate is:

D N N N W W VW N N W W VW N D N N W W W VW D

Markov Analysis: Example

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 36: Shahid Lecture-11- MKAG1273

Markov Analysis

Step-1: Find the transitional frequency matrix

D N N N W W VW N N W W VW N D N N W W W VW D

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 37: Shahid Lecture-11- MKAG1273

Markov Analysis

Step-2: Find thetransitionalprobability matrix

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 38: Shahid Lecture-11- MKAG1273

Markov Analysis

Step-3:Construction offlow diagram

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 39: Shahid Lecture-11- MKAG1273

Markov Analysis

Step-4: Find thelikely cycles.

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 40: Shahid Lecture-11- MKAG1273

Markov Analysis: Testing of Transitional Frequency Matrix

N 0.4W 0.35VW 0.15D 0.1

2 Test for randomness in transition frequency matrix

Dividing each column total of theobserved transition frequency matrixby the total number of transitions,the fixed probability vector iscalculated. The expected randomtransition probability matrix is thendetermined by these probabilities:

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 41: Shahid Lecture-11- MKAG1273

Markov Analysis

The probabilities areconverted into expectedcounts by multiplying by rowtotals for the observedtransitions frequency matrixto give the expectedrandom transition frequencymatrix

Now, we have an observedtransition frequency matrixand an expected randomtransition frequency matrixin the same form.

The difference between theobserved and expected canbe calculated by using 2

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 42: Shahid Lecture-11- MKAG1273

Markov Analysis

Null Hypothesis (H0): the data come from a population oftransitions that are random; the probability of encountering aclimate is not dependent on the previous climate.

Alternative Hypothesis (HA): the data from a population oftransitions that are non-random; the probability of encountering aclimate is dependent on the previous climate.

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 43: Shahid Lecture-11- MKAG1273

Markov Analysis2 =(Oj – Ej)2/Ej

Degree of Freedomv = ((no. of years) – 1)2

= (4 – 1)2

= 92

(0.05,9)= 16.92

2(calculated) > 2

(critical)

Null hypothesis rejected.

Decision: There is a significantMarkov property. Theoccurrence of climate is, to anextent, dependent on precedingclimate.

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Page 44: Shahid Lecture-11- MKAG1273

Markov Analysis

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)