Poisson Statistics: Difference between revisions

From bradwiki
Jump to navigation Jump to search
(Created page with "{{More footnotes|date=July 2010}} In probability theory, a '''Poisson process''' is a stochastic process which counts the number of events<ref group="note">The word '...")
 
No edit summary
Line 1: Line 1:
{{More footnotes|date=July 2010}}
In [[probability theory]], a '''Poisson process''' is a [[stochastic process]] which counts the number of events<ref group="note">The word ''event'' used here is not an instance of the concept of [[event (probability theory)|''event'']] as frequently used in probability theory.</ref> and the time that these events occur in a given time interval. The time between each pair of consecutive events has an [[exponential distribution]] with parameter ''λ'' and each of these inter-arrival times is assumed to be independent of other inter-arrival times.
In [[probability theory]], a '''Poisson process''' is a [[stochastic process]] which counts the number of events<ref group="note">The word ''event'' used here is not an instance of the concept of [[event (probability theory)|''event'']] as frequently used in probability theory.</ref> and the time that these events occur in a given time interval. The time between each pair of consecutive events has an [[exponential distribution]] with parameter ''λ'' and each of these inter-arrival times is assumed to be independent of other inter-arrival times.


Line 6: Line 4:


==Definition==
==Definition==
{{Probability distribution
  | name      = Poisson
  | type      = mass
  | pdf_image  = [[File:poisson pmf.svg|325px|Plot of the Poisson PMF]]<br />The horizontal axis is the index ''k'', the number of occurrences. The function is only defined at integer values of ''k''. The connecting lines are only guides for the eye.
  | cdf_image  = [[File:poisson cdf.svg|325px|Plot of the Poisson CDF]]<br />The horizontal axis is the index ''k'', the number of occurrences. The CDF is discontinuous at the integers of ''k'' and flat everywhere else because a variable that is Poisson distributed only takes on integer values.
  | notation  = <math>\mathrm{Pois}(\lambda)\,</math>
  | parameters = ''λ'' > 0 ([[real number|real]])
  | support    = ''k'' ∈ { 0, 1, 2, 3, ... }
  | pdf        = <math>\frac{\lambda^k}{k!}\cdot e^{-\lambda}</math>
  | cdf        = <math>\frac{\Gamma(\lfloor k+1\rfloor, \lambda)}{\lfloor k\rfloor !}\!</math> --or-- <math>e^{-\lambda} \sum_{i=0}^{\lfloor k\rfloor} \frac{\lambda^i}{i!}\ </math>
(for <math>k\ge 0</math> where <math>\Gamma(x, y)\,\!</math> is the [[Incomplete gamma function]] and <math>\lfloor k\rfloor</math> is the [[floor function]])
  | mean      = <math>\lambda\,\!</math>
  | median    = <math>\approx\lfloor\lambda+1/3-0.02/\lambda\rfloor</math>
  | mode      = <math>\lfloor\lambda\rfloor,\,\lceil\lambda\rceil - 1</math>
  | variance  = <math>\lambda\,\!</math>
  | skewness  = <math>\lambda^{-1/2}\,</math>
  | kurtosis  = <math>\lambda^{-1}\,</math>
  | entropy    = <math>\lambda[1\!-\!\log(\lambda)]\!+\!e^{-\lambda}\sum_{k=0}^\infty \frac{\lambda^k\log(k!)}{k!}</math>
(for large <math>\lambda</math>)
<math>\frac{1}{2}\log(2 \pi e \lambda) - \frac{1}{12 \lambda} - \frac{1}{24 \lambda^2} -</math><br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <math> \frac{19}{360 \lambda^3} + O\left(\frac{1}{\lambda^4}\right)</math><!--formula split with &nbsp; spaces-->
  | pgf        = <math> \exp(\lambda(z - 1))\,</math>
  | mgf        = <math>\exp(\lambda (e^{t}-1))\,</math>
  | char      = <math>\exp(\lambda (e^{it}-1))\,</math>
}}
The basic form of Poisson process, often referred to simply as "the Poisson process", is a continuous-time [[counting process]]  {''N''(''t''), ''t''&nbsp;≥&nbsp;0} that possesses the following properties:
The basic form of Poisson process, often referred to simply as "the Poisson process", is a continuous-time [[counting process]]  {''N''(''t''), ''t''&nbsp;≥&nbsp;0} that possesses the following properties:
* ''N''(0)&nbsp;=&nbsp;0
* ''N''(0)&nbsp;=&nbsp;0

Revision as of 14:33, 3 October 2013

In probability theory, a Poisson process is a stochastic process which counts the number of events[note 1] and the time that these events occur in a given time interval. The time between each pair of consecutive events has an exponential distribution with parameter λ and each of these inter-arrival times is assumed to be independent of other inter-arrival times.

The Poisson process is a continuous-time process; the sum of a Bernoulli process can be thought of as its discrete-time counterpart. A Poisson process is a pure-birth process, the simplest example of a birth-death process. It is also a point process on the real half-line.

Definition

Poisson
Probability mass function
Plot of the Poisson PMF
The horizontal axis is the index k, the number of occurrences. The function is only defined at integer values of k. The connecting lines are only guides for the eye.
Cumulative distribution function
Plot of the Poisson CDF
The horizontal axis is the index k, the number of occurrences. The CDF is discontinuous at the integers of k and flat everywhere else because a variable that is Poisson distributed only takes on integer values.
Notation
Parameters λ > 0 (real)
Support k ∈ { 0, 1, 2, 3, ... }
pmf
CDF --or--

(for where is the Incomplete gamma function and is the floor function)

Mean
Median
Mode
Variance
Skewness
Kurtosis
Entropy

(for large )
                   

MGF
CF
PGF

The basic form of Poisson process, often referred to simply as "the Poisson process", is a continuous-time counting process {N(t), t ≥ 0} that possesses the following properties:

Consequences of this definition include:

  • The probability distribution of the waiting time until the next occurrence is an exponential distribution.
  • The occurrences are distributed uniformly on any interval of time. (Note that N(t), the total number of occurrences, has a Poisson distribution over (0, t], whereas the location of an individual occurrence on t ∈ (a, b] is uniform.)

Other types of Poisson process are described below.

Types

Homogeneous

File:SampleProcess.png
Sample Path of a counting Poisson process N(t)

The homogeneous Poisson process is one of the most well-known Lévy processes. This process is characterized by a rate parameter λ, also known as intensity, such that the number of events in time interval (tt + τ] follows a Poisson distribution with associated parameter λτ. This relation is given as

where N(t + τ) − N(t) = k is the number of events in time interval (tt + τ].

Just as a Poisson random variable is characterized by its scalar parameter λ, a homogeneous Poisson process is characterized by its rate parameter λ, which is the expected number of "events" or "arrivals" that occur per unit time.

N(t) is a sample homogeneous Poisson process, not to be confused with a density or distribution function.

Non-homogeneous

Template:Main

In general, the rate parameter may change over time; such a process is called a non-homogeneous Poisson process or inhomogeneous Poisson process. In this case, the generalized rate function is given as λ(t). Now the expected number of events between time a and time b is

Thus, the number of arrivals in the time interval (ab], given as N(b) − N(a), follows a Poisson distribution with associated parameter λa,b

A rate function λ(t) in a non-homogeneous Poisson process can be either a deterministic function of time or an independent stochastic process, giving rise to a Cox process. A homogeneous Poisson process may be viewed as a special case when λ(t) = λ, a constant rate.

Spatial

An important variation on the (notionally time-based) Poisson process is the spatial Poisson process. In the case of a one-dimension space (a line) the theory differs from that of a time-based Poisson process only in the interpretation of the index variable. For higher dimension spaces, where the index variable (now x) is in some vector space V (e.g. R2 or R3), a spatial Poisson process can be defined by the requirement that the random variables defined as the counts of the number of "events" inside each of a number of non-overlapping finite sub-regions of V should each have a Poisson distribution and should be independent of each other.

Space-time

A further variation on the Poisson process, the space-time Poisson process, allows for separately distinguished space and time variables. Even though this can theoretically be treated as a pure spatial process by treating "time" as just another component of a vector space, it is convenient in most applications to treat space and time separately, both for modeling purposes in practical applications and because of the types of properties of such processes that it is interesting to study.

In comparison to a time-based inhomogeneous Poisson process, the extension to a space-time Poisson process can introduce a spatial dependence into the rate function, such that it is defined as , where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): x \in V for some vector space V (e.g. R2 or R3). However a space-time Poisson process may have a rate function that is constant with respect to either or both of x and t. For any set (e.g. a spatial region) with finite measure , the number of events occurring inside this region can be modeled as a Poisson process with associated rate function λS(t) such that

Separable space-time processes

In the special case that this generalized rate function is a separable function of time and space, we have:

for some function . Without loss of generality, let

(If this is not the case, λ(t) can be scaled appropriately.) Now, represents the spatial probability density function of these random events in the following sense. The act of sampling this spatial Poisson process is equivalent to sampling a Poisson process with rate function λ(t), and associating with each event a random vector sampled from the probability density function . A similar result can be shown for the general (non-separable) case.

Characterisation

In its most general form, the only two conditions for a counting process to be a Poisson process are:Template:Citation needed

  • Orderliness: which roughly means
which implies that arrivals don't occur simultaneously (but this is actually a mathematically stronger statement).
  • Memorylessness (also called evolution without after-effects): the number of arrivals occurring in any bounded interval of time after time t is independent of the number of arrivals occurring before time t.

These seemingly unrestrictive conditions actually impose a great deal of structure in the Poisson process. In particular, they imply that the time between consecutive events (called interarrival times) are independent random variables. For the homogeneous Poisson process, these inter-arrival times are exponentially distributed with parameter λ (mean 1/λ).

Also, the memorylessness property entails that the number of events in any time interval is independent of the number of events in any other interval that is disjoint from it. This latter property is known as the independent increments property of the Poisson process.

Properties

As defined above, the stochastic process {N(t)} is a Markov process, or more specifically, a continuous-time Markov process.Template:Citation needed

To illustrate the exponentially distributed inter-arrival times property, consider a homogeneous Poisson process N(t) with rate parameter λ, and let Tk be the time of the kth arrival, for k = 1, 2, 3, ... . Clearly the number of arrivals before some fixed time t is less than k if and only if the waiting time until the kth arrival is more than t. In symbols, the event [N(t) < k] occurs if and only if the event [Tk > t] occurs. Consequently the probabilities of these events are the same:

In particular, consider the waiting time until the first arrival. Clearly that time is more than t if and only if the number of arrivals before time t is 0. Combining this latter property with the above probability distribution for the number of homogeneous Poisson process events in a fixed interval gives

Consequently, the waiting time until the first arrival T1 has an exponential distribution, and is thus memoryless. One can similarly show that the other interarrival times Tk − Tk−1 share the same distribution. Hence, they are independent, identically distributed (i.i.d.) random variables with parameter λ > 0; and expected value 1/λ. For example, if the average rate of arrivals is 5 per minute, then the average waiting time between arrivals is 1/5 minute.

Applications

The classic example of phenomena well modelled by a Poisson process is deaths due to horse kick in the Prussian army, as shown by Ladislaus Bortkiewicz in 1898.[1][2] The following examples are also well-modeled by the Poisson process:

  • Requests for telephone calls at a switchboard.
  • Goals scored in a soccer match.[3]
  • Requests for individual documents on a web server.[4]
  • Particle emissions due to radioactive decay by an unstable substance. In this case the Poisson process is non-homogeneous in a predictable manner - the emission rate declines as particles are emitted.
  • L. F. Richardson showed that the outbreak of war followed a Poisson process from 1820 to 1950.[5]
  • Photons landing on a photodiode, in particular in low light environments. This phenomena is related to shot noise.

In queueing theory, the times of customer/job arrivals at queues are often assumed to be a Poisson process.

Occurrence

The Palm–Khintchine theorem provides a result that shows that the superposition of many low intensity non-Poisson point processes will be close to a Poisson process.

See also

Notes

  1. The word event used here is not an instance of the concept of event as frequently used in probability theory.

References

  1. Ladislaus von Bortkiewicz, Das Gesetz der kleinen Zahlen [The law of small numbers] (Leipzig, Germany: B.G. Teubner, 1898). On page 1, Bortkiewicz presents the Poisson distribution. On pages 23-25, Bortkiewicz presents his famous analysis of "4. Beispiel: Die durch Schlag eines Pferdes im preussischen Heere Getöteten." (4. Example: Those killed in the Prussian army by a horse's kick.).
  2. Template:Cite book
  3. doi:10.1209/0295-5075/89/38007
    This citation will be automatically completed in the next few minutes. You can jump the queue or expand by hand
  4. Cite error: Invalid <ref> tag; no text was provided for refs named ArlittMartin
  5. doi:10.1511/2002.1.10
    This citation will be automatically completed in the next few minutes. You can jump the queue or expand by hand

Further reading

Template:Stochastic processes