Right-Continuous Function

Topics from the Theory of Characteristic Functions

George Roussas , in An Introduction to Measure-Theoretic Probability (Second Edition), 2014

11.1 Definition of the Characteristic Function of a Distribution and Basic Properties

In all that follows, d.f.s are nonnegative, nondecreasing, right-continuous functions with finite variations; it is not assumed that the variations are necessarily bounded by 1 unless otherwise stated (see also Exercises 4 and 5 in Chapter 8).

Definition 1

The characteristic function f of a d.f. F (in the sense of Definition 1 in Chapter 8; see also Remark 5 there) is, in general, a complex-valued function defined on R by

(11.1) f ( t ) = R e itx dF ( x ) = R cos txdF ( x ) + i R sin txdF ( x ) .

The integration in (11.1) is to be understood either in the sense of Riemann–Stieltjes, or as integration with respect to the measure induced by F (see also Appendix B). The integral is well defined for all t R , since cos tx and sin tx are F -integrable. If F is the d.f. of a r.v. X , then (11.1) may be rewritten as

(11.2) f X ( t ) = E e itX = E cos tX + i E sin tX .

Some basic properties of a ch.f. are gathered next in the form of a theorem.

Theorem 1

(i)

f ( t ) VarF , t R , and f ( 0 ) = VarF . In particular, if f ( 0 ) = 1 and 0 F ( x ) 1 , then f is the ch.f. of a r.v.

(ii)

f is uniformly continuous in R .

(iii)

If f is the ch.f. of a r.v. X , then f α X + β ( t ) = e i β t f X ( α t ) , t R , where α and β are constants.

(iv)

If f is the ch.f. of a r.v. X , then f - X ( t ) = f X ( t ) , t R , where, for z = x + iy ( x , y R ) , z ¯ = x - iy .

(v)

If for some positive integer n the n th moment E X n is finite, then d n dt n f X ( t ) t = 0 = i n E X n .

Remark 1

In the proof of the theorems, as well as in other cases, the following property is used:

R [ g ( x ) + ih ( x ) ] d μ R g ( x ) + ih ( x ) d μ = R [ g 2 ( x ) + h 2 ( x ) ] 1 / 2 d μ ,

where g and h are real-valued functions, and R [ g ( x ) + ih ( x ) ] d μ = R g ( x ) d μ + i R h ( x ) d μ . Its justification is left as an exercise (see Exercise 1).

Proof of Theorem 1

For convenience omit R in the integration. Then

(i)

f ( t ) = e itx dF ( x ) e itx dF ( x ) = VarF , and f ( 0 ) = dF ( x ) = VarF . If f ( 0 ) = 1 , then VarF = 1 , which together with 0 F ( x ) 1 , x R , implies F ( - ) = 0 , F ( ) = 1 , so that F is the d.f. of a r.v.

(ii)

f ( t + h ) - f ( t ) = e i ( t + h ) x dF ( x ) - e itx dF ( x ) = [ e i ( t + h ) x - e itx ] dF ( x ) = [ e itx ( e ihx - 1 ) ] dF ( x ) e itx ( e ihx - 1 ) dF ( x ) = e ihx - 1 dF ( x ) . Now e ihx - 1 2 , which is independent of h and F -integrable. Furthermore, e ihx - 1 0 as h 0 . Therefore the Dominated Convergence Theorem applies and gives

e ihx - 1 dF ( x ) 0 as h 0 .

So f ( t + h ) - f ( t ) is bounded by a quantity that is independent of t and 0 as h 0 . This establishes uniform continuity for f .
(iii)

f α X + β ( t ) = E e it ( α X + β ) = E [ e i β t e i ( α t ) X ] = e i β t E [ e i ( α t ) X ] = e i β t f X ( α t ) .

(iv)

f - X ( t ) = E e it ( - X ) = E e i ( - t ) X = E [ cos ( - tX ) + i sin ( - tX ) ] = E [ cos ( tX ) - i sin ( tX ) ] = E cos tX - i E sin tX = E cos tX + i E sin tX = f X ( t ) .

(v)

Consider, e.g., the interval [ - r , r ] for some r > 0 . Then, for t [ - r , r ] , t e itX = iXe itX exists, and t e itX X , independent of t and integrable. Then, by Theorem 5 in Chapter 5,

d dt f ( t ) = d dt e itX dF ( x ) = t e itX dF ( x ) = i ( Xe itX ) dF ( x ) ,

and, in particular, d dt f ( t ) t = 0 = i ( Xe itX ) t = 0 dF ( x ) = i E X .

The same applies for any k , 1 k n , since d k dt k e itX = i k X k e itX exists, and d k dt k e itX X k , independent of t and integrable. In particular, d k dt k f ( t ) t = 0 = i k X k dF ( x ) = i k E X k .

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128000427000116

Methods and Models in Neurophysics

Emery N. Brown , in Les Houches, 2005

3. The conditional intensity function and interevent time probability density

Neural spike trains are characterized by their interspike interval probability models. In Section 2, we showed how elementary interspike interval probability models can be derived from elementary stochastic dynamical systems models of neurons. By viewing the neural spike trains as a point process, we can present characterization of the spike train in terms of its conditional intensity function. We develop this characterization in this section and we relate the conditional intensity function to the interspike interval probability models in Section 2. The presentation here follows closely [5].

Let (0, T] denote the observation interval and let 0 < u 1 < u 2 < , , < u f 1 < u f T be a set of J spike time measurements. For t ɛ (0, T] let N 0:t be the sample path of the point process over (0, t]. It is defined as the event N 0 : t = { 0 < u 1 < u 2 , u j t N ( t ) = j } , where N(t) is the number of spikes in (0, t] and jJ. The sample path is a right continuous function that jumps 1 at the spike times and is constant otherwise [1, 5–8]. The function N 0:t tracks the location and number of spikes in (0, t] and hence, contains all the information in the sequence of spike times (Fig. 4A). The counting process N(t) gives the total number of events that have occurred up through time t. The counting process satisfies

Fig. 4. A. The construction of the sample path N 0:t from the spike times u 1, u 2, u 3, u 4. At time t, N 0 : t = { u 1 , u 2 , u 3 , u 4 N ( t ) = 4 } B. The discretization of the time axis allows us to evaluate the probability of each spike occurrence or non-occurrence as a local Bernoulli process. By Eq. 3.3 the probability of the event u 2, i.e. a 1 between t k-1 and tk , is λ ( t k | H k ) Δ , whereas the probability of the immediately prior to event, a 0 between t k−2 and t k−1, is 1 λ ( t k 1 | H k 1 ) Δ . In this plot, we have taken Δ k = Δ for all k = 1,…, K (reprinted and used with permission of CRC Press).

i)

N(t) ≥ 0.

ii)

N(t) is an integer-valued function.

iii)

If s < t, then N(s) ≤ N(t).

iv)

For s < t, N(t) – N(s) is the number of events in (s, t).

We define the conditional intensity function for t ∈ (0, T] as

(3.1) λ ( t | H t ) = lim Δ 0 Pr ( N ( t + Δ ) N ( t ) = 1 | H t ) Δ ,

where Ht is the history of the sample path and of any covariates up to time t. In general λ(t|Ht ) depends on the history of the spike train and therefore, it is also termed the stochastic intensity. In survival analysis the conditional intensity function is called the hazard function [9, 10]. This is because the hazard function can be used to define the probability of an event in the interval [t, t + Δ) given that there has not been an event up to t. For example, it might represent the probability that a piece of equipment fails in [t, t + Δ) given that it was worked up to time t [9]. As another example, it might define the probability that a patient receiving a new therapy dies in the interval [t, t + Δ) given that he/she has survived up to time t [10]. It follows that λ(t|Ht ) can be defined in terms of the interspike interval probability density at time t, p(t|Ht ), as

(3.2) λ ( t | H t ) = p ( t | H t ) 1 0 t p ( u | H t ) d u ·

We gain insight into the definition of the conditional intensity function in Eq. 3.1 by considering the following heuristic derivation of Eq. 3.2 based on the definition of the hazard function. We compute explicitly the probability of the event, a spike in [t, t + Δ) given Ht and that there has been no spike in (0, t). That is,

(3.3) Pr ( u [ t , t + Δ ) | u > t , H t ) = Pr ( u [ t , t + Δ ) u > t | H t ) Pr ( u > t | H t ) = Pr ( u [ t , t + Δ ) | H t ) Pr ( u > t | H t ) = t t + Δ p ( u | H u ) d u 1 0 t p ( u | H u ) d u = p ( t | H t ) Δ 1 0 t p ( u | H u ) d u + o ( Δ ) = λ ( t | H t ) Δ + o ( Δ ) ,

where o(Δ) refers to all events of order smaller than Δ, such as two or more spikes occurring in an arbitrarily small interval. This establishes Eq. 3.2.

The power of the conditional intensity function is that if it can be defined, as Eq. 3.3 suggests, then it completely characterizes the stochastic structure of the spike train. In any time interval [ t , t + Δ ) , ( t | H t ) Δ defines the probability of a spike given the history up to time t. If the spike train is an inhomogeneous Poisson process, then λ ( t | H t ) = λ ( t ) becomes the Poisson rate function. Thus, the conditional intensity function (Eq. 3.1) is a history-dependent rate function that generalizes the definition of the Poisson rate function. Similarly, Eq. 3.1 is also a generalization of the hazard function for renewal processes [9, 10].

Example 3.1. Conditional intensity function of the Gamma probability density.

The gamma probability density for the integrate and fire model in Eq. 2.4

is

(3.4) p k ( t ) = e λ t λ t ( k 1 ) 1 Γ ( k ) ·

From Eq. 3.2, it follows that the conditional intensity function is

(3.5) λ ( t ) = e λ t λ t ( k 1 ) 1 Γ ( k ) [ 1 0 t p k ( u ) d u ] ·

Example 3.2. Conditional intensity function of the inverse Gaussian probability density.

The inverse Gaussian probability density for the Wiener process integrate and fire model in Eq. 2.20 is

(3.6) p ( t | μ , ρ ) = ( ρ 2 π t 3 ) 1 2 exp { 1 2 ρ ( t μ ) 2 μ 2 t } ·

From Eq. 3.2, the conditional intensity function for this model is

(3.7) λ ( t | μ , ρ ) = ( ρ 2 π t 3 ) 1 2 exp { 1 2 ρ ( 1 μ ) 2 μ 2 t } 1 0 t ( ρ 2 π u 3 ) 1 2 exp { 1 2 ρ ( u μ ) 2 μ 2 u } d u ·

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/S0924809905800204

Stochastic Dynamics

Don Kulasiri , Wynand Verwoerd , in North-Holland Series in Applied Mathematics and Mechanics, 2002

2.3 What is Stochastic Calculus?

In standard calculus we deal with differentiable functions which are continuous except perhaps in certain locations of the domain under consideration. To understand the continuity of the functions better we make use of the definitions of the limits. We call a function f, a continuous function at the point t = t0 if

lim t t 0 f t = f t 0

regardless of the direction t approaches t0 . A right-continuous function at t0 has a limiting value only when t approaches t0 from the right direction, i.e. t is larger than t0 in the vicinity of t 0. We will denote this as

f t + = lim t t 0 f t = f t 0 .

Similarly a left-continuous function at t0 can be represented as

f t = lim t t 0 f t = f t 0 .

These statements imply that a continuous function in both right-continuous and left-continuous at a given point of t. Often we encounter functions having discontinuities; hence the need for the above definitions. To measure the size of a discontinuity, we define the term "jump" at any point t to be a discontinuity where the both f(t   +) and f(t-) exist and the size of the jump be ∆f (t)=f(t   +)− f(t   ). The jumps are the discontinuities of the first kind and any other discontinuity is called a discontinuity of the second kind. Obviously a function can only have countable number of jumps in a given range. From the mean value theorem in calculus it can be shown that we can differentiate a function in a given interval only if the function is either continuous or has a discontinuity of the second kind during the interval. Stochastic calculus is the calculus dealing with often non-differentiable functions having jumps without discontinuities of the second kind. One such example of a function is the Wiener process (Brownian motion). One realization of the standard Wiener process is given in Figure 2.1.

Figure 2.1. An example of a function dealt in stochastic calculus. This function is continuous but not differentiable at any point.

Without going into details of how we computed this function- we will do that in Chapter 3 – we can see that the increments are irregular and we can not define a derivative according to the mean value theorem. This is because of the fact that the function changes erratically within small intervals, however small that interval may be, and we can not define a derivative at a given point in the conventional sense. Therefore we have to devise new mathematical tools that would be useful in dealing with these irregular non-differentiable functions.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/S0167593102800031

Preliminaries

Jaroslav Hájek , ... Pranab K. Sen , in Theory of Rank Tests (Second Edition), 1999

2.3.4 The Neyman-Pearson lemma.

This basic lemma shows that the most powerful test for testing a simple hypothesis against a simple alternative may be found quite easily.

Lemma 1

In testing p against q at level α the most powerful test may be found as follows:

(1) Ψ 0 ( x ) = { 1 , i f q ( x ) > k p ( x ) , 0 , i f q ( x ) < k p ( x ) ,

where k and Ψ0 (x) for x such that q(x) = k p(x) should and can be defined so that

(2) Ψ 0 d P = α .

Proof.

Observe that α(c) = P(q(X) > cp(X )) is a non-increasing and right-continuous function of c such that α(0 − 0) = 1 and α(∞) = 0. Therefore for each α ∈ (0,1) there exists a k ≥ 0 such that

(3) α ( k 0 ) α α ( k ) .

If k is a continuity point, then (2) follows from (1) regardless of the choice of Ψ0(x) for x such that q(x) = kp(x). If k is a point of discontinuity, it suffices to put

(4) Ψ 0 ( x ) = α α ( k ) α ( k 0 ) α ( k )

for x such that q(x) = kp(x).

Now for any other critical function Ψ, 0 ≤ Ψ ≤ 1 and (1) imply that either sign(Ψ0 − Ψ) = sign(q − kp) or at least one of these, expressions equals 0, so that for all x

(5) [ Ψ 0 ( x ) Ψ ( x ) ] [ q ( x ) k p ( x ) ] 0.

Consequently, if ∫ Ψ dP ≤ α,

(6) Ψ 0 d Q Ψ d Q k ( Ψ 0 Ψ ) d P = k ( α Ψ d P ) 0 ,

which was to be proved.

Remark 1.

We also see that Ψ has the same power as Ψ0 only if (5) equals 0 a.s., i.e. if Ψ satisfies (1) a.s. with respect to Lebesgue measure.

Remark 2.

We have made no use of the fact that the measure, with respect to which the densities are defined, is the Lebesgue measure. And as a matter of fact the Neyman-Pearson lemma holds for densities defined with respect to any σ-finite measure.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780126423501500205

Recent Progress in Functional Analysis

Bertram M. Schreiber , in North-Holland Mathematics Studies, 2001

1 Introduction

The notion of a stochastic process which is continuous in probability (stochastically continuous in [11]) arises in numerous contexts in probability theory [2,4,5,11,15]. Indeed, the Poisson process is continuous in probability, and this notion plays a role in the study of its generalizations and, from a broader point of view, in the theory of processes with independent increments [11]. For instance, the work of X. Fernique [9 ] on random right-continuous functions with left-hand limits (so-called cadlag functions) involves continuity in probability in an essential way.

The study of processes continuous in probability as a generalization of the notion of a continuous function began with the approximation theorems of K. Fan [7] (cf. [5], Theorems VI.III.III and VI.III.IV) and D. Dugué ([5], Theorem VI.III.V) on the unit interval. These results were generalized to convex domains in higher dimensions in [12], where the problem of describing all compact sets in the complex plane on which every random function continuous in probability can be uniformly approximated in probability by random polynomials was raised. This problem, as well as the corresponding question for rational approximation, were taken up in [1], Along with some stimulating examples, the authors of [1] prove, under the natural assumptions appearing below, that random polynomial approximation holds over Jordan curves and the closures of Jordan domains.

In this note we study the space of functions continuous in probability over a general topological space and develop the analogue of the space C(K) for K compact. This space has the structure of a Fréchet algebra. We investigate the closed ideals of this algebra and then introduce the notion of a stochastic uniform algebra.

Just as in the deterministic, classical case, there are natural stochastic uniform algebras defined by the appropriate concept of random approximation. We shall highlight some results from [3] which show that random polynomial approximation in the plane obtains for a very large class of compact sets. For instance, if K is a compact set with the property that every continuous function on ∂K can be uniformly approximated by rational functions, then every function continuous in probability on K (with respect to a nonatomic measure) and random holomorphic on the interior of K can be uniformly approximated in probability by random polynomials.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/S0304020801800554

Preliminaries

Yuriy E. Obzherin , Elena G. Boyko , in Semi-Markov Models, 2015

1.3 Preliminaries on semi-Markov processes with arbitrary phase space of states

We represent necessary results from the theory of semi-Markov processes (SMPs) with arbitrary phase space of states [14–16] 14 15 16 .

Definition [16]. Semi-Markov kernel (SM-kernel) in a measurable space (E, B ) is the function Q ( t , x , B ) , satisfying the conditions:

(1)

Q ( t , x , B ) are nondecreasing right-continuous functions of t 0 , Q ( 0 , x , B ) = 0 , x E , B B ;

(2)

with t > 0 fixed, Q ( t , x , B ) is a semistochastic kernel: Q ( t , x , B ) 1 ;

(3)

Q ( + , x , B ) is a stochastic kernel by x , B , that is Q ( + , x , E ) 1 .

An SMP with arbitrary phase space of states is defined by means of a Markov renewal process (MRP).

Definition [16]. An MRP is a two-dimensional Markov chain ξ n , θ n ; n 0 taking values in E × 0 , . Its transition probabilities are given by the expression:

P ξ n + 1 B , θ n + 1 t / ξ n = x = Q ( t , x , B ) ,

where Q ( t , x , B ) is an SM-kernel in (E, B ).

The first component ξ n ; n 0 of the MRP ξ n , θ n ; n 0 is a Markov chain. Its transition probabilities are defined by means of SM-kernel Q ( t , x , B ) :

P ( x , B ) = P ξ n + 1 B / ξ n = x = Q ( + , x , B ) .

It is called an embedded Markov chain (EMC) of MRP ξ n , θ n ; n 0 . RVs θ n , n 0 , making the second component of MRP ξ n , θ n ; n 0 , determine intervals between the moments τ n of Markov restoration:

τ n = k = 1 n θ k n 1 , τ 0 = 0 .

Consider the counting process ν ( t ) : ν ( t ) = sup n : τ n t , which counts the number of Markov renewal moments in 0 , t .

Definition [16]. The process ξ ( t ) = ξ ν ( t ) is an SMP corresponding MRP ξ n , θ n ; n 0 .

It can be concluded from the definition that SMP is a jump right-continuous process: ξ ( t + 0 ) = ξ ( t ) .

Another way of SMP definition is the following [16]:

(1)

stochastic kernel

P ( x , B ) = P ξ n + 1 B / ξ n = x , x E , B

(2)

DF of sojourn times of EMC ξ n ; n 0 transitions

G ( t , x , y ) = G x y ( t ) = P θ n + 1 t / ξ n = x , ξ n + 1 = y

are defined.

Then SM-kernel Q ( t , x , B ) is defined by the formula [16]:

(1.14) Q ( t , x , B ) = B G ( t , x , y ) P ( x , d y ) .

Let us write out definitions and formulas of some reliability and efficiency characteristics of restorable systems described by means of SMP.

Let a system S be described by SMP ξ ( t ) with a phase space (E, B ). Assume the set of SMP ξ ( t ) states can be represented as

E = E + E , E + E = , E + B , E B ,

where E + and E are interpreted as sets of system S up- and down-states, respectively.

Definition [16]. Stationary availability factor K a of system S is the number, given by

K a = lim t + P ξ ( t ) E + / ξ ( 0 ) = x ,

under assumption the limit existence and independence on the initial state x E .

The following stationary reliability characteristics of restorable systems are often in use. Their formal definition is given in [16]:

(a)

mean stationary operating time to failure T + ,

(b)

mean stationary restoration time T .

EMC ξ n ; n 0 stationary distribution ρ ( B ) satisfies the integral equation:

(1.15) ρ ( B ) = E ρ ( d x ) P ( x , B ) , B .

It was proved in [16], that if the unique stationary distribution of EMC ξ n ; n 0 of SMP ξ ( t ) describing the system S operation exists, characteristics K a , T + , T are given by the formulas:

(1.16) K a = E + m ( x ) ρ ( d x ) E m ( x ) ρ ( d x ) ,

(1.17) T + = E + m ( x ) ρ ( d x ) E + P ( x , E ) ρ ( d x ) ,

(1.18) T = E m ( x ) ρ ( d x ) E + P ( x , E ) ρ ( d x ) ,

under some assumptions.

Here ρ ( d x ) denotes the EMC ξ n ; n 0 stationary distribution, and m ( x ) is the mean sojourn time in state x E . One should note the characteristics K a , T + , T relate like this:

(1.19) K a = T + T + + T .

The Markov renewal equation [16] plays an important role in the theory of SMP. It is as follows:

(1.20) u ( x , t ) = g ( x , t ) + 0 t E Q ( d s , x , d y ) u ( y , t s ) , x E , t 0 .

Markov renewal equations for some SMP characteristics are given in [16]. The Markov renewal equation for the distribution of the sojourn time R ¯ x ( t ) of SMP ξ ( t ) in a certain subset E 0 of states is often applied [16]:

(1.21) R x ¯ ( t ) = F x ¯ ( t ) + 0 t E 0 Q ( d s , x , d y ) R y ¯ ( t s ) ,

its consequence is the equation for mean sojourn times in a subset E 0 [15]:

U ( x ) = E 0 P ( x , d y ) U ( y ) + m ( x ) ,

where m ( x ) is the SMP ξ ( t ) mean sojourn time in x .

Stationary efficiency characteristics of system operation are: S is the mean specific income per calendar time unit and C is the mean specific expenses per time unit of up-state. In terms of SM model, these characteristics are given by the ratios [18,26] 18 26 :

(1.22) S = E m ( x ) f s ( x ) ρ ( d x ) E m ( x ) ρ ( d x ) ,

(1.23) C = E m ( x ) f c ( x ) ρ ( d x ) E + m ( x ) ρ ( d x ) ,

where f s ( x ) , f c ( x ) are functions denoting income and expenses in each state.

In the monograph, the following method of approximation of system stationary reliability characteristics, introduced in [14], is applied.

Let the initial system S operation is described by SMP ξ ( t ) with a phase space (E, B ). The set E of states is divided into two subsets E + and E , so that E = E + E , E + E = . Assume the kernel P ( x , B ) , B B , of EMC ξ n ; n 0 of SMP ξ ( t ) is close to the kernel P ( 0 ) ( x , B ) , B B , of EMC ξ n ( 0 ) ; n 0 of supporting system S ( 0 ) having unique stationary distribution ρ ( 0 ) ( B ) , B∈, B .

Then instead of the expressions (1.17) and (1.18) we can use the following formulas [14]:

(1.24) T + E + m ( x ) ρ ( 0 ) ( d x ) E + P ( r ) ( x , E ) ρ ( 0 ) ( d x ) , T E ρ ( 0 ) ( d x ) E m ( y ) P ( r ) ( x , d y ) E + P ( r ) ( x , E ) ρ ( 0 ) ( d x ) ,

approximating characteristics of the initial system S .

Here, ρ ( 0 ) ( d x ) is the EMC ξ n ( 0 ) ; n 0 stationary distribution for supporting system; m ( x ) is the mean sojourn times in the states of the initial system; P ( r ) ( x , E ) is the the probabilities of EMC ξ n ; n 0 transitions from up- into down-states in minimal path for the initial system; r is a minimum of steps, necessary for transition from the states of E + , belonging to the ergodic class E ( 0 ) of the initial system, to the set of down-states E . Under r = 1 , formula (1.24) takes the form:

(1.25) T + E + m ( x ) ρ ( 0 ) ( d x ) E + P ( x , E ) ρ ( 0 ) ( d x ) , T E ρ ( 0 ) ( d x ) E m ( y ) P ( x , d y ) E + P ( x , E ) ρ ( 0 ) ( d x ) .

The kernel P ( x , B ) of the initial system EMC ξ n ; n 0 is close to the kernel P ( 0 ) ( x , B ) of supporting system EMC ξ n ; n 0 , that is why under r = 1 , along with the second formula (1.25), the following approximating formula for T can be used:

(1.26) T E m ( x ) ρ ( 0 ) ( d x ) E + P ( x , E ) ρ ( 0 ) ( d x ) .

To approximate system stationary efficiency characteristics, instead of (1.22) and (1.23) the following ratios will be used:

(1.27) S E m ( x ) f s ( x ) ρ ( 0 ) ( d x ) E m ( x ) ρ ( 0 ) ( d x ) , C E m ( x ) f c ( x ) ρ ( 0 ) ( d x ) E + m ( x ) ρ ( 0 ) ( d x ) ,

where ρ ( 0 ) ( d x ) is the stationary distribution of supporting system EMC ξ n ( 0 ) ; n 0 ; m ( x ) is the mean sojourn times in the states of the initial system; and f s ( x ) , f c ( x ) are the functions denoting income and expenses in each state of the initial system.

Semi-Markov models of latent failures control are built under the following assumptions:

(1)

From the point of view of reliability, a system component is a minimal compound element (detail), which can be failed, controlled, and restored.

(2)

Component failure is detected while control execution only.

(3)

After failure detection, restoration process immediately begins.

(4)

A component is completely restored while restoration process.

(5)

DFs of the RVs: operating time to failure, time periods between the moments of control execution, control and restoration time are arbitrary ones.

The stages of semi-Markov model construction and system stationary characteristics definition are represented in Figure 1.2.

Figure 1.2. General scheme of semi-Markov model building

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128022122000012

Time Series Analysis: Methods and Applications

Kanchan Mukherjee , in Handbook of Statistics, 2012

6.1 M- and R-Estimators

Let τ = ( a , b ) denote a generic value in the parameter space Ω 1 × Ω 2 and let θ = ( α , β ) be the true parameter. To estimate α , we proceed in three steps. Using E { σ ( Y i 1 , β ) η i } = 0 in (29), we first propose a preliminary estimator α ^ p ; note that the proposal does not take into account the heteroscedasticity of the model, and hence, it gives a consistent but inefficient estimator. Next, we use α ^ p to construct an estimator β ^ of the parameter β . Finally, substituting α ^ p and β ^ in (29), the heteroscedastic model is transformed to an approximate nonlinear homoscedastic autoregressive model (36), and we use standard robust estimation procedures of the homoscedastic models to propose improved estimator of α .

In the sequel, μ ˙ and σ ˙ denote the derivatives of the functions μ and σ, respectively, with respect to their second arguments. Also for a vector y , its j th coordinator is denoted as y j .

Step 1:

Define

( a ) : = n 1 2 i = 1 n μ ˙ ( Y i 1 , a ) { X i μ ( Y i 1 , a ) } .

Since E [ ( α ) ] = 0 , we define a preliminary estimator α ^ p of α by the relation

(34) α ^ p : =  argmin j = 1 r 1 | j ( a ) | ; a Ω 1 ,

where j ( a ) is the j th coordinate of the vector ( a ) , 1 j r 1 .

In particular, when μ ( y , a ) = y a ,

α ^ p = i = 1 n Y i 1 Y i 1 1 i = 1 n X i Y i 1 .

Step 2:

Let

η i ( τ ) : = { X i μ ( Y i 1 , a ) } σ ( Y i 1 , b ) , 1 i n ,

denote the i th residual. Let κ be a nondecreasing right continuous function on I R such that E { η 1 κ ( η 1 ) } = 1 . This is automatically satisfied, for example, when κ is the identity function ( κ ( x ) x ). Consider the statistic

M s ( τ ) : = n 1 2 i = 1 n σ ˙ ( Y i 1 , b ) σ ( Y i 1 , b ) η i ( τ ) κ ( η i ( τ ) ) 1 .

Since E [ M s ( α , β ) ] = 0 , an estimator of the scale parameter β is defined by the relation

β ^ : =  argmin j = 1 r 2 | M s j ( α ^ p , b ) | ; b Ω 2 .

Note that (29) can be written as

(35) X i σ ( Y i 1 , β ) = μ ( Y i 1 , α ) σ ( Y i 1 , β ) + η i .

This in turn can be approximated by

(36) X i σ ( Y i 1 , β ^ ) μ ( Y i 1 , α ) σ ( Y i 1 , β ^ ) + η i ,

which is a nonlinear autoregressive model with homoscedastic errors.

Now using the standard definition for homoscedastic nonlinear model (35), the class of M-estimators and R-estimators based on appropriate score functions ψ and φ, respectively, can be defined as follows; see the study by Bose and Mukherjee (2003) for a similar two-step idea.

Step 3:

Let ψ be nondecreasing and bounded function on I R such that E { ψ ( η 1 ) } = 0 . An example is the function ψ ( x ) = s i g n ( x ) when { η i } s are symmetrically distributed around 0.

Let φ  : [ 0 , 1 ] I R belong to the class

= { φ ; φ  : [ 0 , 1 ] I R  is right continuous, nondecreasing, with φ ( 1 ) φ ( 0 ) = 1 } .

An example of the function belonging to this class is φ ( u ) = u 1 2 ; it is called the Wilcoxon rank score function. Define the M-statistics

M ψ ( τ ) = n 1 2 i = 1 n μ ˙ ( Y i 1 , a ) σ ( Y i 1 , b ) ψ { η i ( τ ) } .

Since E [ M ψ ( α , β ) ] = 0 , from (35), an M estimator of α corresponding to the score function ψ is defined as

α ^ M : =  argmin j = 1 r 1 | M ψ j ( a , β ^ ) | ; a Ω 1 .

Define the rank statistic as

S φ ( τ ) = n 1 2 n i = 1 [ μ ˙ ( Y i 1 , a ) σ ( Y i 1 , b ) n 1 × n j = 1 μ ˙ ( Y j 1 , a ) σ ( Y j 1 , b ) ] φ ( R i τ n + 1 ) , τ Ω ,

where R i τ = j = 1 n I { η j ( τ ) η i ( τ ) } , the rank of η i ( τ ) among { η j ( τ ) ; 1 j n } . Hence, E [ S φ ( α , β ) ] = 0 and so a generalized R-estimator of α corresponding to the score function φ is defined as

α ^ R =  argmin { { j = 1 r 1 | S φ j ( a , β ^ ) | ; a Ω 1 } .

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780444538581000065

Interpolation of Operators

In Pure and Applied Mathematics, 1988

Definition 1.5

Suppose f belongs to M 0(R, μ). The decreasing rearrangement of f is the function f* defined on [0, ∞) by

(1.9) f * ( t ) = i n f { λ : μ f ( λ ) t } , ( t 0 ) .

We use here the convention that inf Ø = ∞. Thus, if μf (λ) > t for all λ ≥ 0, then f*(t) = ∞. Also, if (R, μ) is a finite measure space, then the distribution function μf is bounded by μ(R) and so f*(t) = 0 for all t ≥ μ(R). In this case, we may regard f* as a function defined on the interval [0, μ(R)). Notice also that if μf happens to be continuous and strictly decreasing, then f* is simply the inverse of μ f on the appropriate interval. In fact, for general f, if we first form the distribution function μf and then form the distribution function mμf of μf (with respect to Lebesgue measure m on [0, ∞)) we obtain precisely the decreasing rearrangement f*. This is an immediate consequence of the identities

(1.10) f * ( t ) = sup { λ : μ f ( λ ) > t } = m μ f ( t ) , ( t 0 ) ,

which follow from (1.9), the fact that μ f is decreasing, and the definition of the distribution function.

Examples 1.6

(a)

Now we compute the decreasing rearrangement of the simple function f given by (1.6). Referring to (1.9) and Figure 1, we see that f*(t) = 0 if tm 3. Also, if m 3> tm 2, then f*(t) = a 3, and if m2 > tm 1, then f*(t) = a 2, and so on. Hence,

(1.11) f * ( t ) = j = 1 n a j χ [ m j 1 . m j ) ( t ) , ( t 0 ) ,

where we have taken m 0 = 0.

Geometrically, we are merely rearranging the vertical blocks in the graph of f in decreasing order to obtain the decreasing rearrangement f* (see Figure 2); the values of f* at the jumps are determined by the right continuity (Proposition 1.7).

Figure 2. Graphs of f and f*.

(b)

It is sometimes more useful to section functions into horizontal blocks rather than vertical ones. Thus, the simple function f in (1.6) may be represented also as follows:

(1.12) f ( x ) = k = 1 n b k χ F k ( x ) ,

where the coefficients bk are positive and the sets Fk each have finite measure and form an increasing sequence F 1F 2 ⊂ … ⊂ Fn . Comparison with (1.6) shows that

b k = a k a k + 1 , F k j = 1 k E j , ( k = 1 , 2 , , n ) .

In this case, the decreasing rearrangement is viewed as being formed by sliding the blocks in each horizontal layer to form a single larger block positioned with its left-hand end against the vertical axis (see Figure 3). Thus

(1.13) f * = k = 1 n b k χ [ 0 , μ ( F k ) ) .

Figure 3. Graphs of f and f*.

(c)

Let f(x) = 1 − e−x , (0 < x < ∞). The distribution function mf (with respect to Lebesgue measure m on (0, ∞)) is infinite for 0 ≤ λ < 1, and equal to zero for all λ ≥ 1. Hence f*(t) = 1 for all t ≥ 0 (cf. Figure 4). This example shows that a considerable amount of information may be lost in passing to the decreasing rearrangement. Such information, however, is irrelevant as far as Lp -norms (or any other rearrangement-invariant norms) are concerned. Thus, the Lp -norms of f and f* are both infinite when 1 ≤ p < ∞, and the L -norms are both equal to 1.

Figure 4. Graphs of f(x) = 1 − e−x and f*(t).

Proposition 1.7

Suppose f, g, and fn , (n = 1, 2, …), belong to M 0(R, μ) and let a be any scalar. The decreasing rearrangement f* is a nonnegative, decreasing, right-continuous function on [0, ∞). Furthermore,

(1.14) | g | | f | μ a . e . g * f * ;

(1.15) ( a f ) * = | a | f * ;

(1.16) ( f + g ) * ( t 1 + t 2 ) f * ( t 1 ) + g * ( t 2 ) , ( t 1 , t 2 0 ) ;

(1.17) | f | lim i n f n | f n | μ a . e . f * lim i n f n f n * ;

in particular,

(1.18) | f n | | f | μ a . e . f n * f * ; f * ( μ f ( λ ) ) λ , ( μ f ( λ ) < ) ; μ f ( f * ( t ) ) t , ( f * ( t ) ) ;

(1.19) f and f * are equimeasurable;

(1.20) ( | f | p ) * = ( f * ) p , ( 0 < p < ) .

Proof. That f* is nonnegative, decreasing, and right-continuous follows from Proposition 1.3 and the fact that f* is itself a distribution function (cf. (1.10)). The properties (1.14), (1.15), and (1.17) are immediate consequences of their counterparts in Proposition 1.3 and the definition of the decreasing rearrangement.

For property (1.18), fix λ ≥ 0 and suppose t = μf(λ) is finite. Then (1.9) gives

f * ( μ f ( λ ) ) = f * ( t ) = i n f { λ : μ f ( λ ) t = μ f ( λ ) } λ ,

which establishes the first part of (1.18). For the second part, fix t ≥ 0 and suppose λ = f*(t) is finite. By (1.9), there is a sequence λn ↓ λ with μ f n ) ≤ t, so the right-continuity of μf (Proposition 1.3) gives

μ f ( f * ( t ) ) = μ f ( λ ) = lim n μ f ( λ n ) t .

This establishes (1.18).

Returning to (1.16), we may assume that λ = f*(t 1) + g*(t 2) is finite since otherwise there is nothing to prove. Let t = μ f + g (λ). Then by the triangle inequality and the second of the inequalities in (1.18) we have

t = μ { x : | f ( x ) + g ( x ) | > f * ( t 1 ) + g * ( t 2 ) } μ { x : | f ( x ) | > f * ( t 1 ) } + μ { x : | g ( x ) | > g * ( t 2 ) } = μ f ( f * ( t 1 ) ) + μ g ( g * ( t 2 ) ) t 1 + t 2 .

This shows in particular that t is finite. Hence, using the first of the inequalities in (1.18) and the fact that (f + g)* is decreasing, we obtain

( f + g ) * ( t 1 + t 2 ) ( f + g ) * ( t ) = ( f + g ) * ( μ f + g ( λ ) ) λ = f * ( t 1 ) + g * ( t 2 ) ,

and this establishes (1.16).

For an arbitrary function f in M 0, we can find a sequence of nonnegative simple functions fn , (n = 1, 2, …), such that fn ↑ |f|. It is clear (cf. Example 1.6(a)) that for each n the functions fn and fn * are equimeasurable, that is,

(1.21) μ f n ( λ ) = m f n * ( λ ) , ( λ 0 ) .

But fn ↑ |f| and fn * ↑ f* (by 1.17) so property (1.5), applied to each of the distribution functions in (1.21), shows that

(1.22) μ f ( λ ) = m f * ( λ ) , ( λ 0 ) .

Hence, f and f* are equimeasurable, as asserted by (119).

Finally, from (1.22) we have

μ | f | p ( λ ) = μ f ( λ 1 / p ) = m f * ( λ 1 / p ) = m ( f * ) p ( λ ) , ( λ 0 ) .

Passing to the decreasing rearrangements by means of (1.9), we obtain (1.20).

The next result gives alternative descriptions of the L p-norm in terms of the distribution function and the decreasing rearrangement.

Proposition 1.8

Let f M 0 . If 0 < p < ∞, then

(1.23) R | f | p d μ = p 0 λ p 1 μ f ( λ ) d λ = 0 f * ( t ) p d t .

Furthermore, in the case p = ∞,

(1.24) ess sup x R | f ( x ) | = i n f { λ : μ f ( λ ) = 0 } = f * ( 0 ) .

Proof. In view of (1.5), (1.17), and the monotone convergence theorem, it will suffice to prove (1.23) for an arbitrary nonnegative simple function f. With f written in the form (1.6), we saw that its decreasing rearrangement f* is given by (1.11). But then it is clear from (1.8) that

| f | p d μ = j = 1 n a j p μ ( E j ) = j = 1 n a j p m ( [ m j 1 , m j ) ) = 0 ( f * ) p d m .

Similarly, using the expressions (1.6) and (1.7) for f and its distribution function μ f , we have

p 0 λ p 1 μ f ( λ ) d λ = p j = 1 n m j a j + 1 a j λ p 1 d λ = j = 1 n ( a j p a j + 1 p ) m j = j = 1 n a j p μ ( E j ) = | f | p d μ ,

where the third equality follows from (1.8) and a summation by parts.

This establishes (1.23). The proof of (1.24) is straightforward and we omit it.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/S0079816908608478