Estimation of different entropies via Abel–Gontscharoff Green functions and Fink’s identity using Jensen type functionals

Khuram Ali Khan (Department of Mathematics, University of Sargodha, Sargodha, Pakistan)
Tasadduq Niaz (Department of Mathematics, University of Sargodha, Sargodha, Pakistan) (Department of Mathematics, The University of Lahore, Sargodha-Campus, Sargodha, Pakistan)
Đilda Pečarić (Catholic University of Croatia, Zagreb, Croatia)
Josip Pečarić (RUDN University, Moscow, Russia)

Arab Journal of Mathematical Sciences

ISSN: 1319-5166

Article publication date: 31 December 2018

Issue publication date: 31 August 2020

428

Abstract

In this work, we estimated the different entropies like Shannon entropy, Rényi divergences, Csiszár divergence by using Jensen’s type functionals. The Zipf’s–Mandelbrot law and hybrid Zipf’s–Mandelbrot law are used to estimate the Shannon entropy. The Abel–Gontscharoff Green functions and Fink’s Identity are used to construct new inequalities and generalized them for m-convex function.

Keywords

Citation

Khan, K.A., Niaz, T., Pečarić, Đ. and Pečarić, J. (2020), "Estimation of different entropies via Abel–Gontscharoff Green functions and Fink’s identity using Jensen type functionals", Arab Journal of Mathematical Sciences, Vol. 26 No. 1/2, pp. 15-39. https://doi.org/10.1016/j.ajmsc.2018.12.002

Publisher

:

Emerald Publishing Limited

Copyright © 2019, Khuram Ali Khan, Tasadduq Niaz, Đilda Pečarić and Josip Pečarić

License

Published in Arab Journal of Mathematical Sciences. Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) license. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this license may be seen at http://creativecommons.org/licences/by/4.0/legalcode


1. Introduction and preliminary results

In recent years many researchers generalized different inequalities using different identities involving green functions, for example in [24] Nasir et al. generalized the Popoviciu inequality using Mongomery identity along with the new green function. Also in [25] Niaz et al. used Fink’s identity along with new Abel–Gontscharoff type Green functions for ‘two point right focal’ to generalize the refinement of Jensen inequality.

The most commonly used words, the largest cities of countries, income of billionaire can be described in terms of Zipf’s law. The f-divergence means the distance between two probability distributions by making an average value, which is weighted by a specified function. As f-divergence, there are other probability distributions like Csiszár f-divergence [11,12], some special case of which is Kullback–Leibler-divergence used to find the appropriate distance between the probability distributions (see [20,21]). The notion of distance is stronger than divergence because it gives the properties of symmetry and triangle inequalities. Probability theory has application in many fields and the divergence between probability distribution has many applications in these fields.

Many natural phenomena like distribution of wealth and income in a society, distribution of face book likes, distribution of football goals follow power law distribution (Zipf’s Law). Like above phenomena, distribution of city sizes also follows Power Law distribution. Auerbach [3] first time gave the idea that the distribution of city size can be well approximated with the help of Pareto distribution (Power Law distribution). This idea was well refined by many researchers but Zipf [32] worked significantly in this field. The distribution of city sizes is investigated by many scholars of the urban economics, like Rosen and Resnick [29], Black and Henderson [4], Ioannides and Overman [19], Soo [30], Anderson and Ge [2] and Bosker et al. [5]. Zipf’s law states that: “The rank of cities with a certain number of inhabitants varies proportional to the city sizes with some negative exponent, say that is close to unit”. In other words, Zipf’s Law states that the product of city sizes and their ranks appear roughly constant. This indicates that the population of the second largest city is one half of the population of the largest city and the third largest city equal to the one third of the population of the largest city and the population of nth city is 1n of the largest city population. This rule is called rank, size rule and also named as Zipf’s Law. Hence Zip’s Law not only shows that the city size distribution follows the Pareto distribution, but also shows that the estimated value of the shape parameter is equal to unity.

In [18] L. Horváth et al. introduced some new functionals based on the f-divergence functionals and obtained some estimates for the new functionals. They obtained f-divergence and Rényi divergence by applying a cyclic refinement of Jensen’s inequality. They also construct some new inequalities for Rényi and Shannon entropies and used Zipf–Mandelbrot law to illustrate the results.

The inequalities involving higher order convexity are used by many physicists in higher dimension problems since the founding of higher order convexity by T. Popoviciu (see [27, p. 15]). It is quite interesting fact that there are some results that are true for convex functions but when we discuss them in higher order convexity they do not remain valid.

In [27, p. 16], the following criteria are given to check the m-convexity of the function.

If f(m) exists, then f is m-convex if and only if f(m)0.

In recent years many researchers have generalized the inequalities for m-convex functions; like S. I. Butt et al. generalized the Popoviciu inequality for m-convex function using Taylor’s formula, Lidstone polynomial, Montgomery identity, Fink’s identity, Abel–Gontscharoff interpolation and Hermite interpolating polynomial (see [6–10]).

Since many years Jensen’s inequality has of great interest. The researchers have given the refinement of Jensen’s inequality by defining some new functions (see [16,17]). Like many researchers L. Horváth and J. Pečarić in [14,17], see also [15, p. 26], gave a refinement of Jensen’s inequality for convex function. They defined some essential notions to prove the refinement given as follows:

Let X be a set, and:

P(X):= Power set of X,

|X|:=Number of elements of X,

:=Set of natural numbers with 0.

Consider q1 and r2 be fixed integers. Define the functions

Fr,S:{1,,q}r{1,,q}r11Sr,
Fr:{1,,q}rP({1,,q}r1),
and
Tr:P({1,,q}r)P({1,,q}r1),
by
Fr,S(i1,,ir):=(i1,i2,,iS1,iS+1,,ir)1Sr,
Fr(i1,,ir)=S=1r{Fr,S(i1,,ir)},
and
Tr(I)={φ,I=φ;(i1,,ir)IFr(i1,,ir),Iφ.}
Next let the function
αr,i:{1,,q}r1iq
defined by
αr,i(i1,,ir)is the number of occurrences of i in the sequence(i1,,ir).

For each IP({1,,q}r) let

αI,i:=(i1,ir)Iαr,i(i1,,ir)1iq.

(H1) Let n,m be fixed positive integers such that n1, m2 and let Im be a subset of {1,,n}m such that

αIm,i11in.

Introduce the sets Il{1,,n}l(m1l1) inductively by

Il1:=Tl(Il)ml2.

Obviously the sets I1={1,,n}, by (H1) and this insures that αI1,i=1(1in). From (H1) we have αIl,i1(m1l1,1in).

For ml2, and for any (j1,,jl1)Il1, let

HIl(j1,,jl1):={((i1,,il),k)×{1,,l}|Fl,k(i1,,il)=(j1,,jl1)}.

With the help of these sets they define the functions ηIm,l:Il(ml1) inductively by

ηIm,m(i1,,im):=1(i1,,im)Im;
ηIm,l1(j1,,jl1):=((i1,il),k)HIl(j1,,jl1)ηIm,l(i1,,il).

They define some special expressions for 1lm, as follows

Am,l=Am,l(Im,x1,,xn,p1,,pn;f):=(m1)!(l1)!(i1,il)IlηIm,l(i1,,il)×(j=1lpijαIm,ij)f(j=1lpijαIm,ijxijj=1lpijαIm,ij)
and prove the following theorem.
Theorem 1.1.

Assume (H1), and let f:IR be a convex function where I is an interval. If x1,,xnI and p1,,pn are positive real numbers such that S=1npS=1, then

(1)f(S=1npSxS)Am,mAm,m1Am,2Am,1=S=1npSf(xS).

We define the following functionals by taking the differences of refinement of Jensen’s inequality given in (1).

(2) Θ1(f)=Am,rf(S=1npSxS),r=1,,m,
(3)Θ2(f)=Am,rAm,k,1r<km.
Under the assumptions of Theorem 1.1, we have
(4)Θi(f)0,i=1,2.
Inequalities (4) are reversed if f is concave on I.

In [26], the green function G:[α1,α2]×[α1,α2]R is defined as

(5)G(u,v)={(uα2)(vα1)α2α1,α1vu;(vα2)(uα1)α2α1,uvα2.
The function G is convex with respect to v and due to symmetry also convex with respect to u. One can also note that G is continuous function.

In [31] it is given that any function f:[α1,α2]R, such that fC2([α1,α2]) can be written as

(6)f(u)=α2uα2α1f(α1)+uα1α2α1f(α2)+α2α1G(u,v)f(v)dv.

2. Inequalities for Csiszár divergence

In [11,12] Csiszár introduced the following notion.

Definition 1.

Let f:++ be a convex function, let r=(r1,,rn) and q=(q1,,qn) be positive probability distributions. Then f-divergence functional is defined by

(7) If(r, q):=i=1nqif(riqi).
And he stated that by defining
(8) f(0):=limx0+f(x);0f(00):=0;0f(a0):=limx0+xf(a0),a>0,
we can also use the nonnegative probability distributions as well.

In [18], L. Horv´ath, et al. gave the following functional based on the previous definition.

Definition 2.

Let I be an interval and let f:I be a function, let r=(r1,,rn)Rn and q=(q1,,qn)(0,)n such that

rSqSI,S=1,,n.
Then they define the sum I^f(r, q) as
(9)I^f(r, q):=S=1nqSf(rSqS).
We apply Theorem 1.1 to I^f(r, q)

Theorem 2.1.

Assume (H1), let I be an interval and let r=(r1,,rn) and q=(q1,,qn) are in (0,)n such that

rSqSI,S=1,,n.
(i) If f:I is a convex function, then
I^f(r, q)=S=1nqSf(rSqS)=Am,1[1]Am,2[1]Am,m1[1]Am,m[1]
(10)f(S=1nrSS=1nqS)S=1nqS.
where
(11) Am,l[1]=(m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)(j=1lqijαIm,ij)f(j=1lrijαIm,ijj=1lqijαIm,ij)
If f is a concave function, then inequality signs in (10) are reversed.

(ii) If f:IR is a function such that xxf(x)(xI) is convex, then

(12)(S=1nrS)f(S=1nrSS=1nqS)Am,m[2]Am,m1[2]Am,2[2]Am,1[2]=S=1nrSf(rSqS)=I^idf(r, q)
where
Am,l[2]=(m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)(j=1lqijαIm,ij)(j=1lrijαIm,ijj=1lqijαIm,ij)×f(j=1lrijαIm,ijj=1lqijαIm,ij)

Proof.

  • (i) Consider pS=qSS=1n qS and xS=rSqS in Theorem 1.1, we have

(13)f(S=1nqSS=1nqSrSqS)(m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)×(j=1lqijS=1nqSαIm,ij)f(j=1lqiji=1nqiαIm,ijrijqijj=1lqiji=1nqiαIm,ij)S=1nqSi=1nqSf(rSqS)
And taking the sum S=1nqi we have (10).
  • (ii) Using f:=idf (where “id” is the identity function) in Theorem 1.1, we have

    (14)S=1npSxSf(S=1npSxS)(m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)×(j=1lpijαIm,ij)(j=1lpijαIm,ijxijj=1lpijαIm,ij)f(j=1lpijαIm,ijxijj=1lpijαIm,ij)S=1npSxSf(xS)

Now on using pS=qSS=1nqS and xS=rSqS,S=1,,n, we get
(15)S=1nqSS=1nqSrSqSf(S=1nqSS=1nqSrSqS)(m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)×(j=1lqijS=1nqSαIm,ij)(j=1lqijS=1nqSαIm,ijrijqijj=1lqijS=1nqSαIm,ij)f(j=1lqijS=1nqSαIm,ijrijqijj=1lqijS=1nqSαIm,ij)S=1nqSS=1nqSrSqSf(rSqS)

On taking sum S=1nqS on both sides, we get (12). □

3. Inequalities for Shannon Entropy

Definition 3

(See [18]). The Shannon entropy of positive probability distribution r=(r1,,rn) is defined by

(16) S:=S=1nrSlog(rS).

Corollary 3.1.

Assume (H1).

(i) If q=(q1,,qn)(0,)n, and the base of log is greater than 1, then

(17) SAm,m[3]Am,m1[3]Am,2[3]Am,1[3]=log(nS=1nqS)S=1nqS,
where
(18) Am,l[3]=(m1)!!(l1)!(i1,,il)IlηIm,l(i1,,il)(j=1lqijαIm,ij)log(j=1lqijαIm,ij).
If the base of log is between 0 and 1, then inequality signs in (17) are reversed.

(ii) If q=(q1,,qn) is a positive probability distribution and the base of log is greater than 1, then we have the estimates for the Shannon entropy of q

(19) SAm,m[4]Am,m1[4]Am,2[4]Am,1[4]=log(n),
where
Am,l[4]=(m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)(j=1lqijαIm,ij)log(j=1lqijαIm,ij).

Proof.

(i) Using f:=log and r=(1,,1) in Theorem 2.1 (i), we get (17).

(ii) It is the special case of (i). □

Definition 4

(See [18])

The Kullback–Leibler divergence between the positive probability distribution r=(r1,,rn) and q=(q1,,qn) is defined by

(20)D(r, q):=S=1nrilog(riqi).

Corollary 3.2.

Assume (H1).

(i) Let r=(r1,,rn)(0,)n and q:=(q1,,qn)(0,)n. If the base of log is greater than 1, then

(21) S=1nrSlog(S=1nrSS=1nqS)Am,m[5]Am,m1[5]Am,2[5]Am,1[5]=S=1nrSlog(rSqS)=D(r, q),
where
Am,l[5]=(m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)(j=1lqijαIm,ij)(j=1lrijαIm,ijj=1lqijαIm,ij)×log(j=1lrijαIm,ijj=1lqijαIm,ij).
If the base of log is between 0 and 1, then inequality in (21) is reversed.

(ii) If r and q are positive probability distributions, and the base of l is greater than 1, then we have

(22) D(r, q)=Am,1[6]Am,2[6]Am,m1[6]Am,m[6]0,
where
Am,l[6]=(m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)(j=1lqijαIm,ij)(j=1lrijαIm,ijj=1lqijαIm,ij)×log(j=1lrijαIm,ijj=1lqijαIm,ij)
If the base of log is between 0 and 1, then inequality signs in (22) are reversed.

Proof.

  • (i) On taking f:=log in Theorem 2.1 (ii), we get (21).

  • (ii) Since r and q are positive probability distributions therefore S=1nrS=S=1nqS=1, so the smallest term in (21) is given as

(23) S=1nrSlog(S=1nrSS=1nqS)=0.
Hence for positive probability distribution r and q the (21) will become (22). □

4. Inequalities for Rényi Divergence and Entropy

The Rényi divergence and entropy come from [28].

Definition 5.

Let r=(r1,,rn) and q:=(q1,,qn) be positive probability distributions, and let λ0, λ1.

  • (a) The Rényi divergence of order λ is defined by

(24) Dλ(r, q):=1λ1log(i=1nqi(riqi)λ).
  • (b) The Rényi entropy of order λ of r is defined by

(25) Hλ(r):=11λlog(i=1nriλ).

The Rényi divergence and the Rényi entropy can also be extended to non-negative probability distributions. If λ1 in (24), we have the Kullback–Leibler divergence, and if λ1 in (25), then we have the Shannon entropy. In the next two results, inequalities can be found for the Rényi divergence.

Theorem 4.1.

Assume (H1), let r=(r1,,rn) and q=(q1,,qn) are probability distributions.

(i) If 0λμ such that λ,μ1, and the base of log is greater than 1, then

(26) Dλ(r, q)Am,m[7]Am,m1[7]Am,2[7]Am,1[7]=Dμ(r, q),
where
Am,l[7]=1μ1log((m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)(j=1lrijαIm,ij)×(j=1lrijαIm,ij(rijqij)λ1j=1lrijαIm,ij)μ1λ1)
The reverse inequalities hold in (26) if the base of log is between 0 and 1.

(ii) If 1<μ and the base of log is greater than 1, then

(27)D1(r, q)=D(r, q)=S=1nrSlog(rSqS)Am,m[8]Am,m1[8]Am,2[8]Am,1[8]=Dμ(r, q),
where
Am,l[8]=1μ1log((m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)(j=1lrijαIm,ij)×exp((μ1)j=1lrijαIm,ijlog(rijqij)j=1lrijαIm,ij))
here the base of exp is the same as the base of log, and the reverse inequalities hold if the base of log is between 0 and 1.

(iii) If 0λ<1, and the base of log is greater than 1, then

(28)Dλ(r, q)Am,m[9]Am,m1[9]Am,2[9]Am,1[9]=D1(r, q),
where
(29) Am,l[9]=1λ1(m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)(j=1lrijαIm,ij)×log(j=1lrijαIm,ij(rijqij)λ1j=1lrijαIm,ij)

Proof.

By applying Theorem 1.1 with I=(0,), f:(0,)R, f(t)=tμ1λ1

pS:=rS,xS:=(rSqS)λ1,S=1,,n,
we have
(30) (S=1nqS(rSqS)λ)μ1λ1=(S=1nrS(rSqS)λ)μ1λ1(m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)(j=1lrijαIm,ij)×(j=1lrijαIm,ij(rijqij)λ1j=1lrijαIm,ij)μ1λ1S=1nrS((rSqS)λ1)μ1λ1
if either 0λ<1<β or 1<λμ, and the reverse inequality in (30) holds if 0λβ<1. By raising to power 1μ1, we have from all
(31) (S=1nqS(rSqS)λ)1λ1((m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)(j=1lrijαIm,ij)×(j=1lrijαIm,ij(rijqij)λ1j=1lrijαIm,ij)μ1λ1)1μ1(S=1nrS((rSqS)λ1)μ1λ1)1μ1=(S=1nqS(rSqS)μ)1μ1
Since log is increasing if the base of log is greater than 1, it now follows (26). If the base of log is between 0 and 1, then log is decreasing and therefore inequality in (26) is reversed. If λ=1 and β=1, we have (ii) and (iii) respectively by taking limit, when λ goes to 1. □

Theorem 4.2.

Assume (H1), let r=(r1,,rn) and q=(q1,,qn) are probability distributions. If either 0λ<1 and the base of log is greater than 1, or 1<λ and the base of log is between 0 and 1, then

(32)1S=1nqS(rSqS)λS=1nqS(rSqS)λlog(rSqS)=Am,1[10]Am,2[10]Am,m1[10]Am,m[10]Dλ(r,q)Am,m[11]Am,m[11]Am,2[11]Am,1[11]=D1(r,q)
where
Am,m[10]=1(λ1)S=1nqS(rSqS)λ(m1)!(l1)!(i1,il)IlηIm,l(i1,il)×(j=1lrijαIm,ij(rijqij)λ1)log(j=1lrijαIm,ij(rijqij)λ1j=1lrijαIm,ij)andAm,m[11]=1λ1(m1)!(l1)!(i1,il)IlηIm,l(i1,il)(j=1lrijαIm,ij)×log(j=1lrijαIm,ij(rijqij)λ1j=1lrijαIm,ij).
The inequalities in (32) are reversed if either 0λ<1 and the base of log is between 0 and 1, or 1<λ and the base of l is greater than 1.

Proof.

We prove only the case when 0λ<1 and the base of log is greater than 1 and the other cases can be proved similarly. Since 1λ1<0 and the function log is concave then choose I=(0,), f:=log, pS=rS, xS:=(rSqS)λ1 in Theorem 1.1, we have

(33) Dλ(r, q)=1λ1log(S=1nqS(rSqS)λ)=1λ1log(S=1nrS(rSqS)λ1)1λ1(m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)(j=1lrijαIm,ij)log(j=1lrijαIm,ij(rijqij)λ1j=1lrijαIm,ij)1λ1S=1nrSlog((rSqS)λ1)=S=1nrSlog(rSqS)=D1(r, q)
and this gives the upper bound for Dλ(r, q).

Since the base of log is greater than 1, the function xxf(x) (x>0) is convex therefore 11λ<0 and Theorem 1.1 gives

(34) Dλ(r, q)=1λ1log(S=1nqS(rSqS)λ)=1λ1(S=1nqS(rSqS)λ)(S=1nqS(rSqS)λ)log(S=1nqS(rSqS)λ)1λ1(S=1nqS(rSqS)λ)(m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)(j=1lrijαIm,ij)(j=1lrijαIm,ij(rijqij)λ1j=1lrijαIm,ij)log(j=1lrijαIm,ij(rijqij)λ1j=1lrijαIm,ij)=1λ1(S=1nqS(rSqS)λ)(m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)(j=1lrijαIm,ij(rijqij)λ1)log(j=1lrijαIm,ij(rijqij)λ1j=1lrijαIm,ij)1λ1S=1nrS(rSqS)λ1log(rSqS)λ11S=1nrS(rSqS)λ1=1S=1nqS(rSqS)λS=1nqS(rSqS)λlog(rSqS)
which give the lower bound of Dλ(r, q). □

By using Theorems 4.1, 4.2 and Definition 5, some inequalities of Rényi entropy are obtained. Let 1n=(1n,,1n) be a discrete probability distribution.

Corollary 4.3.

Assume (H1), let r=(r1,,rn) and q=(q1,,qn) are positive probability distributions.

  • (i) If 0λμ, λ,μ1, and the base of log is greater than 1, then

(35)Hλ(r)=log(n)Dλ(r,1n)Am,m[12]Am,m[12]Am,2[12]Am,1[12]=Hμ(r),
where
Am,l[12]=11μlog((m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)×(j=1lrijαIm,ij)×(j=1lrijλαIm,ijj=1lrijαIm,ij)μ1λ1).
The reverse inequalities hold in (35) if the base of log is between 0 and 1.
  • (ii) If 1<μ and base of log is greater than 1, then

(36) S=S=1npilog(pi)Am,m[13]Am,m1[13]Am,2[13]Am,1[13]=Hμ(r)
where
Am,l[13]=log(n)+11μlog((m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)(j=1lrijαIm,ij)×exp((μ1)j=1lrijαIm,ijlog(nrij)j=1lrijαIm,ij)),
the base of exp is the same as the base of log. The inequalities in (36) are reversed if the base of log is between 0 and 1.
  • (iii) If 0λ<1, and the base of log is greater than 1, then

(37) Hλ(r)Am,m[14]Am,m1[14]Am,2[14]Am,1[14]=S,
where
(38) Am,m[14]=11λ(m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)(j=1lrijαIm,ij)×log(j=1lrijλαIm,ijj=1lrijαIm,ij).
The inequalities in (37) are reversed if the base of log is between 0 and 1.

Proof.

  • (i) Suppose q=1n then from (24), we have

(39)Dλ(r, q)=1λ1log(S=1nnλ1rSλ)=log(n)+1λ1log(S=1nrSλ),
therefore we have
(40) Hλ(r)=log(n)Dλ(r,1n).
Now using Theorem 4.1 (i) and (40), we get
(41)Hλ(r)=log(n)Dλ(r,1n)log(n)1μ1×log(nμ1(m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)×(j=1lrijαIm,ij)(j=1lrijλαIm,ijj=1lrijαIm,ij)μ1λ1)log(n)Dμ(r, q)=Hμ(r),
(ii) and (iii) can be proved similarly. □

Corollary 4.4.

Assume (H1) and let r=(r1,,rn) and q=(q1,,qn) are positive probability distributions.

If either 0λ<1 and the base of log is greater than 1, or 1<λ and the base of log is between 0 and 1, then

(42) 1S=1nrSλS=1nrSλlog(rS)=Am,1[15]Am,2[15]Am,m1[15]Am,m[15]Hλ(r)Am,m[16]Am,m1[16]Am,2[16]Am,1[16]=H(r),
where
Am,l[15]=1(λ-1)S=1nrSλ(m-1)!(l-1)!(i1,,il)ÎIlηIm,l(i1,,il)(j=1lrijλαIm,ij)log(nλ-1j=1lrijλαIm,ijj=1lrijαIm,ij)andAm,1[16]=11-λ(m-1)!(l-1)!(i1,,il)IlηIm,l(i1,,il)(j=1lrijαIm,ij)log(j=1lrijλαIm,ijj=1lrijαIm,ij).
The inequalities in (42) are reversed if either 0λ<1 and the base of log is between 0 and 1, or 1<λ and the base of log is greater than 1.

Proof.

The proof is similar to Corollary 4.3 by using Theorem 4.2. □

5. Inequalities by using Zipf–Mandelbrot law

In probability theory and statistics, the Zipf–Mandelbrot law is a distribution. It is a power law distribution on ranked data, named after the linguist G. K. Zipf who suggests a simpler distribution called Zipf’s law. The Zipf’s law is defined as follows (see [32]).

Definition 6.

Let N be a number of elements, S be their rank and t be the value of exponent characterizing the distribution. Zipf’s law then predicts that out of a population of N elements, the normalized frequency of element of rank S, f(S,N,t) is

(43)f(S,N,t)=1Stj=1N1jt.
The Zipf–Mandelbrot law is defined as follows (see [22]).

Definition 7.

Zipf–Mandelbrot law is a discrete probability distribution depending on three parameters N{1,2,,},q[0,) and t>0, and is defined by

(44)f(S;N,q,t):=1(S+q)tHN,q,t,S=1,,N,
where
(45)HN,q,t=j=1N1(j+q)t.
If the total mass of the law is taken over all N, then for q0, t>1, SN, density function of Zipf–Mandelbrot law becomes
(46)f(S;q,t)=1(S+q)tHq,t,
where
(47)Hq,t=j=11(j+q)t.
For q=0, the Zipf–Mandelbrot law (44) becomes Zipf’s law (43).

Conclusion 5.1.

Assume (H1), let r be a Zipf–Mandelbrot law, by Corollary 4.3 (iii), we get: If 0λ<1, and the base of log is greater than 1, then

(48)Hλ(r)=11λlog(1HN,q,tλS=1n1(S+q)λS)11λ(m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)(j=1l1αIm,ij(ij+q)HN.q,t)log(1HN,q,tλ1j=1l1αIm,ij(ijq)λSj=1l1αIm,ij(ijq)S)tHN,q,tS=1Nlog(S+q)(S+q)t+log(HN,q,t)=S.
The inequalities in (48) are reversed if the base of log is between 0 and 1.

Conclusion 5.2.

Assume (H1), let r1 and r2 be the Zipf–Mandelbort law with parameters N{1,2,}, q1,q2[0,) and S1,S2>0, respectively, then from Corollary 3.2 (ii), we have if the base of l is greater than 1, then

(49) D¯(r1,r2)=S=1n1(S+q1)t1HN,q1,t1log((S+q2)t2HN,q2,t2(S+q1)t1HN,q2,t1)(m1)!(l1)!(i1,,il)IlηIm,l(i1,,il)×(j=1l1(ij+q2)t2HN,q2,t2αIm,ij)(j=1l1(ij+q1)t1HN,q1,t1αIm,ijj=1l1(ij+q2)t2HN,q2,t2αIm,ij)×log(j=1l1(ij+q1)t1HN,q1,t1αIm,ijj=1l1(ij+q2)t2HN,q2,t2αIm,ij)0.

The inequalities in (49) are reversed if the base of l is between 0 and 1.

6. Shannon entropy, Zipf–Mandelbrot law and hybrid Zipf–Mandelbrot law

Here we maximize the Shannon entropy using method of Lagrange multiplier under some equations constraints and get the Zipf–Mandelbrot law.

Theorem 6.1.

If J={1,2,,N}, for a given q0 a probability distribution that maximizes the Shannon entropy under the constraints

SJrS=1,SJrS(In(S+q)):=ψ,
is Zipf–Mandelbrot law.

Proof.

If J={1,2,,N}, we set the Lagrange multipliers λ and t and consider the expression

S˜=S=1NrSlnrSλ(S=1NrS1)t(S=1NrSln(S+q)ψ)
Just for the sake of convenience, replace λ by lnλ1, thus the last expression gives
S˜=S=1NrSlnrS(lnλ1)(S=1NrS1)t(S=1NrSln(S+q)ψ)
From S˜rS=0, for S=1,2,,N, we get
rS=1λ(S+q)t,
and on using the constraint S=1NrS=1, we have
λ=S=1N(1(S+1)t)
where t>0, concluding that
rS=1(S+q)tHN,q,t,S=1,2,,N.

Remark 6.2.

Observe that the Zipf–Mandelbrot law and Shannon Entropy can be bounded from above (see [23]).

S=S=1Nf(S,N,q,t)lnf(S,N,q,t)S=1Nf(S,N,q,t)lnqS
where (q1,,qN) is a positive N-tuple such that S=1NqS=1.

Theorem 6.3.

If J={1,,N}, then probability distribution that maximizes Shannon entropy under constraints

SJrS:=1,SJrSln(S+q):=Ψ,SJSrS:=η
is hybrid Zipf–Mandelbrot law given as
rS=wS(S+q)kΦ(k,q,w),SJ,
where
ΦJ(k,q,w)=SJwS(S+q)k.

Proof.

First consider J={1,,N}, we set the Lagrange multiplier and consider the expression

S˜=S=1NrSlnrS+lnw(S=1NSrSη)(lnλ1)(S=1NrS1)k(S=1NrSln(S+q)Ψ).
On setting S˜rS=0, for S=1,,N, we get
lnrS+Slnwlnλkln(S+q)=0,
after solving for rS, we get λ=S=1NwS(S+q)k, and we recognize this as the partial sum of Lerch’s transcendent that we will denote by
ΦN*(k,q,w)=S=1NwS(S+q)kwithw0,k>0.

Remark 6.4.

Observe that for Zipf–Mandelbrot law, Shannon entropy can be bounded from above (see [23]).

S=S=1Nfh(S,N,q,k)lnfh(S,N,q,k)S=1Nfh(S,N,q,k)lnqS
where (q1,,qN) is any positive N-tuple such that S=1NqS=1.

Under the assumption of Theorem 2.1 (i), define the non-negative functionals as follows:

(50)Θ3(f)=Am,r[1]f(S=1nrSS=1nqS)S=1nqS,r=1,,m,
(51)Θ4(f)=Am,r[1]Am,k[1],1r<km.
Under the assumption of Theorem 2.1 (ii), define the non-negative functionals as follows:
(52)Θ5(f)=Am,r[2](S=1nrS)f(S=1nrSS=1nqS),r=1,,m,
(53)Θ6(f)=Am,r[2]Am,k[2],1r<km.
Under the assumption of Corollary 3.1 (i), define the following non-negative functionals
(54)Θ7(f)=Am,r[3]+i=1nqilog(qi),r=1,,n
(55) Θ8(f)=Am,r[3]Am,k[3],1r<km.
Under the assumption of Corollary 3.1 (ii), define the following non-negative functionals as
(56)Θ9(f)=Am,r[4]S,r=1,,m
(57)Θ10(f)=Am,r[4]Am,k[4],1r<km.
Under the assumption of Corollary 3.2 (i), let us define the non-negative functionals as follows:
(58) Θ11(f)=Am,r[5]S=1nrSlog(S=1nlogrnS=1nqS),r=1,,m
(59)Θ12(f)=Am,r[5]Am,k[5],1r<km.
Under the assumption of Corollary 3.2 (ii), define the non-negative functionals as follows
(60)Θ13(f)=Am,r[6]Am,k[6],1r<km.
Under the assumption of Theorem 4.1 (i), consider the following functionals
(61)Θ14(f)=Am,r[7]Dλ(r,q),r=1,,m
(62)Θ15(f)=Am,r[7]Am,k[7],1r<km.

Under the assumption of Theorem 4.1 (ii), consider the following functionals:

(63)Θ16(f)=Am,r[8]D1(r,q),r=1,,m
(64)Θ17(f)=Am,r[8]Am,k[8],1r<km.
Under the assumption of Theorem 4.1 (iii), consider the following functionals:
(65) Θ18(f)=Am,r[9]Dλ(r,q),r=1,,m
(66)Θ19(f)=Am,r[9]Am,k[9],1r<km.
Under the assumption of Theorem 4.2 consider the following non-negative functionals
(67)Θ20(f)=Dλ(r,q)Am,r[10],r=1,,m
(68) Θ21(f)=Am,k[10]Am,r[10],1r<km.
(69) Θ22(f)=Am,r[11]Dλ(r,q),r=1,,m
(70) Θ23(f)=Am,r[11]Am,r[11],1r<km.
(71)Θ24(f)=Am,r[11]Am,k[10],r=1,,m,k=1,,m.
Under the assumption of Corollary 4.3 (i), consider the following non-negative functionals
(72)Θ25(f)=Hλ(r)Am,r[12],r=1,,m
(73) Θ26(f)=Am,k[12]Am,r[12],1r<km.
Under the assumption of Corollary 4.3 (ii), consider the following functionals
(74)Θ27(f)=SAm,r[13],r=1,,m
(75) Θ28(f)=Am,k[13]Am,r[13],1r<km.
Under the assumption of Corollary 4.3 (iii), consider the following functionals
(76)Θ29(f)=Hλ(r)Am,r[14],r=1,,m
(77) Θ30(f)=Am,k[14]Am,r[14],1r<km.
Under the assumption of Corollary 4.4, define the following functionals
(78)Θ31=Am,r[15]Hλ(r),r=1,,m
(79) Θ32=Am,r[15]Am,k[15],1r<km.
(80) Θ33=Hλ(r)Am,r[16],r=1,,m
(81) Θ34=Am,k[16]Am,r[16],1r<km.
(82)Θ35=Am,r[15]Am,k[16],r=1,,m,k=1,,m.

7. Generalization of refinement of Jensen’s, Rényi and Shannon type inequalities Fink’s Identity and Abel–Gontscharoff Green function

In [13], A. M. Fink gave the following result.

Let f:[α1,α2], where [α1,α2] be an interval, is a function such that f(n1) is absolutely continuous then the following identity holds

(83)f(z)=nα2α1α1α2f(ζ)dζ+λ=1n1nλλ!(f(λ1)(α2)(zα2)λf(λ1)(α1)(zα1)λα2α1)+1(n1)!(α2α1)α1α2(zζ)n1Fα1α2(ζ,z)f(n)(ζ)dζ,
where
(84)Fα1α2(ζ,z)={ζα1,α1ζzα2;ζα2,α1z<ζα2.

The complete reference about Abel–Gontscharoff polynomial and theorem for ‘two-point right focal’ problem is given in [1].

The Abel–Gontscharoff polynomial for ‘two-point right focal’ interpolating polynomial for n=2 can be given as

(85)f(z)=f(α1)+(zα1)f(α2)+α1α2G1(z,w)f(w)dw,
where
(86)G1(z,w)={α1w,α1wz;α1z,zwα2.

In [8], S. I. Butt et al. gave some new types of Green functions defined as

(87)G2(z,w)={α2z,α1wz;α2w,zwα2,
(88)G3(z,w)={zα1,α1wz;wα1,zwα2,
(89)G4(z,w)={α2w,α1wz;α2z,zwα2,

Figure 1 shows the graph of Green functions Gi(z,w),i=1,2,3,4 defined in (86)(89) respectively for fixed value of w. They also introduced some new Abel–Gontscharoff type identities by using these new Green functions in the following lemma.

Lemma A.

Let f:[α1,α2] be a twice differentiable function and Gk (k=2,3,4) be the ‘two-point right focal problem’-type Green functions defined by (87)(89). Then the following identities hold:

(90)f(z)=f(α2)(α2z)f(α1)α1α2G2(z,w)f(w)dw,
(91)f(z)=f(α2)(α2α1)f(α2)+(zα1)f(α1)+α1α2G3(z,w)f(w)dw,
(92)f(z)=f(α1)+(α2α1)f(α1)(α2z)f(α2)+α1α2G4(z,w)f(w)dw.
Theorem 7.1.

Assume (H1), and let f:I=[α1,α2]R be a function such that for m3 (an integer) f(m1) is absolutely continuous. Also, let x1,,xnI, p1,,pn, be positive real numbers such that i=1npi=1. Assume that Fα1α2, Gk (k=1,2,3,4) and Θi (i=1,,35) are the same as defined in (84), (86)(89), (2), (3), (50)(82) respectively.

Then:

  1. For k=1,3,4 we have the following identities:

    (93)Θi(f)=(m2)(f(α2)f(α1)α2α1)α1α2Θi(Gk(,w))dw+1α2α1α1α2Θi(Gk(,w))×λ=1m3(m2λλ!)(f(λ+1)(α2)(wα2)λf(λ+1)(α1)(wα1)λ)dw+1(m3)!(α2α1)α1α2f(m)(ζ)×(α1α2Θi(Gk(,w))(wζ)m3Fα1α2α2(ζ,w)dw)dζ,i=1,,35.

  2. For k=2 we have

(94)Θi(f)=(1)(m2)(f(α2)f(α1)α2α1)α1α2Θi(G2(,w))dw+(1)α2α1α1α2Θi(G2(,w))×λ=1m3(m2λλ!)(f(λ+1)(α2)(wα2)λf(λ+1)(α1)(wα1)λ)dw+(1)(m3)!(α2α1)α1α2f(m)(ζ)×(α1α2Θi(G2(,w))(wζ)m3Fα1α2(ζ,w)dw)dζ,i=1,,35.

Proof.

  • (i) Using Abel–Gontsharoff-typeidentities (85), (91), (92) in Θi(f), i=1,,35, and using properties of Θi(f), we get

(95)Θi(f)=α1α2Θi(Gk(,w))f(w)dw,i=1,2.
From identity (83), we get
(96) f(w)=(m2)(f(α2)f(α1)α2α1)+λ=1m3(m2λλ!)(f(λ)(α2)(wα2)λ1f(λ)(α2)(wα2)λ1α2α1)+1(m3)!(α2α1)α1α2(wζ)m3Fα1α2(ζ,w)f(m)(ζ)dζ.
Using (95) and (96) and applying Fubini’s theorem we get the result (93) for k=1,3,4.
  • (ii) Substituting Abel–Gontscharoff-typeinequality (90) in Θi(f), i=1,,35, and following similar steps to (i), we get (94). □

Theorem 7.2.

Assume (H1), and let f:I=[α1,α2]R be a function such that for m3 (an integer) f(m1) is absolutely continuous. Also, let x1,,xnI, p1,,pn are positive real numbers such that i=1npi=1. Assume that Fα1α2, Gk (k=1,2,3,4) and Θi (i=1,2) are the same as defined in (84), (86)(89), (2), (3), (50)(82) respectively. For m3 assume that

(97)α1α2Θi(Gk(,ζ))(wζ)m3Fα1α2(ζ,w)dw0,ζ[α1,α2],i=1,,35,
for k=1,3,4. If f is an m-convex function, then
  • (i) For k=1,3,4, the following holds:

(98)Θi(f)(m2)(f(α2)f(α1)α2α1)α1α2Θi(Gk(,w))dw+1α2α1α1α2Θi(Gk(,w))×λ=1m3(m2λλ!)(f(λ+1)(α2)(wα2)λf(λ+1)(α1)(wα1)λ)dw,i=1,,35.
  • (ii) For k=2, we have

(99) Θi(f)(1)(m2)(f(α2)f(α1)α2α1)α1α2Θi(G2(,w))dw+(1)α2α1α1α2Θi(G2(,w))×λ=1m3(m2λλ!)(f(λ+1)(α2)(wα2)λf(λ+1)(α1)(wα1)λ)dw,i=1,,35.

Proof.

(i) Since f(m1) is absolutely continuous on [α1,α2], f(m) exists almost everywhere. Also, since f is m-convex therefore we have f(m)(ζ)0 for a.e. on [α1,α2]. So, applying Theorem 1.1, we obtain (98).

  • (ii) Similar to (i). □

Remark A.

We can investigate the bounds for the identities related to the generalization of refinement of Jensen inequality using inequalities for the C˘ebys˘ev functional and some results relating to the Gr¨uss and Ostrowski type inequalities can be constructed as given in Section 3 of [6]. Also we can construct the non-negative functionals from inequalities (98)(99) and give related mean value theorems and we can construct the new families of m-exponentially convex functions and Cauchy means related to these functionals as given in Section 4 of [6].

Figures

Graph of Green functions for fix w.

Figure 1

Graph of Green functions for fix w.

References

[1]R.P. Agarwal, P.J.Y. Wong, Error Inequalities in Polynomial Interpolation and their Applications, Kluwer Academic Publishers, Dordrecht/Boston/London, 1993.

[2]G. Anderson, Y. Ge, The size distribution of Chinese cities, Reg. Sci. Urban Econ. 35 (6) (2005) 756776.

[3]F. Auerbach, Das Gesetz der Bevölkerungskonzentration, Petermanns Geographische Mitteilungen 59 (1913) 7476.

[4]D. Black, V. Henderson, Urban evolution in the USA, J. Econ. Geogr. 3 (4) (2003) 343372.

[5]M. Bosker, S. Brakman, H. Garretsen, M. Schramm, A century of shocks: the evolution of the German city size distribution 1925-1999, Reg. Sci. Urban Econ. 38 (4) (2008) 330347.

[6]S.I. Butt, K.A. Khan, J. Pečarić, Generaliztion of Popoviciu inequality for higher order convex function via Tayor’s polynomial, Acta Univ. Apulensis Math. Inform. 42 (2015) 181200.

[7]S.I. Butt, K.A. Khan, J. Pečarić, Popoviciu type inequalities via Hermite’s polynomial, Math. Inequal. Appl. 19 (4) (2016) 13091318.

[8]S.I. Butt, N. Mehmood, J. Pečarić, New generalizations of Popoviciu type inequalities via new green functions and Fink’s identity, Trans A. Razmadze Math. Inst. 171 (3) (2017) 293303.

[9]S.I. Butt, J. Pečarić, Weighted Popoviciu type inequalities via generalized Montgomery identities, Hrvat. Akad. Znan. I Umjet.: Mat. Znan. 19 (523) (2015) 6989.

[10]S.I. Butt, J. Pečarić, Popoviciu’S Inequality for N-Convex Functions, Lap Lambert Academic Publishing, 2016.

[11]I. Csiszár, Information measures: a critical survey, in: Tans. 7th Prague Conf. on Info. Th. Statist. Decis. Funct. Random Process and 8th European Meeting of Statist. Vol. B, Academia Prague, 1978, pp. 7386.

[12]I. Csiszár, Information-type measures of difference of probability distributions and indirect observations, Stud. Sci. Math. Hungar. 2 (1967) 299318.

[13]A.M. Fink, Bounds on the deviation of a function from its averages, Czechoslovak Math. J. 42 (2) (1992) 289310.

[14]L. Horváth, A method to refine the discrete Jense’s inequality for convex and mid-convex functions, Math. Comput. Modelling 54 (9–10) (2011) 24512459.

[15]L. Horváth, K.A. Khan, J. Pečarić, Combinatorial Improvements of Jensens Inequality / Classical and New Refinements of Jensens Inequality with Applications, in: Monographs in inequalities 8, Element, Zagreb, 2014.

[16]L. Horváth, K.A. Khan, J. Pečarić, Refinement of Jensen’s inequality for operator convex functions, Advances in Inequalities and Applications (2014).

[17]L. Horváth, J. Pečarić, A refinement of discrete Jensen’s inequality, Math. Inequal. Appl. 14 (2011) 777791.

[18]L. Horváth, Đ. Pečarić, J. Pečarić, Estimations of f-and Rényi divergences by using a cyclic refinement of the Jensen’s inequality, Bull. Malays. Math. Sci. Soc. (2017) 114.

[19]Y.M. Ioannides, H.G. Overman, Zipf’s law for cities: an empirical examination, Reg. Sci. Urban Econ. 33 (2) (2003) 127137.

[20]S. Kullback, Information Theory and Statistics, Courier Corporation, 1997.

[21]S. Kullback, R.A. Leibler, On information and sufficiency, Ann. Math. Statist. 22 (1) (1951) 7986.

[22]N. Lovričević, Đ. Pečarić, J. Pečarić, Zipf-Mandelbrot law, f-divergences and the Jensen-type interpolating inequalities, J. Inequal. Appl. 2018 (1) (2018) 36.

[23]M. Matic, C.E. Pearce, J. Pečarić, Shannon’s and related inequalities in information theory, in: Survey on Classical Inequalities, Springer, Dordrecht, 2000, pp. 127164.

[24]N. Mehmood, R.P. Agarwal, S.I. Butt, J. Pečarić, New generalizations of Popoviciu-type inequalities via new Green’s functions and Montgomery identity, J. Inequal. Appl. 2017 (1) (2017) 108.

[25]T. Niaz, K.A. Khan, J. Pečarić, On generalization of refinement of Jensen’s inequality using Fink’s identity and Abel-Gontscharoff Green function, J. Inequal. Appl. 2017 (1) (2017) 254.

[26]J. Pečarić, K.A. Khan, I. Perić, Generalization of Popoviciu type inequalities for symmetric means generated by convex functions, J. Math. Comput. Sci. 4 (6) (2014) 10911113.

[27]J. Pečarić, F. Proschan, Y.L. Tong, Convex Functions, Partial Orderings and Statistical Applications, Academic Press, New York, 1992.

[28]A. Rényi, On measure of information and entropy, in: Proceeding of the Fourth Berkely Symposium on Mathematics, Statistics and Probability, 1960, pp. 547561.

[29]K.T. Rosen, M. Resnick, The size distribution of cities: an examination of the Pareto law and primacy, J. Urban Econ. 8 (2) (1980) 165186.

[30]K.T. Soo, Zipf’s Law for cities: a cross-country investigation, Reg. Sci. Urban Econ. 35 (3) (2005) 239263.

[31]D.V. Widder, Completely convex function and Lidstone series, Trans. Amer. Math. Soc. 51 (1942) (1942) 387398.

[32]G.K. Zipf, Human Behaviour and the Principle of Least-Effort, Addison-Wesley, Cambridge MA edn. Reading, 1949.

Acknowledgements

The research of 4th author was supported by the Ministry of Education and Science of the Russian Federation (the Agreement number No. 02.a03.21.0008). The authors wish to thank the anonymous referees for their very careful reading of the manuscript and fruitful comments and suggestions. Authors contribution: All authors jointly worked on the results and they read and approved the final manuscript. Competing interests: The authors declare that there is no conflict of interest regarding the publication of this paper.The publisher wishes to inform readers that the article “Estimation of different entropies via Abel–Gontscharoff Green functions and Fink’s identity using Jensen type functionals” was originally published by the previous publisher of the Arab Journal of Mathematical Sciences and the pagination of this article has been subsequently changed. There has been no change to the content of the article. This change was necessary for the journal to transition from the previous publisher to the new one. The publisher sincerely apologises for any inconvenience caused. To access and cite this article, please use Khan, K.A., Niaz, T., Pečarić, Đ., Pečarić, J. (2018), “Estimation of different entropies via Abel–Gontscharoff Green functions and Fink’s identity using Jensen type functionals” Arab Journal of Mathematical Sciences, Vol. 26 No. 1/2, pp. 15-39. The original publication date for this paper was 31/12/2018.

Corresponding author

Tasadduq Niaz can be contacted at: tasadduq_khan@yahoo.com

Related articles