<a name="docA001">Index beginIndex this file
2009-06-08-19-10 start
This file http://freeman2.com/tute0008.htm
is Freeman's reading notes. Although Freeman
always keep correct view point. But Freeman's
capability is limited, plus no one proofread
this file. Then you can still find wrong view
point. When read, please put question mark as
often as possible. If you suspect any view
point wrong, please ask a math expert near by.
<a name="docA002">,<a name="textbook">
This file is a note for reading inequality
book written by ProfessorJ. MichaelSteeleThe Cauchy-Schwarz Master Class★★★★★
Below use 'textbook' as abbreviation.
Freeman also read web pages online, and will
indicate the source URL at discussion point.
<a name="docA003">
Freeman study mechanical engineering.
Engineering mathematics do not teach
inequality. Above book is first inequality
book. First time read, it was very hard.
Although high school time learned
a*a + b*b >= 2*a*b
But this little knowledge do not help.
<a name="docA004">
This file follow textbook chapter section
order, but not continuous, Freeman skip
those uncertain sections/problems.
This file first function is to learn
inequality. Second function is to learn
how to use html code to write math
equations.
<a name="docA005">Index beginIndex this file
On 2009-01-27-10-08 Freeman accessed
the next page
http://www.sftw.umac.mo/~fstitl/10mmo/inequality.html
save as sftw.umac.mo_text_math_eqn_good.htm
Above page is the main reference for html
math equation.
In order to let reader to build html math
equation, previous file tute0007.htm page
end has math symbol and internal code.
This file third function is to display how
to draw curves in web page. The main engine
is XYGraph v2.3 - Technical Figures
Thank you for read Freeman's inequality page.
2009-06-08-19-47 stop
<a name="docA006">
2009-08-23-15-00 start
□ Exercise 1.14 solution
Exercise 1.14 problem (a) is solved by hint
Exercise 1.14 problem (b) is out of LiuHH's
reach. What is "cardinality of a set B⊂Z3" ?
Without clear understand the definition,
LiuHH is unable to solve problem (b)
LiuHH major in mechanical engineering.
LiuHH is mathematics admirer and outsider.
<a name="docA007">
To solve problems in
The Cauchy-Schwarz Master Class
LiuHH's goal is to solve 50% of the exercise
problems. In reality, solve 40% is good enough.
Please do not be surprise that LiuHH skip some
text section, skip some problems.
2009-08-23-15-12 stop
<a name="ch13c001">Index beginIndex this file
2010-07-10-08-27 start
■ Exercise 13.1 problem statement
textbook page 204
(Two Doubly Stochastic Giveaways)
<a name="ch13c002">
Show that for positive x,y,z one
has the product bound
xyz≦(x/2+y/3+z/6)*(x/3+2y/3)
*(x/6+5z/6) ---eqn.BP001
and the awe inspiring reciprocal
bound
---page 276
---line 18
---eqn.BP004
width of above equation
<a name="ch13c007">
2010-07-10-09-16 here
one gets
(a,b,c)(<)(x,y,z) ---eqn.BP005
The inequalities of the exercise
then follow from the Schur
concavity of the map
(x,y,z)→xyz ---eqn.BP006
<a name="ch13c008">
and the Schur convexity of the
map
(x,y,z)→1/x5+1/y5+1/z5 ---eqn.BP007
2010-07-10-09-22 stop
<a name="ch13c009">Index beginIndex this file
2010-07-10-11-33 start
■ Exercise 13.1 solution
Review Problem 13.4<a name="ch13c010">
Schur's Majorization Inequality
The required condition is
if
φ:(a,b)→Real ---eqn.BO104
is a convex function,
<a name="ch13c011">
then the function
f:(a,b)n→Real ---eqn.BO105
defined by the sum
f(x1,x2,...,xn)
=∑[k=1,n]φ(xk) ---eqn.13.17
is Schur convex.
<a name="ch13c012">
We can change two 'convex' to
two 'concave', theorem is still
true for the concave case.
<a name="ch13c013">
eqn.13.17 say f(x) is a sum of
φ(t) with variable t change to
x1,x2,...,xn But we are given
eqn.BP001 which is a production
of x,y and z, not summation !!
<a name="ch13c014">Index beginIndex this file
Second question is eqn.BP001
has production of x,y and z,
then φ(t)=t, how can we say
φ(t)=t is convex or concave?
f(x)=x is a straight line !!
<a name="ch13c015">
Here we need help from
log(ABC)=log(A)+log(B)+log(C) ---eqn.BO151we can drop log(),
inequality is unchanged.
Let us set
φ(t)=log(t) ---eqn.BP008
//Alert: do NOT set φ(t)=t !!
<a name="ch13c016">
Define
f(x,y,z)=φ(x)+φ(y)+φ(z) ---eqn.BP009
that is
f(x,y,z)=log(x) ---eqn.BP010
+log(y)+log(z)
<a name="ch13c017">
If log(t) is convex, then f(x,y,z)
is Schur convex.
If log(t) is concave, then f(x,y,z)
is Schur concave.
<a name="ch13c018">
eqn.BP008 is analytic, let us
use second differentiation to
check.
φ(t)=log(t) ---eqn.BP008
d[φ(t)]/dt=d[log(t)]/dt
d[φ(t)]/dt=1/t ---eqn.BP011
<a name="ch13c019">Index beginIndex this file
dd[φ(t)]/dt/dt=d[1/t]/dt
dd[φ(t)]/dt/dt=-1/t/t ---eqn.BP012
Log function argument t must be
positive [log(negative) undefined]
second differentiation -1/t/t
is negative. (t squared)
<a name="ch13c020">
This tell us that
φ(t)=log(t) ---eqn.BP008
is a concave function. It follows
f(x,y,z)=log(x) ---eqn.BP010
+log(y)+log(z)
is Schur concave.
<a name="ch13c021">
After determined f(x,y,z) is
Schur concave. Next need find
out two sequences (a,b,c) and
(x,y,z) with majorization
relation. We expect that either
(a,b,c) (<) (x,y,z) ---eqn.BP013
or
(x,y,z) (<) (a,b,c) ---eqn.BP014
<a name="ch13c022">
Go back to eqn.BP001
Because we use
φ(t)=log(t) ---eqn.BP008
we need take log for eqn.BP001.
Take log or drop log do not
change inequality.
<a name="ch13c023">
Take or drop log(), exp()
monotone increase function.
do not change inequality
direction.
Take or drop -log(), -exp()
monotone decrease function.
change inequality direction
<a name="ch13c024">Index beginIndex this file
log(eqn.BP001) is next
log(xyz) ---eqn.BP015
≦log[(x/2+y/3+z/6)
*(x/3+2y/3)*(x/6+5z/6)]
//alert '≦' is same as eqn.BP001
it is
<a name="ch13c025">
log(x)+log(y)+log(z)
≦log[(x/2+y/3+z/6)]
+log[(x/3+2y/3)]
+log[(x/6+5z/6)] ---eqn.BP016
Both side use x,y,z, a little
confuse.
<a name="ch13c026">
Change eqn.BP016 to
log(a)+log(b)+log(c)
≦log[(x/2+y/3+z/6)]
+log[(x/3+2y/3)]
+log[(x/6+5z/6)] ---eqn.BP017
eqn.BP017 is Schur concave
function, not Schur convex.
<a name="ch13c027">
eqn.BP017 suggest us use two
sequences (a,b,c) and (x,y,z)
with the following relation
a=x/2+y/3+z/6 ---eqn.BP018
b=x/3+2y/3 +0*z ---eqn.BP019
c=x/6+5z/6 +0*y ---eqn.BP020
In matrix form it is
---page 276
---line 18
---eqn.BP003
width of above equation
<a name="ch13c029">Index beginIndex this file
Sequences (a,b,c) and (x,y,z)
are related by eqn.BP003.
Do they have majorization
relation? Exam the matrix
in eqn.BP003.
<a name="ch13c030">
(1) matrix all elements are
nonnegative. OK
(2) matrix column sum are
col.1=1/2+1/3+1/6=1
col.2=1/3+2/3+ 0 =1
col.3=1/6+ 0 +5/6=1
OK
<a name="ch13c031">
(3) matrix row sum are
row.1=1/2+1/3+1/6=1
row.2=1/3+2/3+ 0 =1
row.3=1/6+ 0 +5/6=1
OK. easy, symmetry !
<a name="ch13c032">
Three OK passed majorization
check. //why? ABCD
We have
(a,b,c) (<) (x,y,z) ---eqn.BP013
Problem 13.4 tell us
<a name="ch13c033">
for Schur CONCAVE function with
(a,b,c) (<) (x,y,z) ---eqn.BP013
we have //REVERSE inequality
f(a,b,c) ≧ f(x,y,z) ---eqn.BP021
Refer to
f(x,y,z)=log(x) ---eqn.BP010
+log(y)+log(z)
<a name="ch13c034">Index beginIndex this file
eqn.BP021 say that
log(a)+log(b)+log(c)
≧ ---eqn.BP022
log(x)+log(y)+log(z)
It is same as
log(abc)≧log(xyz) ---eqn.BP023
<a name="ch13c035">Drop log get
abc ≧ xyz ---eqn.BP024
a, b, c are defined in eqn.BP018
to eqn.BP020.
<a name="ch13c036">
Recover a,b,c get
(x/2+y/3+z/6)*(x/3+2y/3)
*(x/6+5z/6)
≧ xyz ---eqn.BP025
eqn.BP025 is same as eqn.BP001
Exercise 13.1 part one solved.
2010-07-10-13-07 stop
<a name="ch13c037">
2010-07-10-15-14 start
Exercise 13.1 part two is much
better than part one. Because
eqn.BP002 is an addition
equation. Not a production one.
<a name="ch13c038">
It is easy to see that
φ:(a,b)→Real ---eqn.BO104
is
φ(t)=1/t5 ---eqn.BP026
Is this φ(t) convex? concave?
Reader please do the work,
differentiate φ(t) twice, you
will find the second derivative
of φ(t) is positive. So
<a name="ch13c039">Index beginIndex this file
This φ(t) is convex. Then
f(x,y,z)=1/x5+1/y5+1/z5 ---eqn.BP027
is Schur convex.
Next we need to identify two
sequences.
<a name="ch13c040">
From eqn.BP002 we
find sequence (x,y,z)
and sequence
([z+y]/2, [3x+y+2z]/6,
[3x+2y+z]/6) ---eqn.BP028
Let the longer one be (a,b,c)
<a name="ch13c041">
by define
a=[z+y]/2 ---eqn.BP029
b=[3x+y+2z]/6 ---eqn.BP030
c=[3x+2y+z]/6 ---eqn.BP031
Write (a,b,c) and (x,y,z) in
matrix form, we get eqn.BP004<a name="ch13c042">
Sequences (a,b,c) and (x,y,z)
are related by eqn.BP004.
Do they have majorization
relation? Exam the matrix
<a name="ch13c043">
in eqn.BP004.
(1) matrix all elements are
nonnegative. OK
(2) matrix column sum are
col.1= 0 +1/2+1/2=1
col.2=1/2+1/6+1/3=1
col.3=1/2+1/3+1/6=1
OK
<a name="ch13c044">Index beginIndex this file
(3) matrix row sum are
row.1= 0 +1/2+1/2=1
row.2=1/2+1/6+1/3=1
row.3=1/2+1/3+1/6=1
OK. easy, symmetry !
<a name="ch13c045">
Three OK passed majorization
check. //why? ABCD
We have
(a,b,c) (<) (x,y,z) ---eqn.BP032
<a name="ch13c046">
for Schur CONVEX function with
(a,b,c) (<) (x,y,z)
we have //same direction inequality
f(a,b,c) ≦ f(x,y,z) ---eqn.BP033
Refer to
f(x,y,z)=1/x5+1/y5+1/z5 ---eqn.BP027
<a name="ch13c047">
we have
1/a5+1/b5+1/c5
≦ ---eqn.BP034
1/x5+1/y5+1/z5<a name="ch13c048">
Recover a,b,c from eqn.BP029
to eqn.BP031, we find
eqn.BP034 is same as our target
eqn.BP002.
Exercise 13.1 part two solved.
2010-07-10-16-03 stop
<a name="ch13c049">Index beginIndex this file
2010-07-10-16-06 start
■ Exercise 13.2 problem statement
textbook page 204
(Finding the Majorization)
<a name="ch13c050">
Given 1≦k≦n and real numbers
xj>0, 1≦j≦n ---eqn.BP035
such that
max(x1,x2,...,xn)
≦(x1+x2+...+xn)/k ---eqn.BP036
Show that one has
---page 204
---line 17
---eqn.13.21
width of above equation
2010-07-10-16-20 stop
<a name="ch13c052">Index beginIndex this file
2010-07-10-17-00 start
■ Exercise 13.2 hint
textbook page 276
<a name="ch13c053">
If we set
s=(x1+x2+...+xn)/k ---eqn.BP037
we have
(x1,x2,...,xn) (<)
(s,s,...,s,0,0,...0) ---eqn.BP038
when we take k copies of s.
<a name="ch13c054">
Thus, for convex
φ:[0,∞) to Real ---eqn.BP039
Schur's Majorization Inequality
(13.18)
∑[k=1,n]φ(αk)
≦∑[k=1,n]φ(βk) ---eqn.13.18<a name="ch13c055">
gives us
φ(x1)+φ(x2)+...+φ(xn)
≦(n-k)φ(0)+kφ(s) ---eqn.BP040
and we can set
φ(x)=1/(1+x) ---eqn.BP041
to obtain the bound (13.21)
2010-07-10-17-12 stop
<a name="ch13c056">Index beginIndex this file
2010-07-10-17-26 start
■ Exercise 13.2 solution
<a name="ch13c057">Schur's Majorization Inequality
require a one variable function
φ:(a,b)→Real ---eqn.BO104
require a n variable function
f(x1,x2,...,xn)
=∑[k=1,n]φ(xk) ---eqn.13.17
and require two sequences which
have majorization relation.
<a name="ch13c058">
Exercise 13.2 problem statement
give us 1≦k≦n and real numbers
xj>0, 1≦j≦n ---eqn.BP035
such that
max(x1,x2,...,xn)
≦(x1+x2+...+xn)/k ---eqn.BP036
<a name="ch13c059">
From Problem 13.5
max(x,y,z)≦(x+y+z)/2<1 ---eqn.13.19
we learned to build two sequences
(x,y,z) and (s,s,0)
Now here Exercise 13.2 eqn.BP036
has similar pattern as eqn.13.19
<a name="ch13c060">
We can build two sequences.
First sequence is
(x1,x2,...,xn)
For second sequence, define
s=(x1+x2+...+xn)/k ---eqn.BP037
then set second sequence as
(s,s,...,s,0,0,...0) ---eqn.BP042
<a name="ch13c061">Index beginIndex this file
Name two sequences as following
x_seq. (x1,x2,...,xn)
s_seq. (s,s,...,s,0,0,...0)
<a name="ch13c062">
Majorization relation of eqn.BP038
need one term comparison
two terms comparison etc.
For comparison consideration,
let random magnitude sequence
(x1,x2,...,xn) ---eqn.BP043
be sorted to
(x[1],x[2],...,x[n]) ---eqn.BP044
Such that
x[1]≧x[2]≧...≧x[n] ---eqn.BP045
Then x[1] is max(x1,x2,...,xn)
<a name="ch13c063">
The given condition
max(x1,x2,...,xn)
≦(x1+x2+...+xn)/k ---eqn.BP036
let us have
x[1]≦s=(x1+x2+...+xn)/k ---eqn.BP046
One term inequality is true.
2010-07-10-17-53 stop
<a name="ch13c064">
2010-07-10-19-21 start
//# one term two terms
// three terms comparison.
Use a numerical example.
Let sequence be
α seq.=[6,5,4,3,2,1] ---eqn.BP047
β seq.=[7,7,7,0,0,0] ---eqn.BP048
here n=6, k=3.
Both sequence total sum be 21
max(6,5,4,3,2,1)<(6+5+4+3+2+1)/3 ---eqn.BP049
that is 6 < 7 ---eqn.BP050
<a name="ch13c065">
One term comparison is
x[1]≦s---eqn.BP046
In numerical example: 6 < 7
Two terms comparison is
x[1]+x[2]≦s+s ---eqn.BP051
Numerical example: 6+5 < 7+7
<a name="ch13c066">Index beginIndex this file
Three terms comparison is
x[1]+x[2]+x[3]≦s+s+s ---eqn.BP052
Numerical example: 6+5+4<7+7+7
For this numerical example,
three terms comparison
β seq.=[7,7,7,0,0,0] reach
total sum 21.
<a name="ch13c067">
The following comparison
α seq. ≦ β seq. ---eqn.BP053
is true until sum n terms,
where
∑[n]α seq. = ∑[n]β seq. ---eqn.BP054
Above symbolic equation plus
numerical example tell us that
<a name="ch13c068">
given condition
max(x1,x2,...,xn)
≦(x1+x2+...+xn)/k ---eqn.BP036
promise //why? ABCD
the majorization relation
(x1,x2,...,xn) (<)
(s,s,...,s,0,0,...0) ---eqn.BP038
where s is defined at eqn.BP037
Above checked two sequences.
<a name="ch13c069">
Below find one variable function
φ:(a,b)→Real ---eqn.BO104
eqn.13.21 less than side tell
us clearly, choose
φ(t)=1/(1+t) ---eqn.BP055
otherwise we can not build
eqn.13.21 less than side.
<a name="ch13c070">Index beginIndex this file
First see one variable function
φ(t)=1/(1+t) ---eqn.BP055
is convex or concave.
eqn.BP055 is f(t)=1/t parallel
shift to t=-1.
f(t)=1/t is convex function
shape like ╰╯.
<a name="ch13c071">
Then φ(t)=1/(1+t) is convex.
therefore
f(x1,x2,...,xn)
=∑[k=1,n]φ(xk) ---eqn.13.17
is Schur convex.
<a name="ch13c072">
We can also differentiate
φ(t)=1/(1+t)
twice to get positive result.
This is an easy work, leave it
to reader.
<a name="ch13c073">
In Exercise 13.2
f(x1,x2,...,xn)
=∑[j=1,n]1/(1+xj) ---eqn.BP056
The following work is to
straighten eqn.13.21 greater
than side.
<a name="ch13c074">Index beginIndex this file
for x_seq. : f(x1,x2,...,xn)
for s_seq. : f(s,s,...,s,0,0,...0)
x_seq. already settled
x_seq. eqn.BP056 is eqn.13.21
less than side.
<a name="ch13c075">
Next see (s,s,...,s,0,0,...0)
f(s,s,...,s,0,0,...0)
=1/(1-s)+...+1/(1-s) //k times
+1/(1-0)+...+1/(1-0) //n-k times
=k/(1-s) + (n-k)*1/(1-0)
=(n-k) + k/(1-s) ---eqn.BP057
eqn.BP057 is eqn.13.21 greater
than side.
<a name="ch13c076">
Substitute
s=(x1+x2+...+xn)/k ---eqn.BP037
into eqn.BP056
f(s,s,...,s,0,0,...0) ---eqn.BP058
=(n-k) + k/[1-(x1+x2+...+xn)/k]
=(n-k) + k*k/[k-(x1+x2+...+xn)]
<a name="ch13c077">
Final clean up.
Exercise 13.2 two sequences have
(x1+x2+...+xn) (<)
(s,s,...,s,0,0,...0) ---eqn.BP038
Our problem is Schur convex.
then
f(x1,x2,...,xn)
≦ ---eqn.BP059
f(s,s,...,s,0,0,...0)
<a name="ch13c078">
which gives
eqn.BP056 ≦ eqn.BP058
this result is same as target
eqn.13.21
Exercise 13.2 is done.
2010-07-10-20-25 stop
<a name="ch13c079">Index beginIndex this file
2010-07-10-22-02 start
■ Exercise 13.3 problem statement
textbook page 205
(A Refinement of the 1-Trick)
Given integers 0<m<n and real
numbers x1,x2,...,xn
such that
---page 205
---line 5
---eqn.13.23
width of above equation
<a name="ch13c083">
This refinement of the familiar
1-Trick lower bound was crucial
to the discovery and proof of
the Szemeredi Regularity Lemma,
which is one of the cornerstone
of modern combinatorial theory.
2010-07-10-22-24 stop
<a name="ch13c084">Index beginIndex this file
2010-07-10-22-26 start
■ Exercise 13.3 hint
textbook page 276
If one sets
yk=μ+δ/m for 1≦k≦m ---eqn.BP061
yk=μ-δ/(n-m) for m<k≦n ---eqn.BP062
where
μ=(x1+x2+...+xn)/n ---eqn.BP063
<a name="ch13c085">
then from the condition (13.22)
it follows easily that
y(<)x ---eqn.BP064
The map
f(x)=x12+x22+...+xn2 ---eqn.BP065
is Schur convex,
<a name="ch13c086">
so we have
f(y)<f(x) ---eqn.BP066 ●●change
and after expansion, this is
precisely the target inequality
(13.23) For the connection to
Szemeredi Regularity Lemma,
see Komlos and Simonovits (1996)
2010-07-10-22-36 stop
<a name="ch13c087">
2010-07-10-22-37 start
eqn.BP066 ●●change
because
sequence has n comparisons
sequence use '(<)'
Function f(y) compare with f(x)
just once, use '<'
textbook page 277 line 2 use
f(y) (<) f(x) ---eqn.BP067
2010-07-10-22-43 stop
<a name="ch13c088">Index beginIndex this file
■ Exercise 13.3 draft work
Prepare numerical example.
Please copy next code, paste to
[Box3, input JS command] of
calculator page, then click
[Test box3 command, output to box4]
Box 4 has output result.
//<a name="ch13c089">
//[[ program start
//2010-07-11-11-33
var af=[]
var bt=[]
af[0]=1.5
af[1]=2.6
af[2]=2.7
af[3]=1.7
af[4]=2.8
var n=af.length
var m=3
var afsum=afsu2=0
var afmax=afmxi=0
//<a name="ch13c090">
var i,j,k;
for(i=0;i<n;i++)afsum+=af[i];
for(i=0;i<m;i++)afsu2+=af[i];
for(i=0;i<n;i++)if(af[i]>afmax)
{afmax=af[i];afmxi=i}
var afavg=afsum/n
var afav2=afsu2/m
var dt=m*(afav2-afavg);
var bt0=afavg+dt/m
var bt1=afavg-dt/(n-m)
//<a name="ch13c091">
for(i=0;i<m;i++)bt[i]=bt0;
for(i=m;i<n;i++)bt[i]=bt1;
var btsum=0;
for(i=0;i<n;i++)btsum+=bt[i];
af //α majorize β !!
bt //NOT β majorize α !!
dt //δ must be positive
btsum //β sum
afsum //α sum
afavg //total n average
afav2 //leading m av.
//2010-07-11-11-50
//]] program stop
<a name="ch13c092">
2010-07-11-18-30
x seq.
1.5,2.6,2.7,1.7,2.8
y seq.
2.2666666666666666,2.2666666666666666,2.2666666666666666,2.250000000000001,2.250000000000001
<a name="ch13c093">Index beginIndex this file
x seq. sum to 11.3
2.8 2.7 2.6 1.7 1.5 ---eqn.BP068
y seq. sum to 11.3
34/15 34/15 34/15 9/4 9/4 ---eqn.BP069
<a name="ch13c094">
x seq. majorize y seq.
transformation matrix is
[[ ---eqn.BP070
0.5076923076923081 0.07322033898305072 0.013057124921531679 0.02219711236660385 0.3838331160365057
0 0.5262711864406782 0.09384808537350915 0.15954174513496552 0.2203389830508472
0 0 0.6296296296296296 0.3703703703703704 0
0 0.325 0.25 0.425 0
0.49230769230769194 0.07550847457627117 0.013465160075329565 0.022890772128060257 0.39582790091264713
]]
<a name="ch13c095">
2010-07-11-18-40
x seq.
1.5,2.6,2.7,1.7,2.8
y seq.
2.2666666666666666,2.2666666666666666,2.2666666666666666,2.250000000000001,2.250000000000001
δ=dt=
0.01999999999999913
<a name="ch13c096">
btsum
11.3
afsum
11.3
(x1+x2+x3+x4+x5)/5 afavg
2.26
(x1+x2+x3)/3 afav2
2.2666666666666666
<a name="ch13c097">
2010-07-11-18-51 start
Given relation eqn.13.22 is for
x sequence x1,x2,...,xn ---eqn.BP071
Problem did not require x seq.
be sorted. Maximum x element
can be anywhere. Then
How do I know majorization
comparison method?
<a name="ch13c098">Index beginIndex this file
LiuHH created a small program
and build a x sequence
x.seq = 1.5 2.6 2.7 1.7 2.8
for numerical study.
x.seq is random ordered.
Smallest number 1.5 is first one.
Greatest number 2.8 is last one.
<a name="ch13c099">
Total number n=5
Leading number m=3
Total average = 2.26000000000
(1.5+2.6+2.7+1.7+2.8)/5 ---eqn.BP072
Leading average = 2.26666666666
(1.5+2.6+2.7)/3 ---eqn.BP073
Leading average > Total average
<a name="ch13c100">eqn.13.22 tell us how to find δ
Divide eqn.13.22 by m, result
is
Leading m term average ---eqn.BP074
= Total (five) n average + δ/m
<a name="ch13c101">
For the numerical example
Leading three term average ---eqn.BP075
= Total five term average + δ/m
that is
2.26666666666666
=2.26000000000000 + δ/3
δ=(2.26666666666666-2.26)*3
δ=0.02 ---eqn.BP076
<a name="ch13c102">
Now create y sequence
Follow eqn.BP061
Leading y = μ+δ/m for 1≦k≦m
Leading y ---eqn.BP077
= Total five term average + δ/m
= Leading three term average
<a name="ch13c103">Index beginIndex this file
Follow eqn.BP062
Later y = μ-δ/(n-m) ---eqn.BP078
for m<k≦n
<a name="ch13c104">
Check y seq. total sum.
(Later y)*(n-m)=(n-m)*μ-δ
(Leading y)*m = m *μ+δ
(Later y)*(n-m)+(Leading y)*m
= (n-m)*μ-δ + m*μ+δ
= n*μ = total x seq. sum ---eqn.BP079
y seq. sum = x seq. sum OK
2010-07-11-19-36 stop
<a name="ch13c105">
2010-07-11-21-29 start
■ Exercise 13.3 solution
Exercise 13.3 two sequences
majorization calculation is not
straight forward.
<a name="ch13c106">
Given sequence
x1,x2,...,xn no sort requirement.
Then maximum value element can
be any one. Given sequence has
total n elements. First m (<n)
elements average value is key
role.
<a name="ch13c107">
LiuHH created one seq.
x.seq = 1.5 2.6 2.7 1.7 2.8
Here n=5 total five elements.
Total average name as n_ave
n_ave=(1.5+2.6+2.7+1.7+2.8)/5 ---eqn.BP072
n_ave=2.26000000000
<a name="ch13c108">Index beginIndex this file
Set m=3, take first three for
leading term average, name as
m_ave
m_ave=(1.5+2.6+2.7)/3 ---eqn.BP073
m_ave=2.26666666666
<a name="ch13c109">
n_ave and m_ave are related
by eqn.13.22. Problem give
one word
δ≧0 ---eqn.BP060
without mention the relative
magnitude of n_ave and m_ave.
<a name="ch13c110">eqn.13.22 divide by m get
m_average=n_average+δ/m ---eqn.BP080
Problem given 0<m<n and require
δ≧0. δ/m is positive. then
m_average > n_average ---eqn.BP081
<a name="ch13c111">
Next puzzle is that I do not
know which x sequence element
has maximum value. How can I
start majorization calculation?
After create numerical example
and after think, found answer.
<a name="ch13c112">Maximum value x sequence element
is NOT important.
I create
x.seq = 1.5 2.6 2.7 1.7 2.8
hide the maximum at the end and
put minimum as first element.
I can still solve the problem.
<a name="ch13c113">Index beginIndex this file
The reason is that new sequence
y.seq. has only two values.
These two values are eqn.BP061
and eqn.BP062. Define
y_big=eqn.BP061 value
y_sml=eqn.BP062 value.
<a name="ch13c114">
For this numerical example
y_big=(1.5+2.6+2.7)/3
y_big=2.2666666666666666
y_sml=2.25
y_big repeat m (3) times.
y_sml repeat n-m (5-3) times.
<a name="ch13c115">
y.seq=
2.2666666666666666
2.2666666666666666
2.2666666666666666
2.25
2.25
<a name="ch13c116">
x.seq = 1.5 2.6 2.7 1.7 2.8
y.seq = 34/15 34/15 34/15 9/4 9/4
<a name="ch13c117">
For majorization
one element comparison
y[1] compare with x[1]
y[1] is 34/15
x[1] still NOT known.
I can not use 1.5, it is smallest.
I can not use 2.8, it is the last.
<a name="ch13c118">Index beginIndex this fileThe tricky part is right here.
Instead of compare
x.seq = 1.5 2.6 2.7 1.7 2.8
y.seq = 34/15 34/15 34/15 9/4 9/4
We compare
x.seq = 1.5 2.6 2.7 forget
y.seq = 34/15 34/15 34/15 forget
<a name="ch13c119">
x.seq is not in descending order?
It does not matter here. Because
34/15 is average of first m terms
1.5 2.6 2.7
<a name="ch13c120">
Please see
# (α_bar,...,α_bar)(<)(α1,α2,...,αn)
three average 34/15 34/15 34/15
is majorized by 1.5 2.6 2.7
For first m elements
y.seq (<) x.seq ---eqn.BP082
<a name="ch13c121">For the rest n-m elements,
y.seq has second average value
Now we compare
x.seq = forget 1.7 2.8
y.seq = forget 9/4 9/4
From
# (α_bar,...,α_bar)(<)(α1,α2,...,αn)
again for remaining n-m elements
y.seq (<) x.seq ---eqn.BP083
<a name="ch13c122">
Finally, it follows easily that
overall n elements comparison
y.seq (<) x.seq ---eqn.BP084
eqn.BP084 is same as eqn.BP064
2010-07-11-22-23 here
<a name="ch13c123">Index beginIndex this file
Next find one variable function
φ(t), find n variable function
f(x1,x2,...,xn)
=∑[k=1,n]φ(xk) ---eqn.13.17
<a name="ch13c124">eqn.13.23 greater than (left)
side IS n variable function
eqn.13.23 say that one variable
function
φ(t)=t*t ---eqn.BP085
<a name="ch13c125">
t*t is convex ╰╯, and eqn.BP084
give us y.seq (<) x.seq. Then
Problem 13.4 Schur's Majorization
Inequality promise we have
f(y.seq) ≦ f(x.seq) ---eqn.BP086
2010-07-11-22-37 stop
<a name="ch13c126">
2010-07-12-08-10 start
For comparison purpose. Put two
equations side by side
y.seq (<) x.seq ---eqn.BP084
f(y.seq) ≦ f(x.seq) ---eqn.BP086
<a name="ch13c127">
Argument value x.seq, y.seq) and
function value f(x.seq),f(y.seq)
have SAME inequal direction, this is
Schur CONVEX. On the other hand,
Schur CONCAVE has opposite direction.
y.seq (<) x.seq ---eqn.BP084
f(y.seq) ≧ f(x.seq) ---eqn.BP087
<a name="ch13c128">Index beginIndex this file
Exercise 13.3 use eqn.BP084 and
eqn.BP086 (not eqn.BP087).
We have
f(x1,x2,...,xn)
=∑[k=1,n]φ(xk) ---eqn.13.17
Substitute eqn.BP085 for φ(t)
f(x1,x2,...,xn)
=∑[k=1,n]xk*xk ---eqn.BP088
<a name="ch13c129">
Similarly
f(y1,y2,...,yn)
=∑[k=1,n]yk*yk ---eqn.BP089
y sequence is defined as
yk=μ+δ/m for 1≦k≦m ---eqn.BP061
yk=μ-δ/(n-m) for m<k≦n ---eqn.BP062
<a name="ch13c130">
then
f(y1,y2,...,yn)
=m*(μ+δ/m)*(μ+δ/m) ---eqn.BP090
+(n-m)*[μ-δ/(n-m)]*[μ-δ/(n-m)]
"m*" because repeat m times.
"(n-m)*" for repeat (n-m) times.
<a name="ch13c131">
We need pay attention to
μ=(x1+x2+...+xn)/n ---eqn.BP063
because nμ=x1+x2+...+xn ---eqn.BP091
show up at final answer.
<a name="ch13c132">
Continue expand eqn.BP090
f(y1,y2,...,yn)
=m*(μ+δ/m)*(μ+δ/m) ---eqn.BP090
+(n-m)*[μ-δ/(n-m)]*[μ-δ/(n-m)]
=m*(μμ+2μδ/m+δδ/m/m)
+(n-m)*[μμ-2μδ/(n-m)
+δδ/(n-m)/(n-m)]
<a name="ch13c133">Index beginIndex this file
=(n-m+m)*μμ
+m*(2μδ/m)
+δδ/m
+(n-m)*[-2μδ/(n-m)]
+δδ/(n-m)
=n*μμ
+2μδ
+δδ/m
-2μδ
+δδ/(n-m)
<a name="ch13c134">
=nμ*nμ/n //insert 1=n/n
+δδ[1/m + 1/(n-m)]
=nμ*nμ/n
+δδ(m+n-m)/m/(n-m)
<a name="ch13c135">
Finally
f(y1,y2,...,yn)
=nμ*nμ/n ---eqn.BP092
+δδn/m/(n-m)
eqn.BP092 is target eqn.13.23
less than (right) side exactly.
<a name="ch13c136">
eqn.BP086 give us the required
inequality.
eqn.BP088 give us the greater
than side.
Exercise 13.3 is done.
2010-07-12-08-43 stop
<a name="ch13c137">
2010-07-12-11-37 start
About [L] button in jsmajor2.htm
L=Leap.
To understand a problem, LiuHH
need numerical example to get
some idea. Exercise 13.3 is not
easy to solve if just look for
greatest element comparison.
<a name="ch13c138">Index beginIndex this file
LiuHH created one example
x seq. sum to 11.3
1.5 2.6 2.7 1.7 2.8
34/15 34/15 34/15 9/4 9/4
y seq. sum to 11.3
<a name="ch13c139">
create x seq. majorize y seq.
transformation matrix eqn.BP070
Decimal number is too long,
change to quotient number.
First row is
0.5076923076923081 0.07322033898305072 0.013057124921531679 0.02219711236660385 0.3838331160365057
<a name="ch13c140">
2010-07-11-12-12 (working time)
33/65 108/1475 104/7965 0.02219711236660385 1472/3835
2010-07-11-12-32 found
0.02219711236660385 = 884/39825
Then use program jsmajor2.htm
to convert decimal to quotient.
my HP pavilion a255c computer
run 3 minutes from 1 to 39825
<a name="ch13c141">
Immediately decide change
from one n2=[ box ]
to two b:[begin] e:[end] box
After change, fill
b:[39000] e:[40000]
Click [F] to fill number to
box M5
Then click [x*M5] to get result
at box M6. After this change,
computer run just one second.
<a name="ch13c142">
I know the number is 884/39825
What if I do not know ? I need
start from 1 to 3000, next
3001 to 6000 continue 14 times
fill number to b:[ ] e:[ ]
14 times !? I am lazy, I do not
want to fill numbers that many
times. Next change program,
add [L] button for Leap.
<a name="ch13c143">Index beginIndex this file
If box b:[ ] e:[ ] have number
b:[ 1 ] e:[ 3000 ] Click [L]
program generate 3001 to 6000
to box M5 and update b:[ ] e:[ ]
<a name="ch13c144">
If box b:[ ] e:[ ] have number
b:[ 9000 ] e:[ 7001 ] Click [L]
program generate 7000 to 5001
to box M5 and update b:[ ] e:[ ]
It reduce number too. Stop work
at negative number in b:[ ] e:[ ]
<a name="ch13c145">
This function is an improvement
for LiuHH. Hope you like this
change.
2010-07-12-12-10 stop
<a name="ch13c146">x seq. y seq. have next
majorization matrix [y]=[M][x]
[[ ---eqn.BP070
33/65 108/1475 104/7965 884/39825 1472/3835
0 621/1180 299/3186 5083/31860 13/59
0 0 17/27 10/27 0
0 13/40 1/4 17/40 0
32/65 891/11800 143/10620 2431/106200 1518/3835
]]
<a name="ch13c147">
2431/106200 =
(11*13*17)/(2*2*2*3*3*5*5*59)
2010-07-12-12-31 done change
from decimal to quotient
2010-07-12-13-38 verified
above matrix
all column sum=1
all row sum=1
<a name="ch13c148">Index beginIndex this file
2010-07-12-14-25 start
■ Exercise 13.4 problem statement
textbook page 205
(Symmetric Polynomials and
Schur Concavity)
<a name="ch13c149">
After observing that the kth
elementary symmetric function
ek(x)=ek(x1,x2,...,xn)=
= ∑[1≦i1<i2<...<ik≦n]
xi1*xi2*...*xik ---eqn.BP093
//k=2: e2(x) expansion eqn.BJ103<a name="ch13c150">
satisfy the elegant
"cancellation identity"
∂ek(x)/∂xs= ---eqn.13.24
ek-1(x1,x2,...,xs-1,xs+1,...,xn)
Show that ek(x) is Schur concave
for x∈[0,∞)n. //x∈[0,∞)n is
//critical for this problem
//Similar requirement
2010-07-12-14-49 stop
<a name="ch13c151">
2010-07-12-15-25 start
■ Exercise 13.4 hint
textbook page 277
Two applications of cancellation
identity (13.24) permit one to
<a name="ch13c152">
reduce Schur's differential
(13.4) to
-(xs-xt)2* ---eqn.BP094
ek-1(x1,x2,...,xs-1,xs+1,...,xn)
and this polynomial is obvious
nonpositive for x∈[0,∞)n.
2010-07-12-15-31 stop
<a name="ch13c153">Index beginIndex this file
2010-07-12-15-46 start
■ Exercise 13.4 discussion
To understand the meaning of
eqn.BP093 and eqn.13.24
Let x=[a,b,c,d] ---eqn.BP095
x has n=4 elements.
<a name="ch13c154">
Attention: 'e' is elementary
symmetric function name.
'e' is NOT one of x.
Consider k=3, we have
e3(x)=e3(a,b,c,d)=
=bcd+cad+dab+abc ---eqn.BP096
eqn.BP096 is eqn.BP093 at
n=4 and k=3.
<a name="ch13c155">
Assume
xs=b ---eqn.BP097
then eqn.13.24 become
∂ek(x)/∂xs ---eqn.BP098
=∂e3(a,b,c,d)/∂b
=∂[bcd+cad+dab+abc]/∂b
= 1*cd+ 0 +da*1+ac*1
e2,b(a,b,c,d) ---eqn.BP099
= cd +da +ac
<a name="ch13c156">
2010-07-12-16-03 here
Above is first applications of
cancellation identity eqn.13.24
"Two applications of cancel-
lation identity" do NOT mean
take second derivative.
It is take first derivative
of a different variable. See
eqn.13.4, it has two first
derivatives.
<a name="ch13c157">
Next, assume
xt=c ---eqn.BP100
then get
∂ek(x)/∂xt ---eqn.BP101
=∂e3(a,b,c,d)/∂c
=∂[bcd+cad+dab+abc]/∂c
= 1*bd+1*ad+0 +ab*1
e2,c(a,b,c,d) ---eqn.BP102
= bd +ad +ab
<a name="ch13c158">Index beginIndex this file
Schur's Criterion eqn.13.4
for the specific case is
(b-c)*[(cd +da +ac)
-(bd +ad +ab)]
=(b-c)*[(cd-bd)+(ac-ab)]
=(b-c)*[-d(b-c)-a(b-c)]
=(b-c)*(b-c)*[-d-a] ---eqn.BP103
<a name="ch13c159">Problem givenx∈[0,∞)n
all a,b,c,d are nonnegative.
(b-c)*(b-c) is nonnegative,
[-d-a] is nonpositive. Over
all eqn.BP103 is nonpositive.
That is Schur's Criterion≦0
Then we have Schur concave.
2010-07-12-16-32 here
<a name="ch13c160">
Two points worth your attention
First attention.
If x=[a,b,c,d] is observed
length data. e3(x) has three
variables multiplied. e3(x)
has dimension of length to
third power. Similarly e2(x)
has dimension of length to
second power.
<a name="ch13c161">
In general ek(x)
has dimension of length to
k-th power. Next equation
∂ek(x)/∂xt=ek-1(x) ---eqn.BP104
is consistent in physics
dimension.
<a name="ch13c162">Index beginIndex this file
Second attention.
The result of eqn.BP098 is NOT
complete for a symmetric ek-1(x)
If add
e2,b(a,b,c,d) ---eqn.BP099
= cd +da +ac
and
e2,c(a,b,c,d) ---eqn.BP102
= bd +ad +ab
still NOT complete.
<a name="ch13c163">
For complete e2(a,b,c,d) must
sum over all variables
e2(a,b,c,d)= ---eqn.BP105
∂e3(x)/∂a+∂e3(x)/∂b
+∂e3(x)/∂c+∂e3(x)/∂d
otherwise it is not symmetry.
<a name="ch13c164">
Schur's Criterion eqn.13.4
is not a symmetry equation
but it apply to any pair of
variables.
2010-07-12-17-02 stop
<a name="ch13c165">Index beginIndex this file
2010-07-12-18-32 start
■ Exercise 13.4 solution
Exercise 13.4 is very close to
From Schur concavity to AM-GM
(textbook page 194 lower half)
<a name="ch13c166">eqn.BN079 fit Exercise 13.4
perfectly.
The difference is that AM-GM
use highest power k=n,
ek(x)=en(x)=
f(x)=x1*x2*...*xn ---eqn.BN078<a name="ch13c167">
Here Exercise 13.4 use k-th
power for 1≦k≦n
ek(x)=ek(x1,x2,...,xn)=
= ∑[1≦i1<i2<...<ik≦n]
xi1*xi2*...*xik ---eqn.BP093<a name="ch13c168">
The conclusion at eqn.BN083
is Schur concave, same as
Exercise 13.4 . Compare
eqn.BP094 with eqn.BN083.
<a name="ch13c169">
Can you do Exercise 13.4 ?
you have two references,
First is Exercise 13.4 hint
Second is
From Schur concavity to AM-GM
Good luck.
2010-07-12-18-53 stop
<a name="ch13c170">Index beginIndex this file
2010-07-12-19-12 start
■ Exercise 13.5 problem statement
textbook page 205
(Schur Concavity and Measure
of Dispersion)
<a name="ch13c171">
Many methods have been proposed
to measure dispersion. Statisti-
cians, for example, often use
the sample variance
where x_bar=(x1+x2+...+xn)/n ---eqn.BP107
width of above equation
for x in Realn, n≧2.
<a name="ch13c173">
while information theorists
rely on the entropy
h(p)=-∑[k=1,n]pk*log(pk) ---eqn.BP108
to measure of dispersion of
the probability distribution
(p1,p2,...,pn) where
pk≧0 ---eqn.BP109
<a name="ch13c174">
and
p1+p2+...+pn=1 ---eqn.BP110
Show that both the sample
variance s(x) and entropy
h(p) are Schur concave.
2010-07-12-19-34 stop
<a name="ch13c175">Index beginIndex this file
2010-07-12-21-03 start
■ Exercise 13.5 hint
textbook page 277
Use Schur's criterion (13.4)
<a name="ch13c176">
and note
(xj-xk)*(sxj(x)-sxk(x))
=2(xj-xk)2/(n-1)≧0 ---eqn.BP111
and
(pj-pk)*(hpj(x)-hpk(x))
=(pj-pk)*[log(pj)-log(pk)]≧0 ---eqn.BP112
<a name="ch13c177">
where the subscripts connote
partial derivatives.
Incidentally, the second
formula verifies that h(p) is
Schur convex on all of (0,∞)n
not just the subset of (0,∞)n
where p has sum equal to one.
2010-07-12-21-14 stop
<a name="ch13c178">Index beginIndex this file
2010-07-12-22-24 start
■ Exercise 13.5 solution
<a name="ch13c179">
Schur's criterion (short as SC)
SC=(xj-xk)*[∂f(x)/∂xj //see (13.4)
-∂f(x)/∂xk] ---eqn.BP113
If SC≦0, f(x) is Schur concave
If SC≧0, f(x) is Schur convex
//more<a name="ch13c180">
Set sample variance s(x) to be
f(x). evaluate eqn.BP113
as following
(n-1)*∂f(x)/∂xj = ---eqn.BP114
∂{∑[i=1,n](xi-x_bar)2}/∂xj<a name="ch13c181">
eqn.BP114 is summation, those
xi i≠j all drop out.
Only xi i=j left. Continue
(n-1)*∂f(x)/∂xj ---eqn.BP115
=d[(xj-x_bar)2]/dxj
=2(xj-x_bar)*(dxj-0)/dxj
=2(xj-x_bar)
<a name="ch13c182">
eqn.BP106 has division of
(n-1), here move (n-1) to
other side of equation to
simplify calculation.
<a name="ch13c183">Index beginIndex this file
Similarly
(n-1)*∂f(x)/∂xk ---eqn.BP116
=2(xk-x_bar)
Beck to eqn.BP113
<a name="ch13c184">
Schur's criterion
SC=(xj-xk)*[∂f(x)/∂xj ---eqn.BP117
-∂f(x)/∂xk]
=(xj-xk)*[2(xj-x_bar)
-2(xk-x_bar)]/(n-1)
<a name="ch13c185">
SC=2(xj-xk)2/(n-1)≧0 ---eqn.BP118
The sample variance s(x) is
Schur Convex.
2010-07-12-23-00 here
<a name="ch13c186">
Next see the entropy h(p).
Set entropy h(p) to be f(p)
and evaluate eqn.BP113
∂f(p)/∂pj ---eqn.BP119
=∂{-∑[i=1,n]pi*log(pi)}/∂pj<a name="ch13c187">
eqn.BP119 is summation, those
pi i≠j all drop out.
Only pi i=j left. Continue
∂f(p)/∂pj ---eqn.BP120
=d{-pj*log(pj)}/dpj
=log(pj)*d{-pj}/dpj
+(-pj)*d{log(pj)}/dpj<a name="ch13c188">Index beginIndex this file
∂f(p)/∂pj=-log(pj)-1 ---eqn.BP121
Similarly
∂f(p)/∂pk=-log(pk)-1 ---eqn.BP122
<a name="ch13c189">
SC=(pj-pk)*{∂f(p)/∂pj ---eqn.BP123
-∂f(p)/∂pk}
=(pj-pk)*{[-log(pj)-1]
-[-log(pk)-1]}
SC=(pj-pk)*{-[log(pj)-log(pk)]}
<a name="ch13c190">
Schur's criterion
SC=-(pj-pk)*[log(pj)-log(pk)] ---eqn.BP124
≦0
log() is monotone increase.
(pj-pk) and [log(pj)-log(pk)]
have same sign. With a negative
sign in front of equation, the
entropy h(p) is Schur concave.
2010-07-12-23-25 stop
<a name="ch13c191">
2010-07-13-07-22 start
LiuHH reviewed calculation several
times. But can not find Liu's
derivation where is wrong.
Textbook page 205 last line
last word is 'convex'. Require
reader to prove both sample
variance s(x) and the entropy
h(p) are Schur convex.
<a name="ch13c192">Index beginIndex this file
2009-02-01-15-22 LiuHH download
http://www-stat.wharton.upenn.edu/~steele/Publications/Books/CSMC/CSMC_Errata.pdf
CSMC_Errata.pdf change
from 'convex' to 'concave'.
Both sample variance s(x) and
the entropy h(p) are Schur
concave.
<a name="ch13c193">
Either way, sample variance s(x)
and entropy h(p) should be of
same type, because two names
are actually one thing.
<a name="ch13c194">
LiuHH's work get
the sample variance s(x) convex
and the entropy h(p) concave.
There must be some where wrong.
Before correction,
Exercise 13.5 is NOT DONE
2010-07-13-07-40 stop
<a name="ch13c195">Index beginIndex this file
2010-07-13-07-50 start
■ Exercise 13.6 problem statement
textbook page 206
(Another Inversion Preserving Form)
<a name="ch13c196">
If
pk≧0 ---eqn.BP125
and
p1+p2+...+pn=1 ---eqn.BP126
and
0<α ---eqn.BP127
Show that
<a name="ch13c198">
Incidentally, way back in
Exercise 1.6 we used Cauchy's
inequality to deal with the
case α=2. //●●change was 'α=1'
Remarkably often majorization
helps one to put a consequence
of Cauchy's inequality into a
broader context.
2010-07-13-08-08 stop
<a name="ch13c199">Index beginIndex this file
2010-07-13-08-30 start
■ Exercise 13.6 hint
textbook page 277
<a name="ch13c200">
Since //see here above/below
(1/n,1/n,...,1/n)(<)p ---eqn.BP128
//p=(p1,p2,...,pn) ---eqn.BP129
this is a special case of the
bound (13.18) for
φ(x)=(x+1/x)α ---eqn.BP130
<a name="ch13c201">
Since //●●Discuss φ''(x)
φ''(x)=α(x+1/x)α(x+x3)-2
*{(1+x2-x4)+α(1-x2)2} ---eqn.BP131
must be positive for 0≦x≦1
and α>0.
<a name="ch13c202">
The relevance of Schur convexity
to this problem was noted by
Marshall and Olkin (1979 p.72)
a proof using Lagrange multiplier
is given by Mitrinovic
(1970 p. 282)
2010-07-13-08-43 stop
<a name="ch13c203">
2010-07-13-11-18 start
■ Exercise 13.6 solution
<a name="ch13c204">Index beginIndex this file
Review:
Schur's Majorization Inequality
requirement are
a one variable function φ(t)
φ:(a,b)→Real ---eqn.BO104
be defined.
<a name="ch13c205">
a n variable function f(x)
f(x1,x2,...,xn)
=∑[k=1,n]φ(xk) ---eqn.13.17
be defined.
<a name="ch13c206">
n variable function must be a
sum of one variable function
n times with variable replaced
n times. Replaced variable
match a n elements sequence.
<a name="ch13c207">
Two such sequences α and β
constitute two different
version of n variable functions
f(α) and f(β).
If two sequences α and β have
majorization relation, then
two n variable functions have
specified inequality relation.
<a name="ch13c208">
If α(<)β
if n variable function f(x) is
Schur convex, then
f(α)≦f(β)
<a name="ch13c209">Index beginIndex this file
If α(<)β
if n variable function f(x) is
Schur concave, then
f(α)≧f(β)
"(<)" indicate compare n times
"<" indicate compare once.
<a name="ch13c210">
If α and β do not have
majorization relation, then
f(α) and f(β) magnitude
comparison is uncertain.
2010-07-13-11-36 here
<a name="ch13c211">
For Exercise 13.6, the key
point to watch is eqn.13.25
eqn.13.25 greater than (right)
side tell us that
n variable function f(x) is
f(x)= ---eqn.BP132
∑[k=1,n](pk +1/pk)α<a name="ch13c212">
eqn.BP132 tell us that
one variable function φ(t)
is
φ(t)=(t +1/t)α ---eqn.BP133
<a name="ch13c213">
Next we need two sequences.
eqn.BP132 f(x) tell us
that first sequence is
p=(p1,p2,...,pn) ---eqn.BP129
<a name="ch13c214">Index beginIndex this fileeqn.13.25 less than side
do not have a sequence !!
It is all constant. This
arrangement suggest that
second sequence is a constant
sequence.
<a name="ch13c215">
Because two sequence must have
same total sum. Then second
sequence is a constant of total
average of first sequence.
Define
s=(p1+p2+...+pn)/n
second sequence is
(s, s, ....., s) ---eqn.BP134
repeat n times.
<a name="ch13c216">
Given condition eqn.BP126
tell us that
s=1/n ---eqn.BP135
second sequence is
(1/n, 1/n, ....., 1/n) ---eqn.BP136
repeat n times.
2010-07-13-11-52 stop
<a name="ch13c217">
2010-07-13-14-52 start
p=(p1,p2,...,pn) ---eqn.BP129
majorize
(1/n, 1/n, ....., 1/n) ---eqn.BP136
is discussed in
# (α_bar,...,α_bar)(<)(α1,α2,...,αn)
<a name="ch13c218">
The main point is to check
one variable function φ(t)
convexity/concavity for
t in [0,1].
The condition t in [0,1] is
determined by eqn.BP125 and
eqn.BP126.
We are given 0<α ---eqn.BP127
which we need to use later.
<a name="ch13c219">Index beginIndex this file
One variable function φ(t)
φ(t)=(t +1/t)α ---eqn.BP133
is analytic expression, we
can differentiate φ(t) once
to get φ'(t) and twice to
get φ''(t). For t in [0,1]
positive φ''(t) → convex φ(t)
negative φ''(t) → concave φ(t)
<a name="ch13c220">
First find φ'(t)
φ'(t)=d[φ(t)]/dt ---eqn.BP137
=d[(t +1/t)α]/dt
=α*(t +1/t)α-1*d(t +1/t)/dt
=α*(t +1/t)α-1*(1 -1/t/t)
<a name="ch13c221">
2010-07-13-15-15 here
check first differentiation
[[
alfa=0.2
x=1.5
phi0=pow(x+1/x,alfa)
phi1=alfa*pow(x+1/x,alfa-1)*(1-1/x/x)
phi0
phi1
]]
<a name="ch13c222">complex2.htm#calculator output
phi0
1.1672353193296931
phi1
0.05985822150408684
<a name="ch13c223">integral.htm#differentiate input
f(x) ---eqn.BP138
pow(x+1/x,0.2)
d[f(x)]/dx ---eqn.BP139
0.2*pow(x+1/x,0.2-1)*(1-1/x/x)
x0
1.5
dx
0.00001
<a name="ch13c224">Index beginIndex this fileintegral.htm#differentiate output
Numerical Answer : 0.05985847935097154
Analytic Solution: 0.05985822150408684
first differentiation OK
2010-07-13-15-22 here
<a name="ch13c225">
Next find φ''(t)
φ''(t)=d[φ'(t)]/dt ---eqn.BP140
=d[α*(t +1/t)α-1*(1 -1/t/t)]/dt
=(1 -1/t/t)[α*(α-1)*(t +1/t)α-2]d(t +1/t)/dt
+α*(t +1/t)α-1*d(1 -1/t/t)/dt
=(1 -1/t/t)[α*(α-1)*(t +1/t)α-2]*(1 -1/t/t)
+α*(t +1/t)α-1*[0 -1*(-2)/t/t/t]
φ''(t)=d[φ'(t)]/dt ---eqn.BP141
=(1 -1/t/t)2[α*(α-1)*(t +1/t)α-2]
+α*(t +1/t)α-1*2/t/t/t
2010-07-13-15-29 here
<a name="ch13c226">
check second differentiation
[[
alfa=0.2
x=1.5
phi0=pow(x+1/x,alfa)
phi1=alfa*pow(x+1/x,alfa-1)*(1-1/x/x)
phi2=(1-1/x/x)*(1-1/x/x)*alfa*(alfa-1)*pow(x+1/x,alfa-2)+alfa*pow(x+1/x,alfa-1)*2/x/x/x
phi1
phi2
]]
<a name="ch13c227">
complex2.htm#calculator output
phi1
0.05985822150408684
phi2
0.051570160065059434
<a name="ch13c228">
integral.htm#differentiate input
f(x) //first derivative ---eqn.BP142
0.2*pow(x+1/x,0.2-1)*(1-1/x/x)
d[f(x)]/dx //second derivative ---eqn.BP143
(1-1/x/x)*(1-1/x/x)*0.2*(0.2-1)*pow(x+1/x,0.2-2)+0.2*pow(x+1/x,0.2-1)*2/x/x/x
x0
1.5
dx
0.00001
<a name="ch13c229">Index beginIndex this fileintegral.htm#differentiate output
Numerical Answer : 0.051569353461122784
Analytic Solution: 0.051570160065059434
second differentiation OK
2010-07-13-15-36 here
<a name="ch13c230">
2010-07-13-17-24
eqn.BP141 φ''(t) is not similar
to textbook φ''(t), following
change eqn.BP141 φ''(t) to
textbook φ''(t)
φ''(x)= ---eqn.BP144
=(1 -1/x/x)2[α*(α-1)*(x +1/x)α-2]
+α*(x +1/x)α-1*2/x/x/x
=α*(x +1/x)α*(1 -1/x/x)2[(α-1)*(x +1/x)-2] ---eqn.BP145
+α*(x +1/x)α*(x +1/x)-1*2/x/x/x
=α*(x +1/x)α* ---eqn.BP146
{(1 -1/x/x)2[(α-1)*(x +1/x)-2]
+(x +1/x)-1*2/x/x/x}
<a name="ch13c231">
=α*(x +1/x)α* ---eqn.BP147
{(1 -2/x2+1/x4)[(α-1)*((x2+1)/x)-2]
+((x2+1)/x)-1*2/x/x/x}
=α*(x +1/x)α* ---eqn.BP148
{(1 -2/x2+1/x4)[(α-1)*(x2/(x2+1)2)]
+(x/(x2+1))*2/x/x/x}
<a name="ch13c232">
φ''(x)= ---eqn.BP149
=α*(x +1/x)α*
{(1 -2/x2+1/x4)[(α-1)*(x2/(x2+1)2)]
+(x/(x2+1))*2/x/x/x}
2010-07-13-18-31 OK
α*pow(x +1/x,α)*((1 -2/x/x +1/x/x/x/x)*(α-1)*x*x/(x*x +1)/(x*x +1)+2/x/x/(x*x +1))
<a name="ch13c233">
φ''(x)= ---eqn.BP150
=α*(x +1/x)α*
{(1 -2/x2+1/x4)[(α-1)*(x2/(x2+1)2)]
+(2/(x2+1))/x/x}
<a name="ch13c234">Index beginIndex this file
=α*(x +1/x)α* ---eqn.BP151
{(1 -2/x2+1/x4)*(α)*(x2/(x2+1)2)
-(1 -2/x2+1/x4)* 1 *(x2/(x2+1)2)
+(2/(x2+1))/x/x}
2010-07-13-19-04 verified
OK up to here 9907131907
<a name="ch13c235">
φ''(x)= ---eqn.BP152
=α*(x +1/x)α*
{(1 -2/x2+1/x4)*(α)*(x2/(x2+1)2)-(1 -2/x2+1/x4)* 1 *(x2/(x2+1)2)
+(2/(x2+1))/x/x}
Above is LiuHH calculated φ''(x)
attempt to match textbook φ''(x).
Below try to see φ''(x)≧0.
<a name="ch13c236">
α coef. //above red line
(α)* ---eqn.BP153
(1 -2/x2+1/x4)*(x2/(x2+1)2)
= (α)*(1-1/x/x)2*(x2/(x2+1)2)
α coef. ≧ 0
2010-07-13-19-13
<a name="ch13c237">
non-α coef. = ---eqn.BP154 //above blue line
(x*x+1)-1 * //common positive term
[2/x/x -(1-1/x/x)*(1-1/x/x)*x*x/(x*x+1)]
0<x<1
The trouble here is above line
has term A minus term B.
A-B<=>0 all possible.
<a name="ch13c238">
2010-07-13-19-30 in
2/x/x -(1-1/x/x)*(1-1/x/x)*x*x/(x*x+1) ---eqn.BP155
2/x/x
is positive
-(1-1/x/x)*(1-1/x/x)*x*x/(x*x+1)
is negative
their ratio is
<a name="ch13c239">Index beginIndex this file
negative : positive = ---eqn.BP156
(1-1/x/x)*(1-1/x/x)*x*x/(x*x+1)
:
2/x/x
<a name="ch13c240">
negative : positive =
((1-1/x/x)*(1-1/x/x)*x*x/(x*x+1))*x*x/2
= ---eqn.BP157
(x*x-1)*(x*x-1)/(x*x+1)/2
2010-07-13-19-40
<a name="ch13c241">
2010-07-13-20-01
Because 0<x<1, the ratio
(x*x-1)*(x*x-1)/(x*x+1)/2 ---eqn.BP158
has maximum value 0.5 at x=0
has minimum value 0.0 at x=1
Ratio range in [0, 0.5]
positive term ≧ negative term
non-α coef. ≧ 0
therefore φ''(x) is nonnegative
in 0<x<1 for 0<α<∞
<a name="ch13c242">
Conclude
one variable function φ(t)
φ(t)=(t +1/t)α ---eqn.BP133
is convex in 0<x<1 for 0<α<∞
<a name="ch13c243">Index beginIndex this file
n variable function f(x) is
f(x)= ---eqn.BP132
∑[k=1,n](pk +1/pk)α
is Schur convex.
<a name="ch13c244">p=(p1,p2,...,pn) ---eqn.BP129
majorize
(1/n, 1/n, ....., 1/n) ---eqn.BP136
then
f(p) is greater than f(s)
s is defined in eqn.BP134
and eqn.BP136
2010-07-13-20-14 here
<a name="ch13c245">
f(p) is eqn.13.25 greater than
side summation.
Next find f(s)
Refer to eqn.13.25
f(s) is eqn.13.25 less than side
summation, but change pk to 1/n
1/n come from s sequence and
eqn.BP135<a name="ch13c246">
f(s)=
k=n
∑
k=1
(
1/n
+
1
1/n
)
α
=
n*
(
n2+1
n
)
α
=
(n2+1)α
nα-1
---page 206
---line 3
---eqn.BP159
width of above equation
Exercise 13.6 is done
2010-07-13-20-27 stop
<a name="ch13c247">Index beginIndex this file
2010-07-13-22-30 start
●●Discuss φ''(t)
Exercise 13.6 start from
one variable function φ(t)
φ(t)=(t +1/t)α ---eqn.BP133<a name="ch13c248">
Because second derivative φ''(t)
determine φ(t) is convex or
concave. We need find φ'(t)
and φ''(t).
<a name="ch13c249">
First find φ'(t)
φ'(t)=d[φ(t)]/dt ---eqn.BP137
=α*(t +1/t)α-1*(1 -1/t/t)
<a name="ch13c250">
Second find
φ''(x)= ---eqn.BP152
=α*(x +1/x)α*
{(1 -2/x2+1/x4)*(α)*(x2/(x2+1)2)-(1 -2/x2+1/x4)* 1 *(x2/(x2+1)2)
+(2/(x2+1))/x/x}
<a name="ch13c251">
To verify differentiation, use localhttp://freeman2.com/integral.htm
First verify φ(t) to φ'(t)
in integral.htm fill in the
following //α=0.2 , x=1.5
<a name="ch13c252">Index beginIndex this file
[[
f(x)
pow(x+1/x,0.2)
d[f(x)]/dx
0.2*pow(x+1/x,0.2-1)*(1-1/x/x)
x0
1.5
dx
0.00001
]]
Click [Differentiate] button
<a name="ch13c253">
Output
Numerical Answer : 0.05985847935097154
Analytic Solution: 0.05985822150408684
Two numbers are the same
first differentiation OK
<a name="ch13c254">
Second verify φ'(t) to φ''(t)
in integral.htm fill in the
following //α=0.2 , x=1.5
[[
<a name="ch13c255">
f(x)
0.2*pow(x+1/x,0.2-1)*(1-1/x/x)
d[f(x)]/dx
(1-1/x/x)*(1-1/x/x)*0.2*(0.2-1)*pow(x+1/x,0.2-2)+0.2*pow(x+1/x,0.2-1)*2/x/x/x
x0
1.5
dx
0.00001
]]
<a name="ch13c256">
Output
Numerical Answer : 0.051569353461122784
Analytic Solution: 0.051570160065059434
second differentiation OK
<a name="ch13c257">Index beginIndex this file
Textbook eqn.BP131 has
φ''(x)=α(x+1/x)α(x+x3)-2
*{(1+x2-x4)+α(1-x2)2} ---eqn.BP131
=α*pow(x+1/x,α)*pow(x+x*x*x,-2)
*((1+x*x-x*x*x*x)+α*(1-x*x)*(1-x*x))
= //α=0.2
0.2*pow(x+1/x,0.2)*pow(x+x*x*x,-2)*((1+x*x-x*x*x*x)+0.2*(1-x*x)*(1-x*x))
<a name="ch13c258">
in integral.htm fill in the
following //α=0.2 , x=1.5
[[
f(x)
0.2*pow(x+1/x,0.2-1)*(1-1/x/x)
d[f(x)]/dx
0.2*pow(x+1/x,0.2)*pow(x+x*x*x,-2)*((1+x*x-x*x*x*x)+0.2*(1-x*x)*(1-x*x))
x0
1.5
dx
0.00001
]]
<a name="ch13c259">
integral.htm#differentiate output
Numerical Answer : 0.051569353461122784
Analytic Solution: -0.014734331447159836
second differentiation different
from textbook equation !!
2010-07-13-15-48 work here
2010-07-13-23-03 doc. here
<a name="ch13c260">
Main point is φ''(x) equation
study notes eqn.BP152 and
textbook eqn.BP131
are different.
Equation short section evaluation
Red line in eqn.BP152 evaluate
same value as eqn.BP131
Blue line in eqn.BP152 evaluate
different value from eqn.BP131
<a name="ch13c261">
LiuHH's φ''(x) equation eqn.BP152
could be wrong.
2010-07-13-23-08 stop
<a name="ch13c262">Index beginIndex this file
2010-07-14-09-45 start
■ Exercise 13.7 problem statement
textbook page 206
(A Birthday Problem)
<a name="ch13c263">
Given n random people, what is
the probability that two or more
of them have the same birthday?
Under the natural (but approximate!)
model where the birthdays are viewed
as an independent and uniformly
distributed in the set {1,2,...,365}
<a name="ch13c264">
show that this probability is at
least 1/2 if n≧23. For the more
novel bit, show this probability
does not go down if one drops the
assumption that the birthdays are
uniformly distributed.
2010-07-14-09-52 stop
<a name="ch13c265">Index beginIndex this file
2010-07-14-09-57 start
■ Exercise 13.7 hint
textbook page 277
<a name="ch13c266">
In the uniform case the probability
is
1-(1-1/365)*(1-2/365)*...*(1-22/365)
≈ 0.5079... ---eqn.BP160
<a name="ch13c267">
In the general case the probability
is
1-en(p1,p2,...,p365) ---eqn.BP161
where en(p) is the n th
symmetric polynomial and pk is
the probability that a randomly
chosen person is born on day k.
<a name="ch13c268">
By Exercise 13.4 the polynomial
en(p) is Schur concave, and this
is even more than one needs.
The connection between
majorization and the birthday
problem has been made in
Clevenson and Watkins (1991) and
Proschan and Joag-Dev (1992);
McConnell (2001) gives a
treatment for nonuniform
probabilities without explicit
recourse to majorization.
2010-07-14-10-08 stop
//<a name="ch13c269">
//9907141009
var i,j,k,n,s,p;
n=365;
s=1-1/n;
j=2;
k=22
for(i=j;i<=k;i++)
s=s*(1-i/n)
1-s
//9907141015
<a name="ch13c270">
answer
1-s
0.5072972343239854
2010-07-14-10-16
<a name="ch13c271">Index beginIndex this file
2010-07-14-10-47
■ Exercise 13.7 solution
What is the meaning of eqn.BP160?
eqn.BP160 say:
Probability have at least one
match = total probability 1
minus prob. no match at all.
<a name="ch13c272">
Assume there are n=23 person.
Assume take person AA birthday
as reference. Other 22 person
have possibly one birthday or
two birthdays or ... up to
other 22 person have
22 different birthdays.
<a name="ch13c273">
case 1: (1-1/365) in eqn.BP160 say
other 22 person have one birthday
(they all have same birthday). The
probability of this one day match
AA birthday is 1/365. Probability
of no match AA birthday: (1-1/365)
<a name="ch13c274">
case 2: (1-2/365) in eqn.BP160 say
other 22 person have two birthdays
Probability of this two day match
AA birthday is 2/365. Probability
of no match AA birthday: (1-2/365)
<a name="ch13c275">Index beginIndex this file
The probability of birthdays ALL
different from reference AA is
case 1 * case 2 *...* case 22
that is
(1-1/365)*(1-2/365)*...*(1-22/365)
this is the value of probability
not match AA birthday at all.
<a name="ch13c276">
Total probability 1 subtract
no match probability
1-(1-1/365)*(1-2/365)*...*(1-22/365)
is the probability at least one
case match AA.
Above is the explanation for
eqn.BP160
2010-07-14-11-04 stop
<a name="ch13c277">
2010-07-14-11-28 start
eqn.BP160 use equal probability
for each day. It is possible to
assign non-equal probability for
365 days. eqn.BP161 works for
non-equal probability. Because
<a name="ch13c278">
en(p1,p2,...,p365) in
1-en(p1,p2,...,p365) ---eqn.BP161
is symmetric summation, it takes
all non-equal probability days
into calculation.
<a name="ch13c279">
en(p1,p2,...,p365) is no match
probability.
1-en(p1,p2,...,p365) is match
probability.
2010-07-14-11-35 stop
<a name="ch13c280">Index beginIndex this file
2010-07-15-19-40 start
Exercise 13.4 prove that
ek(x) is Schur concave
for x∈[0,∞)n.
Exercise 13.7 use next equation
1-en(p1,p2,...,p365) ---eqn.BP161
+en() is Schur concave
-en() is Schur convex<a name="ch13c281">
eqn.BP161 is Schur convex
equation. For Schur convex
sequence inequality direction
function inequality direction
are same direction.
<a name="ch13c282">
The discussion at
# (α_bar,...,α_bar)(<)(α1,α2,...,αn)
tell us that
uniform birthday probability
is majorized by
non-uniform birthday probability
<a name="ch13c283">
For Schur convex eqn.BP161,
it is more diverse than the
uniform case. That is
the probability two or more
of them have the same birthday
as reference AA does not go down.
2010-07-15-19-56 stop
<a name="ch13c284">Index beginIndex this file
2010-07-14-12-22 start
■ Exercise 13.8 problem statement
textbook page 206
(SDRs and the Marriage Problem)
<a name="ch13c285">
If S1,S2,...,Sn is a collection
of subsets of the set S, we may
say that the set
R={x1,x2,...,xn}⊂S ---eqn.BP162
is a system of distinct
representatives (or an SDR)
provided that the elements of
R are all distinct and
xk∈Sk ---eqn.BP163
for each 1≦k≦n.
<a name="ch13c286">
Prove that a necessary and
sufficient condition for the
existence of an SDR is that
one has the inequality
---page 206
---line 21
---eqn.13.26
width of above equation
where |C| is used as shorthand
for the cardinality of a set C.
<a name="ch13c288">
The quaint term "marriage problem"
comes from a 1949 article by
Hermann Weyl who essentially put
the issue as follows:
<a name="ch13c289">
given a set of girls and boys,
it is possible for each girl to
marry a boy she knows if and only
if each subset of k girls knows
at least k boys.
<a name="ch13c290">
The marriage lemma is one of the
most widely applied results in
all of combinatorial theory, and
it has many applications to the
theory of inequalities. In
particular, it is of great help
with the final exercise which
develops Birkhoff's theorem.
2010-07-14-12-46 stop
<a name="ch13c291">Index beginIndex this file
2010-07-14-13-56 start
■ Exercise 13.8 hint
textbook page 277
<a name="ch13c292">
The necessity of the condition
is immediate, so we just need
to prove sufficiency.
In Weyl's terms, girl j knows
precisely the boys in the set
Sj, so for a given set A of
girls, every boy in the set
∪j∈ASj ---eqn.BP164
will be known by some girl in
A. We now consider two cases.
<a name="ch13c293">
In case I, we assume that the
inequality (13.26) is strict
for all A with |A|<n. Girl n
then marries any boy b she
knows. Since the condition
(13.26) continue to hold for
all A⊂{1,2,...,n-1} when each
Sj 1≦j≦n-1 is replaced by
Sj\{b}, the remaining girls
can be married by induction
to the remaining boys.
<a name="ch13c294">
In case II, we assume that
equality holds in the bound
(13.26) for some A0 with
|A0|<n. ---eqn.BP165
We then let
B=∪j∈A0Sj ---eqn.BP166
and set
S'j=Sj\B ---eqn.BP167
for all
j∈A0c ---eqn.BP168
<a name="ch13c295">Index beginIndex this file
The girls in A0 can be married
to the boys in B by induction,
and it remains to show that the
girls in A0c can be married to
the boys in Bc.
<a name="ch13c296">
We now take any A⊂A0c and note
that
|
∪
j∈A0∪A
Sj
|
≧
|A0∪A|=|A|+|A0|
---page 278
---line 14
---eqn.BP169
width of above equation
<a name="ch13c297">
We also have the identity
|
∪
j∈A0∪A
Sj
|
=
|
{
∪
j∈A0
Sj
}
∪
{
∪
j∈A
S'j
}
|
=
|A0|
+
|
∪
j∈A
S'j
|
---page 278
---line 16
---eqn.BP170
width of above equation
<a name="ch13c298">
Thus, we find for all A⊂A0c that we have
|
∪
j∈A
S'j
|
≧
|A|
---page 278
---line 18
---eqn.BP171
width of above equation
<a name="ch13c299">
2010-07-14-14-49 here
that is, every set of k girls
in A0c knows at least k boys
in Bc. By induction the girls
in A0c can be married to the
boys in Bc.
<a name="ch13c300">Index beginIndex this file
This proof is essentially the
one given by Halmos and Vaughan
(1950) The marriage lemma is a
cornerstone of the large and
active field of matching theory
which is beautifully surveyed
by Lovasz and Plummer (1986)
2010-07-14-14-54 stop
2010-07-15-16-02 done first proofread
<a name="ch13c301">
Exercise 13.8 and Exercise 13.9
need time to think. Now upload
what I have in hand first.
2010-07-15-16-03 stop
2010-07-15-20-08 done second proofread
2010-07-15-20-35 done spelling check
<a name="ch13c302">Index beginIndex this file
■ Exercise 13.8 problem statement
textbook page 206
(SDRs and the Marriage Problem)
■ Exercise 13.8 hint
textbook page 277
<a name="ch13c303">
2010-07-16-21-39 start
■ Exercise 13.8 solution
LiuHH still need more time to
study Exercise 13.8. At present
time Exercise 13.8 is NOT DONE
2010-07-16-21-40 stop
<a name="ch13c304">
2010-07-17-07-38 start
■ Exercise 13.9 problem statement
textbook page 207
(Birkhoff's Theorem)
<a name="ch13c305">
Given a permutation σ∈Sn, the
permutation matrix associated
with σ is the n×n matrix
Pσ=(Pσ(j,k):1≦j,k≦n) ---eqn.BP172
with matrices
Pσ(j,k)=1 if σ(j)=k ---eqn.BP173
Pσ(j,k)=0 otherwise ---two lines
<a name="ch13c306">
Show that if D is an n×n doubly
stochastic matrix, then there
exist nonnegative weights
{wσ: σ∈Sn} ---eqn.BP174
such that
∑[σ∈Sn]{wσ} = 1 ---eqn.BP175
and
∑[σ∈Sn]{wσPσ} = D ---eqn.13.27
<a name="ch13c307">
In other words, every doubly
stochastic matrix is an average
of permutation matrices.
2010-07-17-07-51 stop
<a name="ch13c308">
2010-07-17-07-56 start
■ Exercise 13.9 hint
textbook page 278
One can argue by induction on
the number of nonzero entries
of D, but it is perhaps more
convrete to look for an
algorithm to compute the
required convex combinations.
<a name="ch13c309">
Either way, the basic idea is
to use the marriage lemma to
make step-bystep progress.
<a name="ch13c310">
For each 1≦j≦n, we let Sj
denote the set of all k such
that djk>0, and we note that
for each A⊂{1,2,...,n} one
has
---page 279
---line 2
---eqn.BP176
width of above equation
<a name="ch13c312">
2010-07-17-08-23 here
By the marriage lemma, there is
a system of SDRs of
{S1,S2,...,Sn} ---eqn.BP177
so, we can define a permutation
σ by taking σ(j) to be the
representative from Sj for each
j=1,2,...,n.
<a name="ch13c313">
Now we let Pσ be
the permutation matrix
associated with σ and set
α=min(djσ(j))>0 ---eqn.BP178
If α=1 then D is a permutation
matrix, and there is nothing
left to prove. On the other hand,
<a name="ch13c314">
if α<1 consider the new matrix
D' defined by setting
D'=(1-α)-1(D-αPσ) ---eqn.BP179
We then have
D=αPσ+(1-α)D' ---eqn.BP180
and D' is a doubly stochastic
matrix with more zero entries
than D.
<a name="ch13c315">
The proof may now be
completed by applying the
induction hypothesis to D'.
Alternatively, one can compute
the required summands by
repeating the analogous steps
until the representatiin is
complete; at most n2 steps
will be needed.
2010-07-17-08-42 stop
<a name="ch13c316">
2010-07-17-09-02 start
■ Exercise 13.9 discussion
Both Exercise 13.8 and 13.9
are new to Liu,Hsinhan.
2010-07-14-19-14 in
google search for
proof of Birkhoff Theorem
About 221,000 results (0.43 seconds)
<a name="ch13c317">
Birkhoff's Theorem
The proof of Birkhoff's theorem
uses Hall's marriage theorem. ...
Proof of Birkhoff's theorem: We
proceed by induction on the number
of nonzero entries in ...
2010-07-14-19-23 LiuHH access
http://math.uncc.edu/~ghetyei/courses/old/F09.3116/birkhofft.pdf<a name="ch13c318">
Among all web pages I accessed
this math.uncc.edu page is most
close to Exercise 13.9. It has
numerical example show how to
decompose a doubly stochastic
matrix to interpolation of
permutation matrices.
<a name="ch13c319">
math.uncc.edu page say
[[
.....
Then, by Hall’s theorem, there
is a subset A of the vertices
in one part such that the set
R(A) of all vertices connected
to some vertex in A has strictly
less than |A| elements.
.....
<a name="ch13c320">
For example in the associated
graph above (1, 3), (2, 1), (3, 2)
is a perfect matching so we
underline x13, x21 and x32.
.....
]]
<a name="ch13c321">
"there is a subset A" ?
How to find this subset A ??
Why (1,3), (2,1), (3,2) is a
perfect matching?
<a name="ch13c322">Textbook say
[[
By the marriage lemma, there is
a system of SDRs of
{S1,S2,...,Sn} ---eqn.BP177
so, we can define a permutation
σ by taking σ(j) to be the
representative from Sj for each
j=1,2,...,n.
.....
]]
<a name="ch13c323">
"there is a system of SDRs"
How to find this SDR system??
"we can define a permutation"
How to define? what is the
guide line?
<a name="ch13c324">
LiuHH studied several times,
still not get the idea how to
create a permutation matrix.
Wait for futher reading and
further thinking.
<a name="ch13c325">
At current time Exercise 13.9
is NOT DONE
2010-07-17-09-15 stop
========= Chapter thirteen end here =========
<a name="Copyright">Index beginIndex this file
2009-06-19-10-48
If you are interested in inequality,
suggest you buy the book
The Cauchy-Schwarz Master Class
written by Prof. J. Michael Steele
The Cauchy-Schwarz Master Class is
this web page's textbook.
To buy textbook, that is to show thanks
to Prof. J.M. Steele's great work.
and it is also respect copyright law.
Thank you. Freeman 2009-06-19
The Cauchy-Schwarz Master Class
J. MichaelSteele★★★★★
ISBN 978-0-521-54677-5
2009-06-19-10-56