VOLUMEN 8 NÚMERO 2 JULIO A DICIEMBRE DE 2004 ISSN: 1870-6525
Morfismos Comunicaciones Estudiantiles Departamento de Matem´aticas Cinvestav Editores Responsables • Isidoro Gitler • Jes´ us Gonz´alez
Consejo Editorial • Luis Carrera • Samuel Gitler • On´esimo Hern´ andez-Lerma • H´ector Jasso Fuentes • Miguel Maldonado • Ra´ ul Quiroga Barranco • Enrique Ram´ırez de Arellano • Enrique Reyes • Armando S´ anchez • Mart´ın Sol´ıs • Leticia Z´arate
Editores Asociados • Ricardo Berlanga • Emilio Lluis Puebla • Isa´ıas L´ opez • Guillermo Pastor • V´ıctor P´erez Abreu • Carlos Prieto • Carlos Renter´ıa • Luis Verde
Secretarias T´ecnicas • Roxana Mart´ınez • Laura Valencia
Morfismos puede ser consultada electr´onicamente en “Revista Morfismos” de la direcci´ on http://www.math.cinvestav.mx. Para mayores informes dirigirse al tel´efono 50 61 38 71. Toda correspondencia debe ir dirigida a la Sra. Laura Valencia, Departamento de Matem´ aticas del Cinvestav, Apartado Postal 14-740, M´exico, D.F. 07000 o por correo electr´ onico: laura@math.cinvestav.mx.
VOLUMEN 8 NÚMERO 2 JULIO A DICIEMBRE DE 2004 ISSN: 1870-6525
Informaci´ on para Autores El Consejo Editorial de Morfismos, Comunicaciones Estudiantiles del Departamento de Matem´ aticas del CINVESTAV, convoca a estudiantes de licenciatura y posgrado a someter art´ıculos para ser publicados dentro de esta revista bajo los siguientes lineamientos • Todos los art´ıculos ser´ an enviados a especialistas para su arbitraje. No obstante, los art´ıculos ser´ an considerados s´ olo como versiones preliminares y por tanto pueden ser publicados en otras revistas especializadas. • Se debe anexar junto con el nombre del autor, su nivel acad´ emico y la instituci´ on donde estudia o labora. • El art´ıculo debe empezar con un resumen en el cual se indique de manera breve y concisa el resultado principal que se comunicar´ a. • Es recomendable que los art´ıculos presentados est´ en escritos en Latex y sean enviados a trav´ es de un medio electr´ onico. Los autores interesados pueden obtener el formato LATEX 2ε utilizado por Morfismos en “Revista Morfismos” de la direcci´ on web http://www.math.cinvestav.mx, o directamente en el Departamento de Matem´ aticas del CINVESTAV. La utilizaci´ on de dicho formato ayudar´ a en la pronta publicaci´ on del art´ıculo. • Si el art´ıculo contiene ilustraciones o figuras, ´ estas deber´ an ser presentadas de forma que se ajusten a la calidad de reproducci´ on de Morfismos. • Los autores recibir´ an un total de 15 sobretiros por cada art´ıculo publicado.
• Los art´ıculos deben ser dirigidos a la Sra. Laura Valencia, Departamento de Matem´ aticas del Cinvestav, Apartado Postal 14 - 740, M´ exico, D.F. 07000, o a la direcci´ on de correo electr´ onico laura@math.cinvestav.mx
Author Information Morfismos, the student journal of the Mathematics Department of Cinvestav, invites undergraduate and graduate students to submit manuscripts to be published under the following guidelines • All manuscripts will be refereed by specialists. However, accepted papers will be considered to be “preliminary versions” in that authors may republish their papers in other journals, in the same or similar form. • In addition to his/her affiliation, the author must state his/her academic status (student, professor,...). • Each manuscript should begin with an abstract summarizing the main results.
• Morfismos encourages electronically submitted manuscripts prepared in Latex. Authors may retrieve the LATEX 2ε macros used for Morfismos through the web site http://www.math.cinvestav.mx, at “Revista Morfismos”, or by direct request to the Mathematics Department of Cinvestav. The use of these macros will help in the production process and also to minimize publishing costs. • All illustrations must be of professional quality.
• 15 offprints of each article will be provided free of charge.
• Manuscripts submitted for publication should be sent to Mrs. Laura Valencia, Departamento de Matem´ aticas del Cinvestav, Apartado Postal 14 - 740, M´ exico, D.F. 07000, or to the e-mail address: laura@math.cinvestav.mx
Lineamientos Editoriales “Morfismos” es la revista semestral de los estudiantes del Departamento de Matem´ aticas del CINVESTAV, que tiene entre sus principales objetivos el que los estudiantes adquieran experiencia en la escritura de resultados matem´ aticos. La publicaci´ on de trabajos no estar´ a restringida a estudiantes del CINVESTAV; deseamos fomentar tambi´en la participaci´ on de estudiantes en M´exico y en el extranjero, as´ı como la contribuci´ on por invitaci´ on de investigadores. Los reportes de investigaci´ on matem´ atica o res´ umenes de tesis de licenciatura, maestr´ıa o doctorado pueden ser publicados en Morfismos. Los art´ıculos que aparecer´ an ser´ an originales, ya sea en los resultados o en los m´etodos. Para juzgar ´esto, el Consejo Editorial designar´ a revisores de reconocido prestigio y con experiencia en la comunicaci´ on clara de ideas y conceptos matem´ aticos. Aunque Morfismos es una revista con arbitraje, los trabajos se considerar´ an como versiones preliminares que luego podr´ an aparecer publicados en otras revistas especializadas. Si tienes alguna sugerencia sobre la revista hazlo saber a los editores y con gusto estudiaremos la posibilidad de implementarla. Esperamos que esta publicaci´ on propicie, como una primera experiencia, el desarrollo de un estilo correcto de escribir matem´ aticas.
Morfismos
Editorial Guidelines “Morfismos” is the journal of the students of the Mathematics Department of CINVESTAV. One of its main objectives is for students to acquire experience in writing mathematics. Morfismos appears twice a year. Publication of papers is not restricted to students of CINVESTAV; we want to encourage students in Mexico and abroad to submit papers. Mathematics research reports or summaries of bachelor, master and Ph.D. theses will be considered for publication, as well as invited contributed papers by researchers. Papers submitted should be original, either in the results or in the methods. The Editors will assign as referees well–established mathematicians. Even though Morfismos is a refereed journal, the papers will be considered as preliminary versions which could later appear in other mathematical journals. If you have any suggestions about the journal, let the Editors know and we will gladly study the possibility of implementing them. We expect this journal to foster, as a preliminary experience, the development of a correct style of writing mathematics.
Morfismos
Contenido Homotopy triangulations of a manifold triple Rolando Jim´enez and Yuri V. Muranov . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
On information measures and prior distributions: a synthesis Francisco Venegas-Mart´ınez . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
On a problem of Steinhaus concerning binary sequences Shalom Eliahou and Delphine Hachez . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Errata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Morfismos, Vol. 8, No. 2, 2004, pp. 1–25 Morfismos, Vol. 8, No. 2, 2004, pp. 1–25
Homotopy triangulations of a manifold triple ∗ Rolando Jimenez
Yuri V. Muranov
Abstract The set of homotopy triangulations of a given manifold fits into a surgery exact sequence which is the main tool for the classification of manifolds. In the present paper we describe relations between homotopy triangulations of different manifolds for a given manifold triple and its connection to surgery theory. We introduce a group of obstructions to split a homotopy equivalence along a pair of submanifolds and study its properties. The main results are given by commutative diagrams of exact sequences.
2000 Mathematics Subject Classification: 57R67, 57Q10, 19J25, 19G24, 18F25. Keywords and phrases: surgery on manifolds, surgery and splitting obstruction groups, surgery exact sequence, the set of homotopy triangulations.
1
Introduction
Let X n be a closed n–dimensional CAT (CAT = T OP, P L, Dif f ) manifold with fundamental group π = π1 (X). A fundamental problem of geometric topology is to describe all closed n–dimensional CAT – manifolds which are homotopy (simple homotopy) equivalent to X. The main tool for such investigation is the surgery exact sequence (see [23, 20]) σ
∗
· · · → Ln+1 (π) → Sn (X) → [X, G/CAT ] → Ln (π) → · · ·
(1.1)
Invited article. Partially supported by Russian Foundation for Fundamental Research Grant no. 02–01–00014, CONACyT, DGAPA–UNAM, Fulbright–Garc´ıa Robles and UW–Madison.
1
2
Rolando Jimenez and Yuri V. Muranov
The elements of the set [X, G/CAT ] are called normal invariants [23, §10]. In the present paper we shall work in the category of topological manifolds (CAT = T OP ), and consider the groups L∗ (π) = Ls∗ (π) which give obstructions to simple homotopy equivalence. The elements of Sn (X) = Sns (X) are called homotopical triangulations or s–triangulations of the manifold X (see [23, §10],[20]). The structure set Sns (X) is the set of s–cobordism classes of CAT –manifolds which are simple homotopy equivalent to X n (see [23, §10] and [20, p. 542]). Let Y ⊂ X be a submanifold of codimension q in X. A simple homotopy equivalence f : M → X splits along the submanifold Y if it is homotopy equivalent to a map g transversal to Y , such that for N = g −1 (Y ) the restrictions g|N : N → Y, g|(M \N ) : M \ N → X \ Y are simple homotopy equivalences. Let ∂U be the boundary of a tubular neighborhood U of the submanifold Y in X. There exists a group LSn−q (F ) of obstructions to splitting (see [23, 20]) which depends only on n − q mod 4 and a pushout square ⎛ ⎞ π1 (∂U ) → π1 (X \ Y ) ⎠ ↓ ↓ F =⎝ (1.2) π1 (X) π1 (U ) → of fundamental groups with orientations.
We consider the group LPn−q (F ) of obstructions to surgery on pairs of manifolds (X, Y ) as defined in [23, 20]. This group depends as well only on n − q mod 4 and the square F . In [20] Ranicki introduced a set Sn+1 (X, Y, ξ) of homotopy triangulations of a pair of manifolds (X, Y ), where ξ denotes the normal bundle of Y in X. This set consists of concordance classes of maps f : (M, N ) → (X, Y ) which are splitted along Y and fits into a commutative braid of exact sequences ([20, Proposition 7.2.6]) → Sn+1 (X, Y, ξ) ↗ ↘ →
↘
↗ Ln+1 (π1 (X))
−→ Sn+1 (X) −→
Hn (X, L• ) ↗ ↘ ↘ ↗ LSn−q (F )
−→ LPn−q (F ) →
Ln (π1 (X)) ↗ ↘ ↘ ↗ Sn (X, Y, ξ)
→
→, (1.3)
where Hn (X, L• ) ≃ [X, G/T OP ] and the spectrum L• is a one–connected cover of the Ω–spectrum L(Z) with L•0 ≃ G/T OP [21] (see also [23, 20]).
3
Homotopy triangulations of a manifold triple
′
Consider a triple Z n−q−q ⊂ Y n−q ⊂ X n of closed topological manifolds. We shall suppose that every submanifold is locally flat in an ambient manifold, and equipped with the structure of normal topological bundle (see [20, p. 562–563]). The groups LTn−q−q′ (X, Y, Z) of obstructions to surgery on manifold triples were recently introduced in [18]. The groups LT∗ are natural generalizations of the obstruction groups LP∗ to surgery on manifold pairs. The set Sn+1 (X, Y, Z) of homotopy triangulations of the triple (X, Y, Z) is the natural generalization of the structure sets Sn+1 (X) and Sn+1 (X, Y, ξ). This set fits in the exact sequence
· · · → LTk+1 (X, Y, Z) → Sn+1 (X, Y, Z) → Hn (X, L• ) → LTk (X, Y, Z) → · · · (1.4)
where k = n − q − q ′ . The relations between S∗ (X, Y, Z) and S∗ (X, Y, ξ) are given by the following braid of exact sequences [18] → Sn+1 (X, Y, Z) ↗ ↘ →
↘ ↗ LPn−q+1 (F )
−→ Sn+1 (X, Y, ξ) −→
Hn (X, L• ) ↗ ↘ ↘ ↗ LSk (Ψ)
−→ LTk (X, Y, Z) −→
LPn−q (F ) ↗ ↘ ↘ ↗ Sn (X, Y, Z)
→
→ (1.5)
where Ψ is the square of fundamental groups in the splitting problem for the pair (Y, Z). Remark that the map Sn+1 (X, Y, Z) −→ Sn+1 (X, Y, ξ) in (1.5) is a natural forgetful map. We have the following topological normal bundles: ξ for the submanifold Y in X, η for the submanifold Z in Y , and ν for the submanifold Z in X. Let Uξ be a space of normal bundle ξ. We shall suppose that the space Uν of the normal bundle ν is identified with the space Vξ of restriction of bundle ξ on a space Uη of normal bundle η in such way that ∂Uν = ∂Uξ |Uη ∪ Uξ |∂Uη . In the present paper we describe various relations between sets of homotopy triangulations S∗ (X), S∗ (Y ), S∗ (Z), S∗ (X, Y, ξ), S∗ (X, Z, ν), S∗ (Y, Z, η), and S∗ (X, Y, Z) which arise naturally for a triple of embedded manifolds. The main results are given by commutative diagrams of exact sequences. We also introduce a group LSP∗ of obstructions to split a simple homotopy equivalence f : M → X along a pair of embedded submanifolds (Z ⊂ Y ) ⊂ X, and describe its relations to the
4
Rolando Jimenez and Yuri V. Muranov
classical obstruction groups in surgery theory. The group LSP∗ is a natural straightforward generalization of the group LS∗ if we consider the pair of submanifolds Z ⊂ Y instead of the submanifold Y .
2
Preliminaries
In this section we recall some definitions and results used in the paper (see [23, 20, 10, 11, 18, 21, 22, 8]). The definition of a topological normal map (f, b) : M → X is given in [20, p. 36] (see also [21, 19]). For n ≥ 5 the set of concordance classes of topological normal maps into a manifold X coincides with the set [X, G/T OP ] ∼ = Hn (X, L• ) (see [20, 21, 19]). We shall use the definition of topological manifold pair (X, Y, ξ) as given in [20, §7.2]. Here Y is the submanifold of codimension q for which the topological normal bundle (Dq , S q−1 ) → (E(ξ), S(ξ)) → Y is defined. A topological normal map ((f, b), (g, c)) : (M, N, ξ1 ) → (X, Y, ξ) with N = f −1 (Y ) is defined in [20, §7.2]. For this map the restrictions (f, b)|N = (g, c) : N → Y and (f, b)|P = (h, d) : (P, S(ξ1 )) → (Z, S(ξ))
are topological normal maps where P = M \ E(ξ1 ), Z = X \ E(ξ). Additionally, the restriction (h, d)|S(ξ1 ) : S(ξ1 ) → S(ξ) coincides with the induced map (g, c)! : S(ξ1 ) → S(ξ), and we have (f, b) = (g, c)! ∪ (h, d). It follows from [20, Proposition 7.2.3] that the set of concordance classes of normal maps to the pair (X, Y, ξ) coincides with the set of concordance classes of normal maps to the manifold X. A normal map ((f, b), (g, c)) : (M, N ) → (X, Y ) represents an element of Sn+1 (X, Y, ξ) if the maps f : M → X, g : N → Y, and h : (P, S(ν)) → (Z, S(ξ))
are s–triangulations (see [20, p. 571]). It follows from the definition of s–triangulation of the pair (X, Y, ξ) that the forgetful maps Sn+1 (X, Y, ξ) → Sn+1 (X),
Sn+1 (X, Y, ξ) → Sn−q+1 (Y ),
((f, b), (g, c)) → (f, b); ((f, b), (g, c)) → (g, c)
are well defined. In general, the map Sn+1 (X, Y, ξ) → Sn+1 (X) is not an epimorphism or a monomorphism [20, p. 571].
Homotopy triangulations of a manifold triple
5
Recall that if (X, Y ) is a pair of topological spaces, equipped with an orientation homomorphism, then the relative groups S∗ (X, Y ) are well defined [20, p. 560]. These groups fit into the following exact sequences · · · → Sn (Y ) → Sn (X) → Sn (X, Y ) → Sn−1 (X, Y ) → · · · , · · · → Hn (X, Y ; L• ) → Ln (π1 (Y ) → π1 (X)) → Sn (X, Y ) → · · · .
(2.1)
For a manifold pair (X, Y ) the groups S∗ (X, Y ) differ from the groups S∗ (X, Y, ξ). A topological normal map ((f, b), (g, c), (h, d)) : (M, N, K) → (X, Y, Z) of a triple of manifolds (X, Y, Z) is given by topological normal maps of manifold pairs ((f, b), (g, c)) : (M, N ) → (X, Y ) and ((g, c), (h, d)) : (N, K) → (Y, Z). A topological normal map (f, b) ∈ [X, G/T OP ] with f : M → X (see [20, 18]) defines the topological normal map ((f, b), (g, c), (h, d)) : (M, N, K) → (X, Y, Z) as follows from topological transversality (see [20, Proposition 7.2.3]). A topological normal map ((f, b), (g, c), (h, d)) : (M, N, K) → (X, Y, Z) is an s–triangulation of the triple (X, Y, Z) if the constituent normal maps ((f, b), (g, c)) : (M, N ) → (X, Y ), ((g, c), (h, d)) : (N, K) → (Y, Z), and ((f, b), (h, d)) : (M, K) → (X, Z) are s–triangulations. The set of concordance classes of s–triangulations of the triple (X, Y, Z) is denoted by Sn+1 (X, Y, Z). As follows from [18] this set has a group structure and fits into the commutative braid of exact sequences (1.5). In [18] the groups LT∗ (X, Y, Z) and the map Θ∗ (f, b) : [X, G/T OP ] = Hn (X, L• ) → LTn−q−q′ (X, Y, Z) are defined in such a way that the normal map (f, b) is normally bordant to an s–triangulation of the triple (X, Y, Z) if and only if Θ∗ (f, b) = 0 (for n − q − q ′ ≥ 5). Proposition 2.1 Suppose that a topological normal map ((f, b), (g, c), (h, d)) : (M, N, K) → (X, Y, Z) gives s–triangulations of manifold pairs ((f, b), (g, c)) : (M, N ) → (X, Y )
6
Rolando Jimenez and Yuri V. Muranov
and ((g, c), (h, d)) : (N, K) → (Y, Z). Then the map ((f, b), (h, d)) : (M, K) → (X, Z) is an s–triangulation of the pair (X, Z). Proof: The maps f : M → X and h : K → Z are simple homotopy equivalences by definition. Denote by Vξ the space of restriction of the bundle ξ on the space Uη . We can identify the space Vξ with the space Uν of the bundle ν. The map g already splitted along the submanifold Z. Hence its restriction is a simple homotopy equivalence on Y \ Z and, therefore, we have a simple homotopy equivalence f |H where H is Uξ \ Uν . By definition the restriction of the map f gives a simple homotopy equivalence on E = X \ Uξ . The map f |H∩E will be a simple homotopy equivalence since g is a simple homotopy equivalence on Y \Z and on the boundary of the tubular neighborhood. Hence the map f |H∪E will be a simple homotopy equivalence (see [8, §23]). We can ! identify X \ Uν with H ∪ E and the proposition is proved. Proposition 2.2 For the triple Z ⊂ Y ⊂ X the natural forgetful maps fit in the following commutative diagrams Sn+1 (X, Y, Z) → Sn+1 (X, Y, ξ) ↓ ↓ Sn−q+1 (Y, Z, η) → Sn−q+1 (Y ), Sn+1 (X, Y, Z) ↓ Sn+1 (X, Z, ν) and
→ Sn+1 (X, Y, ξ) ↓ → Sn+1 (X),
Sn+1 (X, Y, Z) → Sn+1 (X, Z, ν) ↓ ↓ Sn−q+1 (Y, Z, η) → Sn−q−q′ +1 (Z).
Proof: The result follows from the definition of s–triangulation of triple of manifolds. ! In the present paper we shall use realizations of the various groups and natural maps in surgery theory on the spectra level (see [23, 20, 18, 21, 1, 8]). The surgery exact sequence (1.1) in the topological category
7
Homotopy triangulations of a manifold triple
is realized on the spectra level [21]. The commutative diagrams (1.3) and (1.5) are realized on the spectra level (see [20, 18]), too. We recall here that transfer and induced maps of L–groups are realized on the spectra level. A homomorphism of oriented groups f : π → π ′ induces a cofibration of Ω–spectra [8] L(π) −→ L(π ′ ) −→ L(f ), where πn (L(π)) = Ln (π) and similarly for the other spectra. The homotopy long exact sequence of this cofibration gives a relative exact sequence of L∗ –groups of the map f . . . → Ln (π) → Ln (π ′ ) → Ln (f ) → Ln−1 (π) → . . . . Let p : E → X be a bundle over an n–dimensional manifold X whose fiber is an m–dimensional manifold M m . Then the transfer map (see [23, 10, 11]) p∗ : Ln (π1 (X)) → Ln+m (π1 (E)) is well defined. This map is realized on the spectra level by a map p! : L(π1 (X)) → Σ−m L(π1 (E)). For the pair of manifolds (X, Y ) consider a homotopy commutative diagram of spectra (see [23, 20, 1]) L(π1 (Y )) → Σ−q L(π1 (∂U ) → π1 (Y )) → Σ−q L(π1 (X \ Y ) → π1 (X)) ↘ ↓ ↓ → Σ1−q L(π1 (X \ Y )), Σ1−q L(π1 (∂U ))
(2.2)
where the left maps are the transfer maps on the spectra level. The spectrum LS(F ) is defined as a homotopy cofiber of the map ΣL(π1 (Y )) → Σ−q−1 L(π1 (X \ Y ) → π1 (X)), and the spectrum LP (F ) is defined as a hmotopy cofiber of the map Σ−1 L(π1 (Y )) → Σ−q L(π1 (X \ Y )) (see [1, 13, 17, 12, 8]). Then the homotopy groups of these spectra coincide with the splitting obstruction groups πn (LS(F )) ∼ = LSn (F ) and the surgery obstruction groups for the manifold pair πn (LP (F )) ∼ = LPn (F ). It follows from [20, 18, 21] that the sets of s–triangulations are realized on the spectra level. we shall denote by S(X) the corresponding
8
Rolando Jimenez and Yuri V. Muranov
spectrum for the structure set Sn+1 (X), and similarly for the other structure sets. ′
For the triple Z n−q−q ⊂ Y n−q ⊂ X n denote by j the natural inclusion (X \ Z, Y \ Z) → (X, Y ) of CW –pairs of codimension q (see [20, §7.2]). Let FZ be the square of fundamental groups for the pair (X \ Z, Y \ Z). Let W = (X \ Z). The map j induces the map of squares F → FZ and therefore a commutative diagram .. . ↓
. . . → LSn−q (FZ ) ↓ . . . → LSn−q (F ) ↓ ... →
LN Sk ↓ .. .
→ → tr rel
→
→ → →
.. . ↓ Ln−q (π1 (Y \ Z)) ↓ Ln−q (π1 (Y )) ↓
Ln−q (π1 (Y \ Z) → π1 (Y )) ↓ .. .
.. . ↓ Ln (π1 (X \ Y ) → π1 (W )) ↓ Ln (π1 (X \ Y ) → π1 (X)) ↓ Ln (π1 (W ) → π1 (X)) ↓ .. .
→ → tr rel
→
→ ··· → ···
(2.3)
→ ···
where k = n−q−q ′ . Two right upper horizontal maps induce the map of the relative groups of two right upper vertical maps with relative groups LN S∗ (see [7, 15]). Diagram (2.3) is realized on the spectra level.
3
Homotopy triangulations of a triple of manifolds
In this section we describe relations between various structure sets which arise naturally for a triple of manifolds. Denote by Φ the square of fundamental groups in the splitting problem for the pair (X, Z). Recall here that F , FZ , and Ψ denote the similar squares for the pairs (X, Y ), (X \ Z, Y \ Z), and (Y, Z), respectively.
9
Homotopy triangulations of a manifold triple
Theorem 3.1 The natural forgetful maps of Proposition 2.2 fit in the following commutative diagrams of exact sequences → Sn (X \ Y ) ↗ ↘ ↘ ↗ → LSk (Ψ)
··· →
−→ Sn (X, Y, Z) −→ .. . ↓
LSn−q (FZ ) ↓ · · · → Sn (X, Y, Z) ↓ · · · → Sn (X, Z, ν) ↓ .. .
→
Sn (X, Y, ξ) ↗ ↘ ↘
↗ Sn−q (Y, Z, η) .. . ↓
LSn−q (F ) ↓ → Sn (X, Y, ξ) ↓ → Sn (X) ↓ .. .
−→ Sn−q (Y ) −→
LSk−1 (Ψ) ↗ ↘
→ (3.1)
↘
↗ Sn−1 (X \ Y ) →,
.. . ↓ → LN Sk ↓ → LSk−1 (Ψ) ↓ → LSk−1 (Φ) ↓ .. .
→ ··· → ···,
(3.2)
→ ···
and .. . ↓
.. .. . . ↓ ↓ ··· → LSn−q (FZ ) → Sn (X, Y, Z) → Sn (X, Z, ν) → · · · ↓ ↓ ↓ → Sn−q (Y, Z, η) → Sk (Z) → ···, ··· → Sn−q (Y \ Z) ↓ ↓ ↓ · · · → Sn−1 (X \ Z, X \ Y ) → Sn−1 (X \ Y ) → Sn−1 (X \ Z) → · · · ↓ ↓ ↓ .. .. .. . . .
(3.3)
where k = n − q − q ′ . The diagrams are realized on the spectra level. Proof: Diagram (3.1) was obtained in [18]. The group Hn (X, L• ) maps to the groups of a commutative diagram of forgetful maps LTn−q−q′ (X, Y, Z) → LPn−q (F ) ↓ ↓ → Ln (π1 (X)), LPn−q−q′ (Φ)
(3.4)
which is realized on the spectra level (see [18, 15]). We obtain a commutative diagram in the form of pyramid with the group Hn (X, L• ) in the top. This diagram is realized on the spectra level. The cofibres of the
10
Rolando Jimenez and Yuri V. Muranov
maps which correspond to the side edges give a homotopy commutative diagram of spectra of structure sets S(X, Y, Z) → S(X, Y, ξ) ↓ ↓ S(X, Z, ν) → S(X), which realizes the second commutative diagram from Proposition 2.2. It is easy to see that diagram (3.2) follows from the diagram above and diagrams (1.3), (2.3). Now consider a commutative diagram Hn (X, L• ) Hn (X, L• ) = ↓ ↓ Hn−q (Y, L• ) → Hn−q−q′ (Z, L• ),
(3.5)
in which the vertical maps and the lower horizontal map are given by compositions of transfer maps and isomorphisms (see [20, p. 579]) Hn−q (Y, L• ) ∼ = Hn (X, X \ Y ; L• ), Hn−q−q′ (Z, L• ) ∼ = Hn (X, X \ Z; L• ), ∼ Hn−q−q′ (Z, L• ) = Hn−q (Y, Y \ Z; L• ). Consider a natural map of (3.5) to the commutative diagram of forgetful maps (see [18, 15]) LTn−q−q′ (X, Y, Z) → LPn−q−q′ (Φ) ↓ ↓ LPn−q−q′ (Ψ) → Ln−q−q′ (π1 (Z)).
(3.6)
Diagrams (3.5) and (3.6) and the obtained maps between them are realized on the spectra level. Cofibres give a homotopy commutative diagram of spectra of structure sets S(X, Y, Z) → S(X, Z, ν) ↓ ↓ ′ Σq S(Y, Z, η) → Σq+q S(Z).
(3.7)
Diagram (3.7) realizes on the spectra level commutative diagram (3.3). Now the result follows similarly to the previous case. ! Theorem 3.2 Let Φ and Ψ be the squares of fundamental groups concerning with a splitting problem for the manifold pairs (X, Z) and (Y, Z),
11
Homotopy triangulations of a manifold triple
respectively. Then there exists a commutative braid of exact sequences →
→
LN Sk−1 ↗ ↘ ↘ ↗ Sk (Z)
−→ Sl (Y, Y \ Z)
LSk−1 (Ψ) ↗ ↘
→ LSk−1 (Φ)
↘ ↗ Sn (X, X \ Z)
−→
−→
Sk−1 (Z) ↗ ↘ ↘ ↗ LN Sk−2
→ (3.8) →,
where k = n − q − q ′ , l = n − q. Proof:
The transfer maps give isomorphisms Hn−q−q′ (Z, L• )
∼ =
! Hn−q (Y, Y \ Z; L• )
∼ =
"
#
(3.9)
∼ =
Hn (X, X \ Z; L• ). Consider a commutative triangle (see [7, 15]) ! Ln−q (π1 (Y \ Z) → π1 (Y ))
Ln−q−q′ (π1 (Z))
(3.10)
trrel
$
#
Ln (π1 (X \ Z) → π1 (X)). The right vertical map in (3.9) is a relative transfer map from (2.3) and the two other maps are compositions of transfer maps and maps induced by inclusions. For the pair (X, Y ) the corresponding map on the spectra level is described in (2.2). Taking the obstruction to surgery we obtain maps from (3.9) into (3.10) for which homotopy cofibers give a homotopy commutative triangle of the spectra of structure sets ′
Σq+q S(Z) → Σq S(Y, Y \ Z) ↘ ↓ S(X, X \ Z).
(3.11)
The cofiber of horizontal map in (3.11) is Σk+1 LS(Ψ) and the cofiber of the sloping map is Σk+1 LS(Φ) as follows from [20, Proposition 7.2.6,ii]. Thus we have a push–out square of spectra Σq S(Y, Y \ Z) → Σk+1 LS(Ψ) ↓ ↓ k+1 S(X, X \ Z) → Σ LS(Φ),
(3.12)
12
Rolando Jimenez and Yuri V. Muranov
where the right vertical maps fit into (3.2). Now the homotopy long exact sequences of maps in (3.12) give diagram (3.8). ! Theorem 3.3 There exists the following braid of exact sequences →
→
Hk (Z, L• ) −→ Ln (C → D) −→ LN Sk−1 ↗ ↘ ↗ ↘ ↗ ↘ Sn (X, X \ Z) Ll (A → B) ↘ ↗ ↘ ↗ ↘ ↗ LN Sk −→ Sl (Y, Y \ Z) −→ Hk−1 (Z, L• )
→ (3.13) →,
where k = n − q − q ′ , l = n − q, π1 (Y \ Z) = A, π1 (Y ) = B, π1 (X \ Z) = C, π1 (X) = D. Diagram (3.13) is realized on the spectra level. Proof: The proof is similar to that of Theorem 3.2. It is necessary to use the definition of the map trrel : Ll (A → B) → Ln (C → D) from (2.3) and isomorphisms (3.9). ! Theorem 3.4 There exists a commutative diagram of exact sequences .. .. .. . . . ↓ ↓ ↓ · · · → Sl (Y \ Z) → Sn (X \ Z, X \ Y ) → LSl−1 (FZ ) ↓ ↓ ↓ ··· → Sl (Y ) → Sn (X, X \ Y ) → LSl−1 (F ) ↓ ↓ ↓ Sn (X, X \ Z) → LN Sk−1 · · · → Sl (Y, Y \ Z) → ↓ ↓ ↓ .. .. .. . . .
→ ··· → ···
(3.14)
→ ···,
where l = n − q and k = n − q − q ′ . Diagram (3.14) is realized on spectra level. Proof:
!
The proof is similar to that of Theorem 3.2.
Theorem 3.5 There exists a commutative braid of exact sequences → Hn (X, L• ) ↗ ↘ ↘ ↗ → LSl (FZ )
−→ LTk (X, Y, Z) −→
LPk (Φ) ↗ ↘
↘ ↗ Sn (X, Y, Z)
−→ Sn (X, Z, ν) −→
LSl−1 (FZ ) ↗ ↘
→ (3.15)
↘ ↗ Hn−1 (X, L• ) →,
where k = n − q − q ′ and l = n − q. Diagram (3.15) is realized on the spectra level.
Homotopy triangulations of a manifold triple
13
Proof: Consider the maps from Hn (X, L• ) to the groups LTk (X, Y, Z) and LPk (Φ) obtained by taking the obstructions to surgery. We obtain a commutative triangle in which the third map is the natural forgetful map LTk (X, Y, Z) → LPk (Φ). Now the result follows from the definitions of the groups Sn (X, Z, ν) and Sn (X, Y, Z) and from commutative diagram (3.2). ! Theorem 3.6 There exists a commutative diagram of exact sequences .. .. .. . . . ↓ ↓ ↓ · · · → Hn (X \ Y, L• ) → Hn (X, L• ) → Hn−q (Y, L• ) → · · · ↓ ↓ ↓ · · · → Ln (π1 (X \ Y )) → LTk (X, Y, Z) → LPk (Ψ) → ··· ↓ ↓ ↓ · · · → Sn (X \ Y ) → Sn (X, Y, Z) → Sl (Y, Z, η) → · · · , ↓ ↓ ↓ .. .. .. . . .
(3.16)
where l = n − q and k = n − q − q ′ . Diagram (3.16) is realized on the spectra level. Proof: The right upper square in (3.16) is commutative and it is realized on the spectra level by [20, 18]. Now diagram (3.16) is obtained by considering the homotopy long exact sequences of maps from this square. !
4
Splitting a homotopy equivalence along a submanifold pair
In this section we introduce the obstruction groups LSP∗ = LSP∗ (X, Y, Z) for a triple of embedded manifolds Z ⊂ Y ⊂ Z. These groups fit into an exact sequence . . . → LSPn−q−q′ → LTn−q−q′ (X, Y, Z) → → Ln (π1 (X)) → LSPn−q−q′ −1 → . . . .
(4.1)
The groups LSP∗ (X, Y, Z) are a natural straightforward generalization of the splitting obstruction groups LS∗ (F ) for the manifold X with a
14
Rolando Jimenez and Yuri V. Muranov
submanifold Y to the case when the manifold X contains a pair of embedded submanifolds (Z ⊂ Y ) ⊂ X. Hence the groups LSP∗ (X, Y, Z) are obstruction groups for doing surgery on the manifold pair (Y, Z) inside the manifold X. In particular, there is a natural forgetful map LSP∗ (X, Y, Z) → LP∗ (Ψ) forgetting the ambient manifold X. We also describe relations between the introduced groups and the sets of homotopy triangulations which arise for the triple of manifolds. Recall here that in [18] the spectrum LT (X, Y, Z) with homotopy groups LTn (X, Y, Z) = πn (LT (X, Y, Z)) is defined as a homotopy cofiber of the map ′
′
Σ−q −1 v : Σ−q −1 LP (F ) → LS(Ψ), where Σ denotes the suspension functor. The map of homotopy groups induced by v coincides with the composition LPn−q+1 (F ) → Sn+1 (X, Y, ξ) → Sn−q+1 (Y ) → LSn−q−q′ (Ψ),
(4.2)
where the middle map is the natural forgetful map and the other maps are described in (1.3). Hence, from the cofibration sequence of the ′ map v, we obtain the map t : Σq+q LT (X, Y, Z) → Σq LP (F ). The composition of the map t with the natural forgetful map LPn−q (F ) → ′ Ln (π1 (X)) on spectra level provides a map s : Σq+q LT (X, Y, Z) → L(π1 (X)). We define a spectrum LSP (X, Y, Z) as the spectrum fitting in the cofibration ′
LSP (X, Y, Z) → LT (X, Y, Z) → Σ−q−q L(π1 (X)).
(4.3)
Let LSPn = LSPn (X, Y, Z) denote the homotopy group πn (LSP (X, Y, Z)). As follows from the definition, these groups fit into the long exact sequence in (4.1). Theorem 4.1 There exists a commutative braid of exact sequences → Sn+1 (X, Y, Z) ↗ ↘ →
↘
↗ Ln+1 (π1 (X))
−→ Sn+1 (X) −→
Hn (X, L• ) ↗ ↘ ↘α ↗ LSPn−q−q′
which is realized on the spectra level.
−→ LTn−q−q′ −→
Ln (π1 (X)) → ↗ ↘ ↘ ↗ Sn (X, Y, Z) →
(4.4)
Homotopy triangulations of a manifold triple
15
Proof: The definition of the spectrum LT yields a homotopy commutative right square of a homotopy commutative diagram of spectra Σ−1 S(X) ↓ ′ Σq+q LSP
→
X ∧ L• ↓ ′ → Σq+q LT
→ L(π1 (X)) ↓= → L(π1 (X))
(4.5)
where the existence of the left vertical map follows from [22]. The cofibers of the two horizontal maps of the left square in (4.5) coincide. Hence the left square is a pull–back square and the homotopy long exact sequences of this square give diagram (4.4). ! Commutative diagram (4.4) is a natural generalization of diagram (1.3) in the case of a triple of embedded manifolds. The left vertical map in (4.5) induces a map α : Sn+1 (X) → LSPn−q−q′ (X, Y, Z) in (4.4). The geometric meaning of this map is explained in the following theorem. Theorem 4.2 Let f : M → X be a simple homotopy equivalence which represents an element of Sn+1 (X). Then α(f ) = 0 if and only if the homotopy class of the map f contains an s–triangulation of the triple (X, Y, Z) (f splits along the pair Z ⊂ Y ). Proof: Let the homotopy class of the map f contain an s–triangulation of the triple (X, Y, Z). Then the element f lies in the image of the forgetful map Sn+1 (X, Y, Z) → Sn+1 (X). The composition Sn+1 (X, Y, Z) → Sn+1 (X) → LSPn−q−q′ is trivial as follows from (4.4). Hence α(f ) = 0. Conversely, let α(f ) = 0. Then the same exact sequence shows that f lies in the image Sn+1 (X, Y, Z) → Sn+1 (X), and the result follows. ! Suppose that the pairs of manifolds (X, Y ) and (Y, Z) are Browder– Livesay pairs (see [3]). In this case the spectrum Σ2 LT (X, Y, Z) coincides with the third member of the filtration in the construction of the surgery spectral sequence of Hambleton and Kharshiladze (see [18, 6]). Then the map rp : Ln (π1 (X)) → LSPn−q−q′ −1 (X, Y, Z) from (4.1) is a natural generalization of the Browder–Livesay invariant r : Ln (π1 (X)) → LSn−q−1 (F ) = LNn−q−1 (π1 (X \Y ) → π1 (X)) (see [3, 5]). Recall here (see [3]), that if r(x) ̸= 0, then the element x ∈ Ln (π1 (X)) is not realized by a normal map of closed manifolds. In fact, the map rp gives an information which is equivalent to consider the first and second Browder–Livesay invariants (see [15, 6, 5]).
16
Rolando Jimenez and Yuri V. Muranov
Proposition 4.3 For a triple of manifolds (X, Y, Z), let (X, Y ) and (Y, Z) be Browder–Livesay pairs. If rp (x) ̸= 0, then the element x ∈ Ln (π1 (X)) is not realized by a normal map of closed manifolds. Proof:
!
The result follows from [15, Proposition 3] and [5].
Recall here the diagram (see [18]) → Ln (π1 (X \ Y )) −→ LP n−q (F ) −→ LSk−1 (Ψ) → ↗ ↘ ↗ ↘ ↗ ↘ Ln−q (π1 (Y )) LTk ↘ ↗ ↘ ↗ ↘ ↗ → LSk (Ψ) −→ LPk (Ψ) −→ Ln−1 (π1 (X \ Y )) →,
(4.6)
that gives the relations of LT∗ –groups to splitting obstruction groups and to surgery obstruction groups for manifold pairs. Diagram (4.6) is realized on spectra level (see [18]). The relations of LSP∗ to classical surgery obstruction groups for the triple (X, Y, Z) is given by the following result. Theorem 4.4 There exist braids of exact sequences →
Ln (C) ↗ ↘
−→ LTk (X, Y, Z)
↘ ↗ → LSPk → LSn−q (FZ ) ↗ ↘ →
↘ ↗ Ln+1 (D)
→ LSn−q (FZ ) ↗ ↘ →
↘ ↗ LSk (Ψ)
and
→ LSn−q+1 (F ) ↗ ↘ →
↘ ↗ LTk+1
−→
Ln (D) ↗ ↘ ↘ ↗ LPk (Ψ)
−→ Ln (C → D) −→
LSPk−1 → ↗ ↘
−→ LTk (X, Y, Z) −→ Ln (D) → ↗ ↘ ↗ ↘ LPk (Φ) LSPk ↘ ↗ ↘ ↗ −→ LSk (Φ) −→ LSn−q−1 (FZ ) →, −→ LSPk −→
LSn−q (F ) ↗ ↘
LPn−q+1 (F )
LSk−1 (Ψ) ↗ ↘
LN Sk
↘ ↗ LSk (Φ)
−→
−→
−→
LSk (Ψ) ↗ ↘ ↘ ↗ Ln+1 (D)
−→
↘ ↗ LSn−q−1 (FZ )
−→ LSPk −→
(4.7)
↘ ↗ Ln−1 (C) →,
LTk ↗
↘
↘ ↗ LSn−q (F )
(4.8)
→ (4.9) →, → (4.10) →,
Homotopy triangulations of a manifold triple
17
where C = π1 (X \ Y ), D = π1 (X), k = n − q − q ′ . Diagrams (4.7)–(4.10) are realized on the spectra level. Proof:
Consider a homotopy commutative diagram LPn−q−q′ (Ψ) → Ln (π1 (X \ Y ) → π1 (X)) ↓ ↓ = Ln−1 (π1 (X \ Y )) → Ln−1 (π1 (X \ Y ))
in which the upper horizontal map and the left map are compositions of the natural forgetful map LPn−q−q′ (Ψ) → Ln−q (π1 (Y )) and maps of L–groups induced from (2.2). This diagram is realized on spectra level and by [22] we obtain a map of cofibration sequences ′
LT (X, Y, Z) → LP (Ψ) → Σ−q−q +1 L(π1 (X \ Y )) ↓ ↓ ↓= ′ ′ ′ Σ−q−q L(π1 (X)) → Σ−q−q L(π1 (X \ Y ) → π1 (X)) → Σ−q−q +1 L(π1 (X \ Y ))
where the left square is a pull–back square of spectra. The homotopy long exact sequences of this square provide the commutative braid of exact sequences (4.7) if we use the definition of LSP∗ –groups. The natural forgetful maps (see [18, 15]) LTn−q−q′ → LPn−q−q′ (Φ) → Ln (π1 (X)) provide a map of cofibration sequence ′
LSP (X, Y, Z) → LT (X, Y, Z) → Σ−q−q L(π1 (X)) ↓ ↓ ↓= ′ −q−q L(π1 (X)). LS(Φ) → LP (Φ) → Σ Now, similarly to the previous result, we obtain diagram (4.8), since the map LT∗ → LP∗ (Φ) fits into (3.15). Transfer maps and diagram (2.3) provide a map of the commutative diagram Ln−q (π1 (Y )) LPn−q−q′ (Ψ) → ↓ ↓ Ln−q−q′ (π1 (Z)) → Ln−q (π1 (Y \ Z) → π1 (Y ))
(4.11)
to the commutative diagram =
Ln (π1 (X \ Y ) → π1 (X)) → Ln (π1 (X \ Y ) → π1 (X)) ↓ ↓ Ln (π1 (X \ Z) → π1 (X)) → Ln (π1 (X \ Z) → π1 (X)).
(4.12)
18
Rolando Jimenez and Yuri V. Muranov
All these maps are realized on the spectra level. Hence, on the spectra level, the cofibers of the maps from (4.11) to (4.12) provide a homotopy commutative diagram of spectra ′
LSP → Σ−q LS(F ) ↓ ↓ LS(Φ) → LN S
(4.13)
as follows from (2.3) and (4.6). The realizations on spectra level of diagrams (4.11) and (4.12) give pull–back squares. Hence the homotopy commutative square (4.13) is a pull–back and diagram (4.9) is obtained from the homotopy long exact sequence of (4.13). The natural forgetful maps LTn−q−q′ → LPn−q (F ) → Ln (π1 (X)) from (4.6) provide a homotopy commutative diagram of spectra ′
→ ΣLS(Ψ) LT (X, Y, Z) → Σ−q LP (F ) ↓ ↓ ↓= ′ LT (X, Y, Z) → Σ−q−q L(π1 (X)) → ΣLSP (X, Y, Z),
(4.14)
where the rows are cofibrations and the right vertical map is defined by [22]. Hence the right square in (4.14) is a pull–back and its homotopy long exact sequences give diagram (4.10). ! Corollary 4.5 There exist exact sequences · · · → LSPk → LSn−q (F ) → LSk−1 (Ψ) → · · · , · · · → LSPk → LSk (Φ) → LSn−q−1 (FZ ) → · · · , and · · · → LSPk → LPk (Ψ) → Ln−1 (π1 (X \ Y ) → π1 (X)) → · · · , in which the left maps are natural forgetful maps. Now we describe some relations between the introduced groups LSP∗ and various structure sets which arise for the triple (X, Y, Z). Theorem 4.6 There exist braids of exact sequences →
Sn (X) ↗ ↘
−→
Sl−1 (Y, Z, η) → ↗ ↘ Sn−1 (X, Y, Z) Sn (X, X \ Y ) ↘ ↗ ↘ ↗ ↘ ↗ → Sl (Y, Z, η) −→ Sn−1 (X \ Y ) −→ Sn−1 (X) →, LSPk−1 ↗ ↘
−→
(4.15)
Homotopy triangulations of a manifold triple
→ Hl (Y, L• ) ↗ ↘ ↘ ↗ → LSPk
19
−→ Ln (π1 (X \ Y ) → π1 (X)) −→ LSPk−1 → ↗ ↘ ↗ ↘ LPk (Ψ) Sn (X, X \ Y ) ↘ ↗ ↘ ↗ −→ Sl (Y, Z, η) −→ Hl−1 (Y, L• ) →, (4.16)
→ LSl (FZ ) ↗ ↘ ↘ ↗ → Sn+1 (X)
−→ LSPk −→
Sn (X, Y, Z) ↗ ↘ ↘ ↗ LSk (Φ)
−→ Sn (X, Z, ν) −→
Sn (X) → ↗ ↘
(4.17)
↘ ↗ LSl−1 (FZ ) →,
and →
LSl+1 (F ) ↗ ↘
−→
LSk (Ψ) ↗ ↘
−→
Sn+1 (X, Y, ξ) LSPk ↗ ↘ ↗ −→ Sn+1 (X) −→ → Sn+1 (X, Y, Z) ↘
Sn (X, Y, Z) → ↗ ↘
(4.18)
↘ ↗ LSl (F ) →,
where l = n − q, k = n − q − q ′ . Diagrams (4.15)–(4.18) are realized on the spectra level. Proof:
Transfer maps give a commutative diagram (see [20]) ∼ =
Hn−q (Y, L• ) → Hn (X, X \ Y ; L• ) ↘ ↓ Hn−1 (X \ Y ; L• ).
(4.19)
LPn−q−q′ (Ψ) → Ln (π1 (X \ Y ) → π1 (X)) ↘ ↓ Ln−1 (π1 (X \ Y ))
(4.20)
Consider the commutative triangle
which follows from the commutative diagram obtained in the proof of Theorem 4.4. The results of [20, Proposition 7.2.6] provide the map from (4.19) to (4.20). On the spectra level cofibres of this map give a homotopy commutative triangle of spectra of structure sets S(Y, Z, η) → Σ−q S(X, X \ Y ) ↘ ↓ −q+1 S(X \ Y ). Σ
(4.21)
20
Rolando Jimenez and Yuri V. Muranov
By [22] diagram (4.21) induces a map of cofibration sequences ′
S(Y, Z, η) → Σ−q S(X, X \ Y ) → Σq +1 LSP ↓ ↓ ↓= S(Y, Z, η) → Σ−q+1 S(X \ Y ) → Σ−q+1 S(X, Y, Z). where the left square is a pull–back square of spectra. The homotopy long exact sequences of this square provide the commutative braid of exact sequences in (4.15). In a similar way the map from (4.19) to (4.20) provides a pull–back square ′
Σq LP (Ψ) → Σ−q L(π1 (X \ Y ) → π1 (X)) ↓ ↓ S(Y, Z, η) → Σ−q S(X, X \ Y ) where the cofibers of the vertical maps are homotopy equivalent to the spectrum Y+ ∧L• . From this the braid of exact sequences (4.16) follows. The diagram (4.17) is obtained in a similar way if we consider on the spectra level the homotopy commutative triangle of the cofibers of the map from Hn (X, L• ) to the triangle of natural forgetful maps LTn−q−q′
→ LPn−q−q′ (Φ) ↘ ↓ Ln (π1 (X)).
(4.22)
We obtain diagram (4.18) in a similar way to the construction of diagram (4.17). To do this we have to consider the commutative triangle LTn−q−q′
→ ↘
LPn−q (F ) ↓ Ln (π1 (X)).
So the proof is complete.
5
Examples
Now we give examples how to compute some LSP –groups. Consider the triple (Z ⊂ Y ⊂ X) = (RPn ⊂ RPn+1 ⊂ RPn+2 )
!
Homotopy triangulations of a manifold triple
21
of real projective spaces with n ≥ 5. The orientation homomorphism w : π1 (RPk ) = Z/2 → {±1} is trivial for k odd and nontrivial for k even. We have the following table for surgery obstruction groups (see [23, 9]) n=0 n=1 n=2 n=3 Z 0 Z/2 0 Ln (1) 0 Z/2 Z/2 Ln (Z/2+ ) Z ⊕ Z 0 Z/2 0 Ln (Z/2− ) Z/2 where superscript ”+” denotes the trivial orientation of the corresponding group and superscript ”−” denotes the nontrivial orientation. We have two squares for codimension one splitting problems which appear for different pairs RPk ⊂ RPk+1 of the considered triple. We denote by F±
⎛
⎞ 1 → 1 ↓ ⎠ =⎝ ↓ ∓ Z/2 → Z/2±
the oriented square F of fundamental groups in accordance with the orientation {±} of the ambient manifold. We have the following isomorphisms (see [9, p. 15] and [23]) LSn (F + ) = LNn (1 → Z/2+ ) = BLn+1 (+) = Ln+2 (1) and LSn (F − ) = LNn (1 → Z/2− ) = BLn+1 (−) = Ln (1). Now we recall intermediate computations of obstruction groups LP∗ (F ± ) and LT ∗ (X, Y, Z) from [18].
The computation of LP∗ –groups for a pair Y ⊂ X is based on the following braid of exact sequences [23] →
Ln+1 (C) ↗ ↘
↘ ↗ → LSn−q+1 (F )
−→ LPn−q+1 (F ) −→
Ln+1 (D) ↗ ↘ ↘ ↗ Ln−q+1 (B)
∂
−→ Ln+1 (C → D) −→
LSn−q (F ) → ↗ ↘ ↘ ↗ Ln (C) →
(4.23)
22
Rolando Jimenez and Yuri V. Muranov
where A = π1 (∂U ), B = π1 (Y ), C = π1 (X \ Y ), and D = π1 (X).
In the cases of squares F ± we have q = 1, and the natural map that forget the ambient manifold LSn (F ± ) → Ln (Z/2∓ ) coincides with the map ln : BLn (±) → Ln−1 (Z/2∓ ) which is described in [9, p. 35]. Using this result and a diagram chasing in diagram (4.23) we obtain surgery obstruction groups (see also [16]) LPn (F + ) = LPn−1 (F − ) = Z/2, Z/2, Z/2, Z for n = 0, 1, 2, 3 (mod 4), respectively. Now a diagram chasing in diagram (4.6) provides the following results. Proposition 5.1 [18] Let M n−k be a closed simply connected topological manifold. For the triple of manifolds (Z n ⊂ Y n+1 ⊂ X n+2 ) = (M n−k × RPk ⊂ M n−k × RPk+1 ⊂ M n−k × RPk+2 )
with n ≥ 5 we have the following results. For k odd the groups LTn are isomorphic to Z ⊕ Z/2, Z/2, Z ⊕ Z/2, Z/2 for n = 0, 1, 2, 3 (mod 4), respectively. ∼ Z/2 ⊕ Z/2 and LT1 ∼ For k even LT0 = = Z/2. The groups LT3 and LT2 fit into the exact sequence 0 → LT3 → Z → Z → LT2 → Z/2 → 0. Now we apply these results to compute the LSP∗ –groups in the considered cases. Theorem 5.2 Under assumptions of Proposition 5.1 we have the following:
Homotopy triangulations of a manifold triple
23
For k odd the groups LSPn are isomorphic to Z, Z, Z/2, Z/2 for n = 0, 1, 2, 3 (mod 4), respectively. For k even we have isomorphisms LSP0 ∼ = LSP1 ∼ = Z/2. The groups LSP3 and LSP2 fit into the exact sequence 0 → LSP3 → Z → Z → LSP2 → 0. Proof: Consider the case when k is odd. From diagram (4.6) in the considered case we conclude that all maps LTn → LPn+1 (F + ) are epimorphisms (see also [18]). Now it is easy to describe the maps LPn (F + ) → Ln+1 (Z/2+ ) from diagram (4.23). For n = 1 mod 4 and n = 2 mod 4 these maps are isomorphisms Z/2 → Z/2 as follows considering exact sequences lying in diagram (4.23). For n = 0 mod 4 the map is trivial since the group L1 (Z/2+ ) is trivial. The map Z = LP3 (F + ) → L0 (Z/2+ ) = Z ⊕ Z is an inclusion on a direct summand. The image of this map coincides with the image of the map L0 (1) → L0 (Z/2+ ) that is induced by the inclusion 1 → Z/2+ . This follows from the commutative triangle Z || LP3 (F + ) ∼ ↘ =↗ mono L0 (1) −→ L0 (Z/2+ ) || || Z Z⊕Z which lies in diagram (4.23). From diagram (4.7) we obtain an exact sequence τ
· · · → LTn → Ln+2 (Z/2+ ) → LSPn−1 → LTn−1 → · · · The map τ in (4.24) is a composition LTn → LPn+1 (F + ) → Ln+2 (Z/2+ ) of maps that we already know.
(4.24)
24
Rolando Jimenez and Yuri V. Muranov
From this we obtain that τ is trivial for n = 3, an isomorphism Z/2 → Z/2 for n = 1, an epimorphism Z ⊕ Z/2 → Z/2 with a kernel Z for n = 0, and a map Z ⊕ Z/2 → Z ⊕ Z with kernel Z/2 and cokernel Z for n = 2. Now considering the exact sequence (4.24) we get the result of the theorem for k odd. We get the result for k the even case in a similar way. ! Dr. Rolando Jimenez, Instituto de Matem´ aticas, UNAM, Unidad Cuernavaca, Av. Universidad S/N, Col. Lomas de Chamilpa, 62210 Cuernavaca, Morelos, M´exico. rolando@aluxe.matcuer.unam.mx
Prof. Yuri V. Muranov, Vitebsk State University, Moskovskii pr.33, 210026 Vitebsk, Belarus. ymuranov@mail.ru; ymuranov@imk.edu.by
References [1] Bak A.; Muranov Yu. V., Splitting along submanifolds and Lspectra, J. Math. Sci (N. Y.) 123 No. 4 (2004), 4169–4184. [2] Browder W.; Livesay G. R., Fixed point free involutions on homotopy spheres, Bull. Amer. Math. Soc. 73 (1967), 242–245. [3] Cappell S. E.; Shaneson J. L., Pseudo-free actions. I., Lecture Notes in Math. 763 (1979), 395–447. [4] Cohen M. M., A Course in Simple-Homotopy Theory, Graduate Texts in Mathematics 10, Springer–Verlag, New York, 1973. [5] Hambleton I., Projective surgery obstructions on closed manifolds, Lecture Notes in Math. 967 (1982), 101–131. [6] Hambleton I.; Kharshiladze A. F., A spectral sequence in surgery theory, Sb. Mat. 183 (1992), 3–14. [7] Hambleton I.; Pedersen E., Topological Equivalences of Linear Representations for Cyclic Groups, MPI, Preprint, 1997. [8] Hambleton I.; Ranicki A. A.; Taylor L., Round L-theory, J. Pure Appl. Algebra 47 (1987), 131–154. [9] L´opez de Medrano S., Involutions on Manifolds, Springer–Verlag, New York, 1971.
Homotopy triangulations of a manifold triple
25
[10] L¨ uck W.; Ranicki A. A., Surgery obstructions of fibre bundles, J. Pure Appl. Algebra 81 No. 2 (1992), 139–189. [11] L¨ uck W.; Ranicki A. A., Surgery transfer, Lecture Notes in Math. 1361 (1988), 167–246. [12] Maleˇsiˇc J.; Muranov Yu. V.; Repovˇs D., Splitting obstruction groups in codimension 2, Mat. Zametki 69 (2001), 52–73. [13] Muranov Yu. V., Splitting obstruction groups and quadratic extension of antistructures, Izv. Math. 59 No. 6 (1995), 1207–1232. [14] Muranov Yu. V., Splitting problem, 123–146, Proc. Steklov Inst. Math. 212 (1996), 123–146. [15] Muranov Yu. V.; Jimenez R., Transfer maps for triples of manifolds, Mat. Zametki, In print. [16] Muranov Yu. V.; Kharshiladze A. F., Browder–Livesay groups of Abelian 2-groups, Sb. Mat. 181 (1990), 1061–1098. [17] Muranov Yu. V.; Repovˇs D., Groups of obstructions to surgery and splitting for a manifold pair, Sb. Math. 188 No. 3 (1997), 449–463. [18] Muranov Yu. V. ; Repovˇs D.; Spaggiari F., Surgery on triples of manifolds, Sb. Mat. 8 (2003), 1251–1271. [19] Ranicki A. A., Algebraic L-theory and Topological Manifolds, Cambridge Tracts in Math., Cambridge University Press, Cambridge, 1992. [20] Ranicki A. A., Exact Sequences in the Algebraic Theory of Surgery, Math. Notes 26, Princeton Univ. Press, Princeton, N. J., 1981. [21] Ranicki A. A., The total surgery obstruction, Lecture Notes in Math. 763 (1979), 275–316. [22] Switzer R., Algebraic Topology–Homotopy and Homology, Grundlehren Math. Wiss. 212, Springer, New York, 1975. [23] Wall C. T. C., Surgery on Compact Manifolds, Academic Press, London–New York, 1970. (Second Edition, Mathematical Surveys and Monographs 69, A. A. Ranicki Editor, Amer. Math. Soc., Providence, R. I., 1999.)
Morfismos, Vol. 8, No. 2, 2004, pp. 27–50
On information measures and prior distributions: a synthesis Francisco Venegas-Mart´ınez
Abstract This paper suggests a new approach to reconciling, in a systematic way, all inferential methods that maximize a specific criterion functional to produce non-informative and informative priors. In particular, Good’s (1968) Minimax Evidence Priors (MEP), Zellner’s (1971) Maximal Data Information Priors (MDIP) and Bernardo’s (1979) Reference Priors (RP) are seen as special cases of maximizing a more general criterion functional. In a unifying approach Good-Bernardo-Zellner’s priors are introduced and applied to a number of Bayesian inference problems, including the Kalman filter and Normal linear model. Moreover, the paper focuses, under plausible conditions, on the existence and uniqueness of the solutions of the derived optimization problems.
2000 Mathematics Subject Classification: 62F15, 49K20. Keywords and phrases: information measures, Bayesian inference.
1
Introduction
The distinctive task in Bayesian inference of deriving priors, in such a way that the inferential content of the data is minimally affected in the posterior, has been of great interest for more than two centuries since the early work of Bayes (1763). More current approaches to this problem, based on the maximization of a specific criterion functional, have been suggested by Good (1968), Zellner (1971) and Bernardo (1979), among others. It is also important to mention that recent literature has included inference procedures to provide a posterior without having a prior, like the Bayesian method of moments (BMOM) introduced by Zellner (1996) and (1998). 27
28
Francisco Venegas-Mart´ınez
In Good’s (1968) principle of maximum invariantized negative crossentropy, the minimax evidence method of deriving priors was presented for the first time. In this approach the initial density is taken as the square root of Fisher’s information. Zellner’s (1971) book introduced a method to obtain priors through the maximization of the total information about the parameters provided by independent replications of an experiment (prior average information in the data minus the information in the prior). In Bernardo (1979) a procedure was proposed to produce reference priors by maximizing the expected information about the parameters provided by independent replications of an experiment (average information in the posterior minus the information in the prior). All of the above methods have comparative and absolute advantages in several respects and have been applied to a large number of inference problems: (i ) While Zellner’s method is based on an exact finite sample criterion functional, Good’s approach uses a limiting criterion functional, and Bernardo’s procedure lies in asymptotic results. In Bernardo’s proposal a reference prior (posterior) is defined as the limit of a sequence of priors (posteriors) that maximize finite-sample criteria. In a pragmatic approach in which results are most important, many reference prior algorithms have been developed. For instance, Berger, Bernardo and Mendoza (1989), and Berger and Bernardo (1989), (1992a), (1992b), Bernardo and Smith (1994 , ch. 5), and Bernardo and Ram´ on (1997). (ii ) The criterion functional used by Bernardo is a cross-entropy, which satisfies a number of remarkable properties, in particular, invariance with respect to one-to-one transformations of the parameters (Lindley 1956). In contrast, the total information functional employed by Zellner is invariant only for the location-scale family and under linear transformations of the parameters. To generate invariance under other relevant transformations, not necessarily one-to-one, side conditions could be needed, as suggested by Zellner (1971). (iii ) These methods have been tested by seeing how well they perform in particular examples. The evaluation is often based on contrasting the derived priors with Jeffreys’ (1961) priors, usually improper. Even though improper priors can be associated with unbounded measures consistent with Renyi’s (1970) axioms
Information measures and prior distributions
29
on probability measures, some technical difficulties remain, see: Box and Tiao (1973), p. 314; Akaike (1978), p. 58; and Berger and Bernardo (1992a), p. 37. It is also important to mention that Jeffreys’ priors can lead to singularities producing inadequate results at certain values of the parameters; see Jeffreys (1967, p. 359). Of course, if MEP, MDIP, and RP priors were to be used to contrast the performance of other priors, the former priors could also produce unsatisfactory results under certain circumstances. In this paper, we attempt to reconcile all inferential methods that maximize a criterion functional to produce non-informative and informative priors. In our general approach, Good’s Minimax Evidence Priors (1968 and 1969), Zellner’s Maximal Data Information Priors (1971, 1977, 1991, 1993, 1995, 1996a and 1996b) and Bernardo’s Reference Priors (1979 and 1997) are seen as special cases of maximizing a more general indexed criterion functional. Thus, properties of the derived priors will depend on the choice of indexes from a wide range of possibilities, instead of on a few personal points of view with ad hoc modifications. In the spirit of Akaike (1978) and Smith (1979), we can say that this will look more like Mathematics than Psychology–without underestimating the importance of the latter in the Bayesian framework. This unified approach will enable us to explore a vast range of possibilities for constructing priors. It is worthwhile to note that our general method extends in a natural way Soofi’s (1994) pyramid by adding more vertices and including their convex hull. In any event, a good choice will depend on the specific characteristics of the problem we are concerned with. Needless to say, the chosen method should also provide good predictions. This work is organized as follows. In section 2, we will introduce an indexed family of information functionals. In section 3, on the basis of asymptotic normality, we will state a relationship between Bernardo’s (1979) criterion functional and some members of the indexed family. In section 4, we will study a Bayesian inference problem associated with convex combinations of relevant members of the proposed indexed family. Here, we will introduce Good-Bernardo-Zellner’s priors as well as their controlled versions as solutions of maximizing discounted entropy. We will pay special attention to the existence and uniqueness of the solution of the corresponding optimization problems. In section 5, we will study Good-Bernardo-Zellner’s priors as Kalman Filtering priors. In section 6, we examine the relationship between Good-Bernardo-Zellner’s
30
Francisco Venegas-Mart´ınez
priors and the Normal linear model. Finally, in section 7, we will draw conclusions, acknowledge limitations, and make suggestions for further research.
2
An indexed family of information functionals
In this section, we define an indexed family of information functionals and study some distinguished members. For the sake of simplicity, we will remain in the single parameter case. The extension to the multidimensional parameter case will lead to conceptual complications. This is not surprising when dealing with information measures and priors; see Jeffreys (1961), Zellner (1971), Box and Tiao (1973), and Berger and Bernardo (1992a). Suppose that we wish to make inferences about an unknown parameter θ ∈ Θ ⊆ R of a distribution Pθ , from which there is available an observation, say, X. Assume that Pθ has density f (x|θ) (Radon-Nikodym derivative) with respect to some fixed dominating σ-finite measure λ on R for all !θ ∈ Θ ⊆ R, that is, dPθ /dλ = f (x|θ) for all θ ∈ Θ ⊆ R, thus Pθ (A) = A f (x|θ)dλ(x) for all Borel sets A ⊂ R. The Bayesian approach is to assume that there is a prior density, π(θ), describing initial knowledge about the likelihood of the values of the parameter, θ. We will assume that π(θ) is a density with respect to some σ-finite measure µ on R. Once a prior distribution, π(θ), has been prescribed, then the information provided by the data, x, about the parameter is used to modify the initial knowledge, as expressed in π(θ), via Bayes’ theorem to obtain a posterior distribution of θ, namely, f (θ|x) ∝ f (x|θ)π(θ) for every x ∈ R (using f generically to represent densities). The normalized posterior distribution is then used to make inferences about θ. Let us define an infinite system of nested functionals: (1)
1 Vγ,α,δ (π) = 1−γ
"
π(θ)G(I(θ), F(θ), γ, α, δ)dµ(θ)
where G(I(θ), F(θ), γ, α, δ) $ # 1−γ exp{[F(θ)/I(θ)]1−δ [I(θ)] 1+α − δ[I(θ)]1−α } , = log π(θ)1−γ
Information measures and prior distributions
31
0 ≤ γ < 1, α ∈ {0, 1}, δ ∈ {0, 1}, and #2 ! " ∂ (2) log f (x|θ) f (x|θ)dλ(x) I(θ) = ∂θ is Fisher’s information about θ provided by an observation X with density f (x|θ), and ! F(θ) = f (x|θ) log f (x|θ)dλ(x) (3) is the negative Shannon’s information of f (x|θ), provided I(θ) and F(θ) exist. In the case that n independent observations of X are drawn from Pθ , say, (X1 , X2 , ..., Xn ), then I(θ) and F(θ) will still stand for the average Fisher’s information and the average negative Shannon’s information of f (x|θ) respectively. It is not unsual to deal with indexed functionals in inference problems about a distribution, as in Good (1968). It is worthwhile pointing out that for each triad (γ, α, δ) taking values in 0 ≤ γ < 1, α ∈ {0, 1}, δ ∈ {0, 1}, then Vγ,α,δ (π) is a criterion functional that can be used to derive a prior π(θ), θ ∈ Θ, belonging to a feasible set C. Usually, C is defined by constraints in terms of potential values of θ. Note now that for the location parameter family f (x|θ) = f (x − θ), θ ∈ R, with the properties ! [f ′ (x)]2 /f (x) dλ(x) < ∞ and
!
f (x) log f (x) dλ(x) < ∞,
where λ = µ stands for the Lebesgue measure, we have that both I(θ) and F(θ) are constant. Observe also that the scale parameter family f (x|θ) = (1/θ)f (x/θ), θ > 0, with the above properties, satisfies the following relationship: (4)
F(θ) =
1 2
log I(θ) + constant.
The indexed family in which we will be concerned with is given by: A = conv[ {Vγ,α,δ (π)} ]
=convex hull of the closure of the family{Vγ,α,δ (π)}. We readily identify a number of distinguished members of A:
32
Francisco Venegas-Mart´ınez
(i ) Criterion for Maximum Entropy Priors (MAXENTP): V0,0,1 (π) = −
!
π(θ) log π(θ)dµ(θ),
which is just Shannon’s information measure of a density π(θ), or Jaynes’ (1957) criterion functional to derive maximum entropy priors. Notice also that (3) can be rewritten in a simpler way as F(θ) = −V0,0,1 (f (x|θ)). (ii ) Criterion for Minimax Evidence Priors (MEP): def (5) V1,1,1 (π) = lim Vγ,1,1 (π) = − γ→1
!
π(θ) log
π(θ) dµ(θ) − log C, p(θ)
which is Good’s invariantized negative cross-entropy, taking as " 1 1 2 2 initial density p(θ) = C[I(θ)] with C = { [I(θ)] dµ(θ)}−1 , pro" 1 vided that [I(θ)] 2 dµ(θ) < ∞. We can also write (5) as (6)
V1,1,1 (π) − V0,0,1 (π) =
!
1
π(θ) log[I(θ)] 2 dµ(θ).
(iii ) Criterion for Maximal Data Information Priors (MDIP): (7)
V0,0,0 (π) =
! !
f (x)f (θ|x) log
ℓ(θ|x) dµ(θ)dλ(x), π(θ)
which is Zellner’s criterion functional in his MDIP approach. Here, as usual, f (θ|x) =
f (x|θ)π(θ) , f (x)
f (x) =
!
f (x|θ)π(θ)dµ(θ),
and ℓ(θ|x) = f (x|θ) is the likelihood function. An alternative formulation of (7), which is often useful, is given by (8)
V0,0,0 (π) − V0,0,1 (π) =
!
π(θ)F(θ)dµ(θ).
Some members of A define new criterion functionals in which the information provided by the sampling model, I(θ), plays a role:
Information measures and prior distributions
33
(iv ) Criterion for Maximal Modified Data Information Priors (MMDIP): (9)
V0,1,0 (π) =
! !
1
[ℓ(θ|x)][I(θ)] 2 dµ(θ)dλ(x), f (x)f (θ|x) log π(θ)
which is the prior average information in the data modified by Fisher’s information minus the information in the prior. Note that when I(θ) is constant, (9) reduces to Zellner’s criterion functional (up to a constant factor). (v ) Criterion for Maximal Fisher Information Priors (MFIP): (10)
V0,1,1 (π) = −
!
π(θ) log
π(θ) 1
exp{[I(θ)] 2 }
dµ(θ) − 1,
which is the prior average Fisher’s information minus the information in the prior.
3
Revisiting Bernardo’s reference priors
The maximization of Bernardo’s (1979) criterion is usually a difficult problem to deal with. In order to get a simpler alternative procedure under specific conditions, we will derive a useful asymptotic approximation between Bernardo’s criterion functional (or Lindley’s information measure, 1956) and some members of the class A. As stated in Bernardo (1979), the concept of reference prior is very general. However, in order to keep the analysis tractable, we will restrict ourselves to the continuous one-dimensional parameter case. Suppose that there are available n independent observations, say, (X1 , X2 , . . . , Xn ), of a distribution Pθ , θ ∈ Θ ⊆ R. Accordingly, the random vector (X1 , X2 , . . . , Xn ) has density dPθ /dν = f (ξ|θ) =
n "
k=1
f (xk |θ),
for all ξ = (x1 , x2 , ..., xn ) and all θ ∈ Θ ⊆ R, where Pθ = Pθ ⊗ Pθ ⊗ · · · ⊗ Pθ and ν = λ # ⊗λ⊗ $%· · · ⊗ λ& . $% & # n
n
34
Francisco Venegas-Mart´ınez
Following Lindley (1956), a measure of the expected information about θ of a sampling model f (x|θ) provided by a random sample of size n when the prior distribution of θ is π(θ), is defined to be ! ! f (θ|ξ) f (θ|ξ) log (11) dµ(θ)dν(ξ). L(n) (π) = f (ξ) π(θ) In order to obtain an asymptotic approximation of (11) in terms of V1,1,1 and V0,0,1 , we state a limit theorem which justifies the passage of the limit under the integral signs in (11). The theorem rules out the possibility that the essentials of the statistical model, f (ξ|θ), change when samples grow in size. Let us rewrite (11) as: √ L(n) (π) =Vγ,0,1 (π) + log n # "! ! ! (12) − log Tn (ω)Wn (ω)dµ(ω) f (ξ|θ)π(θ)dν(ξ)dµ(θ), where
(13)
Tn (ω) =
f (X1 , X2 , ..., Xn |θ +
√ω ) n
f (X1 , X2 , ..., Xn |θ)
and (14)
Wn (ω) =
π(θ +
√ω ) n
π(θ)
.
Throughout the paper, both λ and µ will stand for the Lebesgue measure on R. Also, we will assume that all densities involved are Lebesgue measurable in both arguments, x and θ. Theorem 3.1 Assume that the following conditions hold: (I) Θ is an open interval in R; $ (II) The function f (x|θ) is absolutely continuous on θ, and {x|f (x|θ) > 0}
is independent of θ; (III) If θ, θ′ ∈ Θ, then θ ̸= θ′ implies λ{x|f (x|θ) ̸= f (x|θ′ )} > 0;
35
Information measures and prior distributions
(IV)
∂ ∂θ
log f (x|θ) exists for all θ ∈ Θ and every x;
(V) I(θ) is a continuous and bounded function in Θ; (VI) For all δ > 0, and all θ ∈ Θ ! %2 "# $ f (x|θ + √ωn ) − f (x|θ) dλ(x) = o( n1 ), Bδ ( √ωn )
where Bδ ( √ωn ) = {x : |
#
f (x|θ +
√ω ) n
−
$
f (x|θ) | > δ
$ f (x|θ) };
(VII) There exist c > 0 and τ > 0 such that ! & & & π(θ + u) − π(θ) &dµ(θ) ≤ c|u|τ ;
(VIII) For all ρ > 0 !
|ω|>nρ
% " P Tn (ω)Wn (ω) − Tn (ω) dµ(ω)−→0;
(IX) The sequence of random variables {log Un }∞ n=1 where ! Un = Tn (ω)Wn (ω)dµ(ω) satisfies lim sup
ε→∞ n≥1
!
| log Un |≥ε
| log Un |dP = 0,
where P {ξ ∈ A, θ ∈ B} =
!
π(θ) B
!
f (ξ|θ)dν(ξ)dµ(θ), A
for all A ∈ Rn and B ∈ Θ. Then, as n → ∞, (15)
√ L(n) (π) − V1,1,1 (π) = −V0,0,1 (ϕ) + log C n + o(1),
where ϕ(z) is the density of Z ∼ N (0, 1), and C is taken as in (4).
36
Francisco Venegas-Mart´ınez
Some comments are in order: (I)-(IV) are standard regularity conditions, (V) states desirable properties for I(θ), (VI) is a bounded variance condition, (VII) is a smoothness condition, (VIII) is a convergence condition, and (IX) says that the sequence {log Un }∞ n=1 is uniformly integrable with respect to P . It can be shown that (I)-(VI) lead to " # $% ! " L (16) Tn (ω)−→ exp ω I(θ) Z − 12 ω I(θ) , where Z ∼ N (0, 1), and (16) along with (VII)-(IX) imply & " L log Un = log Tn (ω)Wn (ω)dµ(ω)−→ log 2π/I(θ) + 12 Z 2 ,
from where the conclusion of the theorem follows. Notice that the righthand side of (3.5) is independent of π. Thus, if conditions (I)-(IX) are fulfilled, instead of maximizing L(∞) (π), which is usually a difficult problem, we have as an alternative procedure maximizing V1,1,1 (π), which is independent of n. Notice that for maximization purposes the right-hand side of (15) becomes a constant. Finally, it is worthwhile to "note that the location parameter family f (x|θ) = f (x − θ), with f (x) absolutely continuous on R, and ' ′ 2 [f (x)] /f (x) dλ(x) < ∞, fully satisfies the conditions of Theorem 3.1.
4
Good-Bernardo-Zellner priors
In this section we introduce Good-Bernardo-Zellner’s priors as solutions of convex combination of relevant members of the class A. Very often, there exist priors for which entropy becomes infinite, specially when dealing with the non-informative case. In order to overcome this difficulty, we suggest the concept of discounted entropy. We also introduce Good-Bernardo-Zellner’s controlled priors as solutions of maximizing discounted entropy. We emphasize the existence and uniqueness of the solutions of the corresponding variational and optimal control problems. Throughout this section, we will be studying a number of Bayesian inferential problems related to convex combinations of distinctive elements of A. Let def
Mφ (π) = φV1,1,1 (π) + (1 − φ)V0,0,0 (π), 0 ≤ φ ≤ 1. Plainly, Mφ (π) ∈ A. To see that Mφ (π) is concave w.r.t. π, it is enough to observe, as in Zellner (1991), that V0,0,0 (π(θ)) = L(1) (π(θ)) + V0,0,1 (π(θ)) − V0,0,1 (f (x)),
Information measures and prior distributions
37
is a sum of concave functions w.r.t. π (up to the constant V0,0,1 (f (x))). Since V1,1,1 (π) is concave w.r.t. π, Mφ (π) is also concave w.r.t. π. Zellner (1996b) provides a criterion functional that agrees with Mφ (π) given by Gφ [π(θ)] ! " ! # 1 = φF(θ) + (1 − φ) log[I(θ)] 2 π(θ)dµ(θ) − π(θ) log π(θ)dµ(θ). Indeed, from (5), (6) and (8), we get Gφ [π(θ)] =φ (V0,0,0 − V0,0,1 ) + (1 − φ) (V1,1,1 − V0,0,1 ) + V0,0,1 =φV0,0,0 + (1 − φ)V1,1,1 − V0,0,1 + V0,0,1 =Mφ (π).
Usually, in the absence of data supplementary information, in terms of expectations about the parameter, comes from additional knowledge of the experiment, or from the experience of the experimenter, namely, ! ak (θ)π(θ)dµ(θ) = ak , k = 1, 2, ..., s, (17) where both the functions ak and the constants ak , k = 1, 2, ..., s, are known. Hereafter, we will assume that (17) does not lead to any contradiction about π(θ). We will now concern with maximizing Mφ (π) subject to supplementary information. Proposition 4.1 Consider the Good-Bernardo-Zellner problem:
subject to C :
!
Maximize Mφ (π)
(with respect to π)
ak (θ)π(θ)dµ(θ) = ak ,
k = 0, 1, 2, ..., s, a0 ≡ 1 = a0 .
Then a necessary condition for a maximum is
(18)
φ
πφ∗ (θ) ∝ [I(θ)] 2 exp{(1 − φ)F(θ) +
s $
λk ak (θ)},
k=0
where λk , k = 0, 1, ..., s, are the Lagrange multipliers associated with the constraints C (cf. Zellner 1995).
38
Francisco Venegas-Mart´ınez
Notice that when no supplementary information is available, πφ∗ (θ) is appropiate for an unprejudiced experimenter, otherwise it will be suitable for an informed experimenter who is in favor of C. Observe also that π1∗ (θ) is Good-Bernardo’s prior, and π0∗ (θ) is Zellner’s prior. Consider the binomial distribution for a single observation, f (x|θ) = 1 1 θx (1 − θ)1−x , 0 ≤ θ ≤ 1. In such a case, π1∗ (θ) ∝ θ− 2 (1 − θ)− 2 and π0∗ (θ) ∝ θθ (1 − θ)1−θ for θ ∈ [0, 1], which are quite different. Notice that π1∗ (θ) becomes infinite at θ = 0 and θ = 1. On the other hand π0∗ (θ) rises monotonically to 1.6186 at θ = 0 and θ = 1. Yet, another view in this regard (Geisser, 1993) states that when the sample size is fairly large it does not matter which prior is employed, and the uniform prior may as well be used for θ. Corolary 4.1 Consider the location and scale parameter families, f (x|θ) = f (x − θ), θ ∈ R, and f (x|θ) = (1/θ)f (x/θ), θ > 0, respectively, both satisfying ! [f ′ (x)]2 /f (x) dλ(x) < ∞
" and f (x) log f (x) dλ(x) < ∞. Then, Good-Bernardo’s and Zellner’s priors agree regardless of the value of φ ∈ (0, 1). The proof of the above corollary for the scale parameter case follows from (4). It is important to point out that when there is no supplementary information, we require µ(Θ) < ∞. Of course, the parameter space Θ can have bounds as large as needed to consider where the likelihood for θ is appreciable. Notice that Proposition 4.1 can be used recursively when there is more supplementary information to be added, say, ! ak (θ)π(θ)dµ(θ) = ak , k = s + 1, s + 2, ..., t. (19) In such a case, in a cross-entropy formulation (Kullback 1959), we take (18) as the initial density, and (19) as the additional information. Hence,
39
Information measures and prior distributions
φ
πφ∗ (θ) ∝ [I(θ)] 2 exp{(1 − φ)F(θ) + φ
= [I(θ)] 2 exp{(1 − φ)F(θ) +
s !
λk ak (θ)} exp{
λk ak (θ)}
k=s+1
k=0
t !
t !
λk ak (θ)} .
k=0
To deal with the (local) uniqueness of the solution of the problem stated in Proposition 4.1, we rewrite "the constraints, C, as a function of ¯ where the multipliers in the form A(Λ) = [ ak (θ)πφ∗ (θ)dµ(θ)]sk=0 = A, T T A¯ = (a0 , a1 , ..., as ), and Λ = (λ0 , λ1 , ..., λs ) (the superindex T denotes the usual vector or matrix transposing operation). Proposition 4.2 Let πφ∗ (θ) be as in (4.2), and suppose that ak , k = 0, 1, ..., s, are linearly independent continuous functions in L2 [Θ, πφ∗ dµ] (the space of all πφ∗ dµ-measurable functions a(θ) defined on Θ such that |a(θ)|2 is πφ∗ dµ-integrable). Suppose that A(Λ) is defined on an open set ∆ ⊂ Rs+1 , and let Λo be a solution of A(Λ) = A¯ for a fixed value of A¯ = A¯o . Then, there exists a neighborhood of Λo , N (Λo ), in which Λo is the unique solution of A(Λ) = A¯o in N (Λo ). The proof follows from the fact that A(Λ) is continuously differentiable on ∆, with nonsingular derivative # A′ (Λ) = [ aȷ (θ)aℓ (θ)πφ∗ (θ)dµ(θ)]0≤ȷ,ℓ≤s , and from a straightforward application of inverse function theorem. From (4.1) we may derive the following necessary condition, which is useful in practical situations. Proposition 4.3 The multipliers ΛT = (λ0 , λ1 , ..., λs ) appearing in (18) satisfy the following non-linear system of s + 1 equations: $# & s % φ (1−φ)F (θ) λ a (θ) k k [I(θ)] 2 e e dµ(θ) , 1 = λ0 + log ¯k + log 1 = λ0 − log a k = 1, 2, ..., s. Moreover,
$#
k=1
ak (θ)[I(θ)]
φ 2
e
(1−φ)F (θ)
s %
u=1
e
λu au (θ)
& dµ(θ) ,
40
Francisco Venegas-Mart´ınez
(i) if the integral in the first equality has a closed-form solution, then the rest of the multipliers can be found from the relations: ∂λ0 =a ¯k , ∂λk
k = 1, 2, ..., s,
(ii) the formula φV1,1,1 (πφ∗ )
+ (1 −
φ)[V0,0,0 (πφ∗ )
−
2V0,0,1 (πφ∗ )]
=1−
s !
λk a ¯k ,
k=0
holds for all 0 ≤ φ ≤ 1. Very often, experimenters are concerned with assigning weights a ¯k , k = 1, 2, ..., s, to regions Ak , k = 1, 2, ..., s, to express, according to experience, how likely it is that θ belongs to each region. The following result, based on Proposition 4.3, characterizes Good-Bernardo-Zellner’s priors when such a supplementary information comes in the form of quantiles, and both I(θ) and F(θ) are constant. Under such assumptions, the non-linear system of s + 1 equations given in Proposition 4.3 is transformed into a homogeneous linear system of the same dimension as shown below: Proposition 4.4 Suppose that the sets Ak = (bk , bk+1 ], k = 1, 2, ..., s− 1 and As = (bs , bs+1 ) with b1 < b2 < · · · < bs+1 , s ≥ 2, constitute a partition of Θ, 0 < µ(Θ) < ∞. Suppose also that both "s I(θ) and F(θ) ¯k = 1, and , a ¯ , ..., a ¯ > 0 be such that are constant. Let a ¯ 1 2 s k=1 a # ¯k , k = 1, 2, ..., s. If we define new multipliers: IAk (θ)π(θ)dµ(θ) = a φ
ω0 = e1−λ0 /D where D = [I(θ)] 2 e(1−φ)F (θ) , and ωk = eλk , k = 1, 2, ..., s. Then, Ω = (ω0 , ω1 , ..., ωs ) can be found from the following homogeneous linear system:
(20)
⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝
−1 u1 u2 −1 v1 0 −1 0 v2 .. .. .. . . . −1 0 0
⎞ ⎛ ⎞⎛ ω0 . . . us ⎟ ⎜ ⎜ ... 0 ⎟ ⎟ ⎜ ω1 ⎟ ⎜ ⎜ ⎟ ⎜ . . . 0 ⎟ ⎜ ω2 ⎟ ⎟=⎜ . ⎟⎜ . ⎟ ⎜ .. . .. ⎠ ⎝ .. ⎠ ⎝ . . . vs ωs
where uk = µ(Ak ), and vk = a ¯−1 k uk , k = 1, 2, ..., s.
0 0 0 .. . 0
⎞
⎟ ⎟ ⎟ ⎟, ⎟ ⎠
Information measures and prior distributions
41
Observe that the determinant, ∆, of the matrix in (20) is given by $ ! "s a ¯ k − 1 #s k=1 #s ∆= k=1 uk , ¯k k=1 a
which guarantees that there exists a unique nontrivial solution since "s a ¯ 1. In this case, the solution is Ω ∗T = (1, v1−1 , v2−1 , ..., vs−1 ), k=1 k = " and πφ∗ = sk=1 vk−1 IAk . The following proposition extends Good-Bernardo-Zellner’s priors to a richer family by using the MMDIP and MFIP criteria: Proposition 4.5 Let def
Nφ,ψ (π) = φV1,1,1 (π)+(1−φ)(1−ψ)V0,0,0 (π)+(ψ(1−φ)/2)[V0,1,1 +V0,1,0 ], 0 ≤ φ, ψ ≤ 1. Then (i) Nφ,ψ (π) ∈ A and is concave w.r.t. π. (ii) A necessary condition for π to be a maximum of the problem Maximize subject to
C:
%
Nφ,ψ (π)
ak (θ)π(θ)dµ(θ) = ak ,
k = 0, 1, 2, ..., s,
where a0 ≡ 1 = a0 , is given by φ 2
&
∗ πφ,ψ (θ) ∝[I(θ)] exp (1 − φ)(1 − ψ)F(θ)
(21)
' ( * s ) 1 ψ(1 − φ) F(θ) + [I(θ)] 2 + + λk ak (θ) , 1 2 [I(θ)] 2 k=0
where λk , k = 0, 1, ..., s, are the Lagrange multipliers associated with the constraints C. The second term inside the exponential of (21) is the average between Fisher’s information and the negative relative Shannon-Fisher’s ∗ (θ) is just Good-Bernardo-Zellner’s prior. information. Notice that πφ,0 In the following proposition, Good-Bernardo-Zellner type priors are derived as MAXENTP solutions by treating (5) and (8) as constraints (for the rationale of MAXENTP methods see Jaynes’ 1982 seminal paper).
42
Francisco Venegas-Mart´ınez
Proposition 4.6 Consider the Jaynes–Good–Bernardo–Zellner problem: Maximize V0,0,1 (π) ⎧ V1,1,1 (π) − V0,0,1 (π) = ¯b1 , ⎪ ⎪ ⎪ ⎨ V0,0,0 (π) − V0,0,1 (π) = ¯b2 , subject to: % ⎪ ⎪ ⎪ ⎩ ak (θ)π(θ)dµ(θ) = ak , k = 0, 1, 2, ..., s, a0 ≡ 1 = a0 .
Then a necessary condition for a maximum is (22)
π ∗ (θ) ∝ [I(θ)]
ρ1 2
exp{ρ2 F(θ) +
s &
λk ak (θ)},
k=0
where ρj , j = 1, 2, and λk , k = 0, 1, ..., s, are the Lagrange multipliers associated with the constraints. Unlike the coefficients φ and 1 − φ appearing in (4.6), the multipliers ρj , j = 1, 2, do not necessarily add up to 1. There typically exist priors for which Shannon-Jaynes entropy becomes infinite. One way to overcome this problem consists of discounting entropy at a constant rate ν > 0. The following proposition introduces Good-Bernardo-Zellner’s controlled priors as solutions of maximizing discounted entropy: Proposition 4.7 Consider the discounted version of the problem stated in the preceding proposition: % Maximize − e−νθ π(θ) log π(θ)dµ(θ), subject to: ⎧ 1 1 dh1 (θ) 2 ⎪ ⎪ π(θ) dµ(θ) = log[I(θ)] , h1 (−∞) = 0, ⎪ ⎪ ⎪ h1 (∞) = V1,1,1 (π) − V0,0,1 (π) < ∞, ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ 1 dh2 (θ) π(θ) dµ(θ) = F(θ), h2 (−∞) = 0, ⎪ ⎪ ⎪ h2 (∞) = V0,0,0 (π) − V0,0,1 (π) < ∞, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ 1 dgk (θ) π(θ) dµ(θ) = ak (θ), gk (−∞) = 0, gk (∞) < ∞, k = 0, 1, 2, ..., s
Information measures and prior distributions
43
where a0 ≡ 1 = a0 . Then, a necessary condition for π ∗ (θ) to be an optimal control is given by (23)
π ∗ (θ) ∝ [I(θ)]
ρ1 (θ) 2
exp{ρ2 (θ)F(θ) +
s !
λk (θ)ak (θ)},
k=0
where ρj (θ) = ρj0 eνθ , j = 1, 2, and λk (θ) = λk0 eνθ , k = 0, 1, ..., s, are the costate variables associated with the state variables hj (θ), j = 1, 2, and gk (θ), k = 0, 1, ..., s, respectively. Furthermore, the constants ρj0 , j = 1, 2, and λk0 , k = 0, 1, ..., s, can be computed from the following non-linear system of s + 3 equations: $ "# 1 2 1 + log h1 (∞) = log log[I(θ)] m(ρ10 , ρ20 , λ00 , λ10 , ..., λs0 ; θ)dµ(θ) , "# $ 1 + log h2 (∞) = log F(θ)m(ρ10 , ρ20 , λ00 , λ10 , ..., λs0 ; θ)dµ(θ) , "# $ 1 + log gk (∞) = log ak (θ)m(ρ10 , ρ20 , λ00 , λ10 , ..., λs0 ; θ)dµ(θ) , k = 0, 1, 2, ..., s
where m(ρ10 , ρ20 , λ00 , λ10 , ..., λs0 ; θ) 'eνθ % ρ10 & . = [I(θ)] 2 eρ20 F (θ) eλ00 su=1 eλu0 au (θ)
5
Kalman filtering priors
In this section, we will study Good-Bernardo-Zellner’s priors as Kalman Filtering priors (Kalman 1960, and Kalman and Bucy 1961). We will continue to work with the single parameter case, and focus our attention on both the location and scale parameter families. Let Y1 , Y2 , ..., Yt be a set of indirect measurements, from a polling system or a sample survey, of an unobserved state variable βt . The objective is to make inferences about βt . The relationship between Yt and βt is specified by the measurement equation, sometimes also called the observation equation: (24)
Yt = At βt + εt ,
where At ̸= 0 is known, and εt is the observation error distributed as N (0, σε2t ) with σε2t known. Notice that the main difference between the
44
Francisco Venegas-Mart´ınez
measurement equation and the linear model is that, in the former, the coefficient βt changes with time. Furthermore, we suppose that βt is driven by a first order autoregressive process, that is, (25)
βt = Zt βt−1 + ηt−1 ,
where Zt ̸= 0 is known, and ηt ∼ N (0, ση2t ) with ση2t known. In what follows, we will assume that β0 , εt , and ηt are independent random variables. We might state nonlinear versions of (24) and (25), but this would not make any essential differences in the subsequent analysis. Suppose now, that at time t = 0, supplementary information is given ! by β0 and σ !02 , the mean and variance of β0 respectively. That is, ⎧ & ∞ ⎪ ⎪ π(β0 )dβ0 = 1, ⎪ ⎪ ⎪ −∞ ⎪ ⎪ ⎪ ⎨ & ∞ β0 π(β0 )dβ0 = β!0 , (26) C : ⎪ −∞ ⎪ ⎪ ⎪ & ∞ ⎪ ⎪ ⎪ ⎪ (β0 − β!0 )2 π(β0 )dβ0 = σ !02 . ⎩ −∞
In this case, Good-Bernardo-Zellner’s prior is given by
φ (27) πφ∗ (β0 ) ∝ [I(β0 )] 2 exp{(1 − φ)F(β0 ) + λ0 + λ1 β0 + λ2 (β0 − β!0 )2 },
where λj , j = 0, 1, 2, are Lagrange multipliers. Suppose that, at time t, we wish to make inferences about the conditional state variable θt = βt |It , where It = {Y1 , Y2 , ..., Yt−1 }. To obtain a posterior distribution of θt , the information provided by the measurement Yt , with density f (Yt |θt ), is used to modify the initial knowledge in πφ∗ (θt ) according to Bayes’ theorem: (28)
f (θt |Yt ) ∝ f (Yt |θt )πφ∗ (θt ).
We are now in a position to state the Bayesian recursive updating procedure of the Kalman Filter (KF) for both the location and scale parameter families f (Yt |θ) = f (Yt − θ), θ ∈ R, and f (Yt |θ) = (1/θ)f (Yt /θ), θ > 0, respectively. To start off the KF procedure, we substitute (27) in (26), obtaining that Good-Bernardo-Zellner’s prior at time t = 0, is given by N (β!0 , σ !02 ), which is describing the initial knowledge of the system. Proceeding inductively, at time t, β!t−1 and
Information measures and prior distributions
45
2 become supplementary information, and therefore Good-Bernardoσ !t−1 Zellner’s prior at time t is represented by
(29)
where
θt = βt |It ∼ N (Zt β!t−1 , Mt ), 2 Mt = Zt2 σ + ση2t−1 . !t−1
(30)
The sampling model (or likelihood function) is determined by Yt |θt ∼ N (At βt , σε2t ).
(31)
The posterior distribution, at time t, is then obtained by substituting both (29) and (30) in (28), so " # f (θt |Yt ) ∝ exp − 21 [(At βt − Yt )2 σε−2 + (βt − Zt β!t−1 )2 Mt−1 ] . t
Noting that πφ∗ (θt ) is a natural conjugate prior, we may complete the squares to get $ % θt |Yt ∼ N Zt β!t−1 + Kt (Yt − At Zt β!t−1 ), Mt − Kt At Mt , where (32)
Kt = Mt At (σε2t + A2t Mt )−1 .
This, of course, means that & β!t = Zt β!t−1 + Kt (Yt − At Zt β!t−1 ), (33) σ !t2 = Mt − Kt At Mt .
We then proceed with the next iteration. Equations (33), (30), and (32) are known in the literature as the KF. The above analysis can be summarized in the following proposition: Proposition 5.1 Consider the state-space representation: ⎧ ⎨ Yt = At βt + εt , ⎩
βt = Zt βt−1 + ηt−1 ,
defined as in (24) and (25). Suppose that supplementary information on the mean and variance of β0 is available. Let θt = βt |It , where
46
Francisco Venegas-Mart´ınez
It = {Y1 , Y2 , ..., Yt−1 }, and consider the location and scale parameter families f (Yt |θ) = f (Yt − θ), θ ∈ R, and f (Yt |θ) = (1/θ)f (Yt /θ), θ > 0, respectively, along with the properties stated in Corollary 4.1. Then, under Good-Bernardo-Zellner’s prior, πφ∗ (θt ), the posterior estimate of βt , β!t , is given by β!t = ωt Zt β!t−1 + (1 − ωt )(Yt /At ),
where ωt = σε2t (σε2t + A2t Mt )−1 .
6
Revisiting the normal linear model
The results on Good-Bernardo-Zellner priors given so far can be easily extended to the multi-dimensional parameter case, namely, θ = (θ1 , θ2 , ..., θm ) ∈ Θ ⊆ Rm , m > 1. Consider a vector of independent and identically distributed normal random variables (X1 , X2 , ..., Xn ) with common and known variance σ 2 satisfying (34)
E(Xk ) = ak1 θ1 + ak2 θ2 + · · · + akm θm ,
k = 1, 2, ..., n,
where A = (aij ) is a matrix of known coefficients for which (AT A)−1 exists. Let X and θ stand for the column vectors of variables Xk and parameters θj , respectively. Then (34) can be written in matrix notation as, E(X) = Aθ. In this case, we have (35)
n
2 1 2 exp{− 1 ∥ξ − Aθ∥ }, f (ξ|θ) = ( 2πσ 2) 2σ 2
where ξ = (x1 , x2 , ..., xn ). Since σ 2 has been assumed known, only the location parameter is unknown. The analogue of (2) is now given by the matrix: "# $ & %$ % ∂ ∂ In (θ) ≡ ∂θȷ log f (x|θ) ∂θℓ log f (x|θ) f (x|θ)dλ(x) 1≤ȷ,ℓ≤m
1 = 2 AT A, σ
and so det[In (θ)] is constant, which implies that the Good-BernardoZellner prior distribution πφ∗ (θ), describing a situation of vague information on θ, must be a locally uniform prior distribution.
47
Information measures and prior distributions
Let θ! be the least squares estimate for θ, then it is known that = AT X, E(θ! ) = θ, and Var(θ! ) = σ 2 (AT A)−1 . Noting from equation (35) that
AT Aθ!
n 1 ! ∥2 + ⟨AT A(θ − θ! ), θ − θ! ⟩)}, 2 exp{− 1 (∥ξ − Aθ f (ξ|θ) = ( 2πσ 2) 2σ 2
and applying Bayes’ theorem, we get as the posterior distribution of θ " # $% m 1 f (θ|ξ) = (2π)− 2 (det[ σ12 AT A]) 2 exp − 12 σ12 AT A(θ − θ! ), θ − θ! .
If supplementary information in mean, c, and variance-covariance matrix, D, is now incorporated, then the (informative) Good-BernardoZellner prior is given by " % m 1 πφ∗ (θ) = (2π)− 2 (det[D])− 2 exp − 12 ⟨D−1 (θ − c), θ − c⟩ .
The posterior distribution is now m
1
f (θ|ξ) = (2π)− 2 (det[B]) 2 & × exp − 12 ⟨B[θ − ((DB)−1 c + θ − ((DB)−1 c +
where B = D−1 +
7
1 T A A. σ2
1 B −1 AT Aθ! σ2
1 B −1 AT Aθ! σ2
)],
'
)⟩ ,
Summary and conclusions
We have presented, in a unified framework, a number of well-known methods that maximize a criterion functional to obtain non-informative and informative priors. Our general procedure is, by itself, capable of dealing with a range of interesting issues in Bayesian analysis. However, in this paper, we have limited our attention to Good-Bernardo-Zellner’s priors as well as their application to some Bayesian inference problems, including the Kalman filter and the Normal linear model. There exist priors for which Shannon-Jaynes entropy becomes infinite. In order to overcome this difficulty we proposed discounted entropy. We introduced Good-Bernardo-Zellner’s controlled priors which maximize discounted entropy at a constant rate. Throughout the paper, we have emphasized the existence and uniqueness of the solutions of the corresponding variational and optimal control problems. There are, of
48
Francisco Venegas-Mart´ınez
course, many other members of the class A that deserve much more attention than that we have attempted here. Needless to say, more work will be required in this direction. Results will be reported elsewhere. Acknowledgement The author wishes to thank an anonymous referee for valuable guidance and numerous suggestions. The author is indebted to Arnold Zellner, Jos´e M. Bernardo, Jim Berger, and George C. Tiao for helpful comments on earlier drafts of this paper. As usual, the author bears sole responsability for opinions and errors. Francisco Venegas-Mart´ınez Department of Finance, Tecnol´ ogico de Monterrey, Calle del Puente 222, 14380 M´exico D. F. fvenegas@itesm.mx
References [1] Akaike H., A new look at the Bayes procedure, Biometrika 65 (1978), 53–59. [2] Bayes T., An essay towards solving a problem in the doctrine of chances, Philos. Trans. R. Soc. London 53 (1763) 370–418. (Reprinted in Biometrika 45 (1958), 243–315.) [3] Berger J. O.; Bernardo J. M., Estimating a product of means: Bayesian analysis with reference priors, J. Amer. Statist. Assoc. 84 (1989), 200–207. [4] Berger J. O.; Bernardo J. M., On the development of reference priors, Bayesian Statistics 4 (1992a), 35–60. [5] Berger J. O.; Bernardo J. M., Ordered group reference priors with application to a multinomial problem, Biometrika, 79 (1992b), 25– 37. [6] Berger J. O.; Bernardo J. M.; Mendoza M., On priors that maximize expected information, Recent Developments in Statistics and Their Applications (1989), 1–20. [7] Bernardo J. M., Noninformative priors do not exist, J. Statist. Plann. Inference B65 (1997), 177–189.
Information measures and prior distributions
49
[8] Bernardo J. M., Reference posterior distributions for Bayesian inference, J. Roy. Statist. Soc. Ser. B Stat. Methodol. 41 (1979), 113–147. [9] Bernardo J. M.; Ram´ on J. M., An Introduction to Bayesian Reference Analysis: Inference on the Ratio of Multinomial Parameters, Tech. Rep. 3–97, Universitat de Val`encia, Spain, 1997. [10] Bernardo J. M.; Smith A. F. M., Bayesian Theory, John Wiley & Sons, Wiley Series in Probability and Mathematical Statistics, New York, 1994. [11] Box G. E. P.; Tiao G. C., Bayesian Inference and Statistical Analysis, Addison-Wesley Series in Behavioral Science: Quantitative Methods, Massachusetts, 1973. [12] Geisser S., Predictive Inference: An Introduction, Chapman & Hall, New York, 1993. [13] Good I. J., Utility of a distribution, Nature 219 (1968), 1392. [14] Good I. J., What is the use of a distribution?, Multivariate Analysis (Krishnaiah, ed.) Vol. II, Academic Press, New York (1969), 183– 203. [15] Jaynes E. T., Information theory and statistical mechanics, Phys. Rev. 106 (1957), 620–630. [16] Jaynes E. T., On the rationale of maximum-entropy methods, Proc. of the IEEE 70 (1982), 939–952. [17] Jeffreys H., Theory of Probability, 3rd. edition, Oxford University Press, Oxford, 1961. [18] Kalman R. E., A new approach to linear filtering and prediction problems, Transactions ASME, Series D, J. of Basic Engineering 82 (1960), 35–45. [19] Kalman R. E.; Bucy R., New results in linear filtering and prediction theory, Transactions ASME, Series D, J. of Basic Engineering 83 (1961), 95–108. [20] Kullback S., Information Theory and Statistics, Wiley, New York, 1959.
50
Francisco Venegas-Mart´ınez
[21] Lindley D. V., On a measure of information provided by an experiment, Ann. Math. Statist. 27 (1956), 986–1005. [22] R´enyi A., Foundations of Probability, Holdan-Day, San Francisco, 1970. [23] Soofi E. S., Capturing the intangible concept of information, J. Amer. Statist. Assoc. 89 (1994), 1243–1254. [24] Zellner A., An Introduction to Bayesian Inference in Econometrics, Wiley, New York, 1971. [25] Zellner A., Bayesian method of moments (BMOM) analysis of mean and regression models, Bayesian Analysis in Econometrics and Statistics, Published by Edward Elgar Publishing Limited, UK (1997), 291–304. (Also available in Prediction and Modelling Honouring Seymor Geisser, Springer-Verlag, New York, Berlin, Heidelberg, (1996).) [26] Zellner A., Bayesian methods and entropy in economics and econometrics, Maximum Entropy and Bayesian Methods, Dordrecht, Netherlands: Kluwer (1996), 17–31. [27] Zellner A., Maximal data information prior distributions, New Developments in the Application of Bayesian Methods, Amsterdam: North-Holland (1977), 201–232. [28] Zellner A., Models, prior information and Bayesian analysis, J. Econometrics 75 (1996a), 51–58. [29] Zellner A., Past and recent results on maximal data information priors, J. Statist. Plann. Inference 49 (1996b), 3–8. [30] Zellner A., The finite sample properties of simultaneous equations’ estimates and estimators: Bayesian and non-Bayesian approaches, J. Econometrics 83 (1998), 185-212.
Morfismos, Vol. 8, No. 2, 2004, pp. 51–80
On a problem of Steinhaus concerning binary sequences Delphine Hachez1
Shalom Eliahou
Abstract A finite ±1 sequence X yields a binary triangle ∆X whose first row is X, and whose (k + 1)st row is the sequence of pairwise products of consecutive entries of its kth row, for all k ≥ 1. We say that X is balanced if its derived triangle ∆X contains as many +1’s as −1’s. In 1963, Steinhaus asked whether there exist balanced binary sequences of every length n ≡ 0 or 3 mod 4. While this problem has been solved in the affirmative by Harborth in 1972, we present here a different solution. We do so by constructing strongly balanced binary sequences, i.e. binary sequences of length n all of whose initial segments of length n − 4t are balanced, for 0 ≤ t ≤ n/4. Our strongly balanced sequences do occur in every length n ≡ 0 or 3 mod 4. Moreover, we provide a complete classification of sufficiently long strongly balanced binary sequences.
2000 Mathematics Subject Classification: 05A05, 05A15. Keywords and phrases: Steinhaus, balanced binary sequence, derived sequence, derived triangle.
1
Introduction
Let X = (x1 , x2 , . . . , xn ) be a binary sequence of length n, i.e. a sequence with xi = ±1 for all i. We define the derived sequence ∂X of X by ∂X = (y1 , y2 , . . . , yn−1 ) where yi = xi xi+1 for all i. By convention, we agree that ∂X = ∅ whenever n = 0 or 1, where ∅ stands for the 1
The present work is part of the PhD thesis work of the second author at the Universit´e du Littoral Co ˆte d’Opale under the direction of Professor S. Eliahou.
51
52
Eliahou and Hachez
empty binary sequence of length 0. More generally, for k ≥ 0, we shall denote by ∂ k X the kth derived sequence of X, defined recursively as usual by ∂ 0 X = X and ∂ k X = ∂(∂ k−1 X) for k ≥ 1. We shall denote by ∆X the collection of the derived sequences X, ∂X,. . . , ∂ n−1 X of X. This collection may be pictured as a triangle, as in the following example: if X = (+1, +1, −1, +1, −1, +1, +1), abbreviated as + + − + − + +, then ∆X = ++−+−++ + − − − −+ −+++− − + +− −+− −− +
We shall henceforth refer to ∆X as the derived triangle of X. If Y = (y1 , . . . , ym ) is any finite ! collection of numbers, we denote the sum of its entries by S(Y ) = m i=1 yi . For instance, if X = (x1 , x2 , . . . , xn ) is a binary sequence, then S(∆X) represents ! the sum of the entries in the k derived triangle ∆X of X, i.e. S(∆X) = n−1 k=0 S(∂ X). Definition 1.1 A binary sequence X = (x1 , x2 , . . . , xn ) is balanced if S(∆X) = 0. In other words, X is balanced if its derived triangle ∆X contains as many +1’s as −1’s.
For example, the above binary sequence X = ++−+−++ is balanced, as its derived triangle contains 14 positive signs and 14 negative signs in total. This sequence, as well as other balanced sequences of length 11, 12, 19 and 20, appear in [3], where the author proposed the following problem. Problem Is there a balanced binary sequence of length n for every n ≡ 0 or 3 mod 4? (The term “balanced” is not used by Steinhaus.) Note that the condition n ≡ 0 or 3 mod 4 is necessary for the existence of a balanced binary sequence X of length n. Indeed, the derived triangle of X contains n(n + 1)/2 entries; if n ≡ 1 or 2 mod 4, this number of entries is odd, and therefore S(∆X) cannot vanish. The above problem has been solved in the affirmative in [Harborth 1972]. In this paper, we shall present a new solution to the problem of
Steinhaus’ binary sequences
53
Steinhaus, by constructing binary sequences satisfying a much stronger condition. Definition 1.2 A binary sequence X = (x1 , . . . , xn ) is strongly balanced if the initial segment (x1 , . . . , xn−4t ) of X is balanced, for every 0 ≤ t ≤ n/4. Alternatively, strongly balanced sequences may be defined recursively, as follows. As initial conditions, balanced sequences of length 0 or 3 are considered as strongly balanced. For n ≥ 4, the sequence (x1 , . . . , xn ) is defined as strongly balanced if and only if it is balanced and (x1 , . . . , xn−4 ) is strongly balanced. For instance, the above binary sequence X = + + − + − + + is strongly balanced of length 7, as X and its initial segment of length 3, namely + + −, are both balanced. Another example of a strongly balanced binary sequence is given by P = + − + + − + + + + − −−, of length 12. Indeed, the initial segments of length 4, 8 and 12 of P , namely + − ++, + − + + − + ++ and P itself, are all balanced as easily seen. On the other hand, the sequences Y7 = + + + − + + − and Y8 = + + + + − + −− are both balanced, but not strongly so. Indeed, the initial segments of length 3 of Y7 and length 4 of Y8 are both constant +1 sequences, and therefore cannot be balanced. We shall denote by sb(n) the number of strongly balanced binary sequences of length n. There is no a priori reason to expect that strongly balanced sequences should exist at all for n large. But fortunately, the task of searching for all such sequences lends itself very well to computer experimentation (see below). The outcome of our experiments is quite surprising. Initially, the number sb(n) for n ≡ 0 mod 4 strictly increases, from n = 4 up to n = 36. Then it starts to decrease (non-strictly) up to length n = 92, where it finally stabilizes to the constant sb(n) = 4 for all n = 4m ≥ 92. For n ≡ 3 mod 4, the situation is similar, though more complicated: provided n ≥ 127, we find that sb(n) = 14 if n ≡ 3, 7 mod 12, and sb(n) = 12 if n ≡ 11 mod 12. A convenient way to summarize the behavior of the numbers sb(n) is ! n. sb(n)t to exhibit properties of their generating function g(t) = ∞ n=0 For example, the eventual periodicity of sb(n) for n large is reflected by the property of the generating function g(t) of being a rational function. Our main result in this paper is the following
54
Eliahou and Hachez
!∞ n Theorem 1.3 The generating function g(t) = n=0 sb(n)t of the number sb(n) of strongly balanced binary sequences of length n is given by the following rational function: g(t) = 4t92 /(1 − t4 ) + f0 (t) + (14 + 12t4 + 14t8 )t127 /(1 − t12 ) + f3 (t), where f0 (t) and f3 (t)are the following polynomials: f0 (t) = 1 + 6t4 + 18t8 + 30t12 + 52t16 + 80t20 + 88t24 + 106t28 + 116t32 + 124t36 + 106t40 + 92t44 + 92t48 + 90t52 + 64t56 + 44t60 + 38t64 + 32t68 + 20t72 + 20t76 + 8t80 + 8t84 + 6t88 , f3 (t) = 4t3 + 8t7 + 16t11 + 26t15 + 36t19 + 48t23 + 48t27 + 66t31 + 88t35 + 108t39 + 114t43 + 90t47 + 88t51 + 104t55 + 92t59 + 60t63 + 48t67 + 28t71 + 26t75 + 26t79 + 20t83 + 16t87 + 18t91 + 14t95 + 14t99 + 14t103 + 14t107 + 16t111 + 14t115 + 14t119 + 16t123 . In the above formula for g(t), the terms tn are separated according as n ≡ 0 or 3 mod 4, for better readability and because their behavior is different. Corollary 1.4 For every natural number n ≡ 0 or 3 mod 4, there exists a strongly balanced binary sequence of length n. Proof: Consider first the case n ≡ 0 mod 4. By expanding the summand 4t92 /(1−t4 ) as 4t92 +4t96 +4t100 +. . . in the formula for g(t), we see that sb(n) = 4 for every n = 4m ≥ 92, as stated earlier. And the summand f0 (t) in g(t) gives the exact value of sb(n) for 0 ≤ n = 4m ≤ 88, which is nowhere zero. Similarly, for the case n ≡ 3 mod 4, we see that sb(n) = 14 for every n ≡ 3 or 7 mod 12 with n ≥ 127, and sb(n) = 12 for every n ≡ 11 mod 12 with n ≥ 131. This follows from expanding the summand (14 + 12t4 + 14t8 )t127 /(1 − t12 ) as an infinite series. Smaller values of n are taken care of by the polynomial f3 (t). For example, sb(51) = 88, sb(55) = 104 and sb(59) = 92. Alternatively, one may note that, if there exists a strongly balanced binary sequence X of length n, then the initial segment of length n − 4 of X is also a strongly balanced binary sequence. This follows directly from the definition. ! The set of all strongly balanced binary sequences of small length n (n ≤ 127, say) may be constructed by the method described in Section 3.
Steinhaus’ binary sequences
55
The eventual periodicity of sb(n) is a consequence of Theorems 2.1 and 2.2 below.
2
A classification of long strongly balanced sequences
In this section, we shall describe the set of all strongly balanced binary sequences of length n ≥ 92 for n ≡ 0 mod 4, and n ≥ 127 for n ≡ 3 mod 4. These two sets admit periodic structures. In order to present the results, we introduce the following notation. Notation If P , Q are finite binary sequences, we shall denote by P Q∞ the infinite eventually periodic sequence which starts with P and continues with Q repeated periodically thereafter. If R is yet another finite binary sequence, and if k ∈ N, we shall denote by P Qk R the sequence starting with P , continuing with Q repeated k times, and ending with R. Finally, if T = (t1 , . . . , tm , . . .) is any finite or infinite sequence of length ≥ m, we shall denote by T [m] = (t1 , . . . , tm ) the initial segment of length m of T .
2.1
The case n ≡ 0 mod 4
Let Q1 , . . . , Q4 denote the following infinite eventually periodic binary sequences. We will show that every initial segment Qi [n] with n ≡ 0 mod 4 is strongly balanced, and that there are no other strongly balanced binary sequences of length n, provided n = 4m ≥ 92. These statements are formalized in the next theorem. Q1 = + − + + (+ + − + + − + − − − ++)∞ ,
Q2 = (+ − + + − + + + + − −−)∞ ,
Q3 = + − + − (+ − − + + + + − − + ++)∞ ,
Q4 = + − + − (− + − + + + − + − + ++)∞ . Theorem 2.1 For every n ≡ 0 mod 4, the initial segment of length n of each of Q1 , Q2 , Q3 and Q4 is a strongly balanced binary sequence. Conversely, every strongly balanced binary sequence of length n with n ≡ 0 mod 4 and n ≥ 92 is an initial segment of either of Q1 , Q2 , Q3 or Q4 . Parts of the proof of this result can be found in Section 6.
56
Eliahou and Hachez
2.2
The case n ≡ 3 mod 4
This case is more complicated. Let R1 , . . . , R12 denote the following infinite eventually periodic binary sequences. Their initial segments of length n ≡ 3 mod 4 are all strongly balanced. Moreover, they account for all sufficiently long strongly balanced binary sequences, except for two more exotic ones in length n ≡ 3 mod 12 and n ≡ 7 mod 12. For instance, one of these extra sequences for n ≡ 3 mod 12 is R5 [n − 4] + − + −, that is, the initial segment of length n − 4 of R5 appended with the sequence + − +−. R1 = + + −(+ − + + + + − + + + +−)∞ ,
R2 = + + − − − − +(+ + − − + − + − + − −+)∞ ,
R3 = + − +(+ + + − + − + + + + −+)∞ , R4 = + − + + + + −
(+ − + + − + − − − − + + + + − + − + − − − − −−)∞ ,
R5 = + − + + + + −(− + + − + + + + − + +−)∞ , R6 = + − + − + − −(+ − + − + − − + + + −−)∞ ,
R7 = + − + − + − −
(+ − + − − − − − − − + − + − − + − + + − − − +−)∞ ,
R8 = + − +(− + − − − + − − + + −+)∞ , R9 = − + +(+ + − + + + + − + − ++)∞ ,
R10 = − + + + + − +(− − + + + − − + − + −+)∞ , R11 = − − − − − + −(+ − − + + + − − + − +−)∞ ,
R12 = − − −(− − + − − + + + + − −−)∞ .
Theorem 2.2 Let n ≡ 3 mod 4. Then, the initial segment of length n of each of R1 , . . . , R12 is a strongly balanced binary sequence. Moreover, if n ≥ 127, then every strongly balanced binary sequence of length n is an initial segment of one of R1 , . . . , R12 , with the following exceptions: • If n ≡ 3 mod 12, there are two more strongly balanced binary sequences of length n, namely R5 [n−4]+−+− and R8 [n−4]+−++. • If n ≡ 7 mod 12, there are also two more strongly balanced binary sequences of length n, namely R8 [n−8]+−++−+++, and either R5 [n−8]+−+−−−−− if n ≡ 7 mod 24, or R5 [n−8]+−+−−+−+ if n ≡ 19 mod 24.
Steinhaus’ binary sequences
57
The proof is similar to that of Theorem 2.1. See the last comment in Section 6. Even though Theorems 2.1 and 2.2 achieve the complete description of all sufficiently long strongly balanced binary sequences, we should point out that there are other infinite families of (simply) balanced binary sequences. For example, for all n ≡ 3 mod 4, the sequence Q1 [n]+ happens to be balanced. Similarly, for all n ≡ 8 mod 12, the sequence R1 [n] + − − + is balanced as well. And of course, there are the sequences in [2] which originally solved the problem of Steinhaus. None of the presently discussed sequences are strongly balanced, though.
3
The method
We shall explain now the method by which we have obtained the results above, and shall also supply our specific Mathematica implementation of it. The idea is quite simple. Assume X is a strongly balanced binary sequence of length n. An extension of X is any binary sequence Y containing X as an initial segment. Let Y be any one of the 16 possible extensions of X of length n + 4. Then, Y is strongly balanced if and only if Y is balanced. This holds because X itself is strongly balanced. Consequently, if we know the set SB(n) of all strongly balanced binary sequences of length n, and if card(SB(n)) = t, then in order to construct the set SB(n + 4), it is enough to consider the 16t extensions of length n + 4 of all the elements in SB(n), and select those which are simply (hence strongly) balanced. This is a computational task of low complexity. In summary, our method is a greedy algorithm, which aims to construct all strongly balanced sequences at increasing lengths. For lengths divisible by 4, the algorithm may start with the set {∅} of (strongly) balanced sequences of length 0. In length 3 mod 4, it will start with the set {+ + −, + − +, − + +, − − −} of all (strongly) balanced sequences of length 3. Here are the very concise Mathematica functions which we have written to implement the method. The first four functions (derive, triangle, weight and ext4) take as argument an arbitrary finite binary
58
Eliahou and Hachez
sequence s, e.g. s = {1, 1, −1, 1} in Mathematica syntax. 1. The function derive[s] outputs the derived sequence ∂s of s, that is, the sequence of pairwise products of consecutive terms in s. derive[s_] := Table[s[[i]]s[[i+1]], {i, 1, Length[s] - 1}]
2. Then, the function triangle[s] outputs the derived triangle ∆s of s, i.e. the list of all higher order derived sequences of s. triangle[s_] := Block[{s1, tri}, s1 = s; tri = {s1}; While[Length[s1] > 1, s1 = derive[s1]; AppendTo[tri, s1]]; tri] 3. The function weight[s] outputs the sum of the entries in the derived triangle ∆s of s. weight[s_] := Apply[Plus, Flatten[triangle[s]]] 4. The function ext4[s] outputs the list of all balanced binary sequences containing s as an initial segment and 4 units longer. Note that, if s is strongly balanced, then ext4[s] outputs the list of all strongly balanced sequences containing s as an initial segment and 4 units longer. ext4[s_] := Block[{l, Do[sext = Join[s, If[weight[sext] {x1, -1, 1, 2}, {x3, -1, 1, 2}, l]
sext}, l = {}; {x1, x2, x3, x4}]; == 0, AppendTo[l, sext]], {x2, -1, 1, 2}, {x4, -1, 1, 2}];
5. Finally, given a non-negative integer n ≡ 0 or 3 mod 4, the function strong[n] successively builds all strongly balanced binary sequences of length m with m ≤ n and m ≡ n mod 4. strong[n_] := strong[n] = (If[n == 0, Return[{{}}]]; If[n == 3, Return[{{1, 1, -1}, {1, -1, 1}, {-1, 1, 1}, {-1, -1, -1}}]]; Flatten[Map[ext4, strong[n - 4]], 1])}
Steinhaus’ binary sequences
59
For instance, the command Sum[Length[strong[n]]*t^n, {n, 0, 88, 4}]!will output the polynomial f0 (t) of Theorem 1.3, where 4i f0 (t) = 22 i=0 sb(4i)t displays the numbers sb(n) for each length n = 4i ≤ 88. This computation takes about 90 seconds on a standard PC with a Pentium 4m processor clocked at 1.6 GHZ.
4
Other possible strengthenings
We describe here two other attempts of strengthening the notion of balanced sequences. However, in contrast to strongly balanced sequences, these other strengthenings turn out to admit only finitely many complying binary sequences.
4.1
M-sequences
In our first attempt, we shall be seeking binary sequences X = (x1 , . . . , xn ) having the property M defined recursively as follows: X is balanced, and its middle segment (x3 , . . . , xn−2 ) of length n − 4 is also balanced and satisfies property M. By convention, balanced binary sequences of length 0 or 3 satisfy property M. (Compare with the similar-looking recursive definition of strongly balanced sequences.) For brevity, sequences satisfying property M will be called M-sequences. We shall restrict our attention to lengths n ≡ 0 mod 4. As it turns out, there are binary M-sequences of length n for every n = 4, 8, . . . , 96. In length 96, there remain exactly two binary M-sequences. Quite surprisingly, none of these two sequences can be extended to a sequence of length 100 still satisfying property M. Consequently, there are no binary M-sequences X of length ! n ≡ 0 mod 4 with n ≥ 100. Thus, the generating function gM (t) = X tl(X) , where X runs over the set of all balanced binary M-sequences of even length, and where l(X) denotes the length of X, is a polynomial of degree 96, given by the following expression: gM (t) = 2t96 + 8t92 + 10t88 + 14t84 + 22t80 + 22t76 + 30t72 +48t68 + 76t64 + 88t60 + 108t56 + 130t52 + 174t48 +226t44 + 222t40 + 198t36 + 172t32 + 144t28 +138t24 + 94t20 + 60t16 + 40t12 + 20t8 + 6t4 + 1.
60
Eliahou and Hachez
For definiteness, here are the two binary M-sequences of length 96: ++++−−−−−+++−−−−+−++−+−−+−++−+−− ++−−−−+−−−−−−+++−−+++++−++++−++− + − − + + + + − − − − − − − − − + + − − − + + − + − − + + + ++, +++++−−+−++−−−++−−−−−−−+−++++−−+ −++−++++−+++++−−+++−−−−−−+−−−−++ −−+−++−+−−+−++−+−−−−+++−−−−−++++.
4.2
Universal balanced binary sequences
In our second attempt, we seek universal balanced binary sequences, i.e. balanced binary sequences X = (x1 , . . . , xn ) with the property that every initial segment (x1 , . . . , xk ), with k ≡ 0 or 3 mod 4, is also balanced. There are exactly 6 universal balanced binary sequences of length 11, namely + − + + + + − + − + +, + − + + + + − − + + −, + − + + + + − − + − +, + − + − + − − + − + −, + − + − + − − − + + + and + − + − + − − − + − −. As easily checked, by adding one more ± sign at the end of each of these 6 sequences, we find that there are no universal balanced binary sequences in length 12 or higher.
5
Related open problems
We propose here a few open problems in the same spirit as that of Steinhaus. Problem 1 Are there infinitely many symmetric balanced binary sequences, such as X = + + − + − + + ? More generally, what is the set of lengths of all such sequences? For instance, it may be shown that there exists no symmetric balanced binary sequences of length n ≡ 4 mod 8. Problem 2 The balanced sequences X of length 12 and 20 given in [3] have the property that S(X) = 0, where S(X) is the sum of the entries in X. As a consequence, their derived sequences, of length 11 and 19, respectively, are also balanced. It would be of great interest to know, more generally, whether for every n divisible by 4, there exists a balanced binary sequence X of length n satisfying S(X) = 0. We did
Steinhaus’ binary sequences
61
find such sequences in every length n = 4k with n ≤ 36. However, we do not know whether they exist in higher length. This problem was suggested by Michel Kervaire during a phone conversation with one of the authors. Problem 3 For every binary sequence X of length n ≡ 1 or 2 mod 4, the sum S(∆X) of the entries of the derived triangle ∆X of X is an odd number. It is natural to ask whether the value S(∆X) = 1 (respectively S(∆X) = −1) is attained for every n ≡ 1 or 2 mod 4. More generally, given any integer v, are there infinitely many finite binary sequences X such that S(∆X) = v ? We know at least that the answer is positive for v = −3, −2, 1, 2, 4 and 5, by taking suitable initial segments of some of the Qi and the Ri . The answer is also positive for v = −1, with the sequence Q1 [n] + − for every n ≡ 11 mod 12. Still more ! generally, what can be said about the generating function Gn (t) = X tS(∆X) , where X runs over the set of all binary sequences of length n ? Problem 4 The notion of balanced sequence makes sense not only with entries ±1, but more generally with entries taken in any (commutative) ring R. Indeed, let X = (x1 , . . . , xn ) be a sequence with entries xi ∈ R for all i. The derived sequence ∂X = (y1 , . . . , yn−1 ) of X can still be defined by yi = xi xi+1 for all 1 ≤ i ≤ n − 1, and this gives rise again to the derived triangle ∆X of X, namely the collection of the ∂ k X. Of course, the sequence X is said to be balanced if the sum of the entries in ∆X is 0 ∈ R. Are there interesting infinite families of balanced sequences in this more general setting? For instance, let p be a prime number, let ζ be a primitive pth root of unity, and let R = Z[ζ]. In a forthcoming note, we shall show that, for p = 3, the ring R contains infinitely many balanced sequences of powers of ζ. We do not know whether this remains true for larger primes p. The referee has suggested the following related problem. Let G be a finite group, even a non-abelian one. Are there infinitely many sequences X with entries in G whose derived triangle ∆X contains the same number of occurrences of each group element? Problem 5 This is really a family of problems. We may consider higherdimensional analogues of balanced sequences, such as balanced binary matrices, balanced binary 3-dimensional tensors, or balanced binary simplices for example. In general, the concept of a balanced object X
62
Eliahou and Hachez
will make sense whenever there is a suitable notion of derived object X !→ ∂X, with strictly decreasing sizes. The derived object should be constructed by taking the product of the neighbours for each suitable position in X, as is the case for sequences. A given object X will then be said to be balanced whenever the sum of the entries in the collection of its iterated derived objects ∂ k X is zero. Consider, for example, the following notion of a balanced binary square matrix. If A = (ai,j )1≤i,j≤n is a binary matrix of order n, define ∂A as the binary matrix (bi,j )1≤i,j≤n−1 of order n − 1, where bi,j = ai,j ai,j+1 ai+1,j ai+1,j+1 . The derived pyramid ∆A is then defined as the collection of ∂ k A for 0 ≤ k ≤ n − 1. Note that again, the total number of binary entries in ∆A is even if and only if n is congruent to 0 or 3 mod 4. Are there infinitely many balanced binary matrices? Problem 6 Let X be an arbitrary binary sequence of length n. Does there exist a balanced binary sequence Y having X as an initial segment? (This problem is due to Pierre Duchet.) For instance, let Jn be the constant +1 sequence of length n. What is the length j(n) of a shortest possible balanced binary extension of Jn , if one exists at all? We know by construction that j(100) ≤ 236.
6
Highlights of the proof of Theorem 2.1
We shall give here parts of the proof of Theorem 2.1. There are two things to prove. First, that the initial segments Qi [n] are balanced, for every n ≡ 0 mod 4. And second, that there are no other strongly balanced binary sequences of length n ≡ 0 mod 4, provided n ≥ 92. We shall restrict our attention to Q1 . (The phenomena are similar for Q2 , Q3 , Q4 .) The fact that S(∆Q1 [n]) = 0 for n ≡ 0 mod 4 will follow from a certain periodic structure of the derived triangle ∆Q1 [n]. This structure then allows us to control which extensions of Q1 [n] remain strongly balanced, leading to the classification statement. This is already quite tedious. Consequently, we shall not discuss Theorem 2.2 concerning sequences of length n ≡ 3 mod 4. However, the phenomena are similar again, and it should become clear that a complete proof can be written in this case as well.
63
Steinhaus’ binary sequences
C1 A1 B1 0 -4 -4
T16 S=0
0
B3 4
4
4
4
B2 0
C3
C2 0
A3 0
C2 0
A2 0
B3
B2 0
C2 0
A3 0
0
B2 0
C3
A1 B1 C1 0 -4 -4
A2
A2
A3 0
A1 B1 C1 0 -4 -4
B3 4
C3 4
Figure 1: Structure of the derived triangle of Q1 [52].
Recall from subsection 2.1 that Q1 = + − + + (+ + − + + − + − − − ++)∞ . Let n ≡ 0 mod 4 be a given positive integer. We claim that ∆Q1 [n] has a periodic structure, as illustrated in Figure 1. More specifically, we will prove that, if n ≥ 16, there are nine types of NE/SW diagonal strips of width 4, denoted A1 , A2 , A3 , B1 , B2 , B3 , C1 , C2 , C3 , such that the derived triangle ∆Q1 [n] is the periodic assembly of T16 = ∆Q1 [16] and of the components Ai , Bi , Ci , as depicted in Figure 1. Note that the components A1 , B1 , C1 appear on the top of the derived triangle, the components A3 , B3 , C3 on its SW side, and A2 , B2 , C2 occupy the rest of the triangle (except T16 ). The sum of each component is as indicated (e.g. A1 has sum S(A1 ) = 0, B1 has sum S(B1 ) = −4, and so on). 1 According to this structure of ∆Q1 [n], we see that each full NE/SW diagonal strip of width 4 on the right of T16 has sum zero, and therefore
64
Eliahou and Hachez
S(∆Q1 [n]) = 0, as claimed. In order to establish this structure, we need to introduce a few notations selecting certain specific parts of these NE/SW diagonal strips. Notation. • xqp denotes the pth digit in the qth row of ∆Q1 [n], for all 1 ≤ p ≤ n and 1 ≤ q ≤ n − p + 1. In particular, the first row of ∆Q1 [n], i.e. Q1 [n] itself, is constituted by the elements x11 , x12 , . . . , x1n , and the left side of the triangle ∆Q1 [n] consists of x11 , x21 , . . . , xn1 . The basic = xqp xqp+1 . defining property of the triangle ∆Q1 [n] thus reads xq+1 p • di denotes the ith NE/SW diagonal of ∆Q1 [n], i.e. di is the right side of the triangle ∆Q1 [i], for all 1 ≤ i ≤ n;
• For i ≡ 1 mod 4 and j ≡ 1 mod 4, 1 ≤ j ≤ i, Tij denotes the trapezoid of figure 3.
• For i ≡ 1 mod 4, we set Si = Tii . This special trapezoid Si corresponds to the last four NE/SW diagonals of ∆Q1 [i + 3], and will be called a strip. • For i ≡ 1 mod 4 and j ≡ 2 mod 4, 2 ≤ j ≤ i, Pij denotes the parallelogram of figure 2, of width 4 and length 12. A few remarks are in order. First observe that, because of the basic = xqp xqp+1 , the trapezoid Tij is completely determined by property xq+1 p its top row and its left side, namely by x1i , x1i+1 , x1i+2 , x1i+3 and x1i , x2i−1 , . . . , xji+1−j . Now, this left side of Tij is itself determined by x1i
j−4 and by the right side of the adjacent trapezoid Ti−4 . We record these observations as follows.
Fact 1 The trapezoid Tij is completely determined by its top row and j−4 by the right side of Ti−4 . Similar remarks can be made about the parallelogram Pij , and we have: Fact 2 The parallelogram Pij is completely determined by the bottom j−4 of the quadrilateral just above it and by the right side of Pi−4 . Finally, given i ≡ 1 mod 4, let j be the unique element in the set {1, 5, 9} which is congruent to i mod 12. Clearly, with these notations,
65
Steinhaus’ binary sequences
xji xj+1 i−1 xj+2 i−2 xj+3 i−3
xj+1 i
xj+2 i−1
xj+3 i−2
xj+2 i
xj+3 i−1
xj+3 i
xj+4 xj+4 xj+4 xj+4 i−4 i−3 i−2 i−1 . . . . . . . . . . . . j+11 xj+11 xj+11 xj+11 i−11 i−10 i−9 xi−8 xj+12 i−11
j+12 xj+12 i−10 xi−9
j+13 xj+13 i−11 xi−10
xj+14 i−11 Figure 2: Parallelogram Pij .
x1i x2i−1
.
.
x2i
x3i−2 .
xji+1−j
.
x3i−1 . .
xji+2−j
xj+1 i+1−j
x1i+1
xj+2 i+1−j
x2i+1
.
.
.
xji+3−j
xj+1 i+2−j
x1i+2
x2i+2
x3i .
x1i+3
.
x3i+1 .
xji+4−j
xj+1 i+3−j
xj+2 i+2−j
xj+3 i+1−j Figure 3: Trapezoid Tij .
66
Eliahou and Hachez
the strip Si is the concatenation, in the NE/SW direction, of the trapezoid Tij and of the (i − j)/12 parallelograms Pij+1 , Pij+13 , . . ., Pii−11 . We will denote the NE/SW concatenation by the symbol +. With this notation, we have Si = Tij + Pij+1 + Pij+13 + . . . + Pii−11 . We now define the 9 special components Ai , Bi , Ci , where A1 , B1 , C1 are trapezoids, whereas A2 , B2 , C2 , A3 , B3 , C3 are parallelograms: 5 = + + − + ∗ A1 := T17 + + − − + + − + − + − − − − − + + + − + − −
6 = ∗ A3 := P17
+ −
+ + − + + − − − + − + + − − − + + + + − − + + − + − + − + − − − + − + + + − − + − + − − − +
6 ∗ A2 := P29 = + − + + − + + − − − + − + + − − − + + + + − − + + − + − + − + − − − + − + + − − − + + + − + − −
9 = ∗ B1 := T21
+
−
+ + −
−
−
−
−
+
−
+
−
+ − + − − − + +
−
+
+ + −
+
−
− +
−
−
− +
+
+ − +
−
+
−
−
67
Steinhaus’ binary sequences
10 ∗ B2 := P33 =
+
− + − + − − − + + + + − − + + − + − + − + − − − + − + + − − − + + + + − − + + − − + − − − + +
13 ∗ C1 := T25 = − − + + + + − + − + − − − − − + − + + − + − + − + − − − + − + + − − − + + + + − − + + − + − + − + − − − − + + − + −
10 ∗ B3 := P21 =
+
− − + − − − + + + + − − + + − + − + − + − − − + − + + − − − + + + + − + + + − + + − + − − +
+
14 = + ∗ C2 := P37 − − + + + − + + − + − + − + − − − + − + + − − − + + + + − − + + − + − + − + − − − − + + − + −
68
Eliahou and Hachez
14 = + ∗ C3 := P25 − − + + + − + + − + − + − + − − − + − + + − − − + + + + − − + + − + − + − − − − − + + + + + +
We shall need to observe some resemblances between some of these components, to be used with Facts 1 and 2. • The SW edge of A1 (respectively B1 , C1 ) is equal to the SW edge of A2 (respectively B2 , C2 ). • The 12-tuple composed by the last 12 digits of the right side of C1 is equal to the 12-tuple containing the digits of the right side of C2 . We claim that the strips Si come in 3 different types, depending on the class i ≡ 1, 5 or 9 mod 12. Here is the general key formula we want to prove: Claim 1 ∀k ∈ N, k ≥ 1,
S12k+5 = A1 + (k − 1)A2 + A3 , S12k+9 = B1 + (k − 1)B2 + B3 , S12(k+1)+1 = C1 + (k − 1)C2 + C3 .
As we will see, this results from the structure of the 9 components Ai , Bi , Ci and Facts 1 and 2, and may be proved by induction on k. To start the induction, one verifies the claim in ∆Q1 [40] by direct observation.
Steinhaus’ binary sequences
69
Assume now that the claim is true for k = 1, 2. In particular, we know that S37 = C1 + C2 + C3 . We will show that S41 = A1 + 2A2 + A3 . By periodicity of the sequence Q1 , we know that the top of S41 is equal 17 is to the top of A1 . Thus, using Fact 1, we derive that the trapezoid T41 equal to A1 + A2 . Indeed, it is completely determined by the top of A1 and the right side of C1 (by the hypothesis for S37 ), and the same is true for A1 + A2 in S29 , by the hypothesis for S29 . Thus, the parallelogram just under A1 + A2 in S41 is completely determined by the bottom of A2 and the right side of C2 , which is equal to the last 12 digits of the right side of C1 . According to the verifications we have just made for the previous trapezoid, the same holds for A2 , whence Fact 2 implies: 29 = A + A + A . T41 1 2 2 Finally, similar arguments enable us to show that the last parallelogram of S41 is equal to the last parallelogram of S29 , namely A3 . Hence we get S41 = A1 + 2A2 + A3 , and we are done. The case k ≥ 3 can be treated in the same way, by induction. Claim 2 ∀n ≡ 1 mod 4, S(Sn ) = 0. Using Claim 1, it suffices to compute the sum of each of the 9 components Ai , Bi , Ci and of the first irregular strips. For every n ≤ 37, we check the equality by direct computations of sums in the triangle ∆Q1 [40]. For n ≥ 41, we have to consider three possibilities, according to Claim 1: • If n = 12k + 1, k ≥ 2, then S(Sn ) = S(C1 ) + (k − 2)S(C2 ) + S(C3 ) = −4 + (k − 2) × 0 + 4 = 0; • if n = 12k + 5, k ≥ 2, then S(Sn ) = S(A1 ) + (k − 1)S(A2 ) + S(A3 ) = 0 + (k − 1) × 0 + 0
= 0; • if n = 12k + 9, k ≥ 2, then
S(Sn ) = S(B1 ) + (k − 1)S(B2 ) + S(B3 )
70
Eliahou and Hachez
= −4 + (k − 1) × 0 + 4 = 0.
This proves Claim 2. It follows that S(∆Q1 [n]) = 0, i.e. that Q1 [n] is balanced, for every n ≡ 0 mod 4. We now turn to the proof of the second part of Theorem 2.1, namely that every strongly balanced binary sequence of length n with n = 4m ≥ 92 is equal to Qi [n] for some 1 ≤ i ≤ 4. We do this by induction on n, starting at n = 92. In order to construct all strongly balanced binary sequences of length 92, we use the method of Section 3, implemented in the given Mathematica functions. For example, issuing the command strong[92] to Mathematica will output exactly four sequences, namely Q1 [92], Q2 [92], Q3 [92] and Q4 [92]. This computation uses exact integer arithmetic only. This establishes the case n = 92. Let n ≥ 92 with n ≡ 0 mod 4. It remains to show that, if X = Qi [n] for some i ∈ {1, 2, 3, 4}, then there is a unique extension X ′ of X, of length n + 4, such that X ′ is (simply, hence strongly) balanced, and X ′ = Qi [n + 4]. (In fact, this statement already holds true for n ≥ 52 if i = 1 or 3, and for n ≥ 64 if i = 2 or 4.) Once again, we restrict our attention to Q1 , so X = Q1 [n]. We denote an arbitrary extension of length n + 4 of X as the concatenation Y = Y (x1 , x2 , x3 , x4 ) = Xx1 x2 x3 x4 , where x1 , x2 , x3 , x4 are unknown binary digits satisfying x2i = 1. Our task is to determine those values of xi ∈ {±1} for which S(∆Y ) = 0. In order to do this, we need to determine the structure of the derived triangle ∆Y (x1 , x2 , x3 , x4 ) in terms of the unknown x1 , x2 , x3 , x4 . Claim 3 For every n ∈ N, n ≡ 0 mod 4, the last strip Sn+1 (x1 , x2 , x3 , x4 ) of the triangle ∆(Q1 [n]x1 x2 x3 x4 ) has the following structure: ⎧ ′ ⎨ C1 + (k − 2)C2′ + C3′ if n = 12k A′ + (k − 1)A′2 + A′3 if n = 12k + 4 Sn+1 (x1 , x2 , x3 , x4 ) = ⎩ 1′ B1 + (k − 1)B2′ + B3′ if n = 12k + 8,
where A′1 , B1′ , C1′ are trapezoids and A′2 , B2′ , C2′ , A′3 , B3′ , C3′ parallelograms. These components have the same size as the corresponding components Ai , Bi , Ci , and are depicted below.
Steinhaus’ binary sequences
71
Not surprisingly, A′i , Bi′ , Ci′ share similar properties as Ai , Bi , Ci , i.e. the bottom of A′1 (respectively B1′ , C1′ ) is equal to the bottom of A′2 (respectively B2′ , C2′ ), and the 12-tuple composed by the last 12 digits of the right side of C1′ is equal to the 12-tuple containing the digits of the right side of C2′ . Thus, the proof of Claim 3 is similar to that of Claim 1. Here are A′i , Bi′ , Ci′ , explicitly. x2 x3 x4 ∗ A′1 = x1 x1 x1 x2 x2 x3 x3 x4 x1 x2 x1 x3 x2 x4 −x1 x1 x2 x1 x2 x3 x1 x2 x3 x4 −x1 − x2 x3 x4 x1 x1 x2 −x2 x3 x3 x4 −x1 x2 −x1 x3 −x2 x4 x1 −x1 x2 −x1 x2 x3 x1 x2 x3 x4 x1 − x2 x3 − x4 x1 −x1 x2 −x2 x3 −x3 x4 −x1 − x2 x1 x3 x2 x4 x1 x1 x2 −x1 x2 x3 x1 x2 x3 x4 −x1 x2 − x3 − x4 x1 −x1 x2 −x2 x3 x3 x4 x1 − x2 x1 x3 −x2 x4 x1 −x1 x2 −x1 x2 x3 −x1 x2 x3 x4 −x1 − x2 x3 x4 x1 x2 −x2 x3 x3 x4 −x1 x3 −x2 x4 x1 x2 x3 x4
72
Eliahou and Hachez
∗ A′2 = x1 −x1 x2 x1 −x1 x2 −x1 x2 x3 x1 − x2 x3 − x4 x1 −x1 x2 −x2 x3 −x3 x4 −x1 − x2 x1 x3 x2 x4 x1 x1 x2 −x1 x2 x3 x1 x2 x3 x4 −x1 x2 − x3 − x4 x1 −x1 x2 −x2 x3 x3 x4 x1 − x2 x1 x3 −x2 x4 x1 −x1 x2 −x1 x2 x3 −x1 x2 x3 x4 −x1 − x2 x3 x4 x1 x2 −x2 x3 x3 x4 −x1 x3 −x2 x4 x1 x2 x3 x4
∗ A′3 = x1 −x1 x2 x1 −x1 x2 −x1 x2 x3 x1 − x2 x3 − x4 x1 −x1 x2 −x2 x3 −x3 x4 −x1 − x2 x1 x3 x2 x4 x1 x1 x2 −x1 x2 x3 x1 x2 x3 x4 −x1 x2 − x3 − x4 x1 −x1 x2 −x2 x3 x3 x4 x1 − x2 x1 x3 −x2 x4 x1 −x1 x2 −x1 x2 x3 −x1 x2 x3 x4 x1 − x2 x3 x4 −x1 x2 −x2 x3 x3 x4 x1 x3 −x2 x4 −x1 x2 x3 x4
Steinhaus’ binary sequences
∗ B1′ = x1 x2 x3 x4 x1 x1 x2 x2 x3 x3 x4 −x1 x2 x1 x3 x2 x4 −x1 −x1 x2 x1 x2 x3 x1 x2 x3 x4 x1 x2 − x3 x4 x1 x1 x2 −x2 x3 −x3 x4 −x1 x2 −x1 x3 x2 x4 x1 −x1 x2 −x1 x2 x3 −x1 x2 x3 x4 −x1 − x2 x3 x4 x1 x2 −x2 x3 x3 x4 −x1 x3 − x2 x4 x1 x2 x3 x4
∗ B2′ = x1 x1 x2 x1 x1 x2 −x1 x2 x3 −x1 x2 −x3 −x4 x1 −x1 x2 −x2 x3 x3 x4 −x1 − x2 x1 x3 −x2 x4 x1 x1 x2 −x1 x2 x3 −x1 x2 x3 x4 x1 x2 − x3 x4 x1 x1 x2 −x2 x3 −x3 x4 −x1 x2 −x1 x3 x2 x4 x1 −x1 x2 −x1 x2 x3 −x1 x2 x3 x4 −x1 − x2 x3 x4 x1 x2 −x2 x3 x3 x4 −x1 x3 −x2 x4 x1 x2 x3 x4
73
74
Eliahou and Hachez
∗ B3′ = x1 x1 x2 x1 x1 x2 −x1 x2 x3 −x1 x2 −x3 −x4 x1 −x1 x2 −x2 x3 x3 x4 −x1 − x2 x1 x3 −x2 x4 x1 x1 x2 −x1 x2 x3 −x1 x2 x3 x4 x1 x2 − x3 x4 x1 x1 x2 −x2 x3 −x3 x4 −x1 x2 −x1 x3 x2 x4 x1 −x1 x2 −x1 x2 x3 −x1 x2 x3 x4 x1 − x2 x3 x4 −x1 x2 −x2 x3 x3 x4 x1 x3 −x2 x4 −x1 x2 x3 x4 x2 x3 x4 ∗ C1′ = x1 −x1 x1 x2 x2 x3 x3 x4 x1 − x2 x1 x3 x2 x4 x1 −x1 x2 −x1 x2 x3 x1 x2 x3 x4 x1 − x2 x3 − x4 −x1 −x1 x2 −x2 x3 −x3 x4 −x1 x2 x1 x3 x2 x4 −x1 −x1 x2 x1 x2 x3 x1 x2 x3 x4 x1 x2 − x3 x4 −x1 x1 x2 −x2 x3 −x3 x4 x1 −x2 −x1 x3 x2 x4 −x1 −x1 x2 x1 x2 x3 −x1 x2 x3 x4 −x1 x2 − x3 − x4 −x1 x2 −x2 x3 x3 x4 x1 x3 −x2 x4 −x1 x2 x3 x4
Steinhaus’ binary sequences
75
∗ C2′ = −x1 x1 x2 −x1 x1 x2 x1 x2 x3 x1 − x2 x3 − x4 −x1 −x1 x2 −x2 x3 −x3 x4 −x1 x2 x1 x3 x2 x4 −x1 −x1 x2 x1 x2 x3 x1 x2 x3 x4 x1 x2 − x3 x4 −x1 x1 x2 −x2 x3 −x3 x4 x1 −x2 −x1 x3 x2 x4 −x1 −x1 x2 x1 x2 x3 −x1 x2 x3 x4 −x1 x2 − x3 − x4 −x1 x2 −x2 x3 x3 x4 x1 x3 −x2 x4 −x1 x2 x3 x4 ∗ C3′ = −x1 x1 x2 −x1 x1 x2 x1 x2 x3 x1 − x2 x3 − x4 −x1 −x1 x2 −x2 x3 −x3 x4 −x1 x2 x1 x3 x2 x4 −x1 −x1 x2 x1 x2 x3 x1 x2 x3 x4 x1 x2 − x3 x4 −x1 x1 x2 −x2 x3 −x3 x4 x1 −x2 −x1 x3 x2 x4 −x1 −x1 x2 x1 x2 x3 −x1 x2 x3 x4 x1 x2 − x3 − x4 x1 x2 −x2 x3 x3 x4 −x1 x3 −x2 x4 x1 x2 x3 x4 We are now in a position to determine the sum S(∆Q1 [n]x1 x2 x3 x4 ) in terms of the xi . Since we already know that S(∆Q1 [n]) = 0, it follows
76
Eliahou and Hachez
that S(∆Q1 [n]x1 x2 x3 x4 ) = S(Sn+1 (x1 , x2 , x3 , x4 )). From this remark and Claim 3, we have, for all n ≥ 36: S(∆Q1 [n]x1 x2 x3 x4 ) = ⎧ ′ ′ ′ ⎪ ⎨ S(C1 ) + (k − 2)S(C2 ) + S(C3 ) if n = 12k S(A′1 ) + (k − 1)S(A′2 ) + S(A′3 ) if n = 12k + 4 ⎪ ⎩ S(B1′ ) + (k − 1)S(B2′ ) + S(B3′ ) if n = 12k + 8 .
Writing n = 12k + r with r ∈ {0, 4, 8}, we shall use the notation wk,r (x1 , x2 , x3 , x4 ) = S(∆Q1 [n]x1 x2 x3 x4 ).
Computing explicitly S(A′i ), S(Bi′ ), S(Ci′ ) from the above figures, we get: wk,0 (x1 , x2 , x3 , x4 ) = 5x1 + x2 (−1 + x1 ) +x3 (1 − x1 + x2 − 2x1 x2 ) +x4 (1 + x2 + x3 + 3x1 x2 x3 ) +k[−4x1 + 2x2 (1 − x1 ) +x3 (−1 + x1 − 3x2 + 3x1 x2 ) +x4 (−1 + x2 − x3 − x1 x2 x3 )] , wk,4 (x1 , x2 , x3 , x4 ) = 3x1 + x2 (1 + x1 ) + x3 (2 + 2x1 + x1 x2 ) +2x4 (1 + x3 ) +k[4x1 − 2x2 (1 + x1 ) +x3 (1 + x1 − 3x2 − 3x1 x2 ) +x4 (−1 − x2 + x3 + x1 x2 x3 )] , wk,8 (x1 , x2 , x3 , x4 ) = 3x1 + x2 (3 − x1 ) + x3 (1 + x1 − x2 ) +x4 (3 + x2 + x3 − x1 x2 x3 ) +k[4x1 + 2x2 (1 + x1 ) +x3 (−1 − x1 − 3x2 − 3x1 x2 ) +x4 (1 − x2 + x3 − x1 x2 x3 )] . Successively replacing (x1 , x2 , x3 , x4 ) by each of the 16 binary sequences of length 4, we obtain 48 polynomial functions of degree 1 in k. We must then determine the zeroes of these polynomials. Case 1 : r = 0, i.e. we consider sequences of the type Q1 [12k]x1 x2 x3 x4 . We obtain the following values of: wk,0 (x1 , x2 , x3 , x4 ) = S(∆Q1 [12k]x1 x2 x3 x4 ) :
Steinhaus’ binary sequences
77
wk,0 (1, 1, 1, 1) = 10 − 6k
wk,0 (1, 1, 1, −1) = −2 − 2k wk,0 (1, 1, −1, 1) = 4 − 2k
wk,0 (1, 1, −1, −1) = 8 − 6k wk,0 (1, −1, 1, 1) = 4 − 6k
wk,0 (1, −1, 1, −1) = 8 − 2k wk,0 (1, −1, −1, 1) = 6 − 6k
wk,0 (1, −1, −1, −1) = 2 − 2k wk,0 (−1, 1, 1, 1) = −2
wk,0 (−1, 1, 1, −1) = −2
wk,0 (−1, 1, −1, 1) = −8 + 16k
wk,0 (−1, 1, −1, −1) = −16 + 16k wk,0 (−1, −1, 1, 1) = 0
wk,0 (−1, −1, 1, −1) = −8 + 8k wk,0 (−1, −1, −1, 1) = −6 − 4k
wk,0 (−1, −1, −1, −1) = 2 − 4k.
Given that wk,0 (−1, −1, 1, 1) = 0, independently of k, we see that the sequence Q1 [12k] − − + + is (simply, hence strongly) balanced. But, as easily checked, Q1 [12k]−−++ = Q1 [12k+4]. The 15 other polynomials may vanish for small values of k, yielding “exotic” short strongly balanced sequences. However, direct inspection reveals that none of these other functions vanishes for k ≥ 5. Consequently, Q1 [12k + 4] is the unique balanced extension of length 12k + 4 of Q1 [12k], provided k ≥ 5. Note that, in the context of this proof, we have k ≥ 7 in fact, since we are assuming n ≥ 92. Case 2 : r = 4, i.e. we consider the binary sequences Q1 [12k + 4]x1 x2 x3 x4 . We obtain the following values of wk,4 (x1 , x2 , x3 , x4 ) : wk,4 (1, 1, 1, 1) = 14 − 4k
wk,4 (1, 1, 1, −1) = 6 − 4k wk,4 (1, 1, −1, 1) = 0
wk,4 (1, 1, −1, −1) = 8k
wk,4 (1, −1, 1, 1) = 8 + 16k
wk,4 (1, −1, 1, −1) = 16k
78
Eliahou and Hachez
wk,4 (1, −1, −1, 1) = −2
wk,4 (1, −1, −1, −1) = −2
wk,4 (−1, 1, 1, 1) = −6k
wk,4 (−1, 1, 1, −1) = −8 − 2k wk,4 (−1, 1, −1, 1) = −2 − 6k
wk,4 (−1, 1, −1, −1) = −2 − 2k wk,4 (−1, −1, 1, 1) = 2 − 2k
wk,4 (−1, −1, 1, −1) = −6 − 6k wk,4 (−1, −1, −1, 1) = −4 − 6k
wk,4 (−1, −1, −1, −1) = −4 − 2k. Here we have wk,4 (1, 1, −1, 1) = 0, independently of k. Thus, the sequence Q1 [12k + 4] + + − + is (simply, hence strongly) balanced. Again, one easily checks that Q1 [12k + 4] + + − + = Q1 [12k + 8]. The other 15 functions do not vanish for k ≥ 2. Therefore, Q1 [12k + 8] is the unique balanced extension of length 12k + 8 of Q1 [12k + 4], provided k ≥ 2. Case 3 : r = 8, i.e. we consider the binary sequences Q1 [12k + 8]x1 x2 x3 x4 . Here are the values of wk,8 (x1 , x2 , x3 , x4 ) : wk,8 (1, 1, 1, 1) = 10 wk,8 (1, 1, 1, −1) = 2
wk,8 (1, 1, −1, 1) = 8 + 16k
wk,8 (1, 1, −1, −1) = 16k
wk,8 (1, −1, 1, 1) = 8 + 8k
wk,8 (1, −1, 1, −1) = 0
wk,8 (1, −1, −1, 1) = −2 − 4k
wk,8 (1, −1, −1, −1) = −2 − 4k wk,8 (−1, 1, 1, 1) = 6 − 2k
wk,8 (−1, 1, 1, −1) = −6 − 6k wk,8 (−1, 1, −1, 1) = 4 − 6k
wk,8 (−1, 1, −1, −1) = −2k
wk,8 (−1, −1, 1, 1) = −4 − 2k
wk,8 (−1, −1, 1, −1) = −8 − 6k
Steinhaus’ binary sequences
79
wk,8 (−1, −1, −1, 1) = −6 − 2k
wk,8 (−1, −1, −1, −1) = −10 − 6k. Again, wk,8 (1, −1, 1, −1) = 0, but none of the other 15 functions vanishes for k ≥ 4. Moreover, Q1 [12k + 8] + − + − = Q1 [12k + 12]. Therefore, Q1 [12k + 12] is the unique balanced extension of length 12k + 12 of Q1 [12k + 8] for k ≥ 4. With the above three cases, we have verified that, for every n ≡ 0 mod 4 with n ≥ 52, the sequence Q1 [n] admits a unique balanced binary extension of length n + 4, namely Q1 [n + 4]. Similar phenomena as those described here for Q1 occur for the other sequences Q2 , Q3 , Q4 , R1 , . . . , R12 , and for the supplementary strongly balanced sequences described in Theorem 2.2. This explains why, after a somewhat chaotic initial behavior, the set SB(n) of strongly balanced binary sequences of length n ultimately becomes periodic. Acknowledgement We are grateful to Pierre Duchet for attracting our attention to the problem of Steinhaus. It is worthwile to note that Duchet uses this problem with school pupils, by challenging them to find as large balanced binary sequences as possible. Interestingly, some pupils were able to discover such sequences in length as high as 123, or even 240 [1]. We also thank Michel Kervaire, Pierre de la Harpe and the anonymous referee for useful comments about this paper. Shalom Eliahou Laboratoire de Math´ematiques Pures et Appliqu´ees Joseph Liouville Universit´e du Littoral Cˆ ote d’Opale B.P. 699, 62228 Calais cedex, France. eliahou@lmpa.univ-littoral.fr
Delphine Hachez Laboratoire de Math´ematiques Pures et Appliqu´ees Joseph Liouville Universit´e du Littoral Cˆ ote d’Opale B.P. 699, 62228 Calais cedex, France. hachez@lmpa.univ-littoral.fr
80
Eliahou and Hachez
References [1] Duchet P., La r`egle des signes, MATh.en.JEANS (1995), 139–140. [2] Harborth H., Solution of Steinhaus’s problem with plus and minus signs, J. Comb. Th. (A) 12 (1972), 253–259. [3] Steinhaus H., One Hundred Problems in Elementary Mathematics, Pergamon, Elinsford, N.Y., 1963.
Morfismos, Vol. 8, No. 2, 2004
Errata
En el Vol. 8, No. 1 (junio 2004), pa´g. 55, las referencias 3 y 4 deben ser:
In Vol. 8, No. 1 (June 2004), page 55, references 3 and 4 should be:
[3] de Mier A.; Noy M., On graphs determined by their Tutte polynomials, Graphs Combin. 20 (2004), 105–119. [4] Ma´rquez A.; de Mier A.; Noy M.; Revuelta M. P., Locally grid graphs: Classification and Tutte uniqueness, Discr. Math. 266 (2003), 327–352.
81
Morfismos, Comunicaciones Estudiantiles del Departamento de Matem´ aticas del CINVESTAV, se termin´ o de imprimir en el mes de junio de 2005 en el taller de reproducci´ on del mismo departamento localizado en Av. IPN 2508, Col. San Pedro Zacatenco, M´exico, D.F. 07300. El tiraje en papel opalina importada de 36 kilogramos de 34 × 25.5 cm consta de 500 ejemplares en pasta tintoreto color verde.
Apoyo t´ecnico: Omar Hern´ andez Orozco.
Contenido Homotopy triangulations of a manifold triple Rolando Jim´enez and Yuri V. Muranov . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
On information measures and prior distributions: a synthesis Francisco Venegas-Mart´ınez . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
On a problem of Steinhaus concerning binary sequences Shalom Eliahou and Delphine Hachez . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Errata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81