1.3 Periodicity

In this section, some basic properties of periodic functions are carefully established, from a viewpoint that favors investigation of periodicities in discrete-domain data.

The next section addresses more delicate properties of periodic functions.

Introduction to Periodic Functions

Roughly, a periodic function is one with values that repeat some basic pattern. The graphs of two periodic functions are shown below.

The usual definition of a periodic function is adapted, as follows, to better suit the investigation of periodicities in discrete-domain data:

Definition period $\,p\,$; periodic function

Let $\,f\,$ be a real-valued function, with $\,{\cal D}(f) \subset \Bbb R\,.$

The function $\,f\,$ has a period $\,p\,,$ where $\,p\,$ is a real number, if and only if the domain of $\,f\,$ contains both $\,x + p\,$ and $\,x - p\,$ whenever it contains $\,x\,,$ and if

$$f(x + p) = f(x-p) = f(x)$$

for all $\,x\in{\cal D}(f)\,.$

A function $\,f\,$ is periodic if and only if it has a nonzero period.

The symbol $\,x\pm p$

The symbol ‘$\,\pm\,$’ is read as ‘plus or minus’, and the abbreviation ‘$\,x \pm p\,$’ is used for ‘$\,x + p\ \ \text{ or }\ \ x-p\,$’ .

Similarly, ‘$\,f(x \pm p)\,$’ is shorthand for:

$$ f(x + p)\ \ \text{ or }\ \ f(x-p) $$

The requirement that both $\,x + p\,$ and $\,x - p\,$ be in the domain of $\,f\,$ is essential. It assures that one can always move both to the right and to the left in the domain of $\,f\,.$ Without this requirement, it is possible to have ‘functions with a nonzero period’ that no reasonable person would want to call periodic, as illustrated in a later example.

Some immediate consequences of the definition

Here are some immediate consequences of the definition just given:

Every function has period $\,0$

Every function has period $\,0\,,$ since $\,f(x \pm 0) = f(x)\,$ for all $x \in {\cal D}(f)\,.$

Symmetry in the definition; if $\,f\,$ has a nonzero period, then it has a positive period

Whenever a function $\,f\,$ has a period $\,p\,,$ it also has a period $\,-p\,.$ This is a consequence of the symmetry in the definition. Thus, if a function $\,f\,$ has any nonzero period, then it necessarily has a positive period.

A constant function has all periods

A constant function $\,f : \Bbb R \to \Bbb R\,$ has all periods, since for all real numbers $\,x\,$ and $\,p\,,$ $\,f(x \pm p) = f(x)\,.$

Example: A Function with Rational Number Periods

Define $\,g : \Bbb R \to \Bbb R\,$ by:

$$ g(x) = \cases{ 1 & \text{for $\,x\in\Bbb Q$}\cr\cr 0 & \text{for $\,x\notin\Bbb Q$} } $$

Every rational number is a period of this function, and these are the only periods. To see this, argue as follows:

Let $\,p\,$ be any rational number. Then,

$$ \begin{align} x\in \Bbb Q &\implies x\pm p \in \Bbb Q\cr\cr &\implies 1 = g(x) = g(x\pm p) \end{align} $$

and

$$ \begin{align} x\notin \Bbb Q &\implies x\pm p \notin \Bbb Q\cr\cr &\implies 0 = g(x) = g(x\pm p)\,, \end{align} $$

which shows that every rational number is a period of $\,g\,$. The argument used the fact that a sum of rational numbers is rational; and the sum of a rational and irrational number is irrational.

To see that no irrational number is a period, let $\,p\,$ be any irrational number, and choose $\,x := -p\,.$ Thus, $\,x\,$ is irrational, so $\,g(x) = 0\,.$

However, $\,x + p = -p + p = 0\,$ is rational, so $\,g(x + p) = 1\,.$ Thus, $\,g(x) = g(x + p)\,$ does not hold for all $\,x\,,$ so the irrational number $\,p\,$ is not a period of $\,g\,.$

Example: A Function with Period $\,2$

The discrete-domain function $\,f\,$ graphed below has period $\,2\,.$ Observe that it also has periods $\,2k\,$ for all integers $\,k\,.$

Building new periodic functions

The next result gives a way to build ‘new’ periodic functions from functions with a common domain and a common period.

Lemma 1 sums, scalar multiples, and products of functions with a period $\,p\,$

Let $\,f\,$ and $\,g\,$ be functions with common domain $\,\cal D\,$ and a common period $\,p\,.$

Then the functions $\,f\pm g\,,$ $\,fg\,,$ and $\,kf\,$ (for all real numbers $\,k\,$) also have a period $\,p\,.$

If $\,g(x) \neq 0\,$ for all $\,x \in \cal D\,,$ then $\,\frac fg\,$ has a period $\,p\,.$

Proof

Recall that the functions $\,f\pm g\,,$ $\,fg\,,$ $\,kf\,$ and $\,\frac fg\,$ are defined by:

$$ \begin{gather} (f\pm g)(x) := f(x) \pm g(x)\cr\cr (fg)(x) := f(x)\cdot g(x)\cr\cr (kf)(x) := k\cdot f(x)\cr\cr \bigl(\frac fg\bigr)(x) := \frac{f(x)}{g(x)}\,,\ \ g(x)\ne 0 \end{gather} $$

Now, for every $\,x\in \cal D\,$:

$$ \begin{align} (f + g)(x\pm p) &:= f(x\pm p) + g(x\pm p)\cr\cr &= f(x) + g(x)\cr\cr &:= (f+g)(x) \end{align} $$

This shows that $\,f + g\,$ also has a period $\,p\,.$ The remaining cases are similar. $\blacksquare$

Lemma 1 holds for all finite sums and products

By induction, it follows easily that Lemma 1 also holds for all finite sums and products.

For example, suppose that functions $\,f\,,$ $\,g\,$ and $\,h\,$ have common domain $\,\cal D\,$ and a common period $\,p\,.$ Then, one application of the lemma shows that $\,f+g\,$ has a period $\,p\,$; a second application shows that $\,(f + g) + h\,$ has a period $\,p\,.$

If a pattern repeats itself on an interval of length $\,p\,,$ then it necessarily repeats itself on intervals of length $\,2p\,,$ $\,3p\,,$ $\,4p,\,\ldots\,.$ This idea is formalized next.

Lemma 2 any multiple of a period is also a period

If a function $\,f\,$ has a period $\,p\,,$ then it also has periods $\,kp\,$ for $\,k \in \Bbb Z\,.$

In particular, whenever $\,x \in {\cal D}(f)\,,$ then so are $\,x + kp\,,\ \ k \in \Bbb Z\,.$

Proof: A Typical Induction Argument

The proof proceeds by induction. Let $\,S(k)\,$ be the statement:

‘ $\,f\,$ has a period $\,kp\,$’

The function $\,f\,$ has period $\,0\cdot p = 0\,,$ since $\,f(x \pm 0) = f(x)\,$ for all $\,x \in \cal D(f)\,.$ Thus, $\,S(0)\,$ is true.

In what follows, it is shown that whenever $\,S(k)\,$ is true (for $\,k \ge 0\,$), then $\,S(k + 1)\,$ is also true. Since $\,-p\,$ is a period of $\,f\,$ whenever $\,p\,$ is, this will complete the proof.

For induction, suppose that $\,f\,$ has period $\,kp\,$ (i.e., $\,S(k)\,$ is true). Then:

$$ \begin{align} &x\in \cal D(f)\cr\cr &\quad \implies x\pm kp \in \cal D(f)\cr &\qquad\qquad \text{($\,f\,$ has a period $\,kp\,$)}\cr\cr &\quad \implies (x\pm kp) \pm p \in \cal D(f)\cr &\qquad\qquad \text{($\,f\,$ has a period $\,p\,$)} \end{align} $$

In particular, both $\,x + (k+1)p\,$ and $\,x - (k+1)p\,$ are in $\,\cal D(f)\,.$ Furthermore:

$$ \begin{align} &f(x)\cr\cr &\quad = f(x + kp) = f(x - kp)\cr &\qquad \text{($\,f\,$ has period $\,kp\,$)}\cr\cr &\quad = f\bigl((x + kp) + p\bigr)) = f\bigl((x - kp) - p\bigr)\cr &\qquad \text{($\,f\,$ has a period $\,p\,$)}\cr\cr &\quad = f\bigl(x + (k + 1)p\bigr) = f\bigl(x - (k+1)p\bigr) \end{align} $$

It has been shown that whenever $\,x\in \cal D(f)\,,$ so are $\,x \pm (k+1)p\,,$ and $\,f(x) = f\bigl( x \pm (k+1)p\bigr)\,.$ Thus, $\,f\,$ has a period $\,(k + 1)p\,.$ Therefore, $\,S(k + 1)\,$ is true. $\blacksquare$

Lemma 3 if $\,f\,$ has a period $\,p\,,$ then domain elements $\,|p|\,$ apart have identical function values

Let $\,f\,$ have a period $\,p\,.$ Any two domain elements whose distance apart is $\,|p|\,$ must have identical function values.

Whenever two domain elements are any multiple of $\,|p|\,$ apart, then they must have identical function values.

Proof

Suppose that $\,f\,$ has a period $\,p\,.$ Let $\,x\,$ and $\,y\,$ be in the domain of $\,f\,,$ with $\,|x - y| = |p|\,.$ Switching names, if necessary, suppose that $\,x \ge y\,.$

If $\,p \ge 0\,,$ then $\,x = y + p\,,$ and so:

$$f(x) = f(y + p) = f(y)$$

If $\,p \lt 0\,,$ then $\,y = x + p\,,$ and so:

$$f(y) = f(x + p) = f(x)$$

In either case, $\,f(x) = f(y)\,.$

By the previous lemma, $\,f\,$ also has periods $\,kp\,,$ for all $\,k \in \Bbb Z\,.$ So if $\,|x - y| = |kp|\,,$ then an application of the result just proved shows that $\,f(x) = f(y)\,.\ \ \blacksquare$

The next lemma shows that if a function has a positive period $\,p\,,$ then on any half-open interval of length $\,p\,,$ the function must take on all of its function values. Thus, there is no ‘natural starting place’ for the repeating pattern that the function exhibits.

The sketch below illustrates that a function with a positive period $\,p\,$ may not take on all its function values on an open interval of length $\,p\,.$ It is interesting to note that this is a ‘fault’ of the definition of the length of an interval; the intervals $\,(a, b)\,,$ $\,[a, b]\,,$ $\,[a, b)\,$ and $\,(a, b]\,$ all have the same length.

a function with positive period p may NOT take on all its functions values on an open interval of length p

Period $\,3\,$; take $\,I := (1,4)$

Lemma 4 a function with a positive period $\,p\,$ takes on all its function values on any half-open interval of length $\,p$

Let $\,f\,$ have a positive period $\,p\,.$ On any half-open interval of length $\,p\,,$ $\,f\,$ must take on all the values in its range.

On any interval that contains a half-open interval of length $\,p\,,$ $\,f\,$ must take on all its output values.

Proof

Let $\,f\,$ have a positive period $\,p\,,$ and let $\,I\,$ be any half-open interval of length $\,p\,.$ Thus, $\,I\,$ is of the form $\,(a, b]\,$ or $\,[a, b)\,,$ where $\,b - a = p\,.$

Suppose, for contradiction, that there is an output of $\,f\,$ which is not assumed in $\,I\,$; that is, suppose there exists $\,y \in \cal R(f)\,$ for which there is no $\,x \in ({\cal D}(f) \cap I)\,$ with $\,f(x) = y\,.$

Since $\,y\in {\cal R}(f)\,,$ there must exist $\,\tilde x\in {\cal D}(f)\,$ with $\,y = f(\tilde x)\,.$ Write $\,\tilde x = x + np\,$ for some $\,x \in I\,,\ n \in \Bbb Z\,.$ (To see that this is always possible, merely place enough intervals of length $\,I\,$ end-to-end so that they eventually cover $\,\tilde x\,$. See the sketch below.)

Since $\,\tilde x\,$ is in the domain of $\,f\,,$ and the distance from $\,\tilde x\,$ to $\,x\,$ is a multiple of $\,p\,,$ $\,x\,$ is also in $\,{\cal D}(f)\,.$ Furthermore,

$$ \begin{align} y = f(\tilde x) &= f(x + np)\cr &= f(x)\,, \end{align} $$

yielding the desired contradiction. $\blacksquare$

Investigating a different definition for a periodic function

It is tempting to simplify the definition of a periodic function as follows:

Proposed Definition: The function $\,f\,$ has a period $\,p\,$ if the domain of $\,f\,$ contains $\,x + p\,$ whenever it contains $\,x\,,$ and if

$$f(x + p) = f(x)$$

for all $\,x\in {\cal D}(f)\,.$

With this ‘proposed definition’, a function need not have a period $\,-p\,$ whenever it has a period $\,p\,.$

For example, the function graphed below has a period $\,2\,,$ but not a period $\,-2\,.$

Domain = $\,\{1,2,3,\ldots\}$

The problem with this ‘proposed definition’ is that it is possible to define functions with nonzero periods that no reasonable person would want to call periodic. This is illustrated in the next example.

Example

Define a function $\,f\,$ as follows: let $\,0\in {\cal D}(f)\,,$ and require $\,f\,$ to have periods $\,1\,$ and $\,\pi\,$ (as per the ‘proposed definition’).

Then, the numbers $\,1,2,3,4,\ldots\,$ must all be in $\,{\cal D}(f)\,,$ as must $\,\pi,2\pi,3\pi,4\pi,\ldots\,.$

Then, all elements of the form $\,k + n\pi\,$ must be in $\,{\cal D}(f)\,,$ where $\,k\,$ and $\,n\,$ are nonnegative integers. Now, the set $\{k + n\pi\ |\ k\ge 0,\ n\ge 0\}$ is closed under addition, and forms $\,\cal D(f)\,.$

Define $\,f\,$ to be $\,1\,$ everywhere on its domain. A moment’s reflection confirms that $\,f\,$ does indeed have periods $\,1\,$ and $\,\pi\,.$

The problem, however, is this: the graph of the function does not periodically repeat itself as one moves from left to right, since new domain elements are constantly being added!

In particular, the graph of $\,f\,$ does not repeat itself on intervals of length $\,1\,$ or $\,\pi\,,$ so most people would feel uncomfortable calling $\,f\,$ ‘periodic’. A partial graph of $\,f\,$ is shown next.

A ‘periodic’ function?

Contents of the next section

Thus far, this section has carefully established basic properties of functions with a period $\,p\,.$ Many interesting questions arise when one considers the set of all periods of a function.

Does every periodic function have a least positive period? (No.) If a periodic function does not have a least positive period, can anything be said about it? (Yes. The periods must be dense in $\Bbb R\,,$ and the function must be either constant, or everywhere discontinuous.)

Is the sum of two periodic functions (with different periods) necessarily periodic? (No.) If two functions have the same least positive period, must their sum have this same least positive period? (No.)

These are some of the questions addressed in Section 1.4.

Some Basic Reshaping Techniques

This section closes with some interesting ‘reshaping’ results that can be used, in certain cases, to identify periodic components in a finite list of real numbers. The next few definitions and notation will considerably simplify the statement and proof of the ‘reshaping’ theorem.

Definition mean of a list

Let $\,\boldsymbol{\rm y} = (y_1,\ldots,y_N)\,$ be a finite list. The mean of the list $\,\boldsymbol{\rm y}\,$ is the number $\,\mu_{\boldsymbol{\rm y}}\,$ defined by:

$$ \mu_{\boldsymbol{\rm y}} := \frac 1N(y_1 + y_2 + \cdots + y_N) $$

Lemma 5 the mean of a sum of lists is the sum of the means

Let $\,\boldsymbol{\rm x}_1, \boldsymbol{\rm x}_2, \ldots, \boldsymbol{\rm x}_M\,$ be lists, each of the same (finite) length, and let:

$$ \boldsymbol{\rm S} = \boldsymbol{\rm x}_1 + \boldsymbol{\rm x}_2 + \cdots + \boldsymbol{\rm x}_M $$

Then:

$$ \mu_{\boldsymbol{\rm S}} = \mu_{\boldsymbol{\rm x}_1} + \mu_{\boldsymbol{\rm x}_2} + \cdots + \mu_{\boldsymbol{\rm x}_M} $$

Thus, the mean of a sum of lists is the sum of the means of the component lists.

Proof of Lemma 5

First, it is shown that the result holds when $\,\boldsymbol{\rm S}\,$ is generated by only two lists. Thus, suppose that:

$$ \begin{align} \boldsymbol{\rm S} &= (x_1,\ldots,x_N) + (y_1,\ldots,y_N)\cr\cr &= (x_1 + y_1,\ldots,x_N+y_N) \end{align} $$

Then:

$$ \begin{align} \mu_{\boldsymbol{\rm S}} &=\frac 1N\bigl( (x_1 + y_1) + \cdots + (x_N + y_N)\bigr)\cr\cr &= \frac 1N(x_1 + \cdots + x_N) + \frac 1N(y_1 + \cdots + y_N)\cr\cr &= \mu_{\boldsymbol{\rm x}} + \mu_{\boldsymbol{\rm y}} \end{align} $$

The remainder of the proof follows by induction. Suppose that the result holds for $\,K\,$ lists, where $\,K \ge 2\,$ is a fixed integer, and suppose that $\,S\,$ is a sum of $\,K + 1\,$ lists, denoted by $\,\boldsymbol{\rm x}_1,\ldots, \boldsymbol{\rm x}_{K+1}\,.$ Write

$$ S = \overbrace{\boldsymbol{\rm x}_1 + \cdots + \boldsymbol{\rm x}_K} + \overbrace{\boldsymbol{\rm x}_{K+1}}\,, $$

so that $\,S\,$ is viewed as a sum of two lists. Then,

$$ \begin{align} \mu_{\boldsymbol{\rm S}} &= \mu_{ \boldsymbol{\rm x}_1 + \cdots + \boldsymbol{\rm x}_K } + \mu_{\boldsymbol{\rm x}_{K+1}}\cr\cr &= (\mu_{\boldsymbol{\rm x}_1} + \cdots + \mu_{\boldsymbol{\rm x}_K} ) + \mu_{\boldsymbol{\rm x}_{K+1}}\,, \end{align} $$

where the inductive hypothesis was used in the second step. $\blacksquare$

Definition cycle

Let $\,p\,$ be a positive integer. The phrase ‘$p$-cycle’ is used to denote any list of the form:

$$ \begin{gather} \bigl(\overbrace{x_1,x_2,\ldots,x_p}^{\text{$1^{\text{st}}$ cycle}}, \overbrace{x_1,x_2,\ldots,x_p}^{\text{$2^{\text{nd}}$ cycle}},\,\ldots\, ,\cr \overbrace{x_1,x_2,\ldots,x_p}^{\text{$r^{\text{th}}$ cycle}}\bigr)\ ; \end{gather} $$

That is, a $p$-cycle is any list composed of $\,p\,$ numbers in a specified order, that are repeated $\,r\,$ times, where $\,r\,$ is a positive integer.

The length of a $p$-cycle is necessarily a multiple of $\,p\,.$

For positive integers $\,p\,$ and $\,q\,,$ a $p$-cycle and a $q$-cycle are called relatively prime if $\,p\,$ and $\,q\,$ are relatively prime; that is, if $\,p\,$ and $\,q\,$ have no common factors other than $\,1\,.$

Notation for cycles

Let $\,r\,$ be a positive integer, and let $\boldsymbol{\rm x} = (x_1,... ,x_p)\,.$ The notation ${}^r\boldsymbol{\rm x}\,$ is used for the $p$-cycle formed from $\,r\,$ repetitions of $\,\boldsymbol{\rm x}\,$:

$$ \begin{gather} {}^r\boldsymbol{\rm x} := \bigl(\overbrace{x_1,x_2,\ldots,x_p}^{\text{$1^{\text{st}}$ cycle}}, \overbrace{x_1,x_2,\ldots,x_p}^{\text{$2^{\text{nd}}$ cycle}},\,\ldots\, ,\cr \overbrace{x_1,x_2,\ldots,x_p}^{\text{$r^{\text{th}}$ cycle}}\bigr) \end{gather} $$

$\mu_{ ({}^r\boldsymbol{\rm x})} = \mu_{\boldsymbol{\rm x}}$

Observe that the mean of $\boldsymbol{\rm x} = (x_1,\ldots,x_p)\,$ is equal to the mean of any $p$-cycle $\,{}^r\boldsymbol{\rm x}\,,$ since:

$$ \mu_{\boldsymbol{\rm x}} = \frac 1p(x_1 + \cdots + x_p) $$

and

$$ \mu_{ ({}^r\boldsymbol{\rm x})} = \frac 1{rp}\bigl( r(x_1 + \cdots + x_p)\bigr) $$

Example

Suppose that a list $\,\boldsymbol{\rm S}\,$ of length $\,6\,$ is a sum of a $2$-cycle and a $3$-cycle, say:

$$ \begin{align} \boldsymbol{\rm S} &= (\overbrace{2,5}, \overbrace{2,5}, \overbrace{2,5}) + (\overbrace{3,0,-1}, \overbrace{3,0,-1})\cr\cr &= (5,5,1,8,2,4) \end{align} $$

There are of course an infinite number of ‘similar’ cycles that could have summed to yield $\,\boldsymbol{\rm S}\,,$ since given any constant $\,K\,$:

$$ \begin{alignat}{6} S =\ &(2+K,\ &&5+K,\ &&2+K,\ &&5+K,\ && 2+K,\ && 5+K)\cr\cr +\ &(3-K, && 0-K, && {-}1-K,\ && 3-K, && 0-K, && {-}1-K) \end{alignat} $$

That is, one component can be shifted any given amount $\,K\,,$ provided that the remaining component is shifted the opposite amount, $\,-K\,.$ Thus, given only the sum list $\,\boldsymbol{\rm S} = (5,5,1,8,2,4)\,,$ it is impossible to recover the precise cycles that originally generated $\,\boldsymbol{\rm S}\,.$

However, it is possible in this case to recover zero-mean $\,2\,$ and $3$-cycles which, when added to $\,\mu_{\boldsymbol{\rm S}}\,,$ yield $\,\boldsymbol{\rm S}\,.$ Only some basic reshaping techniques are needed. The technique is presented in the proof of the next theorem.

Theorem identifying periodic components in a finite list

Let $\,p\,$ and $\,q\,$ be relatively prime positive integers; that is, $\,p\,$ and $\,q\,$ have no common factors other than $\,1\,.$

Let $\,\boldsymbol{\rm S}\,$ be a list of length $\,N\,,$ where $\,N\,$ is a multiple of both $\,p\,$ and $\,q\,$ (and hence, $\,N\,$ is a multiple of $\,pq\,$).

Suppose that $\,\boldsymbol{\rm S}\,$ is a sum of a $p$-cycle and a $q$-cycle, that is:

$$ \begin{align} \boldsymbol{\rm S} =\ &(\overbrace{x_1\,,\,\ldots\,,\,x_p}\ ,\ \overbrace{x_1\,,\,\ldots\,,\,x_p}\ ,\ \ldots,\cr &\qquad \overbrace{x_1\,,\,\ldots\,,\,x_p})\cr\cr +\ & (\overbrace{y_1,\ldots,y_q},\overbrace{y_1,\ldots,y_q},\overbrace{y_1,\ldots,y_q},\ldots,\cr &\qquad \ \ \ \overbrace{y_1,\ldots,y_q}) \end{align} $$

By letting

$$ \begin{gather} \boldsymbol{\rm x} := (x_1,\ldots,x_p)\cr \text{and}\cr \boldsymbol{\rm y} := (y_1,\ldots,y_q) \end{gather} $$

and using cycle notation, $\,\boldsymbol{\rm S}\,$ can be written more compactly as:

$$ \boldsymbol{\rm S} = {}^{(N/p)}\boldsymbol{\rm x} + {}^{(N/q)}\boldsymbol{\rm y} $$

Then, given only the sum $\,\boldsymbol{\rm S}\,$ (and not the component lists), there is a constructive method for finding zero-mean lists

$$ \begin{gather} \tilde{\boldsymbol{\rm x}}_0 := (\tilde x_1,\ldots,\tilde x_p)\cr \text{and}\cr \tilde{\boldsymbol{\rm y}}_0 := (\tilde y_1,\ldots,\tilde y_q) \end{gather} $$

such that

$$ \boldsymbol{\rm S} = {}^{(N/p)}\tilde{\boldsymbol{\rm x}}_0 + {}^{(N/q)}\tilde{\boldsymbol{\rm y}}_0 + {\boldsymbol{\rm M}}_{\boldsymbol{\rm S}}\,, $$

where $\,{\boldsymbol{\rm M}}_{\boldsymbol{\rm S}}\,$ is the list of length $\,N\,$ having every entry equal to $\,\mu_{\boldsymbol{\rm S}}\,.$

Proof: Producing the Lists $\,\tilde{\boldsymbol{\rm x}}_0\,$ and $\,\tilde{\boldsymbol{\rm y}}_0$

The proof illustrates the procedure that yields the zero-mean lists $\,\tilde{\boldsymbol{\rm x}}_0\,$ and $\,\tilde{\boldsymbol{\rm y}}_0\,.$

Renaming, if necessary, suppose that $\,p \gt q\,.$ In the following summation of the two components forming $\,\boldsymbol{\rm S}\,,$ the notation $\,\boldsymbol{\rm y}_?\,$ is used to indicate that the exact subscript on $\,\boldsymbol{\rm y}\,$ depends on the relationship between $\,p\,$ and $\,q\,$:

$$ \begin{align} \boldsymbol{\rm S}\ &=\ (x_1, x_2,\ldots, x_q,\cr &\qquad\ x_{q+1},\ldots,x_p,\cr &\qquad\ x_1,\ldots,x_p)\cr &\ \ + (y_1,y_2,\ldots,y_q\cr &\qquad\ y_1,\ldots,y_?,\cr &\qquad\ y_?,\ldots,y_q)\cr\cr &=\ (x_1+y_1, x_2+y_2,\ldots,x_q+y_q,\cr &\qquad\ x_{q+1}+y_1,\ldots,x_p+y_?,\cr &\qquad\ x_1 + y_?,\ldots,x_p + y_q) \end{align} $$

Subtract $\mu_{\boldsymbol{\rm S}}\,$ from each entry in $\,\boldsymbol{\rm S}\,,$ and call the resulting list $\,\boldsymbol{\rm S}_0\,$ (since $\,\boldsymbol{\rm S}_0\,$ has mean $\,0\,$). To find $\,\boldsymbol{\rm x}_0\,,$ first reshape $\,\boldsymbol{\rm S}_0\,$ in rows of length $\,p\,$:

$$ \small \begin{align} &( \overbrace{x_1+y_1-\mu_{\boldsymbol{\rm S}}, x_2+y_2-\mu_{\boldsymbol{\rm S}},\ldots, x_q+y_q-\mu_{\boldsymbol{\rm S}},\ldots, x_p+y_?-\mu_{\boldsymbol{\rm S}}}^{\text{first row}},\cr &\qquad x_1+y_?-\mu_{\boldsymbol{\rm S}},\ldots) \end{align} $$

to get:

$$ \small \begin{alignat}{2} &x_1+y_1-\mu_{\boldsymbol{\rm S}}\quad x_2+y_2-\mu_{\boldsymbol{\rm S}}\quad &&\cdots\quad x_p+y_?-\mu_{\boldsymbol{\rm S}}\cr &x_1+y_?-\mu_{\boldsymbol{\rm S}}\quad x_2+y_?-\mu_{\boldsymbol{\rm S}}\quad &&\cdots\quad x_p+y_?-\mu_{\boldsymbol{\rm S}}\cr &\qquad \vdots\qquad\qquad\quad\ \ \vdots\qquad\quad &&\ \ \, \vdots\qquad\quad\ \ \ \vdots\cr &x_1+y_?-\mu_{\boldsymbol{\rm S}}\quad x_2+y_?-\mu_{\boldsymbol{\rm S}}\quad &&\cdots\quad x_p+y_q-\mu_{\boldsymbol{\rm S}}\cr \end{alignat} $$

There are $\,\frac Np\,$ rows in the arrangement above. Summing the entries in each of the $\,p\,$ columns gives the column sums:

$$ \begin{gather} \left(\frac Np(x_1-\mu_{\boldsymbol{\rm S}}) + \sum_{\text{col 1}} y_i\right)\cr \left(\frac Np(x_2-\mu_{\boldsymbol{\rm S}}) + \sum_{\text{col 2}} y_i\right)\cr \vdots\cr \left(\frac Np(x_p-\mu_{\boldsymbol{\rm S}}) + \sum_{\text{col $p$}} y_i\right) \end{gather} $$

There are $\,\frac Nq\,$ sets of $\,(y_1,\ldots,y_q)\,$ in the entire list, and these are equally divided among the $\,p\,$ columns, since $\,p\,$ and $\,q\,$ are relatively prime. Thus, each column contains $\,\frac N{pq}\,$ sets of $\,(y_1,\ldots,y_q)\,,$ and hence:

$$ \begin{align} &\sum_{\text{col $j$}} y_i\cr &\quad = \frac N{pq}(y_1 + \cdots + y_q)\cr\cr &\quad = \frac Np\mu_{\boldsymbol{\rm y}}\,,\quad j = 1,\ldots,p \end{align} $$

Dividing each column sum by $\,\frac Np\,$ gives the averages of each column:

$$ \begin{align} &\overbrace{x_1 - \mu_{\boldsymbol{\rm S}} + \mu_{\boldsymbol{\rm y}}}^{1^{\text{st}}\text{ column average}}\cr\cr &\overbrace{x_2 - \mu_{\boldsymbol{\rm S}} + \mu_{\boldsymbol{\rm y}}}^{2^{\text{nd}}\text{ column average}}\cr\cr &\qquad\quad \vdots\cr\cr &\overbrace{x_p - \mu_{\boldsymbol{\rm S}} + \mu_{\boldsymbol{\rm y}}}^{\text{last column average}} \end{align} $$

Since $\,\mu_{\boldsymbol{\rm S}} = \mu_{\boldsymbol{\rm x}} + \mu_{\boldsymbol{\rm y}}\,,$ these column averages can be written more simply as:

$$ \begin{align} &\overbrace{x_1 - \mu_{\boldsymbol{\rm x}}}^{1^{\text{st}}\text{ column average}}\cr\cr &\overbrace{x_2 - \mu_{\boldsymbol{\rm x}}}^{2^{\text{nd}}\text{ column average}}\cr\cr &\qquad\quad \vdots\cr\cr &\overbrace{x_p - \mu_{\boldsymbol{\rm x}}}^{\text{last column average}} \end{align} $$

Therefore, the column averages of the reshaped list recover the list $\,\boldsymbol{\rm x}\,,$ with each entry decreased by $\,\mu_{\boldsymbol{\rm x}}\,.$ Define $\,\tilde{\boldsymbol{\rm x}}_0\,$ to be the list consisting of these column averages:

$$ \tilde{\boldsymbol{\rm x}}_0 := (x_1 - \mu_{\boldsymbol{\rm x}}, x_2 - \mu_{\boldsymbol{\rm x}},\ldots, x_p - \mu_{\boldsymbol{\rm x}}) $$

Clearly, $\,\tilde{\boldsymbol{\rm x}}_0\,$ has zero mean.

A similar reshaping of $\,\boldsymbol{\rm S}_0\,$ into $\,\frac Nq\,$ rows of $\,q\,$ each yields the column averages:

$$ \tilde{\boldsymbol{\rm y}}_0 := (y_1 - \mu_{\boldsymbol{\rm y}}, y_2 - \mu_{\boldsymbol{\rm y}},\ldots, y_q - \mu_{\boldsymbol{\rm y}}) $$

Then,

$$ \begin{align} &{}^{(N/p)}\tilde{\boldsymbol{\rm x}}_0 + {}^{(N/q)}\tilde{\boldsymbol{\rm y}}_0\cr\cr &\quad = {}^{(N/p)}\boldsymbol{\rm x} - \boldsymbol{\rm M}_{\boldsymbol{\rm x}} + {}^{(N/q)}\boldsymbol{\rm y} - \boldsymbol{\rm M}_{\boldsymbol{\rm y}}\cr\cr &\quad = \boldsymbol{\rm S} - \boldsymbol{\rm M}_{\boldsymbol{\rm S}}\,, \end{align} $$

from which:

$$ \boldsymbol{\rm S} = {}^{(N/p)}\tilde{\boldsymbol{\rm x}}_0 + {}^{(N/q)}\tilde{\boldsymbol{\rm y}}_0 + \boldsymbol{\rm M}_{\boldsymbol{\rm S}}\quad \blacksquare $$

The theorem does not hold if $\,p\,$ and $\,q\,$ are not relatively prime, even if $\,N\,$ is a multiple of $\,pq\,.$ Also, the theorem does not hold if $\,p\,$ and $\,q\,$ are relatively prime, but $\,N\,$ is not a multiple of $\,pq\,.$

Extending the previous theorem

The constructive technique just discussed also works if $\,\boldsymbol{\rm S}\,$ is a sum of any finite number of relatively prime cycles, providing that $\,N\,$ is a multiple of the product of the cycles.

To see this, suppose that the theorem holds for $\,K\,$ cycles, and suppose that

$$ \boldsymbol{\rm S} = \overbrace{\boldsymbol{\rm C}_1 + \cdots + \boldsymbol{\rm C}_K}^{:= \boldsymbol{\rm C}} + \overbrace{\boldsymbol{\rm C}_{K+1}}\,, $$

where $\,\boldsymbol{\rm C}_i\,$ is a $p_i$-cycle.

The list $\,\boldsymbol{\rm C}\,$ is a $p_1p_2\cdots p_K$-cycle, which is relatively prime to $\,p_{K+1}\,,$ by hypothesis. Thus, applying the theorem to $\,\boldsymbol{\rm C}\,$ and $\,\boldsymbol{\rm C}_{K+1}\,$ yields:

$$ \boldsymbol{\rm S} = \tilde{\boldsymbol{\rm C}} + \tilde{\boldsymbol{\rm C}}_{K+1} + \mu_{\boldsymbol{\rm S}} $$

Then, using the inductive hypothesis on $\,\tilde{\boldsymbol{\rm C}}\,,$ and the fact that $\,\mu_{\tilde{\boldsymbol{\rm C}}} = 0\,,$ one obtains:

$$ \boldsymbol{\rm S} = \tilde{\boldsymbol{\rm C}}_1 + \cdots + \tilde{\boldsymbol{\rm C}}_{K+1} + \mu_{\boldsymbol{\rm S}} $$

Uniqueness of decomposition into relatively prime cycles

In general, a positive integer $\,N\,$ may be viewed as a product of relatively prime integers in more than one way. For example, $\,30 = 2 \cdot 15 = 3 \cdot 10 = 5 \cdot 6\,.$ Thus, given a list $\,S\,$ of length $\,30\,,$ one might separately investigate the hypotheses that:

$\boldsymbol{\rm S}\,$ is a sum of a $2$-cycle and a $15$-cycle;
$\boldsymbol{\rm S}\,$ is a sum of a $3$-cycle and a $10$-cycle;
$\boldsymbol{\rm S}\,$ is a sum of a $5$-cycle and a $6$-cycle.

Under what conditions will $\,N\,$ have a unique decomposition as a product of relatively prime integers? The next proposition answers this question.

Proposition

Let $\,N\,$ be a positive integer. Then, $\,N\,$ has a unique representation as a product of relatively prime integers if and only if the prime factorization of $\,N\,$ is of the form $\,p^mq^n\,,$ where $\,p\,$ and $\,q\,$ are prime with $\,p\ne q\,,$ and $\,m\,$ and $\,n\,$ are positive integers.

Proof

“$\impliedby$”

Suppose that the prime decomposition of $\,N\,$ is of the form $\,N = p^mq^n\,$ for primes $\,p\ne q\,,$ $\,m\,$ and $\,n\,$ positive integers.

Then, $\,p^m\,$ and $\,q^n\,$ are relatively prime integers with product $\,N\,.$ Any other regrouping of the prime factors as a product of two numbers is necessarily of the form $\,N = d_1d_2\,,$ where either $\,d_1\,$ and $\,d_2\,$ have a common factor $\,p\,$; or $\,d_1\,$ and $\,d_2\,$ have a common factor $\,q\,.$

“$\implies$”

Suppose that $\,N\,$ has a unique representation as a product of relatively prime integers.

If $\,N\,$ has exactly one distinct prime in its prime decomposition, say $\,N = p^m\,$ for a prime $\,p\,$ and positive integer $\,m\,,$ then $\,N\,$ cannot be written as a product of relatively prime integers.

If $\,N\,$ has three or more primes in its prime decomposition, say N = $\,p^mq^nr^jx\,$ where $\,p\,,$ $\,q\,$ and $\,r\,$ are distinct primes, $\,m\,,$ $\,n\,$ and $\,j\,$ are positive integers, and $\,x\,$ is either $\,1\,$, or a product of primes other than $\,p\,,$ $\,q\,,$ and $\,r\,,$ then $\,N\,$ can be written as a product of two relatively prime integers in more than one way: say,

$$ \begin{align} N &= (p^m)\cdot(q^nr^jx)\cr &= (q^n)\cdot(p^mr^jx) \end{align} $$

Thus, $\,N\,$ must have precisely two distinct primes in its prime factorization. $\blacksquare$

Matlab Example

The following diary of an actual MATLAB session illustrates the reshaping procedure, while reviewing the necessary MATLAB commands.

The illustrated procedure could be implemented much more efficiently. However, efficient implementations often obscure simple underlying ideas. Therefore, in what follows, efficiency has been sacrificed for the sake of clarity.

% construct a 'known unknown':
%   a 2-cycle, 3-cycle and 5-cycle
x = [1 2];
x = [x x x x x x x x x x x x x x x];
% x is a 2-cycle of length 30
y = [-1 0 4];
y = [y y y y y y y y y y];
% y is a 3-cycle of length 30
z = [-1 3 1 2 0];
z = [z z z z z z];
% z is a 5-cycle of length 30
% sum the components to get the 'known unknown'
S = x + y + z;
% subtract off the mean of S to get SO
SO = S - mean(S);
% reshape SO to test for a 2-cycle,
%   using the 'reshape' command
r1 = reshape(S0,2,15)'

r1 =

   -4.5000   1.5000
    2.5000  -0.5000
   -2.5000   1.5000
   -0.5000  -0.5000
    3.5000  -2.5000
   -3.5000   5.5000
   -2.5000   0.5000
    1.5000  -3.5000
    0.5000   3.5000
   -1.5000  -1.5000
    0.5000   0.5000
   -1.5000   4.5000
   -3.5000  -2.5000
    4.5000  -1.5000
   -0.5000   2.5000

% average the columns, using the 'sum' command,
%   dividing by # of rows
r1 = sum(r1) / 15

r1 =
   -0.5000 0.5000

% Observe that this is precisely x - mean(x)!
temp = x - mean(x);
temp(1:2)

ans =
   -0.5000 0.5000

% Now, repeat, testing for the 3-cycle and 5-cycle
r2 = reshape(SO,3,10)';
r2 = sum(r2) / 10

r2 =
  -2   -1    3

r3 = reshape(SO,5,6)';
r3 = sum(r3) / 6

r3 =
  -2    2    0    1    -1

% Now, build r1, r2, and r3 into length-30 cycles
r1 = [r1 r1 r1 r1 r1 r1 r1 r1 r1 r1 r1 r1 r1 r1 r1];
r2 = [r2 r2 r2 r2 r2 r2 r2 r2 r2 r2];
r3 = [r3 r3 r3 r3 r3 r3];

% Sum r1, r2 and r3, and add  mean(S)  back in,
%   to recover S:

predict = r1 + r2 + r3 + mean(S)

predict =
  Columns 1 through 12
    -1  5  6  3  1  5  3  3  7  1  0  9
  Columns 13 through 24
     1  4  5  0  4  7  2  2  4  4  2  8
  Columns 25 through 30
     0  1  8  2  3  6

% Compare with S:
S - predict

ans =
  Columns 1 through 12
     0  0  0  0  0  0  0  0  0  0  0  0
  Columns 13 through 24
     0  0  0  0  0  0  0  0  0  0  0  0
  Columns 25 through 30
     0  0  0  0  0  0