Finding Hidden Patterns in Datasets: Abstract

This is my (Dr. Carol JVF Burns) dissertation, submitted in 1994 for my Doctor of Arts in Mathematics at Idaho State University.

The original dissertation title was ‘Identifying Hidden Periodicities in Discrete-Domain Data’. I chose a simpler title for a web audience.

The dissertation is Copyright © Carol J.V. Fisher 1994, with All Rights Reserved.

My dissertation was saved on floppy disks, which were damaged and are now unreadable.

Fortunately, I have the original (decades-old) pages. I used the CZUR ET24 Pro book scanner to scan and convert to editable text. This scanner was a dream to use, and saved me hundreds of hours of typing from scratch.

The original dissertation is largely preserved, but formatting decisions appropriate for the web have been made. In particular, long paragraphs are broken into several pieces.

The dissertation will ‘emerge’ online, beginning in December 2024. Enjoy!

Content that has been added or changed from the original dissertation is bordered in red, as indicated here.

Abstract

This dissertation investigates the problem of identifying hidden periodicities in discrete-domain data, with emphasis on identification for the purpose of prediction.

By definition, discrete-domain data is a collection of ordered pairs, $\,\{(t_i,y_i)\}_{i=1}^{N\text{ or }\infty}\,,$ with the property that the time values $\,t_i\,$ can be arranged in strictly increasing order.

The data values $\,y_i\,$ are allowed to come from the set of real numbers, denoted by $\,\Bbb R\,.$ Implementation of all the techniques discussed herein are incorporated throughout the dissertation, using the MATLAB software package.

GNU Octave is free software that is largely compatible with MATLAB. As I get the dissertation online, I will test all provided code with Octave and indicate any change(s) that must be made.

Prerequisites for this Dissertation

The dissertation is written as the basis for a textbook, and assumes a mathematical background typical of an undergraduate degree in engineering: three semesters of calculus, introductory courses in linear algebra and statistics, and a moderate amount of mathematical maturity. Additional information that is essential for an understanding of the material is included in the appendices.

Chapter 1: Periodic Functions

Periodicity is studied in Chapter 1. The usual definition of periodic functions (as functions from $\,\Bbb R\,$ to $\,\Bbb R\,$) is generalized, to provide a viewpoint that favors investigation of periodicities in discrete-domain data. It is verified that functions obeying this generalized definition satisfy the properties commonly associated with periodicity. The set of all periods of a periodic function is studied.

A reshaping method for identifying relatively prime periodic components is presented. Important logical considerations regarding the use of identified periodic components for prediction are discussed. The chapter closes with an overview of historical contributions in the search for hidden periodicities.

Chapter 2: ‘Fitting’ a Data Set with a Function

Chapter 2 develops techniques for fitting a data set with a function. A ‘turning point’ test for random behavior is developed. If the hypothesis of random behavior cannot be rejected, one may still be able to take advantage of the turning points in economics data, via an interesting application of a Martingale Algorithm.

Linear and nonlinear least squares approximation techniques are presented for situations where there are specific conjectured components in the data. Condition numbers and discrete orthogonal functions are discussed in the context of overcoming numerical difficulties with computer applications.

Gradient methods and a genetic algorithm provide a way to deal with nonlinear least squares approximation. Cubic spline interpolation is presented as a way to achieve a uniform time list, if necessary.

Discrete Fourier theory and the periodogram are presented as useful tools, particularly when the data is thought to have unknown periodic components. Efficient computation of the periodogram via the discrete Fourier transform is discussed.

Chapter 3: Filter Theory

Chapter 3 discusses nonrecursive mathematical filters, and their corresponding transfer functions. Such filters can be used for removal of noise, and are useful identification tools when the data contains sinusoidal components.

An overall approach to identifying hidden periodicities, together with examples, is given to conclude the dissertation.