**Background information:
Jacobs University "Smart Systems" seminar wins international
financial time series competition**

*H. Jaeger, Nov 3, 2007*

**The competition** (http://www.neural-forecasting-competition.com/NN3/index.htm
)

*Background / motivation of the competition*

Forecasting financial time series (like stock indices,
currency exchange rates, gross national products) is a task of obvious great
economical and commercial interest. In academic and applied economics research,
this task has been investigated intensely over decades, and numerous
specialized forecasting methods have been developed. There are dedicated
Journals, textbooks, and professional societies (e.g., http://www.forecasters.org/, http://forecasters.org/ijf/index.html, for more see sponsors listed at competition website).

Independently from this
specialized field of forecasting research, in other fields of engineering other
methods for predicting time series have been developed. These other methods are
"general-purpose", that is, they are not specialized to a particular
target domain from the outset, and they often are motivated by cognitive
science or neuroscience. Collectively, these fields may be referred to as *Computational
Intelligence* (CI) approaches. A major type
of CI methods are *neural networks, *which
can be seen as complex information processing systems that are modelled after
biological brains (with a lot of abstraction).

CI methods have been applied to
financial forecasting in the past, but infrequently and with very limited
success. This is particularly noteworthy for neural networks based methods,
which created some publicity in the 90'ies and the years of the economical
bubble. When put to stringent tests, it was consistently found that CI methods
in general and neural network methods in particular were inferior to
specialized "traditional" methods. Such tests were done in individual
scientific studies, but also within large-scale international time series competition, the last (and largest)
of which was the "M3" competition conducted in 1999[1].

Since these negative findings, the
state-of-the-art in neural networks and CI prediction methods did however
advance. The main purpose of the "NN3" competition, in which the
Jacobs group participated, was to again compare the neural networks / CI
methods which are available today, with the "traditional" methods.

*The competition*

* *

The organizers published online a
set of 111 financial timeseries. The only information given about these data
was that the timeseries were monthly measurements of a variety of micro- and
macroeconomical indicators from a variety of sources. To give an impression,
the figure below shows some of the given series:

**Figure.** Four from the 111 competition datasets.

The task was to predict these time
series by 18 months (= compute 18 additional data points per time series). A
particular (intended) difficulty of this task arises from the circumstance that
the time series came from a wide diversity of sources and thus reflected a
likewise large variety of financial/market mechanisms. Thus, one of the main
theoretical and technical challenges was that the prediction methods had to be
very *flexible, robust*, and *versatile*.

Because many prediction methods
(both "traditional" and CI) are labour- and computer-intensive,
predicting 111 series poses a real resource- and workload challenge for labs
using these methods. Thus, there also was an option to participate on a subset
of only 11 of the given timeseries.

The competition was organized by a
team from the Lancaster University Management School, sponsored by a grant from
the financial consulting company SAS (http://www.sas.com/index.html)
and the International Institute of Forecasters (http://www.forecasters.org/ ).

The competition was announced in
Fall 2006, the deadline for submitting predictions was May 14, 2007. For the
full 111 dataset there were 25, and for the reduced 11 dataset there were 44
submissions. The results were announced by the end of October 2007 (http://www.neural-forecasting-competition.com/NN3/results.htm
).

**The seminar**

In the Smart Systems graduate program in Computer Science, our first-year master
students have to take three *research seminars* in the Spring semester. This is the standard opportunity
for first-year master students to get into close contact with the research
field of the professors of the Smart Systems graduate program, enabling them to
choose the special interest area for the subsequent Master thesis (which they
work out in their 2nd year). Since the competition deadline coincided perfectly
with the end of the Spring semester at Jacobs, I found that participating in
this competition was a great opportunity for a treat of hands-on education in *machine
learning*, my research area. So in Spring
2007, I announced that the topic of my research seminar would be financial time
series prediction and participation in this competition.

Five master students registered
for this seminar:

- Iulian Illies (mathematics)

- Olegas Kosuchinas (Smart Systems)

- Monserrat Rincon (mathematics)

- Vytenis Sakenas (Smart Systems)

- Narunas Vaskevicius (Smart Systems)

Neither these students nor I had
any previous experience with financial forecasting. Thus we structured the 3
months that we had as follows:

- month 1:

o
collective study of basic
principles of financial datasets and "traditional" forecasting
methods

o
implementing two of the
"traditional" forecasting methods

o
preliminary analysis of the
111 datasets (extracting trends, cyclic and random components, general aim:
getting a feeling for the various mechanisms and features of these time series)

- month 2:

o
for students: implementation
of 5 textbook methods for predicting time series, -- these were general-purpose
methods from various fields but not specialized financial forecasting methods
(2 methods from CI: *feedforward neural networks*, *support vector machines*, 2 methods from signal processing: *wavelets * and *linear
IIF filters*, one method from statistical machine
learning: *local nonparametric modelling*).

o
for myself: adaptation of the
general-purpose prediction method that I developed over the last years (*Echo
State Networks*, a special type of neural
networks)

- month 3:

o
comparative assessment of the
methods implemented so far (2 traditional financial, 5 generic textbook, 1
"home-made")

o
design of a
"best-possible" prediction method, based on the experiences made so
far. This turned out to be a combination of some steps of
"traditional" financial forecasting methods for data preprocessing,
with Echo State Networks for computing the predictions.

o
Actually computing the
predictions, writing a report paper (which was a requisite of the competition),
and submitting.

Of course, we finished only
just-in-time, that is, submitted shortly before the midnight deadline on a
Sunday night, after a feverish weekend spent all together in my office, feeding
on pizzas, doing all the last-minute optimizations that we could think of.

Because we implemented and tested
a variety of "standard" methods during our project, we had a fairly
clear impression that the method which we ultimately designed would be highly
competitive – in fact, we were quite confident that we would score highly
in the competition. But we also knew that at the same time, other highly
qualified groups around the world would feel the same way...

What did these students *learn* in this seminar? Besides the technical contents that was
acquired (basics of financial forecasting, overview of other standard
prediction methods, basics of Echo State Network theory), there were two other
yields from this seminar:

- first-hand experience in the organisation and daily
administering of a complex joint project under an unforgiving deadline
pressure, and

- a truly hands-on experience of what "machine
learning" *really *means, beyond all
theory: namely, that the ultimate quality of the models hinges crucially on the
efforts of the *human in the loop*, on
the "feeling" that one has to develop for a particular type of data
and modelling task, which in turns guides the design and parameter-tuning of
the implemented algorithms.

After a week or so of recovery
(and after the students had finished their final exams in other courses), we
met a last time for a barbecue on the lush campus green, on a wonderfully sunny
and warm evening (see picture below).

**Figure. **The team: Olegas Kosuchinas (lying), Iulian Illies and
Narunas Vaskevicius (sitting), Vytenis Sakenas, Herbert Jaeger and Monserrat
Rincon (standing).

Smart Systems seminars are usually
graded, that is, every student receives a grade for his/her specific
performance. We felt that this would not do justice to the great joint effort
and always fully shared work that characterized this seminar. Therefore, we asked
the Dean for permission to register this course as an ungraded course.
Consequentially, each participant received a "pass" note in the
transcript, and to compensate, a personal recommendation letter.

**The method**

The prediction method which we
designed combined

- a preprocessing stage where the given time series were
split into a variety of components: for this stage we relied on the specific
decomposition methods developed in financial forecasting;

- a prediction stage where the series componentes were individually
predicted for 18 months (to be subsequently recombined): for this stage we
employed Echo State Networks (ESN), a generic neural-networks-based modelling technique pioneered by H. Jaeger.
In the seminar we developed and added a few additional "tricks and
tweaks" to the basic ESN architecture, giving us specific performance
improvement for short, non-stationary time series as the ones we encountered in
the competition.

Our approach is presented in more
detail in our competition report at http://www.neural-forecasting-competition.com/downloads/NN3/methods/27-NN3_Herbert_Jaeger_report.pdf
.

Echo State Networks have been very
successful in predicting time series before. For example, ESNs could predict *chaotic* time series with an almost 10,000 fold improvement in
accuracy over previous methods[2],
or identify speakers from voice recordings, improving the error rate on a
widely used benchmark from a previous best of 1.8% to zero[3].
Herbert Jaeger is closely collaborating with a company (Planet AG, www.planet.de ) who specialize in text and
handwriting recognition systems, and among others develop and deliver parcel
sorting systems to Federal Express. Planet will base the core recognition
modules for handwritten addresses on ESNs, and is currently financing a PhD
position in H. Jaeger's group. International patents for the ESN method are held by the Fraunhofer Society, the former employer of Herbert Jaeger.

**Significance of winning the competition **

** **

Winning this competition does not imply that our method is the currently best possible. Inspecting the competition result page, one will find at the top of the scoring table the following entries, which needs an explanation:

The SMAPE column gives the average
relative prediction error. Our entry has a SMAPE of 15.18%. There are two other
entries which have better SMAPEs of 14.84% and 14.89%, respectively. These
refer to predictions that were made with methods rooting in the financial
forecasting tradition, and which were used in this competition as points of
reference. These "financial forecasting"-specialized methods were not
ranked in this competition, which expressedly targetted CI methods. One finds,
in sum, that our method is placed close to the forefront of a group of
top-performing "traditional" methods.

The significance of this outcome
is that here *for the first time, a generic neural network method turned out
to be competitive with specialized financial forecasting methods*. Given that the two superior contributions were entered by
renown financial modelling institutions / groups with a long-standing
experience in financial forecasting, and ours "just" reflects a brief
educational adventure (in true Jacobs University spirit), the performance level
that we reached becomes noteworthy. We see here an indication that neural
networks are finally rising to the original hopes in the field, namely, that
learning from biological systems is the ultimately best way of realizing
artificial intelligence.

[1]
Spyros Makridakis,
Michele Hibon: The M3-Competition: results, conclusions andimplications.
International Journal of Forecasting 16 ( 2000) 451 – 476

[2] H. Jaeger, H. Haas (2004): Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication. Science, April 2, 2004,78-80

[3] H. Jaeger, M. Lukosevicius, D. Popovici (2007): Optimization and Applications of Echo State Networks with Leaky Integrator Neurons. Neural Networks 20(3), 335-352, 2007