M.S. Thesis Project
(in cooperation with Ernst & Young, Copenhagen)

Title: Efficient payment transaction matching

Problem domain:

Payment transaction matching (henceforth simply "transaction matching")
is the problem of matching two sets of payment transactions, each
containing an amount, a date and other attributes; e.g. one
representing bank account transactions, the others postings to a
journal or day book. Matching (Danish "afstemning") means, informally: (a) identfying
corresponding (groups of) transactions based on (payment) amounts,
dates, identifying strings and/or other criteria; and (b) computing the difference, if any, of the
aggregate amounts of the matched groups.

Background:

Ernst & Young have developed an Excel/VBA based tool called RECON for performing
rule-based transaction matching. The rules enable user-specified matching.
RECON works as follows: For each rule, for each transaction group in the one
set satisfying the rule and for each transaction group in the other set
satisfying the rule, check whether the two groups match according to
the rule and, if so, output aggregate amounts of the two matching
groups as well as their difference. Because RECON employs nested loops, it results in prohibitive
Theta(r m n) run-time complexity, where r is the number of rules, m
the number of transactions in one set and n the number of transactions
in the other set.

Objectives:

The overall goal of the project is to develop a transaction matching tool
with drastically improved time efficiency.

Specific objectives are:

1. Analysis and formal definition of relevant variations of matching.
What exactly does "matching" mean?

2. An efficient and scalable algorithm for and implementation of
automatic, scalable (quasi-linear time for fixed r) transaction matching based on
specifications of matching rules; in particular it should have
interactive performance on tens or hundreds of thousands of
transactions in each set and respond in no more than a few seconds for
millions of transactions in each of the two transaction sets.
In particular, it should employ and evaluate applicability of discriminatory joins with
symbolic products [1].

2a. Optional extension: Consider both Boolean (does or does not match)
and probabilistic matching (provide "goodness of match" measure for ranking
different possible matches).
For this part, no particular GUI or implementation platform is stipulated.

3. A tool that: incorporates the algorithm in 2; supports manual matching
and unmatching (revoking/removing matchings "by hand"); supports convenient user
interaction for import and export of data, for reporting of matching results,
and for specification of matching rules. Use of .NET (see 4 below) is welcome but
not required.

4. Analysis of programming platforms for developing the tool for use in
and at Ernst & Young; that is, taking account of Ernst & Young's
technical setup and intentions for the tool. In particular, consider
the .NET platform as development platform, with VisualStudio's
support for development in high-level languages such as C# and F#, as
an alternative to developing with Excel and VBA.

Advisors:

Main advisor: Fritz Henglein, DIKU
External advisor: Nishandan Ganesalingam, Ernst & Young

The main advisor is responsible for the project overall and is primarily responsible
for advising on objectives 1 and 2 above. The external advisor is primarily
responsible for advising on objectives 3 and 4.

Period:

Start: Between January 1st and 30th, 2012
End: ca. June 30th, 2012

Size:

1-2 persons, 30 ECTS points each.

Prerequisites:

o Eligibility for starting an M.S. thesis.
o Programming Languages and Systems profile recommended, but not required.
o Signing of nondisclosure agreement, which provides access to
source code of and information on RECON.

Language:

Advising: English or Danish (. M.S. thesis: English.

More information:

Contact Fritz Henglein, DIKU room 3-2-17, tel. 30589576, email henglein@diku.dk.
(Please note: unavailable Dec. 23, 2011 to January 8th, 2012.)

[1] Fritz Henglein, Ken Friis Larsen, "Generic Multiset Programming with Discrimination-based Joins and Symbolic Cartesian Products",
Higher-Order Symbolic Computation, 2011, DOI 10.1007/s10990-011-9078-8; preprint available from http://www.diku.dk/~henglein > "Papers".

Project info
Deadline: 
31. Januar 2012
Contact info
Kontakt: 
Fritz Henglein
E-mail: