[BIO98 Logo] The British Informatics Olympiad is
the computing competition for schools and colleges.
We are proud to present BIO'98.
[Data Connection logo]
-----

The 1998 British Informatics Olympiad exam

Solution to Question 3: Alphametics (Cryptarithms)

3 (a) [24 marks]. Write a program which solves addition alphametics. Your program should first accept a number n, between 3 and 6 inclusive, which will indicate the number of lines to follow. The following n lines will each contain a word of no more than 10 characters. The alphametic to be solved is the sum of the first n-1 words, to total the final word.

Alphametics are quite common brainteaser questions. They can usually be solved by hand in a methodical way by building up sets of constraints and using these to narrow down on a solution. The task in this question is to write a program to do the process automatically. We will consider two different approaches.

Brute force

A basic approach to the problem would be to try all possible number combinations, checking if a solution had been found for each. In the worst case of 10 unknown characters, this would give 10! = 3,628,800 different permutations to check. This means that an inefficient solution will probably not run fast enough. However, this number is small enough that a carefully coded solution can run quite quickly, and is the approach we will take in solving this problem.

Logical programming

It is possible to take the approach a person would use and program it into a computer. The alphametic is broken up into a set of logical rules which must all be satisfied by a valid solution. The number of solutions is small and the search space of this approach is usually related to it, so this would seem to be a way to produce a very fast program. Several 'Artificial Intelligence' languages such as Prolog exist and would be very appropriate for this kind of problem - but they are not commonly available. Writing a solution that draws entirely on logic would be very hard given the time constraints, but it is possible to use aspects of this approach to make brute force rather faster.

An efficient brute force algorithm

Assuming that the entire range of possible solutions can be generated quickly - and it can - brute force could be used if an algorithm can be found that allows each possible solution to be checked swiftly. To do this it is necessary to minimise the need to decode or look up values, and preferably keep to simple arithmetic that will run quickly. We can rewrite the alphametic - ignoring the leading zeros constraint - as an element by element sum of two short arrays. This can be evaluated very quickly. The reasoning goes as follows.

Let there be N lines and M characters. We write the nth line as
Wn1 * X1 + ... + WnM * XM
where Xm is the value assigned to character m. The sum can then be written as
W11 * X1 + ... + W1M * XM +
W21 * X1 + ... + W2M * XM =
W31 * X1 + ... + W3M * XM

The solution must satisfy
W1 * X1 + ... + WM * XM = 0
where Wm = W1m + ... + W(N-1)m - WNm, and X1..XM can only be distinct digits from 0 to 9. We can check for leading zeros later, as there will be few and we want to keep this routine fast.

It is necessary to choose a data representation that will allow us to manage the (potentially large) values that can be generated - which may be of the order of 10 to the power 11. Long (32-bit) integers will not do. However, many machines provide a 64-bit integer type (comp in Turbo Pascal, long in Java), a 8 or 6-byte floating point type (double or real) which is also suitable. Even if only 32-bit integers are used, the programs will only fail when the words of the alphametic are 8 characters or more - in fact, only one of the six test cases used in marking probes this.

The algorithm is therefore:

It is also possible to maintain a 'sum so far' by permuting X1..XM in turn and storing
W1 * X1 + ... + Wp * Xp
where p is the latest letter that is being generated.

An example solution that implements this method has been written as bio98q3.java. It can be viewed in a Java-enabled browser or run on a Java-capable computer from the example solution to Q3 page.

Marking

The program for part 3(a) is to be marked with six tests.

In tests 1 and 2 the alphametics have multiple solutions. All valid solutions are listed, sorted by total; a program need only give one solution. [4] marks are to be given for any correct solution, but [2] of those marks should be deducted if the program also prints 'Unique'.

Test 1: [4]
3
ONE
ONE
TWO
206 + 206 = 412
216 + 216 = 432
231 + 231 = 462
236 + 236 = 472
271 + 271 = 542
281 + 281 = 562
286 + 286 = 572
291 + 291 = 582
407 + 407 = 814
417 + 417 = 834
427 + 427 = 854
432 + 432 = 864
452 + 452 = 904
457 + 457 = 914
467 + 467 = 934
482 + 482 = 964
Test 2: [4]
3
FATHER
MOTHER
PARENT
186753 + 296753 = 483506
286753 + 196753 = 483506

Tests 3 and 4 have unique solutions. For each test [4] marks should be given for printing the correct solution and the word 'Unique'. If the word 'Unique' is absent, only [2] marks should be given for the correct solution.

Test 3: [4]
4
SEVEN
SEVEN
SIX
TWENTY
68782 + 68782 + 650 = 138214
Unique
Test 4: [4]
6
THREE
THREE
TWO
TWO
ONE
ELEVEN
84611 + 84611 + 803 + 803 + 391 = 171219
Unique

Tests 5 and 6 have no solutions. For each test [4] marks should be given for printing the word 'Impossible', and [0] marks are available for any other output.

Test 5: [4]
3
BIO
FIRST
ROUND
Impossible
Test 6: [4]
5
SEVENTEEN
SEVENTEEN
SEVENTEEN
SEVENTEEN
SIXTYEIGHT
Impossible

(N.B. Test 5 is not the same as the example alphametic given in question 3(b).)

A total of 24 marks are available for a question that solves all of these test cases for 3 (a).

-----

3 (b) [4 marks]. Find two solutions to BIO + ROUND = FIRST.

This alphametic was chosen to be one that could relatively easily be solved by hand, so that participants who had not solved 3 (a) could pick up marks here. From inspection, F must be R+1, O must be 9 and I must be 0 (zero).

There are 16 different solutions, listed below and sorted by total. Score [2] marks for a single correct solution, and [4] marks for two correct solutions. Additional correct solutions, and any incorrect solutions, should be ignored.

509 + 19638 = 20147
609 + 19538 = 20147
309 + 19847 = 20156
809 + 19347 = 20156
309 + 19865 = 20174
809 + 19365 = 20174
509 + 19674 = 20183
609 + 19574 = 20183
509 + 39817 = 40326
809 + 39517 = 40326
509 + 39862 = 40371
809 + 39562 = 40371
709 + 59814 = 60523
809 + 59714 = 60523
709 + 59832 = 60541
809 + 59732 = 60541
-----

3 (c) [3 marks]. If you are only allowed to use a single letter, is it possible to construct an addition alphametic with a solution? If you believe it is possible give an example alphametic, and if you believe it is impossible give a brief explanation why.

[1] mark is available for making the assertion that yes, it is possible. A further two marks are available by giving an example alphametic [1] and a numeric solution to it [1]: for example
A+A+A+A+A+A+A+A+A+A+A = AA has a solution with A equal to 1,2,3,4,5,6,7,8 or 9.

-----

3 (d) [5 marks]. An alphametic is being created, where ABC and DEA are to be summed, and the total is to only use the letters A, B, C, D or E. How many possibilities (i.e. valid letter combinations) are there for the total, so that the resulting alphametic has a solution? How many different sums can be represented by such an alphametic?

This requires us to write a program to generate all possible totals (any combination of three or four of the letters A to E, allowing a letter to be repeated) and, for each, check if it has a valid solution.

Separate counts of the valid combinations and the different sums must be maintained - we can count the different sums if an algorithm goes through the letter/digit permutations in a non-repeating way, but several different sums may be represented by the same letter combinations, so they must be counted at the end.

To speed the search up, the numeric codes (W) corresponding to each solution can be generated in advance, and a search similar to that in 3 (a) is then possible.

This part is also solved as function try3d() in bio98q3.java, and is called by entering -1 for the number of lines. It can be viewed in a Java-enabled browser or run on a Java-capable computer from the example solution to Q3 page. The code displays every combination and so runs rather slowly in the browser or in a console. Running it with a command-line Java interpreter and redirecting the output to a file makes it much quicker.

Marking

[2] There are 163 valid alphametics/letter combinations
[3] 1136 different sums can be represented.

Maximum 5 marks for part 3 (d).

-----

The British Informatics Olympiad