My research and investigations

Archive for January, 2011

ML as a Programming Language

As a programming language, ML is quite powerful: it features both functional and procedural styles.

ML is born a functional language — indeed the very first implementation of ML is in Lisp, the grand-daddy of functional programming. Like Lisp, ML has a top-level read-evaluate-print loop, making it interactive. However, ML is not an interpretative language: the top-level interactive loop is actually running an incremental compiler. Most ML implementations will compile ML into some form of lambda expressions. Possible ways to deal with these lambda expressions are:

  1. Interpret the lambda expressions.
  2. Compile the lambda expressions into byte-codes of some abstract machine, and run the abstract machine.
  3. Convert the abstract machine byte-codes into native machine codes, to be run directly in real machines.

In pure functional programming, data are only transformed, never modified, hence no side-effects. This simplifies reasoning about program correctness. However, real-world applications need to interact with the world (e.g. response to mouse clicks) — which depends on side-effects. Procedural programming is largely based on data changes, the most explicit being the use of pointers in C/C++ to modify memory locations directly.

Intended for building practical applications (like theorem provers) ML provides procedural features such as assignments and references. As a result ML shares many goodies of mainstream programming languages, e.g. use of modules and separate compilation.

However, even with these many features, ML is not a bloated language. It has a well-thought-out design, just using typed lambda calculus. The evolution of ML is a story about how to implement functional languages efficiently, and how to achieve performance comparable to procedural languages.


Learning ML

I start to play with ML, the MetaLanguage for theorem-provers. It is developed by Robin Milner and others in the early 1970s at the University of Edinburgh. It is now standardized as SML, the Standard ML.

A good ML Tutorial is this:
Tips for Computer Scientists on Standard ML (Revised) by Mads Tofte (2009), which describes all of Standard ML in 20 pages.

To try out ML, I install Moscow ML by Sergei Romanenko. The Windows installer is very convenient. It runs in command-line mode, which is interactive enough for trying out ML.

For a few lines, typing is OK. For many more lines, an editor is much preferred. I’m using TextPad, and I’m able to locate an SML syntax file for TextPad.

Poly/ML by David Matthews is also good for learning ML. It is Windows-based, and the documentation includes a short tutorial on programming Windows in Poly/ML. Sounds exciting.

I can run ML scripts with both Moscow ML and Poly/ML. The Poly/ML window even has copy/paste — so I can copy ML snippet from tutorials and paste to the window for execution.

The language ML is rich with features, so it may take some time to know it well.

Finite Fields – many Models

The story starts here.
G: Let the first one introduce himself.

B: George Boole (1815-1864), from England. Did you read my book?
X: Which book?
G: An Investigation of the Laws of Thought of 1854, a classic. I read it just before I logged off.

G: You have another model for GF(2)?
B: Yes, and it isn’t based on numbers.

G: Spell out the details, please.
B: It is based on simple sentences, those you can tell immediately whether it is true or false.

G: This is straight from your book!
B: I also consider simple ways to combine such sentences, only the words and, or, called connectives.

G: That’s two symbols, two operations. Interesting.
B: The connectives can join the simple sentences in 8 possible ways:

Sentence Computations  
Or false true
false false true
true true true
And false true
false false false
true false true

X: Exactly how do you compute with sentences?

G: Think logically. I’m here and you’re here: true or false? I’m here or I’m there: true or false?
X: I see. When connected by and, both parts must be true to have true finally. When connected by or, both parts must be false to be false finally.

G: That’s Boole’s Laws of Thoughts!
B: And it gives a model of GF(2) if we use this dictionary:

Boole’s proposed model of GF(2)  
Symbol Meaning
0 false
1 true
Operation Meaning
+ logical or
× logical and

X: That’s brilliant! A symbolic model for a symbolic system!

G: Not quite. This model is wrong!
G: I spot the error: in your model: true or true is true. Translate by the dictionary: 1+1=1, which is not correct in GF(2).

B: Oops! How embarrassing! I made a boolean error!
G: That’s the lesson. We need to check carefully to ensure all dictionary translations are perfect.

B: Wait a second. I can fix my error!
G: Don’t make the same mistake twice.

B: No, I won’t. Instead of or, use exclusive or, shortened to xor. I just redo all the computations:

Sentence Computations  
Xor false true
false false true
true true false
And false true
false false false
true false true

X: What is xor, or exclusive or?
G: It means either this or that, but not both. Tomorrow either the sun will shine or rain will fall. If tomorrow is foggy or a rainbow appears, my prediction is wrong.

B: And the Boole model of GF(2) is:

Boole model of GF(2)  
Symbol Meaning
0 false
1 true
Operation Meaning
+ logical xor
× logical and

G: That’s an impressive quick fix!
G: There are further check … it’s all OK, I declare the Boole model of GF(2) is verified correct.

X: Did you just invent a logical operation to fit into the model?
B: No. It’s all in my book. Using either-or to join sentences is rather common.

G: Now the second model. Your name please.
E: Euclid of Alexandria (300 BC) from Greece. Excuse me for my dates as I’ve lost count.

G,G,B: (bowing) We all read your book.
X: Which book?
G,G,B: (angry looks) Ssh!

E: My classic text Elements was the bible of geometry for more than 20 centuries, until the rise of modern math.
G: Your method of doing math is still being used in all math textbooks.

E: I try to catch up with modern math, so I’ve been doing a bit of abstract algebra lately.
G: I suppose you have a model of GF(2) based on geometry?

E: Take a non-symmetric alphabet, say the letter E in my name. There are two ways you can play with it: either leave it, or reflect it to ?.
B: See, Euclid is using my xor!

E: The first task, which is doing nothing, I call it identity. The second task, which is flipping over, I call it reverse.
B: I can see that.

E: Given two tasks, you can either do both of them, or just do the first one and ignoring the second one.
B: Another xor. First is diligent, while the second is lazy.

E: There are 8 ways to combine the tasks with the do-both or do-one operations:

Task Computations  
Do Both identity reverse
identity identity reverse
reverse reverse identity
Do One identity reverse
identity identity identity
reverse identity reverse

E: This gives a model of GF(2) using the following dictionary:

Euclid’s proposed model of GF(2)  
Symbol Meaning
0 identity
1 reverse
Operation Meaning
+ Do -Both
× Do-one

X: Another impressive feat!

G: Look carefully ¡X this model is wrong!
G: If your Do-one means to do the first one, then the Do-one table has an error: assume the verticals denote the first, we should have Do-one(reverse,identity)=reverse, not identity as shown.

E: Oops, I may not be so good in algebra!
B: We do have to check thoroughly,

E: But I think I can fix this ….. Yes, I can. Instead of Do-one meaning to do the first one, let’s change it to do the easy one, assuming the task reverse is harder than the task identity, which is doing nothing.
B: This gives the revised 8 computations:

Task Computations  
Do-them-both identity reverse
identity identity reverse
reverse reverse identity
Do -easy-one identity reverse
identity identity identity
reverse identity reverse

E: And with this dictionary we surely have a model of GF(2):

Euclid model of GF(2)  
Symbol Meaning
0 identity, the easy task
1 reverse, the hard task
Operation Meaning Name
x+y do x, do y. Do-them-both
x×y do easy(x,y) Do-easy-one

G: Let me do further check …. they’re OK. I declare the Euclid model of GF(2) verified correct.

G: This Euclid model inspire me to have still another model of GF(2).
X: Another model?!

G: Sure. This one is based on numbers, but not modulo 2.
G: Let hear the details.

G: Take the two numbers +1 and 1. Symbolic + is numerically multiply, and symbolic × is take the maximum.
G: This is very interesting. Let me compute:

Numeric Computations  
Multiply +1 -1
+1 +1 -1
-1 -1 +1
Maximum +1 -1
+1 +1 +1
-1 +1 -1

B: And use this dictionary for the model of GF(2):

Gauss model of GF(2)  
Symbol Meaning
0 +1
1 -1
Operation Meaning
x+y multiply(x,y)
x×y maximum(x,y)

G: Further checks are OK. This is indeed Gauss model of GF(2).

X: How to get this model from Euclid model?
G: When I see that reversing twice give identity, I recall (1)(1)=+1. Euclid used the easier task to pick the identity result in his multiplication, so I try the lesser of 1 and +1. It doesn’t work, but reverting to the bigger of 1 and +1 works.

G: Very good indeed. This inspires me to yet another model of GF(2)!
X: Still more?

G: Yes. Take the symbols 0 and 1 as numbers, but let + to mean difference, and × to mean minimum.
X: What is difference?

G: That’s subtraction without sign: always taking the smaller from the bigger.
X: Oh, I see. Let me do the computations:

Numeric Computations  
Difference 0 1
0 0 1
1 1 0
Minimum 0 1
0 0 0
1 0 1

G: To match GF(2), the dictionaries for symbols and operations are:

Galois model of GF(2)  
Symbol Meaning
0 Number 0
1 Number 1
Operation Meaning
x+y difference(x,y)
x×y minimum(x,y)

G: That’s excellent.
G: Further checks are OK in this case. This is my Galois model of GF(2).

B: It is thrilling to see so many models of GF(2).
X: Is there a most suitable one among the many choices?

G: That’s a good question. It’s too late. We’ll call it a day.
X: Yes, next time.

Y: This is way too late.

X: I learn too many models of GF(2) in one day. I’m exhausted.

Software Bugs and Theorem Proving

Apple iPhone users are greeted with a silent alarm at the first day of 2011 (see here and here). All evidence points to a software bug in the operating system.

This is not the first widespread software bug from a well-known manufacturer. In 1994, a math professor found that the Intel Pentium CPU had a bug when doing divisions. It caused a sensation in the IT world (see here), and ultimately Intel recalled all flawed processors.

All modern electronic devices (your desktop, laptop, notebook, netbook, digital camera, mobile phone, mp3 player, etc.) contain software. These software are designed and implemented by smart engineers, and checked and tested by a team of testers. The aim is to get rid of every last software bug. After all these rigorous processes the software is 99.99999% correct — but it is the 0.00001% that embarrasses the company.

The only way to claim the perfect 100% is to prove it, mathematically. This means translating the design and implementation into the language of math, and to show that the implementation follows logically from the design. This is tedious work, not usually done by a human mathematician. Rather it is the job for theorem-provers, special programs that can perform automated reasoning.

You’ve probably never heard of these terms. It’s because traditionally theorem-provers exist only in universities, for academics doing research in automated theorem proving. Since the Pentium bug, hardware companies (like AMD) are relying on theorem-provers to verify microprocessor designs. Even software companies (like Microsoft) are doing research in theorem proving.

Now if you put on your logic hat, you’ll find an interesting situation: theorem-provers are themselves software coded by human, how can we be sure that theorem-provers don’t have bugs?

The answer is: we can’t be sure, but we do take measures to reduce the chance of any bugs.

Firstly, theorem-provers have a very small core, call the inference engine. Basically, it is capable of manipulating symbols, according to a few simple rules. Keeping everything simple means less hiding spots for bugs. The system starts with a few fundamental math facts (like every number has a next one), and requires any additional facts must be first proved by the inference engine. Thus the system builds up its power on a firm basis.

Secondly, theorem-provers are open-source projects. This means the source code of system, in particular its core inference engine, is open for inspection. In these days of Internet, you can download the source code. With appropriate tools, you can compile the codes and run the system in your own computer. People with like interest in the system will form a group checking the merits of the system, and contribute to any bug-fix or improvement of the source code. There is a saying in IT community: “given enough eyeballs, all bugs are shallow”.

The theorem-prover I’m using for my project is HOL4, which is open-source with a small core. Here is a readable introduction, and here is a technical overview.

My supervisor is one of the main developers of HOL4.