Pause before disliking

Actually I do not think that there are any wrong reasons for liking a statue or a picture. Someone may like a landscape painting because it reminds him of home, or a portrait because it reminds him of a friend. There is nothing wrong with that… It is only when some irrelevant memory makes us prejudiced, when we instinctively turn away from a magnificent picture of an alpine scene because we dislike climbing, that we should search our mind for the reason for the aversion which spoils a pleasure we might otherwise have had. There are wrong reasons for disliking a work of art.

- E.H. Gombrich, The Story of Art

 

A sage function for computing Kloosterman sums

This is a function I use to compute the Kloosterman sum

\mathcal{K}_q(a) = \sum_{x \in \mathbb{F}_q} \zeta^{\text{Tr}(x^{-1}+ax)}

where \zeta is a primitive p-th root of unity.

It’s written for sage, the free, open-source computer algebra system. The reason for writing this is laziness; sage’s internal implementation of Kloosterman sums uses an approach which is different from mine, defining them via Dirichlet characters. I never bothered to find out if that approach could be made to be compatible with the more familiar (to me) finite field definition above. Perhaps I’m missing a trick by not understanding that approach; however the code below worked well enough for me to either falsify or corroborate any conjectures I was working on.

It makes use of one pretty handy optimisation: the connection between the Kloosterman sum and the cardinality of a specific elliptic curve in the characteristic 2 and 3 cases.

def Kloosterman(a):
  L=a.parent()
  assert(L.is_field())
  assert(L.is_finite())
  p == L.characteristic()
  q = L.cardinality()
  if a == 0:
    return 0

  if p == 2:
    return KloostermanBinary(a)

  if p == 3:
    return KloostermanTernary(a)

  Zeta = CC(e^(2*I*pi/p))
  K = 0
  for x in L:
    K = K + Zeta^(ZZ((x^(q-2)+a*x).trace()))

  assert (K.imag_part() < 0.0001)
  K = K.real_part()
  return K

def KloostermanBinary(a):
  L=a.parent()
  assert(L.is_field())
  assert(L.is_finite())
  assert(L.characteristic() == 2)
  if a == 0:
    return 0

  q = L.cardinality()
  return EllipticCurve(L,[1,0,0,0,a]).cardinality()-q

def KloostermanTernary(a):
  L=a.parent()
  assert(L.is_field())
  assert(L.is_finite())
  assert(L.characteristic() == 3)
  if a == 0:
    return 0

  q = L.cardinality()
  return EllipticCurve(L,[0,1,0,0,-a]).cardinality()-q

Do you want to solve problems, or create them?

Both are legitimate.

Karl Gerstner is a graphic designer and artist. He is quoted here as saying

I see graphic design as a matter of solving problems; art as a matter of inventing them.

My own career has spanned engineering and mathematics. I think a similar distinction applies between those fields.

It’s certainly true that engineering is a matter of solving problems. And while problem-solving is a part of mathematics, it is certainly not the only goal of mathematics. For evidence, one could consider the late Bill Thurston’s wonderful essay On proof and progress in mathematics, and his fascinating account of how his solutions of the major problems in foliation theory discouraged other mathematicians from doing research in the area, thereby all but killing the subject as an active field.

What is the purpose of mathematics? That’s a knotty question. One glib, but not entirely inaccurate answer is this: the aim of mathematics is to produce more mathematics. Under this view, the task of problem creation is at least as important as problem solving.

The virtue of mathematical proof

The virtue of a logical proof is not that it compels belief, but that it suggests doubts.

-H.G. Forder, The Foundations of Euclidean Geometry

It is one of the chief merits of proofs that they instil a certain scepticism as to the result proved. As soon as it was found that the similarity of whole and part could be proved to be impossible for every finite whole, it became not unplausible to suppose that for infinite wholes, where the impossibility could not be proved, there was in fact no such impossibility.

-Bertrand Russell, The Principles of Mathematics

If a conjecture seems very plausible or even self-evident, one should prove it; one may find that it hinges on very sophisticated and dubious lemmas.

-Imre Lakatos, Proofs and Refutations

Self-evident, adj.  Evident to one’s self and to nobody else.

-Ambrose Bierce, The Devil’s Dictionary

A curious characterization of trigonometric addition

The following is a rather odd observation on the familiar trigonometric addition formulae, by which I mean those trusty old stalwarts

\cos (\theta + \phi) = \cos\theta \cos\phi - \sin\theta \sin\phi \ \dots\ (1)

and

\sin (\theta + \phi) = \cos\theta \sin\phi + \sin\theta \cos\phi \ \dots\ (2)

One way to view this pair of formulae is as providing an operation of addition on the points of a circle, under which it is a group with identity (1,0). I’ll denote this addition operation by a plus sign enclosed in a circle:

(x_1,y_1) \oplus (x_2,y_2) = (x_1x_2-y_1y_2, x_1y_2+x_2y_1)

For a specific example, let’s take

\theta = \frac{\pi}{6},\ \phi = \frac{5\pi}{8}:

Addition of points on a circle

The red point in the above picture is the sum of the two blue points.

Now, let’s draw a hyperbola. (Why? Well, why not?) A hyperbola can be specified by three points; we’ll take the two summands (the blue points) and the point (-1,0).

Superimposing a hyperbola

The hyperbola intersects the circle in a fourth point. Let’s draw a vertical line through this point of intersection.Circle, hyperbola, and vertical line

It turns out that this vertical line crosses the circle at precisely the red point, the sum of the two blue points!
So there’s some sort of connection between the addition of points on a circle, and the intersection of a hyperbola with this circle.
Here’s a little sage script that verifies that this is general, at least where the points are distinct:

P.<ct,st,cf,sf> = PolynomialRing(ZZ,4)
x = ct*cf-st*sf
y = -st*cf-ct*sf

A = (ct-cf)*st*sf
B = (ct+1)*cf*sf-(cf+1)*ct*st
D = (cf+1)*st - (ct+1)*sf

A*(1+ct) + B*st +D*ct*st == 0

A*(1+cf) + B*sf +D*cf*sf == 0

A*(1+x)+B*y+D*x*y     ==     ct^3*cf*sf^2 + ct^2*st*cf*sf\\
 - ct^2*st*sf^3 + ct^2*st*sf - ct*st^2*cf^3 + ct*st^2*cf\\
 - ct*st*cf^2*sf - ct*st*sf^3 + ct*st*sf - ct*cf*sf^2\\
 + st^3*cf^2*sf + st^3*cf*sf - st*cf^2*sf - st*cf*sf

#Now swap ct^2 for 1-st^2, cf^2 for 1 - sf^2

ct*(1-st^2)*cf*sf^2 + (1-st^2)*st*cf*sf - (1-st^2)*st*sf^3\\
 + (1-st^2)*st*sf - ct*st^2*cf*(1-sf^2) + ct*st^2*cf\\
 - ct*st*(1-sf^2)*sf - ct*st*sf^3 + ct*st*sf - ct*cf*sf^2\\
 + st^3*(1-sf^2)*sf + st^3*cf*sf\\
 - st*(1-sf^2)*sf - st*cf*sf     ==     0

Where did the idea to do this come from? If you’ve ever seen the ‘chord-and-tangent’ addition of points on an elliptic curve, you might notice some family resemblance. The connecting tissue is provided by Edwards curves. These are curves with equation

x^2+y^2 = 1+ dx^2y^2

If we set the parameter d to be zero, we simply get a circle. If we take a d which is different from zero (and also different from one), we have a fascinating class of curves discovered by Harold Edwards to be closely connected to elliptic curves. A paper by Arène, Lange, Naehrig and Ritzenthaler illustrated that the chord-and-tangent addition procedure on the elliptic curve maps to a hyperbola on the Edwards curve.

With this background, the observation above is simply that this hyperbolic point addition procedure also holds for that degenerate Edwards curve, the circle. Nonetheless, it is, to the best of my knowledge, not an observation that has been made before.

The sum of the inverses of finite field elements with given trace

For this post, I’m going to assume the reader has enough mathematics to know what a finite field is, and that a finite field has a function defined on it called the trace, but not much more than that.

One of my favourite results from my thesis is the following (Lemma 4.2, p. 44):

The sum of the inverses of the elements of a finite field with a given nonzero trace is the inverse of that trace.

Or, more symbolically, if

T_{r} = \{a\in \mathbb{F}_{q} | {\rm{Tr}}(a) = r \}, \ r \ne 0,

then

\sum_{t \in T_{r}} t^{-1} = r^{-1}.

Although the proof doesn’t require anything beyond undergraduate mathematics, I think it’s quite a nice little proof all the same, and I’ll give an outline of it below. Given that the result is so straightforward to state, I’m a little surprised that, to the best of my knowledge, it has not been noted before.

In the course of the proof, I also show the following related results:

  1. The product of the elements of a finite field with a given trace is that trace.
  2. The product of the inverses of the elements of a finite field with a given nonzero trace is the inverse of that trace.

The second fact is an immediate corollary of the first. The proof of these two facts is even simpler than the title result, but again, not to my knowledge previously known.

The Proof

Let’s fix some notation first. Let q = p^n be the order of the finite field where p is a prime. As above,

T_{r} = \{a\in \mathbb{F}_{q} | {\rm{Tr}}(a) = r \}, r \in \mathbb{F}_{p} \setminus \{0\}

and

{\rm{Tr}}:\mathbb{F}_{q} \to \mathbb{F}_{p},

a \mapsto a + a^p + a^{p^2}+\cdots+a^{p^{n-1}}

is the trace function.

Let g(x) be the monic polynomial which vanishes on the p^{n-1} elements with trace r. This polynomial we can express in two ways:

g(x) = \prod_{t\in T_{r}} (x-t),

and

g(x) = x^{p^{n-1}} + x^{p^{n-2}} + \cdots + x - r.

Looking at the independent term of that polynomial, then, we get the identity

\prod_{t \in T_{r}}(-t) = -r.

By the way, this implies that

(-1)^{p^{n-1}}\prod_{t \in T_{r}}t = -r,

so

\prod_{t \in T_{r}}t = r.

A moment’s thought about the possible cases (p being even or odd) should make it clear why we can drop the minus sign.

Inverting this last equation gives us that

\prod_{t \in T_{r}}t^{-1} = r^{-1}.

This demonstrates facts 1. and 2. above. With that out of the way, let’s proceed with the main proof.

The reciprocal polynomial of g is defined, as usual, to be

g^{*}(x) = x^{p^{n-1}}g(1/x).

So

g^{*}(x) = x^{p^{n-1}}\prod_{t \in T_{r}}(\frac{1}{x}-t)

= \prod_{t \in T_{r}}(1-tx)

= \prod_{t \in T_{r}}(t)(t^{-1}-x)

= \prod_{t \in T_{r}}(-t)(x-t^{-1})

= \prod_{t \in T_{r}}(-t)\prod_{t \in T_{r}}(x-t^{-1})

= -r\prod_{t \in T_{r}}(x-t^{-1})

= -rh(x),

where

h(x) = \prod_{t \in T_{r}}(x-t^{-1})

is the monic polynomial which vanishes on the p^{n-1} inverses of the elements with trace r.

Thus,

h(x) = -r^{-1}g^{*}(x).

Now, remember that the reciprocal of a polynomial f is the polynomial whose coefficients are the coefficients of f, but in the reverse order.

So if

f = a_mx^m+a_{m-1}x^{m-1}+\cdots+a_1x+a_0,

then

f^{*} = a_0x^m+a_{1}x^{m-1}+\cdots+a_{m-1}x+a_m.

This is immediate from the definition. So, since we have that

h(x) = -r^{-1}g^{*}(x),

this means that

h(x) = x^{p^{n-1}}-r^{-1}x^{p^{n-1}-1}-\cdots-r^{-1}x^{p^{n-1}-p^{n-2}} - r^{-1}.

Recalling that

h(x) = \prod_{t \in T_{r}}(x-t^{-1}),

and comparing the second coefficient (i.e. the coefficient of x^{p^{n-1}-1} ) of each of these representations of this polynomial, it is clear that

\sum_{t \in T_{r}}(-t^{-1}) = -r^{-1},

or equivalently,

\sum_{t \in T_{r}}t^{-1} = r^{-1},

QED.

What is Popper’s problem of the empirical basis?

For the last couple of weeks I’ve been reading Karl Popper’s The Logic of Scientific Discovery, a true masterpiece of the philosophy of science.  Both intellectually rich and clearly and persuasively written, the book really is a joy to read. This post is my attempt to get to grips with one of the knottier problems considered within it, that of the empirical basis.

Falsifiability and Basic Statements

When I’m asked why anyone should bother to study philosophy, one of the advances I mention is Popper’s introduction of the condition of falsifiability (as opposed to the verificationist stance of the logical positivists) as that which distinguishes the statements of science from those of non-science. In his own words (with his emphasis):

We must choose a criterion which allows us to admit to the domain of empirical science even statements which cannot be verified. But I shall certainly admit a system as empirical or scientific only if it is capable of being tested by experience. These considerations suggest that not the verifiability but the falsifiability of a system is to be taken… it must be possible for an empirical or scientific system to be refuted by experience.

This is eminently sensible. How, for example, could one hope to verify that Newton’s law of universal gravitation holds between all masses in the universe, for all time? But if Newton’s law were false, the act of falsification would simply involve observing an instance in which certain masses failed to attract each other as predicted. Thanks to Popper, this condition is now generally known and accepted by the scientific community (physicists will dismiss an unfalsifiable theory as ‘not even wrong‘).

The crux of Popper’s principle is this: a statement of a scientific theory should be universal. Roughly speaking, it asserts something about all entities with some given property. On the other hand, the statements which humans can make based on their own experience are basic; they correspond to particular events. The difference is that between the statements  ‘All swans are white’ and ‘This swan is white’. The latter can, in principle, be conclusively decided one way or the other; the former cannot be conclusively verified, only falsified.

From a universal statement, we can deduce a multitude of basic statements. The negation of any one of these basic statements will falsify that universal statement. For example, the existence of any one black swan will disprove the theory that all swans are white.

To the contrary (and this is Hume’s problem of induction), we cannot infer a universal statement from the conjunction of any finite number of basic statements. In our example, no matter how many white swans we observe, we cannot conclude that all swans must be white.

To digress for a moment, we might ask, what about the statement ‘All swans currently living on earth are white’? It is an ‘all-statement’, but can, in principle be conclusively decided. As this refers to entities within a bounded region of spacetime, Popper distinguishes this as a numerically universal statement, as opposed to a strict universal statement; only the strict universal statements are those which Popper sees as being characteristically scientific.

The Empirical Basis

Now we get to the problem mentioned in the title of this post, the problem of the empirical basis. The problem: what precisely is the relationship between perceptual experiences and basic statements?

Clearly a basic statement, one which describes an event, has its source in the experience of someone’s senses. On the other hand, the rules of logic only give us the means to deduce statements from other statements, not from experiences. To Popper, the great danger is that of falling into the trap of what he terms psychologism; the view that basic statements are at bottom justified by a statement of the type ‘I am personally convinced that so-and-so is true’. It’s plain to see why allowing justifications of this nature will play havoc with the rational search for truth.

Popper’s Answer

What is Popper’s solution to this problem? He abandons any notion of logically justifying basic statements. Instead, “basic statements are accepted as the result of a decision or agreement; and to that extent they are conventions”. Much like in statistical hypothesis testing, we do not exactly assert the truth of a basic statement; we simply decide whether to accept or reject it. This is a procedure based on rules, but we do not have any absolute guarantee that this procedure will give us the correct result – any more than we can guarantee that a jury’s verdict in a criminal trial will be the correct one.

To Popper, the role of sense experience is to provide evidence which can be used to arrive at a decision – but ultimately, the acceptance or rejection of a basic statement is a decision.