Generalization of strip formula to discs of different sizes

These notes, dated 30 June 1985, provide a generalization of the strip method for estimating the number of chance alignments in a random set of sites.  The sites are supposed to be discs, and a set of sites is considered to be aligned if a straight line can be drawn that intersects all the discs.  The discs are now allowed to be of different sizes, whereas in the strip formula they are required to be all the same size. 

The clarification at the end was added at the time, in response to a reader who complained that the proof was “very hard going”.  The note on practical calculation was added in January 2014.  MB

The area (e.g. Ordnance Survey map) in which the sites lie is assumed to be a bounded convex subset K of E2.  The sites are S1,,Sn where each Si is a closed connected subset of K whose width in any direction is small compared with the dimensions of K.  A set of r3 sites is considered to form an alinement iff there exists a straight line L meeting each site in the set.  Note that L meets Si iff L meets the convex hull Si of Si.  We suppose that for each i the position of Si in K and the orientation of Si in (0,2π) have uniform distributions, and that the 2n distributions are independent.  We derive a formula for the expected number of r-site alinements among the n sites, having length less than a given length q

The mean width di of Si is defined as follows.  Given a line L of azimuth θ let wi(θ) be the width of the projection of Si onto L.  Then

di= 1 wi(θ)dθ.
0
(1)

Equivalently, di is defined by the fact that πdi is the perimeter of Si (easy proof ).  For each i, let Pi be a point inside, and fixed relatively to, Si (say the centroid of Si). 

We first investigate the probability that S1,,Sr form an r-site alinement with end sites S1 and S2.  The required formula will then follow by a simple combinatorial argument. 

First fix P1,P2 so that P1P2=t<q.  We may assume that t is large compared with the dimensions of the Si.  Take (x,y)-axes with origin at P1 so that P2 is at (t,0).  Let Pi be at (xi,yi).  Next fix the orientations of the Si, and let the maximum and minimum y-coordinates of points in Si then be yi+bi±ci.  For 3ir let xi be restricted to some small interval in (0,t); say

tui<xi<t(ui+δui), 0<ui<1.
(2)

Let L be a movable straight line meeting S1 and S2; say L passes through (0,z1) and (t,z2) where

|zibi|<ci (i=1,2).
(3)

For given (z1,z2) the line L meets the other sites iff

|(1ui)z1+uiz2yibi|ci (3ir).
(4)

Let C(z1,z2) in Er−2 be the cuboid of all ( y3,,yr) such that (4) holds.  We want to find the content of the (r2)-dimensional solid U given by

U={C(z1,z2)(3)holds}.
(5)

Let W in Er be the cuboid

W={(z1,,zr)|zibi|ci}

and let V in Er−1 be the image of W under the linear map

g:(z1,,zr)(z1z2,y3,,yr),
yi=(1ui)z1+uiz2zi.
(6)

Let ƒ be the projection from Er−1 to Er−2 given by

ƒ:( y0,y3,,yr)( y3,,yr)
(7)

so that U=ƒ(V).  Since V is convex, each point in the interior of U is the image of exactly 2 points on the boundary of V.  Hence

U= 1 |ƒ(X)|
2
(8)

summing over all faces X of V.  Clearly each face X of V is the image g(Y ) of some face Y of W (not conversely).  Fix i and j with 1i<jr and consider the set of 4 faces Y given by

Y:zi=bi±ci, zj=bj±cj.
(9)

Since coordinates 1 and 2 are distinguished there are 4 cases. 

Case 1: i>2 and j>2

Let h be the linear function on Er−1 given by

h( y0,y3,,yr)=(uiuj)y0+yiyj.
(10)

Then by (6) and (10)

h(g(z1,,zr))=zi+zj
(11)

so that h is constant on g(Y ).  The signs on the RHS of (11) are opposite, hence g(Y ) is a face of V iff the signs in (9) are taken opposite.  If X=g(Y ) is such a face then we easily calculate

|ƒ(X)|=|uiuj|   (2ck).
ki,j
(12)

Case 2: i=2,j>2.  The same argument with

h( y0,y3,,yr)=(1uj)y0yj
(13)

leads to 2 faces X of V such that

|ƒ(X)|=|1uj|   (2ck).
k2,j
(14)

Case 3: i=1,j>2

h( y0,y3,,yr)=ujy0+yj
(15)
|ƒ(X)|=|uj|   (2ck).
k1,j
(16)

Case 4: i=1,j=2

h( y0,y3,,yr)=y0
(17)
|ƒ(X)|=   (2ck).
k1,2
(18)

For convenience define

u1=0,u2=1.
(19)

Then substituting into (8) from (12), (14), (16) and (18) (recall that each describes 2 faces X) we get

|U|=   |uiuj|Cij
1i<jr
(20)

where

Cij=   (2ck).
ki,j
(21)

So if now P3,,Pr vary freely, the probability that S1,,Sr form an alinement with end sites S1,S2 is

|K|2rtr2 1 1 |U|du3…dur
0 0
 =|K|2rtr2 ( C12+ 1   C1j+ 1   C2j+  1   Cij ) .
2 j>2 2 j>2 2 i,j>2
(22)

Since the orientations of the Si are independent and uniform, we can now allow them to vary and replace each in (22) by its mean value

  dk.
ki,j

Now allow the end sites to be any pair of S1,,Sr.  To sum all the terms such as (22) note that the sum of the coefficients of the Cij in (22) is

1+  1 (r2)+  1 (r2)+  1 (r2)(r3)=  1 r(r+1).
2 2 6 6

so that the probability that S1,,Sr are alined is

|K|2rtr2 1 r(r+1)       dk.
6 i<j ki,j
(23)

If any set of r sites can be chosen, we get nCr expressions like (23) in which each product of d’s appears nr+2C2 times.  Finally, let the distance t=P1P2 between the independently and uniformly distributed points P1,P2 of K have p.d.f. p(t).  Then the expected number of r-site alinements of length less than q among the n sites S1,,Sn is given by

1 (nr+2)(nr+1)r(r+1)   di1dir2|K|2r q p(t)tr2dt.
12 i1<<ir2 0

If the orientations of the Si are non-uniform (but still independent) one could try modifying the above argument, using the joint p.d.f. of t,θ where θ is the azimuth of P1P2.  Note that if K is a disc then the formula applies even for non-uniformly distributed orientations.  If K is a square the discrepancy is likely to be small. 

(March 2013) We check that the above formula reduces to the strip formula when the alinements have unlimited length (q=) and the discs are all the same size (di=2c for each i).  In this case

  di1dir2=nCr2(2c)r2, p(t)tr2dt=Mr2,
i1<<ir2 0

so that with the notation A=|K| the result is seen to be the same as for the strip formula. 

Clarification

Diagram to clarify the idea behind the proof

The above diagram, drawn for the case of 4-point alinements, may clarify the idea behind the proof. 

The cuboid C(z1,z2) is a movable rectangle, of which KLMN is a particular position.  As z1,z2 vary, so the lower right corner of the rectangle moves over the parallelogram NPQR.  Thus U, the union of the rectangles, is the octagon formed by the outer edges of the figure. 

By drawing in the solid lines inside U we see that U is the projection of a 3-dimensional solid V with 12 parallelogram faces.  The projections of the faces of V cover U exactly twice. 

By drawing in the dotted lines (which represent lines inside V) we see that V in turn is the projection of a 4-dimensional cuboid W.  Each face of V is the projection of a 2-dimensional face of W.  The 2-dimensional faces of W fall into groups of 4, of which 2 project into faces of V, while 2 project into the interior of V and are excluded from the calculation. 

Practical calculation

Recall that we are given the diameters d1,,dn of the n discs.  For r-point alinements, r=3,4,5,, define m=r2.  The formula above requires the calculation of Sm, the sum of all products of m diameters with distinct indices.  Formally

Sm=   di1dim.
1i1<<imn

In principle this could be done by simply working through all m-element subsets of 1,,n and adding the products.  In practice this approach is likely to take far too much computer time.  A better method is as follows. 

For m=1,2,3, define

Pm=d1m+d2m++dnm.

It is not difficult to show that

 S1=P1
2S2=P1S1P2
3S3=P1S2P2S1+P3
4S4=P1S3P2S2+P3S1P4
5S5=P1S4P2S3+P3S2P4S1+P5

and so on.  These formulae allow S1,S2,S3, to be calculated in succession.