Generalization of strip formula to discs of different sizes
These notes, dated 30 June 1985, provide a generalization of the
strip method
for estimating the number of chance alignments in a random set of sites.
The sites are supposed to be discs, and a set of sites is
considered to be aligned if a straight line can be drawn that intersects all the discs.
The discs are now allowed to be of different sizes,
whereas in the strip formula they are required to be all the same size.
The clarification at the end was added at the time,
in response to a reader who complained that the proof was “very hard going”.
The note on practical calculation was added in January 2014.
MB
The area (e.g. Ordnance Survey map) in which the sites lie is assumed to be
a bounded convex subset K of E^{2}. The sites are
S_{1},…,S_{n} where each S_{i} is a closed connected subset of K whose width in
any direction is small compared with the dimensions of K. A set of r≥3
sites is considered to form an alinement iff there exists a straight line L
meeting each site in the set. Note that L meets S_{i} iff L meets the convex hull
S_{i} of S_{i}.
We suppose that for each i the position of S_{i}
in K and the orientation of S_{i} in (0,2π) have uniform distributions,
and that the 2n distributions are independent. We derive a formula for the
expected number of rsite alinements among the n sites, having length less
than a given length q.
The mean width d_{i} of S_{i} is defined as follows. Given a line L of azimuth
θ let w_{i}(θ) be the width of the projection of S_{i} onto L.
Then
d_{i}= 
1 
∫ 
^{2π} 
w_{i}(θ)dθ. 
2π 
_{0} 

(1) 
Equivalently, d_{i} is defined by the fact that πd_{i} is the perimeter of
S_{i} (easy proof^{ }).
For each i, let P_{i} be a point inside, and fixed
relatively to, S_{i}
(say the centroid of S_{i}).
We first investigate the probability that S_{1},…,S_{r} form an rsite
alinement with end sites S_{1} and S_{2}. The required formula will then follow
by a simple combinatorial argument.
First fix P_{1},P_{2} so that P_{1}P_{2}=t<q. We may assume that t is large
compared with the dimensions of the S_{i}. Take (x,y)axes with origin at
P_{1} so that P_{2} is at (t,0).
Let P_{i} be at (x_{i},y_{i}). Next fix the orientations of the S_{i}, and let
the maximum and minimum ycoordinates of points in
S_{i} then be y_{i}+b_{i}±c_{i}. For 3≤i≤r let x_{i} be restricted to
some small interval in (0,t); say
tu_{i}<x_{i}<t(u_{i}+δu_{i}), 0<u_{i}<1. 

(2) 
Let L be a movable straight line meeting S_{1} and S_{2}; say L passes
through (0,z_{1}) and
(t,z_{2}) where
z_{i}−b_{i}<c_{i} (i=1,2). 

(3) 
For given (z_{1},z_{2}) the line L meets the other sites iff
(1−u_{i})z_{1}+u_{i}z_{2}−y_{i}−b_{i}≤c_{i} (3≤i≤r). 

(4) 
Let C(z_{1},z_{2}) in E^{r−2} be the cuboid of all ( y_{3},…,y_{r}) such that
(4) holds. We want to find the content of the (r−2)dimensional solid U
given by
U=  ∪  {C(z_{1},z_{2})│(3)holds}. 

(5) 
Let W in E^{r} be the cuboid
W={(z_{1},…,z_{r})│z_{i}−b_{i}≤c_{i}} 

and let V in E^{r−1} be the image of W under the linear map
g:(z_{1},…,z_{r})→(z_{1}−z_{2},y_{3},…,y_{r}), 

y_{i}=(1−u_{i})z_{1}+u_{i}z_{2}−z_{i}. 

(6) 
Let ƒ be the projection from E^{r−1} to E^{r−2} given by
ƒ:( y_{0},y_{3},…,y_{r})→( y_{3},…,y_{r}) 

(7) 
so that U=ƒ(V). Since V is convex, each point in the interior of U is
the image of exactly 2 points on the boundary of V. Hence
summing over all faces X of V. Clearly each face X of V is the image
g(Y ) of some face Y of W (not conversely). Fix i and j with
1≤i<j≤r and consider the set of 4 faces Y given by
Y:z_{i}=b_{i}±c_{i}, z_{j}=b_{j}±c_{j}. 

(9) 
Since coordinates 1 and 2 are distinguished there are 4 cases.
Case 1: i>2 and j>2.
Let h be the linear function on E^{r−1} given by
h( y_{0},y_{3},…,y_{r})=(u_{i}−u_{j})y_{0}+y_{i}−y_{j} . 

(10) 
Then by (6) and (10)
h(g(z_{1},…,z_{r}))=−z_{i}+z_{j} 

(11) 
so that h is constant on g(Y ). The signs on the RHS of (11) are opposite,
hence g(Y ) is a face of V iff the signs in (9) are taken opposite.
If X=g(Y ) is such a face then we easily calculate
ƒ(X)=u_{i}−u_{j} 
∏ 
^{ } 
(2c_{k}). 
_{k}_{≠i,j} 

(12) 
Case 2: i=2,j>2.
The same argument with
h( y_{0},y_{3},…,y_{r})=(1−u_{j})y_{0}−y_{j} 

(13) 
leads to 2 faces X of V such that
ƒ(X)=1−u_{j} 
∏ 
^{ } 
(2c_{k}). 
_{k}_{≠2,j} 

(14) 
Case 3: i=1,j>2.
h( y_{0},y_{3},…,y_{r})=u_{j}y_{0}+y_{j} 

(15) 
ƒ(X)=u_{j} 
∏ 
^{ } 
(2c_{k}). 
_{k}_{≠1,j} 

(16) 
Case 4: i=1,j=2.
h( y_{0},y_{3},…,y_{r})=y_{0} 

(17) 
ƒ(X)= 
∏ 
^{ } 
(2c_{k}). 
_{k}_{≠1,2} 

(18) 
For convenience define
Then substituting into (8) from (12), (14), (16) and (18) (recall that each
describes 2 faces X) we get
U= 
∑ 
^{ } 
u_{i}−u_{j}C_{ij} 
_{1≤i<j≤r} 

(20) 
where
C_{ij}= 
∏ 
^{ } 
(2c_{k}). 
_{k}_{≠i,j} 

(21) 
So if now P_{3},…,P_{r} vary freely, the probability that S_{1},…,S_{r}
form an alinement with end sites S_{1},S_{2} is
K^{2−r}t^{r}^{−2} 
∫ 
^{1} 
… 
∫ 
^{1} 
Udu_{3}…du_{r} 
_{0} 
_{0} 

=K^{2−r}t^{r}^{−2} 
( 
C_{1}_{2}+ 
1 
∑ 
^{ } 
C_{1j}+ 
1 
∑ 
^{ } 
C_{2j}+ 
1 
∑ 
^{ } 
C_{ij} 
) 
. 
2 
_{j}_{>2} 
2 
_{j}_{>2} 
2 
_{i}_{,j>2} 

(22) 
Since the orientations of the S_{i} are independent and uniform, we can now
allow them to vary and replace each in (22) by its mean value
∏ 
^{ } 
d_{k} . 
_{k}_{≠i,j} 

Now allow the end sites to be any pair of S_{1},…,S_{r}.
To sum all the terms such as (22) note that the sum of the coefficients of the
C_{ij} in (22) is
1+ 
1 
(r−2)+ 
1 
(r−2)+ 
1 
(r−2)(r−3)= 
1 
r(r+1). 
2 
2 
6 
6 

so that the probability that S_{1},…,S_{r} are alined is
K^{2−r}t^{r}^{−2}⋅ 
1 
r(r+1) 
∑ 
^{ } 
∏ 
^{ } 

d_{k} . 
6 
_{i}_{<j} 
_{k}_{≠i,j} 

(23) 
If any set of r sites can be chosen, we get ^{n}C_{r} expressions like (23) in
which each product of d’s appears ^{n}^{−r+2}C_{2} times. Finally, let the
distance t=P_{1}P_{2} between the independently and uniformly distributed points
P_{1},P_{2} of K have p.d.f. p(t). Then the expected number of rsite
alinements of length less than q among the n sites S_{1},…,S_{n} is
given by
1 
(n−r+2)(n−r+1)r(r+1) 
∑ 
^{ } 
d_{i}_{1}…d_{ir}_{−2}K^{2−r} 
∫ 
^{q} 
p(t)t^{r}^{−2}dt. 
12 
_{i}_{1<…<ir−2} 
_{0} 

If the orientations of the S_{i} are nonuniform (but still independent) one
could try modifying the above argument, using the joint p.d.f. of t,θ
where θ is the azimuth of P_{1}P_{2}. Note that if K is a disc then the
formula applies even for nonuniformly distributed orientations. If K is a
square the discrepancy is likely to be small.
(March 2013) We check that the above formula reduces to the
strip formula
when the alinements have unlimited length (q=∞) and the discs are all the same size
(d_{i}=2c for each i). In this case
∑ 
^{ } 
d_{i}_{1}…d_{ir}_{−2}=^{n}C_{r}_{−2}(2c)^{r}^{−2}, 
∫ 
^{∞} 
p(t)t^{r}^{−2}dt=M_{r}_{−2} , 
_{i}_{1<…<ir−2} 
_{0} 

so that with the notation A=K the result is seen to be the same as for the strip formula.
Clarification
The above diagram, drawn for the case of 4point alinements, may clarify the
idea behind the proof.
The cuboid C(z_{1},z_{2}) is a movable rectangle, of which KLMN is a particular
position. As z_{1},z_{2} vary, so the lower right corner of the rectangle moves
over the parallelogram NPQR. Thus U, the union of the rectangles, is the
octagon formed by the outer edges of the figure.
By drawing in the solid lines inside U we see that U is the projection of a
3dimensional solid V with 12 parallelogram faces. The projections of the
faces of V cover U exactly twice.
By drawing in the dotted lines (which represent lines inside V) we see that
V in turn is the projection of
a 4dimensional cuboid W. Each face of V is the projection of a
2dimensional face of W. The 2dimensional faces of W fall into groups of 4,
of which 2 project into faces of V, while 2 project into the interior of V
and are excluded from the calculation.
Practical calculation
Recall that we are given the diameters d_{1},…,d_{n} of the n discs.
For rpoint alinements, r=3,4,5,…, define m=r−2.
The formula above requires the calculation of S_{m},
the sum of all products of m diameters with distinct indices. Formally
S_{m}= 
∑ 
^{ } 
d_{i}_{1}…d_{im}. 
_{1≤i1<…<im≤n} 

In principle this could be done by simply working through all melement subsets of 1,…,n and adding the products.
In practice this approach is likely to take far too much computer time. A better method is as follows.
For m=1,2,3,… define
P_{m}=d_{1}^{m}+d_{2}^{m}+⋯+d_{n}^{m}. 

It is not difficult to show that
S_{1}=P_{1} 
2S_{2}=P_{1}S_{1}−P_{2} 
3S_{3}=P_{1}S_{2}−P_{2}S_{1}+P_{3} 
4S_{4}=P_{1}S_{3}−P_{2}S_{2}+P_{3}S_{1}−P_{4} 
5S_{5}=P_{1}S_{4}−P_{2}S_{3}+P_{3}S_{2}−P_{4}S_{1}+P_{5} 

and so on. These formulae allow S_{1},S_{2},S_{3},… to be calculated in succession.