Area estimation
So, how does all of this relate to the square of the sum of lengths divided by the edge count,LM 2/e , and the estimate of a shape’s overall area, A ? Suppose that the average mosaic piece resembled a square, not an octagon, but also with a perimeter eight times the average edge length a . Each side would be 2a long and the area would be 4 a 2. The overall area across the graph would therefore be the piece count times 4a 2. In this limited case, A is justLM 2/e because there are four edges per 2 x 2 rectangle on average: given 25 rectangles, the area is 100 a 2; e = 4 x 25 = 100;LM = 4 x 25 a = e a = 100a ; and LM 2/e = 100 a 2 = A .
Because the pieces actually average out to octagons, it might seem that the area of each one would be the area of a regular octagon, which is 2 (1 + 20.5) a 2 = 4.828a 2. Thus, we might estimate A as 4.828/4LM 2/e = 1.207LM 2/e . However, the maximal size of any polygon is reached when it expands in all directions to become regular (because it most closely approximates a circle). No matter what the construction process, polygons subject to any kind of randomness must be smaller. Thus, the 4.828 figure may be too high.
Nonetheless, simulations provide no evidence to support this hypothesis. A good explanation is that the average edge in a mosaic abuts a larger-than-average piece by definition. For example, if half of the mosaic consists of 6-edge pieces and half of 10-edge pieces, the average edge abuts a shape of (62 + 102)/16 = 8.5 edges, not eight. The larger a piece, the more closely it approximates a circle, the shape having the lowest perimeter-to-area ratio: a square with a perimeter of eight has an area of four, whereas a circle with the same perimeter (circumference) C has an area ofC 2/(4 pi) = 5.093. This effect seems to cancel out the overestimation due to irregularity in polygon shapes, and as a result, throughout the rest of this paper I employ the 4.828/4 = 1.207 regularisation constant.
Turning briefly to MSTs, which can be computed using the mosaicpackage’s tgraph function, each includes about 4/5 as many edges as a mosaic because the edge:point ratio is nearly one in a large MST. However, an MST’s total length should be less than 4/5 of the corresponding mosaic’s length because an MST should avoid many long edges. Perhaps, the MST on average simply avoids the longest out every five edges. It can do this because there are four points for every five edges in the mosaic (see above), and there is a near 1:1 ratio of edges to points in an MST. However, the choice may come down to only two edges because the others can’t be avoided: if the points form a line, the MST must either cross from the left and leave out the last edge or vice versa. The longer segment when a line is subdivided at random comprises 3/4 of the length on average, so the MST’s length should be (3 + 1/4)/5 = 65% that of the mosaic’s. Thus, if LT is the length of the MST, then instead of A = 1.207LM 2/e we would predictA = 1.207/0.652LT 2/e = 2.857LT 2/e . However, in practice, MST-based area estimates are highly problematic because the 0.65 constant seems to vary in simulation according to the shape of the object: for example, it is higher for circles and rings, and actually close to 0.8. Therefore, an MST-based approach is not recommended.