Tom Cashman

I am a scientist in the HoloLens research team at Microsoft Cambridge.

I have previously worked on geometric problems in computer graphics and computer vision at the University of Cambridge, the University of Lugano and TranscenData Europe Ltd.

This page gives details on the publications and projects I've worked on. It would be great to hear from you if you have questions or feedback on any of this work.

The Phong Surface: Efficient 3D Model Fitting using Lifted Optimization

Jingjing Shen, Thomas J. Cashman, Qi Ye, Tim Hutton, Toby Sharp, Federica Bogo, Andrew Fitzgibbon, and Jamie Shotton
Proceedings of the European Conference on Computer Vision (ECCV), 2020

Realtime perceptual and interaction capabilities in mixed reality require a range of 3D tracking problems to be solved at low latency on resource-constrained hardware such as head-mounted devices. Indeed, for devices such as HoloLens 2 where the CPU and GPU are left available for applications, multiple tracking subsystems are required to run on a continuous, real-time basis while sharing a single Digital Signal Processor. To solve model-fitting problems for HoloLens 2 hand tracking, where the computational budget is approximately 100 times smaller than an iPhone 7, we introduce a new surface model: the 'Phong surface'. Using ideas from computer graphics, the Phong surface describes the same 3D shape as a triangulated mesh model, but with continuous surface normals which enable the use of lifting-based optimization, providing significant efficiency gains over ICP-based methods. We show that Phong surfaces retain the convergence benefits of smoother surface models, while triangle meshes do not.

@inproceedings{Shen:2020:TPS,
  title   = {The Phong Surface: Efficient 3D Model Fitting using
             Lifted Optimization},
  author  = {Jingjing Shen and Thomas J. Cashman and Qi Ye and
             Tim Hutton and Toby Sharp and Federica Bogo and Andrew
             Fitzgibbon and Jamie Shotton},
  booktitle = {Proceedings of the European Conference on Computer
               Vision (ECCV)},
  year    = {2020}
}

Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences

Jonathan Taylor, Lucas Bordeaux, Thomas Cashman, Bob Corish, Cem Keskin, Eduardo Soto, David Sweeney, Julien Valentin, Benjamin Luff, Arran Topalian, Erroll Wood, Sameh Khamis, Pushmeet Kohli, Toby Sharp, Shahram Izadi, Richard Banks, Andrew Fitzgibbon and Jamie Shotton
ACM Transactions on Graphics 35(4), pp. #143, 1–12, Proc. SIGGRAPH 2016

Fully articulated hand tracking promises to enable fundamentally new interactions with virtual and augmented worlds, but the limited accuracy and efficiency of current systems has prevented widespread adoption. Today's dominant paradigm uses machine learning for initialization and recovery followed by iterative model-fitting optimization to achieve a detailed pose fit. We follow this paradigm, but make several changes to the model-fitting, namely using: (1) a more discriminative objective function; (2) a smooth-surface model that provides gradients for non-linear optimization; and (3) joint optimization over both the model pose and the correspondences between observed data points and the model surface. While each of these changes may actually increase the cost per fitting iteration, we find a compensating decrease in the number of iterations. Further, the wide basin of convergence means that fewer starting points are needed for successful model fitting. Our system runs in real-time on CPU only, which frees up the commonly over-burdened GPU for experience designers. The hand tracker is efficient enough to run on low-power devices such as tablets. We can track up to several meters from the camera to provide a large working volume for interaction, even using the noisy data from current-generation depth cameras. Quantitative assessments on standard datasets show that the new approach exceeds the state of the art in accuracy. Qualitative results take the form of live recordings of a range of interactive experiences enabled by this new approach.

@article{Taylor:2016:EPI,
  title   = {Efficient and Precise Interactive Hand Tracking
             through Joint, Continuous Optimization of Pose and
             Correspondences},
  author  = {Jonathan Taylor and Lucas Bordeaux and Thomas Cashman
             and Bob Corish and Cem Keskin and Eduardo Soto and
             David Sweeney and Julien Valentin and Benjamin Luff
             and Arran Topalian and Erroll Wood and Sameh Khamis
             and Pushmeet Kohli and Toby Sharp and Shahram Izadi
             and Richard Banks and Andrew Fitzgibbon and
             Jamie Shotton},
  journal = {ACM Transactions on Graphics},
  year    = {2016},
  volume  = {35},
  number  = {4},
  pages   = {\#143, 1--12}
}

What shape are dolphins? Building 3D morphable models from 2D images

Thomas J. Cashman and Andrew W. Fitzgibbon
IEEE Transactions on Pattern Analysis and Machine Intelligence 35(1), pp. 232–244, 2013

3D morphable models are low-dimensional parametrizations of 3D object classes which provide a powerful means of associating 3D geometry to 2D images. However, morphable models are currently generated from 3D scans, so for general object classes such as animals they are economically and practically infeasible. We show that, given a small amount of user interaction (little more than that required to build a conventional morphable model), there is enough information in a collection of 2D pictures of certain object classes to generate a full 3D morphable model, even in the absence of surface texture. The key restriction is that the object class should not be strongly articulated, and that a very rough rigid model should be provided as an initial estimate of the 'mean shape'.

The model representation is a linear combination of subdivision surfaces, which we fit to image silhouettes and any identifiable key points using a novel combined continuous-discrete optimization strategy. Results are demonstrated on several natural object classes, and show that models of rather high quality can be obtained from this limited information.

Full MATLAB source code is available from CodePlex.

The CodePlex release also contains our data sets for bananas, pigeons, polar bears and (of course) dolphins.

See the included documentation to reproduce our results.

@article{Cashman:2013:WSD,
  author  = {Thomas J. Cashman and Andrew W. Fitzgibbon},
  title   = {What shape are dolphins? Building {3D} morphable
             models from {2D} images},
  journal = {IEEE Transactions on Pattern Analysis and Machine
             Intelligence},
  volume  = 35,
  number  = 1,
  pages   = {232--244},
  year    = 2013
}

A continuous, editable representation for deforming mesh sequences with separate signals for time, pose and shape

Thomas J. Cashman and Kai Hormann
Computer Graphics Forum 31(2), pp. 735–744, 2012

It is increasingly popular to represent non-rigid motion using a deforming mesh sequence: a discrete sequence of frames, each of which is given as a mesh with a common graph structure. Such sequences have the flexibility to represent a wide range of mesh deformations used in practice, but they are also highly redundant, expensive to store, and difficult to edit in a time-coherent manner. We address these limitations with a continuous representation that extracts redundancy in three separate phases, leading to separate editable signals in time, pose and shape. The representation can be applied to any deforming mesh sequence, in contrast to previous domain-specific approaches. By modifying the three signal components, we demonstrate time-coherent editing operations such as local repetition of part of a sequence, frame rate conversion and deformation transfer. We also show that our representation makes it possible to design new deforming sequences simply by sketching a curve in a 2D pose space.

Binaries for Windows (32 bit, 8.2 MB)
We use CHOLMOD for solving sparse linear systems in certain shape spaces. CHOLMOD can be compiled with the graph partitioning software METIS to give better performance, but can not be distributed in this form under the terms of the GNU GPL. The version of CHOLMOD in this implementation has therefore been compiled without METIS, but you may gain better performance by compiling, from source, against CHOLMOD linked with METIS and/or a version of the BLAS which has been optimized for your processor.

Source code (97 KB)

Sample 'flying squirrel' mesh sequence from Big Buck Bunny (8.3 MB)

Sample 'flag' mesh sequence (36.5 MB)

@article{Cashman:2012:CER,
  author  = {Thomas J. Cashman and Kai Hormann},
  title   = {A continuous, editable representation for deforming
             mesh sequences with separate signals for time, pose
             and shape},
  journal = {Computer Graphics Forum},
  volume  = 31,
  number  = 2,
  year    = 2012,
  pages   = {735--744},
  note    = {Proceedings of Eurographics}
}

NURBS with extraordinary points: High-degree, non-uniform, rational subdivision schemes

Thomas J. Cashman, Ursula H. Augsdörfer, Neil A. Dodgson and Malcolm A. Sabin
ACM Transactions on Graphics 28(3), pp. #46, 1–9, Proc. SIGGRAPH 2009

We present a subdivision framework that adds extraordinary vertices to NURBS of arbitrarily high degree. The surfaces can represent any odd degree NURBS patch exactly. Our rules handle non-uniform knot vectors, and are not restricted to midpoint knot insertion. In the absence of multiple knots at extraordinary points, the limit surfaces have bounded curvature.

Binaries for Windows (32 bit, 5.4 MB)

Source code (960.4 KB)

Precomputed tables containing bounded curvature solutions (68.6 KB)

@article{Cashman:2009:NEP,
  title   = {{NURBS} with Extraordinary Points:
             High-degree, Non-uniform, Rational Subdivision
             Schemes},
  author  = {Thomas J. Cashman and Ursula H. Augsd{\"o}rfer and
             Neil A. Dodgson and Malcolm A. Sabin},
  journal = {ACM Transactions on Graphics},
  year    = {2009},
  volume  = {28},
  number  = {3},
  pages   = {\#46, 1--9}
}

Full list of publications

The Phong Surface: Efficient 3D Model Fitting using Lifted Optimization
Jingjing Shen, Thomas J. Cashman, Qi Ye, Tim Hutton, Toby Sharp, Federica Bogo, Andrew Fitzgibbon and Jamie Shotton
Proc. European Conference on Computer Vision (ECCV), 2020
See above for an accompanying video and further information.
PDF (4.0 MB). PDF including supplementary material (5.1 MB)
QRkit: Sparse, composable QR decompositions for efficient and stable solutions to problems in computer vision
Jan Svoboda, Thomas Cashman and Andrew Fitzgibbon
Proc. IEEE Winter Conference on Applications of Computer Vision, pp. 1263–1272, 2018
PDF (754 KB). Supplementary material (332 KB). DOI link
An efficient background term for 3D reconstruction and tracking with smooth surface models
Mariano Jaimez, Thomas J. Cashman, Andrew W. Fitzgibbon, Javier Gonzalez-Jimenez and Daniel Cremers
Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2575–2583, 2017
PDF including supplementary material (2.5 MB). DOI link
Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences
Jonathan Taylor, Lucas Bordeaux, Thomas Cashman, Bob Corish, Cem Keskin, Eduardo Soto, David Sweeney, Julien Valentin, Benjamin Luff, Arran Topalian, Erroll Wood, Sameh Khamis, Pushmeet Kohli, Toby Sharp, Shahram Izadi, Richard Banks, Andrew Fitzgibbon and Jamie Shotton
ACM Transactions on Graphics 35(4), pp. #143, 1–12, Proc. SIGGRAPH 2016
See above for an accompanying video and further information.
PDF (7.0 MB). Supplementary material (2.8 MB). DOI link
Fits like a glove: Rapid and reliable hand shape personalization
David Joseph Tan, Thomas Cashman, Jonathan Taylor, Andrew Fitzgibbon, Daniel Tarlow, Sameh Khamis, Shahram Izadi and Jamie Shotton
Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 5610–5619, 2016
PDF (2.3 MB). Supplementary material (190 KB)
Watertight conversion of trimmed CAD surfaces to Clough–Tocher splines
Jiří Kosinka and Thomas J. Cashman
Computer Aided Geometric Design 37, pp. 25–41, 2015
PDF (3.0 MB). DOI link
Efficient interpolation of articulated shapes using mixed shape spaces
Stefano Marras, Thomas J. Cashman and Kai Hormann
Computer Graphics Forum 32(8), pp. 258–270, 2013
PDF (2.7 MB). DOI Link
A smoothness criterion for monotonicity-preserving subdivision
Michael S. Floater, Carolina V. Beccari, Thomas J. Cashman and Lucia Romani
Advances in Computational Mathematics 39(1), pp. 193–204, 2013
PDF (145 KB). DOI Link
Generalized Lane–Riesenfeld algorithms
Thomas J. Cashman, Kai Hormann and Ulrich Reif
Computer Aided Geometric Design 30(4), pp. 398–409, 2013
PDF (1.1 MB). DOI link
What shape are dolphins? Building 3D morphable models from 2D images
Thomas J. Cashman and Andrew W. Fitzgibbon
IEEE Transactions on Pattern Analysis and Machine Intelligence 35(1), pp. 232–244, 2013
See above for an accompanying video and CodePlex for a sample implementation.
PDF including supplementary material (4.6 MB). DOI link
A mixed shape space for fast interpolation of articulated shapes
Stefano Marras, Thomas J. Cashman and Kai Hormann
Proceedings of Vision, Modeling, and Visualization 2012, pp. 159–166, 2012
PDF (8.1 MB). DOI Link
A continuous, editable representation for deforming mesh sequences with separate signals for time, pose and shape
Thomas J. Cashman and Kai Hormann
Computer Graphics Forum 31(2), pp. 735–744, 2012
See above for a sample implementation, video, and other auxiliary materials.
PDF (1.2 MB). DOI link
Beyond Catmull–Clark? A survey of advances in subdivision surface methods
Thomas J. Cashman
Computer Graphics Forum 31(1), pp. 42–61, 2012
PDF (2.0 MB). DOI link
NURBS-compatible subdivision surfaces
Thomas J. Cashman
PhD thesis, 2010
PDF (5.8 MB)
Numerical checking of C¹ for arbitrary degree quadrilateral subdivision schemes
Ursula H. Augsdörfer, Thomas J. Cashman, Neil A. Dodgson and Malcolm A. Sabin
E. Hancock, R. Martin, M. Sabin (Eds.): Mathematics of Surfaces 2009, LNCS 5654, pp. 45–54, 2009
PDF (375 KB). DOI link
Deriving box-spline subdivision schemes
Neil A. Dodgson, Ursula H. Augsdörfer, Thomas J. Cashman and Malcolm A. Sabin
E. Hancock, R. Martin, M. Sabin (Eds.): Mathematics of Surfaces 2009, LNCS 5654, pp. 106–123, 2009
DOI link
NURBS with extraordinary points: High-degree, non-uniform, rational subdivision schemes
Thomas J. Cashman, Ursula H. Augsdörfer, Neil A. Dodgson and Malcolm A. Sabin
ACM Transactions on Graphics 28(3), pp. #46, 1–9, Proc. SIGGRAPH 2009
See above for a sample implementation, video, and other auxiliary materials.
PDF (2.6 MB). DOI link
Selective knot insertion for symmetric, non-uniform refine and smooth B-spline subdivision
Thomas J. Cashman, Neil A. Dodgson and Malcolm A. Sabin
Computer Aided Geometric Design 26(4), pp. 472–479, 2009
PDF (151 KB). DOI link
A symmetric, non-uniform, refine and smooth subdivision algorithm for general degree B-splines
Thomas J. Cashman, Neil A. Dodgson and Malcolm A. Sabin
Computer Aided Geometric Design 26(1), pp. 94–104, 2009
PDF (378 KB). DOI link
Non-uniform B-spline subdivision using refine and smooth
Thomas J. Cashman, Neil A. Dodgson and Malcolm A. Sabin
R. Martin, M. Sabin, J. Winkler (Eds.): Mathematics of Surfaces 2007, LNCS 4647, pp. 121–137, 2007
PDF (637 KB). DOI link
Bounded curvature subdivision without eigenanalysis
Malcolm A. Sabin, Thomas J. Cashman, Ursula H. Augsdörfer and Neil A. Dodgson
R. Martin, M. Sabin, J. Winkler (Eds.): Mathematics of Surfaces 2007, LNCS 4647, pp. 391–411, 2007
DOI link