Tom Cashman

I am a scientist in the HoloLens research team at Microsoft Cambridge.

I have previously worked on geometric problems in computer graphics and computer vision at the University of Cambridge, the University of Lugano and TranscenData Europe Ltd.

This page gives details on the publications and projects I've worked on. It would be great to hear from you if you have questions or feedback on any of this work.

The Phong Surface: Efficient 3D Model Fitting using Lifted Optimization

Jingjing Shen, Thomas J. Cashman, Qi Ye, Tim Hutton, Toby Sharp, Federica Bogo, Andrew Fitzgibbon, and Jamie Shotton
Proceedings of the European Conference on Computer Vision (ECCV), 2020

Realtime perceptual and interaction capabilities in mixed reality require a range of 3D tracking problems to be solved at low latency on resource-constrained hardware such as head-mounted devices. Indeed, for devices such as HoloLens 2 where the CPU and GPU are left available for applications, multiple tracking subsystems are required to run on a continuous, real-time basis while sharing a single Digital Signal Processor. To solve model-fitting problems for HoloLens 2 hand tracking, where the computational budget is approximately 100 times smaller than an iPhone 7, we introduce a new surface model: the 'Phong surface'. Using ideas from computer graphics, the Phong surface describes the same 3D shape as a triangulated mesh model, but with continuous surface normals which enable the use of lifting-based optimization, providing significant efficiency gains over ICP-based methods. We show that Phong surfaces retain the convergence benefits of smoother surface models, while triangle meshes do not.

@inproceedings{Shen:2020:TPS,
  title   = {The Phong Surface: Efficient 3D Model Fitting using
             Lifted Optimization},
  author  = {Jingjing Shen and Thomas J. Cashman and Qi Ye and
             Tim Hutton and Toby Sharp and Federica Bogo and Andrew
             Fitzgibbon and Jamie Shotton},
  booktitle = {Proceedings of the European Conference on Computer
               Vision (ECCV)},
  year    = {2020}
}

Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences

Jonathan Taylor, Lucas Bordeaux, Thomas Cashman, Bob Corish, Cem Keskin, Eduardo Soto, David Sweeney, Julien Valentin, Benjamin Luff, Arran Topalian, Erroll Wood, Sameh Khamis, Pushmeet Kohli, Toby Sharp, Shahram Izadi, Richard Banks, Andrew Fitzgibbon and Jamie Shotton
ACM Transactions on Graphics 35(4), pp. #143, 1–12, Proc. SIGGRAPH 2016

Fully articulated hand tracking promises to enable fundamentally new interactions with virtual and augmented worlds, but the limited accuracy and efficiency of current systems has prevented widespread adoption. Today's dominant paradigm uses machine learning for initialization and recovery followed by iterative model-fitting optimization to achieve a detailed pose fit. We follow this paradigm, but make several changes to the model-fitting, namely using: (1) a more discriminative objective function; (2) a smooth-surface model that provides gradients for non-linear optimization; and (3) joint optimization over both the model pose and the correspondences between observed data points and the model surface. While each of these changes may actually increase the cost per fitting iteration, we find a compensating decrease in the number of iterations. Further, the wide basin of convergence means that fewer starting points are needed for successful model fitting. Our system runs in real-time on CPU only, which frees up the commonly over-burdened GPU for experience designers. The hand tracker is efficient enough to run on low-power devices such as tablets. We can track up to several meters from the camera to provide a large working volume for interaction, even using the noisy data from current-generation depth cameras. Quantitative assessments on standard datasets show that the new approach exceeds the state of the art in accuracy. Qualitative results take the form of live recordings of a range of interactive experiences enabled by this new approach.

@article{Taylor:2016:EPI,
  title   = {Efficient and Precise Interactive Hand Tracking
             through Joint, Continuous Optimization of Pose and
             Correspondences},
  author  = {Jonathan Taylor and Lucas Bordeaux and Thomas Cashman
             and Bob Corish and Cem Keskin and Eduardo Soto and
             David Sweeney and Julien Valentin and Benjamin Luff
             and Arran Topalian and Erroll Wood and Sameh Khamis
             and Pushmeet Kohli and Toby Sharp and Shahram Izadi
             and Richard Banks and Andrew Fitzgibbon and
             Jamie Shotton},
  journal = {ACM Transactions on Graphics},
  year    = {2016},
  volume  = {35},
  number  = {4},
  pages   = {\#143, 1--12}
}

What shape are dolphins? Building 3D morphable models from 2D images

Thomas J. Cashman and Andrew W. Fitzgibbon
IEEE Transactions on Pattern Analysis and Machine Intelligence 35(1), pp. 232–244, 2013

3D morphable models are low-dimensional parametrizations of 3D object classes which provide a powerful means of associating 3D geometry to 2D images. However, morphable models are currently generated from 3D scans, so for general object classes such as animals they are economically and practically infeasible. We show that, given a small amount of user interaction (little more than that required to build a conventional morphable model), there is enough information in a collection of 2D pictures of certain object classes to generate a full 3D morphable model, even in the absence of surface texture. The key restriction is that the object class should not be strongly articulated, and that a very rough rigid model should be provided as an initial estimate of the 'mean shape'.

The model representation is a linear combination of subdivision surfaces, which we fit to image silhouettes and any identifiable key points using a novel combined continuous-discrete optimization strategy. Results are demonstrated on several natural object classes, and show that models of rather high quality can be obtained from this limited information.

Full MATLAB source code is available from CodePlex.

The CodePlex release also contains our data sets for bananas, pigeons, polar bears and (of course) dolphins.

See the included documentation to reproduce our results.

@article{Cashman:2013:WSD,
  author  = {Thomas J. Cashman and Andrew W. Fitzgibbon},
  title   = {What shape are dolphins? Building {3D} morphable
             models from {2D} images},
  journal = {IEEE Transactions on Pattern Analysis and Machine
             Intelligence},
  volume  = 35,
  number  = 1,
  pages   = {232--244},
  year    = 2013
}

A continuous, editable representation for deforming mesh sequences with separate signals for time, pose and shape

Thomas J. Cashman and Kai Hormann
Computer Graphics Forum 31(2), pp. 735–744, 2012

It is increasingly popular to represent non-rigid motion using a deforming mesh sequence: a discrete sequence of frames, each of which is given as a mesh with a common graph structure. Such sequences have the flexibility to represent a wide range of mesh deformations used in practice, but they are also highly redundant, expensive to store, and difficult to edit in a time-coherent manner. We address these limitations with a continuous representation that extracts redundancy in three separate phases, leading to separate editable signals in time, pose and shape. The representation can be applied to any deforming mesh sequence, in contrast to previous domain-specific approaches. By modifying the three signal components, we demonstrate time-coherent editing operations such as local repetition of part of a sequence, frame rate conversion and deformation transfer. We also show that our representation makes it possible to design new deforming sequences simply by sketching a curve in a 2D pose space.

Binaries for Windows (32 bit, 8.2 MB)
We use CHOLMOD for solving sparse linear systems in certain shape spaces. CHOLMOD can be compiled with the graph partitioning software METIS to give better performance, but can not be distributed in this form under the terms of the GNU GPL. The version of CHOLMOD in this implementation has therefore been compiled without METIS, but you may gain better performance by compiling, from source, against CHOLMOD linked with METIS and/or a version of the BLAS which has been optimized for your processor.

Source code (97 KB)

Sample 'flying squirrel' mesh sequence from Big Buck Bunny (8.3 MB)

Sample 'flag' mesh sequence (36.5 MB)

@article{Cashman:2012:CER,
  author  = {Thomas J. Cashman and Kai Hormann},
  title   = {A continuous, editable representation for deforming
             mesh sequences with separate signals for time, pose
             and shape},
  journal = {Computer Graphics Forum},
  volume  = 31,
  number  = 2,
  year    = 2012,
  pages   = {735--744},
  note    = {Proceedings of Eurographics}
}

NURBS with extraordinary points: High-degree, non-uniform, rational subdivision schemes

Thomas J. Cashman, Ursula H. Augsdörfer, Neil A. Dodgson and Malcolm A. Sabin
ACM Transactions on Graphics 28(3), pp. #46, 1–9, Proc. SIGGRAPH 2009

We present a subdivision framework that adds extraordinary vertices to NURBS of arbitrarily high degree. The surfaces can represent any odd degree NURBS patch exactly. Our rules handle non-uniform knot vectors, and are not restricted to midpoint knot insertion. In the absence of multiple knots at extraordinary points, the limit surfaces have bounded curvature.

@article{Cashman:2009:NEP,
  title   = {{NURBS} with Extraordinary Points:
             High-degree, Non-uniform, Rational Subdivision
             Schemes},
  author  = {Thomas J. Cashman and Ursula H. Augsd{\"o}rfer and
             Neil A. Dodgson and Malcolm A. Sabin},
  journal = {ACM Transactions on Graphics},
  year    = {2009},
  volume  = {28},
  number  = {3},
  pages   = {\#46, 1--9}
}

Full list of publications

  • The Phong Surface: Efficient 3D Model Fitting using Lifted Optimization
    Jingjing Shen, Thomas J. Cashman, Qi Ye, Tim Hutton, Toby Sharp, Federica Bogo, Andrew Fitzgibbon and Jamie Shotton
    Proc. European Conference on Computer Vision (ECCV), 2020
    See above for an accompanying video and further information.
    PDF (4.0 MB). PDF including supplementary material (5.1 MB)
  • QRkit: Sparse, composable QR decompositions for efficient and stable solutions to problems in computer vision
    Jan Svoboda, Thomas Cashman and Andrew Fitzgibbon
    Proc. IEEE Winter Conference on Applications of Computer Vision, pp. 1263–1272, 2018
    PDF (754 KB). Supplementary material (332 KB). DOI link
  • An efficient background term for 3D reconstruction and tracking with smooth surface models
    Mariano Jaimez, Thomas J. Cashman, Andrew W. Fitzgibbon, Javier Gonzalez-Jimenez and Daniel Cremers
    Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2575–2583, 2017
    PDF including supplementary material (2.5 MB). DOI link
  • Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences
    Jonathan Taylor, Lucas Bordeaux, Thomas Cashman, Bob Corish, Cem Keskin, Eduardo Soto, David Sweeney, Julien Valentin, Benjamin Luff, Arran Topalian, Erroll Wood, Sameh Khamis, Pushmeet Kohli, Toby Sharp, Shahram Izadi, Richard Banks, Andrew Fitzgibbon and Jamie Shotton
    ACM Transactions on Graphics 35(4), pp. #143, 1–12, Proc. SIGGRAPH 2016
    See above for an accompanying video and further information.
    PDF (7.0 MB). Supplementary material (2.8 MB). DOI link
  • Fits like a glove: Rapid and reliable hand shape personalization
    David Joseph Tan, Thomas Cashman, Jonathan Taylor, Andrew Fitzgibbon, Daniel Tarlow, Sameh Khamis, Shahram Izadi and Jamie Shotton
    Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 5610–5619, 2016
    PDF (2.3 MB). Supplementary material (190 KB)
  • Watertight conversion of trimmed CAD surfaces to Clough–Tocher splines
    Jiří Kosinka and Thomas J. Cashman
    Computer Aided Geometric Design 37, pp. 25–41, 2015
    PDF (3.0 MB). DOI link
  • Efficient interpolation of articulated shapes using mixed shape spaces
    Stefano Marras, Thomas J. Cashman and Kai Hormann
    Computer Graphics Forum 32(8), pp. 258–270, 2013
    PDF (2.7 MB). DOI Link
  • A smoothness criterion for monotonicity-preserving subdivision
    Michael S. Floater, Carolina V. Beccari, Thomas J. Cashman and Lucia Romani
    Advances in Computational Mathematics 39(1), pp. 193–204, 2013
    PDF (145 KB). DOI Link
  • Generalized Lane–Riesenfeld algorithms
    Thomas J. Cashman, Kai Hormann and Ulrich Reif
    Computer Aided Geometric Design 30(4), pp. 398–409, 2013
    PDF (1.1 MB). DOI link
  • What shape are dolphins? Building 3D morphable models from 2D images
    Thomas J. Cashman and Andrew W. Fitzgibbon
    IEEE Transactions on Pattern Analysis and Machine Intelligence 35(1), pp. 232–244, 2013
    See above for an accompanying video and CodePlex for a sample implementation.
    PDF including supplementary material (4.6 MB). DOI link
  • A mixed shape space for fast interpolation of articulated shapes
    Stefano Marras, Thomas J. Cashman and Kai Hormann
    Proceedings of Vision, Modeling, and Visualization 2012, pp. 159–166, 2012
    PDF (8.1 MB). DOI Link
  • A continuous, editable representation for deforming mesh sequences with separate signals for time, pose and shape
    Thomas J. Cashman and Kai Hormann
    Computer Graphics Forum 31(2), pp. 735–744, 2012
    See above for a sample implementation, video, and other auxiliary materials.
    PDF (1.2 MB). DOI link
  • Beyond Catmull–Clark? A survey of advances in subdivision surface methods
    Thomas J. Cashman
    Computer Graphics Forum 31(1), pp. 42–61, 2012
    PDF (2.0 MB). DOI link
  • NURBS-compatible subdivision surfaces
    Thomas J. Cashman
    PhD thesis, 2010
    PDF (5.8 MB)
  • Numerical checking of C1 for arbitrary degree quadrilateral subdivision schemes
    Ursula H. Augsdörfer, Thomas J. Cashman, Neil A. Dodgson and Malcolm A. Sabin
    E. Hancock, R. Martin, M. Sabin (Eds.): Mathematics of Surfaces 2009, LNCS 5654, pp. 45–54, 2009
    PDF (375 KB). DOI link
  • Deriving box-spline subdivision schemes
    Neil A. Dodgson, Ursula H. Augsdörfer, Thomas J. Cashman and Malcolm A. Sabin
    E. Hancock, R. Martin, M. Sabin (Eds.): Mathematics of Surfaces 2009, LNCS 5654, pp. 106–123, 2009
    DOI link
  • NURBS with extraordinary points: High-degree, non-uniform, rational subdivision schemes
    Thomas J. Cashman, Ursula H. Augsdörfer, Neil A. Dodgson and Malcolm A. Sabin
    ACM Transactions on Graphics 28(3), pp. #46, 1–9, Proc. SIGGRAPH 2009
    See above for a sample implementation, video, and other auxiliary materials.
    PDF (2.6 MB). DOI link
  • Selective knot insertion for symmetric, non-uniform refine and smooth B-spline subdivision
    Thomas J. Cashman, Neil A. Dodgson and Malcolm A. Sabin
    Computer Aided Geometric Design 26(4), pp. 472–479, 2009
    PDF (151 KB). DOI link
  • A symmetric, non-uniform, refine and smooth subdivision algorithm for general degree B-splines
    Thomas J. Cashman, Neil A. Dodgson and Malcolm A. Sabin
    Computer Aided Geometric Design 26(1), pp. 94–104, 2009
    PDF (378 KB). DOI link
  • Non-uniform B-spline subdivision using refine and smooth
    Thomas J. Cashman, Neil A. Dodgson and Malcolm A. Sabin
    R. Martin, M. Sabin, J. Winkler (Eds.): Mathematics of Surfaces 2007, LNCS 4647, pp. 121–137, 2007
    PDF (637 KB). DOI link
  • Bounded curvature subdivision without eigenanalysis
    Malcolm A. Sabin, Thomas J. Cashman, Ursula H. Augsdörfer and Neil A. Dodgson
    R. Martin, M. Sabin, J. Winkler (Eds.): Mathematics of Surfaces 2007, LNCS 4647, pp. 391–411, 2007
    DOI link