I am a data scientist specialising in natural language processing (NLP). I help organisations extract value from unstructured data, such as PDFs, transcripts, emails, or contracts. In 2018, I founded Fast Data Science to deliver NLP consulting services.
I have been working in natural language processing (NLP) since 2008, after completing an MPhil in Computer Speech, Text and Internet Technology at Fitzwilliam College, University of Cambridge. I specialise in NLP for text and pharmaceuticals.
I offer data science and data strategy consulting in machine learning and natural language processing. Whether you would like advice on starting a data science project or are interested in a longer term engagement, I'm keen to hear from you. More information is on my company website fastdatascience.com.
You can reach me on +44 20 3488 574 or at fastdatascience.com/contact/. You can also add me on LinkedIn or follow me on Github.
My main area of focus is natural language processing (NLP), and I have undertaken many projects in industries such as pharmaceuticals, healthcare, market research, or law, where unstructured text data is the norm. I studied a Masters in 2008 at Cambridge University in MPhil in Computer Speech, Text and Internet Technology and since then I have been working exclusively in machine learning and mostly in NLP. In 2018, I founded my consultancy Fast Data Science Ltd, focusing on NLP. I have built NLP pipelines from scratch, and worked on natural language dialogue systems, document classifiers and text based recommender systems. For these tasks I have used both traditional machine learning techniques as well as the state of the art such as neural networks or large language models (LLMs) and generative models.
A good data and AI strategy should be tailored to your organisation and robust against disruption. A data or AI strategy should take into account the current business needs and potential for growth, and the possibilities that AI can bring in the future. If you are starting on a new venture and know that data science and AI will form part of your business model, please contact me for an AI strategy consulting session to ensure that your data processes and data collection is on the right track.
I offer end to end machine learning consulting and development. I will start with an on site meeting to discuss your requirements, and to brainstorm about the projects and objectives with the highest ROI for your organisation. This is followed by a data exploration phase where I establish what is achievable for your use case using AI. I begin an intensive phase of model development, testing thousands of competing models against a validation dataset, and finally we move on to model deployment.
I offer expert witness and expert advisor services for civil litigation in the field of NLP, data science and AI. I have completed the Cardiff University Bond Solon Civil Expert Certificate, and I am certified to give expert evidence in civil proceedings in England and Wales. I am certified by Microsoft as a Microsoft Azure Associate Data Scientist. I also routinely undertake due diligence engagements for private equity investors and other investors who are considering acquiring companies in the NLP and data science space. Please click here to download an example of a CPR35/PD35 compliant expert witness report in the field of machine learning.
I helped Tarion develop a pathway to use data science and NLP to predict when a form submitted by a homeowner, containing multiple free text fields, is likely to result in escalation to physical inspection of a building, a major structural defect, or litigation.
I worked on a data harmonisation tool for psychologists, using LLMs to assist researchers to combine data from disparate sources.
I assisted the Gates Foundation with an NLP tool which can identify key risk factors of a clinical trial failing to deliver informative results.
Robnik, M & Wood, T. Testing Adiabatic Invariance in Separatrix Crossing. Nonlinear Phenomena in Complex Systems, 2007.
Wood TA and McNair D. Clinical Trial Risk Tool: software application using natural language processing to identify the risk of trial uninformativeness. Gates Open Res 2023, 7:56 doi: 10.12688/gatesopenres.14416.1.
McElroy, E., Moltrecht, B., Scopel Hoffmann, M., Wood, T. A., & Ploubidis, G. (2023, January 6). Harmony – A global platform for contextual harmonisation, translation and cooperation in mental health research. Retrieved from osf.io/bct6k (submitted for publication)
Moltrecht, B., Wood, TA., Scopel Hoffmann, M., McElroy, E., Harmony: a web-tool for retrospective, multilingual harmonisation of questionnaire items using natural language processing. Retrieved from https://psyarxiv.com/zvqbf/ (submitted for publication)
Wood, TA., McNair, D., Clinical Trial Risk Tool: software application using natural language processing to identify the risk of trial uninformativeness [version 1; peer review: awaiting peer review], Gates Open Research 2023, 7:56.
Mather, JF, Othman, A, Streit, S, Dumitran, I & Wood, T. System and method for biometric protocol standards. US patent 9838388, 2017.