Talks and presentations

See a map of all the places I've given a talk!

Minimum Viable Model Estimates for Machine Learning Projects

December 18, 2020

Talk, CSEA 2020, Sydney Australia

Prioritization of machine learning projects requires estimates of both the potential ROI of the business case and the technical difficulty of building a model with the required characteristics. In this work we present a technique for estimating the minimum required performance characteristics of a predictive model given a set of information about how it will be used. This technique will result in robust, objective comparisons between potential projects. The resulting estimates will allow data scientists and managers to evaluate whether a proposed machine learning project is likely to succeed before any modelling needs to be done. The technique has been implemented into the open source application MinViME (Minimum Viable Model Estimator) which can be installed via the PyPI python package management system, or downloaded directly from the GitHub repository.

Modern Machine Learning Language Models

October 16, 2020

Talk, Selenium Day, 16 Oct 2020, Sydney., Sydney Australia

In this invited talk for the Selenium Day I presented an overview of the technical innovations that have led to modern machine learning successes with language processing. This involved discussing what is special about processing text, the fundamentals of recurrent processing, the development of attention and self-attention models, and finally how this led to the Transformer architecture.

Event Link

Introduction to Bayesian Machine Learning.

November 28, 2019

Talk, Machine Learning & Deep Learning Day, 28 Nov 2019, Sydney., Sydney Australia

In this invited talk for the Machine Learning & Deep Learning Day I presented a ground-up introduction to understanding the fundamentals of Bayesian Machine Learning. I introduced the idea of Bayesian statistics and described the connections between maximum likelihood, maxium a-posteriori and finally the Bayesian goal of a complete estimate of the posterior distribution. I introduced Markov Chain Monte Carlo and the Metropolis Hastings Algorithm. Finally I share some brief cautions on how people from freqeuntist machine learning tend to go wrong either through their expectations or implementations.

Event Link

DataRobot Vs The Red Queen

September 10, 2019

Talk, Chief Data and Analytics Officer Conference, Melbourne, Australia

In this talk I gave a brief overview of the Red Queen effect that has been used in evolutionary biology to describe co-evolution of competing species. I apply this idea to the competition betweenm organisations that are using data science and machine learning to differentiate against their competitors.

Building Model Factories with the DataRobot API

May 31, 2018

Talk, Sydney Data Science Meet-Up, Sydney Australia

In this Sydney Data Science Sponsored Meet-Up talk I gave an introduction of the idea of Model Factories, discussing the history of the idea and how it has lead to AutoML systems like DataRobot. Ultimately enabling us to build new forms of automated ML systems.

Being Bayesian

November 29, 2016

Talk, Sydney Data Science Meet-Up, Sydney Australia

In this Sydney Data Science Meet-Up talk I gave an overview of the history and reasoning that lead to the distinction beetwen Frequentist and Bayesian Inference. I give several worked examples and show the results of simulations designed to answer the question under which circumstances should we prefer one over the other.

Full video of the talk here

Protein Structure Search Strategies

December 13, 2010

Talk, BioTec Dresden, Germany., Dresden, Germany

In this BIOTEC Post-Doc Seminar Series talk I gave an overview of the algorithms used to search protein databases to look for functional motifs and active sites that determine biological function and potetnial biomedical applications.

Can Comparative Genomics Improve Transcription Factor Binding Site Prediction

April 07, 2008

Talk, Institution Presentation for the Institute for Molecular Bioscience, Brisbane, Australia

In this presentation for the Institute for Molecular Bioscience at Queensland University I summarised some of the observations and conclusions that Tim Bailey and I had come to in working on the task of using information from gene sequence alignments to try and improve our ability to identify transcription factor binding sites.

The Statistical Power of Phylogenetic Motif Models

March 30, 2008

Talk, RECOMB: Research in Computational Molecular Biology, Singapore

In this talk I presented the results of the research paper completed with Tim Bailey on the task of exploiting the phylogenetic information in comparative gene sequence alignments to try and improve the prediction of transcription factor binding site prediction.

Evolving PTS2 Motifs

July 20, 2006

Talk, Congress on Evolutionary Computation, Vancouver, Canada

In this talk I presented the work done with Mikael Boden on the task of designing evolutioning algorithms to create regular expression like motifs to distinguish proteins that carry the PTS2 motif. This is a difficult classification task due to the absence of large data sets and highly variable sequences in the signalling section of the proteins.

Predicting Nuclear Proteins

July 12, 2006

Talk, ACB Bionformatics Student Symposium, Auckland, New Zealand

In this talk I presented initial work done with Mikael Boden on the task of building machine learning systems to classify proteins that are bound for the nucleus after transcription. It involves the creation of new datasets, and evaluating a range of existing techniques.