Below we showcase several projects in which SIH has delivered a report. See all projects.
Wheat yield prediction with uncertainty estimates
Predicting crop yield using a range of proximal and remote sensor measurements is area of active research. Such predictions are important for optimisation of crop management (e.g. nitrogen application) and robust associated uncertainty estimates help to improve this process and understand its limitations. We wrote code implementing a Bayesian regression model with spatially correlated residuals for application to wheat crop yield forecasting using a range of sensor data. We used this to generate predictive maps of wheat yield with robust uncertainty bounds.
Bayesian Updating for Childhood Obesity Grant Proposal
SIH supported a grant proposal by the Centre for Translational Data Science, by demonstrating the value of using Bayesian modelling when collecting and analysing longitudinal data on childhood obesity. We built cross-sectional Bayesian variable selection models to select important factors and models for predicting children’s BMI, mental health and sleep quality across multiple ages, for each child in the Longitudinal Study of Australian Children (LSAC) study. A vector-autoregressive model was then applied to visualise the unexplained variation in the preceding models. We constructed visualisations to demonstrate the importance of understanding uncertainty over the course of data collection, and the potential for using Bayesian adaptive trials during collection.
Labelling Clause Type at Scale for LCT
LCT studies how knowledge is built through teaching, and in order to determine the trajectory of knowledge building, proposes to categorise each clause in a teaching transcript. SIH made this process of labelling clauses much faster and scalable. They did so firstly by developing software with natural language processing technology that converts a lesson transcript into a spreadsheet where each row contains a clause to be categorised. Secondly, they developed a machine learning classifier to learn from these spreadsheets and predict the labels of future clauses. Finally, SIH developed techniques to visualise the trajectory of knowledge building through a lesson where clauses have been categorised.
Identifying ram mating behaviour
Monitoring livestock has historically been labour intensive. The advent of on-animal sensors means this monitoring can be conducted remotely, continuously, and accurately. The ability to identify the precise time when sheep are mating using ram-mounted accelerometer data would unlock unprecedented information on the reproductive performance of these animals. We fit a classifier model to data from collar accelerometers labelled by videoing rams in the presence of ewes in oestrus. We then wrote code to detect change points in new acceleration data and to predict the occurence of mating events.
eSCAPE parallel landscape evolution benchmarking
eSCAPE is a parallel landscape evolution model, built to simulate topography dynamic at various space and time scales. SIH benchmarked eSCAPE’s performance across multiple CPUs and nodes on the University of Sydney’s Artemis HPC, visualizing the program’s runtimes as well as the runtimes of specific functions within the program. SIH created reusable scripts to allow the researcher to easily assess eSCAPE’s performance in the future as code development continues.
Video Tracking Predator-Prey Interactions in Fish.
By video-tracking the interaction between prey mosquitofish, Gambusia holbrooki, and their predator, jade perch, Scortum barcoo, under controlled conditions, we provide some of the first fine-scale characterisation of how prey adapt their behaviour according to their continuous assessment of risk based on both predator behaviour and angular distance to the predator’s mouth. When these predators were inactive and posed less of an immediate threat, prey were often found within the attack cone of the predator showing reductions in speed and acceleration, characteristic of predator-inspection behaviour. However, when predators became active, prey swam faster with greater acceleration and were closer together within the attack cone of predators. Most importantly, this study provides evidence that prey do not adopt a uniform response to the presence of a predator. Instead, we demonstrate that prey are capable of rapidly and dynamically updating their assessment of risk and showing fine-scale adjustments to their behaviour.
Paper: “Fine-scale behavioural adjustments of prey on a continuum of risk”. M.I.A. Kent, J.E. Herbert-Read, G.D. McDonald, A.J. Wood, A.J.W. Ward. Proceedings of the Royal Society B. 2019
Where can deep-sea iron nodules be found?
Potato-sized nodules of iron ore found on the ocean floor are of commercial mining interest. However, negative ecological effects from mining these nodules is of concern. SIH constructed a global predictive model of nodule occurrence by combining data from thousands of ocean floor samples with global maps of oceanic variables. The environments in which these deposits do and do not occur could then be characterised to generate insight into potential consequences of proposed mining.
Predicting unnecessary CT scans
- Professor Jonathan Morris, Kolling Institute of Medical Research and Sydney Medical School; Dr Felicity Gallimore
- The University of Sydney Medical School
- Data Science (Dr Aldo Saavedra , Dr Madhura Killedar, Dr Joel Nothman and Mr Peter Thiem)
- Predictive modelling Inferential modelling Description and basic visualization Language as data
Diagnostic imaging in hospitals is costly due to expensive machines and their operators, as well as the cost of moving patients in and out of radiography. Published studies of emergency presentations have shown that the number of brain computer tomography (CT-Brain) scans performed is increasing with time while the proportion of scans giving no cause for concern remains the same and represents the largest category.
We sought to determine whether a substantial portion of CT Scans performed in North Sydney LHD were unnecessary. We translated this research question into something determinable from data: identify CT-Brain cases where the unconcerning outcome of scans could be predicted from clinical knowledge available prior to the scan. By first constructing a text classifier to label CT Scan reports as unconcerning, we were able to use clustering and predictive modelling to weakly identify some patient features that predicted unconcerning CT results.
While the project had the potential to impact clinical policy surrounding the application of CT Scans in Emergency Departments, the weak results suggests that if any excessive expenditure problem exists it is not simple to resolve. At the same time, we have developed methodologies for performing similar studies towards rationalising diagnostic scan expenditure.
Identifying Nerve Function Profiles in Motor Neurodegenerative Disorders
Nerve excitability measurements can identify patterns of nerve dysfunction associated with many diseases of the nervous system. The researchers manage a database containing around 20 years’ of peripheral nerve excitability studies. A software package, QTRAC, is used to generate ~35 properties that are analysed in a research context. Additional information is incorporated to help make a diagnosis, such as clinical survey data, and the temperature of the nerve at the time of the test. Importantly, diagnosis of the disorder is not always 100% accurate. SIH used machine learning to predict the likelihood motor neuron disease for a patient given nerve excitability measurements. The model had reasonable ability to rank individual cases in order of increasing MND risk. SIH delivered this model in a software package for future use in research as well as a clinical setting, with the intention of improving the speed and accuracy of MND diagnosis to improve treatment outcomes for patients.