PhD Research Scholarship – Data Analytics And Biostatistics

We are seeking expressions of interest from candidates who wish to apply for a PhD, to undertake research to develop and validate a natural language processing (NLP) algorithm using the free text notes (‘progress notes’) available in Australian general practice health records (EHR) to identify young people’s risk-taking behaviours (e.g. drug and alcohol taking, smoking, sexual activity) and health conditions (e.g. depression, obesity) of young people attending general practice.

Patient information in EHRs can be entered in a variety of formats classified as structured or unstructured. Structured fields use standardised codes to store patient data such as demographics, diagnoses, investigations, and medications. Unstructured fields include narrative text such as patient consultation or ‘progress’ notes that summarise the interaction between patient and clinician and store valuable medical information related to patient care. Most research studies using EHRs have used the structured fields because content is organised and coded making data easily translatable for computing and easily extracted and analysed on a large scale. Using only data from structured fields can miss important information about the patient and the consultation. Progress note data are difficult to analyse. NLP is a tool that can be used to analyse unstructured progress note data. However, to date, NLP tools have not been applied to general practice EHR data in the Australian setting.

This PhD project will be nested within the University of Melbourne led NHMRC funded RAd Health Trial (Rebate Adolescent Health) and will address a knowledge gap by developing and validating NLP tools to identify adolescent risk-taking behaviours and health conditions in general practice EHR progress notes. The candidate will be required to work closely with other students undertaking natural language processing studies as well as the team from data analytics and RAd Health Trial.

The successful candidate will undertake up to a maximum of 3.5 years of full-time study based at the Department of General Practice and will lead a Doctoral degree from the University of Melbourne. The Department welcomes high achieving graduates with an interest in advancing research into data analytics and biostatistics to apply. The successful applicant will be able to commence their studies in 2023 in line with University of Melbourne intake policies.


To be eligible for this scholarship, you must:

  • Be an outstanding applicant and must have a degree with Honours or Master in a relevant discipline, with strong academic results
  • Demonstrated interest in data analytics and/or biostatistics

Selection criteria

Eligibility will be assessed on the following criteria:

  • satisfy the Faculty of Medicine, Dentistry and Health Sciences entry requirements for PhD, including minimum academic achievement of first-class honours or equivalent
  • demonstrate a deep interest in medical and health research with a focus on data analysis skills
  • demonstrate high level of skills in statistics, mathematics, data analytics or a related discipline.
  • demonstrate strong coding skills using software programs, such as, Stata, Python and R.
  • demonstrated effective verbal and written communications skills
  • an openness to learn new things, versatility and creativity, problem solving skills with attention to detail.

What are the benefits?

A living allowance of $33,000 per year (2023 full time study rate, indexed annually) for up to a maximum of 3.5 years that may be converted to top-up funding if candidates receive a university of national PhD scholarship.

How to apply?

Applicants should provide an expression of interest (EOI) to The EOI should address the selection criteria outlined above and demonstrate an interest in data analytics and/or biostatistics. The EOI must include a copy of your CV and academic transcripts.

For more information, download the Scholarship Description.