Satellite data and machine learning for public policy

As part of the Urban Analytics Data Dive at the Alan Turing Institute, I have recently spent two days working on data science solutions to public policy problems in an interdisciplinary team. We focused on identifying potential sites for the construction of new homes within cities by combining satellite data images, an urban atlas, planning applications, and other data sets. I am excited to say that our team was awarded the second prize by the jury.

Being an astronomer, my idea was to identify brownfield sites - i.e. previously developed but now disused industrial sites - based on their spectral signature in Sentinel-2 satellite images. Using an existing pilot register of brownfield sites in the Greater Manchester area and the satellite data, I have trained a classification algorithm. This classifier could then be applied to identify candidate sites in other cities. While the relatively narrow time frame of the data dive did not leave enough time to turn this idea into production, I have written up my ideas and a first demonstration here.

Workshop: speeding up numerical Python

Most of the code I write for my research is in Python. While standard numerical libraries such as numpy and scipy provide good performance in a lot of cases, they are sometimes stretched to their limits by heavier numerical computations. Also, Python's Global Interpreter Lock prevents parallel execution of computations. To circumvent these shortcomings, I have started using numba (just-in-time compiler), multiprocessing (interface for parallel execution), and Cython (compiled C-extensions for Python).

I have recently given a workshop about these tools for my fellow graduate students at the Institute of Astronomy. The course was written entirely in Jupyter notebooks, which are available here:
Python optimization with Numba and Cython: Course material , download notebook
Parallel Python with multiprocessing: Course material , download notebook

Neuroimaging widget for Jupyter notebooks

As part of the Cambridge Brainhack 2017, Jan Freyberg and I have written a Python package for interactive visualisation of neuroimaging data in Jupyter notebooks. It is available from Github . Here is a (non-interactive) example notebook and a short demo video.