Covid-19 Monitoring

In an earlier post  I explained how to get Covid-19 data from the Public Health England API. You can download my data monitor PBI filehere but you should refer to that post to explain how to create queries to get the specific local authority data you need for your school.


What data is there and how should we monitor it?

The PHE data has different indicators for different area types (they’re also subject to change so check here for the latest info). At local authority level you can get new cases, cumulative cases (which is all since the outbreak began, not all currently ill) and cumulative cases per 100k of population. I think cumulative cases is irrelevant so I’m monitoring new cases in the past 7 days. Cumulative case rate is also irrelevant but useful for calculating the borough population.

My tracker also monitors hospitalisation and death figures although these are only available at the national not the local authority or regional level.

There obviously isn’t a clear dividing line between a safe and unsafe number of cases but for context in the week leading up to the lockdown in Leicester (21st to 28th June 2020) there were 141 new cases per 100k population. At the MAT where I’m currently working we’re considering 100 cases per 100k a serious concern although obviously the rate of increase matters too.

Data Model

My model consists of 3 datasets from PHE

NewDailyCases – this is a separate query for each local authority I’m interested in, combined into a single table in Power Query

NationalData – a single query with the hospitalisation and death data

AreaInfo – this is a query for a specific date, the date is irrelevant it just means we get a list of unique local authorities which allows us to divide their cumulative cases by their cumulative case rate to get their population expressed in units of 100k (which I called ‘pop’).

Plus

Students – I’ve got some dummy student data with a borough for each student. If you need to look up the local authority of students’ postcode you can download that info from the Office of National Statistics https://geoportal.statistics.gov.uk/

DateTable – a table made in DAX using CALENDAR to control dates on NationalData and NewDailyCases with a single slicer.

Covid-19 Score

Because the transmission risk to a school community comes from where the school is and where the pupils live, we came up with the concept of the Covid-19 Score which incorporates both these factors. This is a particular concern where I work in London as pupils live in various boroughs, if most of your kids live in a single local authority it might not be worth it.

The methodology for the Covid-19 score is:

Home Borough Element

=  New cases per 100k population in last 7 days * percentage of students on roll living in borough 1

+ New cases per 100k population in last 7 days * percentage of students on roll living in borough 2

…repeated for each borough where pupils are resident

School Borough Element

= New cases per 100k population in last 7 days in borough where school is located

Covid-19 Score = Home Borough Element + School Borough Element

                                                                                2

We can calculate this in DAX using ADDCOLUMNS and SUMMARIZE. The formula is here:

Covid-19 Score =

//Note that this measure takes a single, hard-coded school site from the measure 'School Location'. For a MAT this could be the SELECTEDVALUE of the school borough from your schools table

VAR SchoolLocation = [School Location]

VAR CVRefDate = MAX(NewDailyCases[date])-7

VAR Onroll = COUNTROWS(Students)  

VAR BoroughProportions = ADDCOLUMNS(

    SUMMARIZE(

                Students,

            Students[Borough]),

                               "PupCt",    CALCULATE(COUNTROWS(Students)),

                              "Cases",    VAR B = Students[Borough] RETURN                                                                          CALCULATE(SUM(NewDailyCases[newCasesBySpecimenDate]),                                                                                                                                                                      NewDailyCases[areaName]=B,                                                                                NewDailyCases[date]>CVRefDate),

                                        "Pop",       VAR B = Students[Borough] RETURN                                                        CALCULATE(SELECTEDVALUE(AreaInfo[Pop]),AreaInfo[areaName]=B))

VAR PupilHomeCount = SUMX(BoroughProportions,([PupCt]/Onroll)*([Cases]/[Pop]))

VAR SchoolCount = SUMX(

                        FILTER(BoroughProportions,

                                                Students[Borough]=SchoolLocation),

                        [Cases]/[Pop])

VAR R = (PupilHomeCount+SchoolCount)/2

RETURN

R

The important things to note about this measure are that all students in the students are assumed to be on roll (you may need to add filters) and that the names in the field Students[borough] match those in NewDailyCases[areaName]

Rate of Change

Perhaps more important that any measure of cases is the rate at which they’re changing. Fortunately time-intelligence functions in DAX make it pretty straight forward to calculate the Covid-19 score from the previous week and express the percentage change to this week. The formula DATESINPERIOD returns a set of dates for the previous week in this measure and can be used as a CALCULATE modifier:

Change from last week =

VAR LW = CALCULATE([Covid-19 Score],DATESINPERIOD(DateTable[Date],MAX(DateTable[Date])-13,7,DAY))

VAR TW = [Covid-19 Score]

RETURN

DIVIDE(TW-LW,LW)

 

Mapping borough scores

In theory you can tell BI that say, ‘Merton’ is a place and then the Maps visual will find it on the map (you do this by clicking on the field in the field list and choosing a Data Category in Column Tools. My experience with this hasn’t been great though as Bing seems to think a lot of places in South London are in Pennsylvania (which to be fair they are…in a sense).

So in order to place my local authority stats on the map I cheated a bit by looking up the postcodes of the council offices for my three sample local authorities (Croydon, Merton and Sutton) and adding a conditional column in Power Query to my AreaInfo table to quasi manually add the postcode to the LA (if areaName = Croydon postcode = …etc)

I have a simple measure for the seven day case count for a single LA:

SevenDayCaseCountPer100K =

VAR MaxD = MAX(NewDailyCases[date])

VAR MinD = MaxD-7

 

VAR X = CALCULATE(SUM(NewDailyCases[newCasesBySpecimenDate]),

                                                            NewDailyCases[date] > MinD,

                                                            NewDailyCases[date] <= MaxD)

VAR Bpop = SELECTEDVALUE(AreaInfo[Pop])

RETURN

DIVIDE(X,Bpop)        

 

And my original plan was use the size property of the map visual to express the differences. The problem with this was that there’s no way of applying an external scale to the sizes – so Croydon with 5.4 cases per 100k had a bubble three times the size of Merton and Sutton which had 5.3.

Instead I switched to a colour scale for the bubbles. If you go to the Data Colours format property you can conditionally format the colour to get progressively stronger the higher the case count.


 

Seven day smoothing for deaths

For hospitalisations I used a simple sum of [nationalData]hospitalCases, but for deaths if you do that it looks like this


 

To show an average of the last seven days instead this is the measure I used (note that using ADDCOLUMNS and CALCULATE(SUM… instead of SUM( inside SUMMARIZE is more principled in terms of maximising performance, but with the low data volumes we’re dealing with here there’s no discernible difference)

SevenDayAverageDeaths =

VAR Pend = SELECTEDVALUE(NationalData[date])

VAR PStart = Pend - 7

VAR CombinedNats = CALCULATETABLE(

                                        SUMMARIZE(NationalData,NationalData[date],

                                        "DeathTot",SUM(NationalData[new28DaysDeathsByPublishDate])),

                                   NationalData[date] <= Pend,

                                   NationalData[date] > PStart)

               

VAR X = AVERAGEX(CombinedNats,[DeathTot])

                                                                         

RETURN

Note that CALCULATETABLE (like CALCULATE) replaces the filter context of fields used as its filters which is important as we need to override the filtering to a single date applied by the X axis of the chart.                

 

The file is downloadable here and you’re free to adapt and use it for non commercial purposes. If you’d like BI training to help you to set it up then please contact me on twitter @Bi4Schools or at ed@rsic.co.uk

  

Comments

  1. I feel SQL, power BI and other tools like these are actually very helpful to provide for more and more aspects of this.
    Powerbi Read Rest

    ReplyDelete

Post a Comment

Popular posts from this blog

Attainment 8 in BI - Part 1

Subject - Class - Pupil Table

Data modelling for a large MAT