You can access the R code for this analysis in my GitHub: https://github.com/InmaculadaRM/LiveBirthsMap

Live births in Scotland 2022
# I load my favorite packages (I don´t always use all of them but I keep all in
# my template).
library(tidyverse)
library(janitor)
library(lubridate)
library(kableExtra)
library(formatR)
library(scales)
library(sp)
library(sf)
library(gridExtra)
library(latticeExtra)
library(cowplot)

Data used

To perfom this analysis, two datasets has been retrieved from The Scottish Health and Social Care Open Data platform for their analysis. And one from the Spatial Data Metadata Portal, Scotland’s catalogue of spatial data.

  1. Births by hospital, Containing 8266 observations with information for 5 variables The number of live and stillbirths by hospital of birth sourced from the Scottish Morbidity Record 02 (SMR02).

  2. Hospitals in Scotland. 277 observations of 16 variables. with a listing of all NHS hospitals across Scotland.

  3. geographical spatial data for the Scottish Health Boards, a ESRI Shape file spatial data defining the boundaries of NHS Health Boards in Scotland,

Reading the data and cleaning variable names:

#read in .csv files with the data  and clea_names
births <- read_csv("https://www.opendata.nhs.scot/dataset/df10dbd4-81b3-4bfa-83ac-b14a5ec62296/resource/d534ae02-7890-4fbc-8cc7-f223d53fb11b/download/10.3_birthsbyhospital.csv") %>%
  clean_names() %>%
  separate(financial_year, into = c("year", NA), sep = "/")

hospitals <- read_csv("https://www.opendata.nhs.scot/dataset/cbd1802e-0e04-4282-88eb-d7bdcfb120f0/resource/c698f450-eeed-41a0-88f7-c1e40a568acc/download/current-hospital_flagged20211216.csv") %>%
  clean_names()

#read in .shp file 
## you need to download all the files in your computer and change the path in the code
path = "D:/SpatiaDataFiles/SG_NHS_HealthBoards_2019.shp"
hb_spatial <- st_read(path)
## Reading layer `SG_NHS_HealthBoards_2019' from data source 
##   `D:\SpatiaDataFiles\SG_NHS_HealthBoards_2019.shp' using driver `ESRI Shapefile'
## Simple feature collection with 14 features and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 5512.998 ymin: 530250.8 xmax: 470332 ymax: 1220302
## CRS:           NA

We can see that the category ‘outcome’ could be ‘Live’, ‘Still’ or ‘Unknown’. We are going to represent live births

head(births)
## # A tibble: 6 × 5
##   year  ca     hospital outcome smr02births
##   <chr> <chr>  <chr>    <chr>         <dbl>
## 1 1997  RA2704 A103H    Live             11
## 2 1997  RA2704 A103H    Still             1
## 3 1997  RA2704 B120H    Live             39
## 4 1997  RA2704 C206H    Live              1
## 5 1997  RA2704 C313H    Live              2
## 6 1997  RA2704 C418H    Live              2
table(births$outcome)
## 
##    Live   Still Unknown 
##    7440    1329      65

Total live births in Scotland in the financial year 2022-23

births %>%
  filter(year==2022 & outcome=="Live") %>%
  summarize(new_babies_2022 = sum(smr02births))
## # A tibble: 1 × 1
##   new_babies_2022
##             <dbl>
## 1           45061

Babies born at home in 2022

#number of babies born at home
births %>%
  filter(year==2022 & outcome=="Live" & hospital=="D201N") %>%
  summarize("Babies born at home in 2022"= sum(smr02births))
## # A tibble: 1 × 1
##   `Babies born at home in 2022`
##                           <dbl>
## 1                           209

Evolution of total number of live births a long time since 1997 to 2022

baby_year <- births %>% filter (outcome=="Live") %>%
  group_by(year) %>%
  summarise(number_of_babies = sum(smr02births))
kable(baby_year)
year number_of_babies
1997 58282
1998 56471
1999 54073
2000 52498
2001 50799
2002 50977
2003 52585
2004 53366
2005 52971
2006 54982
2007 57983
2008 58525
2009 58066
2010 57696
2011 57952
2012 56406
2013 55274
2014 55365
2015 54571
2016 53644
2017 51938
2018 50556
2019 48642
2020 46158
2021 47518
2022 45061
ggplot(baby_year, aes(year, number_of_babies)) + geom_col(fill="#0097a7", alpha=0.3)+
  geom_text(aes(label = number_of_babies), vjust=-0.3, size =2.8, color='#005B70') +
  labs(
    title = "Number of live births in Scottish hospitals",
    subtitle = "(by financial year)",
    caption="Data from: Public health Scotland") +
  ylab("number of births")

Evolution of still births outcomes since 1997 to 2022

still_year <- births %>% 
  filter (outcome=="Still") %>%
  group_by(year) %>%
  summarise(still_births = sum(smr02births))
kable(still_year)
year still_births
1997 307
1998 318
1999 240
2000 286
2001 254
2002 245
2003 322
2004 256
2005 269
2006 295
2007 293
2008 298
2009 297
2010 277
2011 260
2012 231
2013 235
2014 200
2015 199
2016 216
2017 214
2018 173
2019 155
2020 190
2021 175
2022 157
ggplot(still_year, aes(year, still_births)) + geom_col(fill="brown", alpha=0.4) +
  ylim(0, 1000)

Live births at home. (Maybe not all home births were recorded in this dataset).

home <- births %>%
  # D201N is the code for home births
  filter(hospital== "D201N") %>%
  group_by(year) %>%
  summarize(home_delivered = sum(smr02births))
ggplot(home, aes(year, home_delivered)) + geom_col(fill="#0097a7", alpha=0.3) +
   ylab("Number of babies") +
  geom_text(aes(label = home_delivered), vjust=-0.1, size =3, color='#0097a7') +
    labs(
    title = "Trends in home delivery births in Scotland",
    subtitle = "(by financial year)",
    caption="Data from: Public health Scotland")

admissions_deaths %>% ggplot(aes(x = reorder(injury_type, death_ratio), y = death_ratio)) + geom_col(color=“red”, fill=‘pink’) + coord_flip() + labs( title = “Death ratio by Injury type”, subtitle = “Scotland 2013-2022”, caption = “Data source: Public Health Scotland”, y = “Deaths/Admissions ratio”, x = ““, fill =”total_deaths” ) + geom_text(aes(label = round(death_ratio, 3)), hjust = -0.1, size = 3, color=‘red’)

Number of births in each hospital - table

#subseting live births in 2022 grouped by hospital
newborns22 <- births %>%
  # D201N is the code for home births (52 births in 2021)
  filter(year==2022 & outcome=="Live" & hospital!= "D201N") %>%
  group_by(hospital) %>%
  summarize(babies_2022 = sum(smr02births)) %>%
  arrange(desc(babies_2022))
head(hospitals)
## # A tibble: 6 × 15
##   hospital_code hospital_name        address_line1 address_line2 address_line2qf
##   <chr>         <chr>                <chr>         <chr>         <chr>          
## 1 A101H         Arran War Memorial … Lamlash       Isle of Arran <NA>           
## 2 A103H         Ayrshire Central Ho… Kilwinning R… Irvine        <NA>           
## 3 A105H         Kirklandside Hospit… Kirklandside  Kilmarnock    <NA>           
## 4 A110H         Lady Margaret Hospi… College St    Millport      <NA>           
## 5 A111H         University Hospital… Kilmarnock R… Kilmarnock    <NA>           
## 6 A112H         Brooksby Day Hospit… 18 Greenock … Largs         <NA>           
## # ℹ 10 more variables: address_line3 <chr>, address_line3qf <chr>,
## #   address_line4 <chr>, address_line4qf <chr>, postcode <chr>,
## #   health_board <chr>, hscp <chr>, council_area <chr>,
## #   intermediate_zone <chr>, data_zone <chr>

Finding column´s names in the hospitals dataset

names(hospitals)
##  [1] "hospital_code"     "hospital_name"     "address_line1"    
##  [4] "address_line2"     "address_line2qf"   "address_line3"    
##  [7] "address_line3qf"   "address_line4"     "address_line4qf"  
## [10] "postcode"          "health_board"      "hscp"             
## [13] "council_area"      "intermediate_zone" "data_zone"

Joining births dataset with hospital dataset:

births_2022 <- newborns22 %>%
  left_join(hospitals, by=c("hospital" = "hospital_code")) %>%
  select(hospital, hospital_name, health_board, babies_2022)
  
kable(births_2022, 
      caption = "Live births in Scottish hospitals in 2022") %>%
  kable_styling(latex_options = "striped", font_size = 12)
Live births in Scottish hospitals in 2022
hospital hospital_name health_board babies_2022
S314H Royal Infirmary of Edinburgh at Little France S08000024 5534
G405H Queen Elizabeth University Hospital S08000031 5154
G108H The Princess Royal Maternity Unit S08000031 4586
N161H Aberdeen Maternity Hospital S08000020 4509
L308H University Hospital Wishaw S08000032 4042
C418H Royal Alexandra Hospital S08000031 3201
T101H Ninewells Hospital S08000030 3184
V217H Forth Valley Royal Hospital S08000019 2774
A111H University Hospital Crosshouse S08000015 2690
F705H Victoria Maternity Unit S08000029 2519
S308H St John’s Hospital S08000024 2351
H202H Raigmore Hospital S08000022 1818
Y146H Dumfries & Galloway Royal Infirmary S08000017 1075
B120H Borders General Hospital S08000016 663
W107H Western Isles Hospital S08000028 119
T304H Arbroath Infirmary S08000030 106
N411H Dr Gray’s Hospital S08000020 96
Z102H Gilbert Bain Hospital S08000026 95
T202H Perth Royal Infirmary S08000030 92
N333H Peterhead Community Hospital S08000020 75
N331H Inverurie Hospital S08000020 67
R103H The Balfour S08000025 50
C121H Lorn & Islands Hospital S08000022 11
H212H Belford Hospital S08000022 10
C106H Cowal Community Hospital S08000022 9
H103H Caithness General Hospital S08000022 9
C313H Inverclyde Royal Hospital S08000031 5
C206H Vale of Leven General Hospital S08000031 4
H224H Mid-Argyll Community Hospital and Integrated Care Centre S08000022 3
W108H Uist & Barra Hospital S08000028 1

Number of live births in each NHS Health board - table

#calculate births for each health board
births_hb<- births_2022 %>%
  group_by(health_board) %>%
  summarise(Newborns = sum(babies_2022)) %>%
  arrange(desc(Newborns))

kable(births_hb, 
      caption = "Live births by Health Boards in 2022") %>%
  kable_styling(latex_options = "striped", font_size = 12)
Live births by Health Boards in 2022
health_board Newborns
S08000031 12950
S08000024 7885
S08000020 4747
S08000032 4042
S08000030 3382
S08000019 2774
S08000015 2690
S08000029 2519
S08000022 1860
S08000017 1075
S08000016 663
S08000028 120
S08000026 95
S08000025 50

Joining our births & hospital data with the spatial data for the NHS Health boards boundaries:

#join the spatial data with
births_spatial <- hb_spatial %>%
left_join(births_hb, by = c("HBCode" = "health_board"))

Plotting the map

baby2022_map <- ggplot(births_spatial, aes(fill = Newborns)) + 
  geom_sf(size = 0.1, color = "#0097a7") + 
  scale_fill_viridis_c(option = "mako", direction = -1) +
    labs(
    title = "Live births in Scotland 2022",
    subtitle = "by Health Boards",
    caption="Data from: Public health Scotland & Scottish Goverment spatial data") + 
  coord_sf() +
  theme_void()
baby2022_map

See more data fun and drawings in the author´s website www.inmaruiz.com

Software and packages used (or not used, but mentioned alt least)

R: R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URLcitatiohttps://www.R-project.org/.

janitor: Firke S (2021). janitor: Simple Tools for Examining and Cleaning Dirty Data. R package version 2.1.0, https://CRAN.R-project.org/package=janitor..

Tidyverse: Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019). “Welcome to the tidyverse.” Journal of Open Source Software, 4(43), 1686. doi:10.21105/joss.01686 https://doi.org/10.21105/joss.01686.

Knitr: Yihui Xie (2022). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.40. H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016

kableExtra: Zhu H (2021). kableExtra: Construct Complex Table with ‘kable’ and Pipe Syntax. R package version 1.3.4, https://CRAN.R-project.org/package=kableExtra.

ggplot: H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016

formatR Xie Y (2023). formatR: Format R Code Automatically. R package version 1.14, https://CRAN.R-project.org/package=formatR.

lubridate Garrett Grolemund, Hadley Wickham (2011). Dates and Times Made Easy with lubridate. Journal of Statistical Software, 40(3), 1-25. URL https://www.jstatsoft.org/v40/i03/.

rgdal Bivand R, Keitt T, Rowlingson B (2023). rgdal: Bindings for the ‘Geospatial’ Data Abstraction Library. R package version 1.6-4, https://CRAN.R-project.org/package=rgdal.

sp Pebesma, E.J., R.S. Bivand, 2005. Classes and methods for spatial data in R. R News 5 (2), https://cran.r-project.org/doc/Rnews/. Roger S. Bivand, Edzer Pebesma, Virgilio Gomez-Rubio, 2013. Applied spatial data analysis with R, Second edition. Springer, NY. https://asdar-book.org/

sf Pebesma, E., 2018. Simple Features for R: Standardized Support for Spatial Vector Data. The R Journal 10 (1), 439-446, https://doi.org/10.32614/RJ-2018-00.

gridExtra Auguie B (2017). gridExtra: Miscellaneous Functions for “Grid” Graphics. R package version 2.3, https://CRAN.R-project.org/package=gridExtra.

laticeExtra Sarkar D, Andrews F (2022). latticeExtra: Extra Graphical Utilities Based on Lattice. R package version 0.6-30, https://CRAN.R-project.org/package=latticeExtra.

cowplot Wilke C (2020). cowplot: Streamlined Plot Theme and Plot Annotations for ‘ggplot2’. R package version 1.1.1, https://CRAN.R-project.org/package=cowplot.

Spatial Data Metadata Portal, Scotland’s catalogue of spatial data.