In this vignette, we’ll demonstrate how DemoKin can be used to compute kinship networks for an average member of a given (female) population. Let us call her Focal: an average Swedish woman who has always lived in Sweden and whose family has never left the country. Here, we’ll show how DemoKin can be used to compute the number and age distribution of Focal’s relatives under a range of assumptions, including living and deceased kin.

1. Kin counts with time-invariant rates

First, we compute kin counts in a time-invariant framework. We assume that Focal and all of her relatives experience the 2015 mortality and fertility rates throughout their entire lives (Caswell 2019). The DemoKin package includes data from Sweden as an example: age-by-year matrices of survival probabilities (swe_px), survival ratios (swe_Sx), fertility rates (swe_asfr), and population numbers (swe_pop). You can see the data contained in DemoKin with data(package="DemoKin"). This data comes from the Human Mortality Database and Human Fertility Database (see ?DemoKin::get_HMDHFD).

In order to implement the time-invariant models, the function DemoKin::kin expects a vector of survival ratios and another vector of fertility rates. In this example, we get the data for the year 2015, and run the matrix models:

library(DemoKin)
library(tidyr)
library(dplyr)
library(ggplot2)
library(knitr)
# First, get vectors for a given year
swe_surv_2015 <- swe_px[,"2015"]
swe_asfr_2015 <- swe_asfr[,"2015"]
# Run kinship models
swe_2015 <- kin(p = swe_surv_2015, f = swe_asfr_2015, time_invariant = TRUE)

1.1. Value

DemoKin::kin() returns a list containing two data frames: kin_full and kin_summary.

kin_full contains expected kin counts by year (or cohort), age of Focal and age of kin. Note that the columns year and cohort are empty if the argument is time_invariant = TRUE in kin (as in this example).

head(swe_2015$kin_full)
## # A tibble: 6 × 7
##   kin   age_kin age_focal living  dead cohort year 
##   <chr>   <int>     <int>  <dbl> <dbl> <lgl>  <lgl>
## 1 d           0         0      0     0 NA     NA   
## 2 d           0         1      0     0 NA     NA   
## 3 d           0         2      0     0 NA     NA   
## 4 d           0         3      0     0 NA     NA   
## 5 d           0         4      0     0 NA     NA   
## 6 d           0         5      0     0 NA     NA

kin_summary is a ‘summary’ data frame derived from kin_full.

head(swe_2015$kin_summary)
## # A tibble: 6 × 10
##   age_focal kin   year  cohort count_living mean_age sd_age count_dead
##       <int> <chr> <lgl> <lgl>         <dbl>    <dbl>  <dbl>      <dbl>
## 1         0 coa   NA    NA           0.275      8.32   6.14  0.0000633
## 2         0 cya   NA    NA           0.0898     4.05   3.68  0.0000370
## 3         0 d     NA    NA           0        NaN    NaN     0        
## 4         0 gd    NA    NA           0        NaN    NaN     0        
## 5         0 ggd   NA    NA           0        NaN    NaN     0        
## 6         0 ggm   NA    NA           0.320     84.4    6.43  0.0287   
## # ℹ 2 more variables: count_cum_dead <dbl>, mean_age_lost <dbl>

To produce it, we sum over all ages of kin to produce a data frame of expected kin counts by year or cohort and age of Focal (but not by age of kin). Consider this simplified example for living kin counts:

kin_summary_example <- 
  swe_2015$kin_full %>% 
  select(year, cohort, kin, age_focal, age_kin, living, dead) %>% 
  group_by(year, cohort, kin, age_focal) %>% 
  summarise(count_living = sum(living)) 

head(kin_summary_example)
## # A tibble: 6 × 5
## # Groups:   year, cohort, kin [1]
##   year  cohort kin   age_focal count_living
##   <lgl> <lgl>  <chr>     <int>        <dbl>
## 1 NA    NA     coa           0        0.275
## 2 NA    NA     coa           1        0.291
## 3 NA    NA     coa           2        0.305
## 4 NA    NA     coa           3        0.318
## 5 NA    NA     coa           4        0.330
## 6 NA    NA     coa           5        0.341

1.2. Visualizing the distribution of kin

Let us now visualize the distribution of relatives over Focal’s lifecourse using the summary data.frame kin_summary:

swe_2015[["kin_summary"]] %>%
  ggplot() +
  geom_line(aes(age_focal, count_living)) +
  theme_bw() +
  labs(y = "Expected number of living relatives") +
  facet_wrap(~kin)

Here, each relative type is identified by a unique code. Note that DemoKin uses different codes than Caswell (2019); the equivalence between the two set of codes is given in the following table:

DemoKin Caswell Labels_female Labels_male Labels_2sex
coa t Cousins from older aunts Cousins from older uncles Cousins from older aunts/uncles
cya v Cousins from younger aunts Cousins from younger uncles Cousins from younger aunts/uncles
c NA Cousins Cousins Cousins
d a Daughters Brothers Siblings
gd b Grand-daughters Grand-sons Grand-childrens
ggd c Great-grand-daughters Great-grand-sons Great-grand-childrens
ggm h Great-grandmothers Great-grandfathers Great-grandfparents
gm g Grandmothers Grandfathers Grandparents
m d Mother Father Parents
nos p Nieces from older sisters Nephews from older brothers Niblings from older siblings
nys q Nieces from younger sisters Nephews from younger brothers Niblings from younger siblings
n NA Nieces Nephews Niblings
oa r Aunts older than mother Uncles older than fathers Aunts/Uncles older than parents
ya s Aunts younger than mother Uncles younger than father Aunts/Uncles younger than parents
a NA Aunts Uncles Aunts/Uncles
os m Older sisters Older brothers Older siblings
ys n Younger sisters Younger brothers Younger siblings
s NA Sisters Brothers Siblings

We can also visualize the age distribution of relatives when Focal is 35 years old (now, with full names to identify each relative type using the function DemoKin::rename_kin()):

swe_2015[["kin_full"]] %>%
  filter(age_focal == 35) %>% 
  ggplot() +
  geom_line(aes(age_kin, living))  +
  geom_vline(xintercept = 35, color=2) +
  labs(y = "Expected number of living relatives") +
  theme_bw() +
  facet_wrap(~kin)

The one-sex model implemented in DemoKin assumes that the given fertility input applies to both sexes. Note that, if using survival rates (\(S_x\)) instead of probabilities (\(p_x\)), fertility vectors should account for female person-year exposure, using: \((\frac{f_x+f_{x+1}S_x}{2})\frac{L_0}{l_0}\) instead of only \(fx\); see Preston et.al (2001)).

The kin function also includes a summary output with the count of living kin, mean and standard deviation of kin age, by type of kin, for each Focal’s age:

swe_2015[["kin_summary"]] %>% 
  filter(age_focal == 35) %>% 
  select(kin, count_living, mean_age, sd_age) %>% 
  mutate_if(is.numeric, round, 2) %>% 
  kable()
kin count_living mean_age sd_age
coa 0.38 39.23 8.41
cya 0.42 27.59 8.47
d 0.70 5.52 3.80
gd 0.00 0.42 0.71
ggd 0.00 NaN NaN
ggm 0.00 96.90 2.60
gm 0.18 88.62 5.02
m 0.93 65.36 5.10
nos 0.36 9.72 5.93
nys 0.16 3.80 3.19
oa 0.37 70.04 6.32
os 0.43 40.18 4.25
ya 0.45 58.87 6.75
ys 0.47 28.49 4.42

Finally, we can visualize the estimated kin counts by type of kin using a network diagram. Following with the age 35:

swe_2015[["kin_summary"]] %>% 
  filter(age_focal == 35) %>% 
  select(kin, count = count_living) %>% 
  plot_diagram(rounding = 2)