In this vignette, we’ll demonstrate how DemoKin
can be
used to compute kinship networks for an average member of a given
(female) population. Let us call her Focal: an average Swedish woman who
has always lived in Sweden and whose family has never left the country.
Here, we’ll show how DemoKin
can be used to compute the
number and age distribution of Focal’s relatives under a range of
assumptions, including living and deceased kin.
First, we compute kin counts in a time-invariant
framework. We assume that Focal and all of her relatives experience the
2015 mortality and fertility rates throughout their entire lives (Caswell 2019). The DemoKin
package
includes data from Sweden as an example: age-by-year matrices of
survival probabilities (swe_px), survival ratios
(swe_Sx), fertility rates (swe_asfr), and population
numbers (swe_pop). You can see the data contained in
DemoKin
with data(package="DemoKin")
. This
data comes from the Human Mortality
Database and Human
Fertility Database (see ?DemoKin::get_HMDHFD
).
In order to implement the time-invariant models, the function
DemoKin::kin
expects a vector of survival ratios and
another vector of fertility rates. In this example, we get the data for
the year 2015, and run the matrix models:
library(DemoKin)
library(tidyr)
library(dplyr)
library(ggplot2)
library(knitr)
# First, get vectors for a given year
swe_surv_2015 <- swe_px[,"2015"]
swe_asfr_2015 <- swe_asfr[,"2015"]
# Run kinship models
swe_2015 <- kin(p = swe_surv_2015, f = swe_asfr_2015, time_invariant = TRUE)
DemoKin::kin()
returns a list containing two data
frames: kin_full
and kin_summary
.
kin_full
contains expected kin counts by year (or
cohort), age of Focal and age of kin. Note that the columns
year
and cohort
are empty if the argument is
time_invariant = TRUE
in kin
(as in this
example).
head(swe_2015$kin_full)
## # A tibble: 6 × 7
## kin age_kin age_focal living dead cohort year
## <chr> <int> <int> <dbl> <dbl> <lgl> <lgl>
## 1 d 0 0 0 0 NA NA
## 2 d 0 1 0 0 NA NA
## 3 d 0 2 0 0 NA NA
## 4 d 0 3 0 0 NA NA
## 5 d 0 4 0 0 NA NA
## 6 d 0 5 0 0 NA NA
kin_summary
is a ‘summary’ data frame derived from
kin_full
.
head(swe_2015$kin_summary)
## # A tibble: 6 × 10
## age_focal kin year cohort count_living mean_age sd_age count_dead
## <int> <chr> <lgl> <lgl> <dbl> <dbl> <dbl> <dbl>
## 1 0 coa NA NA 0.275 8.32 6.14 0.0000633
## 2 0 cya NA NA 0.0898 4.05 3.68 0.0000370
## 3 0 d NA NA 0 NaN NaN 0
## 4 0 gd NA NA 0 NaN NaN 0
## 5 0 ggd NA NA 0 NaN NaN 0
## 6 0 ggm NA NA 0.320 84.4 6.43 0.0287
## # ℹ 2 more variables: count_cum_dead <dbl>, mean_age_lost <dbl>
To produce it, we sum over all ages of kin to produce a data frame of expected kin counts by year or cohort and age of Focal (but not by age of kin). Consider this simplified example for living kin counts:
kin_summary_example <-
swe_2015$kin_full %>%
select(year, cohort, kin, age_focal, age_kin, living, dead) %>%
group_by(year, cohort, kin, age_focal) %>%
summarise(count_living = sum(living))
head(kin_summary_example)
## # A tibble: 6 × 5
## # Groups: year, cohort, kin [1]
## year cohort kin age_focal count_living
## <lgl> <lgl> <chr> <int> <dbl>
## 1 NA NA coa 0 0.275
## 2 NA NA coa 1 0.291
## 3 NA NA coa 2 0.305
## 4 NA NA coa 3 0.318
## 5 NA NA coa 4 0.330
## 6 NA NA coa 5 0.341
Let us now visualize the distribution of relatives over Focal’s
lifecourse using the summary data.frame kin_summary
:
swe_2015[["kin_summary"]] %>%
ggplot() +
geom_line(aes(age_focal, count_living)) +
theme_bw() +
labs(y = "Expected number of living relatives") +
facet_wrap(~kin)
Here, each relative type is identified by a unique code. Note that
DemoKin
uses different codes than Caswell (2019); the equivalence between the two set of
codes is given in the following table:
DemoKin | Caswell | Labels_female | Labels_male | Labels_2sex |
---|---|---|---|---|
coa | t | Cousins from older aunts | Cousins from older uncles | Cousins from older aunts/uncles |
cya | v | Cousins from younger aunts | Cousins from younger uncles | Cousins from younger aunts/uncles |
c | NA | Cousins | Cousins | Cousins |
d | a | Daughters | Brothers | Siblings |
gd | b | Grand-daughters | Grand-sons | Grand-childrens |
ggd | c | Great-grand-daughters | Great-grand-sons | Great-grand-childrens |
ggm | h | Great-grandmothers | Great-grandfathers | Great-grandfparents |
gm | g | Grandmothers | Grandfathers | Grandparents |
m | d | Mother | Father | Parents |
nos | p | Nieces from older sisters | Nephews from older brothers | Niblings from older siblings |
nys | q | Nieces from younger sisters | Nephews from younger brothers | Niblings from younger siblings |
n | NA | Nieces | Nephews | Niblings |
oa | r | Aunts older than mother | Uncles older than fathers | Aunts/Uncles older than parents |
ya | s | Aunts younger than mother | Uncles younger than father | Aunts/Uncles younger than parents |
a | NA | Aunts | Uncles | Aunts/Uncles |
os | m | Older sisters | Older brothers | Older siblings |
ys | n | Younger sisters | Younger brothers | Younger siblings |
s | NA | Sisters | Brothers | Siblings |
We can also visualize the age distribution of relatives when Focal is
35 years old (now, with full names to identify each relative type using
the function DemoKin::rename_kin()
):
swe_2015[["kin_full"]] %>%
filter(age_focal == 35) %>%
ggplot() +
geom_line(aes(age_kin, living)) +
geom_vline(xintercept = 35, color=2) +
labs(y = "Expected number of living relatives") +
theme_bw() +
facet_wrap(~kin)
The one-sex model implemented in DemoKin
assumes that
the given fertility input applies to both sexes. Note that, if using
survival rates (\(S_x\)) instead of
probabilities (\(p_x\)), fertility
vectors should account for female person-year exposure, using: \((\frac{f_x+f_{x+1}S_x}{2})\frac{L_0}{l_0}\)
instead of only \(fx\); see Preston
et.al (2001)).
The kin
function also includes a summary output with the
count of living kin, mean and standard deviation of kin age, by type of
kin, for each Focal’s age:
swe_2015[["kin_summary"]] %>%
filter(age_focal == 35) %>%
select(kin, count_living, mean_age, sd_age) %>%
mutate_if(is.numeric, round, 2) %>%
kable()
kin | count_living | mean_age | sd_age |
---|---|---|---|
coa | 0.38 | 39.23 | 8.41 |
cya | 0.42 | 27.59 | 8.47 |
d | 0.70 | 5.52 | 3.80 |
gd | 0.00 | 0.42 | 0.71 |
ggd | 0.00 | NaN | NaN |
ggm | 0.00 | 96.90 | 2.60 |
gm | 0.18 | 88.62 | 5.02 |
m | 0.93 | 65.36 | 5.10 |
nos | 0.36 | 9.72 | 5.93 |
nys | 0.16 | 3.80 | 3.19 |
oa | 0.37 | 70.04 | 6.32 |
os | 0.43 | 40.18 | 4.25 |
ya | 0.45 | 58.87 | 6.75 |
ys | 0.47 | 28.49 | 4.42 |
Finally, we can visualize the estimated kin counts by type of kin using a network diagram. Following with the age 35:
swe_2015[["kin_summary"]] %>%
filter(age_focal == 35) %>%
select(kin, count = count_living) %>%
plot_diagram(rounding = 2)