Skip to contents

This function ensures that the most recent record be kept. When a peer-reviewed publication co-exists with a preprint or a conference proceeding, the peer-reviewed version will be kept.

Usage

dup_rm_pairwise(ls_df, id_dup_pair, to_dataframe = TRUE)

Arguments

ls_df

A list of data frames containing the partitioned dataset (i.e., output #1 of simi_ptn_pair()).

id_dup_pair

A data frame listing record id and partition id of duplicate pairs after resolving checked duplicates (i.e., output of dup_resolve_pairwise()).

to_dataframe

Logical: Should we merge the list of data frames into a single data frame? Defaults to TRUE.

Value

The input ls_df with duplicates removed. The resulted list of data frames are merged into a single data frame if to_dataframe == TRUE. Otherwise, a list of data frames is returned.

Examples

if (FALSE) {
df_2 <- dup_rm_pairwise(ls_df, id_dup_pair, to_dataframe = TRUE)
# or
ls_df_2 <- dup_rm_pairwise(ls_df, id_dup_pair, to_dataframe = FALSE)
}