Skip to contents

This function removes duplicate rows from a data frame based on specified fields, applying a function to handle duplicate values in the dependent variable.

Usage

resolve_duplicates(
  df,
  fields = NULL,
  duplicate_function = mean,
  dependent_variable = "DV",
  na.rm = TRUE
)

Arguments

df

A data frame to remove duplicates from

fields

A character vector of field names to check for duplicates. If NULL, defaults to c("USUBJID", "TIME", "ANALYTE") for NIF data.

duplicate_function

A function to apply to duplicate values. Default is mean. The function should take a vector and return a single value.

dependent_variable

The name of the field to apply the duplicate_function to. Defaults to "DV".

na.rm

Logical indicating whether to remove NA values when applying the duplicate_function. Defaults to TRUE.

Value

A data frame with duplicate rows removed