Estimating the empirical Lorenz curve and Gini coefficient in the presence of error with nested data. Academic Article uri icon

Overview

abstract

  • The Lorenz curve is a graphical tool that is widely used to characterize the concentration of a measure in a population, such as wealth. It is frequently the case that the measure of interest used to rank experimental units when estimating the empirical Lorenz curve, and the corresponding Gini coefficient, is subject to random error. This error can result in an incorrect ranking of experimental units which inevitably leads to a curve that exaggerates the degree of concentration (variation) in the population. We consider a specific data configuration with a hierarchical structure where multiple observations are aggregated within experimental units to form the outcome whose distribution is of interest. Within this context, we explore this bias and discuss several widely available statistical methods that have the potential to reduce or remove the bias in the empirical Lorenz curve. The properties of these methods are examined and compared in a simulation study. This work is motivated by a health outcomes application that seeks to assess the concentration of black patient visits among primary care physicians. The methods are illustrated on data from this study.

publication date

  • July 20, 2008

Research

keywords

  • Black People
  • Models, Statistical
  • Primary Health Care

Identity

PubMed Central ID

  • PMC3465674

Scopus Document Identifier

  • 48049100348

Digital Object Identifier (DOI)

  • 10.1002/sim.3151

PubMed ID

  • 18172873

Additional Document Info

volume

  • 27

issue

  • 16