Artikel
Confidence bands for a distribution function with merged data from multiple sources
We consider nonparametric estimation of a distribution function when data are collected from multiple overlapping data sources. Main statistical challenges include (1) heterogeneity of data sets, (2) unidentified duplicated records across data sets, and (3) dependence due to sampling without replacement from a data source. The proposed estimator is computable without identifying duplication but corrects bias from duplicated records. We show the uniform consistency of the proposed estimator over the real line and its weak convergence to a Gaussian process. Based on these asymptotic properties, we propose a simulation-based confidence band that enjoys asymptotically correct coverage probability. The finite sample performance is evaluated through a simulation study. A Wilms tumor example is provided.
- Language
-
Englisch
- Bibliographic citation
-
Journal: Statistics in Transition New Series ; ISSN: 2450-0291 ; Volume: 21 ; Year: 2020 ; Issue: 4 ; Pages: 144-158 ; New York: Exeley
- Subject
-
confidence band
data integration
Gaussian process
- Event
-
Geistige Schöpfung
- (who)
-
Saegusa, Takumi
- Event
-
Veröffentlichung
- (who)
-
Exeley
- (where)
-
New York
- (when)
-
2020
- DOI
-
doi:10.21307/stattrans-2020-035
- Handle
- Last update
-
10.03.2025, 11:44 AM CET
Data provider
ZBW - Deutsche Zentralbibliothek für Wirtschaftswissenschaften - Leibniz-Informationszentrum Wirtschaft. If you have any questions about the object, please contact the data provider.
Object type
- Artikel
Associated
- Saegusa, Takumi
- Exeley
Time of origin
- 2020