This paper investigates the problem of making inference about the coefficients in the linear projection of an outcome variable y on covariates (x,z) when data are available from two independent random samples; the first sample contains information on only the variables (y,z), while the second sample contains information on only the covariates. In this context, the validity of existing inference procedures depends crucially on the assumptions imposed on the joint distribution of (y,z,x). This paper introduces a novel characterization of the identified set of the coefficients of interest when no assumption (except for the existence of second moments) on this joint distribution is imposed. One finding is that inference is necessarily nonstandard because the function characterizing the identified set is a nondifferentiable (yet directionally differentiable) function of the data. The paper then introduces an estimator and a confidence interval based on the directional differential of the function characterizing the identified set. Monte Carlo experiments explore the numerical performance of the proposed estimator and confidence interval.
- Least Squares Projection
- Data Combination