Research

"Though this be madness, yet there is method in't." — Hamlet (II, ii, 206)

Financial support from the European Research Council through ERC Starting Grant 338187; the United Kingdom Research and Innovation Frontier Research Grant; the Economic and Social Research Council; the British Academy; and the National Science Foundation is gratefully acknowledged.
ORCID: 0000-0003-3611-3448

I. Publications

Click on a paper to expand the abstract. On mobile, tap to expand.

Leveraging Subjective Expectations for Production Functions

with Steve Bond, Agnes Norris Keiller and John Van Reenen

American Economic Association Papers and Proceedings, 2026

BibTeX

We propose estimating production functions using firms' subjective expectations of future output and inputs; data which are becoming increasingly available in surveys. This note compares their proposed estimator to traditional dynamic panel data (e.g., Blundell and Bond, 2000) and proxy variable methods (e.g., Olley and Pakes, 1996). While NPR allows for nonlinear productivity processes, we discuss commonalities with the former when those processes are linear. We note that NPR may be more robust to oligopolistic competition than the latter since it does not employ input demand relations to proxy for productivity.

@article{depaula2026leveraging,
  author = {de Paula, \'{A}ureo and Bond, Steve and Norris Keiller, Agnes and Van Reenen, John},
  title = {Leveraging Subjective Expectations for Production Functions},
  journal = {American Economic Association Papers and Proceedings},
  year = {2026}
}

VOG: Using Volcanic Eruptions to Estimate the Impact of Air Pollution on Student Learning Outcomes

with T. Halliday, L. Lusher and R. Inafuku

Oxford Bulletin of Economics and Statistics, 88(1), February 2026, pp.113-140

Featured in: Star Advertiser, KITV, Hawaii Tribune Herald, Hawaii Public Radio

BibTeX

We pair variation stemming from volcanic eruptions with the census of Hawaii's public schools' student test scores to estimate the impact of PM2.5 and SO2 on student performance. Increased particulate pollution decreases test scores. These results are concentrated among schools with the highest long-term average levels of pollution. The effects of PM2.5 are larger for the poorest pupils by a factor of at least three. We demonstrate that poor air quality disproportionately impacts the human capital accumulation of economically disadvantaged children.

@article{depaula2026vog,
  author = {de Paula, \'{A}ureo and Halliday, Timothy and Lusher, Lester and Inafuku, Rachel},
  title = {VOG: Using Volcanic Eruptions to Estimate the Impact of Air Pollution on Student Learning Outcomes},
  journal = {Oxford Bulletin of Economics and Statistics},
  volume = {88},
  number = {1},
  pages = {113--140},
  year = {2026}
}

Intergenerational Mobility in Socio-Emotional Skills

with O. Attanasio and A. Toppeta

Journal of Public Economics, 248, August 2025, 105423

BibTeX

This paper investigates the intergenerational transmission of socio-emotional skills during childhood, using data from the 1970 British Cohort Study (BCS70) in the United Kingdom. This dataset enables us to measure two dimensions of socio-emotional development: internalising and externalising skills. More importantly, we can use multiple measures of parents’ skills collected during both their childhood and their adulthood. Whereas parent–child skills are strongly related when both are measured contemporaneously, they remain correlated when both are measured in childhood, with a stronger transmission observed from mothers to their children. The BCS70 data finally enable us to estimate the correlation between the grandmother’s internalising skill and the grandchildren’s skills, after accounting for parental skills.

@article{depaula2025intergenerational,
  author = {de Paula, \'{A}ureo and Attanasio, Orazio and Toppeta, Alessandro},
  title = {Intergenerational Mobility in Socio-Emotional Skills},
  journal = {Journal of Public Economics},
  volume = {248},
  pages = {105423},
  year = {2025}
}

Identifying Network Ties from Panel Data: Theory and an Application to Tax Competition

with I. Rasul and P. Souza

Review of Economic Studies, 92(4), July 2025, pp.2691-2729

Replication files · Errata · Github repository for codes

BibTeX

Social interactions determine many economic behaviors, but information on social ties does not exist in most publicly available and widely used datasets. We present results on the identification of social networks from observational panel data that contains no information on social ties between agents. In the context of a canonical social interactions model, we provide sufficient conditions under which the social interactions matrix, endogenous and exogenous social effect parameters are globally identified. We then describe how high-dimensional estimation techniques can be used to estimate the interactions model based on the Adaptive Elastic Net Generalized Method of Moments. We employ the method to study tax competition across US states.

@article{depaula2025network,
  author = {de Paula, \'{A}ureo and Rasul, Imran and Souza, Pedro},
  title = {Identifying Network Ties from Panel Data: Theory and an Application to Tax Competition},
  journal = {Review of Economic Studies},
  volume = {92},
  number = {4},
  pages = {2691--2729},
  year = {2025}
}

Subjective Expectations and Demand for Contraception

with C. Valente and G. Miller

Journal of Econometrics, 249, Part B, 105997, 2025

Featured in: PolicyBristol Briefing 82 · Best Paper Prize, Essen Health Conference

BibTeX

One-quarter of married, fertile-age women in Sub-Saharan Africa report not wanting a pregnancy and yet do not practice contraception. We collect detailed data on the subjective beliefs of married, adult women not wanting a pregnancy and estimate a structural model of contraceptive choices. Both our structural model and a validation exercise using an exogenous shock to beliefs show that correcting women’s beliefs about pregnancy risk absent contraception can increase use considerably. Our structural estimates further indicate that costly interventions like eliminating supply constraints would only modestly increase contraceptive use, while confirming the importance of partners’ preferences highlighted in related literature.

@article{depaula2025contraception,
  author = {de Paula, \'{A}ureo and Valente, Christine and Miller, Grant},
  title = {Subjective Expectations and Demand for Contraception},
  journal = {Journal of Econometrics},
  volume = {249},
  number = {Part B},
  pages = {105997},
  year = {2025}
}

Are Self-Reported Fertility Preferences Biased? Evidence from Indirect Elicitation Methods

with C. Valente, W. Qiang Toh, I. Jalingo, A. Lepine and G. Miller

Proceedings of the National Academy of Sciences, 121 (34) e2407629121, 2024

BibTeX

Desired fertility measures are routinely collected and used by researchers and policy makers, but their self-reported nature raises the possibility of reporting bias. In this paper, we test for the presence of such bias by comparing responses to direct survey questions with indirect questions offering a varying, randomized, degree of confidentiality to respondents in a socioeconomically diverse sample of Nigerian women (N=6,256). We find that women report higher fertility preferences when asked indirectly, but only when their responses afford them complete confidentiality, not when their responses are simply blind to the enumerator. Our results suggest that there may be fewer unintended pregnancies than currently thought and that the effectiveness of family planning policy targeting may be weakened by the bias we uncover. We conclude with suggestions for future work on how to mitigate reporting bias.

@article{depaula2024fertility,
  author = {de Paula, \'{A}ureo and Valente, Christine and Qiang Toh, Wei and Jalingo, Istifanus and Lepine, Aurelia and Miller, Grant},
  title = {Are Self-Reported Fertility Preferences Biased? Evidence from Indirect Elicitation Methods},
  journal = {Proceedings of the National Academy of Sciences},
  volume = {121},
  number = {34},
  pages = {e2407629121},
  year = {2024}
}

Identification in Simple Binary Outcome Panel Data Models

with Bo Honoré

Econometrics Journal, 24(2), May 2021, pp.C78-C93

BibTeX

This paper first reviews some of the approaches that have been taken to estimate the common parameters of binary outcome models with fixed effects. We limit attention to situations in which the researcher has access to a data set with a large number of units (individuals or companies, for example) observed over a number of time periods. We then apply some of the existing approaches to study fixed-effects panel data versions of entry games, like the ones studied in Bresnahan and Reiss (1991) and Tamer (2003).

@article{depaula2021binary,
  author = {de Paula, \'{A}ureo and Honor\'{e}, Bo},
  title = {Identification in Simple Binary Outcome Panel Data Models},
  journal = {Econometrics Journal},
  volume = {24},
  number = {2},
  pages = {C78--C93},
  year = {2021}
}

Preparing for a Pandemic: Spending Dynamics and Panic Buying During the COVID-19 First Wave

with Martin O'Connell and Kate Smith

Fiscal Studies, 42(2), 2021, pp.249-264

BibTeX

In times of heightened uncertainty, consumers face incentives to build up precautionary stocks of essential supplies. We study consumer spending dynamics during one such time, the first infection wave of the COVID-19 pandemic, using household scanner data covering fast-moving consumer goods in the United Kingdom. We document large increases in demand for storable products, such as food staples and household supplies, in the days before lockdown. Households in all socio-economic groups exhibit unusually high demand pre-lockdown, but there is a clear gradient, with the largest demand spikes for wealthier households. Although stories of people purchasing extreme amounts received a lot of attention, higher aggregate demand was mainly driven by more households than usual choosing to buy storable products, with only small increases in average quantities bought on a given trip. Temporary limits on the number of units per transaction, introduced following the demand spike, are therefore unlikely to lead to the avoidance of stock-outs.

@article{depaula2021pandemic,
  author = {de Paula, \'{A}ureo and O'Connell, Martin and Smith, Kate},
  title = {Preparing for a Pandemic: Spending Dynamics and Panic Buying During the {COVID-19} First Wave},
  journal = {Fiscal Studies},
  volume = {42},
  number = {2},
  pages = {249--264},
  year = {2021}
}

The Informativeness of Estimation Moments

with Bo Honoré and Thomas Jorgensen

Journal of Applied Econometrics, 35(7), Nov/Dec 2020, pp.797-813 (Lead Article)

BibTeX

This paper introduces measures for how each moment contributes to the precision of parameter estimates in generalized method of moments settings. For example, one of the measures asks what would happen to the variance of the parameter estimates if a particular moment was dropped from the estimation. The measures are all easy to compute. We illustrate the usefulness of the measures through two simple examples as well as an application to a model of joint retirement planning of couples. We estimate the model using the British Household Panel Survey, and we find evidence of complementarities in leisure. Our sensitivity measures illustrate that the estimate of the complementarity is primarily informed by the distribution of differences in planned retirement dates. The estimated econometric model can be interpreted as a bivariate ordered-choice model that allows for simultaneity. This makes the model potentially useful in other applications.

@article{depaula2020informativeness,
  author = {de Paula, \'{A}ureo and Honor\'{e}, Bo and Jorgensen, Thomas},
  title = {The Informativeness of Estimation Moments},
  journal = {Journal of Applied Econometrics},
  volume = {35},
  number = {7},
  pages = {797--813},
  year = {2020}
}

Trade Networks and the Strength of Strong Ties

Advances in Econometrics, 42, October 2020

BibTeX

Evidence suggests that, in the presence of imperfect market institutions, individuals devote resources to the establishment of reliable connections in order to attenuate the frictions that reduce trading and insurance opportunities. In this paper I survey the relevant literature on strategic formation of networks and use it to study this particular economic situation. A simple model is built to show that the investment in strong ties often, though not always, produces stable configurations that manage to improve upon the imperfections of market institutions

@article{depaula2020trade,
  author = {de Paula, \'{A}ureo},
  title = {Trade Networks and the Strength of Strong Ties},
  journal = {Advances in Econometrics},
  volume = {42},
  year = {2020}
}

Econometric Models of Network Formation

Annual Review of Economics, 12, August 2020, pp.775-799

BibTeX

This article provides a selective review of the recent literature on econometric models of network formation. I start with a brief exposition on basic concepts and tools for the statistical description of networks; then I offer a review of dyadic models, focusing on statistical models on pairs of nodes, and I describe several developments of interest to the econometrics literature. I also present a discussion of nondyadic models in which link formation might be influenced by the presence or absence of additional links, which themselves are subject to similar influences. This argument is related to the statistical literature on conditionally specified models and the econometrics of game theoretical models. I close with a (nonexhaustive) discussion of potential areas for further development.

@article{depaula2020formation,
  author = {de Paula, \'{A}ureo},
  title = {Econometric Models of Network Formation},
  journal = {Annual Review of Economics},
  volume = {12},
  pages = {775--799},
  year = {2020}
}

The Econometric Analysis of Network Data

book co-edited with Bryan Graham

Academic Press/Elsevier, 2020

BibTeX

The Econometric Analysis of Network Data serves as an entry point for advanced students, researchers, and data scientists seeking to perform effective analyses of networks, especially inference problems. It introduces the key results and ideas in an accessible, yet rigorous way. While a multi-contributor reference, the work is tightly focused and disciplined, providing latitude for varied specialties in one authorial voice.

@book{depaula2020networkdata,
  author = {de Paula, \'{A}ureo and Graham, Bryan},
  title = {The Econometric Analysis of Network Data},
  publisher = {Academic Press/Elsevier},
  year = {2020}
}

Strategic Network Formation

in B. Graham and A. de Paula (eds.), The Econometric Analysis of Network Data, pp.42-64, 2020

BibTeX

[Abstract placeholder]

@incollection{depaula2020strategic,
  author = {de Paula, \'{A}ureo},
  title = {Strategic Network Formation},
  booktitle = {The Econometric Analysis of Network Data},
  editor = {Graham, Bryan and de Paula, \'{A}ureo},
  pages = {42--64},
  publisher = {Academic Press/Elsevier},
  year = {2020}
}

VOG: Using Volcanic Eruptions to Estimate the Health Costs of Particulates

with Tim Halliday and John Lynham

Economic Journal, 129 (620), May 2019, pp.1782-1816

Top 10% Most Downloaded Papers in The Economic Journal (Jan 2018 – Dec 2019)

BibTeX

The negative consequences of long-term exposure to particulate pollution are well-established but a number of studies find no effect of short-term exposure on health outcomes. The high correlation of industrial pollutants complicates the estimation of the impact of individual pollutants on health. In this study, we use emissions from K¯ılauea volcano, which are uncorrelated with other pollution sources, to estimate the impact of pollutants on local emergency room admissions and a precise measure of costs. A one standard deviation increase in particulates leads to a 23-36% increase in expenditures on ER visits for pulmonary outcomes, mostly among the very young.

@article{depaula2019vog,
  author = {de Paula, \'{A}ureo and Halliday, Timothy and Lynham, John},
  title = {VOG: Using Volcanic Eruptions to Estimate the Health Costs of Particulates},
  journal = {Economic Journal},
  volume = {129},
  number = {620},
  pages = {1782--1816},
  year = {2019}
}

Biomarkers and Self-reported Sexual Behaviors

with Lucia Corno

Economica, 86 (342), 2019, pp.229-261

BibTeX

High-risk sexual behaviours are generally unobserved and difficult to identify. In this paper, we investigate the accuracy of two risky-behaviour measures: biomarkers for sexually transmitted infections (STIs) and self-reported data. We build an epidemiological model to assess the relative performance of biomarkers versus self-reported data. We then suggest an econometric strategy that combines both types of measures to estimate actual unobserved risky sexual behaviours. Using data from the Demographic and Health Survey in 28 countries, we calibrate the model and provide conditions under which self-reported data are a better proxy for risky sexual behaviours than biomarkers. In countries with low STI prevalence, biomarkers have a higher probability of misclassification than self-reported answers. We apply our econometric strategy to the data and show that the probability of actual risky behaviour is much higher than the probability of self-reported risky behaviour and of testing positive for an STI.

@article{depaula2019biomarkers,
  author = {de Paula, \'{A}ureo and Corno, Lucia},
  title = {Biomarkers and Self-reported Sexual Behaviors},
  journal = {Economica},
  volume = {86},
  number = {342},
  pages = {229--261},
  year = {2019}
}

A New Model for Interdependent Durations

with Bo E. Honore

Quantitative Economics, 9 (3), November 2018, pp.1299-1333

BibTeX

This paper introduces a bivariate version of the generalized accelerated failure time model. It allows for simultaneity in the econometric sense that the two realized outcomes depend structurally on each other. Another feature of the proposed model is that it will generate equal durations with positive probability. Our approach takes a stylized economic model that leads to a univariate generalized accelerated failure time model as a starting point. In this model, agents decide when to transition from an initial state to a new one, and the covariates influence the difference in the utility flow in the two states. We introduce simultaneity by allowing the utility flow to depend on the status of the other person. The econometric model is then completed by assuming that the observed outcome is the Nash bargaining solution in that simple economic model. The advantage of this approach is that it includes independent realizations from the generalized accelerated failure time model as a special case, and deviations from this special case can be given an economic interpretation. We established identification under assumptions that are similar to those in the literature on nonparametric estimation of duration models. We illustrate the model by studying the joint retirement decisions in married couples using the Health and Retirement Study. In that example it seems reasonable to allow for the possibility that each partner’s optimal retirement time depends on the retirement time of the spouse. Moreover, the data suggest that the wife and the husband retire at the same time for a non-negligible fraction of couples. The main empirical finding is that the simultaneity is economically important. In our preferred specification the indirect utility associated with being retired increases by approximately 5% when one’s spouse retires.

@article{depaula2010durations,
  author = {de Paula, \'{A}ureo and Honor\'{e}, Bo E.},
  title = {Interdependent Durations},
  journal = {Review of Economic Studies},
  volume = {77},
  number = {3},
  pages = {1138--1163},
  year = {2010}
}
@article{depaula2018newmodel,
  author = {de Paula, \'{A}ureo and Honor\'{e}, Bo E.},
  title = {A New Model for Interdependent Durations},
  journal = {Quantitative Economics},
  volume = {9},
  number = {3},
  pages = {1299--1333},
  year = {2018}
}

Identifying Preferences in Networks with Bounded Degree

with Seth Richards-Shubik and Elie Tamer

Econometrica, 86 (1), January 2018, pp.263-288

BibTeX

This paper provides a framework for identifying preferences in a large network where links are pairwise stable. Network formation models present difficulties for identification, especially when links can be interdependent, for example, when indirect connections matter. We show how one can use the observed proportions of various local network structures to learn about the underlying preference parameters. The key assumption for our approach restricts individuals to have bounded degree in equilibrium, implying a finite number of payoff-relevant local structures. Our main result provides necessary conditions for parameters to belong to the identified set. We then develop a quadratic programming algorithm that can be used to construct this set.

@article{depaula2018preferences,
  author = {de Paula, \'{A}ureo and Richards-Shubik, Seth and Tamer, Elie},
  title = {Identifying Preferences in Networks with Bounded Degree},
  journal = {Econometrica},
  volume = {86},
  number = {1},
  pages = {263--288},
  year = {2018}
}

Econometrics of Network Models

In B. Honore, A. Pakes, M. Piazzesi and L. Samuelson (Eds.), Advances in Economics and Econometrics: Eleventh World Congress, pp.268-323, 2017. Cambridge University Press

BibTeX

In this article I provide a (selective) review of the recent econometric literature on networks. I start with a discussion of developments in the econometrics of group interactions. I subsequently provide a description of statistical and econometric models for network formation and approaches for the joint determination of networks and interactions mediated through those networks. Finally, I give a very brief discussion of measurement issues in both outcomes and networks. My focus is on identification and computational issues, but estimation aspects are also discussed.

@incollection{depaula2017networkmodels,
  author = {de Paula, \'{A}ureo},
  title = {Econometrics of Network Models},
  booktitle = {Advances in Economics and Econometrics: Eleventh World Congress},
  editor = {Honor\'{e}, Bo and Pakes, Ariel and Piazzesi, Monika and Samuelson, Larry},
  pages = {268--323},
  publisher = {Cambridge University Press},
  year = {2017}
}

Identification and Estimation of Preference Distributions when Voters are Ideological

with Antonio Merlo

Review of Economic Studies, 84 (3), 2017, pp.2138-2168

BibTeX

This article studies the non-parametric identification and estimation of voters’ preferences when voters are ideological. We establish that voter preference distributions and other parameters of interest can be identified from aggregate electoral data. We also show that these objects can be consistently estimated and illustrate our analysis by performing an actual estimation using data from the 1999 European Parliament elections.

@article{depaula2017voters,
  author = {de Paula, \'{A}ureo and Merlo, Antonio},
  title = {Identification and Estimation of Preference Distributions when Voters are Ideological},
  journal = {Review of Economic Studies},
  volume = {84},
  number = {3},
  pages = {2138--2168},
  year = {2017}
}

How Beliefs about HIV Affect Risky Behaviors: Evidence from Malawi

with Gil Shapira and Petra Todd

Journal of Applied Econometrics, Sep/Oct 2014, 29 (6), pp.944-964

BibTeX

This paper examines how beliefs about own HIV status affect decisions to engage in risky sexual behavior, as measured by having extramarital sex and/or multiple sex partners. The empirical analysis is based on a panel survey of males from the 2006 and 2008 rounds of the Malawi Diffusion and Ideational Change Project (MDICP). The paper develops a behavioral model of the belief-risky behavior relationship and estimates the causal effect of beliefs on risky behavior using the Arellano and Carrasco (2003) semiparametric panel data estimator, which accommodates both unobserved heterogeneity and belief endogeneity arising from a possible dependence of current beliefs on past risky behavior. Results show that downward revisions in the belief assigned to being HIV positive increase risky behavior and upward revisions decrease it. For example, based on a linear specification, a decrease in the perceived probability of being HIV positive from 10 to 0 percentage points increases the probability of engaging in risky behavior (extramarital affairs) from 8.3 to 14.1 percentage points. We also develop and implement a modified version of the Arellano and Carrasco (2003) estimator to allow for misreporting of risky behavior and find estimates to be robust to a range of plausible misreporting levels.

@article{depaula2014hiv,
  author = {de Paula, \'{A}ureo and Shapira, Gil and Todd, Petra},
  title = {How Beliefs about {HIV} Affect Risky Behaviors: Evidence from Malawi},
  journal = {Journal of Applied Econometrics},
  volume = {29},
  number = {6},
  pages = {944--964},
  year = {2014}
}

Econometric Analysis of Games with Multiple Equilibria

Annual Review of Economics, 5, January 2013, pp.107-131

BibTeX

This article reviews the recent literature on the econometric analysis of games in which multiple solutions are possible. Multiplicity does not necessarily preclude the estimation of a particular model (and, in certain cases, even improves its identification), but ignoring it can lead to misspecifications. The review starts with a general characterization of structural models that highlights how multiplicity affects the classical paradigm. Because the information structure is an important guide to identification and estimation strategies, I discuss games of complete and incomplete information separately. Although many of the techniques discussed here can be transported across different information environments, some are specific to particular models.Models of social interactions are also surveyed. I close with a brief discussion of postestimation issues and research prospects.

@article{depaula2013games,
  author = {de Paula, \'{A}ureo},
  title = {Econometric Analysis of Games with Multiple Equilibria},
  journal = {Annual Review of Economics},
  volume = {5},
  pages = {107--131},
  year = {2013}
}

Inference of Signs of Interaction Effects in Simultaneous Games with Incomplete Information

with Xun Tang

Econometrica, 80 (1), January 2012, pp.143-172

BibTeX

This paper studies the inference of interaction effects in discrete simultaneous games with incomplete information. We propose a test for the signs of state-dependent interaction effects that does not require parametric specifications of players’ payoffs, the distributions of their private signals, or the equilibrium selection mechanism. The test relies on the commonly invoked assumption that players’ private signals are independent conditional on observed states. The procedure is valid in (but does not rely on) the presence of multiple equilibria in the data-generating process (DGP). As a by-product, we propose a formal test for multiple equilibria in the DGP. We also implement the test using data on radio programming of commercial breaks in the United States, and infer stations’ incentives to synchronize their commercial breaks. Our results support the earlier finding by Sweeting (2009) that stations have stronger incentives to coordinate and air commercials at the same time during rush hours and in smaller markets.

@article{depaula2012signs,
  author = {de Paula, \'{A}ureo and Tang, Xun},
  title = {Inference of Signs of Interaction Effects in Simultaneous Games with Incomplete Information},
  journal = {Econometrica},
  volume = {80},
  number = {1},
  pages = {143--172},
  year = {2012}
}

The Informal Sector: An Equilibrium Model and Some Empirical Evidence from Brazil

with Jose A. Scheinkman

Review of Income and Wealth, 57 (S1), May 2011, S8-S26

BibTeX

We test implications of a simple equilibrium model of informality using a survey of 48,000+ small firms in Brazil. In the model, agents’ ability to manage production differs and informal firms face a higher cost of capital and limitation on size, although these informal firms avoid tax payments. As a result, informal firms are managed by less able entrepreneurs, are smaller, and employ a lower capital–labor ratio. The model predicts that the interaction of an index of observable inputs to entrepreneurial ability and formality is positively correlated with firm size, which we verify in the data. Using the model, we estimate that informal firms in our dataset faced at least 1.3 times the cost of capital of formal firms.

@article{depaula2011informal,
  author = {de Paula, \'{A}ureo and Scheinkman, Jose A.},
  title = {The Informal Sector: An Equilibrium Model and Some Empirical Evidence from Brazil},
  journal = {Review of Income and Wealth},
  volume = {57},
  number = {S1},
  pages = {S8--S26},
  year = {2011}
}

Value Added Taxes, Chain Effects and Informality

with Jose A. Scheinkman

American Economic Journal: Macroeconomics, 2, October 2010, pp.195-221

BibTeX

We present an equilibrium model of tax avoidance and test its implications using a survey of firms in Brazil. In the model, the credit method used to collect value-added tax (VAT) creates informality chains—clients or suppliers of informal firms are more likely to be informal. An increase in enforcement in a production stage increases formality downstream and upstream. Various empirical measures of formality of suppliers and buyers, and of enforcement downstream and upstream, are positively correlated with formality. When the VAT is applied in a single stage of production at a rate estimated by the authorities, these chain effects disappear.

@article{depaula2010vat,
  author = {de Paula, \'{A}ureo and Scheinkman, Jose A.},
  title = {Value Added Taxes, Chain Effects and Informality},
  journal = {American Economic Journal: Macroeconomics},
  volume = {2},
  pages = {195--221},
  year = {2010}
}

Interdependent Durations

with Bo E. Honore

Review of Economic Studies, 77(3), July 2010, pp.1138-1163

BibTeX

This paper studies the identification of a simultaneous equation model involving duration measures. It proposes a game theoretic model in which durations are determined by strategic agents. In the absence of strategic motives, the model delivers a version of the generalized accelerated failure time model. In its most general form, the system resembles a classical simultaneous equation model in which endogenous variables interact with observable and unobservable exogenous components to characterize an economic environment. In this paper, the endogenous variables are the individually chosen equilibrium durations. Even though a unique solution to the game is not always attainable in this context, the structural elements of the economic system are shown to be semi-parametrically identified. We also present a brief discussion of estimation ideas and a set of simulation studies on the model.

@article{honoredepaula2010,
  author = {Honor\'{e}, Bo and \'{A}ureo de Paula},
  title = {Interdependent Durations},
  journal = {Review of Economic Studies},
  volume = {77},
  number = {3},
  pages = {1138-1163},
  year = {2010}
}

Inference in a Synchronization Game with Social Interactions

Journal of Econometrics, 148(1), January 2009, pp.56-71

BibTeX

This paper studies inference in a continuous time game where an agent’s decision to quit an activity depends on the participation of other players. In equilibrium, similar actions can be explained not only by direct influences but also by correlated factors. Our model can be seen as a simultaneous duration model with multiple decision makers and interdependent durations. We study the problem of determining the existence and uniqueness of equilibrium stopping strategies in this setting. This paper provides results and conditions for the detection of these endogenous effects. First, we show that the presence of such effects is a necessary and sufficient condition for simultaneous exits. This allows us to set up a nonparametric test for the presence of such influences, which is robust to multiple equilibria. Second, we provide conditions under which parameters in the game are identified. Finally, we apply the model to data on desertion in the Union Army during the American Civil War, and find evidence of endogenous influences

@article{depaula2009synchronization,
  author = {de Paula, \'{A}ureo},
  title = {Inference in a Synchronization Game with Social Interactions},
  journal = {Journal of Econometrics},
  volume = {148},
  number = {1},
  pages = {56--71},
  year = {2009}
}

Conditional Moments and Independence

The American Statistician, 62(3), August 2008, pp.219-221

BibTeX

Consider two random variables X and Y. In initial probability and statistics courses, a discussion of various concepts of dissociation between X and Y is customary. These concepts typically involve independence and uncorrelatedness. An example is shown where E(Y^n|X) = E(Y^n) and E(X^n|Y ) = E(X^n) for n = 1, 2,... and yet X and Y are not stochastically independent. The bivariate distribution is constructed using a wellknown example in which the distribution of a random variable is not uniquely determined by its sequence of moments. Other similar families of distributions with identical moments can be used to display such a pair of random variables. It is interesting to note in class that even such a degree of dissociation between the moments of X and Y does not imply stochastic independence.

@article{depaula2008conditional,
  author = {de Paula, \'{A}ureo},
  title = {Conditional Moments and Independence},
  journal = {The American Statistician},
  volume = {62},
  number = {3},
  pages = {219--221},
  year = {2008}
}

II. Working Papers

Click on a paper to expand the abstract.

The Rise of Online Dating and Heterogamous Marriages

with Y. Hwang and F. Yang

First version: September 2025. Featured in: Financial Times, Valor Economico. IFS WP

BibTeX

We study how online dating shapes intermarriage by race and education in the United States. Using American Community Survey data and a continuous-treatment difference in-differences design, we find sizable effects on interracial marriage that vary across platforms, but weaker effects on educational homogamy. To probe mechanisms, we surveyed retrospective dating histories, partner preferences, and online dating behavior. Individual fixed-effects panel regressions show that effects of meeting a partner online vary with users’ preferences and filter usage: those with strong same-race preferences use a race filter to meet same-race partners. Thus, platform features and user preferences jointly shape dating patterns.

@article{hwangdepaulayang2025,
  author = {Hwang, Yujung and de Paula, \'{A}ureo and Yang, Fangzhu},
  title = {The Rise of Online Dating and Heterogamous Marriages},
  type = {Working Paper Series},
  year = {2025},
  institution = {IFS}, 
  note = {WP 25/59},
}

Prediction Sets and Conformal Inference with Interval Outcomes

with Weiguang Liu and Elie Tamer

First version: January 2025. (submitted) ArXiv

BibTeX

Given data on a random variable \(Y\), a prediction set with miscoverage level \(\alpha \in (0,1)\) is a set that contains a new draw of \(Y\) with probability \(1-\alpha\). Among all prediction sets satisfying this coverage property, the oracle prediction set is the one with minimal volume. The oracle prediction set offers a complementary view of the distribution of \(Y\), beyond point estimators such as the mean and quantiles, and has attracted considerable interest recently. This paper develops methods for estimating such prediction sets conditional on observed covariates when \(Y\) is \textit{censored} or \textit{interval-valued}. We characterise the oracle prediction set under partial identification induced by interval censoring and propose consistent estimators for both oracle prediction intervals and more general oracle prediction sets consisting of multiple disjoint intervals. In addition, we apply conformal inference to construct finite-sample valid prediction sets for interval outcomes that remain consistent as the sample size grows, using a conformity score tailored to interval data. The proposed procedure accounts for irreducible prediction uncertainty due to the stochastic nature of outcomes, modelling uncertainty arising from partial identification, and sampling uncertainty that vanishes as sample size increases. We conduct Monte Carlo simulations and two empirical applications using UK job postings data and the US Current Population Survey. The results demonstrate the robustness and efficiency of the proposed methods.

@article{liudepaulatamer2025,
  author = {Liu, Weiguang and de Paula, \'{A}ureo and Tamer, Elie},
  title = {Prediction Sets and Conformal Inference with Interval Outcomes},
  type = {Working Paper Series},
  year = {2025},
}

Estimating Production Functions Using Subjective Expectations Data

with Agnes Norris Keiller and John Van Reenen

First version: July 2024. (revision requested at the Journal of Political Economy) ArXiv

BibTeX

Standard methods for estimating production functions in the Olley and Pakes (1996) tradition require assumptions on input choices. We introduce a new method that exploits (increasingly available) data on a firm's expectations of its future output and inputs that allows us to obtain consistent production function parameter estimates while relaxing these input demand assumptions. In contrast to dynamic panel methods, our proposed estimator can be implemented on very short panels (including a single cross-section), and Monte Carlo simulations show it outperforms alternative estimators when firms' material input choices are subject to optimization error. Implementing a range of production function estimators on UK data, we find our proposed estimator yields results that are either similar to or more credible than commonly-used alternatives. These differences are larger in industries where material inputs appear harder to optimize. We show that TFP implied by our proposed estimator is more strongly associated with future jobs growth than existing methods, suggesting that failing to adequately account for input endogeneity may underestimate the degree of dynamic reallocation in the economy.

@article{liudepaulatamer2025,
  author = {Norris Keiller, Agnes and de Paula, \'{A}ureo and Van Reenen, John},
  title = {Estimating Production Functions Using Subjective Expectations Data},
  type = {Working Paper Series},
  year = {2024},
}

Nowcasting with Signature Methods

with Samuel Cohen, Giulia Mantoan, Lars Nesheim, Arthur Turrell and Lingyi Yang

First version: May 2023. ArXiv

BibTeX

We introduce a new method of nowcasting using regression on path signatures. Path signatures capture the geometric properties of sequential data. Because signatures embed observations in continuous time, they naturally handle mixed frequencies and missing data. We prove theoretically, and with simulations, that regression on signatures subsumes the linear Kalman filter and retains desirable consistency properties. Nowcasting with signatures is more robust to disruptions in data series than previous methods, making it useful in stressed times (for example, during COVID-19). This approach is performant in nowcasting US GDP growth, and in nowcasting UK unemployment.

@article{cohenetal2023,
  author = {Cohen, Samuel and Mantoan, Giulia and Nesheim, Lars and de Paula, \'{A}ureo and Turrell, Arthur and Yang, Lingyi},
  title = {Nowcasting with Signature Methods},
  type = {Working Paper Series},
  year = {2023},
}

A Sticky View of Hoarding

with Christopher Hansman, Harrison Hong and Vishal Singh

BibTeX

Hoarding disrupts the functioning of markets. Yet little is known about its determinants. We analyze major recent hoarding episodes through the lens of an optimal inventory model in which risk-averse agents hoard both as a precautionary hedge against price uncertainty and to speculate when prices are predictable. Using supermarket scanner data, we provide reducedform evidence of the importance of the speculative motive due to sticky retail prices. We use our model to quantify that speculation accounts for a meaningful fraction of overall hoarding, although smaller than precaution in our episodes.

@article{hansmanetal2023,
  author = {Hansman, Christian and Hong, Harrison and de Paula, \'{A}ureo and Singh, Vishal},
  title = {A Sticky View of Hoarding},
  type = {Working Paper Series},
  year = {2023},
}

Inference of Multiple Equilibria in Discrete Games with Correlated Types

with Xun Tang

BibTeX

We study testable implications of multiple equilibria in discrete games with incomplete information. Unlike de Paula and Tang (2012), we allow the players' private signals to be correlated. In static games, we leverage independence of private types across games whose equilibrium selection is correlated. In dynamic games with serially correlated discrete unobserved heterogeneity, our testable implication builds on the fact that the distribution of a sequence of choices and states are mixtures over equilibria and unobserved heterogeneity. The number of mixture components is a known function of the length of the sequence as well as the cardinality of equilibria and unobserved heterogeneity support. In both static and dynamic cases, these testable implications are implementable using existing statistical tools.

@article{depaulatang2020,
  author = {de Paula, \'{A}ureo and Tang, Xun},
  title = {Inference of Multiple Equilibria in Discrete Games with Correlated Types},
  type = {Working Paper Series},
  year = {2020},
}

CCP and the Estimation of Nonseparable Dynamic Models

with Dennis Kristensen and Lars Nesheim

BibTeX

In this paper we generalize the so-called CCP estimator of Hotz and Miller (1993) to a broader class of dynamic discrete choice (DDC) models that allow period payoff functions to be non-separable in observable and unobservable (to the econometrician) variables. Such nonseparabilities are common in applied microeconomic environments and our generalized CCP estimator allows for computationally simple estimation in this class of DDC models. We first establish invertibility results between conditional choice probabilities (CCPs) and value functions and use this to derive a policy iteration mapping in our more general framework. This is used to develop a pseudo-maximum likelihood estimator of model parameters that side-step the need for solving the model.

@article{kristensennesheimdepaula2015,
  author = {Kristensen, Dennis and Nesheim, Lars and de Paula, \'{A}ureo},
  title = {CCP and the Estimation of Nonseparable Dynamic Models},
  type = {Working Paper Series},
  year = {2015},
}

III. Work in Progress

A. Econometrics

Estimating Nesting Structures

with Ali Hortacsu, Jonas Lieber and Julien Monardo

Estimation using List Randomised Variables

with Aurelia Lepine and Federico Tagliati

A Test for Independent Censoring in Duration Models

with Giulia Branccacio and Bo E. Honore

Identification in a Dynamic Roy Model

with Bo E. Honore

B. Empirical and Applied Micro

Spillovers in Social Programme Participation: Evidence from Chile

with Pedro Carneiro, Barbara Flores, Emanuela Galasso, Rita Ginja and Ben Waltmann

Joint Subjective Expectations in Households

with Enrico Miglino and Jonathan Shaw