Progressives and Never-Trumpers: Contrastive Principal Component Analysis as an Alternative Method for Public Opinion Research


ooc results

This paper exploits an emerging machine learning approach to scaling, contrastive principal component analysis (cPCA), to explore latent political subgroups among survey respondents from the 2018 Public Policy Institute of California’s Statewide California Survey. While standard scaling methods, like PCA and factor analysis, often recover ideology as their first dimension when analyzing public opinion data, these methods almost necessarily gloss over important subgroup variation (e.g., moderate versus conservative Republicans). cPCA, on the other hand, is incredibly suited for subgroup analysis in that it identifies the dimensions on which one subset of the data (the target group) varies the most and the other subset of the data (the background group) varies the least. We find that we are able to distinguish between both the progressive versus moderate wings of the Democratic party and the pro- versus anti-Trump wings of the Republican Party, using opposing parties’ respondents as the background groups, respectively. These results not only directly derive important subgroup variation but also highlight the general usefulness of cPCA for identifying both subgroups and latent divisions in public opinion research.

Keywords: scaling; contrastive learning; public opinion

The preprint can be downloaded here.

Online appendix can be reached here.

Papers, data, and replication codes could be reached here.