Crowd-sourced text analysis: Reproducible and agile production of political data

Empirical social science often relies on data that are not observed in the field, but are transformed into quantitative variables by expert researchers who analyze and interpret qualitative raw sources. While generally considered the most valid way to produce data, this expert-driven process is inhe...

Full description

Saved in:

Bibliographic Details
Main Authors:	BENOIT, Kenneth, CONWAY, Drew, LAUDERDALE, Benjamin E., LAVER, Michael, MIKHAYLOV, Slava
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2016
Subjects:	Models and Methods Political Science
Online Access:	https://ink.library.smu.edu.sg/soss_research/3970 https://ink.library.smu.edu.sg/context/soss_research/article/5228/viewcontent/Crowd_sourcedTA_av.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.soss_research-5228
record_format	dspace
spelling	sg-smu-ink.soss_research-52282024-09-02T06:30:48Z Crowd-sourced text analysis: Reproducible and agile production of political data BENOIT, Kenneth CONWAY, Drew LAUDERDALE, Benjamin E. LAVER, Michael MIKHAYLOV, Slava Empirical social science often relies on data that are not observed in the field, but are transformed into quantitative variables by expert researchers who analyze and interpret qualitative raw sources. While generally considered the most valid way to produce data, this expert-driven process is inherently difficult to replicate or to assess on grounds of reliability. Using crowd-sourcing to distribute text for reading and interpretation by massive numbers of nonexperts, we generate results comparable to those using experts to read and interpret the same texts, but do so far more quickly and flexibly. Crucially, the data we collect can be reproduced and extended transparently, making crowd-sourced datasets intrinsically reproducible. This focuses researchers’ attention on the fundamental scientific objective of specifying reliable and replicable methods for collecting the data needed, rather than on the content of any particular dataset. We also show that our approach works straightforwardly with different types of political text, written in different languages. While findings reported here concern text analysis, they have far-reaching implications for expert-generated data in the social sciences. 2016-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/soss_research/3970 info:doi/10.1017/S0003055416000058 https://ink.library.smu.edu.sg/context/soss_research/article/5228/viewcontent/Crowd_sourcedTA_av.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School of Social Sciences eng Institutional Knowledge at Singapore Management University Models and Methods Political Science
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Models and Methods Political Science
spellingShingle	Models and Methods Political Science BENOIT, Kenneth CONWAY, Drew LAUDERDALE, Benjamin E. LAVER, Michael MIKHAYLOV, Slava Crowd-sourced text analysis: Reproducible and agile production of political data
description	Empirical social science often relies on data that are not observed in the field, but are transformed into quantitative variables by expert researchers who analyze and interpret qualitative raw sources. While generally considered the most valid way to produce data, this expert-driven process is inherently difficult to replicate or to assess on grounds of reliability. Using crowd-sourcing to distribute text for reading and interpretation by massive numbers of nonexperts, we generate results comparable to those using experts to read and interpret the same texts, but do so far more quickly and flexibly. Crucially, the data we collect can be reproduced and extended transparently, making crowd-sourced datasets intrinsically reproducible. This focuses researchers’ attention on the fundamental scientific objective of specifying reliable and replicable methods for collecting the data needed, rather than on the content of any particular dataset. We also show that our approach works straightforwardly with different types of political text, written in different languages. While findings reported here concern text analysis, they have far-reaching implications for expert-generated data in the social sciences.
format	text
author	BENOIT, Kenneth CONWAY, Drew LAUDERDALE, Benjamin E. LAVER, Michael MIKHAYLOV, Slava
author_facet	BENOIT, Kenneth CONWAY, Drew LAUDERDALE, Benjamin E. LAVER, Michael MIKHAYLOV, Slava
author_sort	BENOIT, Kenneth
title	Crowd-sourced text analysis: Reproducible and agile production of political data
title_short	Crowd-sourced text analysis: Reproducible and agile production of political data
title_full	Crowd-sourced text analysis: Reproducible and agile production of political data
title_fullStr	Crowd-sourced text analysis: Reproducible and agile production of political data
title_full_unstemmed	Crowd-sourced text analysis: Reproducible and agile production of political data
title_sort	crowd-sourced text analysis: reproducible and agile production of political data
publisher	Institutional Knowledge at Singapore Management University
publishDate	2016
url	https://ink.library.smu.edu.sg/soss_research/3970 https://ink.library.smu.edu.sg/context/soss_research/article/5228/viewcontent/Crowd_sourcedTA_av.pdf
_version_	1814047823525576704

Crowd-sourced text analysis: Reproducible and agile production of political data

Similar Items