Building legal datasets

Data-centric AI calls for better, not just bigger, datasets. As data protection laws with extra-territorial reach proliferate worldwide, ensuring datasets are legal is an increasingly crucial yet overlooked component of “better”. To help dataset builders become more willing and able to navigate this...

Full description

Saved in:
Bibliographic Details
Main Author: SOH, Jerrold
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2021
Subjects:
Online Access:https://ink.library.smu.edu.sg/sol_research/3442
https://ink.library.smu.edu.sg/context/sol_research/article/5400/viewcontent/BuildingDatasets_2021_wp.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sol_research-5400
record_format dspace
spelling sg-smu-ink.sol_research-54002023-02-07T06:43:48Z Building legal datasets SOH, Jerrold Data-centric AI calls for better, not just bigger, datasets. As data protection laws with extra-territorial reach proliferate worldwide, ensuring datasets are legal is an increasingly crucial yet overlooked component of “better”. To help dataset builders become more willing and able to navigate this complex legal space, this paper reviews key legal obligations surrounding ML datasets, examines the practical impact of data laws on ML pipelines, and offers a framework for building legal datasets. 2021-11-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sol_research/3442 https://ink.library.smu.edu.sg/context/sol_research/article/5400/viewcontent/BuildingDatasets_2021_wp.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection Yong Pung How School Of Law eng Institutional Knowledge at Singapore Management University Legal datasets machine learning data laws data protection laws Computer Law Databases and Information Systems Internet Law
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Legal datasets
machine learning
data laws
data protection laws
Computer Law
Databases and Information Systems
Internet Law
spellingShingle Legal datasets
machine learning
data laws
data protection laws
Computer Law
Databases and Information Systems
Internet Law
SOH, Jerrold
Building legal datasets
description Data-centric AI calls for better, not just bigger, datasets. As data protection laws with extra-territorial reach proliferate worldwide, ensuring datasets are legal is an increasingly crucial yet overlooked component of “better”. To help dataset builders become more willing and able to navigate this complex legal space, this paper reviews key legal obligations surrounding ML datasets, examines the practical impact of data laws on ML pipelines, and offers a framework for building legal datasets.
format text
author SOH, Jerrold
author_facet SOH, Jerrold
author_sort SOH, Jerrold
title Building legal datasets
title_short Building legal datasets
title_full Building legal datasets
title_fullStr Building legal datasets
title_full_unstemmed Building legal datasets
title_sort building legal datasets
publisher Institutional Knowledge at Singapore Management University
publishDate 2021
url https://ink.library.smu.edu.sg/sol_research/3442
https://ink.library.smu.edu.sg/context/sol_research/article/5400/viewcontent/BuildingDatasets_2021_wp.pdf
_version_ 1770575919296020480