Data Governance in the Age of Large-Scale Data-Driven Language Technology
Publikation: Konferencebidrag › Paper › Forskning › fagfællebedømt
Standard
Data Governance in the Age of Large-Scale Data-Driven Language Technology. / Jernite, Yacine; Nguyen, Huu; Biderman, Stella; Rogers, Anna; Masoud, Maraim; Danchev, Valentin; Tan, Samson; Luccioni, Alexandra Sasha; Subramani, Nishant; Johnson, Isaac; Dupont, Gerard; Dodge, Jesse; Lo, Kyle; Talat, Zeerak; Radev, Dragomir; Gokaslan, Aaron; Nikpoor, Somaieh; Henderson, Peter; Bommasani, Rishi; Mitchell, Margaret.
2022. 2206-2222.Publikation: Konferencebidrag › Paper › Forskning › fagfællebedømt
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - CONF
T1 - Data Governance in the Age of Large-Scale Data-Driven Language Technology
AU - Jernite, Yacine
AU - Nguyen, Huu
AU - Biderman, Stella
AU - Rogers, Anna
AU - Masoud, Maraim
AU - Danchev, Valentin
AU - Tan, Samson
AU - Luccioni, Alexandra Sasha
AU - Subramani, Nishant
AU - Johnson, Isaac
AU - Dupont, Gerard
AU - Dodge, Jesse
AU - Lo, Kyle
AU - Talat, Zeerak
AU - Radev, Dragomir
AU - Gokaslan, Aaron
AU - Nikpoor, Somaieh
AU - Henderson, Peter
AU - Bommasani, Rishi
AU - Mitchell, Margaret
PY - 2022
Y1 - 2022
N2 - The recent emergence and adoption of Machine Learning technology, and specifically of Large Language Models, has drawn attention to the need for systematic and transparent management of language data. This work proposes an approach to global language data governance that attempts to organize data management amongst stakeholders, values, and rights. Our proposal is informed by prior work on distributed governance that accounts for human values and grounded by an international research collaboration that brings together researchers and practitioners from 60 countries. The framework we present is a multi-party international governance structure focused on language data, and incorporating technical and organizational tools needed to support its work.
AB - The recent emergence and adoption of Machine Learning technology, and specifically of Large Language Models, has drawn attention to the need for systematic and transparent management of language data. This work proposes an approach to global language data governance that attempts to organize data management amongst stakeholders, values, and rights. Our proposal is informed by prior work on distributed governance that accounts for human values and grounded by an international research collaboration that brings together researchers and practitioners from 60 countries. The framework we present is a multi-party international governance structure focused on language data, and incorporating technical and organizational tools needed to support its work.
KW - Faculty of Social Sciences
KW - dataset
KW - technology governance
KW - data rights
KW - language data
U2 - 10.1145/3531146.3534637
DO - 10.1145/3531146.3534637
M3 - Paper
SP - 2206
EP - 2222
ER -
ID: 346456359