1. 解密数字档案:人工智能和原生数字数据的跨学科视角
Unlocking digital archives: cross‑disciplinary perspectives on AI and born‑digital data
Lise Jaillant(拉夫堡大学社会科学学院)
Annalina Caputo(都柏林大学工程与计算学院)
引用格式:Jaillant L, Caputo A. Unlocking digital archives: cross-disciplinary perspectives on AI and born-digital data[J]. AI & society, 2022(37): 1-13.
摘要原文:Co-authored by a Computer Scientist and a Digital Humanist, this article examines the challenges faced by cultural heritage institutions in the digital age, which have led to the closure of the vast majority of born-digital archival collections. It focuses particularly on cultural organizations such as libraries, museums and archives, used by historians, literary scholars and other Humanities scholars. Most born-digital records held by cultural organizations are inaccessible due to privacy, copyright, commercial and technical issues. Even when born-digital data are publicly available (as in the case of web archives), users often need to physically travel to repositories such as the British Library or the Bibliothèque Nationale de France to consult web pages. Provided with enough sample data from which to learn and train their models, AI, and more specifically machine learning algorithms, offer the opportunity to improve and ease the access to digital archives by learning to perform complex human tasks. These vary from providing intelligent support for searching the archives to automate tedious and time-consuming tasks. In this article, we focus on sensitivity review as a practical solution to unlock digital archives that would allow archival institutions to make non-sensitive information available. This promise to make archives more accessible does not come free of warnings for potential pitfalls and risks: inherent errors, “black box” approaches that make the algorithm inscrutable, and risks related to bias, fake, or partial information. Our central argument is that AI can deliver its promise to make digital archival collections more accessible, but it also creates new challenges – particularly in terms of ethics. In the conclusion, we insist on the importance of fairness, accountability and transparency in the process of making digital archives more accessible.
2. 使用机器学习来增强社交媒体档案的处理工作
Using Machine Learning to Enhance Archival Processing of Social Media Archives
Lizhou Fan(密歇根大学)
Zhanyuan Yin(芝加哥大学)
Huizi Yu(布朗大学)
Anne J Gilliland(加州大学洛杉矶分校 教育和信息学院)
期刊:ACM Journal on Computing and Cultural Heritage (JOCCH)
引用格式:Fan L, Yin Z, Yu H, et al. Using Machine Learning to Enhance Archival Processing of Social Media Archives[J]. Journal on Computing and Cultural Heritage (JOCCH), 2022, 15(3): 1-23.
摘要:本文报告了一项利用机器学习来识别社交媒体档案中仇恨言论的发生和变化动态的研究。为了更好地应对这种大规模和快速演变档案的处理需求,我们提出了数据驱动和循环的档案处理方法(DCAP)。作为概念验证,本研究聚焦与COVID-19有关的英文推特档案。在2020年2月至6月期间,反复抓取推文,并将其摄取、聚合在COVID-19仇恨言论推特档案(COVID-19 Hate Speech Twitter Archive,CHSTA)中,随后使用生成对抗网络启发的DCAP方法对仇恨言论进行分析。结果表明,使用机器学习和数据分析从CHSTA及类似的社交媒体档案中浮现并证实趋势是有可能的。这些趋势可以为危机应对、争议情况或公共政策制定及后续的历史分析提供即刻有用的知识。该方法显示了整合档案工作流程的多方面潜力,并支持自动迭代的再著录和再鉴定活动,使其更负责任,更迅速地响应不断变化的社会利益并不断发展。
摘要原文:This article reports on a study using machine learning to identify incidences and shifting dynamics of hate speech in social media archives. To better cope with the archival processing need for such large-scale and fast evolving archives, we propose the Data-driven and Circulating Archival Processing (DCAP) method. As a proof-of-concept, our study focuses on an English language Twitter archive relating to COVID-19: Tweets were repeatedly scraped between February and June 2020, ingested and aggregated within the COVID-19 Hate Speech Twitter Archive (CHSTA), and analyzed for hate speech using the Generative Adversarial Network–inspired DCAP method. Outcomes suggest that it is possible to use machine learning and data analytics to surface and substantiate trends from CHSTA and similar social media archives that could provide immediately useful knowledge for crisis response, in controversial situations, or for public policy development, as well as for subsequent historical analysis. The approach shows potential for integrating multiple aspects of the archival workflow and supporting automatic iterative redescription and reappraisal activities in ways that make them more accountable and more rapidly responsive to changing societal interests and unfolding developments.
3. 档案和AI:当前辩论和未来前景的概述
Archives and AI: An Overview of Current Debates and Future Perspectives
Giovanni Colavizza(阿姆斯特丹大学人文学院)
Tobias Blanke(阿姆斯特丹大学人文学院) Charles Jeurgens(阿姆斯特丹大学人文学院)
Julia Noordegraaf(阿姆斯特丹大学人文学院)
期刊:ACM Journal on Computing and Cultural Heritage (JOCCH)
引用格式:Colavizza G, Blanke T, Jeurgens C, et al. Archives and AI: an overview of current debates and future perspectives[J]. ACM Journal on Computing and Cultural Heritage (JOCCH), 2021, 15(1): 1-15.
摘要原文:The digital transformation is turning archives, both old and new, into data. As a consequence, automation in the form of artificial intelligence techniques is increasingly applied both to scale traditional recordkeeping activities, and to experiment with novel ways to capture, organise, and access records. We survey recent developments at the intersection of Artificial Intelligence and archival thinking and practice. Our overview of this growing body of literature is organised through the lenses of the Records Continuum model. We find four broad themes in the literature on archives and artificial intelligence: theoretical and professional considerations, the automation of recordkeeping processes, organising and accessing archives, and novel forms of digital archives. We conclude by underlining emerging trends and directions for future work, which include the application of recordkeeping principles to the very data and processes that power modern artificial intelligence and a more structural—yet critically aware—integration of artificial intelligence into archival systems and practice.
4. 论科学档案网络的构建:探索控制论思想集合的计算方法
On Constructing a Scientific Archives Network:Exploring Computational Approaches to the Cybernetics Thought Collective
Bethany G. Anderson(伊利诺伊大学厄巴纳 -香槟分校档案馆)
引用格式:Anderson B G. On Constructing a Scientific Archives Network: Exploring Computational Approaches to the Cybernetics Thought Collective[J]. Archivaria: The Journal of the Association of Canadian Archivists, 2021(91): 104-147.
Neither physical nor juridical persons: electronic personhood and an evolving theory of archival diplomatics
Devon Mordell(温莎大学Leddy图书馆)
期刊:Archives and Records
引用格式:Mordell D. Neither physical nor juridical persons: electronic personhood and an evolving theory of archival diplomatics[J]. Archives and Records, 2021, 42(1): 25-39.
摘要:2017年,欧盟(EU)通过了P8_TA-PRO0051号决议,概述了对机器人民法规则委员会(Commission on Civil Law Rules for Robotics)的一系列建议。尽管该决议表面上以未来主义为前提,但它坚定地基于当前的关切:为欧洲立法者绘制自主机器人的法律和伦理影响。欧盟的决议中包括一项提议,即研究为电子人创建法律地位,从而使自主机器人可以为自己造成的损害负责。一项授予电子人格的立法文书即将改变一个不起眼的领域,那就是档案文书学:电子人的前景对其基本理念构成了特殊的挑战。文书学中人的定义,即文件的核心要素,尚未讨论到电子人或电子人格的可能性。文章将从关于机器人的法律研究中提供一个探索性的概述,以说明自主机器人和人工智能系统的人的地位对不断发展的档案文书学理论可能带来什么。
摘要:近年来,随着人工智能、区块链等技术的发展和应用,不少学者强调要利用新技术应对信息时代的文件与档案管理挑战,尤其是开展智能化的保管期限划分。澳大利亚档案学家弗兰克·阿普沃德(Frank Upward)等在《网络化时代的文件信息学》(Recordkeeping Informatics for a Networked Age)中明确指出必须对技术辅助鉴定给予足够的关注,否则人们将淹没在信息中。在业界,部分国外档案管理机构已经开始了调查和初步实验,英国国家档案馆、澳大利亚国家档案馆对机器学习辅助保管期限划分展开了调查和研究。作为计算档案学的前沿会议之一,美国电气和电子工程师协会大数据会议的计算档案学分会强调要“将计算科学与档案学理论整合,以支撑长期保管、鉴定等工作”。这些研究动向揭示机器辅助保管期限划分的时代即将到来,有必要对该问题开展研究。
7. 档案鉴定与人工智能:将来,如何以及是谁在言说历史