论文标题

Youniverse:来自英语YouTube的大型频道和视频元数据

YouNiverse: Large-Scale Channel and Video Metadata from English-Speaking YouTube

论文作者

Ribeiro, Manoel Horta, West, Robert

论文摘要

YouTube在娱乐和通知全球的人们方面发挥了关键作用。但是,由于缺乏随机抽样的数据以及系统查询平台巨大目录的系统方式,因此很难研究平台。在本文中,我们介绍了Youniverse,这是来自英语YouTube的大量频道和视频元数据。 Youniverse包括2005年5月至2019年10月之间发布的136K频道和7290万次视频的元数据,以及带有每周订阅者和视图计数的频道级时序列数据。利用频道从SocialBlade.com(提供有关YouTube的信息的在线服务)的渠道排名,我们能够评估和增强频道样本的代表性。此外,该数据集还包含一个表,该表指定了哪些视频一组4.49亿个匿名用户评论。 Youniverse,可在https://doi.org/10.5281/zenodo.4650046上公开获取,将使社区能够与YouTube进行研究。

YouTube plays a key role in entertaining and informing people around the globe. However, studying the platform is difficult due to the lack of randomly sampled data and of systematic ways to query the platform's colossal catalog. In this paper, we present YouNiverse, a large collection of channel and video metadata from English-language YouTube. YouNiverse comprises metadata from over 136k channels and 72.9M videos published between May 2005 and October 2019, as well as channel-level time-series data with weekly subscriber and view counts. Leveraging channel ranks from socialblade.com, an online service that provides information about YouTube, we are able to assess and enhance the representativeness of the sample of channels. Additionally, the dataset also contains a table specifying which videos a set of 449M anonymous users commented on. YouNiverse, publicly available at https://doi.org/10.5281/zenodo.4650046, will empower the community to do research with and about YouTube.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源