collaboration networks and citation patterns - chenyang03/Reading GitHub Wiki

{Moody04} James Moody. The Structure of a Social Science Collaboration Network: Disciplinary Cohesion from 1963 to 1999. American Sociological Review, 2004, 69(2):213-238. PDF
{Guimerà05} Roger Guimerà, Brian Uzzi, Jarrett Spiro, and Luís A. Nunes Amaral. Team assembly mechanisms determine collaboration network structure and team performance. Science, 2005, 308(5722):697-702. develop a model for the assembly of teams of creative agents in which the selection of the members of a team is controlled by three parameters: (i) the number, m, of team members; (ii) the probability, p, of selecting incumbents, that is, agents already belonging to the network; and (iii) the propensity, q, of incumbents to select past collaborators.
{Tang08} Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su. ArnetMiner: extraction and mining of academic social networks. Proc. of ACM KDD, 2008.
{Uzzi13} Brian Uzzi, Satyam Mukherjee, Michael Stringer, Ben Jones. Atypical Combinations and Scientific Impact. Science, 2013, 342:468. Highest-impact science is primarily grounded in exceptionally conventional combinations of prior work yet simultaneously features an intrusion of unusual combinations. Papers of this type were twice as likely to be highly cited works.
{Fu14} Tom Z. Fu, Qianqian Song, and Dah Ming Chiu. The academic social network. Scientometrics, 2014, 101(1): 203-239.
{Guan16} JianCheng Guan, KaiRui Zuo, KaiHua Chen, Richard C.M.Yam. Does country-level R&D efficiency benefit from the collaboration network structure? Research Policy, 2016, 45(4):770-784. PDF
{Kong19} Xiangjie Kong, Yajie Shi, Wei Wang, Kai Ma, Liangtian Wan, Feng Xia. The Evolution of Turing Award Collaboration Network: Bibliometric-Level and Network-Level Metrics. IEEE Transactions on Computational Social Systems, 2019, 6(6):1318-1328. although the Turing Award Collaboration Network has small-world properties, it is not a scale-free network
{Wu19} Lingfei Wu, Dashun Wang & James A. Evans. Large teams develop and small teams disrupt science and technology. Nature, 2019, 566:378–382. Work from larger teams builds on more-recent and popular developments, and attention to their work comes immediately. By contrast, contributions by smaller teams search more deeply into the past, are viewed as disruptive to science and technology and succeed further into the future—if at all.
{Jiang21} Song Jiang, Bernard J. Koch, Yizhou Sun. HINTS: Citation Time Series Prediction for New Publications via Dynamic Heterogeneous Information Network Embedding. Proc. of WWW, 2021. We propose HINTS, a novel end-to-end deep learning framework that converts citation signals from dynamic heterogeneous information networks (DHIN) into citation time series. HINTS imputes pseudo-leading values for a paper in the years before it is published from DHIN embeddings, and then transforms these embeddings into the parameters of a formal model that can predict citation counts immediately after publication.
{Rungta22} Mukund Rungta, Janvijay Singh, Saif M. Mohammad, Diyi Yang. Geographic Citation Gaps in NLP Research. Proc. of EMNLP, 2022. We first created a dataset of 70,000 papers from the ACL Anthology, extracted their meta-information, and generated their citation network. We then show that not only are there substantial geographical disparities in paper acceptance and citation but also that these disparities persist even when controlling for a number of variables such as venue of publication and sub-field of NLP.
{Sheng23} Jingran Sheng, Bo Liang, Lin Wang, Xiaofan W. Evolution of scientific collaboration based on academic ages. Physica A, 2023, 624:128846. This paper sets up a scientific research dataset containing over 150 million articles in a wide range of disciplines and fields. After a preliminary cleansing, we select a subset of papers from this dataset with the publication year from 1955 to 2015, including more than 7.9 million authors and 22 million papers. Based on this dataset, we investigate and analyze the scientific collaborative patterns between authors and between countries and regions from the perspective of the authors’ academic age (AA).
{Liu23} Lu Liu, Benjamin F. Jones, Brian Uzzi & Dashun Wang. Data, measurement and empirical methods in the science of science. Nature Human Behaviour, 2023, 7:1046–1058. This Review considers this growing, multidisciplinary literature through the lens of data, measurement and empirical methods. We discuss the purposes, strengths and limitations of major empirical approaches, seeking to increase understanding of the field’s diverse methodologies and expand researchers’ toolkits.
{Wu23} Youyou Wu, Yang Yang and Brian Uzzi. A discipline-wide investigation of the replicability of Psychology papers over the past two decades. PNAS, 2023, 120(6):e2208863120. Using a validated machine learning model that estimates a paper’s likelihood of replication, we found evidence that both supports and refutes speculations drawn from a relatively small sample of manual replications
{Peng24} Hao Peng, Huilian Sophie Qiu, Henrik Barslund Fosse, and Brian Uzzi. Promotional language and the adoption of innovative ideas in science. PNAS, 2024, 121(25):e2320066121. Using three longitudinal samples of funded and unfunded grant applications from three of the world’s largest funders—the NIH, the NSF, and the Novo Nordisk Foundation—we find that the percentage of promotional language in a grant proposal is associated with the grant’s probability of being funded, its estimated innovativeness, and its predicted levels of citation impact.
{Nandi24} Rabindra Nath Nandi, Suman Kalyan Maity, Brian Uzzi, Sourav Medya. An Experimental Analysis on Evaluating Patent Citations. Proc. of EMNLP, 2024. We create a semantic graph of patents based on their semantic similarities, enabling the use of Graph Neural Network (GNN)-based approaches for predicting citations
{Gao24} Jian Gao & Dashun Wang. Quantifying the use and potential benefits of artificial intelligence in scientific research. Nature Human Behaviour, 2024, 8:2281–2292. We find that the use and benefits of AI appear widespread throughout the sciences, growing especially rapidly since 2015. However, there is a substantial gap between AI education and its application in research, highlighting a misalignment between AI expertise supply and demand.
{Hill25} Ryan Hill, Yian Yin, Carolyn Stein, Xizhao Wang, Dashun Wang & Benjamin F. Jones. The pivot penalty in research. Nature, 2025, 642:999–1006. We find a pervasive ‘pivot penalty’, in which the impact of new research steeply declines the further a researcher moves from their previous work.
{Yan24} Pengwei Yan, Yangyang Kang, Zhuoren Jiang, Kaisong Song, Tianqianjin Lin, Changlong Sun, and Xiaozhong Liu. 2024. Modeling Scholarly Collaboration and Temporal Dynamics in Citation Networks for Impact Prediction. In Proceedings of SIGIR '24 (short paper). Association for Computing Machinery, New York, NY, USA, 2522–2526. https://doi.org/10.1145/3626772.3657926
{Xue24}Zhikai Xue, Guoxiu He, Zhuoren Jiang, Sichen Gu, Yangyang Kang, Star Zhao, and Wei Lu. 2024. Predicting Scientific Impact Through Diffusion, Conformity, and Contribution Disentanglement. In Proceedings of CIKM '24. Association for Computing Machinery, New York, NY, USA, 2764–2774. https://doi.org/10.1145/3627673.3679546
{Yang23} C. Yang and J. Han, "Revisiting Citation Prediction with Cluster-Aware Text-Enhanced Heterogeneous Graph Neural Networks," 2023 IEEE 39th International Conference on Data Engineering (ICDE), Anaheim, CA, USA, 2023, pp. 682-695, doi: 10.1109/ICDE55515.2023.00058.
{Geng22} Hao Geng, Deqing Wang, Fuzhen Zhuang, Xuehua Ming, Chenguang Du, Ting Jiang, Haolong Guo, and Rui Liu. 2022. Modeling Dynamic Heterogeneous Graph and Node Importance for Future Citation Prediction. In Proceedings of CIKM '22. Association for Computing Machinery, New York, NY, USA, 572–581. https://doi.org/10.1145/3511808.3557398
{Liang16} Ronghua Liang and Xiaorui Jiang. 2016. Scientific ranking over heterogeneous academic hypernetwork. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI'16). AAAI Press, 20–26.
{Zhang23} Yu Zhang, Bowen Jin, Xiusi Chen, Yanzhen Shen, Yunyi Zhang, Yu Meng, and Jiawei Han. 2023. Weakly Supervised Multi-Label Classification of Full-Text Scientific Papers. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '23). Association for Computing Machinery, New York, NY, USA, 3458–3469. https://doi.org/10.1145/3580305.3599544