64th ISI World Statistics Congress - Ottawa, Canada

64th ISI World Statistics Congress - Ottawa, Canada

Big Data Analysis of Scientific Networks: Methods and Insights


Frederick Kin Hing Phoa


  • JH
    DRS Jing-Wen Huang
  • TK
    Thorsten Koch
  • FP
    Mr Frederick Kin Hing Phoa
  • JN
    PROF. DR. Junji Nakano
  • HJ
    Prof. Hohyun Jung
  • Category: International Association for Statistical Computing (IASC)


    In the regime of big data, citation information is gathered from large-scale database and represented in a network form. A citation network analysis is a quantitative method to identify important and impacted literature of afield based on how often a publication is cited in other publications. This analysis has recently become an essential tool to evaluate scientific achievements in different entities, including research articles, individual researchers, scientific journals, international conference, research institutes, and many others. This invited session gathers scientometric scholars to showcase their statistical methods and metrics to analyze this scientific network, and share their findings and insights from their analytical results. The topics include the network generative mechanism, underlying model assumption, statistical metrics, and data visualization.

    In specific, Professor Junji Nakano will talk about "A Stochastic Generative Model for Citation Networks among Academic Papers". This work proposes a stochastic generative model to represent a directed graph constructed by citations among academic papers. It approximates the importance of a cited paper by its in-degrees. By using the Web of Science database to see the features in the model, it generates simulated graphs and demonstrates the similarity to the original data. Professor Hohyun Jung will talk about "Weighted Evolving Hypergraph Model with Preferential Attachment". This work proposed a weighted evolving hypergraph model that considers preferential attachments. The model allows variability on the number and size of the hypergraphs to be connected. The degree distribution of the model can be expressed as a mixture of the degree distribution with a fixed number of hyperedges to be connected. This model is implemented in the analysis of the Web of Science. Professor Thorsten Koch will talk about "Article's Scientific Prestige: Measuring the Impact of Individual Articles in the Web of Science". It performed a citation analysis on the Web of Science publications consisting of more than 63 million articles and 1.45 billion citations on 254 subjects from 1981 to 2020, and proposes the Article's Scientific Prestige (ASP) metric for measuring the scientific impact of individual articles in the large-scale hierarchical and multi-disciplined citation network. ASP tends to provide more persuasive rankings than existing metrics when the articles are not highly cited. The journal grade is unable to properly reflect the scientific impact of individual articles. Professor Frederick Kin Hing Phoa will talk about "A New Planetarium Mode to Visualize Ego-Centric Networks and its Applications to Large-Scale Scientific Citation and Collaboration Networks". It introduces an efficient three-step approach to optimally allocate alter nodes uniformly on the surface of a con-centric sphere, with the consideration of the existing edges among alter nodes and without overlapping of node clusters. It is applied to the data visualization of author collaboration networks . Further, an additional patch is introduced to handle networks with directed edges between the ego node and its alter nodes, and it is applied to the data visualization of scientific citation networks.