Text classification is one of the most critical areas of machine learning and artificial intelligence research. One of the problems in developing text classification models is that the performances of the models depend on the quality of labeling tasks that are typically done by humans. In this study, we propose a new network community detection-based approach to automatically label and classify text data into multiclass value spaces. Specifically, we build a network with sentences as network nodes and pairwise cosine similarities between sentences as link weights.
Cognition and Linguistics
Ultraslow diffusion (i.e. logarithmic diffusion) has been extensively studied theoretically, but has hardly been observed empirically. In this presentation, firstly, we will show the ultraslow-like diffusion of the time-series of word counts of already popular words by analysing three different nationwide language databases: (i) newspaper articles (Japanese), (ii) blog articles (Japanese), and (iii) page views of Wikipedia (English, French, Chinese, and Japanese).
Based on a biological model of speciation  and a recent application of these ideas to opinion bi-polarization , we develop a simple model of argument persuasion, which allows to analyze the effects of different world views. In the model, agents exchange beliefs about facts. Agents evaluate these facts and form an attitudinal judgement on an issue through their cultural glasses. Facts may, if believed, contribute positively or negatively to this judgement in a way borrowed from expectancy value theory .
Human decision-making and behavior play significant roles in the introduction, spread, recognition, reporting and containment of new, emerging or foreign diseases and pests. Detection and mitigation strategies against the introduction of disease are commonly termed “biosecurity”.
We propose a new method to understand the origins of new technological domains. To define new technological domains, we rely on the taxonomy provided by the US patent office (USPTO). Because a large number of patents is applied for every year, and because patent officers need an efficient tool to search for prior art when evaluating novelty claims in patent applications, the USPTO has developed and maintained an elaborated taxonomy of inventions.
Individuals in social groups benefit from having accurate information about their group mates. Animals can use different strategies to infer or learn about each other’s underlying quality, such as rules-of-thumb or individual recognition. Few studies have addressed the tradeoffs amongst such strategies. We investigate these tradeoffs using an agent-based model, where animals assess each other using either signal-based assessment strategies or individual recognition.
Predicting innovation is a peculiar problem in data science. Following its definition, an innovation is always a never-seen-before event, making the usual approach of learning patterns from the past a useless exercise. Here we propose a strategy to address the problem in the context of innovative patents, by defining innovation as never-seen-before associations of technologies. We think of technological codes present in patents as a vocabulary and the whole technological corpus as written in a specific, evolving language.
Human consciousness represents one the most fascinating topics in the field of Neuroscience. In addition, beyond all the theoretical advances that can result from investigations in this area, further relevant implications at clinical level can be identified. For instance, a quantitative method for detecting the transition from conscious to unconscious states might constitute a strong support to clinicians during surgeries (e.g. providing patients the optimal amount of anesthetic).