新闻动态

学术活动

当前位置: 首页 > 新闻动态 > 学术活动 > 正文

心理与认知科学系学术沙龙预告——Carlo课题组

日期：2025-10-27

浏览次数：

时间：10月30日（周四）下午 13:30

地点：吕大龙楼11层1110

报告一 Brain network science modelling of sparse neural networks enables Transformers and LLMs to perform as fully connected

报告人：Diego Cerretti

内容：

This study aims to enlarge our current knowledge on the application of brain-inspired network science principles for training artificial neural networks (ANNs) with sparse connectivity. Dynamic sparse training (DST) emulates the synaptic turnover of real brain networks, reducing the computational demands of training and inference in ANNs. However, existing DST methods face difficulties in maintaining peak performance at high connectivity sparsity levels. The Cannistraci-Hebb training (CHT) is a brain-inspired method that is used in DST for growing synaptic connectivity in sparse neural networks. CHT leverages a gradient-free, topology-driven link regrowth mechanism, which has been shown to achieve ultra-sparse (1% connectivity or lower) advantage across various tasks compared to fully connected networks. Yet, CHT suffers two main drawbacks: (i) its time complexity is O(Nd3)- N node network size, d node degree - hence it can be efficiently applied only to ultra-sparse networks. (ii) it rigidly selects top link prediction scores, which is inappropriate for the early training epochs, when the network topology presents many unreliable connections. Here, we design the first brain-inspired network model - termed bipartite receptive field (BRF) - to initialize the connectivity of sparse artificial neural networks. Then, we propose a matrix multiplication GPU-friendly approximation of the CH link predictor, which reduces the computational complexity to O(N3), enabling a fast implementation of link prediction in large-scale models. Moreover, we introduce the Cannistraci-Hebb training soft rule (CHTs), which adopts a flexible strategy for sampling connections in both link removal and regrowth, balancing the exploration and exploitation of network topology. Additionally, we propose a sigmoid-based gradual density decay strategy, leading to an advanced framework referred to as CHTss. Empirical results show that BRF offers performance advantages over previous network science models. Using 1% of connections, CHTs outperforms fully connected networks in MLP architectures on visual classification tasks, compressing some networks to less than 30% of the nodes. Using 5% of the connections, CHTss outperforms fully connected networks in two Transformer-based machine translation tasks. Finally, with only 30% of the connections, both CHTs and CHTss achieve superior performance over other dynamic sparse training methods, and perform on par with—or even surpass—their fully connected counterparts in language modeling across various sparsity levels within the LLaMA model family.

报告二 A generalized logistic-logit function and its application to multi-layer perceptron and neuron segmentation

报告人：谷文祺

内容：

Logistic and logit functions play important roles in modern science, serving as foundational tools in various applications including artificial neural network (ANN). While there are functions that could produce distinct logistic and logit curves, no single, unified framework has been developed to generate both logistic and logit curves. We introduce a generalized logistic–logit function (CMG-GLLF) to fill this gap. CMG-GLLF provides four interpretable and trainable parameters that allow explicit control over: curve type and steepness, asymmetry, upper and lower limits of x- and y-axes. CMG-GLLF’s potential is explored in basic machine intelligence tasks. We propose a trainable input feature modulator (IFM) for multi-layer perceptron (MLP) that consists in learning the parameters of the CMG-GLLF for each input layer node during backpropagation, achieving MLP’s superior accuracy and faster learning speed in image classification. Furthermore, CMG-GLLF as data transformation enhances the accuracy of affinity-graph-based neuron segmentation. CMG-GLLF combines in a unique framework the ability of logistic and logit function to modulate signals or variables, covering a full spectrum of attenuation or amplification transformations. CMG-GLLF is flexible and trainable, has potential to advance machine learning models, and can inspire further applications in other data analysis challenges in different domains of science.

上一篇：心理与认知科学系学术沙龙预告——张一思课题组

下一篇：心理与认知科学系学术沙龙预告——张向阳课题组

【关闭】

电话：010-62788812

邮箱：xlxbgs@tsinghua.edu.cn

地址：北京市海淀区清华大学吕大龙楼

邮编：100084

新闻动态

动态信息

通知公告

学术活动

院系简介

主任致辞

历史沿革

现任领导

历任领导

组织架构

联系我们

师资队伍

在职教师

离退休教师

博士后

永远怀念

人才招聘

人才培养

概况

本科生

研究生

学生风采

科学研究

科研成果

科研平台

科研机构

获奖情况

学术规范

合作交流

社会服务

合作发展

校友之家

脑与智能实验室

新闻动态

新闻动态

学术活动

新闻动态

新闻动态

学术活动 .cls-1{fill:#d95097;}.cls-2{fill:#6b1685;fill-rule:evenodd;} 资源 1

学术活动