Home/Research/Academics/Content
Academics
The “Vega” of Wuhan University wins world championship
Author:Jiang Zhaoxi Date:2022-06-02 Clicks:

The Vegav1model proposed by the Wuhan University-JDTrustedArtificial Intelligence Joint Research Center has topped the General Language Understanding Evaluation(GLUE)list, a prestigious global natural language processing list, with an overall average score of 91.3, setting a new world recordin thisarea.

Vega” makes its stunningdebut

The GLUE list, jointly launched by New York University, the University of Washington,Google AI offshoot DeepMindand other institutions, is considered an important metric to measure natural language processingandpre-training techniques.

In the recentlyreleasedGLUE list,Vega v1,a natural language processing model with a very large parameter scaleproposedbyWuhan University-JDTrustedArtificial Intelligence Joint Research Center, scored 91.3 and beat Microsoft, Facebook, and Stanford University, clearly demonstrating Vega v1’s leading positionin artificial intelligence technology.

GLUE list ranking chart

This mysterious and romantic name comes fromα Lyrae,the super large-scale computing cluster of JD Explore Academy.It is with its support that such large-scale training can be realized. Vega is an alias for α Lyrae, the brightest star in the constellationLyra. The team wanted the Vega v1model to be the most specialamong the pre-training models.

Team members working on the computer

As a general-purpose language model, the Vega v1 model can be applied to a variety of natural language processing tasks and has a wide range of future applications, such as intelligent question and answer functionality,chatbot, grammar correction, and autonomous driving.By adoptingmodel compression, pruning and distillation to lightenVega v1, a model with smaller parameters can be obtained and deployed in an intelligent terminal, making peoples daily life more convenient.

In addition to thepowerfulcapabilitiesof the model itself, the team hasalsoadopted many matching and fine-tuning strategies toefficientlyupdate the model’sparameterswith a small number ofannotatedsamples for specific downstream natural language processing tasks, effectively improving the accuracy of the Vega v1 model.

Its a breakthroughtonew heights

Alack of generalization isubiquitousin artificial intelligence.For instance, for each AI task, it is often necessary to train a specific model using a relevant data set. The same specific model that performs well with the current taskmay not perform as wellonother tasks.

To address this challenge and broaden the generality of artificial intelligence, more and moreAIemploysgenericpre-trainingmodel.Good results can be achieved bytraininga genericmodelon alarge-scale dataset and fine-tuning it for a specific task,effectively solving the problem of insufficient generalizability ofmodels.

The Vega v1 model, as a large-scale pre-traininglanguage model, has alsoachieved good resultson a variety of downstream tasks. Compared with other modelsonthe GLUE list, Vega v1 achieves breakthroughsin several pre-training techniques such as an energy-efficient parallelized training frameworkand data utilization approach, the innovative model architecture with billions of parameters, an improved self-supervised pre-training goal, and allowing the model to learn whole-sentence representations based on different granularitiesofwords, phrases, and short sentences, enablingmulti-granular sentence-level representations.These all make the model itself more competitive.

The test results of the Vega v1 model

The GLUE list covers a total of nine major NLP tasks such asnatural language inference, semantictextualsimilarity, questionandanswer. The human test results for each task are provided at theinitial stageof the list’s establishment, representing the level of human intelligenceforeach task. With the continuous research into pre-trainingmodels, such models have been able to outperform human test results on several tasks in GLUE,but sentiment analysis andco-reference resolutionhave performed more poorly than humans.

The Vega v1 model not only topped the GLUE list with the first overall average score, but also surpassed the human test results on these two challenging tasks for the first time, indicating that it has taken the intelligence of pre-trainingmodels to a new level.

In the future, the team will also considerincorporating technologies such as trusted AI to fullyupgrade the Vega v1modelbyintegrating trusted artificial intelligence and other technologies to enhance interpretability,privacyandfairnesswhile constantly improving text understanding capabilities.

The dream makersbehind Vega v1

As a research institute jointly established by the WHU Artificial IntelligenceInstitute, the School of Computer ScienceandJD, the Wuhan University - JD Trusted Artificial Intelligence Joint Research Center has published dozens of high-level research papers since its establishment in 2021. Additionally,itwon first placeboth intheVideo + depthbranchofthe ICCV-2021 Benchmarking Multi-Target Tracking Competitions and the GLUECompetition, the top test in global natural language processing.

Group photo of the Vega v1 model R&D team

In the process of model training and competition, the team also encountered many difficulties.Due to thelack of experience in large-scale model training,theyhad to learn many things from scratch;the heavy demand on computing resources formodel trainingposes severe challenges for the effective managementof these resources.In the face ofthesedifficulties, theteam worked together toanalyze the problems, debug codes, and discusssolutionsuntil the early morning.It is these efforts that allowthe Vega v1 model to be constantly optimizedand improved.

Dr. Zhong Qihuang,thecore member of the team, believes that both learning and research require composure and dedication.Choose a direction, set a time, and the rest is just hard work and persistence.Timewill give us the final answer.In this way,they have been able to overcome difficulties and achieve excellence in the field of artificial intelligence research, just like the star Vega, which shines brightly in the sky.

Like Vega shiningin the sky,

The Vega v1 model alsoyieldsbrilliantresults

In itsowndomain.

On the path ofpursuingdreams,

May we allkeep shining

In the company ofstars.

Rewritten by Zhou Chuangyu

Edited by Su Xinyue, Zou Xiaohan, Sylvia, Xi Bingqing


Source:

Prev Section:Academician Li Deren becomes China’s first recipient of the Brock Gold Medal Award
Next Section:Application of new photosensitive protein brings hope to 20 million blind people to regain vision

[Close]

Copyright @ 2014 Wuhan University

Baidu
map