Overview
pNanjing University is ranked the 3rd best in the C9 league, an official alliance of the top nine elite and prestigious universities in mainland China. Learning and Mining from DatA (LAMDA) Group is affiliated with the a href="http://keysoftlab.nju.edu.cn/"National Key Laboratory for Novel Software Technology/a and the a href="http://cs.nju.edu.cn/"Department of Computer Science Technology/a at a href="http://www.nju.edu.cn/"Nanjing University/a. The Founding Director of LAMDA is Prof. a href="http://cs.nju.edu.cn/zhouzh"Zhi-Hua Zhou/a. /p pThe main research interests of LAMDA Group include machine learning, data mining, pattern recognition, information retrieval, evolutionary computation, neural computation and other related areas. /pSupernova Award Category
Data to Decisions
The Problem
pConstrained by the characteristics of neural network algorithms, deep learning does not offer the optimal training results for discrete, non-continuously differentiable datasets. There also exist shortcomings, including the need for huge amount of labeled data, theoretical analysis problems and over-reliance on hyperparameters. In view of this, scientists specializing in AI are continuously innovating in search of more and newer AI research methods./p pDeep Learning predominantly based on Deep Neural Network (DNN) has gained significant progress in AI applications including image, video and voice, which have been great boosters for many industries. Data shows that by 2023, AI’s industrial value is expected to hit US $14.2 billion. But amid the fanfare, one question remains: does DNN represent the future of AI technology?/p pThe answer is unquestionably a “no.” Feedback from real applications showed that DNN is not dominant in the domain of non-continuously differentiable data. Meanwhile, a href="http://www.kaggle.com/competitions"Kaggle/a competitions also showed that DNN did not achieve ideal results in processing discrete data. On top of that, DNN faces several problems, including the massive labeled data it needs for training, complicated theoretical analyses and stringent demands for parameter adjustments./pThe Solution
pProfessor Zhi-Hua Zhou and his team from LAMDA Group proposed an all-new Deep Forest method, using multilayered decision tree forest-ensemble approach to open up new paths. Zhi-Hua Zhou proposed this view: when one AI training approach satisfies three conditions—layer-by-layer processing, an internal transformation with features and adequate model complexity—it is on a par with, or even outperforms, DNN. Based on this view, he and his team successively published three heavyweight papers, outlining new AI training methods and a deep learning blueprint./p pThrough lab and real application scenario demonstrations, Deep Forest has proven it is able to obtain better results than deep neural networks (DNN) in application scenarios, such as financial data analysis and sentiment classification, where discrete modeling, hybrid modeling or symbolic modeling are required./pThe results
pA close collaboration with an Internet financial services giant proved that Deep Forest can perform well in handling large-scale financial risk control tasks, helping users avoid unnecessary economic losses. In jointly published results, the training results of Deep Forest were verified the “automatic detection of cash payment fraud.” Both teams constructed a model for training dataset after sampling data of users with payment activity in O2O transactions within a few months’ period, and captured data from a few months after that to test the dataset./p pUltimately, the model would select hundreds of features of higher significance to execute the Deep Forest training process. After training with the massive samples and comparatively evaluating the logistic regression, DNN and Multiple Additive Regression Trees (MART), the results show that Deep Forest outperforms other models in terms of either recall rate or precision rate. We can infer that Deep Forest can help finance companies to construct better antifraud risk control solutions for the detection of cash-out fraud activities and effectively lower economic losses. Moreover, the value of Deep Forest is similarly verified in application scenarios of several securities and financial service companies. We believe that as the model and algorithms refine and optimize further with time, Deep Forest will certainly play an increasingly important part in AI applications of more industries and fields./pMetrics
pDeep Forest is used for automatic detection of Cash-out Fraud in Ant Financial Services Group, which is the #1 online Financial Services company in the world. The result can be found at paper - a href="https://arxiv.org/abs/1805.04234"Distributed Deep Forest and its Application to Automatic Detection of Cash-out Fraud/a,16 Mar 2020. Deep Forest is also used for the detection of mechanical defects in wheel gear by Envision Energy, a green technology company./pThe Technology
pUsing the strong computing power provided by Intel Xeon Scalable processors, the potential of Deep Forest is being explored and developed in great depth. The Intel® Advanced Vector Extensions 512 (Intel® AVX-512) also help Deep Forest perform multitask parallel processing. Intel continuously optimizes its software, compiler and other tools with feedbacks from LAMDA./pDisruptive Factor
pAs a leading AI research team, LAMDA is well aware of the importance of high-performance hardware and processors to AI research. When they started work on Deep Forest, LAMDA began working closely with Intel to conduct an in-depth technological exchange. Besides leveraging Intel’s deep optimization of processors, compilers and instruction sets to enhance the efficiency and quality of training, the team also optimized the Deep Forest specific algorithms and model designs based on the hardware infrastructure to make it better cater to the needs of the industry. In the near future, it can be foreseen that optimized technologies for Deep Forest models and algorithms can be integrated into new Intel processor instruction sets, or other Intel software and hardware products, which will further strengthen the support for the application of Deep Forest in industries like financial services, manufacturing, energy and many more./pShining Moment
pTo improve collaboration, Intel and Nanjing University a href="https://software.intel.com/content/www/us/en/develop/articles/intel-parallel-computing-center-at-lamda-group-nanjing-university.html"established/a an Intel Parallel Computing Center in September 2018. This was one of the first AI-oriented research centers set up by Intel Parallel Computing Centers. We believe that the further integration of LAMDA’s groundbreaking research and Intel’s leading computing platforms are bound to spark off more AI innovations and help China’s AI industry achieve substantial transformation./p