Etymo AI newsletter #13

25th January - 7th February 2019

1779 new papers

In this newsletter from Etymo, you can find out the latest development in machine learning research, including the most popular datasets used, the most frequently appearing keywords, the important research papers associated with the keywords, and the most trending papers in the past two weeks.

If you and your friends like this newsletter, you can subscribe to our fortnightly newsletters here.

Fortnight Summary

There are 1779 papers published in the past two weeks. Computer vision (CV) is still a main research area, as reflected on the popularity of the CV datasets and the most trending papers.

We present the emerging interests in research under the "Trending Phrases" section. The papers in this section show some cutting edge results. There are four good papers, each of which is related to Hyberbox based Machine Learning, Cycle GAN, Exact Baysian Inference and Reduced Basis.

Other notable development in research includes the following:

More efficient models using lightweight and dynamic convolutions than self-attention models: Pay Less Attention with Lightweight and Dynamic Convolutions

Deep generative models for classification in ultra-sparse categorisation in training data: Semi-Unsupervised Learning with Deep Generative Models: Clustering and Classifying using Ultra-Sparse Labels

A high-level review of general liguistic intelligence and practices of existing language models: Learning and Evaluating General Linguistic Intelligence

A new generation of learning algorithms for recurrent neural networks: Biologically inspired alternatives to backpropagation through time for learning in recurrent neural nets

A study on the repeated selection (RS) choice representation: Choosing to Rank

A kernal approach to model selection for simulator-based statistical models: Model Selection for Simulator-based Statistical Models: A Kernel Approach

A New Ensemble Method for Heterogeneous Transfer Learning (including codes and datasets): Funnelling: A New Ensemble Method for Heterogeneous Transfer Learning and its Application to Polylingual Text Classification

The use of supervised machine learning (ML) as a tool to get more accurate comsmic shear measurement: Weak-lensing shear measurement with machine learning: teaching artificial neural networks about feature noise

Some of the notable review papers include:

A high-bias, low-variance introduction to Machine Learning for physicists

Deep Learning in Mobile and Wireless Networking: A Survey

A survey of state-of-the-art mixed data clustering algorithms

Popular Datasets

Computer vision is still the main focus area of research.

Name	Type	Number of Papers
MNIST	Handwritten Digits	96
ImageNet	Image Dataset	52
CIFAR-10	Tiny Image Dataset in 10 Classes	48
COCO	Common Objects in Context	14
SVHN	The Street View House Numbers Dataset	12
KITTI	Autonomous Driving	11
CelebA	Large-scale CelebFaces Attributes	10
CIFAR-100	Tiny Image Dataset in 100 Classes	10

Trending Phrases

In this section, we present a list of phrases that appeared significantly more in this newsletter than the previous newsletters.

Hyberbox based Machine Learning

Hyperbox based machine learning algorithms: A comprehensive survey

Cycle GAN

Comparison of Patch-Based Conditional Generative Adversarial Neural Net Models with Emphasis on Model Robustness for Use in Head and Neck Cases for MR-Only planning

Exact Baysian Inference

Scalable Metropolis-Hastings for Exact Bayesian Inference with Large Datasets

Reduced Basis

Bifidelity data-assisted neural networks in nonintrusive reduced-order modeling

Etymo Trending

Presented below is a list of the most trending papers added in the last two weeks.

Pay Less Attention with Lightweight and Dynamic Convolutions:
The authors show that a very lightweight convolution can perform competitively to the best reported self-attention results. In the paper, they introduce dynamic convolutions which are simpler and more efficient than self-attention.

Semi-Unsupervised Learning with Deep Generative Models: Clustering and Classifying using Ultra-Sparse Labels:
Semi-unsupervised learning is an extreme case of semi-supervised learning with ultra-sparse categorisation where some classes are sparsely labelled and other classes appear only as unlabelled data in the training data. The paper introduces two deep generative models for classification in this regime that extend previous deep generative models designed for semi-supervised learning.

Learning and Evaluating General Linguistic Intelligence:
The authors define general linguistic intelligence as the ability to reuse previously acquired knowledge about a language's lexicon, syntax, semantics, and pragmatic conventions to adapt to new tasks quickly. The analyze state-of-the-art natural language understanding models and conduct an extensive empirical investigation. In conclusion, many models require a lot of in-domain training examples (e.g., for fine tuning, training task-specific modules), and are prone to catastrophic forgetting. In addition, models are mostly overfitting to the quirks of particular datasets.

Frequent Words

"Learning", "Model", "Data" and "Training" are the most frequent words. The top two papers associated with each of the key words are:

Learning

Model

Data

Training