Etymo AI newsletter #5

5th October - 18th October 2018

In this newsletter from Etymo, you can find out the latest development in machine learning research, including the most popular datasets used, the most frequently appearing keywords and the important research papers associated with the keywords, and the most trending papers in the past two weeks.

If you and your friends like this newsletter, you can subscribe to our fortnightly newsletters here.

1457 new papers

Due to some technical issues, the newsletter was produced one day later than the normal timeline. Please accept our appologies and thank you for the patience. We send out the newsletter on evvery other Monday! Since the last newsletter, Etymo added 1457 new papers published in the past two weeks.

Fortnight Summary

In this newsletter, we changed the sequence of sections to reflect users' interests and feedbacks.

There was a strong focus on computer vision (CV) in research from the papers published in the last two weeks, as reflected on the popularity of the CV datasets used. The ranking of the datasets appearing in research papers stayed almost the same compared to the last couple of newsletters. The previoiusly top non-image based dataset of Twitter also dropped this time.

In the past two weeks, there was a strong surge in interests in Saliency Models (spotting the focus/attention objects in images), as reflected in the "Trending Phrases" section. The popular papers in Saliency Models include a review of the start-of-the-art technologies (Bottom-up Attention, Models of), and two interesting studies on the gap between the current technologies and human performance (Saliency Prediction in the Deep Learning Era: An Empirical Investigation and Invariance Analysis of Saliency Models versus Human Gaze During Scene Free Viewing). There was also an emergence of interest in motion vectors for video processing, including Combined Static and Motion Features for Deep-Networks Based Activity Recognition in Videos and Fast Semantic Segmentation on Video Using Block Motion-Based Feature Interpolation.

The trending of the last two weeks included a milestone research paper on NLP, using a new technique BERT that exceeds human performance (BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding), and Meta-Learning, consisting of a review (Meta-Learning: A Survey) and a new algorithm (Unsupervised Learning via Meta-Learning).

In other areas of machine learning, there were again reviews and summaries of current machine learning status (Applications of Deep Reinforcement Learning in Communications and Networking: A Survey, and Machine Learning: Basic Principles). There were also a new method for clustering (Dirichlet Process Parsimonious Mixtures for clustering), a practical framework for wider application of machine learning (Model Cards for Model Reporting), and a new prediction algorithm ( A Periodicity-based Parallel Time Series Prediction Algorithm in Cloud Computing Environments). Last but not least, there were interesting studies on selecting the optimal subsets from a well-defined set, such as optimal dataset reduction ( Finding Average Regret Ratio Minimizing Set in Database), and problem set optimisation(Learning-Theoretic Foundations of Algorithm Configuration for Combinatorial Partitioning Problems).

Popular Datasets

Computer vision is still the main focus area of research. The ranking of the datasets used has not changed much compared to the last newsletter. RGB-D, a large dataset of 300 common household objects in 3D, is near the top for the first time.

Name	Type	Number of Papers
MNIST	Handwritten Digits	62
ImageNet	Image Dataset	31
CIFAR-10	Tiny Image Dataset in 10 Classes	28
COCO	Common Objects in Context	12
Cityscapes	Urban Street Scenes	12
KITTI	Autonomous Driving	11
RGB-D	3D Image Dataset	9
CelebA	Large-scale CelebFaces Attributes	9
SVHN	The Street View House Numbers Dataset	9

Trending Phrases

We have started monitoring trending words/ phrases. Below are a list of words/ phrases that appeared significantly more in this newsletter than the previous newsletters.

Saliency Models

Motion Vectors

Etymo Trending

Presented below is a list of the most trending papers added in the last two weeks.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding:
Authors from Google AI produced this milestone paper in NLP, introducing Bidirectional Encoder Representations from Transformers (BERT). BERT achieves new state-of-the-art results on eleven natural language processing tasks, outperforming human performance by 2.0%. It is worth noting that the training of the base dataset of BERT took 4 days with 4 Cloud TPUs, while the training of the large dataset of BERT took 4 days with 16 Cloud TPUs.

Meta-Learning: A Survey:
This paper provides an overview of the state-of-the-art technologies in the realm of meta-learning in 29 pages.

Unsupervised Learning via Meta-Learning:
This 21-page paper presents an unsupervised learning method that explicitly optimizes for the ability to learn a variety of tasks from small amounts of data. When integrated with meta-learning, relatively simple mechanisms for task design, such as clustering unsupervised representations, lead to good performance on a variety of downstream tasks.

Frequent Words

"Learning", "Model", "Data" and "Set" are the most frequent words. The top two papers associated with each of the key words are:

Model

Learning

Data

Set