Etymo AI newsletter #1

10th August - 23rd August 2018

1078 new papers

Etymo added 1078 new papers published in the past two weeks. These newly published papers on average have 3.9 authors for each paper.

The table below indicates the number of papers added by Etymo each day:

10/08/2018	11/08/2018	12/08/2018	13/08/2018	14/08/2018	15/08/2018	16/08/2018
79	51	43	100	133	91	106

17/08/2018	18/08/2018	19/08/2018	20/08/2018	21/08/2018	22/08/2018	23/08/2018
88	37	42	107	104	89	8

Fortnight Summary

There is a big focus on computer vision (CV) in research from the papers published in the last two weeks, as reflected on the popularity of the CV datasets used. The interests in CV can be subdivided into handwriting recognition, autonomous driving, face detection, general object classification, and the handling of low-pixel or blurred images.

In other areas of machine learning, there are good developments in both speech recognition (Fast Spectrogram Inversion using Multi-head Convolutional Neural Networks) and natural language processing (NLP) (SeVeN: Augmenting Word Embeddings with Unsupervised Relation Vectors).

There are also new insights in more general machine learning techniques and understanding, such as better model selection technique ( Model Selection via the VC-Dimension), new understanding of black box machine learning (Shedding Light on Black Box Machine Learning Algorithms: Development of an Axiomatic Framework to Assess the Quality of Methods that Explain Individual Predictions), using small samples (Small Sample Learning in Big Data Era) and new method of outlier detection (Outlier Detection by Consistent Data Selection Method).

Popular Datasets

Computer vision is still the main focus area of research. The recent papers heavily used ImageNet and MNIST as the top datasets, while LFW was used by seven papers.

Name	Type	Number of Papers
MNIST	Handwritten digits	35
ImageNet	Images	27
Cityscapes	Urban street scene	15
CIFAR-10	Tiny images	13
COCO	Common objects in content	11
KITTI	Autonomous driving	8
RGB-D	Household object	7
VOC	Object class recognition	7
LFW	Face detection	7

Frequent Words

"Model", "learning" and "data" are the most frequent words. Below is a word cloud of all keywords from the last two weeks papers:

The top two papers associated with each of the key words are:

Model

Learning

Data

Etymo Trending

Presented below is a list of the most trending papers added in the last two weeks. The ranking is constructed from a combination of Etymo stars and tweets. Please star your favourites on Etymo Scholar to influence the results.

Evaluating and Understanding the Robustness of Adversarial Logit Pairing:
This 17 page paper hypothesises that representation learning algorithms should be evaluted in terms of information content rather than on a pixel level. The authors define a new method called Deep INFOMAX (DIM) and compare it with Variational autoencoders (VAE), Adversarial autoencoders (AAE), BiGAN and Noise at target (NAT). This paper uses the CIFAR10, CIFAR100, Tiny ImageNet and STL-10 datasets for the comparison between different methods.

Fast Spectrogram Inversion using Multi-head Convolutional Neural Networks:
This concise 6 page paper proposes a multi-head convolutional neural network (MCNN) architecture for waveform synthesis. The model is compared with the Griffin-Lim (GL) algorithm and the single-pass spectrogram inversion (SPSI) algorithm on the LibriSpeech dataset.

SeVeN: Augmenting Word Embeddings with Unsupervised Relation Vectors:
This 14 page paper presents SeVeN (Semantic Vector Networks), an algorithm that encodes relationships between words in the form of a graph. They compare their embeddings in several ways with the word2vec Google News embeddings.

Hope you have enjoyed our very first newsletter! If you have any comments or suggestions, please email ernest@etymo.io or steven@etymo.io. If you're having trouble with this email you can read this newsletter online here. The source for all our newsletters can be found on GitHub.