They show improvement on two error correction datasets. Variational Neural Machine Translation Biao Zhang, Deyi Xiong, Jinsong Su, Hong Duan, Min Zhang. First, they model the posterior probability of z, conditioned on both input and output.
During training, these two distributions are optimised to be similar using Kullback-Leibler distance, and during testing the prior is used. https://arxiv.org/abs/1608.04147 They create an LSTM neural language model that 1) has better handling of numerical values, and 2) is conditioned on a knowledge base.
They report improvements on Chinese-English and English-German translation, compared to using the original encoder-decoder NMT framework. Numerically Grounded Language Models for Semantic Error Correction Georgios P. First the the numerical value each token is given as an additional signal to the network at each time step.
They also propose two modifications to the process of generating adversarial images – making it into a more gradual iterative process, and optimising for a specific adversarial class. Extracting token-level signals of syntactic processing from f MRI – with an application to POS induction Joachim Bingel, Maria Barrett, Anders Søgaard. For this they use a dataset of f MRI recordings, where the subjects were reading a chapter of Harry Potter.
The main issue is that f MRI has very low temporal resolution – there is only one f MRI reading per 4 tokens, and in general it takes around 4-14 seconds for something to show up in f MRI. propose a joint model for 1) identifying event keywords in a text, 2) identifying entities, and 3) identifying the connections between these events and entities.
Staying on top of recent work is an important part of being a good researcher, but this can be quite difficult.
Thousands of new papers are published every year at the main ML and NLP conferences, not to mention all the specialised workshops and everything that shows up on Ar Xiv.
The only filter that I applied was to exclude papers older than 2016, as the goal is to give an overview of the more recent work. Once I was done, I thought this would be a sensible place to summarise my own work as well. A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task Danqi Chen, Jason Bolton, Christopher D. This paper has hand-reviewed 100 samples from the dataset and concludes that around 25% of the questions are difficult or impossible to answer even for a human, mostly due to the anonymisation process. https://arxiv.org/pdf/1710.04087Inducing word translations using only monolingual corpora for two languages.
So at the end of the list you will also find brief summaries of the papers I published in 2017. They present a simple classifier that achieves unexpectedly good results, and a neural network based on attention that beats all previous results by quite a margin. Word Translation Without Parallel Data Alexis Conneau, Guillaume Lample, Marc’Aurelio Ranzato, Ludovic Denoyer, Hervé Jégou. Separate embeddings are trained for each language and a mapping is learned though an adversarial objective, along with an orthogonality constraint on the most frequent words.
The numerical grounding helps quite a bit, and the best results are obtained when the KB conditioning is also added. Black Holes and White Rabbits : Metaphor Identification with Visual Features Ekaterina Shutova, Douwe Kiela, Jean Maillard. The basic system uses word embedding similarity – cosine between the word embeddings.
Then they explore variations using phrase embeddings, cos(phrase-word2, word2), which is similar to the operations with word regularities by Mikolov.