Skip to main content

: Machine Learning

Comparing ResMem and MemNet

In my previous posts about Memorability (see the project link above), I’ve been talking about the performance of my models fairly matter-of-factly. I’ve been comparing their scores on things, reporting them in abstracts, and talking about how one model performs better than another, and why I think that is happening. Some questions arise though, for example, why did I get such a vastly different score with MemNet than what Khosla et al.

ResMem and M3M

In my last post on computer vision and memorability, I looked at an already existing model and started experimenting with variations on that architecture. The most successful attempts were those that use Residual Neural Networks. These are a type of deep neural network built to mimic specific visual structures in the brain. ResMem, one of the new models, uses a variation on ResNet in its architecture to leverage that optical identification power towards memorability estimation.

ResMem Release

A user-ready version of ResMem is now available on PyPI! The model included in the package is designed to estimate the memorability of an input image but is not intended for feature space analysis. The model is optimized for accuracy by allowing the ResNet features to retrain. The model included in the resmem package has been dubbed “ResMemRetrain” for this reason. Statistically, the retrained model performs better than the model where ResNet is as-is, receiving a Spearman rank correlation of 0.

MemNet: Models for Predicting Image Memorability

Memnet1 was an attempt to build a neural network-based model to predict the memorability of an image. This attempt was carried out by Khosla et al. at the Computer Science and Artificial Intelligence Labs at MIT to moderate success. It is the most commonly used neural network regression for this purpose, and has been used and cited in many research papers since publication. There are some problems, however. Memnet was built in Caffe, a deep learning framework which has been defunct since shortly after Memnet’s publication.

Blaseball Ticker Analysis

Introduction In the last week, large swathes of the Internet have been enamored with the simulated sport of blaseball. Blaseball describes itself as “baseball at your mercy, baseball perfected”. I’ve been describing it as “Nightvale-esque simulated baseball with rules decided by a voting system”. It’s fun, that’s for sure. Blaseball saw a boom after being picked up by a number of news outlets, and a number of blaseball users thought “now, this is fun, but I want to know it like I know myself”.

Accelerated Gammatones

Introduction Last winter, I worked on a personal project I call ongaku (from the Japanese for ‘music’). This was an attempt to use manifold learning to create a metric space for music. The preprocessing relied heavily on a method called (Valero and Alias 2012). This method was intended to replace Mel Frequency Cepstral Coefficients. Where Mel Frequency is a logarithmic transformation of sound frequency, in an attempt to simulate human perception of sound.

Sex, Conspiracy, and Video Games: Urban Legends on the Internet

Introduction Urban Legends, or Contemporary Legends, are a type of folklore popularized by Jan Harold Brunvand in his book: The Vanishing Hitchhiker: American Urban Legends & Their Meanings. Urban Legends are considered to be the modern continuation of the human tradition of folklore and legends. Folklore does not occur exclusively in so-called primitive or traditional societies, and by studying modern folklore in the same way that we study the folklore of traditional societies, we can also learn about modern societies(Brunvand 2003).

Plurals and ML

Plurals and Machine Learning Using older machine learning models to conjugate English verbs produced rather silly results. These models performed at an acceptable level for many words, but when given nonsense words as an input these models would produce humorous conjugations. For example, we have: Verb Human Generated Past-Tense Machine Generated Past-Tense mail mailed membled conflict conflicted conflafted wink winked wok quiver quivered quess satisfy satisfied sedderded smairf smairfed sprurice trilb tribled treelilt smeej smeejed leefloag frilg frilged freezled Naturally, my girlfriend and I found this hilarious.

Ongaku

Overview Ongaku is a method for creating playlists programmatically, using only the content of the song alone. It uses gammatone cepstral analysis to create unique matrices to represent each song. A gammatone cepstrum is similar to the more common spectra used for audio analysis. Instead of doing a Fourier transform, we do a reverse Fourier transform, and then apply a transformation according to the gammatone function. This function was designed to mimic the signals sent to the brain through the cochlear nerve, the nerve which connects the ear to the brain.

Fluxx for Robots

Introduction: Fluxx is a mildly popular card game. All things considered, it’s actually fairly contentious. It has a score of 5.7 on Board Game Geek, but it has a serious cult following. This is a python implementation of version 5.0 of the Vanilla Fluxx card game. The full rules are available here. The core idea is that the players can play cards which change the rules of the game. Each turn, each player draws a certain number of cards and plays a certain number of cards.