Sort of SVD

Initialize:

initialize CurrentEstimate[user,movie] to MovieAverage[movie] + UserOffset[user] (see Simon's page)

Train a Feature:

initialize UserVector[user] to small random numbers
initialize MovieVector[movie] to small random numbers

repeatedly:
   for each user:
      for each training movie (for this user):
         uv = UserVector[user]
         mv = MovieVector[movie]
         err = ActualRating[user,movie] - CurrentEstimate[user,movie] - uv * mv
         UserVector[user] += learningRate * err * mv
         MovieVector[movie] += learningRate * err * uv

Apply a Feature:

for each user:
   uv = UserVector[user]
   for each training and testing movie (for this user):
         mv = MovieVector[movie]
         CurrentEstimate[user,movie] += uv * mv

2. Eliminate UserVector

If we want to be able to provide estimates to new users without first training the features with that new user, it would be handy to get rid of the UserVector. Given the current error and the MovieVector, the user's value is just the dot product of the error and MovieVector divided by the dot product of the MovieVector with itself (using only movies the user has rated).

Train a Feature:

initialize MovieVector[movie] to small random numbers

repeatedly:
   for each user:
      errMv = 0
      MvMv = 0
      for each training movie (for this user):
         mv = MovieVector[movie]
         err = ActualRating[user,movie] - CurrentEstimate[user,movie]
         errMv += err * mv
         MvMv += mv * mv
      uv = errMv / MvMv
      for each training movie (for this user):
         mv = MovieVector[movie]
         err = ActualRating[user,movie] - CurrentEstimate[user,movie] - uv * mv
         MovieVector[movie] += learningRate * err * uv

Apply a Feature:

for each user:
   for each training movie (for this user):
      mv = MovieVector[movie]
      err = ActualRating[user,movie] - CurrentEstimate[user,movie]
      errMv += err * mv
      MvMv += mv * mv
   uv = errMv / MvMv
   for each training and testing movie (for this user):
      mv = MovieVector[movie]
      CurrentEstimate[user,movie] += uv * mv

This gets a probe RMSE of .9636. Not great.
I start each feature with learningRate = .1 and multiply by .95 after each epoch and continue for 150 epochs. This takes me just under 4 minutes per feature.

3. Discount

Apply a Feature:

for each user:
   for each training movie (for this user):
      mv = MovieVector[movie]
      err = ActualRating[user,movie] - CurrentEstimate[user,movie]
      errMv += err * mv
      MvMv += mv * mv
   uv = errMv / MvMv
   for each training and testing movie (for this user):
      mv = MovieVector[movie]
      CurrentEstimate[user,movie] += uv * mv * (1 - discount)

4. Per User Discount

Somewhat better results can be obtained by applying a bigger discount to users with small numbers of rated movies. Might MvMv be a better choice? It would seem like if you haven't rated any movies that are significantly involved with the feature, maybe that is worth a larger discount.

Apply a Feature:

for each user:
   for each training movie (for this user):
      mv = MovieVector[movie]
      err = ActualRating[user,movie] - CurrentEstimate[user,movie]
      errMv += err * mv
      MvMv += mv * mv
   uv = errMv / MvMv
   disc = (SOME_CONST + discount * TrainingMovieCount[user]) / (SOME_CONST + TrainingMovieCount[user]);
   for each training and testing movie (for this user):
      mv = MovieVector[movie]
      CurrentEstimate[user,movie] += uv * mv * (1 - disc)

1. Simon's Method

Initialize:

Train a Feature:

Apply a Feature:

2. Eliminate UserVector

Train a Feature:

Apply a Feature:

3. Discount

Apply a Feature:

4. Per User Discount

Apply a Feature: