{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# CSCE 470 :: Information Storage and Retrieval :: Texas A&M University :: Fall 2017\n", "\n", "\n", "# Homework 3 and 4 United Forever: Recommenders and Classification!\n", "\n", "### 200 points [10% of your final grade]\n", "\n", "### Due: November 16, 2017\n", "\n", "*Goals of this homework:* Put your knowledge of recommenders and classifiers to work. \n", "\n", "*Submission Instructions (ecampus):* To submit your homework, rename this notebook as `lastname_firstinitial_hw#.ipynb`. For example, my homework submission would be: `caverlee_j_hw3.ipynb`. Submit this notebook via **ecampus**. Your IPython notebook should be completely self-contained, with the results visible in the notebook. We should not have to run any code from the command line, nor should we have to run your code within the notebook (though we reserve the right to do so)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Part 1: Recommending Movies\n", "\n", "For this first part, we're going to use part of the Movielens 100k dataset. Prior to the Netflix Prize, the Movielens data was **the** most important collection of movie ratings.\n", "\n", "First off, we need to load the data (see the data files in the \"Resources\" tab, including u.user, u.item, and ua.base). Here, we provide you with some helper code to load the data using [Pandas](http://pandas.pydata.org/). Pandas is a nice package for Python data analytics." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | MovieId | \n", "Title | \n", "UserId | \n", "Rating | \n", "Age | \n", "Gender | \n", "Occupation | \n", "ZipCode | \n", "
---|---|---|---|---|---|---|---|---|
0 | \n", "1 | \n", "Toy Story (1995) | \n", "1 | \n", "5 | \n", "24 | \n", "M | \n", "technician | \n", "85711 | \n", "
1 | \n", "2 | \n", "GoldenEye (1995) | \n", "1 | \n", "3 | \n", "24 | \n", "M | \n", "technician | \n", "85711 | \n", "
2 | \n", "3 | \n", "Four Rooms (1995) | \n", "1 | \n", "4 | \n", "24 | \n", "M | \n", "technician | \n", "85711 | \n", "
3 | \n", "4 | \n", "Get Shorty (1995) | \n", "1 | \n", "3 | \n", "24 | \n", "M | \n", "technician | \n", "85711 | \n", "
4 | \n", "5 | \n", "Copycat (1995) | \n", "1 | \n", "3 | \n", "24 | \n", "M | \n", "technician | \n", "85711 | \n", "