CFEngine API

org.recommender.server
Class DataManager

java.lang.Object
  |
  +--org.recommender.server.DataManager

public class DataManager
extends java.lang.Object

Implementation of the DataManager. This class handles all the cached data in a CF Engine Application. This includes reading and writing out User Info, Item Info, Ratings Info. It will also be responsible for deciding when to cache in data and when to write it out to disk. The interface should be generic enough so that the transition is painless.

Author:
Sameer Kadam, Irwin Yoon, Olivier Godde, Daniel Lowd, Yun Wang, Matt McLaughlin

Field Summary
(package private)  java.util.LinkedList dbUsers
           
 
Constructor Summary
DataManager(boolean loadRatings, boolean doSampling)
          Create a new DataManager object, establish connection with database, and cache ratings into memory.
 
Method Summary
 void addCachedRating(ItemRating rating)
          Add ratings to cache but not to the database.
 void addRating(ItemRating rating)
          Adds a new rating to both database and the cache.
 void cacheRatingGroup(java.util.Iterator iter)
           
 void cacheTypedItemList()
          Cache typed item list from database into memory - avoid too many disk IO access
 void cacheUserTopN(int userID, ItemPrediction[] predictions)
          Cache user TopN data computed by one of our algorithms.
 boolean checkCachedTopN(int userID)
          Check wether a user's TopN data is cached on the server side.
 void checkCachedUser(int userID)
          Check this user's ratings records and load into cache if not done yet Upon return from this method, the user's record is guaranteed to be in memory and will not be evicted until you call decUserCount() on the UserInfo structure.
 void decCachedUsers()
           
 void decNumItems()
           
 void decNumUsers()
           
 void decUseCount(int userID)
          Sets the inUse field of the specified user.
 int getAddedRatings(int userID)
          Get the value of a user's counter of added ratings.
 java.util.Iterator getAllUsers()
          Return an iterator through all user ID's, each one an Integer object.
 ItemPrediction[] getCachedTopN(int userID, int number, int offset)
          Returns a user's top N recommendations, retrieved from the server side memory cache.
 java.util.Vector getCachedTypedItemList(int typeID)
          Get a vector of specific typed Item List.
 java.util.Iterator getCachedTypedItemLists(int typeID)
          Get an iterator through a list of specific typed Item List.
 float getCoverage()
          Get the percentage of item entries that are not null i.e., items for which we can compute a prediction.
 float getErrRating()
          Accessor, return error rating
 ItemInfo getItemInfo(int itemID)
           
 java.util.Iterator getItemList(int userID)
          Returns a list of item ratings for a given user using the cached data Load the user into the cache if not done yet (because of data sampling).
 ItemRating getItemRatingCached(int userID, int itemID)
          Get rating from the cache.
 ItemRating[] getItemRatingList(int itemID)
          Return an array of userIDs who have rated the given item.
 float getItemSD(int itemID)
          Returns the standard deviation of the item's ratings from the cache.
 java.lang.String getItemTypeInfoDatabaseName()
          Accessor, return itemTypeTableName
 int getMaxItem()
          Accessor, return max item ID
 float getMaxRating()
          Accessor, return MaxRating
 int getMaxType()
           
 int getMaxUser()
          Accessor, return max user ID
 float getMeanItemRatingCached(int itemID)
          Return the mean of item's ratings from the cache.
 float getMeanRatingCached(int userID)
          Return the mean of user's ratings from the cache.
 float getMinRating()
          Accessor, return MinRating
 java.util.Iterator getNeighborhood(int itemID)
          Returns a list of ItemRatings representing ratings of users.
 int getNextItemId()
           
 int getNextUserId()
           
 int getNumItems()
           
 int getNumRaters(int itemID)
          Returns the number of users who have rated the corresponding item.
 int getNumRatings(int userID)
          Returns the number of ratings for the corresponding user.
 int getNumTopNRecords()
          Accessor, return num of recommendations cached
 int getNumUsers()
           
 int getNumUsersCached()
           
 float getRatingCached(int userID, int itemID)
          Get rating from the cache.
 java.lang.String getRatingDatabaseName()
          Accessor, return ratingTableName
 java.util.Iterator getRatingList(java.lang.String whereClause)
          Get an iterator through a list of all ItemRatings in the given table, ordered by UserID.
 int getSizeOfTypedItemList(int typeID)
          Get size of specific typed Item List.
 java.lang.String getStatistics()
           
 ItemPrediction[] getTopNByType(int userID, int number, int offset, int type, ItemPrediction[] pred)
          Returns a user's typed top N recommendations, No caching
 java.util.LinkedList getTypedItemIDList(int typeID)
          Get the whole list of ItemID of specific type
 UserInfo getUser(int userID)
          Accessor, return UserInfo
 UserInfo getUserInfo(int userID)
          Return a UserInfo object to methods that need to perfrom multiple computations on a single user
 java.lang.String getUserInfoTableName()
          Accessor, return userInfo table's name
 java.util.Iterator getUserList(java.lang.String tableName)
          Get an iterator through a list of all userIDs in the given table, ordered by UserID.
 ItemRating[] getUserRatingList(int userID)
          Return an array of itemIDs which the given user has rated.
 float getUserSD(int userID)
          Returns the standard deviation of the user's ratings from the cache.
 java.util.LinkedList givenNHide(int userID, int NumParam, int option)
           
 void hideRatingFromRecords(int user, int item)
          Hiding the rating value for the item we are trying to predict for, re-compute the user SD and user average rating.
 void incCachedUsers()
           
 void incCacheHits()
           
 void incCacheMisses()
           
 void incNumItems()
           
 void incNumUsers()
           
 void initializeData()
          This method initializes the data structures for caching with the sizes of the max user and item ID.
static void insertionSort(int[] a, float[] b)
          Simple insertion sort.
 UserInfo loadUser(int userID, boolean sampled)
          Cache new user.
 void printAllTables()
          Method to print contents of all tables.
 void printRatingTable()
          Print ratingTable
 void refreshCache()
          Remove useless data from the cache.
 void removeCachedRating(ItemRating rating)
          Remove rating from the cache.
 void removeCachedRatingGroup(java.util.Iterator iter)
           
 void removeRating(int userID, int itemID)
          Remove rating from our cache.
 void resetUserCounter(int userID)
          Reset a user's counter of added ratings.
 void sampleUsers()
          Function called by the Resample Thread, at regular intervals of time.
 void setMaxItems(int itemID)
           
 void setMaxUsers(int userID)
           
 void setNumItems(int numItems)
           
 void setNumUsers(int numUsers)
           
 void setNumUsersCached(int numUsersCached)
           
 void shutdown()
           
 java.util.LinkedList sparcifyData(java.util.Iterator iter, int NumParam, int option)
           
 void updateStats(int userID)
          Recalculate mean and standard deviation for a given user.
 void userUnload(int userID)
          Remove a user from the cache.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

dbUsers

java.util.LinkedList dbUsers
Constructor Detail

DataManager

public DataManager(boolean loadRatings,
                   boolean doSampling)
Create a new DataManager object, establish connection with database, and cache ratings into memory.

Parameters:
loadRatings -
doSampling -
Method Detail

getRatingCached

public float getRatingCached(int userID,
                             int itemID)
Get rating from the cache.

Parameters:
userID -
itemID -
Returns:
rating

getItemRatingCached

public ItemRating getItemRatingCached(int userID,
                                      int itemID)
                               throws java.sql.SQLException
Get rating from the cache.

Parameters:
userID -
itemID -
Returns:
rating
java.sql.SQLException

printAllTables

public void printAllTables()
Method to print contents of all tables.


printRatingTable

public void printRatingTable()
Print ratingTable


getNeighborhood

public java.util.Iterator getNeighborhood(int itemID)
Returns a list of ItemRatings representing ratings of users. A neighbor means someone who has rated itemID.

Parameters:
itemID -
Returns:
Iterator of neighbors

getItemInfo

public ItemInfo getItemInfo(int itemID)

getItemList

public java.util.Iterator getItemList(int userID)
Returns a list of item ratings for a given user using the cached data Load the user into the cache if not done yet (because of data sampling).

Parameters:
userID -
Returns:
Iterator of items.

getItemRatingList

public ItemRating[] getItemRatingList(int itemID)
Return an array of userIDs who have rated the given item.

Parameters:
itemID -
Returns:
retArray an array of ItemRatings

getUserRatingList

public ItemRating[] getUserRatingList(int userID)
Return an array of itemIDs which the given user has rated.

Parameters:
userID -
Returns:
retArray an array of itemRatings

getAllUsers

public java.util.Iterator getAllUsers()
Return an iterator through all user ID's, each one an Integer object.

Returns:
Iterator of userIDs

getMeanRatingCached

public float getMeanRatingCached(int userID)
Return the mean of user's ratings from the cache.

Parameters:
userID -
Returns:
mean rating

getMeanItemRatingCached

public float getMeanItemRatingCached(int itemID)
Return the mean of item's ratings from the cache.

Parameters:
itemID -
Returns:
mean rating

updateStats

public void updateStats(int userID)
Recalculate mean and standard deviation for a given user. Used when ratings in the database change for testing and and a new rating is added.

Parameters:
userID -

getUserSD

public float getUserSD(int userID)
Returns the standard deviation of the user's ratings from the cache.

Parameters:
userID -
Returns:
userSD

getItemSD

public float getItemSD(int itemID)
Returns the standard deviation of the item's ratings from the cache.

Parameters:
itemID -
Returns:
itemSD

insertionSort

public static void insertionSort(int[] a,
                                 float[] b)
Simple insertion sort.

Parameters:
a - an array of int items - the sort key.
b - an array of parallel ratings of the same length as a

sampleUsers

public void sampleUsers()
Function called by the Resample Thread, at regular intervals of time. Select the most relevant user data from the database, and insert (if necessary) into the cache. After, this operation, remove the users that are no longer selected by the sampling method and have their reference bit set to "0". Note that this method will have no effect if you haven't rerun the calcUserInfo utility. This method is synchronized to prevent it from being called by two different threads.


getRatingList

public java.util.Iterator getRatingList(java.lang.String whereClause)
                                 throws java.sql.SQLException
Get an iterator through a list of all ItemRatings in the given table, ordered by UserID. Used for experiments.

Parameters:
whereClause -
Returns:
Iterator of ratings
Throws:
java.sql.SQLException

cacheTypedItemList

public void cacheTypedItemList()
                        throws java.sql.SQLException
Cache typed item list from database into memory - avoid too many disk IO access

Throws:
java.sql.SQLException

getTypedItemIDList

public java.util.LinkedList getTypedItemIDList(int typeID)
                                        throws java.sql.SQLException
Get the whole list of ItemID of specific type

Parameters:
typeID -
Returns:
Iterator of itemIDs
Throws:
java.sql.SQLException

getCachedTypedItemLists

public java.util.Iterator getCachedTypedItemLists(int typeID)
Get an iterator through a list of specific typed Item List.

Parameters:
typeID -
Returns:
Iterator of typed item list

getCachedTypedItemList

public java.util.Vector getCachedTypedItemList(int typeID)
Get a vector of specific typed Item List.

Parameters:
typeID -
Returns:
Vector of typed item list

getSizeOfTypedItemList

public int getSizeOfTypedItemList(int typeID)
Get size of specific typed Item List.

Parameters:
typeID -
Returns:
size

getItemTypeInfoDatabaseName

public java.lang.String getItemTypeInfoDatabaseName()
Accessor, return itemTypeTableName

Returns:
itemTypeTableName

getUser

public UserInfo getUser(int userID)
Accessor, return UserInfo

Returns:
UserInfo

getUserList

public java.util.Iterator getUserList(java.lang.String tableName)
                               throws java.sql.SQLException
Get an iterator through a list of all userIDs in the given table, ordered by UserID.

Parameters:
tableName -
Returns:
Iterator of userIDs
Throws:
java.sql.SQLException

initializeData

public void initializeData()
                    throws java.sql.SQLException
This method initializes the data structures for caching with the sizes of the max user and item ID.

Throws:
java.sql.SQLException

getCoverage

public float getCoverage()
                  throws java.sql.SQLException,
                         CFInternalErrorException
Get the percentage of item entries that are not null i.e., items for which we can compute a prediction.

Returns:
percentage
Throws:
java.sql.SQLException
CFInternalErrorException

getRatingDatabaseName

public java.lang.String getRatingDatabaseName()
Accessor, return ratingTableName

Returns:
ratingTableName

getMinRating

public float getMinRating()
Accessor, return MinRating

Returns:
MinRating

getMaxRating

public float getMaxRating()
Accessor, return MaxRating

Returns:
MaxRating

getErrRating

public float getErrRating()
Accessor, return error rating

Returns:
error rating

getMaxUser

public int getMaxUser()
Accessor, return max user ID

Returns:
max user ID

getMaxItem

public int getMaxItem()
Accessor, return max item ID

Returns:
max item ID

getMaxType

public int getMaxType()

getUserInfoTableName

public java.lang.String getUserInfoTableName()
Accessor, return userInfo table's name

Returns:
userInfo table name

getNumTopNRecords

public int getNumTopNRecords()
Accessor, return num of recommendations cached

Returns:
numTopNRecords

hideRatingFromRecords

public void hideRatingFromRecords(int user,
                                  int item)
Hiding the rating value for the item we are trying to predict for, re-compute the user SD and user average rating. Useful for MAE computation and other experiments.

Parameters:
user -
item -

getNumRatings

public int getNumRatings(int userID)
Returns the number of ratings for the corresponding user.

Parameters:
userID -
Returns:
number of ratings

getNumRaters

public int getNumRaters(int itemID)
Returns the number of users who have rated the corresponding item.

Parameters:
itemID -
Returns:
number of users

loadUser

public UserInfo loadUser(int userID,
                         boolean sampled)
                  throws java.sql.SQLException
Cache new user. Update data structures for caching when new user comes in. Also take care of updating userStats. When a user record is loaded, it's inUse variable is set to true. While the inUse variable is set, the user cannot be loaded or unloaded. You may need to immediate unset this variable.

Parameters:
userID -
Returns:
the new UserInfo structure that was created for this user
java.sql.SQLException

userUnload

public void userUnload(int userID)
Remove a user from the cache. This will not unload active users. If the UserInfo structure of the userID specified has a non-zero inUse field, then this method will *NOT* unload it the user. It will simply return.

Parameters:
userID -

checkCachedUser

public void checkCachedUser(int userID)
                     throws java.sql.SQLException
Check this user's ratings records and load into cache if not done yet Upon return from this method, the user's record is guaranteed to be in memory and will not be evicted until you call decUserCount() on the UserInfo structure.

Parameters:
userID -
java.sql.SQLException

addRating

public void addRating(ItemRating rating)
               throws java.sql.SQLException
Adds a new rating to both database and the cache. Adds this rating to the waiting queue for the database and places it in the cache (in both the ratings matrix and its transposed) These operations might be done concurrently by client Threads => synchronize data strucure objects.

Parameters:
rating - - an ItemRating
java.sql.SQLException

addCachedRating

public void addCachedRating(ItemRating rating)
Add ratings to cache but not to the database.

Parameters:
rating - - an ItemRating

removeRating

public void removeRating(int userID,
                         int itemID)
                  throws java.sql.SQLException
Remove rating from our cache. Function is Thread safe.

Parameters:
userID -
itemID -
java.sql.SQLException

removeCachedRating

public void removeCachedRating(ItemRating rating)
Remove rating from the cache. param ItemRating


refreshCache

public void refreshCache()
Remove useless data from the cache. This function is called by special Thread whose task is to update the cache. It is only run When the cache fills up. ((getNumUsersCached() - maxSampledUsers) > maxCachedUsers) First, it sets to "0" all the reference bits of the records in the cache, then sleeps for a certain amount of time. After waking up, it iterates through the cache records once again, clearing all records whose reference bit is not "1".


checkCachedTopN

public boolean checkCachedTopN(int userID)
Check wether a user's TopN data is cached on the server side.

Parameters:
userID -
Returns:
boolean

cacheUserTopN

public void cacheUserTopN(int userID,
                          ItemPrediction[] predictions)
Cache user TopN data computed by one of our algorithms.

Parameters:
predictions -

getCachedTopN

public ItemPrediction[] getCachedTopN(int userID,
                                      int number,
                                      int offset)
Returns a user's top N recommendations, retrieved from the server side memory cache. Parameters specify the number of recommendations to retrieve and the offset

Returns:
The specified number recommendations, with "offset" starting at zero

getTopNByType

public ItemPrediction[] getTopNByType(int userID,
                                      int number,
                                      int offset,
                                      int type,
                                      ItemPrediction[] pred)
Returns a user's typed top N recommendations, No caching

Returns:
The specified number recommendations, with "offset" starting at zero

resetUserCounter

public void resetUserCounter(int userID)
Reset a user's counter of added ratings.

Parameters:
userID -

getAddedRatings

public int getAddedRatings(int userID)
Get the value of a user's counter of added ratings.

Parameters:
userID -

incNumItems

public void incNumItems()

decNumItems

public void decNumItems()

incNumUsers

public void incNumUsers()

decNumUsers

public void decNumUsers()

incCachedUsers

public void incCachedUsers()

decCachedUsers

public void decCachedUsers()

setMaxUsers

public void setMaxUsers(int userID)

setMaxItems

public void setMaxItems(int itemID)

shutdown

public void shutdown()

cacheRatingGroup

public void cacheRatingGroup(java.util.Iterator iter)

removeCachedRatingGroup

public void removeCachedRatingGroup(java.util.Iterator iter)

sparcifyData

public java.util.LinkedList sparcifyData(java.util.Iterator iter,
                                         int NumParam,
                                         int option)

givenNHide

public java.util.LinkedList givenNHide(int userID,
                                       int NumParam,
                                       int option)

getUserInfo

public UserInfo getUserInfo(int userID)
Return a UserInfo object to methods that need to perfrom multiple computations on a single user

Parameters:
userID -
Returns:
a single UserInfo stru

getNextItemId

public int getNextItemId()
                  throws java.sql.SQLException,
                         CFInternalErrorException
Returns:
An unused integer itemid that is guaranteed to be unique
java.sql.SQLException
CFInternalErrorException

getNextUserId

public int getNextUserId()
                  throws java.sql.SQLException,
                         CFInternalErrorException
Returns:
An unused integer userid that is guaranteed to be unique
java.sql.SQLException
CFInternalErrorException

getStatistics

public java.lang.String getStatistics()

getNumItems

public int getNumItems()

setNumItems

public void setNumItems(int numItems)

getNumUsersCached

public int getNumUsersCached()

setNumUsersCached

public void setNumUsersCached(int numUsersCached)

getNumUsers

public int getNumUsers()

setNumUsers

public void setNumUsers(int numUsers)

incCacheHits

public void incCacheHits()

incCacheMisses

public void incCacheMisses()

decUseCount

public void decUseCount(int userID)
Sets the inUse field of the specified user.

Parameters:
userID -

CFEngine API

Copyright © 2003 - Oregon State University www.orst.edu