X-MAP is a large-scale heterogeneous recommender which is built on top of Apache Spark and implemented in Python.
- Provides heterogeneous recommendation based on artificial AlterEgos of users across multiple application domains.
- Any classical homogeneous recommendation algorithm can be run in the target application domain using these AlterEgos.
- Provides formal privacy guarantees.
X-MAP requires Python 3
, Numpy 1.10.4
, Apache Spark 1.6.1
pre-installed on your machine.
Please refer to Anaconda, Apache Spark for installation instructions.
We also provide a docker image for your convenience.
If you do not have docker
and docker-compose
installed on your machine, please check the official configuration guidance for installation guidelines.
Next, open the console, go to the platform
folder, and execute the below-mentioned command to setup the corresponding docker image.
docker-compose build
Once you have modified the scripts in X-MAP
folder, you should rebuild the package using the following command:
python setup.py install
We provide an egg file, located in dist/xmap-0.1.0-py3.5.egg
, that you could use for your application.
X-MAP is tested on real-traces from Amazon. For current implementation, the input data follows the below-mentioned format:
Note that the timestamp is required if you want to implement algorithms incorporating temporal behaviour of users which is also supported by AlterEgos.
We provide here two demonstrations: twodomain_demo.py
and multidomain_demo.py
. You can also tune the parameters in the file parameters.yaml
Note that the scipt should run successfully using the docker image that we provided. Please check your local system settings (e.g., directory path) while working with the application.
A simple example of how to run X-MAP on a local machine.
spark-submit --master local[4] \
--py-files dist/xmap-0.1.0-py3.5.egg twodomain_demo.py
A simple example of how to run X-MAP on a cluster of machines.
spark-submit --py-files xmap-0.1.0-py3.5.egg \
--num-executors 30 --executor-cores 3 --executor-memory 12g \
--driver-memory 12g --driver-cores 4 twodomain_demo.py
X-MAP can be easily used with any publicly available recommender library. We provide an example below for using Spark's built-in MLlib library with X-MAP.
from pyspark.ml.evaluation import RegressionEvaluator
from pyspark.ml.recommendation import ALS
from xmap.core import *
# use component in xmap to build alterEgo profile.
sourceRDD = baseliner_clean_data_pipeline(...)
targetRDD = baseliner_clean_data_pipeline(...)
trainRDD, testRDD = baseliner_split_data_pipeline(...)
item2item_simRDD = baseliner_calculate_sim_pipeline(...)
extendedsimRDD = extender_pipeline(...)
alterEgo_profileRDD = generator_pipeline(...)
# build MLlib to do matrix factorization
als = ALS(...)
model = als.fit(alterEgo_profileRDD)
predictions = model.transform(testRDD)
evaluator = RegressionEvaluator(...)
rmse = evaluator.evaluate(predictions)
print("Root-mean-square error = " + str(rmse))
Please raise potential bugs on github. If you have an open-ended or a research related question, you can post it on: X-MAP group.