Recomendation Systems Using Alternating Least Square With Pyspark, Implemented on Flask Framework

Framework: Flask
- flask_bcrypt
- flask_wtf
- flask_paginate
Database: MongoDB
- pymongo
- mongo-connect (pyspark package)
Styling: Tailwindcss
- @tailwindcss/forms
Apache Spark: Pyspark

Overview

Demo

Data

ml-latest-small - Small Movielens datasets
data.json - contains results of scraped data from to_json.py

Application Flow

graph LR
    A[Movielens Dataset] --> |preprocessing| B(Training Model)
    C[Experiment]-->|Tunning| B 
    B --> |Integration| D[Flask prototype]

Scraping

selection.py - Select imdbIds of movies of different genres with format dictionary with tag genre for key and list of imdbId as value
scrape.py - Uses bs4 and Requests to extract imdbId, title, year, poster, rating, summary, time, genres. return with json dumps
to_json.py - the scraped movies data append to list in json format with data.json and empty "[]". It's a bit weird to add empty list in file. but, its work! saving in json to handle: adding movieId and imdbId to confirm both id is match, for processing multiple genres in each movie and ensure that genres data type is array,m l/m in database, for processing when want to put certain data, ex: only year less than or more than, etc.

Experiment ALS

CF_ALS.ipynb - this notebook represent experiment ALS with hyperparameter tuning and preprocessing data both used in ALS and Web Application. pyspark with mongo-connect responsibility to handle inserting data to mongodb after preprocessing.

Flask Web Application

Web Aplication is on Web Folder

Static folder - Contains css config for tailwindcss, javascript to handle rating display and more static file.
Templates folder - Contains HTML handled with jinja2

layouts folder for base templating to share layouts contains header, footer
other folder and file represent handle page for each other owned file or folder name.

Running Web App Locally

assume if installed dependencies and adding config

installed node (for tailwindcss), python, and apache spark
Make sure to config mongodb port in web/app.py

pip install -r requirements.txt
cd web
npm install 
npm run watch
python app.py

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
web		web
.gitignore		.gitignore
CF ALS.ipynb		CF ALS.ipynb
README.md		README.md
demo.gif		demo.gif
requirement.txt		requirement.txt
scrape.py		scrape.py
selection.py		selection.py
to_json.py		to_json.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Recomendation Systems Using Alternating Least Square With Pyspark, Implemented on Flask Framework

Overview

Demo

Data

Application Flow

Scraping

Experiment ALS

Flask Web Application

Running Web App Locally

About

Releases

Packages

Languages

inurhuda00/flask-als

Folders and files

Latest commit

History

Repository files navigation

Recomendation Systems Using Alternating Least Square With Pyspark, Implemented on Flask Framework

Overview

Demo

Data

Application Flow

Scraping

Experiment ALS

Flask Web Application

Running Web App Locally

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages