Peer-To-Peer Lending: Optimizing The Classification And Matching Of Digital Karmic Personas
[Full Text]
AUTHOR(S)
Sanya Sharma
KEYWORDS
Dimensionality Reduction, Feature Selection, Gradient Boosted Decision Tree, Karmic Digital Persona, Latent Dirichlet Allocation, Peer-to-Peer Lending, Principal Component Analysis
ABSTRACT
Student debt in America amounts to $1.6 trillion and is a growing concern. My personalized peer-to-peer lending platform, Coinsequence, aims to alleviate this crisis and offer a curated platform to do-good investors where they can directly invest in the students’ karma and profile. In this paper, I propose various machine learning (ML) techniques to efficiently personalize the peer-to-peer lending and find ideal investor-student matches on the platform. The investors’ search for students by entering criteria which could range from a sophisticated keyword to free-flowing text, such as from “tennis” to “volunteers at a community shelter to help stray animals get sterilized.” To analyze the inputs, I will assess Latent Dirichlet Allocation (LDA) to assort them into categories to facilitate an efficient pairing. Furthermore, I will evaluate Gradient Boosted Decision Tree ML algorithm to match the digital personas of students to investors’ searches, leveraging the clusters produced using LDA in the conditional tests for the decision tree. The Coinsequence dataset for students’ digital personas, composed of thousands of activities logged by millions of students, would be massive and would require removal of redundant dimensions and features. To execute this refinement, I will evaluate Principal Component Analysis (PCA) and a hybrid of filter and wrapper method feature selection to remove extraneous variables and extract relevant features, which would help achieve high accuracy and efficiency of the ML algorithms.
REFERENCES
[1] Google Sheets - create and edit spreadsheets online, for free. (n.d.). Retrieved from https://docs.google.com/spreadsheets/d/1me3oiIwLcHSxnV0a_5RGdhNvqykxOuKJnUNjHJ1KxpI/edit#gid=120744116
[2] The AI Behind LinkedIn Recruiter search and recommendation systems. (n.d.). Retrieved from https://engineering.linkedin.com/blog/2019/04/ai-behind-linkedin-recruiter-search-and-recommendation-systems
[3] Dwivedi, P. (2019, March 27). NLP: Extracting the main topics from your dataset using LDA in minutes. Retrieved from https://towardsdatascience.com/nlp-extracting-the-main-topics-from-your-dataset-using-lda-in-minutes-21486f5aa925
[4] Li, S. (2018, June 01). Topic Modeling and Latent Dirichlet Allocation (LDA) in Python. Retrieved from https://towardsdatascience.com/topic-modeling-and-latent-dirichlet-allocation-in-python-9bf156893c24
[5] Luhaniwal, V. (2019, October 04). Feature selection using Wrapper methods in Python. Retrieved from https://towardsdatascience.com/feature-selection-using-wrapper-methods-in-python-f0d352b346f
[6] Magee, J. F. (2014, August 01). Decision Trees for Decision Making. Retrieved from https://hbr.org/1964/07/decision-trees-for-decision-making
[7] Raj, J. T. (2019, March 14). Dimensionality Reduction for Machine Learning. Retrieved from https://towardsdatascience.com/dimensionality-reduction-for-machine-learning-80a46c2ebb7e
|