I am a third-year undergraduate at UC San Diego pursuing a Bachelor of Science in mathematics and computer science. I'm specializing in machine learning, quantitative finance and statistics/probablity.
I am currently working at BlackRock as a summer analyst on their ETF & Index Investments research team working on natural language processing. At UC San Diego, I am a member of the Mathematical Neuroscience lab advised by Gabriel Silva, where I am researching adaptive and dynamic neural networks. Last summer, I interned at CareFusion BD as a data scientist working on time-series forecasting and machine learning models.
Team: ETF & Index Investments Global Research and Analytics
Developed time-series forecasting, machine learning models to predict drug shortages, and price changes. Effectively analyzed and visualized datatsets with more than 10 Million drug usage and transaction records. Used dimension-reduction techniques (PCA, SVD, LDA), fourier and log transformations and resampling techniques (bagging, boosting) to identify correlations between variables and extract underlying patterns of the data. Worked with multiple regression and classificatoin models - linear models, ARIMA, boosted trees (xgboost), random forests and SVMs.
Tutor for Object-Oriented Programming (CSE 11) and Data Structures (CSE 12). Worked with the instructor to design and write programming assignments and their specifications. Held office hours and led review sessions to explain programming concepts and assist students in implementing programming assignments by analyzing and debugging their code. Graded homework, exams and wrote submission/grading scripts for programming assignments.
Wrote python scripts to setup a Continuous Integration server to automate package builds. Developed multiple native Linux (Ubuntu, CentOS, Debian) packages using bash and python for Kolibri - Learning Equality's flagship application. Optimized software setup on all platforms by implementing efficient installation scripts.
Developing dynamic, adaptive neural networks using data assimilation training methods. Applying non-linear transfer functions between neurons and layers to create complex dynamic geometric networks.
Support vector machines (SVMs) are an extremely powerful machine learning tool to solve various classification problems. Not only are they less prone to over-fitting due to large margins, but they are also easy to optimize due to their convex nature. In this paper we will review both soft and hard margin formulations of linear SVMs. First, we discuss how to solve soft-margin SVMs via dual formulation, and justify how the dual problem will in-fact give the optimal solution of primal form. Then, we discuss kernel tricks to solve non-linear classification using convex optimization. Finally, we perform classification on real-world data using both non-linear and linear SVMs using the algorithms devised prior.
Analysis of the negative effects of Gentrification in San Diego in the 21st Century. Visualized the change in demographics of all neighborhoods in San Diego using heat maps. Identified neighborhoods effected the most by Gentrification and found patterns between multiple socio-economic factors such as Poverty, Population, Uninsurance and Property value. Languages/Tools Used: Python (Pandas, NumPy, Matplotlib, Patsy), Jupyter Notebooks.
Data science powered web application to perform sentiment analysis on YouTube comments. Applied machine learning techniques on the model using a training dataset of 1 Million tweets. Wrote python scripts for web scraping and performing sentiment analysis on the comments. Languages/Tools Used: Python, Natural Language Toolkit, Flask.
Android application to ease the process of connecting with people on multiple social media. Integrated database, added Location tracking and developed the app structure. Languages/Tools Used: Java (Android), XML, Google Firebase.