Matthew Alhonte
Verified Expert in Engineering
Data Scientist and Developer
Matt has officially worked as a Python-based data scientist for the past six years; however, 在过去的十年里,他一直在研究统计学和编程的交叉领域(在数据科学家这个术语流行之前)。. 他将强大的技术技能与实验设计和统计推断的严格背景相结合. More recently, he's been focusing on machine learning, including some natural language processing and computer vision.
Portfolio
Experience
Availability
Preferred Environment
PyCharm, Git, Spacemacs, Visual Studio Code (VS Code), Jupyter
The most amazing...
...我所做的是对一个包含电生理学读数的未记录文件格式进行逆向工程.
Work Experience
Data Science Consultant
Ophidian Scientific
- Assisted numerous small clients with data-related work, ranging from data science and analysis, data engineering, and machine learning engineering.
- Designed and built ETL pipelines in Python, Dask, and Prefect.
- Oversaw the migrations between Google Sheets and Airtable. Airtable automation was execued in Python.
- 在Python中使用运筹学库来优化体育博彩网站FanDuel的团队.
- 为基于金融的出版物的文章归档构建了自然语言处理(NLP)分类器.
Data Scientist & Data Architect
Birch Infrastructure
- 协助为一家公用事业规模的可再生能源和数据中心公司设计数据基础设施.
- Created data pipelines with Prefect, mostly stitching together Google Cloud Functions and Cloud Run jobs.
- 使用dbt管理BigQuery数据仓库,制作表模式和转换.
- Set up data infrastructure (including Prefect and dbt).
Senior Data Scientist
The University of Colorado — Office of Data Analytics
- 执行统计分析和建模,以支持学生的成功,并在大学数据分析办公室的重组期间帮助建立实践.
- 使用Jupyter和Zeppelin创建并向高级管理人员展示结果和可视化.
- 开发了一个基于蒙特卡罗模拟的模型来预测每个学期的学生保留率.
- Built a Bayesian model of re-offense after student misconduct.
- Modeled the effects of different kinds of financial aid with XGBoost.
- Created a model to predict student GPAs with scikit-learn and Keras.
- 在大学数据分析办公室的重组过程中帮助建立实践.
Data Engineer
NOMI Beauty
- 为一家初创公司设计并构建了数据基础设施,使其更容易预订头发&-makeup appointments in hotel rooms.
- Architected a big data pipeline with Spark, Kafka, and Cassandra.
- Built data dashboards in Tableau for the operations team.
- Designed an ETL for survey data from Typeform's API into MySQL.
- 在Jupyter笔记本中创建报告,使用Python与Altair和Seaborn进行数据可视化.
- Designed and implemented a database schema in MySQL.
- Designed and supported ETL from Couchbase to MySQL using Python.
Data Science and Blockchain Integration Consultant
Tanktwo, Inc.
- 为管理物联网设备及其生成的数据构建了基于区块链的解决方案.
- Create a demo of a potential network using Hyperledger.
- Simulated a private blockchain network in action, using Python.
- 帮助向寻求投资的风险资本家展示了一个演示.
- 研究区块链的最佳实现,以满足业务需求.
Data Science Consultant
Hospital for Special Surgery
- 在神经病学实验室研究数据科学课题,研究术中神经生理监测(IONM) -在手术过程中监测肌肉和神经以防止损伤.
- 逆向工程一个包含生物信号数据的未记录文件格式.
- 尝试使用Scikit-learn将神经传导读数分类为指示损伤或麻醉反应.
- Visualized biosignal data with Plotly and presented findings.
- 研究了手术期间神经传导读数的Higuchi分形维数作为评估潜在损伤的手段.
- 使用Python数据套件(NumPy, Pandas和SciPy)分析生物信号数据.
Natural Language Processing Consultant
New York City Department of Administrative Services
- 用Python抓取pdf文件,帮助将出版物《欧博体育app下载》(the City Record)的旧目录数字化.
- Helped design a schema for entries (such as extracting addresses).
- 创建数据清理机制,以标准化来自100多个城市机构以不同格式报告的条目.
- 使用Python和NLTK对长达一个世纪的出版物语料库执行探索性自然语言处理(NLP).
- Worked to integrate a pipeline into their MS Access.
Integration and Development Consultant
Broadband Technologies Group
- Provided computer vision-based assistance for digitizing video archives.
- Used OpenCV and Python to tag damaged video areas.
- 实现Python自动修复某些类型的损坏视频.
- 帮助构建一个Android应用程序,为现场表演提供同步字幕.
- Prepared presentations with Jupyter.
Research Assistant
Hunter College
- Designed and validated a novel psychometric scale.
- Analyzed survey data in SPSS.
- Presented findings at research conferences.
- Maintained relationships with the lab after graduation, eventually moving from data analysis to Python.
- Worked on the publication of older data.
Summer Research Assistant
Yale School of Medicine
- 设计并指导了一项小型研究,调查最后通牒游戏中的精神病特征和行为.
- Analyzed GSR data.
- 通过演示和DMDX对研究参与者进行基于计算机的任务.
- Analyzed data from surveys and computer-based tasks.
- Built and maintained a database of participants.
Experience
Spring 2018 Complexity Challenge
http://github.com/mattalhonte/sfi-challengeGraph Theory Notes
Binary Grid Search
http://hackersandslackers.com/tuning-machine-learning-hyperparameters-with-binary-search/Recasting Low-cardinality Columns as Categoricals
http://hackersandslackers.com/recasting-low-cardinality-columns-as-categoricals-2Removing Duplicate Columns in Pandas
http://hackersandslackers.com/remove-duplicate-columns-in-pandasDowncast Numerical Data Types with Pandas
http://hackersandslackers.com/downcast-numerical-columns-python-pandas/Sentiment Analysis With AWS SageMaker
http://github.com/mattalhonte/sagemaker-deployment/tree/master/ProjectEpilepsy Classifier
http://github.com/mattalhonte/epilepsy-classifierPython to Rust
Splitting Columns With Pandas
http://hackersandslackers.com/splitting-columns-with-pandas/Education
Bachelor of Arts Degree in Psychology
Hunter College - New York City, NY, USA
Certifications
Machine Learning Engineer Nanodegree
Udacity
Skills
Libraries/APIs
Pandas, Scikit-learn, TensorFlow Deep Learning Library (TFLearn), XGBoost, NumPy, Keras, Dask, SciPy, OpenCV, Natural Language Toolkit (NLTK), PySpark, TensorFlow
Tools
DataViz, Jupyter, Spacemacs, PyCharm, SPSS, Plotly, DMDX, Git, Amazon SageMaker, BigQuery
Languages
Python 3, Python, SQL, Snowflake, Clojure, Rust
Paradigms
Data Science, Database Design, Agile, Functional Programming, ETL
Platforms
Jupyter Notebook, Amazon Web Services (AWS), Docker, Hyperledger, Oracle Database, Linux, Zeppelin, Apache Kafka, Google Cloud Platform (GCP), Visual Studio Code (VS Code)
Frameworks
Spark
Storage
Databases, NoSQL, Cassandra, PostgreSQL, MySQL
Other
Data, Statistical Data Analysis, Exploratory Data Analysis, Unstructured Data Analysis, Complex Data Analysis, Statistical Methods, Statistical Modeling, Statistical Forecasting, Statistical Analysis, Statistical Significance, Random Forests, Random Forest Regression, Experimental Design, Time Series, Machine Learning, Predictive Modeling, Data Visualization, Data Analysis, Data Analytics, Statistics, Computational Statistics, Bayesian Statistics, Statistical Programming, Amazon Machine Learning, Tf-idf, Convolutional Neural Networks (CNN), Analysis of Variance (ANOVA), Dashboards, Analytical Dashboards, Data Build Tool (dbt), Deep Learning, Natural Language Processing (NLP), Mathematical Modeling, Data Engineering, GPT, Generative Pre-trained Transformers (GPT), Operations Research, Simulations, PyEEG, Scientific Data Analysis, Prefect, Serverless
How to Work with Toptal
在数小时内,而不是数周或数月,我们的网络将为您直接匹配全球行业专家.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring