Nicolas Keller, Developer in Berlin, Germany
Nicolas is available for hire
Hire Nicolas

Nicolas Keller

Verified Expert  in Engineering

Data Scientist and Developer

Berlin, Germany
Toptal Member Since
January 21, 2020

With a strong mathematical background (a master's degree in mathematics), 尼古拉斯是一位充满激情的数据科学家,他可以为机器学习知识的理想组合做出贡献, practical programming skills, and a problem solving and analytical mindset to a project. 他有将业务问题转化为数据驱动解决方案的经验,最近在这家全球保险公司担任数据科学家, Allianz.


Python, Neo4j, Amazon EC2, Machine Learning, Large Language Models (LLMs)...
Dashboard Design, Selenium, Automation, Web Crawlers, Python, Reporting...
Focus Sensors Limited
Python, Algorithms, Apache Kafka, SciPy, Testing, Streaming Data...




Preferred Environment

Jupyter Notebook, RStudio, Git, Linux

The most amazing...

...我编写的是一个R程序包,可以根据个人特征预测人寿保险索赔. It is a novel approach going beyond the status quo.

Work Experience

Data Science Lead

2021 - PRESENT
  • 领导临床试验领域的数据科学和数据工程项目的发展.
  • Set up a cloud infrastructure for model training, deployment, and application prototyping, which increased the impact and visibility of our team within the organization.
  • 设计和管理Neo4j图形数据库,以集中组织数据集,并利用图形算法来回答复杂的业务问题.
  • Developed an interface to the graph database, allowing non-technical users to ask questions in natural language. 我们训练了一个深度学习模型将英语翻译成Cypher(图查询语言).
  • 创建了一个优化算法,以选择最佳的临床试验地点,并监督试验的推出和整合到业务运营中.
Technologies: Python, Neo4j, Amazon EC2, Machine Learning, Large Language Models (LLMs), GPT, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Data Engineering, Project Leadership, Dataiku, Quantum Computing, Pandas, Data Science, Web Scraping, Web Crawlers, Microsoft Excel, Statistical Analysis, SQL, Linux, Business Intelligence (BI), NumPy, Scikit-learn, Git, Databases, Data Analysis, Dashboards, Dashboard Development, Data Visualization, Dashboard Design, Plotly, Process Automation, Jupyter Notebook

Data Engineer

2020 - PRESENT
  • Supported the external adjudication process of two medical studies.
  • 使用Python自动完成合并、填充和发送大量PDF表单的过程.
  • 通过自动Excel表格跟踪与裁判的数据交换,并提供仪表板来创建进度报告.
  • Used Selenium to automate downloading and gathering PDF files from a website, which would have been weeks of manual work.
Technologies: Dashboard Design, Selenium, Automation, Web Crawlers, Python, Reporting, Database Management, Microsoft Office, Pandas, Data Science, Web Scraping, Microsoft Excel, Linux, Git, Databases, Data Analysis, Dashboards, Dashboard Development, Data Visualization, Plotly, Process Automation, GPT, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Jupyter Notebook

Data Scientist

2020 - 2021
Focus Sensors Limited
  • 研究了传感器数据异常检测算法的广泛代码库的实现和数学概念.
  • Changed the core architecture from static data files to streaming data using Kafka.
  • 在处理时间和输出完整性方面测试并优化了新架构.
Technologies: Python, Algorithms, Apache Kafka, SciPy, Testing, Streaming Data, Signal Processing, Docker, Pandas, Data Science, Linux, NumPy, Scikit-learn, Git, Data Analysis, Data Visualization, Jupyter Notebook

Data Scientist

2020 - 2021
  • 使用AWS SageMaker和Lambda基础设施实现并生产了一种个性化的机器学习算法,对交易数据进行分类.
  • 检测客户行为的趋势,并创建频繁的报告来呈现结果, which have been published regularly on the company's website.
  • 完成各种数据分析和poc,以响应业务请求,使用SQL和Python作为后端,使用Jupyter Notebooks和Plotly来呈现结果.
Technologies: Financial Data, Applied Mathematics, XGBoost, LightGBM, Time Series Analysis, Reporting, Data Visualization, Data Analysis, MongoDB, Automation, Amazon Athena, Databases, Git, Scikit-learn, NumPy, Business Intelligence (BI), Linux, Statistical Analysis, Algorithms, Data Science, Data Reporting, Pandas, Data Analytics, Amazon Web Services (AWS), Natural Language Processing (NLP), Machine Learning, Plotly, Jupyter Notebook, SQL, Amazon SageMaker, Python, Redshift, Big Data, Dashboards, Dashboard Development, Dashboard Design

Data Scientist

2020 - 2020
  • 分析小额贷款数据,找出影响还款行为的相关特征.
  • 实现并测试了一个Python模块,该模块返回信用风险评分以及详细解释.
  • 将该模块部署在AWS SageMaker和Lambda基础设施上,使其与当前系统完全集成.
Technologies: Amazon Web Services (AWS), Financial Data, Software Engineering, Loans & Lending, Credit Risk, Amazon SageMaker, Python, Redshift, Pandas, Data Science, Statistical Analysis, NumPy, Scikit-learn, Data Analysis, Data Visualization, Plotly

Data Scientist

2019 - 2020
Sopra Steria España
  • 开发了新的方法来衡量零售客户端的业务成功及其在Python中的实现.
  • 使用SQL和Python对零售促销活动进行后期分析,并将结果呈现给利益相关者.
  • Optimized SQL queries to extract insights from large tables.
  • 重构并优化了一个内部Python包,以提取和可视化大型数据库表的统计信息.
Technologies: Reporting, Data Visualization, Data Analysis, SQL Server Management Studio (SSMS), Business Intelligence (BI), Pandas, Data Analytics, Microsoft Excel, Azure, Databricks, Python, SQL, Big Data, MySQL, Data Science, Statistical Analysis

Data Scientist (Master Thesis Student)

2019 - 2019
  • Wrote my thesis about machine learning methods to model life tables.
  • Performed the preprocessing, analysis, and modeling of data with a size of over 100GB.
  • Built and tested an R package for internal usage in the actuarial department.
  • 作为官方培训系列的一部分,我在专家面前做了最后的演讲.
  • Implemented exhaustive performance optimization of R code using vectorization, parallelization, and optimized packages.
Technologies: Applied Mathematics, Ggplot2, Data Analysis, Mathematics, SQL, Algorithms, Data Analytics, Plotly, Markdown, LaTeX, Python, R, Machine Learning, Data Science, Data Visualization

Data Scientist

2018 - 2019
  • Implemented and supported extensive interactive data-driven dashboards in R-Shiny.
  • 根据客户特点和产品历史,开发企业客户产品推荐系统.
  • 建立一个高效的自动化系统,用于基于客户投诉数据的产品或业务流程问题的早期检测.
  • 使用Plotly, D3实现了复杂数据的可视化和见解的呈现.js, and R Markdown.
  • Performed topic modeling and text mining of client complaint texts using LDA.
  • 构建关于机器学习领域的理论和编程包的演示.
  • 创建内部编程包,以简化和简化经常使用的数据科学任务.
Technologies: Markdown, Microsoft PowerPoint, Natural Language Processing (NLP), XGBoost, LightGBM, Ggplot2, Financial Markets, Random Forests, Reporting, Dashboard Design, Data Visualization, Dashboard Development, Dashboards, Data Analysis, CSS, Databases, Business Intelligence (BI), Statistical Analysis, Data Science, Data Reporting, Machine Learning, Data Analytics, Microsoft Excel, Plotly, RStudio Shiny, SQL, Git, Python, RStudio, MySQL, Web Scraping, Web Crawlers


2017 - 2018
Fraunhofer Institute for Industrial Mathematics ITWM
  • Worked on the project Senrisk (Senrisk.Eu /),根据新闻情绪预测公司债券和主权债券的价格波动.
  • Built recurrent neural networks in PyTorch to predict financial time series.
  • Developed statistical methods for fraud detection in the health insurance industry.
  • Implemented a Python package for financial time series prediction, including the integration to a web service.
  • 在R Shiny中构建了一个软件原型,以可视化不同样本量在欺诈检测环境中的影响.
Technologies: Financial Data, Applied Mathematics, Neural Networks, Time Series Analysis, Financial Markets, Optimization, Random Forests, Data Analysis, Keras, Scikit-learn, Statistical Analysis, Algorithms, Data Science, Machine Learning, Data Analytics, R, RStudio Shiny, PyTorch, Anaconda, Linux, Python, RStudio, Data Visualization


2016 - 2017
Universidad Técnica Federico Santa María
  • 用c#实现基于Black-Scholes模型的财务评估软件.
  • Created a detailed report about the theoretical foundations of option price valuation.
  • Conducted research related to the Black-Scholes model and financial time series.
技术:金融市场,数据分析,数据分析,微软Excel, R, c#


As a member of the Fraunhofer ITWM research institute, I participated in the EU-funded SENSIRK project. 该项目的主要目标是根据新闻情绪预测公司和主权债券的价格.

My part was mainly the implementation of the prediction system. 我们使用了循环神经网络和增强方法,并构建了一个Python包来简化整个过程.

Analysis and Visualization of WhatsApp Chats
A Python toolkit to visualize WhatsApp chats. It offers some fun visualizations of single or group chats. 此外,它还有一个交互式仪表板,可用于在可视化中导航. 它将原始文本文件转换为方便的数据框架,并处理不同的输入格式, including other iOS and Android versions.

Machine Learning Demonstration Tool
这个闪亮的应用程序作为一个小用户界面来演示一些标准的机器学习任务. You can upload an example data set and edit, visualize and model it. I used it for demonstration purposes, especially when showing the basic ML concepts to non-technical users.

Android App Course Analyzer
An Android application to track given courses. It is used by a small group of persons for a specific business use case. You can insert a course and specify received evaluations.

On the main page, you have an overview of all courses, and you can export and import a list of the courses. Finally, you can see statistics of the evaluations, 还有一张地图显示了课程的地点,并提供了一些额外的信息. Currently, it is only available in Spanish.
2016 - 2019

Master of Science Degree in Financial and Actuarial Mathematics

Technical University Kaiserslautern - Kaiserslautern, Germany

2016 - 2016

Spent an Exchange Year in Financial Mathematics

Universidad Técnica Federico Santa María - Valparaíso, Chile

2013 - 2016

Bachelor of Science Degree in Mathematics

Technical University Kaiserslautern - Kaiserslautern, Germany


Quantum Applications Lab



Quantum Business Foundations



Neo4j Certified Professional



Big Data Fundamentals with PySpark



Applying SQL to Real-world Problems



Pandas, Ggplot2, XGBoost, Keras, Scikit-learn, Beautiful Soup, NumPy, PySpark, PyTorch, SciPy


Plotly, Amazon SageMaker, Amazon Athena, Git, LaTeX, Microsoft Excel, PyCharm, Microsoft PowerPoint


RStudio Shiny, LightGBM, Selenium


Python, R, SQL, C#, Markdown, Kotlin, HTML, CSS


Data Science, Business Intelligence (BI), Automation, Testing


Jupyter Notebook, RStudio, Amazon Web Services (AWS), Linux, Azure, Anaconda, Databricks, Apache Kafka, Docker, Amazon EC2, Dataiku


Redshift, Databases, SQL Server Management Studio (SSMS), MySQL, MongoDB, Neo4j, Database Management, Graph Databases


Data Analysis, Dashboards, Data Analytics, Applied Mathematics, Mathematics, Dashboard Development, Data Visualization, Machine Learning, Natural Language Processing (NLP), Random Forests, Dashboard Design, Reporting, Data Reporting, GPT, Generative Pre-trained Transformers (GPT), Financial Markets, Financial Data, Big Data, Algorithms, Process Automation, Optimization, Web Scraping, Web Crawlers, Time Series Analysis, Statistical Analysis, Neural Networks, Credit Risk, Loans & Lending, Software Engineering, Data Engineering, Android Development, Streaming Data, Signal Processing, Large Language Models (LLMs), Project Leadership, Microsoft Office, Quantum Computing, Quantum Machine Learning

Collaboration That Works

How to Work with Toptal



Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent


Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring