Ben for short. Experienced in end-to-end data science & engineering using Python, but I consider myself a generalist. Deeply excited about autonomy and cryptocurency (disjointly). Standing up out of chairs whenever possible. Keeping beginner's mind as best as I can.

The AI Project Manager

Year: 2018

Publication: IEEE International Conference on Big Data

Authors: Benjamin Schreck, Shankar Mallapur, Sarvesh Damle, Nitin John James, Sanjeev Vohra, Rajendra Prasad, Kalyan Veeramachaneni

In this paper we describe a system to augment human project managers of large scale software projects by predicting future SLA violations, using a machine learning approach. We detail the way we collected, processed, and formatted the data from a much larger database ill-suited for the problem at hand. We then detail how we extracted training examples through prediction engineering, as well as our semi-automated feature engineering approach. Our system was able to achieve a false positive rate of .311, compared to a non-machine learning baseline of and a simple machine learning baseline without feature engineering scoring .689 and .723 respectively...

Link

Machine learning 2.0: Engineering data driven AI products

Year: 2018

Publication:

Authors: Max Kanter, Benjamin Schreck, Kalyan Veeramachaneni

In this paper, we propose a paradigm shift from the current practice of creating machine learning models that requires months-long discovery, exploration and “feasibility report” generation, followed by re-engineering for deployment, in favor of a rapid 8 week long process of development, understanding, validation and deployment that can executed by developers or subject matter experts (non-ML experts) using reusable APIs. It accomplishes what we call a “minimum viable data-driven model,” delivering a ready-to-use machine learning model for problems that haven’t been solved before using machine learning…

Link

What would a data scientist ask? Automatically formulating and solving prediction problems

Year: 2016

Publication: IEEE International Conference on Data Science and Advanced Analytics

Authors: Benjamin Schreck, Kalyan Veeramachaneni

In this paper, we designed a formal language, called Trane, for describing prediction problems over relational datasets, implemented a system that allows data scientists to specify problems in that language. We show that this language is able to describe several prediction problems and even the ones on Kaggle- a data science competition website. We express 29 different Kaggle problems in this language. We designed an interpreter, which translates input from the user, specified in this language, into a series of transformation and aggregation operations…

Link
Video

Data Science Foundry for MOOCs

Year: 2015

Publication: IEEE International Conference on Data Science and Advanced Analytics

Authors: Sebastien Boyer, Ben U. Gelman, Benjamin Schreck, Kalyan Veeramachaneni.

In this paper, we present the concept of data science foundry for data from Massive Open Online Courses. In the foundry we present a series of software modules that transform the data into different representations. Ultimately, each online learner is represented using a set of variables that capture his/her online behavior. These variables are captured longitudinally over an interval. Using this representation we then build a predictive analytics stack that is able to predict online learners behavior as the course progresses in real time. To demonstrate the efficacy of the foundry, we attempt to solve an important prediction problem...

Link

Better Scheduling Software and User Interface for Liquid-Handling Robots

Year: 2014

Publication: 6th International Workshop on Bio-Design Automation

Authors: Benjamin Schreck, Jonathan Babb and Ron Weiss

This paper outlines the design of a software system that enhances the workflow eficiency and usability of a liquid- handling robot for executing wet lab protocols, with a focus on synthetic biology. The system we propose builds on the BioCAD software suite...

Link