Ben for short. Experienced in end-to-end data science & engineering using Python, but I consider myself a generalist. Deeply excited about autonomy and cryptocurency (disjointly). Standing up out of chairs whenever possible. Keeping beginner's mind as best as I can.

Feature Labs

Co-Founder & Chief Data Scientist

Years: 2016 - 2018

Purpose & History
  • Feature Labs spun out of MIT’s Data-to-AI (DAI) lab to solve a problem we kept seeing: medium to large enterprises were unable to build and deploy machine learning models
  • Helped lead ~10 person team, including mentoring less experienced members
  • Marketed and sold enterprise software products
  • Worked with remote contractors and partners across the world.
  • Architected and developed enterprise and open-source software products. Scalable to data size (from large: terabytes to small: kilobytes) and complexity (worked with hundreds of columns, tens of tables, all data types)
  • Developed tutorials, demos, blogs, and other learning materials
  • Developed novel data science techniques & software, for feature selection, feature engineering, automation, data ingestion, working with multimodal data, dealing with the time dimension, fusing deep learning and traditional machine learning techniques, and catering to non-technical users through UI
  • Gave demos to potential customers
  • Helped raise venture capital
  • Contributed significant work to our team's entry into the DARPA D3M program/competition. This included building core primitives and automation software (see Projects tab), interfacing with our team members at MIT (led by Prof. Saman Amarasinghe), interfacing with other D3M teams who developed primitives or UIs, contributing to shared code (see Open Source), and giving presentations.
  • Core contributor to our open-source product Featuretools, the most popular feature engineering library on Github. See Projects for more details.
Core Product
  • Enterprise-ready end-to-end backend system and UI for predictive analytics to be used concurrently by multiple developers and subject-matter experts at corporations
  • Deployment either on customer's private cloud or on AWS
  • ML 2.0 workflow for deployment-focused machine learning (I had a lead role in first deployment, at Accenture)
  • Novel feature engineering techniques for working with time-based, relational datasets, which could be small or large, multimodal, and complex.
  • Novel techniques for dealing with the time dimension to increase confidence and deployment accuracy
  • Prediction Engineering as a new data science abstraction, for an overview see my paper and video
  • Successfully scaled to large data sets and multiple simultaneous users by partitioning and processing in parallel across cores & machines.
  • Successfully trained customers' software engineers with no or very little prior machine learning experience to develop their own complex predictive models
  • Raised $1.5 MM in seed funding from Flybridge Capital
  • Bootstrapped for first year from customers
  • Sold 6-figure annual software licenses
  • Significant recurrent funding from DARPA D3M program

These are some of Feature Labs' publicly disclosed programs & customers, or users of software produced by Feature Labs.