Featuretools is intended to be run on datasets that can fit in memory on one machine. For advice on handing large dataset refer to Improving Computational Performance.
If you would like to test Feature Labs APIs for running Featuretools natively on Apache Spark or Dask, please let us know here.
Bring your own labels¶
If you are doing supervised machine learning, you must supply your own labels and cutoff times. To structure this process, you can use Compose, which is an open source project for automatically generating labels with cutoff times.