Save Intermediate Feature Matrix Results

In this tutorial, we will go over the how to save intermediate results when computing the feature matrix.

import featuretools as ft

In this example, we will use a dataset of retail data of customers from a UK website from December 2010 to December 2011.

es = ft.demo.load_retail(nrows=10000)

let’s use a simple feature for this example.

region = ft.Feature(es["customers"]["Country"])

We can supply “cutoff times” to specify that we want to calculate features one year after a customer’s first invoice.

import pandas as pd
cutoff_times = es["customers"].df[["CustomerID", "first_invoices_time"]].rename(
    columns={"CustomerID": "instance_id", "first_invoices_time": "time"})
cutoff_times["time"] = cutoff_times["time"] + pd.Timedelta("365 days")

Here is what some of the cutoff times look like.

instance_id time
17850.0 17850.0 2011-12-01 08:26:00
13047.0 13047.0 2011-12-01 08:34:00
12583.0 12583.0 2011-12-01 08:45:00
13748.0 13748.0 2011-12-01 09:00:00
15100.0 15100.0 2011-12-01 09:09:00
15291.0 15291.0 2011-12-01 09:32:00
14688.0 14688.0 2011-12-01 09:37:00
14527.0 14527.0 2011-12-01 09:41:00
15311.0 15311.0 2011-12-01 09:41:00
17809.0 17809.0 2011-12-01 09:41:00

If you want to save intermediate computations as CSVs, simply pass the location of a directory of where the computation should be saved. For example, if you pass a directory called “ft_temp”, CSV files will be output to the directory, named according t the timestamp that it represents.

import os
save_progress = os.path.join(os.getcwd(), 'ft_temp')
if not os.path.exists(save_progress):
fm_save = ft.calculate_feature_matrix([region],

As seen below, there are now files in the directory, named by timestamp.

% ls ft_temp/
ft_2011_12_01_03-08-00-000000.csv  ft_2011_12_02_05-03-00-000000.csv
ft_2011_12_01_09-00-00-000000.csv  ft_2011_12_02_05-19-00-000000.csv
ft_2011_12_01_12-43-00-000000.csv  ft_2011_12_02_12-07-00-000000.csv
ft_2011_12_01_12-51-00-000000.csv  ft_2011_12_02_12-18-00-000000.csv
ft_2011_12_02_03-19-00-000000.csv  ft_2011_12_03_12-57-00-000000.csv
import shutil