Changelog¶

v0.12.0 Oct 31, 2019

Enhancements
- Added First primitive (GH#770)
- Added Entropy aggregation primitive (GH#779)
- Allow custom naming for multi-output primitives (GH#780)
Fixes
- Prevents user from removing base entity time index using additional_variables (GH#768)
- Fixes error when a multioutput primitive was supplied to dfs as a groupby trans primitive (GH#786)
Changes
- Drop Python 2 support (GH#759)
- Add unit parameter to AvgTimeBetween (GH#771)
- Require Pandas 0.24.1 or higher (GH#787)
Documentation Changes
- Update featuretools slack link (GH#765)
- Set up repo to use Read the Docs (GH#776)
- Add First primitive to API reference docs (GH#782)
Testing Changes
- CircleCI fixes (GH#774)
- Disable PIP progress bars (GH#775)

Thanks to the following people for contributing to this release: @ablacke-ayx, @BoopBoopBeepBoop, @jeffzi, @kmax12, @rwedge, @thehomebrewnerd, @twdobson

v0.11.0 Sep 30, 2019

Warning

The next non-bugfix release of Featuretools will not support Python 2

Enhancements

Improve how files are copied and written (GH#721)

Add number of rows to graph in entityset.plot (GH#727)

Added support for pandas DateOffsets in DFS and Timedelta (GH#732)

Enable feature-specific top_n value using a dictionary in encode_features (GH#735)

Added progress_callback parameter to dfs() and calculate_feature_matrix() (GH#739, GH#745)

Enable specifying primitives on a per column or per entity basis (GH#748)

Fixes

Fixed entity set deserialization (GH#720)

Added error message when DateTimeIndex is a variable but not set as the time_index (GH#723)

Fixed CumCount and other group-by transform primitives that take ID as input (GH#733, GH#754)

Fix progress bar undercounting (GH#743)

Updated training_window error assertion to only check against observations (GH#728)

Don’t delete the whole destination folder while saving entityset (GH#717)

Changes

Raise warning and not error on schema version mismatch (GH#718)

Change feature calculation to return in order of instance ids provided (GH#676)

Removed time remaining from displayed progress bar in dfs() and calculate_feature_matrix() (GH#739)

Raise warning in normalize_entity() when time_index of base_entity has an invalid type (GH#749)

Remove toolz as a direct dependency (GH#755)

Allow boolean variable types to be used in the Multiply primitive (GH#756)

Documentation Changes

Updated URL for Compose (GH#716)

Testing Changes

Update dependencies (GH#738, GH#741, GH#747)

Thanks to the following people for contributing to this release: @angela97lin, @chidauri, @christopherbunn, @frances-h, @jeff-hernandez, @kmax12, @MarcoGorelli, @rwedge, @thehomebrewnerd

Breaking Changes

Feature calculations will return in the order of instance ids provided instead of the order of time points instances are calculated at.

v0.10.1 Aug 25, 2019

Fixes
- Fix serialized LatLong data being loaded as strings (GH#712)
Documentation Changes
- Fixed FAQ cell output (GH#710)

Thanks to the following people for contributing to this release: @gsheni, @rwedge

v0.10.0 Aug 19, 2019

Warning

The next non-bugfix release of Featuretools will not support Python 2

Enhancements

Give more frequent progress bar updates and update chunk size behavior (GH#631, GH#696)

Added drop_first as param in encode_features (GH#647)

Added support for stacking multi-output primitives (GH#679)

Generate transform features of direct features (GH#623)

Added serializing and deserializing from S3 and deserializing from URLs (GH#685)

Added nlp_primitives as an add-on library (GH#704)

Added AutoNormalize to Featuretools plugins (GH#699)

Added functionality for relative units (month/year) in Timedelta (GH#692)

Added categorical-encoding as an add-on library (GH#700)

Fixes

Fix performance regression in DFS (GH#637)

Fix deserialization of feature relationship path (GH#665)

Set index after adding ancestor relationship variables (GH#668)

Fix user-supplied variable_types modification in Entity init (GH#675)

Don’t calculate dependencies of unnecessary features (GH#667)

Prevent normalize entity’s new entity having same index as base entity (GH#681)

Update variable type inference to better check for string values (GH#683)

Changes

Moved dask, distributed imports (GH#634)

Documentation Changes

Miscellaneous changes (GH#641, GH#658)

Modified doc_string of top_n in encoding (GH#648)

Hyperlinked ComposeML (GH#653)

Added FAQ (GH#620, GH#677)

Fixed FAQ question with multiple question marks (GH#673)

Testing Changes

Add master, and release tests for premium primitives (GH#660, GH#669)

Miscellaneous changes (GH#672, GH#674)

Thanks to the following people for contributing to this release: @alexjwang, @allisonportis, @ayushpatidar, @CJStadler, @ctduffy, @gsheni, @jeff-hernandez, @jeremyliweishih, @kmax12, @rwedge, @zhxt95,

v0.9.1 July 3, 2019

Enhancements
- Speedup groupby transform calculations (GH#609)
- Generate features along all paths when there are multiple paths between entities (GH#600, GH#608)
Fixes
- Select columns of dataframe using a list (GH#615)
- Change type of features calculated on Index features to Categorical (GH#602)
- Filter dataframes through forward relationships (GH#625)
- Specify Dask version in requirements for python 2 (GH#627)
- Keep dataframe sorted by time during feature calculation (GH#626)
- Fix bug in encode_features that created duplicate columns of features with multiple outputs (GH#622)
Changes
- Remove unused variance_selection.py file (GH#613)
- Remove Timedelta data param (GH#619)
- Remove DaysSince primitive (GH#628)
Documentation Changes
- Add installation instructions for add-on libraries (GH#617)
- Clarification of Multi Output Feature Creation (GH#638)
- Miscellaneous changes (GH#632, GH#639)
Testing Changes
- Miscellaneous changes (GH#595, GH#612)

Thanks to the following people for contributing to this release: @CJStadler, @kmax12, @rwedge, @gsheni, @kkleidal, @ctduffy

v0.9.0 June 19, 2019

Enhancements
- Add unit parameter to timesince primitives (GH#558)
- Add ability to install optional add on libraries (GH#551)
- Load and save features from open files and strings (GH#566)
- Support custom variable types (GH#571)
- Support entitysets which have multiple paths between two entities (GH#572, GH#544)
- Added show_info function, more output information added to CLI featuretools info (GH#525)
Fixes
- Normalize_entity specifies error when ‘make_time_index’ is an invalid string (GH#550)
- Schema version added for entityset serialization (GH#586)
- Renamed features have names correctly serialized (GH#585)
- Improved error message for index/time_index being the same column in normalize_entity and entity_from_dataframe (GH#583)
- Removed all mentions of allow_where (GH#587, GH#588)
- Removed unused variable in normalize entity (GH#589)
- Change time since return type to numeric (GH#606)
Changes
- Refactor get_pandas_data_slice to take single entity (GH#547)
- Updates TimeSincePrevious and Diff Primitives (GH#561)
- Remove unecessary time_last variable (GH#546)
Documentation Changes
- Add Featuretools Enterprise to documentation (GH#563)
- Miscellaneous changes (GH#552, GH#573, GH#577, GH#599)
Testing Changes
- Miscellaneous changes (GH#559, GH#569, GH#570, GH#574, GH#584, GH#590)

Thanks to the following people for contributing to this release: @alexjwang, @allisonportis, @CJStadler, @ctduffy, @gsheni, @kmax12, @rwedge

v0.8.0 May 17, 2019

Rename NUnique to NumUnique (GH#510)
Serialize features as JSON (GH#532)
Drop all variables at once in normalize_entity (GH#533)
Remove unnecessary sorting from normalize_entity (GH#535)
Features cache their names (GH#536)
Only calculate features for instances before cutoff (GH#523)
Remove all relative imports (GH#530)
Added FullName Variable Type (GH#506)
Add error message when target entity does not exist (GH#520)
New demo links (GH#542)
Remove duplicate features check in DFS (GH#538)
featuretools_primitives entry point expects list of primitive classes (GH#529)
Update ALL_VARIABLE_TYPES list (GH#526)
More Informative N Jobs Prints and Warnings (GH#511)
Update sklearn version requirements (GH#541)
Update Makefile (GH#519)
Remove unused parameter in Entity._handle_time (GH#524)
Remove build_ext code from setup.py (GH#513)
Documentation updates (GH#512, GH#514, GH#515, GH#521, GH#522, GH#527, GH#545)
Testing updates (GH#509, GH#516, GH#517, GH#539)

Thanks to the following people for contributing to this release: @bphi, @CharlesBradshaw, @CJStadler, @glentennis, @gsheni, @kmax12, @rwedge

Breaking Changes

NUnique has been renamed to NumUnique.

Previous behavior

from featuretools.primitives import NUnique

New behavior

from featuretools.primitives import NumUnique

v0.7.1 Apr 24, 2019

Automatically generate feature name for controllable primitives (GH#481)
Primitive docstring updates (GH#489, GH#492, GH#494, GH#495)
Change primitive functions that returned strings to return functions (GH#499)
CLI customizable via entrypoints (GH#493)
Improve calculation of aggregation features on grandchildren (GH#479)
Refactor entrypoints to use decorator (GH#483)
Include doctests in testing suite (GH#491)
Documentation updates (GH#490)
Update how standard primitives are imported internally (GH#482)

Thanks to the following people for contributing to this release: @bukosabino, @CharlesBradshaw, @glentennis, @gsheni, @jeff-hernandez, @kmax12, @minkvsky, @rwedge, @thehomebrewnerd

v0.7.0 Mar 29, 2019

Improve Entity Set Serialization (GH#361)
Support calling a primitive instance’s function directly (GH#461, GH#468)
Support other libraries extending featuretools functionality via entrypoints (GH#452)
Remove featuretools install command (GH#475)
Add GroupByTransformFeature (GH#455, GH#472, GH#476)
Update Haversine Primitive (GH#435, GH#462)
Add commutative argument to SubtractNumeric and DivideNumeric primitives (GH#457)
Add FilePath variable_type (GH#470)
Add PhoneNumber, DateOfBirth, URL variable types (GH#447)
Generalize infer_variable_type, convert_variable_data and convert_all_variable_data methods (GH#423)
Documentation updates (GH#438, GH#446, GH#458, GH#469)
Testing updates (GH#440, GH#444, GH#445, GH#459)

Thanks to the following people for contributing to this release: @bukosabino, @CharlesBradshaw, @ColCarroll, @glentennis, @grayskripko, @gsheni, @jeff-hernandez, @jrkinley, @kmax12, @RogerTangos, @rwedge

Breaking Changes

ft.dfs now has a groupby_trans_primitives parameter that DFS uses to automatically construct features that group by an ID column and then apply a transform primitive to search group. This change applies to the following primitives: CumSum, CumCount, CumMean, CumMin, and CumMax.
Previous behavior
ft.dfs(entityset=es, target_entity='customers', trans_primitives=["cum_mean"])
New behavior
ft.dfs(entityset=es, target_entity='customers', groupby_trans_primitives=["cum_mean"])

Related to the above change, cumulative transform features are now defined using a new feature class, GroupByTransformFeature.

Previous behavior

ft.Feature([base_feature, groupby_feature], primitive=CumulativePrimitive)

New behavior

ft.Feature(base_feature, groupby=groupby_feature, primitive=CumulativePrimitive)

v0.6.1 Feb 15, 2019

Cumulative primitives (GH#410)
Entity.query_by_values now preserves row order of underlying data (GH#428)
Implementing Country Code and Sub Region Codes as variable types (GH#430)
Added IPAddress and EmailAddress variable types (GH#426)
Install data and dependencies (GH#403)
Add TimeSinceFirst, fix TimeSinceLast (GH#388)
Allow user to pass in desired feature return types (GH#372)
Add new configuration object (GH#401)
Replace NUnique get_function (GH#434)
_calculate_idenity_features now only returns the features asked for, instead of the entire entity (GH#429)
Primitive function name uniqueness (GH#424)
Update NumCharacters and NumWords primitives (GH#419)
Removed Variable.dtype (GH#416, GH#433)
Change to zipcode rep, str for pandas (GH#418)
Remove pandas version upper bound (GH#408)
Make S3 dependencies optional (GH#404)
Check that agg_primitives and trans_primitives are right primitive type (GH#397)
Mean primitive changes (GH#395)
Fix transform stacking on multi-output aggregation (GH#394)
Fix list_primitives (GH#391)
Handle graphviz dependency (GH#389, GH#396, GH#398)
Testing updates (GH#402, GH#417, GH#433)
Documentation updates (GH#400, GH#409, GH#415, GH#417, GH#420, GH#421, GH#422, GH#431)

Thanks to the following people for contributing to this release: @CharlesBradshaw, @csala, @floscha, @gsheni, @jxwolstenholme, @kmax12, @RogerTangos, @rwedge

v0.6.0 Jan 30, 2018

Primitive refactor (GH#364)
Mean ignore NaNs (GH#379)
Plotting entitysets (GH#382)
Add seed features later in DFS process (GH#357)
Multiple output column features (GH#376)
Add ZipCode Variable Type (GH#367)
Add primitive.get_filepath and example of primitive loading data from external files (GH#380)
Transform primitives take series as input (GH#385)
Update dependency requirements (GH#378, GH#383, GH#386)
Add modulo to override tests (GH#384)
Update documentation (GH#368, GH#377)
Update README.md (GH#366, GH#373)
Update CI tests (GH#359, GH#360, GH#375)

Thanks to the following people for contributing to this release: @floscha, @gsheni, @kmax12, @RogerTangos, @rwedge

v0.5.1 Dec 17, 2018

Add missing dependencies (GH#353)
Move comment to note in documentation (GH#352)

v0.5.0 Dec 17, 2018

Add specific error for duplicate additional/copy_variables in normalize_entity (GH#348)
Removed EntitySet._import_from_dataframe (GH#346)
Removed time_index_reduce parameter (GH#344)
Allow installation of additional primitives (GH#326)
Fix DatetimeIndex variable conversion (GH#342)
Update Sklearn DFS Transformer (GH#343)
Clean up entity creation logic (GH#336)
remove casting to list in transform feature calculation (GH#330)
Fix sklearn wrapper (GH#335)
Add readme to pypi
Update conda docs after move to conda-forge (GH#334)
Add wrapper for scikit-learn Pipelines (GH#323)
Remove parse_date_cols parameter from EntitySet._import_from_dataframe (GH#333)

Thanks to the following people for contributing to this release: @bukosabino, @georgewambold, @gsheni, @jeff-hernandez, @kmax12, and @rwedge.

v0.4.1 Nov 29, 2018

Resolve bug preventing using first column as index by default (GH#308)
Handle return type when creating features from Id variables (GH#318)
Make id an optional parameter of EntitySet constructor (GH#324)
Handle primitives with same function being applied to same column (GH#321)
Update requirements (GH#328)
Clean up DFS arguments (GH#319)
Clean up Pandas Backend (GH#302)
Update properties of cumulative transform primitives (GH#320)
Feature stability between versions documentation (GH#316)
Add download count to GitHub readme (GH#310)
Fixed #297 update tests to check error strings (GH#303)
Remove usage of fixtures in agg primitive tests (GH#325)

v0.4.0 Oct 31, 2018

Remove ft.utils.gen_utils.getsize and make pympler a test requirement (GH#299)
Update requirements.txt (GH#298)
Refactor EntitySet.find_path(…) (GH#295)
Clean up unused methods (GH#293)
Remove unused parents property of Entity (GH#283)
Removed relationships parameter (GH#284)
Improve time index validation (GH#285)
Encode features with “unknown” class in categorical (GH#287)
Allow where clauses on direct features in Deep Feature Synthesis (GH#279)
Change to fullargsspec (GH#288)
Parallel verbose fixes (GH#282)
Update tests for python 3.7 (GH#277)
Check duplicate rows cutoff times (GH#276)
Load retail demo data using compressed file (GH#271)

v0.3.1 Sept 28, 2018

Handling time rewrite (GH#245)
Update deep_feature_synthesis.py (GH#249)
Handling return type when creating features from DatetimeTimeIndex (GH#266)
Update retail.py (GH#259)
Improve Consistency of Transform Primitives (GH#236)
Update demo docstrings (GH#268)
Handle non-string column names (GH#255)
Clean up merging of aggregation primitives (GH#250)
Add tests for Entity methods (GH#262)
Handle no child data when calculating aggregation features with multiple arguments (GH#264)
Add is_string utils function (GH#260)
Update python versions to match docker container (GH#261)
Handle where clause when no child data (GH#258)
No longer cache demo csvs, remove config file (GH#257)
Avoid stacking “expanding” primitives (GH#238)
Use randomly generated names in retail csv (GH#233)
Update README.md (GH#243)

v0.3.0 Aug 27, 2018

Improve performance of all feature calculations (GH#224)
Update agg primitives to use more efficient functions (GH#215)
Optimize metadata calculation (GH#229)
More robust handling when no data at a cutoff time (GH#234)
Workaround categorical merge (GH#231)
Switch which CSV is associated with which variable (GH#228)
Remove unused kwargs from query_by_values, filter_and_sort (GH#225)
Remove convert_links_to_integers (GH#219)
Add conda install instructions (GH#223, GH#227)
Add example of using Dask to parallelize to docs (GH#221)

v0.2.2 Aug 20, 2018

Remove unnecessary check no related instances call and refactor (GH#209)
Improve memory usage through support for pandas categorical types (GH#196)
Bump minimum pandas version from 0.20.3 to 0.23.0 (GH#216)
Better parallel memory warnings (GH#208, GH#214)
Update demo datasets (GH#187, GH#201, GH#207)
Make primitive lookup case insensitive (GH#213)
Use capital name (GH#211)
Set class name for Min (GH#206)
Remove variable_types from normalize entity (GH#205)
Handle parquet serialization with last time index (GH#204)
Reset index of cutoff times in calculate feature matrix (GH#198)
Check argument types for .normalize_entity (GH#195)
Type checking ignore entities. (GH#193)

v0.2.1 July 2, 2018

Cpu count fix (GH#176)
Update flight (GH#175)
Move feature matrix calculation helper functions to separate file (GH#177)

v0.2.0 June 22, 2018

Multiprocessing (GH#170)
Handle unicode encoding in repr throughout Featuretools (GH#161)
Clean up EntitySet class (GH#145)
Add support for building and uploading conda package (GH#167)
Parquet serialization (GH#152)
Remove variable stats (GH#171)
Make sure index variable comes first (GH#168)
No last time index update on normalize (GH#169)
Remove list of times as on option for cutoff_time in calculate_feature_matrix (GH#165)
Config does error checking to see if it can write to disk (GH#162)

v0.1.21 May 30, 2018

Support Pandas 0.23.0 (GH#153, GH#154, GH#155, GH#159)
No EntitySet required in loading/saving features (GH#141)
Use s3 demo csv with better column names (GH#139)
more reasonable start parameter (GH#149)
add issue template (GH#133)
Improve tests (GH#136, GH#137, GH#144, GH#147)
Remove unused functions (GH#140, GH#143, GH#146)
Update documentation after recent changes / removals (GH#157)
Rename demo retail csv file (GH#148)
Add names for binary (GH#142)
EntitySet repr to use get_name rather than id (GH#134)
Ensure config dir is writable (GH#135)

v0.1.20 Apr 13, 2018

Primitives as strings in DFS parameters (GH#129)
Integer time index bugfixes (GH#128)
Add make_temporal_cutoffs utility function (GH#126)
Show all entities, switch shape display to row/col (GH#124)
Improved chunking when calculating feature matrices (GH#121)
fixed num characters nan fix (GH#118)
modify ignore_variables docstring (GH#117)

v0.1.19 Mar 21, 2018

More descriptive DFS progress bar (GH#69)
Convert text variable to string before NumWords (GH#106)
EntitySet.concat() reindexes relationships (GH#96)
Keep non-feature columns when encoding feature matrix (GH#111)
Uses full entity update for dependencies of uses_full_entity features (GH#110)
Update column names in retail demo (GH#104)
Handle Transform features that need access to all values of entity (GH#91)

v0.1.18 Feb 27, 2018

fixes related instances bug (GH#97)
Adding non-feature columns to calculated feature matrix (GH#78)
Relax numpy version req (GH#82)
Remove entity_from_csv, tests, and lint (GH#71)

v0.1.17 Jan 18, 2018

LatLong type (GH#57)
Last time index fixes (GH#70)
Make median agg primitives ignore nans by default (GH#61)
Remove Python 3.4 support (GH#64)
Change normalize_entity to update secondary_time_index (GH#59)
Unpin requirements (GH#53)
associative -> commutative (GH#56)
Add Words and Chars primitives (GH#51)

v0.1.16 Dec 19, 2017

fix EntitySet.combine_variables and standardize encode_features (GH#47)
Python 3 compatibility (GH#16)

v0.1.15 Dec 18, 2017

Fix variable type in demo data (GH#37)
Custom primitive kwarg fix (GH#38)
Changed order and text of arguments in make_trans_primitive docstring (GH#42)

v0.1.14 November 20, 2017

Last time index (GH#33)
Update Scipy version to 1.0.0 (GH#31)

v0.1.13 November 1, 2017

Add MANIFEST.in (GH#26)

v0.1.11 October 31, 2017

Package linting (GH#7)
Custom primitive creation functions (GH#13)
Split requirements to separate files and pin to latest versions (GH#15)
Select low information features (GH#18)
Fix docs typos (GH#19)
Fixed Diff primitive for rare nan case (GH#21)
added some mising doc strings (GH#23)
Trend fix (GH#22)
Remove as_dir=False option from EntitySet.to_pickle() (GH#20)
Entity Normalization Preserves Types of Copy & Additional Variables (GH#25)

v0.1.10 October 12, 2017

NumTrue primitive added and docstring of other primitives updated (GH#11)
fixed hash issue with same base features (GH#8)
Head fix (GH#9)
Fix training window (GH#10)
Add associative attribute to primitives (GH#3)
Add status badges, fix license in setup.py (GH#1)
fixed head printout and flight demo index (GH#2)

v0.1.9 September 8, 2017

Documentation improvements
New featuretools.demo.load_mock_customer function

v0.1.8 September 1, 2017

Bug fixes
Added Percentile transform primitive

v0.1.7 August 17, 2017

Performance improvements for approximate in calculate_feature_matrix and dfs
Added Week transform primitive

v0.1.6 July 26, 2017

Added load_features and save_features to persist and reload features

Added save_progress argument to calculate_feature_matrix

Added approximate parameter to calculate_feature_matrix and dfs

Added load_flight to ft.demo

v0.1.5 July 11, 2017

Windows support

v0.1.3 July 10, 2017

Renamed feature submodule to primitives

Renamed prediction_entity arguments to target_entity

Added training_window parameter to calculate_feature_matrix

v0.1.2 July 3rd, 2017

Initial release