Changelog

v0.12.0 Oct 31, 2019
  • Enhancements
    • Added First primitive (GH#770)

    • Added Entropy aggregation primitive (GH#779)

    • Allow custom naming for multi-output primitives (GH#780)

  • Fixes
    • Prevents user from removing base entity time index using additional_variables (GH#768)

    • Fixes error when a multioutput primitive was supplied to dfs as a groupby trans primitive (GH#786)

  • Changes
    • Drop Python 2 support (GH#759)

    • Add unit parameter to AvgTimeBetween (GH#771)

    • Require Pandas 0.24.1 or higher (GH#787)

  • Documentation Changes
    • Update featuretools slack link (GH#765)

    • Set up repo to use Read the Docs (GH#776)

    • Add First primitive to API reference docs (GH#782)

  • Testing Changes
    • CircleCI fixes (GH#774)

    • Disable PIP progress bars (GH#775)

Thanks to the following people for contributing to this release: @ablacke-ayx, @BoopBoopBeepBoop, @jeffzi, @kmax12, @rwedge, @thehomebrewnerd, @twdobson

v0.11.0 Sep 30, 2019

Warning

The next non-bugfix release of Featuretools will not support Python 2

  • Enhancements
    • Improve how files are copied and written (GH#721)

    • Add number of rows to graph in entityset.plot (GH#727)

    • Added support for pandas DateOffsets in DFS and Timedelta (GH#732)

    • Enable feature-specific top_n value using a dictionary in encode_features (GH#735)

    • Added progress_callback parameter to dfs() and calculate_feature_matrix() (GH#739, GH#745)

    • Enable specifying primitives on a per column or per entity basis (GH#748)

  • Fixes
    • Fixed entity set deserialization (GH#720)

    • Added error message when DateTimeIndex is a variable but not set as the time_index (GH#723)

    • Fixed CumCount and other group-by transform primitives that take ID as input (GH#733, GH#754)

    • Fix progress bar undercounting (GH#743)

    • Updated training_window error assertion to only check against observations (GH#728)

    • Don’t delete the whole destination folder while saving entityset (GH#717)

  • Changes
    • Raise warning and not error on schema version mismatch (GH#718)

    • Change feature calculation to return in order of instance ids provided (GH#676)

    • Removed time remaining from displayed progress bar in dfs() and calculate_feature_matrix() (GH#739)

    • Raise warning in normalize_entity() when time_index of base_entity has an invalid type (GH#749)

    • Remove toolz as a direct dependency (GH#755)

    • Allow boolean variable types to be used in the Multiply primitive (GH#756)

  • Documentation Changes
    • Updated URL for Compose (GH#716)

  • Testing Changes

Thanks to the following people for contributing to this release: @angela97lin, @chidauri, @christopherbunn, @frances-h, @jeff-hernandez, @kmax12, @MarcoGorelli, @rwedge, @thehomebrewnerd

Breaking Changes

  • Feature calculations will return in the order of instance ids provided instead of the order of time points instances are calculated at.

v0.10.1 Aug 25, 2019
  • Fixes
    • Fix serialized LatLong data being loaded as strings (GH#712)

  • Documentation Changes
    • Fixed FAQ cell output (GH#710)

Thanks to the following people for contributing to this release: @gsheni, @rwedge

v0.10.0 Aug 19, 2019

Warning

The next non-bugfix release of Featuretools will not support Python 2

  • Enhancements
    • Give more frequent progress bar updates and update chunk size behavior (GH#631, GH#696)

    • Added drop_first as param in encode_features (GH#647)

    • Added support for stacking multi-output primitives (GH#679)

    • Generate transform features of direct features (GH#623)

    • Added serializing and deserializing from S3 and deserializing from URLs (GH#685)

    • Added nlp_primitives as an add-on library (GH#704)

    • Added AutoNormalize to Featuretools plugins (GH#699)

    • Added functionality for relative units (month/year) in Timedelta (GH#692)

    • Added categorical-encoding as an add-on library (GH#700)

  • Fixes
    • Fix performance regression in DFS (GH#637)

    • Fix deserialization of feature relationship path (GH#665)

    • Set index after adding ancestor relationship variables (GH#668)

    • Fix user-supplied variable_types modification in Entity init (GH#675)

    • Don’t calculate dependencies of unnecessary features (GH#667)

    • Prevent normalize entity’s new entity having same index as base entity (GH#681)

    • Update variable type inference to better check for string values (GH#683)

  • Changes
    • Moved dask, distributed imports (GH#634)

  • Documentation Changes
  • Testing Changes

Thanks to the following people for contributing to this release: @alexjwang, @allisonportis, @ayushpatidar, @CJStadler, @ctduffy, @gsheni, @jeff-hernandez, @jeremyliweishih, @kmax12, @rwedge, @zhxt95,

v0.9.1 July 3, 2019
  • Enhancements
    • Speedup groupby transform calculations (GH#609)

    • Generate features along all paths when there are multiple paths between entities (GH#600, GH#608)

  • Fixes
    • Select columns of dataframe using a list (GH#615)

    • Change type of features calculated on Index features to Categorical (GH#602)

    • Filter dataframes through forward relationships (GH#625)

    • Specify Dask version in requirements for python 2 (GH#627)

    • Keep dataframe sorted by time during feature calculation (GH#626)

    • Fix bug in encode_features that created duplicate columns of features with multiple outputs (GH#622)

  • Changes
    • Remove unused variance_selection.py file (GH#613)

    • Remove Timedelta data param (GH#619)

    • Remove DaysSince primitive (GH#628)

  • Documentation Changes
    • Add installation instructions for add-on libraries (GH#617)

    • Clarification of Multi Output Feature Creation (GH#638)

    • Miscellaneous changes (GH#632, GH#639)

  • Testing Changes

Thanks to the following people for contributing to this release: @CJStadler, @kmax12, @rwedge, @gsheni, @kkleidal, @ctduffy

v0.9.0 June 19, 2019
  • Enhancements
    • Add unit parameter to timesince primitives (GH#558)

    • Add ability to install optional add on libraries (GH#551)

    • Load and save features from open files and strings (GH#566)

    • Support custom variable types (GH#571)

    • Support entitysets which have multiple paths between two entities (GH#572, GH#544)

    • Added show_info function, more output information added to CLI featuretools info (GH#525)

  • Fixes
    • Normalize_entity specifies error when ‘make_time_index’ is an invalid string (GH#550)

    • Schema version added for entityset serialization (GH#586)

    • Renamed features have names correctly serialized (GH#585)

    • Improved error message for index/time_index being the same column in normalize_entity and entity_from_dataframe (GH#583)

    • Removed all mentions of allow_where (GH#587, GH#588)

    • Removed unused variable in normalize entity (GH#589)

    • Change time since return type to numeric (GH#606)

  • Changes
    • Refactor get_pandas_data_slice to take single entity (GH#547)

    • Updates TimeSincePrevious and Diff Primitives (GH#561)

    • Remove unecessary time_last variable (GH#546)

  • Documentation Changes
  • Testing Changes

Thanks to the following people for contributing to this release: @alexjwang, @allisonportis, @CJStadler, @ctduffy, @gsheni, @kmax12, @rwedge

v0.8.0 May 17, 2019
  • Rename NUnique to NumUnique (GH#510)

  • Serialize features as JSON (GH#532)

  • Drop all variables at once in normalize_entity (GH#533)

  • Remove unnecessary sorting from normalize_entity (GH#535)

  • Features cache their names (GH#536)

  • Only calculate features for instances before cutoff (GH#523)

  • Remove all relative imports (GH#530)

  • Added FullName Variable Type (GH#506)

  • Add error message when target entity does not exist (GH#520)

  • New demo links (GH#542)

  • Remove duplicate features check in DFS (GH#538)

  • featuretools_primitives entry point expects list of primitive classes (GH#529)

  • Update ALL_VARIABLE_TYPES list (GH#526)

  • More Informative N Jobs Prints and Warnings (GH#511)

  • Update sklearn version requirements (GH#541)

  • Update Makefile (GH#519)

  • Remove unused parameter in Entity._handle_time (GH#524)

  • Remove build_ext code from setup.py (GH#513)

  • Documentation updates (GH#512, GH#514, GH#515, GH#521, GH#522, GH#527, GH#545)

  • Testing updates (GH#509, GH#516, GH#517, GH#539)

Thanks to the following people for contributing to this release: @bphi, @CharlesBradshaw, @CJStadler, @glentennis, @gsheni, @kmax12, @rwedge

Breaking Changes

  • NUnique has been renamed to NumUnique.

    Previous behavior

    from featuretools.primitives import NUnique
    

    New behavior

    from featuretools.primitives import NumUnique
    
v0.7.1 Apr 24, 2019
  • Automatically generate feature name for controllable primitives (GH#481)

  • Primitive docstring updates (GH#489, GH#492, GH#494, GH#495)

  • Change primitive functions that returned strings to return functions (GH#499)

  • CLI customizable via entrypoints (GH#493)

  • Improve calculation of aggregation features on grandchildren (GH#479)

  • Refactor entrypoints to use decorator (GH#483)

  • Include doctests in testing suite (GH#491)

  • Documentation updates (GH#490)

  • Update how standard primitives are imported internally (GH#482)

Thanks to the following people for contributing to this release: @bukosabino, @CharlesBradshaw, @glentennis, @gsheni, @jeff-hernandez, @kmax12, @minkvsky, @rwedge, @thehomebrewnerd

v0.7.0 Mar 29, 2019
  • Improve Entity Set Serialization (GH#361)

  • Support calling a primitive instance’s function directly (GH#461, GH#468)

  • Support other libraries extending featuretools functionality via entrypoints (GH#452)

  • Remove featuretools install command (GH#475)

  • Add GroupByTransformFeature (GH#455, GH#472, GH#476)

  • Update Haversine Primitive (GH#435, GH#462)

  • Add commutative argument to SubtractNumeric and DivideNumeric primitives (GH#457)

  • Add FilePath variable_type (GH#470)

  • Add PhoneNumber, DateOfBirth, URL variable types (GH#447)

  • Generalize infer_variable_type, convert_variable_data and convert_all_variable_data methods (GH#423)

  • Documentation updates (GH#438, GH#446, GH#458, GH#469)

  • Testing updates (GH#440, GH#444, GH#445, GH#459)

Thanks to the following people for contributing to this release: @bukosabino, @CharlesBradshaw, @ColCarroll, @glentennis, @grayskripko, @gsheni, @jeff-hernandez, @jrkinley, @kmax12, @RogerTangos, @rwedge

Breaking Changes

  • ft.dfs now has a groupby_trans_primitives parameter that DFS uses to automatically construct features that group by an ID column and then apply a transform primitive to search group. This change applies to the following primitives: CumSum, CumCount, CumMean, CumMin, and CumMax.

    Previous behavior

    ft.dfs(entityset=es,
           target_entity='customers',
           trans_primitives=["cum_mean"])
    

    New behavior

    ft.dfs(entityset=es,
           target_entity='customers',
           groupby_trans_primitives=["cum_mean"])
    
  • Related to the above change, cumulative transform features are now defined using a new feature class, GroupByTransformFeature.

    Previous behavior

    ft.Feature([base_feature, groupby_feature], primitive=CumulativePrimitive)
    

    New behavior

    ft.Feature(base_feature, groupby=groupby_feature, primitive=CumulativePrimitive)
    
v0.6.1 Feb 15, 2019
  • Cumulative primitives (GH#410)

  • Entity.query_by_values now preserves row order of underlying data (GH#428)

  • Implementing Country Code and Sub Region Codes as variable types (GH#430)

  • Added IPAddress and EmailAddress variable types (GH#426)

  • Install data and dependencies (GH#403)

  • Add TimeSinceFirst, fix TimeSinceLast (GH#388)

  • Allow user to pass in desired feature return types (GH#372)

  • Add new configuration object (GH#401)

  • Replace NUnique get_function (GH#434)

  • _calculate_idenity_features now only returns the features asked for, instead of the entire entity (GH#429)

  • Primitive function name uniqueness (GH#424)

  • Update NumCharacters and NumWords primitives (GH#419)

  • Removed Variable.dtype (GH#416, GH#433)

  • Change to zipcode rep, str for pandas (GH#418)

  • Remove pandas version upper bound (GH#408)

  • Make S3 dependencies optional (GH#404)

  • Check that agg_primitives and trans_primitives are right primitive type (GH#397)

  • Mean primitive changes (GH#395)

  • Fix transform stacking on multi-output aggregation (GH#394)

  • Fix list_primitives (GH#391)

  • Handle graphviz dependency (GH#389, GH#396, GH#398)

  • Testing updates (GH#402, GH#417, GH#433)

  • Documentation updates (GH#400, GH#409, GH#415, GH#417, GH#420, GH#421, GH#422, GH#431)

Thanks to the following people for contributing to this release: @CharlesBradshaw, @csala, @floscha, @gsheni, @jxwolstenholme, @kmax12, @RogerTangos, @rwedge

v0.6.0 Jan 30, 2018

Thanks to the following people for contributing to this release: @floscha, @gsheni, @kmax12, @RogerTangos, @rwedge

v0.5.1 Dec 17, 2018
  • Add missing dependencies (GH#353)

  • Move comment to note in documentation (GH#352)

v0.5.0 Dec 17, 2018
  • Add specific error for duplicate additional/copy_variables in normalize_entity (GH#348)

  • Removed EntitySet._import_from_dataframe (GH#346)

  • Removed time_index_reduce parameter (GH#344)

  • Allow installation of additional primitives (GH#326)

  • Fix DatetimeIndex variable conversion (GH#342)

  • Update Sklearn DFS Transformer (GH#343)

  • Clean up entity creation logic (GH#336)

  • remove casting to list in transform feature calculation (GH#330)

  • Fix sklearn wrapper (GH#335)

  • Add readme to pypi

  • Update conda docs after move to conda-forge (GH#334)

  • Add wrapper for scikit-learn Pipelines (GH#323)

  • Remove parse_date_cols parameter from EntitySet._import_from_dataframe (GH#333)

Thanks to the following people for contributing to this release: @bukosabino, @georgewambold, @gsheni, @jeff-hernandez, @kmax12, and @rwedge.

v0.4.1 Nov 29, 2018
  • Resolve bug preventing using first column as index by default (GH#308)

  • Handle return type when creating features from Id variables (GH#318)

  • Make id an optional parameter of EntitySet constructor (GH#324)

  • Handle primitives with same function being applied to same column (GH#321)

  • Update requirements (GH#328)

  • Clean up DFS arguments (GH#319)

  • Clean up Pandas Backend (GH#302)

  • Update properties of cumulative transform primitives (GH#320)

  • Feature stability between versions documentation (GH#316)

  • Add download count to GitHub readme (GH#310)

  • Fixed #297 update tests to check error strings (GH#303)

  • Remove usage of fixtures in agg primitive tests (GH#325)

v0.4.0 Oct 31, 2018
  • Remove ft.utils.gen_utils.getsize and make pympler a test requirement (GH#299)

  • Update requirements.txt (GH#298)

  • Refactor EntitySet.find_path(…) (GH#295)

  • Clean up unused methods (GH#293)

  • Remove unused parents property of Entity (GH#283)

  • Removed relationships parameter (GH#284)

  • Improve time index validation (GH#285)

  • Encode features with “unknown” class in categorical (GH#287)

  • Allow where clauses on direct features in Deep Feature Synthesis (GH#279)

  • Change to fullargsspec (GH#288)

  • Parallel verbose fixes (GH#282)

  • Update tests for python 3.7 (GH#277)

  • Check duplicate rows cutoff times (GH#276)

  • Load retail demo data using compressed file (GH#271)

v0.3.1 Sept 28, 2018
  • Handling time rewrite (GH#245)

  • Update deep_feature_synthesis.py (GH#249)

  • Handling return type when creating features from DatetimeTimeIndex (GH#266)

  • Update retail.py (GH#259)

  • Improve Consistency of Transform Primitives (GH#236)

  • Update demo docstrings (GH#268)

  • Handle non-string column names (GH#255)

  • Clean up merging of aggregation primitives (GH#250)

  • Add tests for Entity methods (GH#262)

  • Handle no child data when calculating aggregation features with multiple arguments (GH#264)

  • Add is_string utils function (GH#260)

  • Update python versions to match docker container (GH#261)

  • Handle where clause when no child data (GH#258)

  • No longer cache demo csvs, remove config file (GH#257)

  • Avoid stacking “expanding” primitives (GH#238)

  • Use randomly generated names in retail csv (GH#233)

  • Update README.md (GH#243)

v0.3.0 Aug 27, 2018
  • Improve performance of all feature calculations (GH#224)

  • Update agg primitives to use more efficient functions (GH#215)

  • Optimize metadata calculation (GH#229)

  • More robust handling when no data at a cutoff time (GH#234)

  • Workaround categorical merge (GH#231)

  • Switch which CSV is associated with which variable (GH#228)

  • Remove unused kwargs from query_by_values, filter_and_sort (GH#225)

  • Remove convert_links_to_integers (GH#219)

  • Add conda install instructions (GH#223, GH#227)

  • Add example of using Dask to parallelize to docs (GH#221)

v0.2.2 Aug 20, 2018
  • Remove unnecessary check no related instances call and refactor (GH#209)

  • Improve memory usage through support for pandas categorical types (GH#196)

  • Bump minimum pandas version from 0.20.3 to 0.23.0 (GH#216)

  • Better parallel memory warnings (GH#208, GH#214)

  • Update demo datasets (GH#187, GH#201, GH#207)

  • Make primitive lookup case insensitive (GH#213)

  • Use capital name (GH#211)

  • Set class name for Min (GH#206)

  • Remove variable_types from normalize entity (GH#205)

  • Handle parquet serialization with last time index (GH#204)

  • Reset index of cutoff times in calculate feature matrix (GH#198)

  • Check argument types for .normalize_entity (GH#195)

  • Type checking ignore entities. (GH#193)

v0.2.1 July 2, 2018
  • Cpu count fix (GH#176)

  • Update flight (GH#175)

  • Move feature matrix calculation helper functions to separate file (GH#177)

v0.2.0 June 22, 2018
  • Multiprocessing (GH#170)

  • Handle unicode encoding in repr throughout Featuretools (GH#161)

  • Clean up EntitySet class (GH#145)

  • Add support for building and uploading conda package (GH#167)

  • Parquet serialization (GH#152)

  • Remove variable stats (GH#171)

  • Make sure index variable comes first (GH#168)

  • No last time index update on normalize (GH#169)

  • Remove list of times as on option for cutoff_time in calculate_feature_matrix (GH#165)

  • Config does error checking to see if it can write to disk (GH#162)

v0.1.21 May 30, 2018
v0.1.20 Apr 13, 2018
  • Primitives as strings in DFS parameters (GH#129)

  • Integer time index bugfixes (GH#128)

  • Add make_temporal_cutoffs utility function (GH#126)

  • Show all entities, switch shape display to row/col (GH#124)

  • Improved chunking when calculating feature matrices (GH#121)

  • fixed num characters nan fix (GH#118)

  • modify ignore_variables docstring (GH#117)

v0.1.19 Mar 21, 2018
  • More descriptive DFS progress bar (GH#69)

  • Convert text variable to string before NumWords (GH#106)

  • EntitySet.concat() reindexes relationships (GH#96)

  • Keep non-feature columns when encoding feature matrix (GH#111)

  • Uses full entity update for dependencies of uses_full_entity features (GH#110)

  • Update column names in retail demo (GH#104)

  • Handle Transform features that need access to all values of entity (GH#91)

v0.1.18 Feb 27, 2018
  • fixes related instances bug (GH#97)

  • Adding non-feature columns to calculated feature matrix (GH#78)

  • Relax numpy version req (GH#82)

  • Remove entity_from_csv, tests, and lint (GH#71)

v0.1.17 Jan 18, 2018
  • LatLong type (GH#57)

  • Last time index fixes (GH#70)

  • Make median agg primitives ignore nans by default (GH#61)

  • Remove Python 3.4 support (GH#64)

  • Change normalize_entity to update secondary_time_index (GH#59)

  • Unpin requirements (GH#53)

  • associative -> commutative (GH#56)

  • Add Words and Chars primitives (GH#51)

v0.1.16 Dec 19, 2017
  • fix EntitySet.combine_variables and standardize encode_features (GH#47)

  • Python 3 compatibility (GH#16)

v0.1.15 Dec 18, 2017
  • Fix variable type in demo data (GH#37)

  • Custom primitive kwarg fix (GH#38)

  • Changed order and text of arguments in make_trans_primitive docstring (GH#42)

v0.1.14 November 20, 2017
  • Last time index (GH#33)

  • Update Scipy version to 1.0.0 (GH#31)

v0.1.13 November 1, 2017
  • Add MANIFEST.in (GH#26)

v0.1.11 October 31, 2017
  • Package linting (GH#7)

  • Custom primitive creation functions (GH#13)

  • Split requirements to separate files and pin to latest versions (GH#15)

  • Select low information features (GH#18)

  • Fix docs typos (GH#19)

  • Fixed Diff primitive for rare nan case (GH#21)

  • added some mising doc strings (GH#23)

  • Trend fix (GH#22)

  • Remove as_dir=False option from EntitySet.to_pickle() (GH#20)

  • Entity Normalization Preserves Types of Copy & Additional Variables (GH#25)

v0.1.10 October 12, 2017
  • NumTrue primitive added and docstring of other primitives updated (GH#11)

  • fixed hash issue with same base features (GH#8)

  • Head fix (GH#9)

  • Fix training window (GH#10)

  • Add associative attribute to primitives (GH#3)

  • Add status badges, fix license in setup.py (GH#1)

  • fixed head printout and flight demo index (GH#2)

v0.1.9 September 8, 2017
  • Documentation improvements

  • New featuretools.demo.load_mock_customer function

v0.1.8 September 1, 2017
  • Bug fixes

  • Added Percentile transform primitive

v0.1.7 August 17, 2017
  • Performance improvements for approximate in calculate_feature_matrix and dfs

  • Added Week transform primitive

v0.1.6 July 26, 2017

  • Added load_features and save_features to persist and reload features

  • Added save_progress argument to calculate_feature_matrix

  • Added approximate parameter to calculate_feature_matrix and dfs

  • Added load_flight to ft.demo

v0.1.5 July 11, 2017

  • Windows support

v0.1.3 July 10, 2017

  • Renamed feature submodule to primitives

  • Renamed prediction_entity arguments to target_entity

  • Added training_window parameter to calculate_feature_matrix

v0.1.2 July 3rd, 2017

  • Initial release