Changelog¶
- v0.12.0 Oct 31, 2019
Thanks to the following people for contributing to this release: @ablacke-ayx, @BoopBoopBeepBoop, @jeffzi, @kmax12, @rwedge, @thehomebrewnerd, @twdobson
v0.11.0 Sep 30, 2019
Warning
The next non-bugfix release of Featuretools will not support Python 2
- Enhancements
Improve how files are copied and written (GH#721)
Add number of rows to graph in entityset.plot (GH#727)
Added support for pandas DateOffsets in DFS and Timedelta (GH#732)
Enable feature-specific top_n value using a dictionary in encode_features (GH#735)
Added progress_callback parameter to dfs() and calculate_feature_matrix() (GH#739, GH#745)
Enable specifying primitives on a per column or per entity basis (GH#748)
- Fixes
Fixed entity set deserialization (GH#720)
Added error message when DateTimeIndex is a variable but not set as the time_index (GH#723)
Fixed CumCount and other group-by transform primitives that take ID as input (GH#733, GH#754)
Fix progress bar undercounting (GH#743)
Updated training_window error assertion to only check against observations (GH#728)
Don’t delete the whole destination folder while saving entityset (GH#717)
- Changes
Raise warning and not error on schema version mismatch (GH#718)
Change feature calculation to return in order of instance ids provided (GH#676)
Removed time remaining from displayed progress bar in dfs() and calculate_feature_matrix() (GH#739)
Raise warning in normalize_entity() when time_index of base_entity has an invalid type (GH#749)
Remove toolz as a direct dependency (GH#755)
Allow boolean variable types to be used in the Multiply primitive (GH#756)
- Documentation Changes
Updated URL for Compose (GH#716)
Thanks to the following people for contributing to this release: @angela97lin, @chidauri, @christopherbunn, @frances-h, @jeff-hernandez, @kmax12, @MarcoGorelli, @rwedge, @thehomebrewnerd
Breaking Changes
Feature calculations will return in the order of instance ids provided instead of the order of time points instances are calculated at.
- v0.10.1 Aug 25, 2019
- Fixes
Fix serialized LatLong data being loaded as strings (GH#712)
- Documentation Changes
Fixed FAQ cell output (GH#710)
Thanks to the following people for contributing to this release: @gsheni, @rwedge
v0.10.0 Aug 19, 2019
Warning
The next non-bugfix release of Featuretools will not support Python 2
- Enhancements
Give more frequent progress bar updates and update chunk size behavior (GH#631, GH#696)
Added drop_first as param in encode_features (GH#647)
Added support for stacking multi-output primitives (GH#679)
Generate transform features of direct features (GH#623)
Added serializing and deserializing from S3 and deserializing from URLs (GH#685)
Added nlp_primitives as an add-on library (GH#704)
Added AutoNormalize to Featuretools plugins (GH#699)
Added functionality for relative units (month/year) in Timedelta (GH#692)
Added categorical-encoding as an add-on library (GH#700)
- Fixes
Fix performance regression in DFS (GH#637)
Fix deserialization of feature relationship path (GH#665)
Set index after adding ancestor relationship variables (GH#668)
Fix user-supplied variable_types modification in Entity init (GH#675)
Don’t calculate dependencies of unnecessary features (GH#667)
Prevent normalize entity’s new entity having same index as base entity (GH#681)
Update variable type inference to better check for string values (GH#683)
- Changes
Moved dask, distributed imports (GH#634)
Thanks to the following people for contributing to this release: @alexjwang, @allisonportis, @ayushpatidar, @CJStadler, @ctduffy, @gsheni, @jeff-hernandez, @jeremyliweishih, @kmax12, @rwedge, @zhxt95,
- v0.9.1 July 3, 2019
- Fixes
Select columns of dataframe using a list (GH#615)
Change type of features calculated on Index features to Categorical (GH#602)
Filter dataframes through forward relationships (GH#625)
Specify Dask version in requirements for python 2 (GH#627)
Keep dataframe sorted by time during feature calculation (GH#626)
Fix bug in encode_features that created duplicate columns of features with multiple outputs (GH#622)
Thanks to the following people for contributing to this release: @CJStadler, @kmax12, @rwedge, @gsheni, @kkleidal, @ctduffy
- v0.9.0 June 19, 2019
- Enhancements
Add unit parameter to timesince primitives (GH#558)
Add ability to install optional add on libraries (GH#551)
Load and save features from open files and strings (GH#566)
Support custom variable types (GH#571)
Support entitysets which have multiple paths between two entities (GH#572, GH#544)
Added show_info function, more output information added to CLI featuretools info (GH#525)
- Fixes
Normalize_entity specifies error when ‘make_time_index’ is an invalid string (GH#550)
Schema version added for entityset serialization (GH#586)
Renamed features have names correctly serialized (GH#585)
Improved error message for index/time_index being the same column in normalize_entity and entity_from_dataframe (GH#583)
Removed unused variable in normalize entity (GH#589)
Change time since return type to numeric (GH#606)
Thanks to the following people for contributing to this release: @alexjwang, @allisonportis, @CJStadler, @ctduffy, @gsheni, @kmax12, @rwedge
- v0.8.0 May 17, 2019
Rename NUnique to NumUnique (GH#510)
Serialize features as JSON (GH#532)
Drop all variables at once in normalize_entity (GH#533)
Remove unnecessary sorting from normalize_entity (GH#535)
Features cache their names (GH#536)
Only calculate features for instances before cutoff (GH#523)
Remove all relative imports (GH#530)
Added FullName Variable Type (GH#506)
Add error message when target entity does not exist (GH#520)
New demo links (GH#542)
Remove duplicate features check in DFS (GH#538)
featuretools_primitives entry point expects list of primitive classes (GH#529)
Update ALL_VARIABLE_TYPES list (GH#526)
More Informative N Jobs Prints and Warnings (GH#511)
Update sklearn version requirements (GH#541)
Update Makefile (GH#519)
Remove unused parameter in Entity._handle_time (GH#524)
Remove build_ext code from setup.py (GH#513)
Documentation updates (GH#512, GH#514, GH#515, GH#521, GH#522, GH#527, GH#545)
Thanks to the following people for contributing to this release: @bphi, @CharlesBradshaw, @CJStadler, @glentennis, @gsheni, @kmax12, @rwedge
Breaking Changes
NUnique
has been renamed toNumUnique
.Previous behavior
from featuretools.primitives import NUnique
New behavior
from featuretools.primitives import NumUnique
- v0.7.1 Apr 24, 2019
Automatically generate feature name for controllable primitives (GH#481)
Primitive docstring updates (GH#489, GH#492, GH#494, GH#495)
Change primitive functions that returned strings to return functions (GH#499)
CLI customizable via entrypoints (GH#493)
Improve calculation of aggregation features on grandchildren (GH#479)
Refactor entrypoints to use decorator (GH#483)
Include doctests in testing suite (GH#491)
Documentation updates (GH#490)
Update how standard primitives are imported internally (GH#482)
Thanks to the following people for contributing to this release: @bukosabino, @CharlesBradshaw, @glentennis, @gsheni, @jeff-hernandez, @kmax12, @minkvsky, @rwedge, @thehomebrewnerd
- v0.7.0 Mar 29, 2019
Improve Entity Set Serialization (GH#361)
Support calling a primitive instance’s function directly (GH#461, GH#468)
Support other libraries extending featuretools functionality via entrypoints (GH#452)
Remove featuretools install command (GH#475)
Add commutative argument to SubtractNumeric and DivideNumeric primitives (GH#457)
Add FilePath variable_type (GH#470)
Add PhoneNumber, DateOfBirth, URL variable types (GH#447)
Generalize infer_variable_type, convert_variable_data and convert_all_variable_data methods (GH#423)
Thanks to the following people for contributing to this release: @bukosabino, @CharlesBradshaw, @ColCarroll, @glentennis, @grayskripko, @gsheni, @jeff-hernandez, @jrkinley, @kmax12, @RogerTangos, @rwedge
Breaking Changes
ft.dfs
now has agroupby_trans_primitives
parameter that DFS uses to automatically construct features that group by an ID column and then apply a transform primitive to search group. This change applies to the following primitives:CumSum
,CumCount
,CumMean
,CumMin
, andCumMax
.Previous behavior
ft.dfs(entityset=es, target_entity='customers', trans_primitives=["cum_mean"])
New behavior
ft.dfs(entityset=es, target_entity='customers', groupby_trans_primitives=["cum_mean"])
Related to the above change, cumulative transform features are now defined using a new feature class,
GroupByTransformFeature
.Previous behavior
ft.Feature([base_feature, groupby_feature], primitive=CumulativePrimitive)
New behavior
ft.Feature(base_feature, groupby=groupby_feature, primitive=CumulativePrimitive)
- v0.6.1 Feb 15, 2019
Cumulative primitives (GH#410)
Entity.query_by_values now preserves row order of underlying data (GH#428)
Implementing Country Code and Sub Region Codes as variable types (GH#430)
Added IPAddress and EmailAddress variable types (GH#426)
Install data and dependencies (GH#403)
Add TimeSinceFirst, fix TimeSinceLast (GH#388)
Allow user to pass in desired feature return types (GH#372)
Add new configuration object (GH#401)
Replace NUnique get_function (GH#434)
_calculate_idenity_features now only returns the features asked for, instead of the entire entity (GH#429)
Primitive function name uniqueness (GH#424)
Update NumCharacters and NumWords primitives (GH#419)
Change to zipcode rep, str for pandas (GH#418)
Remove pandas version upper bound (GH#408)
Make S3 dependencies optional (GH#404)
Check that agg_primitives and trans_primitives are right primitive type (GH#397)
Mean primitive changes (GH#395)
Fix transform stacking on multi-output aggregation (GH#394)
Fix list_primitives (GH#391)
Documentation updates (GH#400, GH#409, GH#415, GH#417, GH#420, GH#421, GH#422, GH#431)
Thanks to the following people for contributing to this release: @CharlesBradshaw, @csala, @floscha, @gsheni, @jxwolstenholme, @kmax12, @RogerTangos, @rwedge
- v0.6.0 Jan 30, 2018
Primitive refactor (GH#364)
Mean ignore NaNs (GH#379)
Plotting entitysets (GH#382)
Add seed features later in DFS process (GH#357)
Multiple output column features (GH#376)
Add ZipCode Variable Type (GH#367)
Add primitive.get_filepath and example of primitive loading data from external files (GH#380)
Transform primitives take series as input (GH#385)
Add modulo to override tests (GH#384)
Thanks to the following people for contributing to this release: @floscha, @gsheni, @kmax12, @RogerTangos, @rwedge
- v0.5.1 Dec 17, 2018
- v0.5.0 Dec 17, 2018
Add specific error for duplicate additional/copy_variables in normalize_entity (GH#348)
Removed EntitySet._import_from_dataframe (GH#346)
Removed time_index_reduce parameter (GH#344)
Allow installation of additional primitives (GH#326)
Fix DatetimeIndex variable conversion (GH#342)
Update Sklearn DFS Transformer (GH#343)
Clean up entity creation logic (GH#336)
remove casting to list in transform feature calculation (GH#330)
Fix sklearn wrapper (GH#335)
Add readme to pypi
Update conda docs after move to conda-forge (GH#334)
Add wrapper for scikit-learn Pipelines (GH#323)
Remove parse_date_cols parameter from EntitySet._import_from_dataframe (GH#333)
Thanks to the following people for contributing to this release: @bukosabino, @georgewambold, @gsheni, @jeff-hernandez, @kmax12, and @rwedge.
- v0.4.1 Nov 29, 2018
Resolve bug preventing using first column as index by default (GH#308)
Handle return type when creating features from Id variables (GH#318)
Make id an optional parameter of EntitySet constructor (GH#324)
Handle primitives with same function being applied to same column (GH#321)
Update requirements (GH#328)
Clean up DFS arguments (GH#319)
Clean up Pandas Backend (GH#302)
Update properties of cumulative transform primitives (GH#320)
Feature stability between versions documentation (GH#316)
Add download count to GitHub readme (GH#310)
Fixed #297 update tests to check error strings (GH#303)
Remove usage of fixtures in agg primitive tests (GH#325)
- v0.4.0 Oct 31, 2018
Remove ft.utils.gen_utils.getsize and make pympler a test requirement (GH#299)
Update requirements.txt (GH#298)
Refactor EntitySet.find_path(…) (GH#295)
Clean up unused methods (GH#293)
Remove unused parents property of Entity (GH#283)
Removed relationships parameter (GH#284)
Improve time index validation (GH#285)
Encode features with “unknown” class in categorical (GH#287)
Allow where clauses on direct features in Deep Feature Synthesis (GH#279)
Change to fullargsspec (GH#288)
Parallel verbose fixes (GH#282)
Update tests for python 3.7 (GH#277)
Check duplicate rows cutoff times (GH#276)
Load retail demo data using compressed file (GH#271)
- v0.3.1 Sept 28, 2018
Handling time rewrite (GH#245)
Update deep_feature_synthesis.py (GH#249)
Handling return type when creating features from DatetimeTimeIndex (GH#266)
Update retail.py (GH#259)
Improve Consistency of Transform Primitives (GH#236)
Update demo docstrings (GH#268)
Handle non-string column names (GH#255)
Clean up merging of aggregation primitives (GH#250)
Add tests for Entity methods (GH#262)
Handle no child data when calculating aggregation features with multiple arguments (GH#264)
Add is_string utils function (GH#260)
Update python versions to match docker container (GH#261)
Handle where clause when no child data (GH#258)
No longer cache demo csvs, remove config file (GH#257)
Avoid stacking “expanding” primitives (GH#238)
Use randomly generated names in retail csv (GH#233)
Update README.md (GH#243)
- v0.3.0 Aug 27, 2018
Improve performance of all feature calculations (GH#224)
Update agg primitives to use more efficient functions (GH#215)
Optimize metadata calculation (GH#229)
More robust handling when no data at a cutoff time (GH#234)
Workaround categorical merge (GH#231)
Switch which CSV is associated with which variable (GH#228)
Remove unused kwargs from query_by_values, filter_and_sort (GH#225)
Remove convert_links_to_integers (GH#219)
Add example of using Dask to parallelize to docs (GH#221)
- v0.2.2 Aug 20, 2018
Remove unnecessary check no related instances call and refactor (GH#209)
Improve memory usage through support for pandas categorical types (GH#196)
Bump minimum pandas version from 0.20.3 to 0.23.0 (GH#216)
Make primitive lookup case insensitive (GH#213)
Use capital name (GH#211)
Set class name for Min (GH#206)
Remove
variable_types
from normalize entity (GH#205)Handle parquet serialization with last time index (GH#204)
Reset index of cutoff times in calculate feature matrix (GH#198)
Check argument types for .normalize_entity (GH#195)
Type checking ignore entities. (GH#193)
- v0.2.1 July 2, 2018
- v0.2.0 June 22, 2018
Multiprocessing (GH#170)
Handle unicode encoding in repr throughout Featuretools (GH#161)
Clean up EntitySet class (GH#145)
Add support for building and uploading conda package (GH#167)
Parquet serialization (GH#152)
Remove variable stats (GH#171)
Make sure index variable comes first (GH#168)
No last time index update on normalize (GH#169)
Remove list of times as on option for cutoff_time in calculate_feature_matrix (GH#165)
Config does error checking to see if it can write to disk (GH#162)
- v0.1.21 May 30, 2018
No EntitySet required in loading/saving features (GH#141)
Use s3 demo csv with better column names (GH#139)
more reasonable start parameter (GH#149)
add issue template (GH#133)
Update documentation after recent changes / removals (GH#157)
Rename demo retail csv file (GH#148)
Add names for binary (GH#142)
EntitySet repr to use get_name rather than id (GH#134)
Ensure config dir is writable (GH#135)
- v0.1.20 Apr 13, 2018
Primitives as strings in DFS parameters (GH#129)
Integer time index bugfixes (GH#128)
Add make_temporal_cutoffs utility function (GH#126)
Show all entities, switch shape display to row/col (GH#124)
Improved chunking when calculating feature matrices (GH#121)
fixed num characters nan fix (GH#118)
modify ignore_variables docstring (GH#117)
- v0.1.19 Mar 21, 2018
More descriptive DFS progress bar (GH#69)
Convert text variable to string before NumWords (GH#106)
EntitySet.concat() reindexes relationships (GH#96)
Keep non-feature columns when encoding feature matrix (GH#111)
Uses full entity update for dependencies of uses_full_entity features (GH#110)
Update column names in retail demo (GH#104)
Handle Transform features that need access to all values of entity (GH#91)
- v0.1.18 Feb 27, 2018
- v0.1.17 Jan 18, 2018
LatLong type (GH#57)
Last time index fixes (GH#70)
Make median agg primitives ignore nans by default (GH#61)
Remove Python 3.4 support (GH#64)
Change normalize_entity to update secondary_time_index (GH#59)
Unpin requirements (GH#53)
associative -> commutative (GH#56)
Add Words and Chars primitives (GH#51)
- v0.1.16 Dec 19, 2017
- v0.1.15 Dec 18, 2017
- v0.1.14 November 20, 2017
- v0.1.13 November 1, 2017
Add MANIFEST.in (GH#26)
- v0.1.11 October 31, 2017
Package linting (GH#7)
Custom primitive creation functions (GH#13)
Split requirements to separate files and pin to latest versions (GH#15)
Select low information features (GH#18)
Fix docs typos (GH#19)
Fixed Diff primitive for rare nan case (GH#21)
added some mising doc strings (GH#23)
Trend fix (GH#22)
Remove as_dir=False option from EntitySet.to_pickle() (GH#20)
Entity Normalization Preserves Types of Copy & Additional Variables (GH#25)
- v0.1.10 October 12, 2017
NumTrue primitive added and docstring of other primitives updated (GH#11)
fixed hash issue with same base features (GH#8)
Head fix (GH#9)
Fix training window (GH#10)
Add associative attribute to primitives (GH#3)
Add status badges, fix license in setup.py (GH#1)
fixed head printout and flight demo index (GH#2)
- v0.1.9 September 8, 2017
Documentation improvements
New
featuretools.demo.load_mock_customer
function
- v0.1.8 September 1, 2017
Bug fixes
Added
Percentile
transform primitive
- v0.1.7 August 17, 2017
Performance improvements for approximate in
calculate_feature_matrix
anddfs
Added
Week
transform primitive
v0.1.6 July 26, 2017
Added
load_features
andsave_features
to persist and reload featuresAdded save_progress argument to
calculate_feature_matrix
Added approximate parameter to
calculate_feature_matrix
anddfs
Added
load_flight
to ft.demo
v0.1.5 July 11, 2017
Windows support
v0.1.3 July 10, 2017
Renamed feature submodule to primitives
Renamed prediction_entity arguments to target_entity
Added training_window parameter to
calculate_feature_matrix
v0.1.2 July 3rd, 2017
Initial release