Improved operators, locking mechanism, and seeding initial population with decision trees#65
Conversation
…iations I will not create something fully random if variation fails, because the population can lose the locked nodes or weights. Instead, I try subtree mutations and finally clone the parent if it fails. This is how mutations are handling locked weights Delete – should not work on nodes with locked weight Toggle – will not work if there is a fixed weight (cannot turn on or off) Subtree – will keep the weight Point – will keep the weight Insert – should “steal” the weight from the fixed node Crossover – will keep the weight of the receiving parent
That was making complexity values explore by considering intermediate nodes when doing the recursive calculation, often leading to overflow of the integer value. I also updated the cpp test cases to print the min and max values for each data type so we can manually check if the value is suitable for the calculations we are doing.
…ataframe. Better error message when accessing the dataset directly does not find the feature by its name. New assertions in API interface tests.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR introduces several improvements and fixes:
start_from_decision_treestoEstimatorInterfaceand underlying C++Parameters, allowing the initial population to consist only of decision trees. This is now exposed in Python and passed to C++ bindings.BrushEstimator.partial_fit, addedkeep_current_weightsparameter to control whether current weights are locked during optimization, and ensured population is replicated from the best estimator before fitting. The underlying C++lock_nodesmethod and its Python bindings now support this option.average_precision_scorein C++ to correctly handle cases where all predicted probabilities are constant, matching sklearn's behavior, and fixed an off-by-one error in the loop.