Precision, Recall and F1 score of 1 or 2 regression trees #17

rodrigoazs · 2018-05-04T01:12:00Z

When I set the trees parameter to learn 1 or 2 regression trees I always get those metrics as NaN or 0.

% Precision = NaN at threshold = 0,500
% Recall = 0,000000
% F1 = NaN

I've tried it several times with same dataset, settings and different numbers of learning trees.

boost-starai · 2018-05-04T01:18:04Z

Thanks for the note. A couple of questions: 1. What is the class imbalance in the data set? 50-50? It appears that all the examples are classified as one class. 2. What is the depth of the trees? A couple of favors. 1. Can you please open the models directory and open regressionTrees.txt file? 2. Can you please send us the .bk file? Thanks Sriraam On May 3, 2018, at 8:12 PM, Rodrigo Azevedo <[email protected]<mailto:[email protected]>> wrote: When I set the trees parameter to learn 1 or 2 regression trees I always get those metrics as NaN or 0. % Precision = NaN at threshold = 0,500 % Recall = 0,000000 % F1 = NaN I've tried it several times with same dataset, settings and different numbers of learning trees. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub<#17>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AXip1Xhtm6S8TE9zS_iBfxw3AX4l6wPrks5tu6rhgaJpZM4Tx_tL>.

rodrigoazs · 2018-05-04T01:42:51Z

Thanks for helping!

I tried to learn one single regression tree with the following dataset:

IMDB dataset provided in the BoostSRL Wiki

setParam: treeDepth=3.

95 positive examples and 146 negative examples.

setParam: stringsAreCaseSensitive = true.

usePrologVariables: true.


(female_gender(A, 0.15676004621062295) :-  /* #neg=101 #pos=43 */ workedunder_2_1_genre(A, adrama, 1)).
(female_gender(A, 0.13087620782678497) :-  /* #neg=8 #pos=3 */ workedunder_2_1_genre(A, acomedy, 7)).
(female_gender(A, 0.16584124279182003) :-  /* #neg=9 #pos=4 */ workedunder_2_1_genre(A, ascifi, 1)).
(female_gender(A, 0.35814893509951234) :-  /* #neg=7 #pos=7 */ workedunder_2_1_genre(A, acomedy, 2)).
(female_gender(A, 0.40360348055405776) :-  /* #neg=5 #pos=6 */ workedunder_2_1_genre(A, acrime, 2)).
(female_gender(A, 0.5133213488926157) :-  /* #neg=10 #pos=19 */ workedunder_2_1_genre(A, aaction, 1)).
female_gender(_, 0.524815601766179) /* #neg=7 #pos=14 */ .

usePrologVariables: true.

% maxTreeDepthInNodes                 = 5
% maxTreeDepthInLiterals              = 12
% maxNumberOfLiteralsAtAnInteriorNode = 3
% maxFreeBridgersInBody               = 1
% maxNumberOfClauses                  = 8
% maxNodesToConsider                  = 10
% maxNodesToCreate                    = 10.000
% maxAcceptableNodeScoreToStop        = 0,003
% negPosRatio                         = 2,000
% testNegPosRatio                     = -1,000
% # of pos examples                   = 243
% # of neg examples                   = 0



%%%%%  WILL-Produced Tree #1 @ 22:23:51 5/3/18.  [Using 2.584.856 memory cells.]  %%%%%

% FOR female_gender(A):
%   if ( workedunder_2_1_genre(A, adrama, 1) )
%   then return 0.15676004621062295;  // std dev = 0,458, 144,000 (wgt'ed) examples reached here.  /* #neg=101 #pos=43 */
%   else if ( workedunder_2_1_genre(A, acomedy, 7) )
%   | then return 0.13087620782678497;  // std dev = 0,445, 11,000 (wgt'ed) examples reached here.  /* #neg=8 #pos=3 */
%   | else if ( workedunder_2_1_genre(A, ascifi, 1) )
%   | | then return 0.16584124279182003;  // std dev = 0,462, 13,000 (wgt'ed) examples reached here.  /* #neg=9 #pos=4 */
%   | | else if ( workedunder_2_1_genre(A, acomedy, 2) )
%   | | | then return 0.35814893509951234;  // std dev = 0,500, 14,000 (wgt'ed) examples reached here.  /* #neg=7 #pos=7 */
%   | | | else if ( workedunder_2_1_genre(A, acrime, 2) )
%   | | | | then return 0.40360348055405776;  // std dev = 0,498, 11,000 (wgt'ed) examples reached here.  /* #neg=5 #pos=6 */
%   | | | | else if ( workedunder_2_1_genre(A, aaction, 1) )
%   | | | | | then return 0.5133213488926157;  // std dev = 2,560, 29,000 (wgt'ed) examples reached here.  /* #neg=10 #pos=19 */
%   | | | | | else return 0.524815601766179;  // std dev = 2,160, 21,000 (wgt'ed) examples reached here.  /* #neg=7 #pos=14 */


% Clauses:

female_gender(A, 0.15676004621062295) :- 
     workedunder_2_1_genre(A, adrama, 1), 
     !. // Clause #1.

female_gender(A, 0.13087620782678497) :- 
     workedunder_2_1_genre(A, acomedy, 7), 
     !. // Clause #2.

female_gender(A, 0.16584124279182003) :- 
     workedunder_2_1_genre(A, ascifi, 1), 
     !. // Clause #3.

female_gender(A, 0.35814893509951234) :- 
     workedunder_2_1_genre(A, acomedy, 2), 
     !. // Clause #4.

female_gender(A, 0.40360348055405776) :- 
     workedunder_2_1_genre(A, acrime, 2), 
     !. // Clause #5.

female_gender(A, 0.5133213488926157) :- 
     workedunder_2_1_genre(A, aaction, 1), 
     !. // Clause #6.

female_gender(A, 0.524815601766179) :- !. // Clause #7.


% The flattened versions of these clauses:

flattened_female_gender(a, 0.15676004621062295) :-  /* #neg=101 #pos=43 */ 
   workedunder_2_1_genre(a, adrama, 1),
   !. // Flattened version of clause #1.

flattened_female_gender(a, 0.13087620782678497) :-  /* #neg=8 #pos=3 */ 
   workedunder_2_1_genre(a, acomedy, 7),
   !. // Flattened version of clause #2.

flattened_female_gender(a, 0.16584124279182003) :-  /* #neg=9 #pos=4 */ 
   workedunder_2_1_genre(a, ascifi, 1),
   !. // Flattened version of clause #3.

flattened_female_gender(a, 0.35814893509951234) :-  /* #neg=7 #pos=7 */ 
   workedunder_2_1_genre(a, acomedy, 2),
   !. // Flattened version of clause #4.

flattened_female_gender(a, 0.40360348055405776) :-  /* #neg=5 #pos=6 */ 
   workedunder_2_1_genre(a, acrime, 2),
   !. // Flattened version of clause #5.

flattened_female_gender(a, 0.5133213488926157) :-  /* #neg=10 #pos=19 */ 
   workedunder_2_1_genre(a, aaction, 1),
   !. // Flattened version of clause #6.

flattened_female_gender(underscore, 0.524815601766179) :-  /* #neg=7 #pos=14 */ 
   !. // Flattened version of clause #7.


% The unique flattened literals:
%   workedunder_2_1_genre(a, acomedy, 2)
%   workedunder_2_1_genre(a, aaction, 1)
%   workedunder_2_1_genre(a, adrama, 1)
%   workedunder_2_1_genre(a, acomedy, 7)
%   workedunder_2_1_genre(a, ascifi, 1)
%   workedunder_2_1_genre(a, acrime, 2)


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%  Final call for computing score for female_gender.  %%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

stepLength_tree1(5.0E-4).

logPrior(-1.8).
female_gender(E, Total) :- // A general accessor. 
   female_gender(E, 1000000, Total), !.
female_gender(E, Total) :- waitHere("This should not fail", female_gender(E, Total)).

female_gender(E, TreesToUse, Total) :- // A tree-limited accessor (e.g., for tuning the number of trees to use).
   logPrior(LogPrior),
   getScore_female_gender_tree1(E, TreesToUse, Total1),
   Total is LogPrior + Total1,
   !.
female_gender(E, TreesToUse, Total) :- waitHere("This should not fail", female_gender(E, TreesToUse, Total)).

getScore_female_gender_tree1(E, TreesToUse, 0.0) :- 1 > TreesToUse, !.
getScore_female_gender_tree1(E, TreesToUse, Total1) :- female_gender_tree1(E, Total), stepLength_tree1(StepLen), Total1 is Total * StepLen.

flattenedLiteralsInThisSetOfTrees(female_gender, 6, [
   workedunder_2_1_genre(a, acomedy, 2),
   workedunder_2_1_genre(a, aaction, 1),
   workedunder_2_1_genre(a, adrama, 1),
   workedunder_2_1_genre(a, acomedy, 7),
   workedunder_2_1_genre(a, ascifi, 1),
   workedunder_2_1_genre(a, acrime, 2)]).

%%%%%  WILL-Produced Tree Combined @ 22:23:52 5/3/18.  [Using 2.611.240 memory cells.]  %%%%%

% FOR female_gender(A):
%   if ( workedunder_2_1_genre(A, adrama, 1) )
%   then return 0.15676004621062295;  // std dev = 0,458, 144,000 (wgt'ed) examples reached here.  /* #neg=101 #pos=43 */
%   else if ( workedunder_2_1_genre(A, ascifi, 1) )
%   | then return 0.16584124279182003;  // std dev = 0,462, 13,000 (wgt'ed) examples reached here.  /* #neg=9 #pos=4 */
%   | else if ( workedunder_2_1_genre(A, acomedy, 7) )
%   | | then return 0.13087620782678497;  // std dev = 0,445, 11,000 (wgt'ed) examples reached here.  /* #neg=8 #pos=3 */
%   | | else if ( workedunder_2_1_genre(A, acomedy, 2) )
%   | | | then return 0.35814893509951234;  // std dev = 0,500, 14,000 (wgt'ed) examples reached here.  /* #neg=7 #pos=7 */
%   | | | else if ( workedunder_2_1_genre(A, acrime, 2) )
%   | | | | then return 0.40360348055405776;  // std dev = 0,498, 11,000 (wgt'ed) examples reached here.  /* #neg=5 #pos=6 */
%   | | | | else if ( workedunder_2_1_genre(A, adrama, 2) )
%   | | | | | then return 0.6081489350995122;  // std dev = 0,866, 4,000 (wgt'ed) examples reached here.  /* #neg=1 #pos=3 */
%   | | | | | else return 0.5103228481429896;  // std dev = 3,230, 46,000 (wgt'ed) examples reached here.  /* #neg=16 #pos=30 */


% Clauses:

female_gender(A, 0.15676004621062295) :- 
     workedunder_2_1_genre(A, adrama, 1), 
     !. // Clause #1.

female_gender(A, 0.16584124279182003) :- 
     workedunder_2_1_genre(A, ascifi, 1), 
     !. // Clause #2.

female_gender(A, 0.13087620782678497) :- 
     workedunder_2_1_genre(A, acomedy, 7), 
     !. // Clause #3.

female_gender(A, 0.35814893509951234) :- 
     workedunder_2_1_genre(A, acomedy, 2), 
     !. // Clause #4.

female_gender(A, 0.40360348055405776) :- 
     workedunder_2_1_genre(A, acrime, 2), 
     !. // Clause #5.

female_gender(A, 0.6081489350995122) :- 
     workedunder_2_1_genre(A, adrama, 2), 
     !. // Clause #6.

female_gender(A, 0.5103228481429896) :- !. // Clause #7.


% The flattened versions of these clauses:

flattened_female_gender(a, 0.15676004621062295) :-  /* #neg=101 #pos=43 */ 
   workedunder_2_1_genre(a, adrama, 1),
   !. // Flattened version of clause #1.

flattened_female_gender(a, 0.16584124279182003) :-  /* #neg=9 #pos=4 */ 
   workedunder_2_1_genre(a, ascifi, 1),
   !. // Flattened version of clause #2.

flattened_female_gender(a, 0.13087620782678497) :-  /* #neg=8 #pos=3 */ 
   workedunder_2_1_genre(a, acomedy, 7),
   !. // Flattened version of clause #3.

flattened_female_gender(a, 0.35814893509951234) :-  /* #neg=7 #pos=7 */ 
   workedunder_2_1_genre(a, acomedy, 2),
   !. // Flattened version of clause #4.

flattened_female_gender(a, 0.40360348055405776) :-  /* #neg=5 #pos=6 */ 
   workedunder_2_1_genre(a, acrime, 2),
   !. // Flattened version of clause #5.

flattened_female_gender(a, 0.6081489350995122) :-  /* #neg=1 #pos=3 */ 
   workedunder_2_1_genre(a, adrama, 2),
   !. // Flattened version of clause #6.

flattened_female_gender(underscore, 0.5103228481429896) :-  /* #neg=16 #pos=30 */ 
   !. // Flattened version of clause #7.


% The unique flattened literals:
%   workedunder_2_1_genre(a, acomedy, 2)
%   workedunder_2_1_genre(a, adrama, 1)
%   workedunder_2_1_genre(a, acomedy, 7)
%   workedunder_2_1_genre(a, ascifi, 1)
%   workedunder_2_1_genre(a, acrime, 2)
%   workedunder_2_1_genre(a, adrama, 2)

NELL sports dataset

174 positive examples and 174 negative examples

//Parameters
setParam: maxTreeDepth=6.
setParam: nodeSize=1.
setParam: numOfClauses=8.
//Modes
mode: male(+name).
mode: athleteledsportsteam(+athlete,+sportsteam).
mode: athleteledsportsteam(-athlete,+sportsteam).
mode: athleteledsportsteam(+athlete,-sportsteam).
mode: athleteplaysforteam(+athlete,+sportsteam).
mode: athleteplaysforteam(+athlete,-sportsteam).
mode: athleteplaysforteam(-athlete,+sportsteam).
mode: athleteplaysinleague(+athlete,+sportsleague).
mode: athleteplaysinleague(+athlete,-sportsleague).
mode: athleteplaysinleague(-athlete,+sportsleague).
mode: athleteplayssport(+athlete,+sport).
mode: athleteplayssport(+athlete,-sport).
mode: athleteplayssport(-athlete,+sport).
mode: teamalsoknownas(+sportsteam,+sportsteam).
mode: teamalsoknownas(+sportsteam,-sportsteam).
mode: teamalsoknownas(-sportsteam,+sportsteam).
mode: teamplaysagainstteam(+sportsteam,+sportsteam).
mode: teamplaysagainstteam(+sportsteam,-sportsteam).
mode: teamplaysagainstteam(-sportsteam,+sportsteam).
mode: teamplaysinleague(+sportsteam,+sportsleague).
mode: teamplaysinleague(+sportsteam,-sportsleague).
mode: teamplaysinleague(-sportsteam,+sportsleague).
mode: teamplayssport(+sportsteam,+sport).
mode: teamplayssport(+sportsteam,-sport).
mode: teamplayssport(-sportsteam,+sport).

setParam: stringsAreCaseSensitive = true.

usePrologVariables: true.


(athleteplaysforteam(A, B, 0.8446354215859994) :-  /* #neg=1 #pos=73 */ athleteledsportsteam(_, B), athleteledsportsteam(A, B), !).
(athleteplaysforteam(_, A, 0.5061489350995088) :-  /* #neg=88 #pos=162 */ athleteledsportsteam(_, A), teamalsoknownas(A, _), !).
(athleteplaysforteam(A, B, 0.6669724645112775) :-  /* #neg=13 #pos=55 */ athleteledsportsteam(_, B), athleteplaysinleague(A, UniqueVar1), teamplaysinleague(B, UniqueVar1), !).
(athleteplaysforteam(A, B, 0.055517356152143954) :-  /* #neg=61 #pos=15 */ athleteledsportsteam(_, B), athleteplaysinleague(A, _), !).
(athleteplaysforteam(_, A, 0.1793168183111909) :-  /* #neg=93 #pos=44 */ athleteledsportsteam(_, A), !).
(athleteplaysforteam(A, B, 0.16249676118646889) :-  /* #neg=16 #pos=7 */ teamalsoknownas(B, _), athleteledsportsteam(A, _), !).
(athleteplaysforteam(A, B, 0.324815601766179) :-  /* #neg=16 #pos=14 */ teamalsoknownas(B, _), athleteplaysinleague(A, _), !).
(athleteplaysforteam(_, A, 0.5387044906550683) :-  /* #neg=23 #pos=49 */ teamalsoknownas(A, _), !).
(athleteplaysforteam(_, _, 0.17946041050934608) :-  /* #neg=207 #pos=98 */ !).

usePrologVariables: true.

% maxTreeDepthInNodes                 = 6
% maxTreeDepthInLiterals              = 12
% maxNumberOfLiteralsAtAnInteriorNode = 1
% maxFreeBridgersInBody               = 1
% maxNumberOfClauses                  = 8
% maxNodesToConsider                  = 10
% maxNodesToCreate                    = 10.000
% maxAcceptableNodeScoreToStop        = 0,003
% negPosRatio                         = 2,000
% testNegPosRatio                     = -1,000
% # of pos examples                   = 1.035
% # of neg examples                   = 0



%%%%%  WILL-Produced Tree #1 @ 20:56:42 5/3/18.  [Using 6.846.760 memory cells.]  %%%%%

% FOR athleteplaysforteam(A, B):
%   if ( athleteledsportsteam(C, B) )
%   then if ( athleteledsportsteam(A, B) )
%   | then return 0.8446354215859994;  // std dev = 0,115, 74,000 (wgt'ed) examples reached here.  /* #neg=1 #pos=73 */
%   | else if ( teamalsoknownas(B, D) )
%   | | then return 0.5061489350995088;  // std dev = 0,478, 250,000 (wgt'ed) examples reached here.  /* #neg=88 #pos=162 */
%   | | else if ( athleteplaysinleague(A, E) )
%   | | | then if ( teamplaysinleague(B, E) )
%   | | | | then return 0.6669724645112775;  // std dev = 0,393, 68,000 (wgt'ed) examples reached here.  /* #neg=13 #pos=55 */
%   | | | | else return 0.055517356152143954;  // std dev = 0,398, 76,000 (wgt'ed) examples reached here.  /* #neg=61 #pos=15 */
%   | | | else return 0.1793168183111909;  // std dev = 0,467, 137,000 (wgt'ed) examples reached here.  /* #neg=93 #pos=44 */
%   else if ( teamalsoknownas(B, F) )
%   | then if ( athleteledsportsteam(A, G) )
%   | | then return 0.16249676118646889;  // std dev = 0,460, 23,000 (wgt'ed) examples reached here.  /* #neg=16 #pos=7 */
%   | | else if ( athleteplaysinleague(A, H) )
%   | | | then return 0.324815601766179;  // std dev = 0,499, 30,000 (wgt'ed) examples reached here.  /* #neg=16 #pos=14 */
%   | | | else return 0.5387044906550683;  // std dev = 0,466, 72,000 (wgt'ed) examples reached here.  /* #neg=23 #pos=49 */
%   | else return 0.17946041050934608;  // std dev = 0,467, 305,000 (wgt'ed) examples reached here.  /* #neg=207 #pos=98 */


% Clauses:

athleteplaysforteam(A, B, 0.8446354215859994) :- 
     athleteledsportsteam(C, B), 
     athleteledsportsteam(A, B), 
     !. // Clause #1.

athleteplaysforteam(A, B, 0.5061489350995088) :- 
     athleteledsportsteam(C, B), 
     teamalsoknownas(B, D), 
     !. // Clause #2.

athleteplaysforteam(A, B, 0.6669724645112775) :- 
     athleteledsportsteam(C, B), 
     athleteplaysinleague(A, D), 
     teamplaysinleague(B, D), 
     !. // Clause #3.

athleteplaysforteam(A, B, 0.055517356152143954) :- 
     athleteledsportsteam(C, B), 
     athleteplaysinleague(A, D), 
     !. // Clause #4.

athleteplaysforteam(A, B, 0.1793168183111909) :- 
     athleteledsportsteam(C, B), 
     !. // Clause #5.

athleteplaysforteam(A, B, 0.16249676118646889) :- 
     teamalsoknownas(B, C), 
     athleteledsportsteam(A, D), 
     !. // Clause #6.

athleteplaysforteam(A, B, 0.324815601766179) :- 
     teamalsoknownas(B, C), 
     athleteplaysinleague(A, D), 
     !. // Clause #7.

athleteplaysforteam(A, B, 0.5387044906550683) :- 
     teamalsoknownas(B, C), 
     !. // Clause #8.

athleteplaysforteam(A, B, 0.17946041050934608) :- !. // Clause #9.


% The flattened versions of these clauses:

flattened_athleteplaysforteam(a, b, 0.8446354215859994) :-  /* #neg=1 #pos=73 */ 
   athleteledsportsteam(underscore, b),
   athleteledsportsteam(a, b),
   !. // Flattened version of clause #1.

flattened_athleteplaysforteam(underscore, a, 0.5061489350995088) :-  /* #neg=88 #pos=162 */ 
   athleteledsportsteam(underscore, a),
   teamalsoknownas(a, underscore),
   !. // Flattened version of clause #2.

flattened_athleteplaysforteam(a, b, 0.6669724645112775) :-  /* #neg=13 #pos=55 */ 
   athleteledsportsteam(underscore, b),
   athleteplaysinleague(a, uniqueVar1),
   teamplaysinleague(b, uniqueVar1),
   !. // Flattened version of clause #3.

flattened_athleteplaysforteam(a, b, 0.055517356152143954) :-  /* #neg=61 #pos=15 */ 
   athleteledsportsteam(underscore, b),
   athleteplaysinleague(a, underscore),
   !. // Flattened version of clause #4.

flattened_athleteplaysforteam(underscore, a, 0.1793168183111909) :-  /* #neg=93 #pos=44 */ 
   athleteledsportsteam(underscore, a),
   !. // Flattened version of clause #5.

flattened_athleteplaysforteam(a, b, 0.16249676118646889) :-  /* #neg=16 #pos=7 */ 
   teamalsoknownas(b, underscore),
   athleteledsportsteam(a, underscore),
   !. // Flattened version of clause #6.

flattened_athleteplaysforteam(a, b, 0.324815601766179) :-  /* #neg=16 #pos=14 */ 
   teamalsoknownas(b, underscore),
   athleteplaysinleague(a, underscore),
   !. // Flattened version of clause #7.

flattened_athleteplaysforteam(underscore, a, 0.5387044906550683) :-  /* #neg=23 #pos=49 */ 
   teamalsoknownas(a, underscore),
   !. // Flattened version of clause #8.

flattened_athleteplaysforteam(underscore, underscore, 0.17946041050934608) :-  /* #neg=207 #pos=98 */ 
   !. // Flattened version of clause #9.


% The unique flattened literals:
%   athleteledsportsteam(a, underscore)
%   athleteledsportsteam(a, b)
%   teamplaysinleague(b, uniqueVar1)
%   athleteplaysinleague(a, underscore)
%   athleteledsportsteam(underscore, b)
%   athleteplaysinleague(a, uniqueVar1)
%   teamalsoknownas(a, underscore)
%   teamalsoknownas(b, underscore)
%   athleteledsportsteam(underscore, a)


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%  Final call for computing score for athleteplaysforteam.  %%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

stepLength_tree1(1.0).

logPrior(-1.8).
athleteplaysforteam(D, E, Total) :- // A general accessor. 
   athleteplaysforteam(D, E, 1000000, Total), !.
athleteplaysforteam(D, E, Total) :- waitHere("This should not fail", athleteplaysforteam(D, E, Total)).

athleteplaysforteam(D, E, TreesToUse, Total) :- // A tree-limited accessor (e.g., for tuning the number of trees to use).
   logPrior(LogPrior),
   getScore_athleteplaysforteam_tree1(D, E, TreesToUse, Total1),
   Total is LogPrior + Total1,
   !.
athleteplaysforteam(D, E, TreesToUse, Total) :- waitHere("This should not fail", athleteplaysforteam(D, E, TreesToUse, Total)).

getScore_athleteplaysforteam_tree1(D, E, TreesToUse, 0.0) :- 1 > TreesToUse, !.
getScore_athleteplaysforteam_tree1(D, E, TreesToUse, Total1) :- athleteplaysforteam_tree1(D, E, Total), stepLength_tree1(StepLen), Total1 is Total * StepLen.

flattenedLiteralsInThisSetOfTrees(athleteplaysforteam, 9, [
   athleteledsportsteam(a, underscore),
   athleteledsportsteam(a, b),
   teamplaysinleague(b, uniqueVar1),
   athleteplaysinleague(a, underscore),
   athleteledsportsteam(underscore, b),
   athleteplaysinleague(a, uniqueVar1),
   teamalsoknownas(a, underscore),
   teamalsoknownas(b, underscore),
   athleteledsportsteam(underscore, a)]).

%%%%%  WILL-Produced Tree Combined @ 20:56:43 5/3/18.  [Using 7.125.176 memory cells.]  %%%%%

% FOR athleteplaysforteam(A, B):
%   if ( athleteledsportsteam(A, B) )
%   then return 0.8446354215859994;  // std dev = 0,115, 74,000 (wgt'ed) examples reached here.  /* #neg=1 #pos=73 */
%   else if ( teamalsoknownas(B, C) )
%   | then if ( teamplaysagainstteam(B, C) )
%   | | then return 0.5093117257971828;  // std dev = 0,477, 344,000 (wgt'ed) examples reached here.  /* #neg=120 #pos=224 */
%   | | else return 0.11621345122854462;  // std dev = 0,438, 31,000 (wgt'ed) examples reached here.  /* #neg=23 #pos=8 */
%   | else if ( teamplayssport(B, D) )
%   | | then if ( athleteledsportsteam(A, E) )
%   | | | then return 0.15172691675088706;  // std dev = 0,455, 109,000 (wgt'ed) examples reached here.  /* #neg=77 #pos=32 */
%   | | | else if ( athleteplayssport(A, D) )
%   | | | | then return 0.6167696247546849;  // std dev = 0,428, 58,000 (wgt'ed) examples reached here.  /* #neg=14 #pos=44 */
%   | | | | else if ( athleteplayssport(A, F) )
%   | | | | | then return 0.0268236338946931;  // std dev = 0,374, 83,000 (wgt'ed) examples reached here.  /* #neg=69 #pos=14 */
%   | | | | | else if ( athleteledsportsteam(G, B) )
%   | | | | | | then return 0.2361016910050248;  // std dev = 5,464, 127,000 (wgt'ed) examples reached here.  /* #neg=79 #pos=48 */
%   | | | | | | else return 0.30100607795665635;  // std dev = 5,877, 140,000 (wgt'ed) examples reached here.  /* #neg=78 #pos=62 */
%   | | else return 0.032061978577773244;  // std dev = 0,379, 69,000 (wgt'ed) examples reached here.  /* #neg=57 #pos=12 */


% Clauses:

athleteplaysforteam(A, B, 0.8446354215859994) :- 
     athleteledsportsteam(A, B), 
     !. // Clause #1.

athleteplaysforteam(A, B, 0.5093117257971828) :- 
     teamalsoknownas(B, C), 
     teamplaysagainstteam(B, C), 
     !. // Clause #2.

athleteplaysforteam(A, B, 0.11621345122854462) :- 
     teamalsoknownas(B, C), 
     !. // Clause #3.

athleteplaysforteam(A, B, 0.15172691675088706) :- 
     teamplayssport(B, C), 
     athleteledsportsteam(A, D), 
     !. // Clause #4.

athleteplaysforteam(A, B, 0.6167696247546849) :- 
     teamplayssport(B, C), 
     athleteplayssport(A, C), 
     !. // Clause #5.

athleteplaysforteam(A, B, 0.0268236338946931) :- 
     teamplayssport(B, C), 
     athleteplayssport(A, D), 
     !. // Clause #6.

athleteplaysforteam(A, B, 0.2361016910050248) :- 
     teamplayssport(B, C), 
     athleteledsportsteam(D, B), 
     !. // Clause #7.

athleteplaysforteam(A, B, 0.30100607795665635) :- 
     teamplayssport(B, C), 
     !. // Clause #8.

athleteplaysforteam(A, B, 0.032061978577773244) :- !. // Clause #9.


% The flattened versions of these clauses:

flattened_athleteplaysforteam(a, b, 0.8446354215859994) :-  /* #neg=1 #pos=73 */ 
   athleteledsportsteam(a, b),
   !. // Flattened version of clause #1.

flattened_athleteplaysforteam(underscore, a, 0.5093117257971828) :-  /* #neg=120 #pos=224 */ 
   teamalsoknownas(a, uniqueVar2),
   teamplaysagainstteam(a, uniqueVar2),
   !. // Flattened version of clause #2.

flattened_athleteplaysforteam(underscore, a, 0.11621345122854462) :-  /* #neg=23 #pos=8 */ 
   teamalsoknownas(a, underscore),
   !. // Flattened version of clause #3.

flattened_athleteplaysforteam(a, b, 0.15172691675088706) :-  /* #neg=77 #pos=32 */ 
   teamplayssport(b, underscore),
   athleteledsportsteam(a, underscore),
   !. // Flattened version of clause #4.

flattened_athleteplaysforteam(a, b, 0.6167696247546849) :-  /* #neg=14 #pos=44 */ 
   teamplayssport(b, uniqueVar3),
   athleteplayssport(a, uniqueVar3),
   !. // Flattened version of clause #5.

flattened_athleteplaysforteam(a, b, 0.0268236338946931) :-  /* #neg=69 #pos=14 */ 
   teamplayssport(b, underscore),
   athleteplayssport(a, underscore),
   !. // Flattened version of clause #6.

flattened_athleteplaysforteam(underscore, a, 0.2361016910050248) :-  /* #neg=79 #pos=48 */ 
   teamplayssport(a, underscore),
   athleteledsportsteam(underscore, a),
   !. // Flattened version of clause #7.

flattened_athleteplaysforteam(underscore, a, 0.30100607795665635) :-  /* #neg=78 #pos=62 */ 
   teamplayssport(a, underscore),
   !. // Flattened version of clause #8.

flattened_athleteplaysforteam(underscore, underscore, 0.032061978577773244) :-  /* #neg=57 #pos=12 */ 
   !. // Flattened version of clause #9.


% The unique flattened literals:
%   athleteledsportsteam(a, underscore)
%   teamalsoknownas(a, uniqueVar2)
%   athleteledsportsteam(a, b)
%   athleteplayssport(a, uniqueVar3)
%   teamplaysagainstteam(a, uniqueVar2)
%   teamplayssport(b, underscore)
%   teamalsoknownas(a, underscore)
%   teamplayssport(b, uniqueVar3)
%   teamplayssport(a, underscore)
%   athleteplayssport(a, underscore)
%   athleteledsportsteam(underscore, a)

mayukhdas · 2018-05-04T17:15:56Z

Thanks a lot for providing the information.

I am assuming you are getting proper AUC ROC and AUC PR. With probabilistic classifiers such as ours the precision and recall is not a straightforward measure. In our code the 'threshold' of the prediction probabilities are hard coded to 0.5, for deciding predicted positives and negatives, since the first development cycle. So when the predicted probabilities are all lower than 0.5 (especially with 1 tree) precision comes out as NaN.

We will take this up as an open issue and make the threshold dynamic/customizable through our next full release cycle.

However, as a quick fix on your side, if you are using the source code directly, is changing the threshold for different data sets and seeing what works.

Just go to class "edu.wisc.cs.will.boosting.RDN.RunBoostedRDN.java" and change the threshold from 0.5 to your preferred value in the infer() method and recompile.

public void infer() {
	InferBoostedRDN infer = new InferBoostedRDN(cmdArgs, setup);
	infer.runInference(fullModel, 0.5);
}

LINES: 455 - 458

OR, you may just use AUCs are your performance metric if that is suitable for you.

Thanks
--Mayukh

rodrigoazs · 2018-05-07T03:08:54Z

I'm getting proper AUC ROC and AUC PR, however all the examples are being classified as False. Positive examples are getting probabilities under 0.3 while negative examples are getting probabilities above 0.7 (1-prob). I've changed some parameters as treeDepth and numOfClauses but I didn't get better results.

athleteplaysforteam(jj_putz, new_york_mets) 0.1487444500776306
athleteplaysforteam(rafer_alston, houston_rockets) 0.2778072427204634
athleteplaysforteam(evgeni_malkin, penguins) 0.2778072427204634
athleteplaysforteam(chris_bosh, toronto_raptors) 0.2778072427204634
athleteplaysforteam(matt_holliday, rockies) 0.2778072427204634
athleteplaysforteam(michael_leighton, carolina) 0.16513046764826847
!athleteplaysforteam(jose_calderon, leafs) 0.8348695323517316
!athleteplaysforteam(john_stockton, seahawks) 0.7847983085615956
!athleteplaysforteam(jason_varitek, toronto_raptors) 0.8512555499223694
!athleteplaysforteam(mike_richter, jacksonville_jaguars) 0.7792490418814738
!athleteplaysforteam(jeff_teague, rockies) 0.7847983085615956
!athleteplaysforteam(andrew_alberts, maple_leafs) 0.8348695323517316

The best threshold is 0.19016607954333642.

% F1 = 0.9971346704871061
% Threshold = 0.19016607954333642

%   AUC ROC   = 0,738357
%   AUC PR    = 0,762346
%   CLL	      = -0,876825
%   Precision = NaN at threshold = 0,500
%   Recall    = 0,000000
%   F1        = NaN

In addition, is there a way to infer using the combined regression tree? I'm currently removing the boosted trees, renaming the combined tree to regressionTree0 and modifying the model file.

Thanks.

mayukhdas · 2018-05-07T04:25:39Z

Hi Rodrigo,

Were you able to change the hard-coded threshold in the source code to a different (lower) value that I outlined in my previous message and then tried running it?

Thanks
--Mayukh

mayukhdas · 2018-05-07T04:28:49Z

As you can see Precision = NaN is at threshold 0.5 ...
In your example it should be around 0.19 or 0.2 at most.
Try changing the source code that I pointed out and see.

We will take care of this issue dynamically in our next release.

rodrigoazs · 2018-05-09T17:39:19Z

Hi Mayukh,

Yes, I changed and I'm able to see the scores now.

%   AUC ROC   = 0.658075
%   AUC PR    = 0.565851
%   CLL	      = -0.862083
%   Precision = 0.394191 at threshold = 0.100
%   Recall    = 1.000000
%   F1        = 0.565476

Thank you.

mayukhdas added the enhancement label May 7, 2018

mayukhdas assigned mayukhdas and ddhami May 7, 2018

mayukhdas added this to the Under Development milestone May 15, 2018

mayukhdas mentioned this issue May 17, 2018

Development #18

Closed

mayukhdas modified the milestones: Under Development, DeployedInDEVBranch May 18, 2018

starling-lab deleted a comment from boost-starai Jul 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Precision, Recall and F1 score of 1 or 2 regression trees #17

Precision, Recall and F1 score of 1 or 2 regression trees #17

rodrigoazs commented May 4, 2018

boost-starai commented May 4, 2018 via email

rodrigoazs commented May 4, 2018

mayukhdas commented May 4, 2018 •

edited

Loading

rodrigoazs commented May 7, 2018

mayukhdas commented May 7, 2018

mayukhdas commented May 7, 2018

rodrigoazs commented May 9, 2018

Precision, Recall and F1 score of 1 or 2 regression trees #17

Precision, Recall and F1 score of 1 or 2 regression trees #17

Comments

rodrigoazs commented May 4, 2018

boost-starai commented May 4, 2018 via email

rodrigoazs commented May 4, 2018

mayukhdas commented May 4, 2018 • edited Loading

rodrigoazs commented May 7, 2018

mayukhdas commented May 7, 2018

mayukhdas commented May 7, 2018

rodrigoazs commented May 9, 2018

mayukhdas commented May 4, 2018 •

edited

Loading