Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Precision, Recall and F1 score of 1 or 2 regression trees #17

Open
rodrigoazs opened this issue May 4, 2018 · 7 comments
Open

Precision, Recall and F1 score of 1 or 2 regression trees #17

rodrigoazs opened this issue May 4, 2018 · 7 comments
Assignees

Comments

@rodrigoazs
Copy link

When I set the trees parameter to learn 1 or 2 regression trees I always get those metrics as NaN or 0.

% Precision = NaN at threshold = 0,500
% Recall = 0,000000
% F1 = NaN

I've tried it several times with same dataset, settings and different numbers of learning trees.

@boost-starai
Copy link
Contributor

boost-starai commented May 4, 2018 via email

@rodrigoazs
Copy link
Author

Thanks for helping!

I tried to learn one single regression tree with the following dataset:

IMDB dataset provided in the BoostSRL Wiki

setParam: treeDepth=3.

95 positive examples and 146 negative examples.

setParam: stringsAreCaseSensitive = true.

usePrologVariables: true.


(female_gender(A, 0.15676004621062295) :-  /* #neg=101 #pos=43 */ workedunder_2_1_genre(A, adrama, 1)).
(female_gender(A, 0.13087620782678497) :-  /* #neg=8 #pos=3 */ workedunder_2_1_genre(A, acomedy, 7)).
(female_gender(A, 0.16584124279182003) :-  /* #neg=9 #pos=4 */ workedunder_2_1_genre(A, ascifi, 1)).
(female_gender(A, 0.35814893509951234) :-  /* #neg=7 #pos=7 */ workedunder_2_1_genre(A, acomedy, 2)).
(female_gender(A, 0.40360348055405776) :-  /* #neg=5 #pos=6 */ workedunder_2_1_genre(A, acrime, 2)).
(female_gender(A, 0.5133213488926157) :-  /* #neg=10 #pos=19 */ workedunder_2_1_genre(A, aaction, 1)).
female_gender(_, 0.524815601766179) /* #neg=7 #pos=14 */ .

usePrologVariables: true.

% maxTreeDepthInNodes                 = 5
% maxTreeDepthInLiterals              = 12
% maxNumberOfLiteralsAtAnInteriorNode = 3
% maxFreeBridgersInBody               = 1
% maxNumberOfClauses                  = 8
% maxNodesToConsider                  = 10
% maxNodesToCreate                    = 10.000
% maxAcceptableNodeScoreToStop        = 0,003
% negPosRatio                         = 2,000
% testNegPosRatio                     = -1,000
% # of pos examples                   = 243
% # of neg examples                   = 0



%%%%%  WILL-Produced Tree #1 @ 22:23:51 5/3/18.  [Using 2.584.856 memory cells.]  %%%%%

% FOR female_gender(A):
%   if ( workedunder_2_1_genre(A, adrama, 1) )
%   then return 0.15676004621062295;  // std dev = 0,458, 144,000 (wgt'ed) examples reached here.  /* #neg=101 #pos=43 */
%   else if ( workedunder_2_1_genre(A, acomedy, 7) )
%   | then return 0.13087620782678497;  // std dev = 0,445, 11,000 (wgt'ed) examples reached here.  /* #neg=8 #pos=3 */
%   | else if ( workedunder_2_1_genre(A, ascifi, 1) )
%   | | then return 0.16584124279182003;  // std dev = 0,462, 13,000 (wgt'ed) examples reached here.  /* #neg=9 #pos=4 */
%   | | else if ( workedunder_2_1_genre(A, acomedy, 2) )
%   | | | then return 0.35814893509951234;  // std dev = 0,500, 14,000 (wgt'ed) examples reached here.  /* #neg=7 #pos=7 */
%   | | | else if ( workedunder_2_1_genre(A, acrime, 2) )
%   | | | | then return 0.40360348055405776;  // std dev = 0,498, 11,000 (wgt'ed) examples reached here.  /* #neg=5 #pos=6 */
%   | | | | else if ( workedunder_2_1_genre(A, aaction, 1) )
%   | | | | | then return 0.5133213488926157;  // std dev = 2,560, 29,000 (wgt'ed) examples reached here.  /* #neg=10 #pos=19 */
%   | | | | | else return 0.524815601766179;  // std dev = 2,160, 21,000 (wgt'ed) examples reached here.  /* #neg=7 #pos=14 */


% Clauses:

female_gender(A, 0.15676004621062295) :- 
     workedunder_2_1_genre(A, adrama, 1), 
     !. // Clause #1.

female_gender(A, 0.13087620782678497) :- 
     workedunder_2_1_genre(A, acomedy, 7), 
     !. // Clause #2.

female_gender(A, 0.16584124279182003) :- 
     workedunder_2_1_genre(A, ascifi, 1), 
     !. // Clause #3.

female_gender(A, 0.35814893509951234) :- 
     workedunder_2_1_genre(A, acomedy, 2), 
     !. // Clause #4.

female_gender(A, 0.40360348055405776) :- 
     workedunder_2_1_genre(A, acrime, 2), 
     !. // Clause #5.

female_gender(A, 0.5133213488926157) :- 
     workedunder_2_1_genre(A, aaction, 1), 
     !. // Clause #6.

female_gender(A, 0.524815601766179) :- !. // Clause #7.


% The flattened versions of these clauses:

flattened_female_gender(a, 0.15676004621062295) :-  /* #neg=101 #pos=43 */ 
   workedunder_2_1_genre(a, adrama, 1),
   !. // Flattened version of clause #1.

flattened_female_gender(a, 0.13087620782678497) :-  /* #neg=8 #pos=3 */ 
   workedunder_2_1_genre(a, acomedy, 7),
   !. // Flattened version of clause #2.

flattened_female_gender(a, 0.16584124279182003) :-  /* #neg=9 #pos=4 */ 
   workedunder_2_1_genre(a, ascifi, 1),
   !. // Flattened version of clause #3.

flattened_female_gender(a, 0.35814893509951234) :-  /* #neg=7 #pos=7 */ 
   workedunder_2_1_genre(a, acomedy, 2),
   !. // Flattened version of clause #4.

flattened_female_gender(a, 0.40360348055405776) :-  /* #neg=5 #pos=6 */ 
   workedunder_2_1_genre(a, acrime, 2),
   !. // Flattened version of clause #5.

flattened_female_gender(a, 0.5133213488926157) :-  /* #neg=10 #pos=19 */ 
   workedunder_2_1_genre(a, aaction, 1),
   !. // Flattened version of clause #6.

flattened_female_gender(underscore, 0.524815601766179) :-  /* #neg=7 #pos=14 */ 
   !. // Flattened version of clause #7.


% The unique flattened literals:
%   workedunder_2_1_genre(a, acomedy, 2)
%   workedunder_2_1_genre(a, aaction, 1)
%   workedunder_2_1_genre(a, adrama, 1)
%   workedunder_2_1_genre(a, acomedy, 7)
%   workedunder_2_1_genre(a, ascifi, 1)
%   workedunder_2_1_genre(a, acrime, 2)


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%  Final call for computing score for female_gender.  %%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

stepLength_tree1(5.0E-4).

logPrior(-1.8).
female_gender(E, Total) :- // A general accessor. 
   female_gender(E, 1000000, Total), !.
female_gender(E, Total) :- waitHere("This should not fail", female_gender(E, Total)).

female_gender(E, TreesToUse, Total) :- // A tree-limited accessor (e.g., for tuning the number of trees to use).
   logPrior(LogPrior),
   getScore_female_gender_tree1(E, TreesToUse, Total1),
   Total is LogPrior + Total1,
   !.
female_gender(E, TreesToUse, Total) :- waitHere("This should not fail", female_gender(E, TreesToUse, Total)).

getScore_female_gender_tree1(E, TreesToUse, 0.0) :- 1 > TreesToUse, !.
getScore_female_gender_tree1(E, TreesToUse, Total1) :- female_gender_tree1(E, Total), stepLength_tree1(StepLen), Total1 is Total * StepLen.

flattenedLiteralsInThisSetOfTrees(female_gender, 6, [
   workedunder_2_1_genre(a, acomedy, 2),
   workedunder_2_1_genre(a, aaction, 1),
   workedunder_2_1_genre(a, adrama, 1),
   workedunder_2_1_genre(a, acomedy, 7),
   workedunder_2_1_genre(a, ascifi, 1),
   workedunder_2_1_genre(a, acrime, 2)]).

%%%%%  WILL-Produced Tree Combined @ 22:23:52 5/3/18.  [Using 2.611.240 memory cells.]  %%%%%

% FOR female_gender(A):
%   if ( workedunder_2_1_genre(A, adrama, 1) )
%   then return 0.15676004621062295;  // std dev = 0,458, 144,000 (wgt'ed) examples reached here.  /* #neg=101 #pos=43 */
%   else if ( workedunder_2_1_genre(A, ascifi, 1) )
%   | then return 0.16584124279182003;  // std dev = 0,462, 13,000 (wgt'ed) examples reached here.  /* #neg=9 #pos=4 */
%   | else if ( workedunder_2_1_genre(A, acomedy, 7) )
%   | | then return 0.13087620782678497;  // std dev = 0,445, 11,000 (wgt'ed) examples reached here.  /* #neg=8 #pos=3 */
%   | | else if ( workedunder_2_1_genre(A, acomedy, 2) )
%   | | | then return 0.35814893509951234;  // std dev = 0,500, 14,000 (wgt'ed) examples reached here.  /* #neg=7 #pos=7 */
%   | | | else if ( workedunder_2_1_genre(A, acrime, 2) )
%   | | | | then return 0.40360348055405776;  // std dev = 0,498, 11,000 (wgt'ed) examples reached here.  /* #neg=5 #pos=6 */
%   | | | | else if ( workedunder_2_1_genre(A, adrama, 2) )
%   | | | | | then return 0.6081489350995122;  // std dev = 0,866, 4,000 (wgt'ed) examples reached here.  /* #neg=1 #pos=3 */
%   | | | | | else return 0.5103228481429896;  // std dev = 3,230, 46,000 (wgt'ed) examples reached here.  /* #neg=16 #pos=30 */


% Clauses:

female_gender(A, 0.15676004621062295) :- 
     workedunder_2_1_genre(A, adrama, 1), 
     !. // Clause #1.

female_gender(A, 0.16584124279182003) :- 
     workedunder_2_1_genre(A, ascifi, 1), 
     !. // Clause #2.

female_gender(A, 0.13087620782678497) :- 
     workedunder_2_1_genre(A, acomedy, 7), 
     !. // Clause #3.

female_gender(A, 0.35814893509951234) :- 
     workedunder_2_1_genre(A, acomedy, 2), 
     !. // Clause #4.

female_gender(A, 0.40360348055405776) :- 
     workedunder_2_1_genre(A, acrime, 2), 
     !. // Clause #5.

female_gender(A, 0.6081489350995122) :- 
     workedunder_2_1_genre(A, adrama, 2), 
     !. // Clause #6.

female_gender(A, 0.5103228481429896) :- !. // Clause #7.


% The flattened versions of these clauses:

flattened_female_gender(a, 0.15676004621062295) :-  /* #neg=101 #pos=43 */ 
   workedunder_2_1_genre(a, adrama, 1),
   !. // Flattened version of clause #1.

flattened_female_gender(a, 0.16584124279182003) :-  /* #neg=9 #pos=4 */ 
   workedunder_2_1_genre(a, ascifi, 1),
   !. // Flattened version of clause #2.

flattened_female_gender(a, 0.13087620782678497) :-  /* #neg=8 #pos=3 */ 
   workedunder_2_1_genre(a, acomedy, 7),
   !. // Flattened version of clause #3.

flattened_female_gender(a, 0.35814893509951234) :-  /* #neg=7 #pos=7 */ 
   workedunder_2_1_genre(a, acomedy, 2),
   !. // Flattened version of clause #4.

flattened_female_gender(a, 0.40360348055405776) :-  /* #neg=5 #pos=6 */ 
   workedunder_2_1_genre(a, acrime, 2),
   !. // Flattened version of clause #5.

flattened_female_gender(a, 0.6081489350995122) :-  /* #neg=1 #pos=3 */ 
   workedunder_2_1_genre(a, adrama, 2),
   !. // Flattened version of clause #6.

flattened_female_gender(underscore, 0.5103228481429896) :-  /* #neg=16 #pos=30 */ 
   !. // Flattened version of clause #7.


% The unique flattened literals:
%   workedunder_2_1_genre(a, acomedy, 2)
%   workedunder_2_1_genre(a, adrama, 1)
%   workedunder_2_1_genre(a, acomedy, 7)
%   workedunder_2_1_genre(a, ascifi, 1)
%   workedunder_2_1_genre(a, acrime, 2)
%   workedunder_2_1_genre(a, adrama, 2)

NELL sports dataset

174 positive examples and 174 negative examples

//Parameters
setParam: maxTreeDepth=6.
setParam: nodeSize=1.
setParam: numOfClauses=8.
//Modes
mode: male(+name).
mode: athleteledsportsteam(+athlete,+sportsteam).
mode: athleteledsportsteam(-athlete,+sportsteam).
mode: athleteledsportsteam(+athlete,-sportsteam).
mode: athleteplaysforteam(+athlete,+sportsteam).
mode: athleteplaysforteam(+athlete,-sportsteam).
mode: athleteplaysforteam(-athlete,+sportsteam).
mode: athleteplaysinleague(+athlete,+sportsleague).
mode: athleteplaysinleague(+athlete,-sportsleague).
mode: athleteplaysinleague(-athlete,+sportsleague).
mode: athleteplayssport(+athlete,+sport).
mode: athleteplayssport(+athlete,-sport).
mode: athleteplayssport(-athlete,+sport).
mode: teamalsoknownas(+sportsteam,+sportsteam).
mode: teamalsoknownas(+sportsteam,-sportsteam).
mode: teamalsoknownas(-sportsteam,+sportsteam).
mode: teamplaysagainstteam(+sportsteam,+sportsteam).
mode: teamplaysagainstteam(+sportsteam,-sportsteam).
mode: teamplaysagainstteam(-sportsteam,+sportsteam).
mode: teamplaysinleague(+sportsteam,+sportsleague).
mode: teamplaysinleague(+sportsteam,-sportsleague).
mode: teamplaysinleague(-sportsteam,+sportsleague).
mode: teamplayssport(+sportsteam,+sport).
mode: teamplayssport(+sportsteam,-sport).
mode: teamplayssport(-sportsteam,+sport).
setParam: stringsAreCaseSensitive = true.

usePrologVariables: true.


(athleteplaysforteam(A, B, 0.8446354215859994) :-  /* #neg=1 #pos=73 */ athleteledsportsteam(_, B), athleteledsportsteam(A, B), !).
(athleteplaysforteam(_, A, 0.5061489350995088) :-  /* #neg=88 #pos=162 */ athleteledsportsteam(_, A), teamalsoknownas(A, _), !).
(athleteplaysforteam(A, B, 0.6669724645112775) :-  /* #neg=13 #pos=55 */ athleteledsportsteam(_, B), athleteplaysinleague(A, UniqueVar1), teamplaysinleague(B, UniqueVar1), !).
(athleteplaysforteam(A, B, 0.055517356152143954) :-  /* #neg=61 #pos=15 */ athleteledsportsteam(_, B), athleteplaysinleague(A, _), !).
(athleteplaysforteam(_, A, 0.1793168183111909) :-  /* #neg=93 #pos=44 */ athleteledsportsteam(_, A), !).
(athleteplaysforteam(A, B, 0.16249676118646889) :-  /* #neg=16 #pos=7 */ teamalsoknownas(B, _), athleteledsportsteam(A, _), !).
(athleteplaysforteam(A, B, 0.324815601766179) :-  /* #neg=16 #pos=14 */ teamalsoknownas(B, _), athleteplaysinleague(A, _), !).
(athleteplaysforteam(_, A, 0.5387044906550683) :-  /* #neg=23 #pos=49 */ teamalsoknownas(A, _), !).
(athleteplaysforteam(_, _, 0.17946041050934608) :-  /* #neg=207 #pos=98 */ !).
usePrologVariables: true.

% maxTreeDepthInNodes                 = 6
% maxTreeDepthInLiterals              = 12
% maxNumberOfLiteralsAtAnInteriorNode = 1
% maxFreeBridgersInBody               = 1
% maxNumberOfClauses                  = 8
% maxNodesToConsider                  = 10
% maxNodesToCreate                    = 10.000
% maxAcceptableNodeScoreToStop        = 0,003
% negPosRatio                         = 2,000
% testNegPosRatio                     = -1,000
% # of pos examples                   = 1.035
% # of neg examples                   = 0



%%%%%  WILL-Produced Tree #1 @ 20:56:42 5/3/18.  [Using 6.846.760 memory cells.]  %%%%%

% FOR athleteplaysforteam(A, B):
%   if ( athleteledsportsteam(C, B) )
%   then if ( athleteledsportsteam(A, B) )
%   | then return 0.8446354215859994;  // std dev = 0,115, 74,000 (wgt'ed) examples reached here.  /* #neg=1 #pos=73 */
%   | else if ( teamalsoknownas(B, D) )
%   | | then return 0.5061489350995088;  // std dev = 0,478, 250,000 (wgt'ed) examples reached here.  /* #neg=88 #pos=162 */
%   | | else if ( athleteplaysinleague(A, E) )
%   | | | then if ( teamplaysinleague(B, E) )
%   | | | | then return 0.6669724645112775;  // std dev = 0,393, 68,000 (wgt'ed) examples reached here.  /* #neg=13 #pos=55 */
%   | | | | else return 0.055517356152143954;  // std dev = 0,398, 76,000 (wgt'ed) examples reached here.  /* #neg=61 #pos=15 */
%   | | | else return 0.1793168183111909;  // std dev = 0,467, 137,000 (wgt'ed) examples reached here.  /* #neg=93 #pos=44 */
%   else if ( teamalsoknownas(B, F) )
%   | then if ( athleteledsportsteam(A, G) )
%   | | then return 0.16249676118646889;  // std dev = 0,460, 23,000 (wgt'ed) examples reached here.  /* #neg=16 #pos=7 */
%   | | else if ( athleteplaysinleague(A, H) )
%   | | | then return 0.324815601766179;  // std dev = 0,499, 30,000 (wgt'ed) examples reached here.  /* #neg=16 #pos=14 */
%   | | | else return 0.5387044906550683;  // std dev = 0,466, 72,000 (wgt'ed) examples reached here.  /* #neg=23 #pos=49 */
%   | else return 0.17946041050934608;  // std dev = 0,467, 305,000 (wgt'ed) examples reached here.  /* #neg=207 #pos=98 */


% Clauses:

athleteplaysforteam(A, B, 0.8446354215859994) :- 
     athleteledsportsteam(C, B), 
     athleteledsportsteam(A, B), 
     !. // Clause #1.

athleteplaysforteam(A, B, 0.5061489350995088) :- 
     athleteledsportsteam(C, B), 
     teamalsoknownas(B, D), 
     !. // Clause #2.

athleteplaysforteam(A, B, 0.6669724645112775) :- 
     athleteledsportsteam(C, B), 
     athleteplaysinleague(A, D), 
     teamplaysinleague(B, D), 
     !. // Clause #3.

athleteplaysforteam(A, B, 0.055517356152143954) :- 
     athleteledsportsteam(C, B), 
     athleteplaysinleague(A, D), 
     !. // Clause #4.

athleteplaysforteam(A, B, 0.1793168183111909) :- 
     athleteledsportsteam(C, B), 
     !. // Clause #5.

athleteplaysforteam(A, B, 0.16249676118646889) :- 
     teamalsoknownas(B, C), 
     athleteledsportsteam(A, D), 
     !. // Clause #6.

athleteplaysforteam(A, B, 0.324815601766179) :- 
     teamalsoknownas(B, C), 
     athleteplaysinleague(A, D), 
     !. // Clause #7.

athleteplaysforteam(A, B, 0.5387044906550683) :- 
     teamalsoknownas(B, C), 
     !. // Clause #8.

athleteplaysforteam(A, B, 0.17946041050934608) :- !. // Clause #9.


% The flattened versions of these clauses:

flattened_athleteplaysforteam(a, b, 0.8446354215859994) :-  /* #neg=1 #pos=73 */ 
   athleteledsportsteam(underscore, b),
   athleteledsportsteam(a, b),
   !. // Flattened version of clause #1.

flattened_athleteplaysforteam(underscore, a, 0.5061489350995088) :-  /* #neg=88 #pos=162 */ 
   athleteledsportsteam(underscore, a),
   teamalsoknownas(a, underscore),
   !. // Flattened version of clause #2.

flattened_athleteplaysforteam(a, b, 0.6669724645112775) :-  /* #neg=13 #pos=55 */ 
   athleteledsportsteam(underscore, b),
   athleteplaysinleague(a, uniqueVar1),
   teamplaysinleague(b, uniqueVar1),
   !. // Flattened version of clause #3.

flattened_athleteplaysforteam(a, b, 0.055517356152143954) :-  /* #neg=61 #pos=15 */ 
   athleteledsportsteam(underscore, b),
   athleteplaysinleague(a, underscore),
   !. // Flattened version of clause #4.

flattened_athleteplaysforteam(underscore, a, 0.1793168183111909) :-  /* #neg=93 #pos=44 */ 
   athleteledsportsteam(underscore, a),
   !. // Flattened version of clause #5.

flattened_athleteplaysforteam(a, b, 0.16249676118646889) :-  /* #neg=16 #pos=7 */ 
   teamalsoknownas(b, underscore),
   athleteledsportsteam(a, underscore),
   !. // Flattened version of clause #6.

flattened_athleteplaysforteam(a, b, 0.324815601766179) :-  /* #neg=16 #pos=14 */ 
   teamalsoknownas(b, underscore),
   athleteplaysinleague(a, underscore),
   !. // Flattened version of clause #7.

flattened_athleteplaysforteam(underscore, a, 0.5387044906550683) :-  /* #neg=23 #pos=49 */ 
   teamalsoknownas(a, underscore),
   !. // Flattened version of clause #8.

flattened_athleteplaysforteam(underscore, underscore, 0.17946041050934608) :-  /* #neg=207 #pos=98 */ 
   !. // Flattened version of clause #9.


% The unique flattened literals:
%   athleteledsportsteam(a, underscore)
%   athleteledsportsteam(a, b)
%   teamplaysinleague(b, uniqueVar1)
%   athleteplaysinleague(a, underscore)
%   athleteledsportsteam(underscore, b)
%   athleteplaysinleague(a, uniqueVar1)
%   teamalsoknownas(a, underscore)
%   teamalsoknownas(b, underscore)
%   athleteledsportsteam(underscore, a)


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%  Final call for computing score for athleteplaysforteam.  %%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

stepLength_tree1(1.0).

logPrior(-1.8).
athleteplaysforteam(D, E, Total) :- // A general accessor. 
   athleteplaysforteam(D, E, 1000000, Total), !.
athleteplaysforteam(D, E, Total) :- waitHere("This should not fail", athleteplaysforteam(D, E, Total)).

athleteplaysforteam(D, E, TreesToUse, Total) :- // A tree-limited accessor (e.g., for tuning the number of trees to use).
   logPrior(LogPrior),
   getScore_athleteplaysforteam_tree1(D, E, TreesToUse, Total1),
   Total is LogPrior + Total1,
   !.
athleteplaysforteam(D, E, TreesToUse, Total) :- waitHere("This should not fail", athleteplaysforteam(D, E, TreesToUse, Total)).

getScore_athleteplaysforteam_tree1(D, E, TreesToUse, 0.0) :- 1 > TreesToUse, !.
getScore_athleteplaysforteam_tree1(D, E, TreesToUse, Total1) :- athleteplaysforteam_tree1(D, E, Total), stepLength_tree1(StepLen), Total1 is Total * StepLen.

flattenedLiteralsInThisSetOfTrees(athleteplaysforteam, 9, [
   athleteledsportsteam(a, underscore),
   athleteledsportsteam(a, b),
   teamplaysinleague(b, uniqueVar1),
   athleteplaysinleague(a, underscore),
   athleteledsportsteam(underscore, b),
   athleteplaysinleague(a, uniqueVar1),
   teamalsoknownas(a, underscore),
   teamalsoknownas(b, underscore),
   athleteledsportsteam(underscore, a)]).

%%%%%  WILL-Produced Tree Combined @ 20:56:43 5/3/18.  [Using 7.125.176 memory cells.]  %%%%%

% FOR athleteplaysforteam(A, B):
%   if ( athleteledsportsteam(A, B) )
%   then return 0.8446354215859994;  // std dev = 0,115, 74,000 (wgt'ed) examples reached here.  /* #neg=1 #pos=73 */
%   else if ( teamalsoknownas(B, C) )
%   | then if ( teamplaysagainstteam(B, C) )
%   | | then return 0.5093117257971828;  // std dev = 0,477, 344,000 (wgt'ed) examples reached here.  /* #neg=120 #pos=224 */
%   | | else return 0.11621345122854462;  // std dev = 0,438, 31,000 (wgt'ed) examples reached here.  /* #neg=23 #pos=8 */
%   | else if ( teamplayssport(B, D) )
%   | | then if ( athleteledsportsteam(A, E) )
%   | | | then return 0.15172691675088706;  // std dev = 0,455, 109,000 (wgt'ed) examples reached here.  /* #neg=77 #pos=32 */
%   | | | else if ( athleteplayssport(A, D) )
%   | | | | then return 0.6167696247546849;  // std dev = 0,428, 58,000 (wgt'ed) examples reached here.  /* #neg=14 #pos=44 */
%   | | | | else if ( athleteplayssport(A, F) )
%   | | | | | then return 0.0268236338946931;  // std dev = 0,374, 83,000 (wgt'ed) examples reached here.  /* #neg=69 #pos=14 */
%   | | | | | else if ( athleteledsportsteam(G, B) )
%   | | | | | | then return 0.2361016910050248;  // std dev = 5,464, 127,000 (wgt'ed) examples reached here.  /* #neg=79 #pos=48 */
%   | | | | | | else return 0.30100607795665635;  // std dev = 5,877, 140,000 (wgt'ed) examples reached here.  /* #neg=78 #pos=62 */
%   | | else return 0.032061978577773244;  // std dev = 0,379, 69,000 (wgt'ed) examples reached here.  /* #neg=57 #pos=12 */


% Clauses:

athleteplaysforteam(A, B, 0.8446354215859994) :- 
     athleteledsportsteam(A, B), 
     !. // Clause #1.

athleteplaysforteam(A, B, 0.5093117257971828) :- 
     teamalsoknownas(B, C), 
     teamplaysagainstteam(B, C), 
     !. // Clause #2.

athleteplaysforteam(A, B, 0.11621345122854462) :- 
     teamalsoknownas(B, C), 
     !. // Clause #3.

athleteplaysforteam(A, B, 0.15172691675088706) :- 
     teamplayssport(B, C), 
     athleteledsportsteam(A, D), 
     !. // Clause #4.

athleteplaysforteam(A, B, 0.6167696247546849) :- 
     teamplayssport(B, C), 
     athleteplayssport(A, C), 
     !. // Clause #5.

athleteplaysforteam(A, B, 0.0268236338946931) :- 
     teamplayssport(B, C), 
     athleteplayssport(A, D), 
     !. // Clause #6.

athleteplaysforteam(A, B, 0.2361016910050248) :- 
     teamplayssport(B, C), 
     athleteledsportsteam(D, B), 
     !. // Clause #7.

athleteplaysforteam(A, B, 0.30100607795665635) :- 
     teamplayssport(B, C), 
     !. // Clause #8.

athleteplaysforteam(A, B, 0.032061978577773244) :- !. // Clause #9.


% The flattened versions of these clauses:

flattened_athleteplaysforteam(a, b, 0.8446354215859994) :-  /* #neg=1 #pos=73 */ 
   athleteledsportsteam(a, b),
   !. // Flattened version of clause #1.

flattened_athleteplaysforteam(underscore, a, 0.5093117257971828) :-  /* #neg=120 #pos=224 */ 
   teamalsoknownas(a, uniqueVar2),
   teamplaysagainstteam(a, uniqueVar2),
   !. // Flattened version of clause #2.

flattened_athleteplaysforteam(underscore, a, 0.11621345122854462) :-  /* #neg=23 #pos=8 */ 
   teamalsoknownas(a, underscore),
   !. // Flattened version of clause #3.

flattened_athleteplaysforteam(a, b, 0.15172691675088706) :-  /* #neg=77 #pos=32 */ 
   teamplayssport(b, underscore),
   athleteledsportsteam(a, underscore),
   !. // Flattened version of clause #4.

flattened_athleteplaysforteam(a, b, 0.6167696247546849) :-  /* #neg=14 #pos=44 */ 
   teamplayssport(b, uniqueVar3),
   athleteplayssport(a, uniqueVar3),
   !. // Flattened version of clause #5.

flattened_athleteplaysforteam(a, b, 0.0268236338946931) :-  /* #neg=69 #pos=14 */ 
   teamplayssport(b, underscore),
   athleteplayssport(a, underscore),
   !. // Flattened version of clause #6.

flattened_athleteplaysforteam(underscore, a, 0.2361016910050248) :-  /* #neg=79 #pos=48 */ 
   teamplayssport(a, underscore),
   athleteledsportsteam(underscore, a),
   !. // Flattened version of clause #7.

flattened_athleteplaysforteam(underscore, a, 0.30100607795665635) :-  /* #neg=78 #pos=62 */ 
   teamplayssport(a, underscore),
   !. // Flattened version of clause #8.

flattened_athleteplaysforteam(underscore, underscore, 0.032061978577773244) :-  /* #neg=57 #pos=12 */ 
   !. // Flattened version of clause #9.


% The unique flattened literals:
%   athleteledsportsteam(a, underscore)
%   teamalsoknownas(a, uniqueVar2)
%   athleteledsportsteam(a, b)
%   athleteplayssport(a, uniqueVar3)
%   teamplaysagainstteam(a, uniqueVar2)
%   teamplayssport(b, underscore)
%   teamalsoknownas(a, underscore)
%   teamplayssport(b, uniqueVar3)
%   teamplayssport(a, underscore)
%   athleteplayssport(a, underscore)
%   athleteledsportsteam(underscore, a)

@mayukhdas
Copy link
Contributor

mayukhdas commented May 4, 2018

Thanks a lot for providing the information.

I am assuming you are getting proper AUC ROC and AUC PR. With probabilistic classifiers such as ours the precision and recall is not a straightforward measure. In our code the 'threshold' of the prediction probabilities are hard coded to 0.5, for deciding predicted positives and negatives, since the first development cycle. So when the predicted probabilities are all lower than 0.5 (especially with 1 tree) precision comes out as NaN.

We will take this up as an open issue and make the threshold dynamic/customizable through our next full release cycle.

However, as a quick fix on your side, if you are using the source code directly, is changing the threshold for different data sets and seeing what works.

Just go to class "edu.wisc.cs.will.boosting.RDN.RunBoostedRDN.java" and change the threshold from 0.5 to your preferred value in the infer() method and recompile.

public void infer() {
	InferBoostedRDN infer = new InferBoostedRDN(cmdArgs, setup);
	infer.runInference(fullModel, 0.5);
}

LINES: 455 - 458

OR, you may just use AUCs are your performance metric if that is suitable for you.

Thanks
--Mayukh

@rodrigoazs
Copy link
Author

I'm getting proper AUC ROC and AUC PR, however all the examples are being classified as False. Positive examples are getting probabilities under 0.3 while negative examples are getting probabilities above 0.7 (1-prob). I've changed some parameters as treeDepth and numOfClauses but I didn't get better results.

athleteplaysforteam(jj_putz, new_york_mets) 0.1487444500776306
athleteplaysforteam(rafer_alston, houston_rockets) 0.2778072427204634
athleteplaysforteam(evgeni_malkin, penguins) 0.2778072427204634
athleteplaysforteam(chris_bosh, toronto_raptors) 0.2778072427204634
athleteplaysforteam(matt_holliday, rockies) 0.2778072427204634
athleteplaysforteam(michael_leighton, carolina) 0.16513046764826847
!athleteplaysforteam(jose_calderon, leafs) 0.8348695323517316
!athleteplaysforteam(john_stockton, seahawks) 0.7847983085615956
!athleteplaysforteam(jason_varitek, toronto_raptors) 0.8512555499223694
!athleteplaysforteam(mike_richter, jacksonville_jaguars) 0.7792490418814738
!athleteplaysforteam(jeff_teague, rockies) 0.7847983085615956
!athleteplaysforteam(andrew_alberts, maple_leafs) 0.8348695323517316

The best threshold is 0.19016607954333642.

% F1 = 0.9971346704871061
% Threshold = 0.19016607954333642

%   AUC ROC   = 0,738357
%   AUC PR    = 0,762346
%   CLL	      = -0,876825
%   Precision = NaN at threshold = 0,500
%   Recall    = 0,000000
%   F1        = NaN

In addition, is there a way to infer using the combined regression tree? I'm currently removing the boosted trees, renaming the combined tree to regressionTree0 and modifying the model file.

Thanks.

@mayukhdas
Copy link
Contributor

Hi Rodrigo,

Were you able to change the hard-coded threshold in the source code to a different (lower) value that I outlined in my previous message and then tried running it?

Thanks
--Mayukh

@mayukhdas
Copy link
Contributor

As you can see Precision = NaN is at threshold 0.5 ...
In your example it should be around 0.19 or 0.2 at most.
Try changing the source code that I pointed out and see.

We will take care of this issue dynamically in our next release.

@rodrigoazs
Copy link
Author

Hi Mayukh,

Yes, I changed and I'm able to see the scores now.

%   AUC ROC   = 0.658075
%   AUC PR    = 0.565851
%   CLL	      = -0.862083
%   Precision = 0.394191 at threshold = 0.100
%   Recall    = 1.000000
%   F1        = 0.565476

Thank you.

@mayukhdas mayukhdas added this to the Under Development milestone May 15, 2018
@mayukhdas mayukhdas mentioned this issue May 17, 2018
@starling-lab starling-lab deleted a comment from boost-starai Jul 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants