Skip to the content.

Classification Report for GPT-4o-mini

Consolidated Classification Report

Test Total Samples True Violations Predicted Violations Accuracy Precision (macro) Recall (macro) F1-score (macro) Precision (weighted) Recall (weighted) F1-score (weighted)
base 411 248 312 0.815 0.858 0.773 0.787 0.841 0.815 0.803
0%_64 411 248 310 0.791 0.821 0.749 0.76 0.809 0.791 0.778
0%_128 411 248 337 0.764 0.835 0.707 0.712 0.812 0.764 0.738
0%_256 411 248 338 0.752 0.817 0.693 0.697 0.796 0.752 0.724
0%_512 411 248 332 0.752 0.803 0.697 0.701 0.786 0.752 0.727
0%_1024 411 248 325 0.779 0.83 0.728 0.738 0.812 0.779 0.759
0%_2048 411 248 340 0.718 0.763 0.657 0.653 0.748 0.718 0.684
0%_4096 411 248 385 0.662 0.802 0.575 0.522 0.768 0.662 0.576
0%_8192 411 248 392 0.645 0.789 0.553 0.485 0.756 0.645 0.544
0%_16384 411 248 399 0.628 0.768 0.532 0.445 0.737 0.628 0.511
0%_32768 411 248 400 0.625 0.763 0.529 0.439 0.733 0.625 0.506
25%_64 411 248 341 0.735 0.795 0.674 0.674 0.776 0.735 0.703
25%_128 411 248 352 0.723 0.803 0.656 0.648 0.78 0.723 0.682
25%_256 411 248 360 0.708 0.8 0.636 0.621 0.774 0.708 0.659
25%_512 411 248 343 0.72 0.773 0.658 0.654 0.757 0.72 0.685
25%_1024 411 248 365 0.686 0.766 0.611 0.586 0.745 0.686 0.628
25%_2048 411 248 350 0.713 0.777 0.646 0.638 0.758 0.713 0.672
25%_4096 411 248 393 0.642 0.786 0.55 0.479 0.754 0.642 0.54
25%_8192 411 248 402 0.62 0.752 0.523 0.427 0.723 0.62 0.495
25%_16384 411 248 402 0.611 0.638 0.512 0.412 0.632 0.611 0.483
25%_32768 411 248 377 0.594 0.524 0.508 0.443 0.541 0.594 0.503
50%_64 411 248 343 0.725 0.782 0.663 0.66 0.765 0.725 0.691
50%_128 411 248 346 0.742 0.822 0.679 0.678 0.798 0.742 0.708
50%_256 411 248 332 0.723 0.756 0.666 0.666 0.745 0.723 0.695
50%_512 411 248 340 0.713 0.754 0.652 0.648 0.741 0.713 0.679
50%_1024 411 248 354 0.713 0.789 0.644 0.634 0.767 0.713 0.669
50%_2048 411 248 372 0.674 0.763 0.594 0.56 0.74 0.674 0.606
50%_4096 411 248 399 0.628 0.768 0.532 0.445 0.737 0.628 0.511
50%_8192 411 248 401 0.628 0.809 0.531 0.44 0.77 0.628 0.507
50%_16384 411 248 382 0.601 0.546 0.513 0.443 0.56 0.601 0.504
50%_32768 411 248 371 0.574 0.474 0.491 0.428 0.5 0.574 0.488
75%_64 411 248 339 0.73 0.782 0.67 0.669 0.765 0.73 0.699
75%_128 411 248 357 0.715 0.805 0.645 0.634 0.78 0.715 0.669
75%_256 411 248 339 0.701 0.731 0.64 0.634 0.721 0.701 0.666
75%_512 411 248 331 0.725 0.758 0.669 0.67 0.747 0.725 0.698
75%_1024 411 248 371 0.681 0.779 0.602 0.572 0.754 0.681 0.616
75%_2048 411 248 364 0.674 0.733 0.598 0.571 0.717 0.674 0.615
75%_4096 411 248 397 0.618 0.664 0.523 0.435 0.654 0.618 0.501
75%_8192 411 248 396 0.601 0.536 0.505 0.412 0.551 0.601 0.481
75%_16384 411 248 379 0.594 0.522 0.507 0.439 0.54 0.594 0.5
75%_32768 411 248 385 0.569 0.411 0.478 0.392 0.449 0.569 0.46
100%_64 411 248 334 0.747 0.799 0.69 0.694 0.782 0.747 0.72
100%_128 411 248 342 0.713 0.758 0.651 0.646 0.744 0.713 0.678
100%_256 411 248 341 0.706 0.743 0.644 0.638 0.731 0.706 0.67
100%_512 411 248 343 0.696 0.729 0.632 0.624 0.719 0.696 0.658
100%_1024 411 248 365 0.676 0.742 0.6 0.573 0.724 0.676 0.617
100%_2048 411 248 382 0.655 0.75 0.569 0.518 0.727 0.655 0.571
100%_4096 411 248 397 0.633 0.775 0.538 0.456 0.744 0.633 0.52
100%_8192 411 248 376 0.635 0.658 0.551 0.501 0.652 0.635 0.555
100%_16384 411 248 383 0.584 0.479 0.494 0.417 0.504 0.584 0.481
100%_32768 411 248 385 0.589 0.494 0.498 0.419 0.516 0.589 0.484