Skip to the content.

Classification Report for Mistral Large 2

Invalid Model Output Count

Test Invalid Count
base 1
0%_64 1
0%_128 1
0%_256 1
0%_512 1
0%_1024 1
0%_2048 1
0%_4096 1
0%_8192 1
0%_16384 2
0%_32768 6
25%_64 1
25%_128 1
25%_256 1
25%_512 1
25%_1024 1
25%_4096 1
25%_8192 1
25%_16384 2
25%_32768 8
50%_64 1
50%_128 1
50%_256 1
50%_512 1
50%_2048 1
50%_4096 1
50%_8192 1
50%_16384 2
50%_32768 33
75%_128 1
75%_256 1
75%_512 1
75%_1024 1
75%_4096 1
75%_8192 3
75%_16384 6
75%_32768 4
100%_64 1
100%_256 1
100%_1024 1
100%_8192 1
100%_32768 3

Consolidated Classification Report

Test Total Samples True Violations Predicted Violations Accuracy Precision (macro) Recall (macro) F1-score (macro) Precision (weighted) Recall (weighted) F1-score (weighted)
base 410 247 256 0.871 0.868 0.86 0.864 0.87 0.871 0.87
0%_64 410 247 252 0.856 0.851 0.847 0.849 0.856 0.856 0.856
0%_128 410 247 255 0.859 0.855 0.848 0.851 0.858 0.859 0.858
0%_256 410 247 252 0.861 0.856 0.852 0.854 0.86 0.861 0.861
0%_512 410 247 254 0.856 0.852 0.846 0.849 0.855 0.856 0.856
0%_1024 410 247 258 0.851 0.848 0.839 0.843 0.851 0.851 0.85
0%_2048 410 247 246 0.841 0.834 0.835 0.835 0.842 0.841 0.842
0%_4096 410 247 248 0.846 0.84 0.839 0.839 0.846 0.846 0.846
0%_8192 410 247 259 0.829 0.825 0.816 0.819 0.828 0.829 0.828
0%_16384 409 246 248 0.839 0.832 0.831 0.831 0.838 0.839 0.838
0%_32768 405 243 331 0.714 0.751 0.656 0.653 0.739 0.714 0.682
25%_64 410 247 266 0.832 0.83 0.814 0.82 0.831 0.832 0.83
25%_128 410 247 266 0.841 0.841 0.825 0.831 0.841 0.841 0.839
25%_256 410 247 262 0.841 0.839 0.827 0.832 0.841 0.841 0.84
25%_512 410 247 269 0.834 0.835 0.815 0.822 0.834 0.834 0.832
25%_1024 410 247 278 0.817 0.821 0.793 0.801 0.819 0.817 0.813
25%_2048 411 248 276 0.815 0.817 0.792 0.8 0.816 0.815 0.811
25%_4096 410 247 279 0.81 0.814 0.785 0.793 0.811 0.81 0.805
25%_8192 410 247 336 0.695 0.719 0.635 0.629 0.711 0.695 0.661
25%_16384 409 246 347 0.699 0.75 0.634 0.623 0.735 0.699 0.657
25%_32768 403 244 365 0.615 0.587 0.531 0.479 0.595 0.615 0.535
50%_64 410 247 271 0.839 0.841 0.819 0.827 0.84 0.839 0.836
50%_128 410 247 275 0.829 0.833 0.807 0.816 0.831 0.829 0.826
50%_256 410 247 271 0.844 0.847 0.825 0.832 0.845 0.844 0.841
50%_512 410 247 275 0.824 0.828 0.802 0.81 0.826 0.824 0.821
50%_1024 411 248 280 0.82 0.825 0.795 0.804 0.822 0.82 0.816
50%_2048 410 247 260 0.822 0.817 0.807 0.811 0.821 0.822 0.821
50%_4096 410 247 296 0.763 0.771 0.727 0.736 0.768 0.763 0.753
50%_8192 410 247 351 0.668 0.693 0.6 0.58 0.686 0.668 0.619
50%_16384 409 246 355 0.66 0.686 0.589 0.564 0.679 0.66 0.606
50%_32768 378 229 266 0.706 0.696 0.671 0.675 0.701 0.706 0.697
75%_64 411 248 276 0.83 0.833 0.807 0.816 0.831 0.83 0.826
75%_128 410 247 277 0.834 0.84 0.811 0.82 0.837 0.834 0.83
75%_256 410 247 287 0.815 0.826 0.786 0.796 0.82 0.815 0.809
75%_512 410 247 275 0.829 0.833 0.807 0.816 0.831 0.829 0.826
75%_1024 410 247 271 0.834 0.836 0.814 0.822 0.835 0.834 0.831
75%_2048 411 248 235 0.832 0.824 0.831 0.827 0.836 0.832 0.833
75%_4096 410 247 283 0.78 0.782 0.752 0.76 0.781 0.78 0.774
75%_8192 408 245 322 0.694 0.704 0.641 0.639 0.7 0.694 0.667
75%_16384 405 246 318 0.63 0.601 0.572 0.562 0.612 0.63 0.599
75%_32768 407 245 175 0.646 0.664 0.667 0.646 0.689 0.646 0.648
100%_64 410 247 277 0.834 0.84 0.811 0.82 0.837 0.834 0.83
100%_128 411 248 280 0.83 0.836 0.805 0.815 0.833 0.83 0.826
100%_256 410 247 283 0.82 0.828 0.793 0.803 0.823 0.82 0.814
100%_512 411 248 281 0.832 0.84 0.807 0.817 0.836 0.832 0.828
100%_1024 410 247 286 0.812 0.822 0.784 0.794 0.817 0.812 0.806
100%_2048 411 248 286 0.796 0.801 0.767 0.776 0.798 0.796 0.789
100%_4096 411 248 308 0.752 0.767 0.709 0.717 0.76 0.752 0.737
100%_8192 410 247 340 0.715 0.76 0.654 0.649 0.746 0.715 0.68
100%_16384 411 248 212 0.723 0.72 0.729 0.719 0.74 0.723 0.726
100%_32768 408 246 323 0.64 0.621 0.583 0.573 0.627 0.64 0.608