Skip to the content.

Classification Report for LLaMa3.1 70B

Invalid Model Output Count

Test Invalid Count
0%_512 1
25%_64 2
25%_512 1
50%_64 1
50%_1024 1
50%_4096 2
50%_32768 1
75%_64 1
75%_512 1
75%_32768 1
100%_64 1
100%_128 1
100%_256 1
100%_512 1
100%_2048 1

Consolidated Classification Report

Test Total Samples True Violations Predicted Violations Accuracy Precision (macro) Recall (macro) F1-score (macro) Precision (weighted) Recall (weighted) F1-score (weighted)
base 411 248 243 0.856 0.849 0.853 0.851 0.857 0.856 0.857
0%_64 411 248 225 0.798 0.791 0.801 0.793 0.807 0.798 0.8
0%_128 411 248 233 0.793 0.784 0.792 0.787 0.798 0.793 0.795
0%_256 411 248 225 0.808 0.801 0.811 0.803 0.816 0.808 0.809
0%_512 410 247 208 0.768 0.767 0.778 0.765 0.787 0.768 0.771
0%_1024 411 248 197 0.715 0.72 0.729 0.713 0.743 0.715 0.718
0%_2048 411 248 72 0.513 0.639 0.584 0.488 0.679 0.513 0.465
0%_4096 411 248 86 0.543 0.655 0.607 0.526 0.695 0.543 0.508
0%_8192 411 248 110 0.616 0.709 0.671 0.609 0.75 0.616 0.599
0%_16384 411 248 125 0.618 0.687 0.666 0.615 0.724 0.618 0.607
0%_32768 411 248 80 0.543 0.669 0.61 0.523 0.711 0.543 0.503
25%_64 409 247 259 0.848 0.845 0.835 0.839 0.848 0.848 0.847
25%_128 411 248 266 0.835 0.833 0.818 0.823 0.834 0.835 0.833
25%_256 411 248 264 0.844 0.843 0.829 0.834 0.844 0.844 0.843
25%_512 410 247 259 0.854 0.851 0.841 0.845 0.853 0.854 0.853
25%_1024 411 248 273 0.808 0.807 0.786 0.793 0.807 0.808 0.804
25%_2048 411 248 258 0.805 0.798 0.791 0.794 0.804 0.805 0.804
25%_4096 411 248 279 0.754 0.749 0.727 0.733 0.752 0.754 0.749
25%_8192 411 248 286 0.766 0.767 0.736 0.743 0.767 0.766 0.759
25%_16384 411 248 310 0.723 0.729 0.678 0.682 0.727 0.723 0.705
25%_32768 411 248 265 0.652 0.633 0.628 0.629 0.647 0.652 0.648
50%_64 410 247 266 0.851 0.852 0.835 0.841 0.851 0.851 0.849
50%_128 411 248 263 0.847 0.845 0.832 0.837 0.846 0.847 0.845
50%_256 411 248 261 0.847 0.844 0.833 0.837 0.846 0.847 0.845
50%_512 411 248 261 0.842 0.839 0.828 0.832 0.841 0.842 0.841
50%_1024 410 247 262 0.807 0.802 0.791 0.795 0.806 0.807 0.806
50%_2048 411 248 269 0.764 0.757 0.742 0.747 0.762 0.764 0.761
50%_4096 409 247 280 0.773 0.771 0.745 0.752 0.772 0.773 0.767
50%_8192 411 248 275 0.745 0.737 0.719 0.724 0.741 0.745 0.74
50%_16384 411 248 277 0.662 0.643 0.631 0.634 0.654 0.662 0.655
50%_32768 410 247 175 0.663 0.682 0.686 0.663 0.709 0.663 0.665
75%_64 410 247 264 0.846 0.845 0.831 0.836 0.846 0.846 0.845
75%_128 411 248 265 0.847 0.846 0.831 0.837 0.846 0.847 0.845
75%_256 411 248 262 0.844 0.842 0.83 0.835 0.844 0.844 0.843
75%_512 410 247 265 0.834 0.833 0.817 0.823 0.834 0.834 0.832
75%_1024 411 248 269 0.813 0.81 0.793 0.799 0.812 0.813 0.81
75%_2048 411 248 263 0.798 0.792 0.781 0.785 0.796 0.798 0.796
75%_4096 411 248 274 0.752 0.745 0.727 0.732 0.749 0.752 0.747
75%_8192 411 248 235 0.701 0.69 0.694 0.691 0.706 0.701 0.702
75%_16384 411 248 186 0.64 0.651 0.656 0.639 0.675 0.64 0.643
75%_32768 410 247 171 0.639 0.66 0.663 0.639 0.687 0.639 0.64
100%_64 410 247 259 0.839 0.835 0.826 0.83 0.838 0.839 0.838
100%_128 410 247 264 0.846 0.845 0.831 0.836 0.846 0.846 0.845
100%_256 410 247 256 0.837 0.832 0.825 0.828 0.836 0.837 0.836
100%_512 410 247 263 0.839 0.837 0.824 0.829 0.838 0.839 0.837
100%_1024 411 248 254 0.83 0.823 0.819 0.821 0.829 0.83 0.829
100%_2048 410 248 274 0.79 0.788 0.767 0.773 0.789 0.79 0.786
100%_4096 411 248 305 0.725 0.729 0.683 0.688 0.727 0.725 0.71
100%_8192 411 248 270 0.757 0.749 0.734 0.739 0.754 0.757 0.753
100%_16384 411 248 229 0.706 0.696 0.702 0.698 0.713 0.706 0.708
100%_32768 411 248 305 0.701 0.697 0.657 0.66 0.698 0.701 0.684