Skip to the content.

Classification Report for LLaMa3.1 8B

Invalid Model Output Count

Test Invalid Count
base 2
0%_64 2
0%_128 2
0%_256 9
0%_512 7
0%_2048 2
0%_8192 4
0%_16384 15
0%_32768 6
25%_64 4
25%_128 5
25%_256 6
25%_512 2
25%_1024 11
25%_2048 5
25%_4096 23
25%_8192 10
25%_16384 14
25%_32768 13
50%_64 4
50%_128 3
50%_256 7
50%_512 6
50%_1024 2
50%_2048 7
50%_4096 5
50%_8192 3
50%_16384 12
50%_32768 11
75%_64 4
75%_128 3
75%_256 2
75%_1024 1
75%_2048 15
75%_4096 14
75%_8192 1
75%_16384 18
75%_32768 11
100%_64 9
100%_128 4
100%_256 4
100%_512 4
100%_2048 3
100%_4096 4
100%_8192 1
100%_16384 15
100%_32768 16

Consolidated Classification Report

Test Total Samples True Violations Predicted Violations Accuracy Precision (macro) Recall (macro) F1-score (macro) Precision (weighted) Recall (weighted) F1-score (weighted)
base 409 247 227 0.775 0.767 0.776 0.769 0.782 0.775 0.777
0%_64 409 246 240 0.76 0.75 0.753 0.752 0.762 0.76 0.761
0%_128 409 247 242 0.753 0.742 0.745 0.743 0.755 0.753 0.754
0%_256 402 240 216 0.721 0.715 0.723 0.716 0.731 0.721 0.724
0%_512 404 245 228 0.681 0.67 0.675 0.671 0.688 0.681 0.683
0%_1024 411 248 226 0.625 0.616 0.62 0.616 0.635 0.625 0.628
0%_2048 409 246 216 0.516 0.51 0.511 0.508 0.531 0.516 0.521
0%_4096 411 248 181 0.555 0.568 0.57 0.554 0.591 0.555 0.558
0%_8192 407 244 327 0.614 0.585 0.556 0.539 0.595 0.614 0.576
0%_16384 396 242 380 0.626 0.656 0.525 0.446 0.649 0.626 0.516
0%_32768 405 243 380 0.588 0.5 0.5 0.419 0.52 0.588 0.482
25%_64 407 245 244 0.762 0.751 0.752 0.752 0.762 0.762 0.762
25%_128 406 243 257 0.719 0.708 0.701 0.703 0.716 0.719 0.717
25%_256 405 242 258 0.709 0.697 0.689 0.692 0.705 0.709 0.706
25%_512 409 246 284 0.687 0.674 0.654 0.657 0.681 0.687 0.678
25%_1024 400 239 272 0.688 0.675 0.658 0.661 0.682 0.688 0.68
25%_2048 406 243 312 0.628 0.606 0.578 0.57 0.614 0.628 0.601
25%_4096 388 227 335 0.598 0.577 0.537 0.497 0.582 0.598 0.535
25%_8192 401 239 386 0.579 0.429 0.489 0.387 0.46 0.579 0.453
25%_16384 397 237 391 0.612 0.803 0.519 0.414 0.765 0.612 0.48
25%_32768 398 242 373 0.626 0.632 0.533 0.467 0.631 0.626 0.53
50%_64 407 247 266 0.722 0.709 0.699 0.702 0.718 0.722 0.719
50%_128 408 246 264 0.676 0.66 0.653 0.655 0.672 0.676 0.673
50%_256 404 241 273 0.683 0.67 0.655 0.658 0.677 0.683 0.676
50%_512 405 242 310 0.644 0.629 0.596 0.59 0.635 0.644 0.619
50%_1024 409 246 302 0.609 0.578 0.563 0.558 0.591 0.609 0.588
50%_2048 404 241 320 0.631 0.614 0.578 0.566 0.619 0.631 0.598
50%_4096 406 244 363 0.603 0.563 0.525 0.475 0.573 0.603 0.527
50%_8192 408 245 377 0.618 0.615 0.534 0.473 0.616 0.618 0.528
50%_16384 399 240 385 0.622 0.701 0.528 0.443 0.683 0.622 0.507
50%_32768 400 241 373 0.63 0.664 0.543 0.482 0.656 0.63 0.538
75%_64 407 245 278 0.722 0.714 0.693 0.698 0.719 0.722 0.715
75%_128 408 246 265 0.703 0.69 0.68 0.683 0.699 0.703 0.7
75%_256 409 246 325 0.665 0.661 0.61 0.603 0.663 0.665 0.635
75%_512 411 248 344 0.669 0.682 0.604 0.589 0.678 0.669 0.627
75%_1024 410 247 299 0.615 0.586 0.571 0.567 0.599 0.615 0.596
75%_2048 396 234 333 0.609 0.587 0.548 0.519 0.593 0.609 0.557
75%_4096 397 236 325 0.594 0.558 0.536 0.511 0.569 0.594 0.549
75%_8192 410 247 375 0.6 0.548 0.516 0.454 0.561 0.6 0.512
75%_16384 393 235 378 0.616 0.672 0.526 0.44 0.66 0.616 0.502
75%_32768 400 245 362 0.618 0.577 0.528 0.478 0.588 0.618 0.538
100%_64 402 241 258 0.749 0.74 0.73 0.733 0.746 0.749 0.746
100%_128 407 244 287 0.713 0.706 0.679 0.683 0.709 0.713 0.702
100%_256 407 245 318 0.649 0.634 0.595 0.588 0.639 0.649 0.62
100%_512 407 246 301 0.693 0.685 0.649 0.652 0.689 0.693 0.677
100%_1024 411 248 313 0.652 0.635 0.602 0.599 0.642 0.652 0.629
100%_2048 408 246 315 0.645 0.626 0.593 0.586 0.633 0.645 0.618
100%_4096 407 244 343 0.6 0.559 0.533 0.502 0.571 0.6 0.546
100%_8192 410 247 382 0.602 0.555 0.515 0.444 0.566 0.602 0.505
100%_16384 396 243 380 0.619 0.592 0.515 0.432 0.598 0.619 0.506
100%_32768 395 239 369 0.646 0.721 0.557 0.5 0.703 0.646 0.557