Qwen/Qwen3-Coder-Next
Text Generation • 80B • Updated • 1.21M • • 1.16k
Would have liked to have seen Zipf weighting and PCA used for token level models like GloVe and BPEmb for a fair baseline (I assume for these models, just a mean sentence vector was computed) or is M2V_base_glove_subword just that?
Is M2V_base_output also using tokenlearn fine-tuning?