Skip to main content
All CollectionsFine-tuning FAQs
What happens if the training data (jsonl), has examples with token counts smaller (or longer) than the context length?
What happens if the training data (jsonl), has examples with token counts smaller (or longer) than the context length?
Updated over a week ago

We use dataset packing, so sequences shorter than the max sequence length are concattenated with a token to separate them. If an example is greater than maximum sequence length, it is split so the entire example is used, but in exclusive subsets.

Did this answer your question?