Skip to main content
All CollectionsFine-tuning FAQs
Why was my job cancelled?
Why was my job cancelled?
Updated over 8 months ago

There are two reasons that a job may be automatically cancelled.

  1. You do not have sufficient balance on your account to cover the cost of the job.

  2. You have entered an incorrect WandB API key

You can determine why your job was cancelled by:

(1) checking the events list for your job via the together-CLI tool

$ together list-events <job-fine-tune-id>

(2) Via the web interface https://api.together.ai > Jobs > Cancelled Job > Events List

The following is an example of a job that was cancelled due to an incorrect WandB key (see message 4 and 5):

$ together list-events ft-392ef45d-a4f4-4a4d-b50c-c5b551d852c9 +----+--------------------------------------------------------------------------------------------------------+---------------------------------+----------------------+ | | Message | Type | Hash | +====+========================================================================================================+=================================+======================+ | 0 | Fine tune request created | JOB_PENDING | | +----+--------------------------------------------------------------------------------------------------------+---------------------------------+----------------------+ | 1 | Job started at Tue Jan 23 07:20:10 PST 2024 | JOB_START | 8275378180435023547 | +----+--------------------------------------------------------------------------------------------------------+---------------------------------+----------------------+ | 2 | Model data downloaded for togethercomputer/RedPajama-INCITE-7B-Chat at Tue Jan 23 07:22:34 PST 2024 | MODEL_DOWNLOAD_COMPLETE | -988165705840572841 | +----+--------------------------------------------------------------------------------------------------------+---------------------------------+----------------------+ | 3 | Training data downloaded for togethercomputer/RedPajama-INCITE-7B-Chat at Tue Jan 23 07:22:36 PST 2024 | TRAINING_DATA_DOWNLOAD_COMPLETE | -1605514659064971718 | +----+--------------------------------------------------------------------------------------------------------+---------------------------------+----------------------+ | 4 | WandB login or init failed: API key must be 40 characters long, yours was 17 | WANDB_INIT | -748217494451531697 | +----+--------------------------------------------------------------------------------------------------------+---------------------------------+----------------------+ | 5 | Job cancelled due to error in WandB login/init | CANCEL_REQUESTED | | +----+--------------------------------------------------------------------------------------------------------+---------------------------------+----------------------+ | 6 | Training started for model /work/ft-392ef45d-a4f4-4a4d-b50c-c5b551d852c9/model | TRAINING_START | 1731272555435848274 | +----+--------------------------------------------------------------------------------------------------------+---------------------------------+----------------------+ | 7 | Job stopped due to cancel request | JOB_STOPPED | | +----+--------------------------------------------------------------------------------------------------------+---------------------------------+----------------------+

The following is an example of a job an event log in the web jobs tab where the billing limit was reached:


โ€‹

Did this answer your question?