You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have generated the dataset using the method shown in docs.
Now, when I'm fine-tuning it on base model of bloom 1.3B it's giving me error:
[2023-10-07 06:39:28,529] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
trainable params: 1179648 || all params: 1066493952 || trainable%: 0.11060990995662018
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/xturing/datasets/instruction_dataset.py", line 89, in from_jsonl
data["text"].append(json_line["text"])
KeyError: 'text'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/content/main.py", line 8, in
dataset = InstructionDataset('./dataset/tasks.jsonl')
File "/usr/local/lib/python3.10/dist-packages/xturing/datasets/instruction_dataset.py", line 64, in init
self.data = {"train": HFDataset.from_dict(self.from_jsonl(path))}
File "/usr/local/lib/python3.10/dist-packages/xturing/datasets/instruction_dataset.py", line 93, in from_jsonl
raise ValueError(
ValueError: The jsonl file should have keys text, instruction and target
The text was updated successfully, but these errors were encountered:
{"id": "seed_task_2", "instruction": "What are the major matters related to the enactment and revision?", "instances": [{"input": "", "output": "Answer: The major matters include integrating existing , addressing overlaps and conflicts, and incorporating parts."}]}
This is sample row generated by your InstructionSet Dataset Method using Custom Data.
I have generated the dataset using the method shown in docs.
Now, when I'm fine-tuning it on base model of bloom 1.3B it's giving me error:
[2023-10-07 06:39:28,529] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
trainable params: 1179648 || all params: 1066493952 || trainable%: 0.11060990995662018
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/xturing/datasets/instruction_dataset.py", line 89, in from_jsonl
data["text"].append(json_line["text"])
KeyError: 'text'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/content/main.py", line 8, in
dataset = InstructionDataset('./dataset/tasks.jsonl')
File "/usr/local/lib/python3.10/dist-packages/xturing/datasets/instruction_dataset.py", line 64, in init
self.data = {"train": HFDataset.from_dict(self.from_jsonl(path))}
File "/usr/local/lib/python3.10/dist-packages/xturing/datasets/instruction_dataset.py", line 93, in from_jsonl
raise ValueError(
ValueError: The jsonl file should have keys text, instruction and target
The text was updated successfully, but these errors were encountered: