runtime error

Exit code: 1. Reason: want to force a new download, use `force_download=True`. warnings.warn( config.json: 0%| | 0.00/967 [00:00<?, ?B/s] config.json: 100%|██████████| 967/967 [00:00<00:00, 7.83MB/s] configuration_phi3.py: 0%| | 0.00/11.2k [00:00<?, ?B/s] configuration_phi3.py: 100%|██████████| 11.2k/11.2k [00:00<00:00, 67.9MB/s] A new version of the following files was downloaded from https://huggingface.co/microsoft/Phi-3-mini-4k-instruct: - configuration_phi3.py . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision. modeling_phi3.py: 0%| | 0.00/73.2k [00:00<?, ?B/s] modeling_phi3.py: 100%|██████████| 73.2k/73.2k [00:00<00:00, 54.3MB/s] A new version of the following files was downloaded from https://huggingface.co/microsoft/Phi-3-mini-4k-instruct: - modeling_phi3.py . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision. `flash-attention` package not found, consider installing for better performance: No module named 'flash_attn'. Current `flash-attention` does not support `window_size`. Either upgrade or use `attn_implementation='eager'`. Traceback (most recent call last): File "/home/user/app/app.py", line 25, in <module> model = AutoModelForCausalLM.from_pretrained( File "/usr/local/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained return model_class.from_pretrained( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3202, in from_pretrained hf_quantizer.validate_environment( File "/usr/local/lib/python3.10/site-packages/transformers/quantizers/quantizer_bnb_4bit.py", line 62, in validate_environment raise RuntimeError("No GPU found. A GPU is needed for quantization.") RuntimeError: No GPU found. A GPU is needed for quantization.

Container logs:

Fetching error logs...