Can i turn llama_flash_attn_monkey_patch off?

I see that the training pipeline of this uses a monkey patch to replace the LLamaAttention.forward with a custom forward pass which uses flash_attn. My system however, does not support flash_attn. 

If i turned off the monkey patch, would the regular LLamaAttention.forward be able to run training correctly to create similar results?

eg.

```python
# Need to call this before importing transformers.
from video_chatgpt.train.llama_flash_attn_monkey_patch import replace_llama_attn_with_flash_attn

#replace_llama_attn_with_flash_attn() #What if we just turned this off and trained with the default attn function from LLaMA

from video_chatgpt.train.train import train

if __name__ == "__main__":
    train()
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can i turn llama_flash_attn_monkey_patch off? #132

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Can i turn llama_flash_attn_monkey_patch off? #132

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions