Can you give your experimental devices? And how much GPU memory needed for training?
Can you give your experimental devices? And how much GPU memory needed for training?