Are there plans to integrate multimodal prompts instead of text only? This would be very helpful to evaluate in-context learning.
Are there plans to integrate multimodal prompts instead of text only?
This would be very helpful to evaluate in-context learning.