Luke Demi created a fun session today! We tested five prompts against five different models. We scored it in a matrix and ranked them roughly in this order:
- Google Bard
- Anthropic Claude v2 (https://claude.ai)
- Falcon Instruct (40b) self-hosted https://huggingface.co/tiiuae/falcon-40b-instruct
- Orca mini 3b (on my laptop) https://huggingface.co/psmathur/orca_mini_3b
I was surprised at Google bards performance as it seems to use internal google APIs very well to provide real time information.
You can find the matrix we used here: