When you submitted a prompt to Bing, if the processing model deems it's simple it will go through the simpler, Micrsoft's Turing model, if the prompt is deemed to be complicated it will go through GPT-4 (Link). My speculation is that Creative Mode has a high likelihood of using GPT-4, and other modes use Turing model more frequently.

The issue is we don't know which model is used. I exclusively use the Creative Mode and I don't notice the change in quality of the answers (with the variation in LLM's answer it's really hard to know if it's a different model or the model is just dumb in this case/with this prompt). Every aspect of Bing is also really slow compared to ChatGPT, including rerunning the prompt, limited turns, etc. (yes, I'm grateful to have this technology, just trying to optimize for speed and quality here).

So my question is how can we steer the model to use a specific model to ensure consistency in the answer?

