> The idea of using a smaller version of the same (or a similar) model as a chec...

> The idea of using a smaller version of the same (or a similar) model as a check is interesting.

I built my chat app around this idea and to save money. When it comes to coding, I feel Sonnet 3.5 is still the best but I don't start with it. I tend to use cheaper models in the beginning since it usually takes a few iterations to get to a certain point and I don't want to waste tokens in the process. When I've reached a certain state or if it is clear that the LLM is not helping, I will bring in Sonnet to review things.

Here is an example of how the conversation between models will work.

https://beta.gitsense.com/?chat=bbd69cb2-ffc9-41a3-9bdb-095c...

The reason why this works for my application is, I have a system prompt that includes the following lines:

# Critical Context Information

Your name is {{gs-chat-llm-model}} and the current date and time is {{gs-chat-datetime}}.

When I make an API call, I will replace the template strings with the model and date. I also made sure to include instructions in the first user message to let the model know it needs to sign off on each message. So with the system prompt and message signature, you can say "what do you think of <LLM's> response".