Google is comparing its AI model, Gemini, against Anthropic’s Claude to evaluate performance. Contractors score responses based on truthfulness and clarity, sometimes taking up to 30 minutes per prompt. Gemini has been flagged for safety issues in certain outputs, whereas Claude emphasizes strict safety measures, often refusing to answer potentially harmful prompts. Internal documents show that Google is using Claude’s responses for evaluation, but it’s unclear if they have Anthropic’s permission to do so. Anthropic’s terms prohibit such use without approval. Google stated it doesn’t train Gemini on Anthropic models but uses comparisons for evaluation. Concerns have also arisen about contractors rating Gemini’s responses on sensitive topics outside their expertise, raising accuracy issues.
Grey Matterz Thoughts
Evaluating Gemini against Claude highlights the growing competition in AI. However, transparency and adherence to ethical guidelines are crucial to maintain trust in this rapidly advancing field.
Source: https://shorturl.at/gCJrd