Evaluating AI Language Model Outputs