Evaluation, benchmark, and scorecard, targeting for performance on throughput and latency, accuracy on popular evaluation harness, safety, and hallucination
GenAIEval is a comprehensive evaluation, benchmark, and scorecard tool designed to assess the performance, accuracy, safety, and hallucination of AI models, particularly large language models. It supports popular evaluation harnesses like lm-evaluation-harness and bigcode-evaluation-harness, making it ideal for developers and researchers working on AI model optimization and benchmarking.
How the donated funds are distributed
Kivach works on the Obyte network, and therefore you can track all donations.