Benchmark LLM performance on legal documents with Google Sheets and OpenRouter
AI & ML Data & Analytics File Management

Benchmark LLM performance on legal documents with Google Sheets and OpenRouter

This workflow demonstrates a simple way to run evals on a set of test cases stored in a Google Sheet. The example we are using comes from an info extraction...

Get This Workflow

About This Workflow

What This Workflow Does

This workflow automates benchmarking the performance of Large Language Models (LLMs) on legal documents using Google Sheets and OpenRouter. It enables users to run evaluations on a set of test cases stored in a Google Sheet, streamlining the process and providing insights into the model's performance. The workflow can be used to fine-tune and optimize LLMs for specific tasks, such as information extraction from legal documents.

Who Should Use This

This workflow is ideal for developers, data scientists, and engineers who work with Large Language Models and want to automate the process of evaluating their performance on legal documents. It can also be useful for anyone interested in AI summarization and natural language processing.

Key Features

  • Runs evaluations on a set of test cases stored in a Google Sheet
  • Utilizes OpenRouter to connect with external services and run evaluations
  • Provides insights into the performance of Large Language Models on legal documents
  • Supports fine-tuning and optimization of LLMs for specific tasks
  • Integrates with Google Sheets to store and manage test cases and evaluation results

How to Get Started

To use this workflow, simply import it into your n8n account and configure the Google Sheet and OpenRouter integrations to connect with your external services. You can then customize the workflow to suit your specific needs and start automating the evaluation process for your Large Language Models.

Use This Workflow in n8n →

Affiliate Disclosure: We may earn a commission if you sign up for n8n through our links. This doesn't affect our recommendations.

Get This Workflow →