Documentation Index
Fetch the complete documentation index at: https://wb-21fd5541-weave-caching.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
weave / Evaluation
一連の scorer と Datasets を含む評価(Evaluation)をセットアップします。
evaluation.evaluate(model) を呼び出すと、Datasets の行が Models に渡されます。このとき、Datasets の列名が model.predict の引数名と一致するようにマッピングされます。
その後、すべての scorer が呼び出され、その結果が Weave に保存されます。
Example
// 例題を Dataset に集約します
const dataset = new weave.Dataset({
id: 'my-dataset',
rows: [
{ question: 'What is the capital of France?', expected: 'Paris' },
{ question: 'Who wrote "To Kill a Mockingbird"?', expected: 'Harper Lee' },
{ question: 'What is the square root of 64?', expected: '8' },
],
});
// カスタムのスコアリング関数を定義します
const scoringFunction = weave.op(function isEqual({ modelOutput, datasetRow }) {
return modelOutput == datasetRow.expected;
});
// 評価対象の関数を定義します
const model = weave.op(async function alwaysParisModel({ question }) {
return 'Paris';
});
// 評価を開始します
const evaluation = new weave.Evaluation({
id: 'my-evaluation',
dataset: dataset,
scorers: [scoringFunction],
});
const results = await evaluation.evaluate({ model });
Type parameters
| 名前 | 型 |
|---|
R | DatasetRow を継承 |
E | DatasetRow を継承 |
M | M |
Hierarchy
Table of contents
Constructors
Properties
Accessors
Methods
Constructors
constructor
• new Evaluation<R, E, M>(parameters): Evaluation<R, E, M>
Type parameters
| 名前 | 型 |
|---|
R | DatasetRow を継承 |
E | DatasetRow を継承 |
M | M |
Parameters
| 名前 | 型 |
|---|
parameters | EvaluationParameters<R, E, M> |
Returns
Evaluation<R, E, M>
Overrides
WeaveObject.constructor
Defined in
evaluation.ts:148
Properties
__savedRef
• Optional __savedRef: ObjectRef | Promise<ObjectRef>
Inherited from
WeaveObject.__savedRef
Defined in
weaveObject.ts:73
Accessors
description
• get description(): undefined | string
Returns
undefined | string
Inherited from
WeaveObject.description
Defined in
weaveObject.ts:100
name
• get name(): string
Returns
string
Inherited from
WeaveObject.name
Defined in
weaveObject.ts:96
Methods
evaluate
▸ evaluate(«destructured»): Promise<Record<string, any>>
Parameters
| 名前 | 型 | デフォルト値 |
|---|
«destructured» | Object | undefined |
› maxConcurrency? | number | 5 |
› model | WeaveCallable<(…args: [{ datasetRow: R }]) => Promise<M>> | undefined |
› nTrials? | number | 1 |
Returns
Promise<Record<string, any>>
Defined in
evaluation.ts:163
predictAndScore
▸ predictAndScore(«destructured»): Promise<{ model_latency: number = modelLatency; model_output: any = modelOutput; model_success: boolean = !modelError; scores: { [key: string]: any; } }>
Parameters
| 名前 | 型 |
|---|
«destructured» | Object |
› columnMapping? | ColumnMapping<R, E> |
› example | R |
› model | WeaveCallable<(…args: [{ datasetRow: E }]) => Promise<M>> |
Returns
Promise<{ model_latency: number = modelLatency; model_output: any = modelOutput; model_success: boolean = !modelError; scores: { [key: string]: any; } }>
Defined in
evaluation.ts:231
saveAttrs
▸ saveAttrs(): Object
Returns
Object
Inherited from
WeaveObject.saveAttrs
Defined in
weaveObject.ts:77