Package: vitals 0.2.0.9000

Simon Couch

vitals: Large Language Model Evaluation

A port of 'Inspect', a widely adopted 'Python' framework for large language model evaluation. Specifically aimed at 'ellmer' users who want to measure the effectiveness of their large language model-based products, the package supports prompt engineering, tool usage, multi-turn dialog, and model graded evaluations.

Authors:Simon Couch [aut, cre], Max Kuhn [ctb], Hadley Wickham [ctb], Mine Cetinkaya-Rundel [ctb], Posit Software, PBC [cph, fnd]

vitals_0.2.0.9000.tar.gz
vitals_0.2.0.9000.zip(r-4.7)vitals_0.2.0.9000.zip(r-4.6)vitals_0.2.0.9000.zip(r-4.5)
vitals_0.2.0.9000.tgz(r-4.6-any)vitals_0.2.0.9000.tgz(r-4.5-any)
vitals_0.2.0.9000.tar.gz(r-4.6-any)vitals_0.2.0.9000.tar.gz(r-4.5-any)
vitals_0.2.0.9000.tgz(r-4.5-emscripten)
vitals.pdf |vitals.html
vitals/json (API)
NEWS

# Install 'vitals' in R:
install.packages('vitals', repos = c('https://tidyverse.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/tidyverse/vitals/issues

Pkgdown/docs site:https://vitals.tidyverse.org

Datasets:
  • are - An R Eval

On CRAN:

Conda:

7.63 score 52 stars 72 scripts 377 downloads 15 exports 36 dependencies

Last updated from:39b3f3db47. Checks:9 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK160
source / vignettesOK231
linux-release-x86_64OK153
macos-release-arm64OK129
macos-oldrel-arm64OK121
windows-develOK191
windows-releaseOK141
windows-oldrelOK101
wasm-releaseOK123

Exports:detect_answerdetect_exactdetect_includesdetect_matchdetect_patterngenerategenerate_structuredmodel_graded_factmodel_graded_qaTaskvitals_bindvitals_bundlevitals_log_dirvitals_log_dir_setvitals_view

Dependencies:askpassclicorocpp11curldplyrellmerfastmapgenericsgluehttpuvhttr2jsonlitelaterlifecyclemagrittropensslotelpillarpkgconfigpromisespurrrR6rappdirsRcpprlangS7stringistringrsystibbletidyrtidyselectutf8vctrswithr

Readme and manuals

Help Manual

Help pageTopics
An R Evalare
Convert a chat to a solver functiongenerate
Convert a chat to a solver function with structured outputgenerate_structured
Scoring with string detectiondetect_answer detect_exact detect_includes detect_match detect_pattern scorer_detect
Model-based scoringmodel_graded_fact model_graded_qa scorer_model
Creating and evaluating tasksTask
Concatenate task samples for analysisvitals_bind
Prepare logs for deploymentvitals_bundle
The log directoryvitals_log_dir vitals_log_dir_set
Interactively view local evaluation logsvitals_view