Prepping for v3.0.0b2 by jlarson4 · Pull Request #1185 · TransformerLensOrg/TransformerLens

jlarson4 · 2026-02-26T18:40:56Z

Description

Prepping the model registry, adding new architectures

Compatibility tests with python 3.10, 3.11, 3.12 now run in sequence instead of parallel. This takes longer, but in v3.x the volume of HuggingFace requests gets rate limited when running all three of these at once in addition to the full coverage test
Updated the docs to have separate tables for HookedTransformer compatibility and TransformerBridge compatibility
- TransformerBridge table can be filtered by model name, model status, organization, and architecture.
- Filters are saved and filtered pages can be shared by URL
- Model details are available by clicking the "details" link in each row
Added testing for Model Registry
Benchmark System
- Improved floating-point precision, made sure dtype requests when running properly integrate with all functionalities
- Removed cross model comparisons in favor of comparing the HuggingFace's model
- HF Scraper discovered 4,908 models that should be supported by the TransformerLens system across all architectures
- Registry data: supported_models.json (54K+ lines), architecture_gaps.json, verification_history.json
- Added architecture_gaps.json identifying unsupported HF architectures
- Added batch verification system verify_models for running supported models through the benchmark system
Verification System
- Added handling for models of all sizes
- Verified 480+ models load within the transformer lens system
- Resolved issues with high-value models

Details

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
This change requires a documentation update

Screenshots

Checklist:

I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have not rewritten tests relating to key interfaces which would affect backward compatibility

* Testing R1 Distills to confirm functional in TransformerLens * Updating order to be alphabetical * Setup StableLM architecture adapter * Resolved weight and qk issues with stablelm. Added more models * Added more models * reformatted * Created a ArchitectureAdapter for OpenElm, handled trusting remote code * Fix formatting * Removed test file, update benchmark * Add mock model test * More benchmark adjustments * removed improperly listed supported models * Updating to resolve existing weight diff issues * began working through issues with exsting architecture benchmarks * Resolve any existing weight folding issues we can possibly resolve * Fixing test failures * Clean up format and other changes * Added text quality benchmark, updated to pass CI * Cleaned up comment, tightened tolerances further for bfloat16 models * Removed unnecessary testing file * Cleanup of redundant code * Resolve type issues and format issues

* created initial model registry tool * fixed some formatting issues * ran format * Updated to resolve merge issues, added verification system with model compatibility reporting - Added batch verification system (verify_models.py) with status codes (0=unverified, 1=verified, 2=skipped, 3=failed) - HF scraper with min_downloads=500 threshold (4,846 models across 20 architectures) - 58 models verified, HF token sanitization in verification notes - Registry data stored in single supported_models.json and verification_history.json files - API, validation, and benchmark tooling updated for new registry format * Model registry updates, including new report generation and alias drift detection features. * Initial pass test of all architecture adapters * Updating the last adapters to ensure successful runs * Type and format fixes before PR --------- Co-authored-by: jlarson4 <jonahalarson@comcast.net>

* created initial model registry tool * fixed some formatting issues * ran format * Updated to resolve merge issues, added verification system with model compatibility reporting - Added batch verification system (verify_models.py) with status codes (0=unverified, 1=verified, 2=skipped, 3=failed) - HF scraper with min_downloads=500 threshold (4,846 models across 20 architectures) - 58 models verified, HF token sanitization in verification notes - Registry data stored in single supported_models.json and verification_history.json files - API, validation, and benchmark tooling updated for new registry format * Model registry updates, including new report generation and alias drift detection features. * Initial pass test of all architecture adapters * Updating the last adapters to ensure successful runs * Type and format fixes before PR * First verification test of the top 10 models for each language. Documenting verification changes via a new page in the docs site. * Fixed initial issues with Gemma models, OLMo2, OpenELM, and Llama. * Additional smaller fixes for other popular models * Added single model run for verify models * Resolving format issues * Updated to include verification of the R1 distills * fixed bug where cfg.d_mlp was sometimes dropped * Additional batch of model verification and verification system improvements * Gemma3 issue resolutions * Updating docs to add a prefix filter * Updating supported models to properly track text quality * Additional Partial verification, added text quality to supported models, improve documentation * Docstring test fix --------- Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Jonah Larson <jlarson@Jonahs-MacBook-Pro.local>

* created initial model registry tool * fixed some formatting issues * ran format * Updated to resolve merge issues, added verification system with model compatibility reporting - Added batch verification system (verify_models.py) with status codes (0=unverified, 1=verified, 2=skipped, 3=failed) - HF scraper with min_downloads=500 threshold (4,846 models across 20 architectures) - 58 models verified, HF token sanitization in verification notes - Registry data stored in single supported_models.json and verification_history.json files - API, validation, and benchmark tooling updated for new registry format * Model registry updates, including new report generation and alias drift detection features. * Initial pass test of all architecture adapters * Updating the last adapters to ensure successful runs * Type and format fixes before PR * First verification test of the top 10 models for each language. Documenting verification changes via a new page in the docs site. * Fixed initial issues with Gemma models, OLMo2, OpenELM, and Llama. * Additional smaller fixes for other popular models * Added single model run for verify models * Resolving format issues * Updated to include verification of the R1 distills * fixed bug where cfg.d_mlp was sometimes dropped * Additional batch of model verification and verification system improvements * Gemma3 issue resolutions * Updating docs to add a prefix filter * Updating supported models to properly track text quality * Additional Partial verification, added text quality to supported models, improve documentation * Docstring test fix * Fixed a small bug to ensure MoE weights are properly formatted. Improve memory estimate for verify_models. Add verification for 36 more high value models * Updating docs copy --------- Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Jonah Larson <jlarson@Jonahs-MacBook-Pro.local>

jlarson4 and others added 4 commits February 19, 2026 15:19

jlarson4 merged commit d5561da into dev-3.x Feb 26, 2026
44 of 45 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prepping for v3.0.0b2#1185

Prepping for v3.0.0b2#1185
jlarson4 merged 4 commits intodev-3.xfrom
dev-3.x-canary

jlarson4 commented Feb 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jlarson4 commented Feb 26, 2026

Description

Details

Screenshots

Checklist:

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants