add parse_audience to all taxila scrapers#1229
Open
mikesndrs wants to merge 5 commits intoElixirTeSS:masterfrom
Open
add parse_audience to all taxila scrapers#1229mikesndrs wants to merge 5 commits intoElixirTeSS:masterfrom
mikesndrs wants to merge 5 commits intoElixirTeSS:masterfrom
Conversation
fbacall
requested changes
Feb 13, 2026
Member
fbacall
left a comment
There was a problem hiding this comment.
Can you add some tests for the auto_parse_vars feature?
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces a config-driven “auto-parse from description” mechanism (via JSON mapping files) intended to populate fields like target_audience/keywords during ingestion, and removes the per-scraper parse_audience(...) assignments from the Taxila ingestors.
Changes:
- Add
feature.auto_parse_varsand JSON mapping files to auto-populate selected fields fromdescription. - Apply auto-parsing during
add_event/add_material. - Remove explicit
target_audienceparsing lines from multiple Taxila scrapers and add a unit test for the new behavior.
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 26 comments.
Show a summary per file
| File | Description |
|---|---|
| test/unit/ingestors/ingestor_test.rb | Adds tests for the new auto_parse_vars ingestion behavior. |
| lib/ingestors/taxila/wur_ingestor.rb | Removes explicit target_audience assignment (now relying on auto-parse). |
| lib/ingestors/taxila/uva_ingestor.rb | Removes explicit target_audience assignment (now relying on auto-parse). |
| lib/ingestors/taxila/uu_ingestor.rb | Removes explicit target_audience assignment (now relying on auto-parse). |
| lib/ingestors/taxila/utwente_ingestor.rb | Removes explicit target_audience assignment (now relying on auto-parse). |
| lib/ingestors/taxila/tdcc_ingestor.rb | Removes explicit target_audience assignment (now relying on auto-parse). |
| lib/ingestors/taxila/surf_ingestor.rb | Removes explicit target_audience assignment (now relying on auto-parse). |
| lib/ingestors/taxila/rug_ingestor.rb | Removes explicit target_audience assignment (now relying on auto-parse). |
| lib/ingestors/taxila/rdnl_ingestor.rb | Removes explicit target_audience assignment (now relying on auto-parse). |
| lib/ingestors/taxila/oscm_ingestor.rb | Removes explicit target_audience assignment (now relying on auto-parse). |
| lib/ingestors/taxila/oscd_ingestor.rb | Removes explicit target_audience assignment (now relying on auto-parse). |
| lib/ingestors/taxila/odissei_ingestor.rb | Removes explicit target_audience assignment (now relying on auto-parse). |
| lib/ingestors/taxila/nwo_ingestor.rb | Removes explicit target_audience assignment (now relying on auto-parse). |
| lib/ingestors/taxila/maastricht_ingestor.rb | Removes explicit target_audience assignment (now relying on auto-parse). |
| lib/ingestors/taxila/leiden_ingestor.rb | Removes explicit target_audience assignment (now relying on auto-parse). |
| lib/ingestors/taxila/lcrdm_ingestor.rb | Removes explicit target_audience assignment (now relying on auto-parse). |
| lib/ingestors/taxila/han_ingestor.rb | Removes explicit target_audience assignment (now relying on auto-parse). |
| lib/ingestors/taxila/dtls_ingestor.rb | Removes explicit target_audience assignment (now relying on auto-parse). |
| lib/ingestors/taxila/dans_ingestor.rb | Removes explicit target_audience assignment (now relying on auto-parse). |
| lib/ingestors/material_ingestion.rb | Adds auto-parsing for materials (currently contains a breaking variable reference). |
| lib/ingestors/event_ingestion.rb | Adds auto-parsing for events and replaces the previous parse_audience method. |
| lib/ingestors/auto_parser_mappings/target_audience.json | Adds keyword-to-audience-category mappings for auto-parsing. |
| lib/ingestors/auto_parser_mappings/keywords.json | Adds keyword-to-keyword-category mappings for auto-parsing. |
| config/tess.example.yml | Documents new feature.auto_parse_vars configuration option. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary of changes
Target audience missing for some Taxila scrapers
Checklist
to license it to the TeSS codebase under the
BSD license.