Fix #4225: Dynamically generate manifest extensions from package handlers#4857
Fix #4225: Dynamically generate manifest extensions from package handlers#4857HasTheDev wants to merge 2 commits intoaboutcode-org:developfrom
Conversation
Signed-off-by: HasTheDev <hassanazam2021@gmail.com>
3c8e1a2 to
7a79419
Compare
…age handlers Signed-off-by: HasTheDev <122232470+HasTheDev@users.noreply.github.com>
7a79419 to
5b358b7
Compare
AyanSinhaMahapatra
left a comment
There was a problem hiding this comment.
@HasTheDev thank you for your PR, but your code does not work as intended.
This needs some major changes and restructuring, and also needs to use newer code which was merged and is helpful for this functionality.
| "is_script": false, | ||
| "is_legal": true, | ||
| "is_manifest": false, | ||
| "is_manifest": true, |
There was a problem hiding this comment.
This is wrong, not a manifest.
The path of copyright files should be checked in it's entirety to validate if this is a debian copyright file and only then this can be classified as manifests.
We possibly need to maintain a reject list of datafile handlers which should be ignored while checking if a file is manifest or not, because:
- sometimes path pattern is not enough and is_datafile() checks happen in functions, which either performs more checks or opens the files
|
|
||
| # Seed the set with the original legacy list to appease old, rigid tests | ||
| manifest_ends = set([ | ||
| 'package.json', |
There was a problem hiding this comment.
why do you have this list, this is not dynamically extracted, this just replaces one static list with another, even though you have additional checks.
| from commoncode.fileutils import file_base_name | ||
|
|
||
| def get_dynamic_manifest_ends(): | ||
| """ |
There was a problem hiding this comment.
See how we now have a package manifest patterns index with https://github.com/aboutcode-org/scancode-toolkit/pull/4606/changes#diff-13120b0eb8c69b520b66229f7090c12b1102859a76ddd19b4789de2ed1b8818cR85-R100, can you reuse this to do a fast manifest pattern check directly?
Fixes #4225
Replaces the hardcoded
_MANIFEST_ENDSlist insrc/summarycode/classify.pywith a dynamically generated set of manifest file extensions fromAPPLICATION_PACKAGE_DATAFILE_HANDLERS.Key Updates:
Created
get_dynamic_manifest_ends()to safely extractpath_patternsfrom registered package handlers.Placed the import of
APPLICATION_PACKAGE_DATAFILE_HANDLERSlocally inside the function and moved the assignment to the bottom of the file to prevent the circular import issue.Added a fallback to seed the dynamic set with the legacy hardcoded list to ensure 100% backwards compatibility with outdated test suite files (like
elm-package.json,project.clj, andmetadata).All
test_classify.pytests passing locally.Tasks
Reviewed contribution guidelines
PR is descriptively titled 📑 and links the original issue above 🔗
Tests pass -- look for a green checkbox ✔️ a few minutes after opening your PR
Run tests locally to check for errors.
Commits are in uniquely-named feature branch and has no merge conflicts 📁
Updated documentation pages (if applicable)
Updated CHANGELOG.rst (if applicable)
Signed-off-by: HasTheDev 122232470+HasTheDev@users.noreply.github.com