Update spdx_parser.py to handle spdx file parsing logic to generate correct key value pair of dictionary#5
Conversation
…orrect key value pair of dictionary Addition regex logic will help to find right delta between two SPDX file. Without this fix sometime ProductName getting recorded incorrectly as "v2" or "without full path of software" this enhancement will help to detect same product name b/w two spdx file(s).
|
@anthonyharrison kindly start the review. |
anthonyharrison
left a comment
There was a problem hiding this comment.
I am not convinced that this is a valid or complete change.
| packages = {} | ||
| package = "" | ||
| version = None | ||
| githubStr = "pkg.go.dev/" |
There was a problem hiding this comment.
This is not appropriate as this only works for Go packages. sbomdiff needs to be generic for all SBOMs and languages
There was a problem hiding this comment.
Agree I will add some check to do this only for GO libs.
There was a problem hiding this comment.
Could you suggest how to find packages for other language like java, python etc.
There was a problem hiding this comment.
Could you suggest how to find packages for other language like java, python etc.
I don't think this is a valid change,
The differences in package name is due to the differences in the SBOM generator. Comparing SBOMs from different generators is a valid use case and I can see how useful it is to detect that the generators are creating different names for the same package. Trying to make sbomdiff cater for different generators and establish that the different package names are actually the same package is beyond the scope of sbomdiff as it is not viable to accommodate the approaches adopted by all the different SBOM generators for generating package names.
| version = line[16:].strip().rstrip("\n") | ||
| if line_elements[0] == "PackageLicenseConcluded": | ||
| license = line_elements[1].strip().rstrip("\n") | ||
| if line_elements[0] == "PackageHomePage": |
There was a problem hiding this comment.
If we add this check,we also need to do a corresponding check for all formats of SBOMs not just SPDX tag value SBOMs.
There was a problem hiding this comment.
Yes I will add same check for json, xml, yaml and rdf.
Thanks @anthonyharrison
Problem statement:
Showing same software b/w two SPDX file as diff and generating result with same product removed then later added back.
If spdx file [1] contains:
If spdx file [2] contains:
Command used to generate report:
#python ./cli.py --sbom spdx -o sbom_diff_reports.txt -f text file1.spdx file2.spdx
#echo $?
#1
cat sbom_diff_reports.txt
Solution:
Addition regex logic will help to find right delta between two SPDX file. Without this fix sometime ProductName getting recorded incorrectly as "v2" or "without full path of software" this enhancement will help to detect same product name b/w two spdx file(s).
With fix report contains will be like:
cat sbom_diff_reports.txt
Reviewers:
@anthonyharrison @briancaine