-
Notifications
You must be signed in to change notification settings - Fork 120
Clean up shutterstock descriptions #4615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
c64ef74 to
d678af7
Compare
|
This is great! 🙏 I think it might be better to add certain belt (with braces). It doesn’t happen as often with Shutterstock any more, but it used to (so at least will affect migration of old images). It is also a common occurrence for eg. Getty’s intermediaries (Anadolu etc, example PROD id This is the case (PROD id Current code doesn’t check if removed string contains all the tokens otherwise available in their proper fields and cleans it up. This prevents humans from being able to fix the byline manually. |
Good shout, e753e11 adds a check to only remove the credit field if the first part of the slash-delimited byline in the description matches the metadata. Includes cd2de8ef064b2a137f32a0250447cf454afe0fe1 as fixture. Presume the person we care about is (edit) always the first one? |
c7c6e18 to
7c4fab0
Compare
removing special instructions and credit information when they are already defined in metadata
…ant to accommodate)
7c4fab0 to
3e42726
Compare
| } | ||
| } | ||
|
|
||
| private def matchMandatoryCreditBylines(suppliersReference: String) = s"Mandatory Credit: Photo by (.*)?\\(${Regex.quote(suppliersReference)}\\)\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the group should be optional, and we can be slightly neater leaving a space between the grups
| private def matchMandatoryCreditBylines(suppliersReference: String) = s"Mandatory Credit: Photo by (.*)?\\(${Regex.quote(suppliersReference)}\\)\n" | |
| private def matchMandatoryCreditBylines(suppliersReference: String) = s"Mandatory Credit: Photo by (.*) \\(${Regex.quote(suppliersReference)}\\)\n" |
What does this change?
Clean up shutterstock image descriptions, by removing special instructions and credit information when they are already wholly defined in metadata. When they are not in the metadata, we leave them in the description — this is especially important if the description is the only place a photographer's byline is included.
How should a reviewer test this change?
How can success be measured?
Staff spend less time spent cleaning captions.
Who should look at this?
Tested? Documented?