Adding an additional corruption to our code requires adding a function and adding a dictionary entry to point to that function.
For meaning-altering/meaning-preserving corruptions, the function needs to be added in gather_corruptions.py, and should take in an entry (A dictionary containing a SICK dataset entry) and return 0 for no corruption found, 1 if sentence A is corrupted or 2 if sentence B is corrupted. This function then needs to be added to that file's global dictionary corruptions following this syntax:
'corr_name' : (function, 'description'),
Additionaly for meaning-preserving corruptions, the corruption name needs to be added to the global set m_p of metrics.py.
For fluency-disruption corruptions, the new function needs to be added in generate_corruptions.py, and it should take a sentence and return a corrupted sentence. The new function needs to be added in the global corruptions dictionary using the following syntax:
If you wish to share a new corruption, submit a push request on github.
|negated subject||"A man is playing a harp"||"There is no man playing a harp"|
|negated action||"A jet is flying"||"A jet is not flying"|
|antonym replacement||"A dog with short hair"||"A dog with long hair"|
|active to passive||"A man is cutting a potato"||"A potato is being cut by a man"|
|synonymous phrases||"A dog is eating a doll"||"A dog is bighting a doll"|
|determiner substitution||"A cat is eating food"||"The cat is eating food"|
|double PP||"A boy walks at night"||"A boy walks at night at night"|
|remove head from PP||"A man danced in costume"||"A man danced costume"|
|re-order chunked phrases||"A woman is slicing garlics"||"Is slicing garlics a woman"|