Adding corruptions

Adding an additional corruption to our code requires adding a function and adding a dictionary entry to point to that function.

For meaning-altering/meaning-preserving corruptions, the function needs to be added in gather_corruptions.py, and should take in an entry (A dictionary containing a SICK dataset entry) and return 0 for no corruption found, 1 if sentence A is corrupted or 2 if sentence B is corrupted. This function then needs to be added to that file's global dictionary corruptions following this syntax:

'corr_name' : (function, 'description'),

Additionaly for meaning-preserving corruptions, the corruption name needs to be added to the global set m_p of metrics.py.

For fluency-disruption corruptions, the new function needs to be added in generate_corruptions.py, and it should take a sentence and return a corrupted sentence. The new function needs to be added in the global corruptions dictionary using the following syntax:

'corr_name':function,

If you wish to share a new corruption, submit a push request on github.

Current corruptions

NameBeforeAfter
negated subject"A man is playing a harp""There is no man playing a harp"
negated action"A jet is flying""A jet is not flying"
antonym replacement"A dog with short hair""A dog with long hair"
active to passive"A man is cutting a potato""A potato is being cut by a man"
synonymous phrases"A dog is eating a doll""A dog is bighting a doll"
determiner substitution"A cat is eating food""The cat is eating food"
double PP"A boy walks at night""A boy walks at night at night"
remove head from PP"A man danced in costume""A man danced costume"
re-order chunked phrases"A woman is slicing garlics""Is slicing garlics a woman"