Experiment with multiple regularization terms, one for addition, another for multiplication, another for pow, log etc. Perhaps a better approach would be to use random noise instead of other data points from the dataset.