It's beginning to look like empirical evidence from users may be the proper foundation of a test protocol. But there has to be a quantifiable way to measure damage to the liner, compared in a double-blind setup.
In other words, you first have to model a behavior/reaction and then control conditions and have a viable measurement methodology.

Guess I've been in clinical trials programming too long!

Carl