Search This Blog

Sunday, October 20, 2013

A Questionable Study on Digital Sampling's Impact

At Volokh Conspiracy, Stewart Baker has a nicely-titled post that discusses a recently-posted paper by W. Michael Schuster on artists that create mashups that digitally sample copyrighted works.  I've posted previously about these mashups, specifically as to whether mashup creators could copyright the mashup and sue others who create similar mashups.

Here is the (lengthy) abstract of Schuster's paper:

This Article presents an empirical study on the effect that digital sampling has on sales of copyrighted songs and how this effect should influence the fair use analysis. To conduct this research, a group of previously sampled songs had to be identified and sales information for these songs collected. The over 350 songs sampled in musician Gregg Gillis’s (AKA Girl Talk’s) most recent album presents an ideal dataset because the album’s instantaneous popularity allows for its influence to be analyzed through a comparison of the sampled songs’ sales immediately before and after release. Collecting and comparing sales information for these songs found that — to a 92.5% degree of statistical significance — the copyrighted songs sold better in the year after being sampled relative to the year before. To the extent that the Copyright Act instructs courts to analyze (among other considerations) the effect that an alleged fair use has on the potential market for the original work, these findings favor the conclusion that digital sampling is a fair use (though each statutory fair use consideration should still be reviewed).  
Additionally, the songs sampled in the subject album were evaluated to ascertain the length of each sample and to what degree each sampled song had experienced prior commercial success. This collected data was used to test the hypothesis that sampled songs which were more recognizable to listeners (e.g., songs that were commercial hits or songs that were sampled for a relatively longer period) would see a greater sales increase after being sampled. The collected data did not find a correlation in post-sampling sales increases and sample length or prior commercial success, but further study may be warranted. 
Beyond supporting the premise that digital sampling may constitute fair use, the results of this study raise several notable issues and subjects for future study. One such issue is that courts only address an alleged fair use’s effect on the market for the original as a binary system, wherein the only options are harm to the market (disfavoring fair use) or no harm to the market (favoring fair use). There is no accepted rule on how to treat a market benefit (such as the one evidenced here). The failure to address this issue is questionable because a market benefit actually furthers the utilitarian goal of copyright by incentivizing the creation of new works through economic gain. The current research makes clear the need for precedent on how the fair use analysis should treat actions (e.g., digital sampling) that may increase sales of the original work. Additionally, this study sets the ground work for an objective financial review of fair use and market effect, which would yield needed predictability and stability to the fair use doctrine (at least, with regard to digital sampling).
I'm not one to hesitate to criticize the empirical methodology of studies, but in this case, Baker got there first.  He notes:

Actually, though, I think the article is a little too comforting. I am always skeptical of scholarly research that reinforces academic prejudices, since scholars tend adjust their standards of proof to fit their prejudices. Hostility to copyright is pretty much the norm in academic circles, and if you read the article skeptically, it loses much of its persuasiveness. Schuster achieves his results by playing with the sample, dropping nine songs from a sample of about 200 because they completely wreck his argument. His reason for dropping the songs is that they were hits in the 30 months prior to the release of Girl Talk’s album, and hits by definition suffer declining sales after topping out. If he didn’t drop those songs, Schuster’s data would show a 50% drop in sales of the songs that Girl Talk samples. 
Schuster says he’s just correcting for noise in the data, and it isn’t appropriate to charge Girl Talk with the natural rhythm of pop music sales. Maybe so, but once you start making big after-the-fact adjustments to a sample of 200, you can prove pretty much anything. At best, Schuster has developed an interesting hypothesis that ought to be tested by a new experiment untainted by data cherry-picking.
The only point that I would add to Baker's reaction is that I was already suspicious of the study by the time I read the abstract, due to Schuster's note that his study arrived at a conclusion "to a 92.5% degree of statistical significance."  This is an oddly specific way of framing the results.  While I am not an expert on statistics, I think that this phrasing is a way of avoiding an admission that the level of statistical significance falls below the typically-accepted levels of P=.05 or P=.01 (for the really strong claims).  Those P-values would translate into levels of statistical significance of 95% and 99%, respectively, and it would seem that Schuster's study falls short of these widely accepted levels of significance.  For more on statistical significance thresholds, see these posts about the .05 threshold at the Empirical Legal Studies blog here and here, and a post cautioning overreliance on these thresholds here.

In the spirit of that last post I mentioned above, I don't think that this worry about significance should be fatal to Schuster's study, but I think that it, combined with Schuster's selection methods, should raise some doubts.  At the same time, however, I think that Schuster is on to something interesting, and a wider study may well lead to more solid results in favor of his thesis.

No comments:

Post a Comment