The following is a copy of the CMU Committee of Inquiry report looking into allegations of the Rimm study. ----------------- CARNEGIE MELLON UNIVERSITY Internal Memorandum TO: Paul Christiano, Provost FROM: George Duncan, Sara Kiesler, Mary Shaw DATE: August 1, 1995 SUBJECT: Report of Committee of Inquiry on A Study of Marketing Pornography via Computer Networks Pursuant to your request of July 17, the Committee of Inquiry has conducted a discreet inquiry of the research study published June 1995 in The Georgetown Law Review (GLJ), volume 83 number 5, as Marketing pornography on the information superhighway: A survey of 917,410 images, descriptions, short stories, and animations downloaded 8.5 million times by customers in over 2,000 cities in forty countries, provinces and territories. This article was authored by Martin Rimm. The Committee of Inquiry was formed because many persons have made allegations of research misconduct about this study. You have received such allegations, and many have been widely disseminated both inside and outside the university. Further, you have indicated that these allegations could not be resolved by confidential counseling and informal means. As background, the research study was carried out and the article written while Martin Rimm was an undergraduate at Carnegie Mellon. Rimm arranged with faculty members to sponsor proposals for Small Undergraduate Research Grants (SURG) in Carnegie Mellon's Undergraduate Research Initiative program (URI) and to supervise independent study courses related to this research. Many other people helped him with the study. These included faculty members, staff, and students here, as well as other people outside the University. The Committee reviewed allegations and complaints against the research and the study. Persons external to the University, members of the Carnegie Mellon faculty, staff, and students, and persons listed in the article as contributors have made over 200 such allegations. In the course of the inquiry, more allegations emerged. These allegations, of course, contain duplications and overlaps; further, there was wide variance in the degree of care in which they were put forward. We distilled the list to 22 allegations for consideration. To simplify exposition and review, we have organized both the allegations and our discussion into categories broadly coinciding with the various phases of research. Appendix A documents the sources of external allegations. In further discharge of its responsibilities, the Committee interviewed participants and others with relevant knowledge in order to decide whether there are sufficient grounds to investigate the allegations. The Committee finds that there is sufficient cause to believe that misconduct in this research has occurred. It therefore unanimously recommends that a formal investigation be conducted, in accordance with Organization Announcement 320, Policy for Handling Alleged Misconduct in Research at Carnegie Mellon University. In the subsequent sections of this report, the Committee will identify which allegations of misconduct that they consider to be serious deviations from accepted practices in proposing, carrying out, or reporting results from the research. It also will indicate why it believes that there is reason for formal investigation of these allegations. The Committee believes a formal investigation will have to consider not only serious violations of research standards, but also a number of less serious problems in the research, including faculty oversight, student conduct, Carnegie Mellon policy violations, and adherence to scientific norms. The nature of this research has made this inquiry complex. The research pertained to human sexuality, a topic that is emotionally and politically explosive. As a result of the human, social and political implications of the work, it would be given special scrutiny and held to high standards of scientific integrity. At the same time, the research concerned interactions on computer networks, a relatively new domain where research standards and university policy have not been fully discussed and debated. (They should be.) Because of this context, and although a few of the problems are easily categorized as "serious" or "not serious," many of the allegations fall into a gray area. Since at this point we cannot always fully assess them, these issues should be investigated without presumption that they are in fact serious. Consistent with its mandate, the Committee has stopped short of resolving all questions of fact, assessment of responsibility, and judgments of appropriate actions by the University. This report has the following sections: 1. Applicable standards and policies II. Contributors to the research III. Chronology of the research IV. Findings V. Summary of recommendations Appendices Public documents that serve as basis for allegations 1. Applicable standards and policies The research was conducted by Martin Rimm while he was an undergraduate at Carnegie Mellon University. Policies applying to undergraduate research appear in the CMU Student Handbook (1994- 1995). Among the sections relevant to this case are policies on Computing and Information Resources (pp. 48-49), on Human Subjects in Research (pp. 54-55), and on Cheating and Plagiarism (pp. 7-8). Three faculty and at least one staff member are on record as having supervised and collaborated in the research. Other faculty and staff may have been involved as well. Policies applying to faculty appear in the CMU Faculty Handbook (1993-1994). Relevant sections include Statement on Individual Responsibility in Shared Computing Environments (pp. 35-36), Separation of Individual's and Institution's Interests (pp. 33-34), Organization Announcement 320, Policy for Handling Alleged Misconduct in Research (pp. 31- 32), and Cheating and Plagiarism (pp. 34-35). Policies applying to staff are stated in the staff handbook, Human Resources: A Guide. A policy on Confidentiality of Administrative Data appears as chapter 1, volume 7 of the Carnegie Mellon Policy Library. Appendix E collects relevant Carnegie Mellon policies. These policies are relevant not only in their own right, but also because they help establish the accepted practices for research. Investigators may not determine, themselves, that the research carries minimal risk and need not be reviewed. Institutional Review Boards, in reviewing a proposal, evaluate the proposal against federal standards and ethical principles in the conduct of research with human participants. In addition to Carnegie Mellon policy, research and other professional activities are governed by codes of ethics of various disciplines, for example the American Psychological Association (APA) in Appendix B, the Society for the Scientific Study of Sexuality (SSSS) in Appendix C, and the institute for Electrical and Electronic Engineers (IEEE) in Appendix D. II. Contributors to the research Faculty advisors and collaborators (1994-5 academic year and 1994 summer). Martin Rimm obtained the help of many people in this research. Some are listed in footnote I of the article in GLJ. Some are not listed, the article says, because they requested anonymity. Some, whether listed or not, have disavowed the research. Three faculty members are on record as having supervised the research with Martin Rimm, and considerable evidence from notes of meetings and e-mail shows that they collaborated on the research and the preparation of the article, and that they helped Martin Rimm in other ways to obtain a degree, apply to graduate school, and so forth. A staff member provided data to Martin Rimm. Edward Zuckerman. Adjunct Associate Professor, Psychology. From Spring, 1994 to August, 1994, he was Martin Rimm's advisor on an undergraduate SURG grant. According to Martin Rimm, he remained involved during the rest of the year, giving "management advice." Edward Zuckerman and David Banks provided e-mail corroboration of Zuckerman's participation in the research through Fall Semester, 1994. David L. Banks. Associate Professor, Statistics. He began working with Martin Rimm on the research in July, 1994. He was the faculty member on an independent study course to carry out this research that Martin Rimm registered for in Fall, 1994. He remained closely associated with the project through spring, 1995, except during the period December, 1994 - March, 1995. Marvin Sirbu. Professor, Engineering and Public Policy (EPP, GSIA, ECE Departments). From August, 1994 to late spring, Marvin Sirbu was Martin Rimm's unofficial advisor. He also pursued a grant proposal at the Department of Justice (DoJ) with Martin Rimm and David Banks. He also was the faculty member on an independent study course to carry out this research that Martin Rimm registered for in Fall, 1994. He is identified by Martin Rimm in the Georgetown Law Journal as his "principal faculty advisor". The principals provided us with corroboration, from e-mail exchanges and personal calendars, that David Banks and Marvin Sirbu met frequently with Martin Rimm (approximately once a week) during the Fall Semester, 1994, that they contributed ideas for the research, suggested methods of data collection (Sirbu), carried out statistical analyses (Banks), and provided comments on drafts of the article. Martin Rimm and Edward Zuckerman both report that Edward Zuckerman was involved less deeply in the study after August, 1994. Other faculty or former faculty. Timothy McGuire and Nancy Melone. Former GSIA faculty members at Carnegie Mellon. From Spring 1994, they consulted with Martin Rimm, since he was interested in the "marketing" aspects of pornography. Nancy Melone was interested in computers and marketing. Martin Rimm says they were involved through the 1994-5 academic year, giving "management" advice. Ronald A. Rohrer. Wilkoff University Professor. ECE. Academic advisor to Martin Rimm until spring, 1994. Martin Rimm said Ronald Rohrer read the pornographer's handbook (which Martin Rimm described as a take-off on Machiavelli). Martin Rimm said Ronald Rohrer encouraged Martin to do the research and that he continued to consult with Ronald Rohrer and tell him about the pornography study throughout the research. Staff. John G. Myers. Systems Programmer. Computing Services Administration (System Software Department). He gave data to Martin Rimm -- archives of newsgroup statistics going back through 1988. Undergraduate Research Initiative staff (Barbara Lazarus, Associate Provost for Academic Affairs, and Jesse Ramey, Director) and committee. Approved SURG grants of $1,500. Gave Martin access to office, telephone, fax, etc. Also cited in Martin Rimm's acknowledgment footnote are Chris Hendrickson and Robert P. Kail, both Associate Deans of CIT. Other nonsupervisory persons involved, or said to be involved, with the research: Patrick Aboyoun, programmer. Quit project when Martin refused to tell him the nature of the data to be coded. Hal Wine. Computer scientist Martin met on the net. Helped Martin write a Perl script and taught Martin Perl. This person is not part of the Camegie Mellon community. Melissa Rosenstock. Did reliability checks on the parser software against the data dictionary. She was paid by Martin. Ted Irani. Junior Math/CS. He was on the "parser team" Also involved later. Paul Bordallo. HSS student. Same as above. G. Alexander Flett. Sophomore Math/CS. (role not established) Lisa Sigal. Ph.D. student, HSS (history) student (Disavowed the study.) Christopher Reeve. Sophomore ECE. (role not-established) CJ (Catherine) Taylor. Visiting Project Scientist. Robotics. (role not established) Erikas Napjus. System Designer. Network Development. A message he wrote for a network newsgroup was incorporated into a footnote. Exchanged e-mail with Martin Rimm through Fall Semester, 1994, and has published portions of it on the net. Carolyn Speranza. Artist/Lecturer, Department of Art. Said to have provided illustrations for "The Pornographer's Handbook." Other persons related to the research Robert Thomas. A BBS operator who sold pornographic images, and who subsequently was convicted of obscenity, and who now is imprisoned and is pursuing an appeal. Philip Elmer-DeWitt. Martin Rimm said he established a "relationship" with this reporter from Time, leading to a cover story. Martin Rimm says he gave DeWitt partial results before the publication of the Georgetown article. Dean Kaplan. Editor on the Georgetown Law Journal who made the initial contacts for Martin Rimm. Two sources told us that Kaplan is Vice President of the National Coalition for the Protection of Children and Families and suggested that his political agenda influenced his editorial position. Bob Flores. An Acting Deputy Chief in the Criminal Division, Department of Justice (DoJ). Discussed with Martin Rimm, David Banks, and Marvin Sirbu a proposal for them to analyze some data his division had seized from BBS operators. Unnamed BBS operator. When first interviewed, Martin Rimm said he obtained customer information from this operator and persuaded the same BBS operator to obtain customer information from other BBS operators. 111. Chronology of the research (sources in parentheses): Winter, Martin Rimm gives his academic ECE advisor, Ronald Rohrer, a copy of the 1993-4 pornographer's handbook, a spoof of Machiavelli. (Alternative title: The Pornographer King.) Ronald Rohrer advised Martin Rimm to turn it into a piece of research. (Martin Rimm) March 18, Martin Rimm submitted a proposal, with Edward Zuckerman as advisor, to SURG. He 1994 applied for, and received, a $500 grant for the Summer/Fall grant period. This grant was for a study of pornography that could be purchased and downloaded from commercial BBS operators using electronic networks. (SURG proposal) May, 1994 Martin Rimm mailed form to Robert Thomas to join Amateur Action as a subscriber, and subsequently did subscribe. According to Robert Thomas, Martin Rimm kept complaining about how Thomas kept his statistics. Martin Rimm quit but later resubscribed (Robert Thomas' attorney). Martin Rimm, as a subscriber, was able to obtain statisticson which images were downloaded most, controlling for date on which the image was offered for sale. (Martin Rimm) June, 1994 Marvin Sirbu and Martin Rimm met. Discussed telecommunications policy, rules and liability of sysops (Marvin Sirbu, Martin Rimm). Spring/ Enlisting several students and others to help, Martin Rimm developed a linguistic perser Summer, used to sort captions describing images into one of several categories of pornography, such as pedophilia, soft core, hard core, paraphilia, etc. Students helped test the reliability of the parser. This work was supervised by Edward Zuckerman, with help from Nancy Melone and Tim McGuire (Ed Zuckerman, Martin Rimm). Spring/ Edward Zuckerman lived close to Martin Rimm, and they saw one another outside of Summer, Carnegie Mellon. Martin Rimm told Edward Zuckerman that he would obtain customer (individual-level) data from BBS operators. They discussed using telephone numbers to estimate area codes for a community standards analysis. Edward Zuckerman, in answer to a query, stated that he understood Martin Rimm had obtained the data as a trade for something. He understood customers could be identified in the database, and assumed Martin Rimm would use the telephone numbers to obtain area codes and delete the identifying information. He didn't think it was a big deal. ( Ed Zuckerman first interview.) July, 1994 Martin Rimm walked into David Banks' office, and asked him how to analyze downloads. (Subsequent meetings July 21, Aug 2, Aug 19, Aug 23, Aug 26.) David Banks recalled that Martin Rimm asked him to be his advisor. David Banks agreed to be the faculty supervisor of an independent study course for Martin Rimm. He said he was unaware of Marvin Sirbu's participation for 2 months. (David Banks) August 10, Martin Rimm sent e-mail to Robert Tbomas offering to help. (e-mail in footnote 1) 1994 August, 1994 Martin Rimm proposed an independent study course to Marvin Sirbu. Martin Rimm showed Marvin Sirbu a draft of the pornography paper. Marvin Sirbu arranged for a special topics course for Martin Rimm, through EPP. The draft document was to be revised and become an INI report. (Marvin Sirbu) September 3, Draft 3 of manuscript, dated September 3, 1994.(Ed Zuckerman) 1994 September Martin Rimm told his advisors that he was having discussions with a friend at GLJ, an 13, 1994 editor on the review, who would forward a draft of the study to the senior editor to consider for publication. The friend was Dean Kaplan. (Edward Zuckerman; Robert Thomas' attorney) September Meeting of Martin Rimm, Marvin Sirbu, Edward Zuckerman, and David Banks. 30, 1994 Agreement that Marvin Sirbu would be the main advisor of Martin Rimm (Marvin Sirbu, David Banks). September (?), David Banks arranged to give a talk to the local chapter of the American Statistical 1994 Association on December 1 about the research. The talk would focus on David Banks' statistical analyses of the parser-coded captions, showing the most available pornography from BBS operators vs the types of pornography that customers downloaded most. (David Banks, Martin Rimm) September, Weekly meetings of Martin Rimm with David Banks and Marvin Sirbu. Also plentiful October, e-mail. David Banks agreed to provide histograms for the draft paper. Martin Rimm 1994 asserted he would be sole author. David Banks discussed a second paper that would be coauthored with Martin Rimm. David Banks obtained Martin Rimm's agreement that he would have access to the data. (David Banks) Early Martin Rimm and David Banks discussed newsgroup statistics with John Leong, and October, Leong told them how it was done. (David Banks). 1994 October, Martin Rimm sent to David Banks part of a database which showed the log files from 1994 one BBS pornographic service. The fields included name, telephone number, address, driver's license, age, dates and times which pictures had been downloaded. (David Banks). In addition to that dataset, Martin Rimm also obtained another set of log files from a different BBS operator, described by Martin Rimm as a friend. This friend obtained further log files of customer-level information from friends who were in the same business. (Martin Rimm; also p. 1862 of GLJ article) October Meeting of Marvin Sirbu, Martin Rimm, David Banks, and Rick Carley. First part of 14, 1994 discussion was ordinary faculty-student advising on such topics as Martin Rimm's graduate school choice and GREs. After Rick Carley left, the other faculty discussed with Martin Rimm their previous comments on the draft paper. (Rick Carley, David Banks) Prior to Martin Rimm sent President Mehrabian e-mail informing him that his group had October 15, discovered imagery on the Usenet which had been declared obscene by several courts 1994 of law. He sent names and descriptions of these images. (David Banks) October 15, Martin Rimm sent an outline of the structure and findings of the marketing pornography 1994 research to Erwin Steinberg, Michael Murphy, and Don Hale. (Don Hale) October 17, Meeting among Martin Rimm, Marvin Sirbu, David Banks, Erwin Steinberg, Don 1994 Hale, and Michael Murphy to discuss sexually-oriented bboard trees on the Andrew bboards system. (Don Hale) October 26, Martin Rimm submitted another SURG proposal for the Spring, 1995 grantperiod. His 1994 advisor on the proposal was David Banks. (SURG proposal) Early A former employee of the U.S. Department of Justice, a friend of Martin Rimm's, November, proposed to Bob Flores that Flores contact Martin Rimm. Flores did so, thinking Rimm 1994 had data that Flores might want to have re-analyzed (Flores). David Banks drafted a proposal to the Department of Justice that would use the linguistic parser with DoJ files seized in the Amateur Action and Pequena Penacha cases. He sent the proposal to the INI, which would submit it. (David Banks) November, Martin Rimm telephoned Michael Mehta (new PhD, now at York University). Martin 1994 Rimm said he was the principal investigator of an interdisciplinary research team of Carnegie Mellon faculty and requested copies of Mehta's study of sexually oriented images on the Internet. (Michael Mehta) November Bill Arms decided to withdraw 6 bboard trees (mainly alt.sex.binaries) from the Andrew 3, 1994 system. Marvin Sirbu syggested that the researchers access AMS.prof files before the bboards became unavailable (Martin Rimm, David Banks). They did so on three occasions in the week prior to November 8, 1994. (Martin Rimm) November The new bboard policy went into effect. 8, 1994 November A rally took place in protest of the Carnegie Mellon censorship action. 9, 1994 November Meeting of David Banks and Martin Rimm with George Duncan concerning the Carnegie 11, 1994 Mellon data. Approx 1 hour. Martin Rimm told David Banks this would help cover his ass (David Banks). George Duncan told them how to anonymize the files immediately, and explained how they had to aggregate categories to prevent individuals from being identified (David Banks, Martin Rimm, George Duncan). Martin Rimm agreed to obtain the help of an undergraduate to go through the confidential data, coding it for blind statistical analysis. (e-mail from David Banks) However, Rimm informed Banks he would not do this until Spring. ( David Banks) November Martin Rimm informed Edward Zuckerman, Marvin Surbu and David Banks that GLJ 13, 1994 was considering his article, and therefore there should be no press release and David Banks should not give his talk to the ASA on December 1. (David Banks) November Martin Rimm asked Bill Arms to remove andrew.ms.stats from the Andrew system 13, 1994 because they would identify Carnegie Mellon (a university studied) in the subsequent article. Marvin Sirbu Reprimanded Martin Rimm (Marvin Sirbu). Marvin Sirbu also asked Martin Rimm not to tell reporters he was a research associate, and not to cut corners. (David Banks, Marvin Sirbu) November Time magazine published an article, Censoring Cyberspace. Subsequently Martin Rimm 21, 1994 (?) befriended Phil DeWitt, who authored the story (Martin Rimm). November Martin Rimm informed Edward Zuckerman, David Banks, Nancy Melone, Tim McGuire, 28, 1994 Marvin Sirbu and 20 others that our article had been accepted for publication in the May, 1995 issue of GLJ. He also asked them to keep information about the study from others. He said the journal insisted on an embargo. Banks reminded Rimm that he was giving a talk on the study December 1. (David Banks) November David Banks and Martin Rimm argued about the talk on Dec. 1. Martin Rimm insisted 28-30, 1994 David Banks cancel the talk. David Banks asked his department head, John Lehoczky, for help. David Banks offered to speak with the editor of the journal. Marvin Sirbu supported Martin Rimm and the journal editor (David Banks, John Lehoczky). But also he tried to act as a broker and suggested David give a talk on the methods without revealing any data or labels. (Marvin Sirbu) December David Banks gave his 11 content free" talk. (John Lehoczky). David Banks 1, 1994 continued to complain during the next week, as he had done the statistical work. Martin Rimm was angry about the talk also. Martin Rimm subsequently did not communicate with David Banks about the study, except to insure Martin's grade would not be held hostage by the argument.(David Banks) December David Banks complained to Marvin Sirbu that Martin Rimm kept pulling 1, 1994 participants into the research without keeping everyone informed. He complained he did not know where the legal footnotes came from or the quality of their supplier, or his/her bias. He also complained he knew of at least three undergraduates who might have or would have access to sensitive data. He said Martin Rimm needed close supervision. Marvin Sirbu agreed and said he would "get after" Martin Rimm (David Banks) December, Visit by Marvin Sirbu and Martin Rimm to Bob Flores. Discussion of work they 1994 might do for him: Marvin Sirbu and Martin Rimm proposed to do analyses of the files Justice seized in two BBS obscenity cases, including the files of Robert Thomas. (Flores) Although Bob Flores was sure that Martin Rimm was there, Marvin Sirbu's narrative did not mention his participation (Marvin Sirbu). December, David Banks attempted to rejoin the project, asking to direct undergraduates 1994 working on the data and to work on the proposal. Martin Rimm rebuffed him. (David Banks, Marvin Sirbu) December, Edward Zuckerman said he read revision L.7 of the manuscript (Edward 1994 Zukerman) approx Jan David Banks requested that his analyses and figures be removed from the article if 21-23, 1995 he were not included in the DoJ proposal. (David Banks, Marvin Sirbu) Feb, 1995 Martin Rimm said he did a major revision of the article in February. Mar, 1995 Martin Rimm visited with Bob Flores at DoJ. May, 1995 David Banks visited Bob Flores about the DoJ proposal, learning then that Martin Rimm and Marvin Sirbu had already visited with Flores. (David Banks; visit confirmed by Flores) July 3, Cyberporn cover story by Time Magazine. Discussed Robert Thomas; calls him 1995 the Marquis de Cyberspace. July 14, David Banks disavowed research study (David Banks). 1995 July, 1995 Robert Thomas learned in July, 1995 that subscriber Martin Rimm was the same person who did the Carnegie Mellon study featuring Thomas. (Robert Thomas' attorney) Chronology not established: undated John Myers provided archives to Martin Rimm, containing names of Carnegie Mellon persons having Andrew accounts (Myers e-mail to committee) undated Martin Rimm acquired customer log files from Robert Thomas (first interview with Rimm/second SURG proposal/Zuckerman e-mail to committee) undated Martin Rimm acquired customer log files from an unnamed BBS operator, who in turn obtained customer log files from other BBS operators (first interview with Rimm/second SURG proposal) undated Martin Rimm acquired demographic data on Carnegie Mellon individual staff, faculty and students (claimed in article) IV. Findings of the Committee of Inquiry The findings of the inquiry relate to the study as authored by Martin Rimm. Other questions have arisen about the quality of participation from faculty advisors (Marvin Sirbu, David Banks, Ed Zuckerman), the oversight provided by the Undergraduate Research Initiative office, and actions of John Myers of the Academic Computing staff. As noted above, the situation is complex. The Committee has identified both questions of research misconduct and less serious allegations.1 The Committee's findings are organized in categories reflecting phases of the research process, and within each phase in response to the allegations. Sources of allegations appear in Appendix A; those that arose during the inquiry rather than from outside sources are flagged with an asterisk. A. Preparation for research (IRB review, planning confidentiality procedures, obtaining organization and institutional permissions; deciding responsibilities for oversight) The Committee finds that all of the allegations in this category should be investigated for research misconduct A.1 Absence of IRB review; An IRB process must be instituted for research involving human participants (It is the responsibility of investigators who plan to use human subjects in research to obtain written consent from the IRB prior to conducting an investigation involving human subjects," CMU Student Handbook 1994-1995, p.55). The research described in the first SURG grant proposal and supervised by Edward Zuckerman does not involve human participants. This research involved creating software to automate content analysis of captions of sexually- oriented images. However, at some point during this research, Martin Rimm learned he could obtain individual-level data from BBS operators and proceeded to do so. He said he discussed this with Edward Zuckerman and other faculty and staff. The second SURG proposal discusses plans to obtain individual-level data. An IRB review was not performed, nor does it appear that the participants discussed whether such a review was required. Our evidence from records of conversations and discussions with the principal participants suggests that questions of ethical conduct arising from collection of data, risks to persons, or threats to the integrity of the scientific process were evaluated (if at all) after the data collection or other procedures were performed, not beforehand. A.2* Failure to institute procedures for faculty oversight of student research and to create procedures to protect confidentiality of sensitive data. Lapses in faculty oversight threatened the integrity of the work. In particular, procedures for protecting confidentiality are essential to this kind of work because leaks can put people at risk. We also encountered questions regarding relations among faculty. According to at least two of his faculty advisors and others with whom the Committee talked, Martin Rimm routinely told persons, including co-workers and faculty, different, inaccurate, incomplete accounts of his 1 "Research misconduct", in this case and in accordance with Organization Announcement 320, refers to "Serious deviation from accepted practices in preparing, carrying out or reporting results from research". activities and of what was happening on the research project. For instance, in August, 1994, he failed to tell David Banks that he had been working with Marvin Sirbu, and he failed to tell Marvin Sirbu that he had been working with David Banks. It is possible that in the two independent study courses Martin Rimm took during Fall, 1994, one with Marvin Sirbu and one with David Banks, that the same work may have been used to obtain course credit in each. B. Collection of data from individuals B.1 Obtaining personal information on private BBS customers and not asking for their informed consent B.2 Failure to obtain informed consent from Carnegie Mellon individuals whose AMS.prof files were accessed for research The Committee believes that most of the individual-level data in this research were obtained from databases, rather than directly from individuals. Hence allegations B.1 and B.2 do not warrant investigation. B.3 Deceptive collection of customer information from BBS operators Data obtained from Robert Thomas and other unnamed BBS operators is a different matter. The Committee was informed by an attorney in contact with Thomas that Thomas did not know Martin Rimm was doing a study of the sort published by GLJ. There is no record of informed consent from BBS operators or of an "arms- length" review of the ethics of the data collection procedures. Hence allegation B.3 does warrant investigation. C. Collection of data from databases The Committee finds that all of the allegations in this category should be investigated. C.1 Invasion of privacy of home directories and files C2* Transferring databases from administrative control to a researcher's control C3 Misrepresentation in obtaining research data (offering marketing help to BBS operators (e.g., the "Pornographer's Handbook") in exchange for private information on customers, and not explaining the purpose as research) During this research project, six different sets of data were collected or are alleged to have been collected: (1) Captions downloaded from (pay) BBS systems, describing images available. Customers must give evidence of age and usually pay a nominal fee to access this information. With the captions there is information regarding the date the image was made available, and how frequently it has been downloaded, (2) Brian Reid's Arbitron statistics on Internet newsgroup usage; customer data from BBS operators including Robert Thomas and another BBS operator, who in turn obtained data from other BBS operators; (4) programs and 1988-1994 monthly archived statistics from John Myers on Andrew newsgroups, also containing thousands of names of individual faculty, staff, and students; (5) individual-level data collected on three occasions in October, 1994, on Andrew users who accessed newsgroups and who had world-readable files (the default for all but freshmen and other new accounts as of August, 1994); (6) demographic data on Carnegie Mellon staff, faculty, and students including age, sex, nationality, marital status, position, and department (described in article but probably not collected). The Committee finds that datasets (1) and (2) do not present sufficient problems to warrant investigation as they do not contain individual-level data. The Reid statistics are gathered from newsgroup accesses at a variety of sites and are reported to him at the aggregate level. Reid's laboratory confirms that all the information transmitted to them consists of aggregated statistics and no information can be associated with an individual user. Dataset (3). Martin Rimm said he collected these data and also claimed, in the article itself, that he collected the data. 'Me second SURG proposal described the customer-level BBS dataset as follows: "'Me records indicate every file that every customer downloaded, as well as the on-line activities, for 2,500 customers... Detailed profiles of 2,500 consumers can be developed and analyzed..." Footnote 88 in the article, describing the customer-level dataset, says that the team decided to remove customer names from the "Carnegie Mellon database." This is corroborated by David Banks, who said he saw an excerpt of the data, which contained names, telephone numbers, images downloaded, etc. The Committee does not have direct information about how Martin Rimm convinced BBS operators to supply these data. Allegations were made to the effect that Martin Rimm offered attractive marketing inducements, such as software and analysis of customer ordering preferences by region or area code, in exchange for individual-level customer data. Our interviews with the principals have not laid these allegations to rest.2 Furthermore, the allegations that Martin Rimm misrepresented the research and lied to BBS operators are of serious consequence because the data contain information that, if provided to law enforcement, employers, or relatives, could be harmful to the individuals. Dataset (4). The Andrew statistics on popular newsgroups from 1988. John Myers gave Martin Rimm archives of statistics on individuals' accesses to Andrew files from 1988. These statistics were gathered monthly (sometimes more often), and for technical programming reasons, the statistics included the name of the last reader of each newsgroup. John Myers said he warned Martin Rimm that these data were confidential. However, no assurance was made regarding confidentiality. (For example, Myers did not strip the names before giving Martin the dataset.) 2 Robert Thomas' wife gave this copy of an e-mail message from Martin Rimm to Robert Thomas to an attorney with the Electronic Frontier Foundation. In a public posting complaining that this message had been released, Martin Rimm confirmed that he sent the message. ----------- From : Martin Rimm Number : 205 of 229 To : Robert Thomas Date : 08/10/94 8: pm Subject : Question Reference : NONE Read : 08/11/94 9:37am Private Conf: : 000 - Amateur Action I'll tell you, Robert, in spite of my few comments, I still think you're a fucking genius. Every time I run your list through my computer I learn new things that no one else in the business ever thought of. I'd like to help in any way I can. I hope you count me among your friends. Martin John Leong, the technical director of computing services, told the Committee that it is the policy of Academic Computing never to provide information on individual users to anyone except on direct instructions from the university attorney. Dataset (5). AMS.prof data on Andrew accesses. In October, 1994, after the release of information to the press that obscene images could be accessed by Carnegie Mellon students from newsgroups, the administration decided to remove alt.binaries newsgroups from the Andrew bboard system. At Marvin Sirbu's suggestion (according to Banks and Rimm, though not corroborated), Sirbu, Banks, Rimm and Zuckerman decided to capture data on Carnegie Mellon faculty, staff, and students before the new Carnegie Mellon policy went into effect. They had one week to do this, and quickly wrote (with help from others) a Perl script to capture the data. They collected data three times in one week in October, ending in a dataset with each person's name and last access before the program captured the file. These, files were in users' home directories. It is against Carnegie Mellon policy to access files of persons without their permission. ("All files belong to somebody. They should be assumed to be private and confidential unless the owner has explicitly made them available to others." CMU Student Handbook 1994-1995, p.48). However, the similarity in the mode of data capture at some of these sites to that used to capture dataset (5), plus the fact that this collection scheme has been used for some ten years without complaint by users, may weaken any case that the capture of dataset (5) was in itself an abuse of privilege. Dataset (6). Demographics on faculty, staff, and students at Carnegie Mellon. This dataset would not be an issue except that, as is explained in footnote 40 of the article, the team planned to merge such data with the dataset on newsgroup accesses, in order to study "detailed demographics of the university population of computer pornography consumers." Use of databases for purposes other than those for which they were originally collected raises delicate ethical problems involving the expectations and vulnerability of data subjects. This is of particular concern when two or more databases are linked, because the resulting richness of the data aggravates the problem of maintaining confidentiality. Thus it is especially important in such cases to have an arms-length review that establishes (1) that the intended use serves an important social concern, scientific issue, or significant public policy issue, (2) the intended analyses require the proposed data, and (3) that the intended analyses would not compromise the dignity or personal freedom of the data subjects or providers. Footnote 40 cites "several privacy experts" as the reason the team "opted" not to report the results of this analysis. However, there is no mention that the combined database was destroyed. Complicating this whole issue is the fact that the Committee of Inquiry has unresolved questions about whether this dataset ever existed. If it did not, the discussion of this linked dataset in the article is misleading, tending to falsely inflate the perceived capabilities of the author and research team. On the other hand, if it did exist, the release of demographic data release (either by making it accessible to any user, as claimed. or by direct release to Martin Rimm) would violate the Carnegie Mellon policy on confidentiality of administrative data. D. Storage and 'cleaning' of data D.1 * Failure to strip datasets of information that could be used to identify individuals The Committee learned through its interviews that individual-level identifying information was not removed from any datasets. This allegation should be investigated further. Dataset (3) - Edward Zuckerman, in an initial conversation with the Committee by telephone, said that in the summer of 1994, he and Martin discussed Martin's obtaining individual level information on customers from BBS operators in "exchange for something." Martin told Ed that the data contained telephone numbers, and that he was going to use these to obtain area codes so he could do an analysis of community standards. They discussed doing this work to study community standards. (Community standards are not mentioned in the GLJ article.) Edward Zuckerman said he assumed that Martin would strip the phone numbers out of the dataset as soon as he obtained the area codes. Dataset (4) . The Andrew statistics on popular newsgroups from 1988. John Myers did not strip names from the archives he gave to Martin Rimm. Martin Rimm told the Committee he did not strip the names away either. Dataset (5). On November 11, 1994, after the AMS.prof files were collected, George Duncan advised Martin Rimm and David Banks that names in datasets must be replaced by id codes in a manner that would ensure the persons in the datasets were anonymous. He also explained that this might not be sufficient to ensure anonymity if demographic data on each person put them in very small groups. Martin Rimm, after the discussion regarding AMS.prof files with David Banks and George Duncan, told David Banks" that he would hire an undergraduate to strip individual-level information from that database and substitute codes. He did not do so. Martin Rimm told Marvin Sirbu and David Banks that he was too busy and would do so in the Spring Semester, 1995. However, in talking with the Committee, he said he did not do this and said he destroyed the AMS.prof files "last week" (i.e., the week of July 9, 1995). E. Sampling, coding, analyzing, and interpreting findings and literature E.I. Strong, unmistakable bias in the sampling, collection, coding, and interpretation of the data, misinterpretation of the literature, and bias in the writing of the article With respect to E.1, the Committee finds the external criticism credible and believes the allegation should be investigated, since it bears on the scientific integrity of the research, on the faculty oversight of the research. and on the research conduct of Martin Rimm. The Committee agrees with allegations that serious scholarly flaws mar the GU article, suggesting inadequate faculty oversight and failure to obtain peer review. Any one of these flaws might show up in a piece of research, especially one by an undergraduate. However, the sheer number of problems and the polemic tone of the article bespeaks serious lack of oversight, and reflects the secrecy which characterized the publication of the article. Further, the Committee was unable to dispel the criticism that some of the bias amounts to deliberate misrepresentation in support of a political agenda. E.2 Poor methodology (e.g., misinterpretations, statistical flaws, absence of clear statement of methodology) Much of the external criticism, especially from the academic community, addresses shortcomings of the work itself. The following examples illustrate the methodological flaws. This list is not intended to be exhaustive or comprehensive, but only illustrative. Appendix A contains more detail, and the detailed critiques are appended as a further resource. 1. Inadequate and, in some cases, misleading, description of methods of data collection. For instance, on page 1895 the GLJ article refers to obtaining customer data through various methodologies developed by the research team programmers." 2. Misrepresentation of other literature. For instance, a causal connection between pornography and other illegal activity has never been established, yet in the footnotes of the GLJ article a strong connection is alleged based on this literature. 3. Absence of discussion of sampling and coding biases. For example, the sequential coding scheme (detailed in a flow chart in the article) and method of aggregating data may well have inflated the relative frequency of pedophilia and obscene material in images. 4. Misleading presentation. The title and other sections of the article obscure the difference between customers of pornography (who have provided proof of age and a payment) and downloaders of images obtained freely on the Internet. This aspect of the article increased the likelihood that the press and readers would conclude that obscene material is widely available via the Internet. 5. Footnote 30, since it implies that individuals whose files were not world-readable could be pedophiles. In August, 1994, Carnegie Mellon changed the defaults for new accounts, making the AMS.prof files unreadable by others. A criticism of the Rimm article referred to Dmitri Schoeman, who collected statistics about files on the Andrew system. Dmitri Schoeman said that as of October, 1994, after the new policy [defaults] was put into effect, there were 15074 accounts on the Andrew system. Of those, approximately 13236 accounts (~ 87.8 %) had publicly readable configuration files. So the 11% unreadable files discussed in the footnote would be entirely accounted for by the fact that in August of 1994 the default protection of new accounts, including those of freshmen, was changed to make them unreadable. If this fact were known to Martin Rimm, as it seems plausible it would have been, the interpretation in footnote 11 would be deliberately misleading. 6. Failure to acknowledge or explain baserates, which would greatly change a reader's interpretation of the results.3 7. Absence of a discussion of limitations of the research, statistical artifacts, unsubstantiated causal statements, misuse of statistics, and alternative explanations of the data. Marvin Sirbu, David Banks, and Edward Zuckerman read revisions and drafts of the article through at least December, 1994, and some of them may have read the February 1995 revision. These were all occasions in which flaws could be corrected. Marvin Sirbu told us that he told Martin Rimm to re-write the methods section, but Martin Rimm did not do this. Many of the specific methodological flaws are the sorts of things that not uncommonly occur in student, especially undergraduate, work. They should certainly be evaluated and reviewed with the student. However, this should be done by the faculty advisors, not by the institution. The work was not problematic as a student project; it became so only when it was widely and publicly disseminated. The problems of dissemination are another matter, addressed in allegations under heading F. To the extent that the pattern of flaws amounts to deception, however, the allegations fall under E. l. 3 For example, readers and posters m newsgroups have thousands of specific interests (quilting, behavior problems of dogs. Sam Smith fans, C++ programming, Pittsburgh restaurants, and so forth). When access to these newsgroups are counted, the apparent popularity of sex-oriented material as well as the "top-forty," are seen to be a minor percentage of the activities of all persons reading newsgroups. F. Preparing manuscripts, submitting to journals, consultation with colleagues, and dissemination (including giving appropriate credit and attribution to persons and literature, identifying specific sources at the point of reference and more generally) The Committee finds that some, but not all, of these allegations should be investigated further. F.1 Manipulation of publication and dissemination process (conspiracy with an editor of the Georgetown Law Journal (GLJ) who is an official of a political group), to use the research for political aims The Committee felt that investigation of this allegation is not properly within the purview of Carnegie Mellon. F. 2 Absence of peer review F. 3 Absence of open communication and review of the research, through agreement to an "embargo" by the GLJ In mid-November, Martin Rimm informed his co-workers and faculty advisers that the GLJ was seriously considering publication and therefore no details, press releases, or talks should be given. He told David Banks not to give a scheduled and pre-publicized talk to the local chapter of the American Statistical Association. Subsequently, Martin Rimm and Marvin Sirbu agreed to the embargo, and no public discussion of substance took place. Although the December 1 talk took place, David Banks was upset at having to give what he viewed as an empty talk, and Martin Rimm cut off contact with David Banks for a period of weeks because he felt David Banks had revealed too much. F. 4 Plagiarism and misuse of other persons' research, along with misrepresentation to obtain information from research "competitors" Plagiarism does not appear to be an issue, since there is no evidence that text is essentially identical in both papers. Misrepresentation and failure to acknowledge sources should be investigated. There was a prior study of content analysis of sexual material on the network, known to Martin Rimm, that does not appear in the references in the article. 4 According to the first author of this study, Michael Mehta, Martin Rimm contacted hint claiming to be the principal investigator of a team of faculty researchers. Assuming Martin Rimm was a professional colleague, Michael Mehta discussed his work with Martin Rimm at length, and suggested to him the idea of comparing commercial vs. noncommercial images and perhaps other ideas. (see Student Handbook regarding giving credit and attributions to others' work). 4 Michael D. Mehta & Dwaine E. Plaza, York University's Department of Sociology, "A content analysis of pornographic images on the Internet. Martin Rimm called Mehta in September. 1994. Mehta has reported that Rimm claimed to be the principal investigator directing a team of faculty members at Carnegie Mellon. Mehta interpreted this to mean that Rimm was a senior faculty member. Rimm said he was going to publish a book and would consider including Mehta's research. Mehta sent the paper, and never heard from him again. The Mehta-Plaza paper is on the same topic as the Rimm paper. For example, the content analysis is done on images not captions, and reports reliability coefficients, reports what is in pictures rather than calling them "pedophilia." hebephilia, etc.,and it examines alternative perspectives, for example, pointing out that in none of the so-called pedophilia was there any depiction of sex with children. F. 5 Failure to separate the contributions of the individual author from the institution; inappropriate use of institution's prestige to lend credibility to the study (references in the article to: "Carnegie Mellon research team, Carnegie Mellon study, Carnegie Mellon database, etc." and use of "Carnegie " as private publisher) This allegation should be pursued. The GLJ article does not differentiate between the individual investigators and Carnegie Mellon University. In interactions with others about his research, Martin Rimm failed to identify himself as a student and used the name of Carnegie Mellon University in inappropriate ways to further his own agenda. He failed to identify himself as an undergraduate student in the GLJ article (although it could be inferred from a footnote that he was a student, but not an undergraduate), to fully import the sources of his financial support, and to acknowledge the specific roles in the research of the persons listed as advisers and contributors. Throughout the article, the research was described as a Carnegie Mellon study; the databases, as Carnegie Mellon databases, the author and contributors, as a Carnegie Mellon research team The author apparently misrepresented himself to a reporter with Time Magazine as a "research associate." In addition to these misrepresentations with respect to the report, Martin Rimm caused an entry for "The Pornographer's Handbook; How to Exploit Women, Dupe Men, and Make Lots of Money" to be listed in Books in Print. This entry lists the publisher as "Carnegie", allegedly with an address that matches a Pittsburgh address for a Martin Rimm. F. 6 Misrepresentation of sections of the article as written by the author or contributors, though actually written by others This allegation should be pursued. Information received in private communication from EFF (Electronic Frontier Foundation) sources and from David Banks raise questions about whether the legal footnotes were written by Martin Rimm. Also, Carolyn Speranza has suggested that she was incorrectly identified as the illustrator of "'The Pornographer's Handbook." F.7 Failure to cite the work of others This allegation should be pursued. Martin Rimm apparently used Michael Mehta's ideas and drew from his paper, and did not cite him or give him credit. In addition but more ambiguously, Erikas Napjus claims he wrote footnote 137 in exchange for specific credit, and was not given this credit, though he was cited as a contributor to the study. Although neither of these justifies a charge of plagiarism, both appear to support the allegation of failure to grant credit. G. Other aspects of preventing and minimizing risk and harm to research participants The Committee believes that these allegations warrant further investigation, although it would be difficult to pursue G.1, G.2, and G.3 without further harming individuals or subjecting risk to them. G.I Revelation-of harmful information about a BBS operator G.2 Breach of confidentiality of identified files o BBS customers G.3 Breach of confidentiality of Carnegie Mellon files, causing rumors and damage to faculty, staff, and students The datasets (1), (3)-(6) were kept on Martin Rimm's home machine. According to David Banks, and corroborated by e-mail he sent to Marvin Sirbu, at least three undergraduates might or would access these data. The undergraduates were not supervised by faculty or staff. With regard to G.3, the Committee has heard rumors that some undergraduates searched the database for information about their instructors' use of sexually-oriented newsgroups. G. 4 Exploitation of research participants--using analyses of data they provided against them (i.e., Robert Thomas and customers) This allegation is very serious, indeed. It should be pursued. Every ethical standard we know of prohibits using research data against research participants. With regard to G.4, Robert Thomas is currently imprisoned for distributing obscene materials in a community. He is appealing the verdict. According to his attorney, the research report and Time article, describing Thomas' BBS as the "Marquis de Cyberspace" seriously damages Thomas' chances of a successful appeal. His attorney says that Thomas did not know, until two weeks ago, that the Martin Rimm subscriber was the author of the Carnegie Mellon study described in Time. Even more serious is the proposal to the Department of Justice. The DoJ learned about Martin Rimm's research from a former employee who referred Martin Rimm to Bob Flores, Acting Deputy Chief of Child Exploitation and Obscenity Section, Criminal Division. Martin Rimm and Bob Flores talked by phone, and subsequently, Martin Rimm, Marvin Sirbu, and David Banks proposed to the Criminal Division of the Department of Justice to analyze files seized in two obscenity cases, one of them being the Thomas files. David Banks wrote a draft proposal in the fall of 1994. Marvin Sirbu wrote a cover letter to the proposal when it was sent to the Division in the spring of 1995. The proposal offered to analyze data at the DoJ, including customer files seized from Robert Thomas and another BBS operator. The analyses would, among other things, compare the level of obscenity of Thomas' images with those of other BBS operators, and would identify his "top 100" customers. The researchers said that they knew, from their Usenet contacts, that they could obtain download histories for each customer data. Both of these analyses could have very negative consequences for Robert Thomas and his customers, from whom, and about whom, the researchers had collected research data. Bob Flores told the Committee that he was interested in this proposal. He said he discussed the analyses with the group on at least three occasions. He discussed the options of using them as expert witnesses, hiring them as consultants to re-analyze his datasets, or sending them to another division, if he decided the DoJ should fund some research. Bob Flores made it clear to the Committee that his interest was both general, since there was not much research on the topic, and specifically prosecutorial. However, he would not be able to fund research, only specific analyses related to prosecutions. He asked the group not to give him findings before publication of the GLJ article, lest they would be exposed to the discovery process. V. Summary of Recommendations The picture that emerges is of a single student project that got seriously out of hand. It is not clear now, and may never be clear, to what extent Martin Rimm evaded, ignored, and circumvented his advisors and to what extent they actively participated in various aspects of the work. We recommend that a number of the allegations be investigated. A few may lead to charges of research misconduct. Others will be relatively less serious violations of university policies. We are not able to clearly separate these in all cases, so we simply recommend investigation of 13 of the 22 allegations (as we have organized them). A. Preparation for research (IRB review, planning confidentiality procedures, obtaining organization and institutional permissions: deciding responsibilities for oversight) A.1 Absence of IRB review: Should be investigated A2* Failure to institute procedures for faculty oversight of student research and to create procedures to protect confidentiality, of sensitive data, Should be investigated B. Collection of data from individuals B.1 Obtaining personal information on private BBS customers and not asking for their informed consent Does not warrant investigation B.2 Failure to obtain informed consent from Carnegie Mellon individuals whose Ams.prof files were accessed for research Does not warrant investigation B.3 Deceptive collection of customer information from BBS operators Should be investigated C. Collection of data from databases C.1 Invasion of privacy of home directories and files Should be investigated C.2* Transferring databases from administrative control to a researcher's control Should be investigated C.3 Misrepresentation in obtaining research data (offering marketing help to BBS operators (e.g., the "Pornographer's Handbook".) in exchange for private information on customers, and not explaining the purpose as research) Should be investigated D. Storage and 'cleaning' of data D.1 * Failure to strip datasets of information that could be used to identify individuals Should be investigated E. Sampling, coding, analyzing, and interpreting findings and literature E.I. Strong, unmistakable bias in the sampling, collection, coding, and interpretation of the data, misinterpretation of the literature, and bias in the writing of the article Should be investigated 18 E.2 Poor methodology (e.g., misinterpretations, statistical flaws, absence of clear statement of methodology) Should be re-evaluated by advisors and reviewed with student F. Preparing manuscripts, submitting to journals, consultation with colleagues, and dissemination (including giving appropriate credit and attribution to persons and literature, identifying specific sources at the point of reference and more generally) F.1 Manipulation of publication and dissemination process (conspiracy with an editor of the Georgetown Law Journal (GLJ) who is an official of a political group, to use the research for political aims Not within the purview of Carnegie Mellon F.2 Absence of peer review Should be investigated F.3 Absence of open commi4iiication and review of the research, through agreement to an embargo" by the GLJ Should be investigated F. 4 Plagiarism and misuse of other persons' research, along with misrepresentation to obtain information from research "competitors" Does not warrant investigation F. 5 Failure to separate the contributions of the individual author from the institution; inappropriate use of institution's prestige to lend credibility to the study (references in the article to: "Carnegie Mellon research team, Carnegie Mellon study, Carnegie Mellon database, etc." and use of "Carnegie" as private publisher) Should be investigated F. 6 Misrepresentation of sections of the article as written by the author or contributors, though actually written by others Probably cannot be pursued by Carnegie Mellon F.7 Failure to cite the work of others Should be investigated G. Other aspects of preventing and minimizing risk and harm to research participants G.1 Revelation of harmful information about a BBS operator Probably cannot be pursued without risk or harm to individuals G.2 Breach of confidentiality of identified files on BBS customers Probably cannot be pursued without risk or harm to individuals G. 3 Breach of confidentiality of Carnegie Mellon files, causing rumors and damage to faculty, staff, and students Probably cannot be pursued without risk or harm to individuals G.4 Exploitation of research participants--using analyses of data they provided against them (e.g., Robert Thomas and customers) Should be investigated