Ethical and privacy considerations for research using online fandom data

Brianna Dym and Casey Fiesler

University of Colorado Boulder, Boulder, Colorado, United States

[0.1] Abstract—As online fandom continues to grow, so do the public data created by fan creations and interactions. With researchers and journalists regularly engaging with those data (and not always asking permission), many fans are concerned that their content might end up in front of the wrong audience, which could lead to privacy violations or even harassment from within or outside of fandom. To better understand fan perspectives on the collection and analysis of public data as a methodology, we conducted both an interview study and a survey to solicit responses that would help provide a broader understanding of fandom's privacy norms as they relate to the ethical use of data. We use these findings to revisit and recommend best practices for working with public data within fandom.

[0.2] Keywords—Data privacy; Ethics; LGBTQ; Online communities; Public data; Social norms

Dym, Brianna, and Casey Fiesler. 2020. "Ethical and Privacy Considerations for Research Using Online Fandom Data." In "Fan Studies Methodologies," edited by Julia E. Largent, Milena Popova, and Elise Vist, Transformative Works and Cultures, no. 33.

1. Introduction

[1.1] Transformative fandom has long occupied the grey spaces online. Because fandom is both entirely open to the public and living in the cracks of obscurity, accessing its communities sometimes feels like knowing the secret knock to get into the speakeasy tucked away in the back alley. People come inside. They order their favorite special at the bar ("I'll have a 'And They Were Roommates,' please"). They make idle conversation with other patrons as they wait for the next performer or their favorite bartender to come along. At the end of the night, these visitors return home, friends and family often none the wiser to where they were or the specific transactions of their nightlife, though any of them could have easily come to the speakeasy if they had only known where to look.

[1.2] The metaphor neatly encapsulates online fandom life—though in the digital world those interactions also leave data traces, and knowing who has seen them is not as easy as looking around the room to see who else is there. The transactions of their nightlife have been broadcasted the entire time, leaving a trail of public data that is accessible, identifiable, and valuable.

[1.3] Even outside the context of fandom, many people are unsure at best as to whether or not their data are publicly viewable and how they might be accessed by third parties (Proferes 2017). Journalists, for example, might include a public tweet in an article to represent public opinion on a news-relevant issue. Meanwhile, internet researchers are increasingly collecting public data that may seem ephemeral to social media users but in fact lingers, and many of those users are entirely unaware that their data can legally be collected for research purposes (Fiesler and Proferes 2018).

[1.4] The disconnect between people, their data, and how others use them is complicated by the fact that researchers disagree over norms for studying public data as well as a definition for what public even means (Vitak, Shilton, and Ashktorab 2016). For example, while some researchers might argue that public data are freely accessible and therefore usable (Zimmer 2010), fan studies scholars often argue that scholars ought to inform or ask people about using their data, regardless of whether or not they are accessible (Nielsen 2016). While not all disciplines are in agreement, scholarship on the ethics of studying fandom has emphasized the need to preserve fan privacy (Busse and Hellekson 2012), prioritize transparency in research (Bennett 2018), and gain permission from fans before referencing their work in academic research (Busse and Hellekson 2012). These principles often sit in contrast to how many disciplines approach the ethics of using public data in their own work.

[1.5] Fan studies cover a wide variety of domains and methods. For example, fan scholars have provided literary analysis of fan works (Coppa 2017), analysis of human behavior (Guerrero-Pico, Establés, and Ventura 2018), and an understanding of different learning that occurs within fan communities (Black 2005; Evans et al. 2017). Bethan Jones (2016) has identified two primary traditions of approaching fan studies research: (1) literary analysis, when the content creator is assumed inaccessible and a fan work is the subject of study, and (2) human subjects research, in which the creator is the subject of study—whether with consent (e.g., interviews or surveys) or without consent (e.g., collecting public trace data).

[1.6] Traditional human subjects research, where consent must be established for data collection, often falls under the purview of ethical review bodies such as institutional review boards (IRBs) at US universities. However, many IRBs consider the collection of public data to not constitute human subjects research because it typically does not involve direct interaction with humans or collection of personally identifiable information (Vitak et al. 2017). Perhaps in part because of this line drawn by IRBs, a common perception among researchers is that the most important question for using data is whether they are public, regardless of other relevant factors such as users' expectations of privacy (Zimmer 2010). Our goal with the work here is to bridge discussions in fan studies with broader conversations about use of public data in research across all disciplines.

[1.7] Fan studies are an interesting context for examining the intersection of research ethics and privacy, in part because of the long-standing social norms that dictate sharing and privacy behavior (Dym and Fiesler 2018a). We also think that it is important that fans have a voice in determining best practices for researchers. To that end, we conducted an interview study and a large-scale survey in which we asked about privacy and ethical concerns in relation to fandom and public data. By fandom, we mean transformative fandom, or an online fan community that both creates and shares fan works that are transformative of source material through writing fan fiction, creating fan art, or other creative practices.

[1.8] Our findings further illustrate the concerns that fandom communities have about privacy, safety, and the integrity of their content. This includes (1) special concern for fans from marginalized backgrounds (especially LGBTQ fans) who may face serious consequences from privacy violations; (2) fear of harassment within fandom; and (3) the importance of positionality within and understanding of fandom for researchers. Our findings contribute to specific recommendations for working with public data generated within fandom, including special care to avoid amplifying fan content and ways that researchers can care for the trust they receive. These findings stand to inform broader discussions on what constitutes the ethical use of public data within other types of research as well, particularly for vulnerable populations.

2. Background

[2.1] Determining the risks and benefits of using public data is a complex process, especially when those data may incorporate vulnerable communities or privacy-sensitive contexts. Fandom falls into these categories due to not only the large number of LGBTQ participants (Dym and Fiesler, 2018b) but also different stigmas associated with fandom. Fan data also have a history of negatively affecting their owners when unearthed to the broader world (Busse and Hellekson 2012). We informed our study with prior literature that connects work about research using public data, the current status of privacy and data within fandom, and the contextual nature of online privacy.

[2.2] Though using public data, particularly from social media, without consent is common practice—whether by researchers, journalists, or marketers—social media users are often confused about the nature of those data, whether they are public, and what the rules are surrounding their use and how far those data might spread (Proferes 2017). With respect to research specifically, most people are unaware that researchers collect data from platforms such as Twitter (Fiesler and Proferes 2018).

[2.3] Even when anonymized by not including usernames, content from social media collected and shared in research articles can be easily traced back to its creator. Ayers and colleagues (2018) reported that 72 percent of research articles from a corpus of medical research quoted a tweet that could be traced back to that Twitter user 84 percent of the time. Nonresearchers have taken note of this issue as well, with at least one journalist contacting people whose tweets appeared in a study only to find out they had no idea their posts had been quoted (Fiesler and Proferes 2018). Research has also shown that Twitter users prefer that journalists not quote their tweets directly and instead make inferences from large, anonymous data sets (Dubois, Gruzd, and Jacobson 2018).

[2.4] Even when data are not public, social media data might still be available to researchers without the users' knowledge; for example, we have seen negative public reactions to being studied without explicit consent by companies such as Facebook and Dropbox (boyd 2016; Dreyfuss 2018). Controversies surrounding unexpected uses of data (e.g., Cambridge Analytica) are increasingly commonplace as a result of a gap in understanding between users, their data, and entities that make secondary uses of those data (Fiesler and Hallinan 2018). Though prior work has shown that people are generally more comfortable with their data being used for university research as compared with governmental or commercial uses (Williams, Burnap, and Sloan 2017), their level of comfort varies greatly depending on contextual factors such as what content is being used, who the researchers are, or how the data are being analyzed and for what purpose—far beyond just whether data are public or not (Fiesler and Proferes 2018). However, consideration for these factors may not be commonplace among internet researchers, who lack guidance from traditional ethics review bodies such as IRBs (Vitak et al. 2017) and do not have agreement for best practices within their own communities (Vitak, Shilton, and Ashktorab 2016).

[2.5] Fan studies occupies a somewhat unique position in this regard, with Transformative Works and Cultures recommending that researchers gain permission from content creators before using fan works or blog posts in a journal article (Hellekson and Busse 2009). Fan studies, however, became a subdivision of internet research when fandoms moved online, with fan studies' history rooted in ethnographic and participant-observer studies of communities (Jenkins 1988). While some researchers approach fandom as participant observers, others come to fandom as literary scholars. In this case, researchers draw from public data that were generated by a person, and as such may contain information that the person might not consent to having analyzed outside of its intended community (Reid 2016). Unlike traditional texts, fan works are personal and tied to the people and communities they are created in as living data, so they carry consequences with their use and analysis (Jensen 2016).

[2.6] Fans expect that their content will stay within fandom (Busse and Hellekson 2012), and fan studies scholars often enforce their own expectations of obtaining a fan's permission before discussing or researching their fan works (Fathallah 2016; Reid 2016; Zubernis and Davis 2016). Newer investigations into fan studies have touched on large, quantitative data gained through surveys, noting the importance of online surveys and a researcher's responsibility to communicate information back to participants (Bennett 2018), and qualitative researchers advocate for a goodwill approach to ethically study fan communities using ethnographic methods (Kelley 2016).

[2.7] Recent discussions on ethical fan studies have aligned with Brittany Kelley's perspective, following a participant-observer model of research (Busse 2018, 9). However, there are also moments in fandom when the fans first approach does not always make for the best scholarship. Natasha Whiteman (2016) has highlighted the "localised nature of ethical decision-making in qualitative research" (309), encouraging researchers to make case-by-case ethical judgments rather than universal rules regarding human subjects research. Covert research might be necessary when an online community's expectation of privacy contrasts with an ethical obligation to research the community (Whiteman 2012), or when a community might be actively hostile toward the researcher (Chess and Shaw 2015; Massanari 2017).

[2.8] The duality of fandom data as both a private and public object can be framed within the concept of contextual integrity (Nissenbaum 2004), meaning that a user's expectation to privacy and the sensitivity of their data are contextual, always, and are dependent on different motivating factors. For example, a person in fandom who identifies as transgender online but presents as cisgender outside of fandom might have more significant privacy concerns than a cisgender, heterosexual person who presents as such, both within and outside of fandom. Someone who must maintain a public image separate from adult themes might be more protective of their erotic fan works than someone who professionally writes erotic romance novels.

[2.9] Helen Nissenbaum summarizes these concerns: "Almost everything—things that we do, events that occur, transactions that take place—happens in a context not only of place but of politics, convention, and cultural expectation" (2004, 137). This concept has found its way into some ethical best practices. For example, the Association of Internet Researchers recommends, "The greater the vulnerability of the community / author / participant, the greater the obligation of the researcher to protect [them]" (Markham and Buchanan 2012, 4). This key guideline posits that context matters in research.

[2.10] For example, Bethan Jones (2016) has explored considerations a researcher must take into account when studying antifandom, or fandom content that is negative toward a particular topic. Jones highlighted that drawing attention to antifans, especially those who make disparaging or problematic statements, increases the risk that they could suffer more harassment than they might otherwise encounter.

[2.11] Jones's article identifies one of the many contexts in which public data gathered from fandom might make someone more vulnerable than they initially considered. Previous research has also highlighted how LGBTQ members of fandom are particularly vulnerable to privacy risks, especially if they are still in the closet (Dym and Fiesler 2018b). To better understand how public data are contextual within fandom and the steps researchers can take to minimize their impact, we explored people's thoughts on privacy, safety, and ethics within their fan communities. Our findings explore the risks of research amplifying public data from vulnerable communities.

3. Methods

[3.1] For the first stage of our research, we recruited participants for an interview study, posting a recruitment ad on social media sites that are gathering places for fans (including Twitter and Tumblr) and seeking fandom participants at least eighteen years old who were willing to talk to us about privacy, ethics, and fandom. In particular, we asked participants about their perceptions of researchers and journalists interacting with fandom, considering that these two groups often repurpose public data for secondary use.

[3.2] In our recruitment material, we specified that interviews would take place remotely over a participant's preferred communication service (e.g., Skype, phone, instant message) and could be voice or text based. Our interview protocol, designed to discuss themes in fandom that might come up in otherwise everyday conversation about fan communities, was approved by our university's IRB as low risk.

[3.3] The participants were given a consent form before the scheduled interview, which we then verbally walked them through, emphasizing that the participant could stop the interview at any time, reschedule, or request to have some or all of their interview data removed from the study at any time. Responding to interview volunteers on a first-come-first-served basis, we conducted twenty-five interviews in total. Of our participants, twenty-one identified as part of the LGBTQ community. In addition, our participants identified in the majority as female, and in large surveys we found that more of our participants identified as nonbinary and transgender than as cisgender men. Participants also identified as majority white and US residents. These demographics are similar to other studies about fandom (Fiesler and Bruckman 2014; Fiesler, Morrison, and Bruckman 2016; Nielsen 2016).

[3.4] We conducted semistructured interviews (Seidman 2013) via voice or instant message depending on the interviewees' preferences. The interviews lasted between twenty-five minutes and ninety minutes, with most averaging sixty minutes in length. After transcribing the interviews, we conducted a thematic analysis of the data (Braun and Clarke 2006), with the authors discussing and converging on emergent themes.

[3.5] From this analysis, we focused on themes central to privacy expectations and etiquette that participants requested of researchers in order to inform the survey questions in which we asked thousands of fan community members about their thoughts on their data, its use, and overall concerns surrounding data privacy. We thus conducted a broader survey targeting multiple aspects of transformative fandom, including privacy and safety concerns. We recruited respondents from Tumblr and Twitter, obtaining 4,117 respondents who identified as fan fiction writers or readers. The participants were required to be at least eighteen years old and had to click through a consent form. The respondent demographics were similar overall to our interviews and to previous fandom studies. In neither study were the participants compensated monetarily, but we provided the option to leave their email address if they wanted to be informed about our results.

[3.6] The respondents answered a number of multiple choice and Likert scale questions, as well as open-answer questions about research ethics and privacy related to online fandom data. We analyzed the open-answer responses qualitatively based on the themes we had already identified from interviews. The quotations included in our findings come from both interview and survey responses, and the participants' identities are anonymized using participant numbers, with PI denoting an interview participant and PS denoting a survey participant.

4. Findings

[4.1] Based on our analysis of the data collected, we found that people within fandom have nuanced and protective views regarding their data and their use by both researchers and journalists. The participants highlighted the ways their data are contextual and might carry unintended consequences when shared outside of fandom, or even when shared within other parts of fandom. However, the fans also felt trust toward researchers, emphasizing the responsibility that researchers take on to present fan data ethically.

[4.2] These findings identify a risk that research and journalism around fandom carry with them for fan communities—the risk of amplifying fan content to an audience it was never intended for. Amplification can be a problem for other types of online content as well (for example, in a medical context; Ayers et al. 2018), but our findings reveal that the large number of LGBTQ people within fandom generates a unique set of risks. Our participants, whether identifying as LGBTQ themselves or simply thinking about their friends, worried that exposing fandom content to a broader audience could lead to fans being accidentally outed.

[4.3] Whether or not a fan harbored specific fears for how their data might be used against them, a majority of fans worried about the privacy of their data. Out of 4,006 survey participants, fewer than 10 percent reported using their real name in online fandom. The vast majority prefer to use pseudonyms or other methods for obscuring their identity. This practice sits in contrast to social media sites that encourage users to provide real identities when online (Cho 2018), explaining in part why platforms like Facebook are not commonly used in transformative fandom (Dym and Fiesler 2018b). Pseudonyms represent a norm strongly anchored within fan communities, which also benefits young, vulnerable LGBTQ people existing within fandom circles or the digital platforms fandom cohabitates (Cho 2018).

[4.4] Based on our interview data, which revealed common patterns of privacy concerns among fandom participants, we examined the prevalence of these at scale with the survey by providing multiple choice options to the question "Which of these concerns about privacy in fandom do you share?" Only 6.7 percent of participants responded that they did not have privacy concerns at all. Meanwhile, 34.5 percent of participants were concerned about "people in my real life finding out about specific types of content I consume, create, or share," 22 percent about "my real name/identity being 'outed' to other people in fandom," 18 percent about "being outed as a fandom participant to people in real life," and 14 percent about "being outed with respect to another identity (e.g., sexual orientation or gender identity)." Fewer than 5 percent of participants listed "other" privacy concerns; the most common involved how private companies might use their data.

[4.5] Despite these common concerns, only 5.9 percent of participants (235 out of 3,985) stated that they had personally experienced a privacy violation in fandom. The fact still remains that a majority of people in fandom have fears about their privacy being violated. Outsiders coming into fandom, such as researchers and journalists, are susceptible to violating fan privacy if they are unaware of those specific concerns and norms that work to keep fans and their data safe. To unpack these concerns further, next we explore key concepts related to fan communities, public data, and how researchers can continue to engage responsibly with this space.

[4.6] Despite the common ethical heuristic that publicness is the most important factor in determining whether online data can be part of research (Vitak, Shilton, and Ashktorab 2016), prior work has shown that other contextual factors are also important, such as how the data are analyzed, what kind of data they are, and how they are used (Fiesler and Proferes 2018). Drawing from this prior work as well as topics that came up in interviews, we asked survey participants about different contexts of data use. On a Likert scale of "very uncomfortable" to "very comfortable," participants rated their level of comfort for different uses of both public fan works and public fan discussion. We found that fans are more comfortable with academic research than they are with journalism, and more comfortable with quantitative over qualitative research. Based on open response answers and interviews, these differences seem to track in part to fear of amplification: media articles have more reach than academic articles, and reports of quantitative research are less likely to include identifiable data than reports of qualitative research. Participants also brought up other contextual factors, such as how the article portrays fandom or whether the data themselves are sensitive.

[4.7] However, even for researchers determining what public data may or may not be sensitive, how someone feels about their own data being used might vary depending on their identity or other factors invisible to the researcher. Nissenbaum's (2004) framework of contextual integrity encapsulates this problem—data generated in fandom will be contextual to the owner and their specific privacy needs. For example, someone living alone or with a supportive partner might be less worried about their privacy than someone still living with their parents or someone in a career that requires a certain public image. Again, LGBTQ identity emerged as a strong factor in privacy concerns. When talking about the importance of their privacy in fandom, our interview participants often recounted past and current experiences where fandom functioned as a safe space to explore their LGBTQ identity:

[4.8] I think that fandom for me was such an outlet…I grew up in a small island, a very small community and to talk about homosexuality was the kind of thing that would get you put in the cupboard…And I think that for a baby LGBTQ person…fandom is a fantastic arena for learning about the world. (PI-17)

[4.9] When you have younger individuals, privacy and anonymity of sexuality is huge. People who are out online…are not out to their friends and family, but present themselves as who they identify as online. Which this brings in an extra layer of problems, because if they're identified, this outs them to their network. (PI-10)

[4.10] When PI-17 was first involved in fandom, it was their one outlet to explore LGBTQ identity. Fandom continues to represent a safe space for LGBTQ people, though fear of being outed as LGBTQ is still a major concern among users, as PI-10 described. And this fear is not unfounded. It is easy for researchers and journalists to release sensitive data that, despite being anonymized, can still reidentify study participants (Ayers et al. 2018). In our own findings, several participants had stories of online content being amplified beyond fandom and leading to unintended consequences:

[4.11] I have a friend who was outed by being featured in a local newspaper kissing her girlfriend when the paper did not get the permission of my friend nor her girlfriend. It was disastrous to say the least. Having worked for a newspaper, I know that isn't protocol, but it's still a risk run when you exist in a space where others have no regard for the ethical standards or basic consent agreements. (PI-15)

[4.12] Though these anecdotes or fears often related to journalism, their concerns about personal information being shared outside fandom can apply to academic research as well (Fiesler and Proferes 2018). However, the type of content a fan produces and interacts with, beyond personal details, can cause problems as well. Certain portrayals of fandom might cause harm to a fan simply by association with it, and our participants were wary of having their content presented outside of its intended context and of potentially amplifying that content to the wrong audience.

[4.13] Prior work in fan studies about research ethics has focused on respecting a person's preference for attributing their pseudonym to the referenced fan work (Busse and Hellekson 2012; Busse and Farley 2013). This issue highlights competing values: preserving a fan's privacy versus giving someone credit where they might desire it. Privacy and anonymity are important, but fandom also has a strong social norm toward attribution (Tushnet 2007; Fiesler and Bruckman 2014). However, this ethical tension is not unique to fandom; in all human subjects research there are cases when, despite traditional norms to always anonymize participants, it may be appropriate to give participants credit for their work and commentary (Bruckman, Luther, and Fiesler 2015; Fiesler, Morrison, and Bruckman 2016). The norm within fan studies has been to leave this decision up to the fan creator. However, there are also cases of research using public data where either the data set is too large to make asking for consent for every data point feasible, or where it may be impossible to track down the content creators.

[4.14] To explore fan perspectives on this research practice, we asked survey participants if they would want their pseudonym attributed if their fan works or fan discussion posts were used in different contexts, including research, journalism, and use by other fans (e.g., reposting or remixing). Participants could answer yes, no, or "it depends" with the request to explain why. For both fan discussion and fan works, about one-third of respondents chose "it depends," again emphasizing the contextual nature of these decisions.

[4.15] Unsurprisingly, given the existing fandom norms around attribution (Tushnet 2007), 90 percent of participants insisted on attribution when other fans use their content, either reshared or remixed. However, it was also slightly more common for fans to want attribution (by fannish pseudonym) when their work is used for research (60 percent) as well as for journalism (51 percent). Despite the majority of participants indicating a desire for attribution, a third of all answers indicated qualifying statements for when this practice would be acceptable.

[4.16] Open-answer responses around attribution as well as other ethics and privacy-related questions also emphasized important contexts for when using data at all may or may not be acceptable. In the sample of responses that we analyzed qualitatively, we found that participants cared deeply about whether their work was being used in research that would benefit fandom. While many responses contained an altruistic sense of preserving a positive image of fandom, responses also focused on the dangers that accompany amplification—in particular that negative studies toward fandom could bring negative attention to fans. If their data were going to be used in a research study or a news article they did not agree with (that was harsh toward them or extremely negative about fandom overall) the participants were far more reluctant to allow their data to be used, let alone be attributed to the associated work.

[4.17] If the news article were negative or sneering in tone, as many have been in the past, I don't want to deal with its readership finding my fandom accounts. (PS-4009)

[4.18] What [is] the publication that published an article about fanfiction? Is it fan-friendly, honestly interested in fan culture? Or is it written to mock fanfic? In the latter case, I wouldn't want my writing in it at all, and definitely not with my name attached to it. (PS-4032)

[4.19] Though letting the participants dictate the type of findings that come from research could be a threat to scientific integrity, for many participants the concern comes from a fear of harm as a result of the research, regardless of a researcher's intentions. These are consequences that can be mitigated regardless of the argument of any particular article.

[4.20] Interview participants also explained the nuanced reasons behind keeping certain online activities separated. Many participants expressed concern about not wanting people in their real lives to know about their fandom life, because of the stigma associated with fandom, certain types of content, or a different type of identity they present online. Others noted that presenting a certain persona within fandom might also rely on keeping their fandom identity separated from other professional contexts.

[4.21] I want to be seen as approachable by fandom and approachable by professional entities in their respective spheres. I don't necessarily want my employers…to know that I dedicated well over a year to a 400,000 word queer fanfiction for Overwatch. It might seem like I'm "wasting" my spare time…But in that same mindset, I don't want my academia and professionalism to make the fandoms I'm involved with view me as a snob. (PI-15)

[4.22] Being viewed as a snob, as PI-15 put it, was one of the milder concerns participants expressed, but this highlights a key concern among participants. Certain fandom activities are intended for certain audiences, just as activities outside of fandom are intended for nonfandom audiences. Overwhelmingly the participants stated that they did not necessarily fear what a research study or journalistic article might do with their data, but they did fear how other members of fandom might treat them in response. People writing about or engaging with sensitive and troubling topics often reported experiencing anti or anonymous hate in addition to threats to their physical safety, often through their personal identifying information being doxxed (publicly published) to broader fan communities.

[4.23] Though many parts of transformative fandom can be much more positive than other online communities (Campbell et al. 2016), there are still toxic fandom spaces. Sometimes, that toxic behavior can be self-regulated by strong community norms within a fan space (Guerrero-Pico, Establés, and Ventura 2018), but those negative behaviors still find their way in. When discussing fear of amplification, many participants expressed concern over harassment within fandom. Many survey participants shared the perception that they might be harassed if the wrong corner of fandom found their social posts. Some interview participants even asked that in the course of this research that we take care to protect their identities (e.g., by not listing the fandoms they are involved with) to avoid backlash from the community, even if their data were anonymized.

[4.24] Have you been in fandom lately? I don't want to be stalked or lectured online for accidentally having a "wrong" opinion. (PS-4075)

[4.25] Fandom has become even more hostile in recent years, so with increased visibility comes increased likelihood of brigading over perceived threats, whether or not it's warranted. If the wrong person talks about your work on Twitter or Tumblr, you're in for hell. (PS-3976)

[4.26] Interview participants described concern over anti culture in fandom, where certain fan groups would maliciously target other fans who were participating in ways they did not approve of. For example, one person stated that they refused to talk about a certain pairing they like on Tumblr because antis had a history of harassing and doxxing people who enjoyed the pairing. Anti behavior introduces a new set of concerns for researchers to be aware of. If crediting a person's pseudonym, directly quoting their posts, or identifying specific fandoms as part of a contentious topic might draw negative attention, then what precautions are appropriate to take?

[4.27] In traditional human subjects research we only gather and publish information a participant is willing to disclose, but this protection is only in place when a researcher is engaging with participants directly. When using public data instead, these protections are absent. However, we found that overwhelmingly participants placed their trust in individual researchers to make the right, ethical choices regarding their data.

[4.28] People in fandom are generally protective of their privacy. However, participants from both our surveys and interviews expressed a certain amount of trust in researchers that was absent from their views on other types of outsiders (e.g., journalists) engaging with their data.

[4.29] I personally don't trust news media to use quotes in context or to take the subject at all seriously. In academic settings, if there is a good peer and ethics review, I'm much more comfortable. (PS-2762)

[4.30] I'm pro researchers coming into fandom and asking questions. I'm against journalists coming in and stealing things. If a journalist wants to ask things, I suppose that's okay. But, again, their questions tend to be more invasive and try to catch people out on things that I find offensive. (PI-2)

[4.31] These statements highlight the perception our survey participants shared toward researchers—they are perceived as not looking for the "gotcha" statements that the participants associated with many journalists. And while the participants acknowledged that there are journalists who come from within fandom and do good work, they generally trusted researchers more broadly. This trust can complicate the responsibility a researcher has with participant data. As participant PS-2762 stated, fans might be more comfortable with an academic setting, specifically with a peer-reviewed study, than they would be with a reporter. Another reason for this comfort was the perception that media articles reach a broader audience than academic work, making the risks accompanying amplification more severe. As PS-1091 stated, unlike academic research, journalism reaches a "public audience."

[4.32] Given our findings about the importance of individual context in making ethical decisions about fandom data, a fundamental issue with collecting data without consent is that the researcher might have no idea what this context might be. Accordingly, many participants expressed that using their data from fandom without permission was unacceptable.

[4.33] Journalists and scientists need to ask [people in fandom] before popularizing, publicizing, or publishing other people's tweets or tumblr posts. Anything else is unethical. (PS-2230)

[4.34] However, other participants recognized that there were ways for researchers or journalists to have a better understanding of context, insisting that anyone making use of fan data should spend time within a fan community before collecting from it. Not only would this help them recognize what kind of content is acceptable for sharing, but it might also help outsiders represent fandom more fairly. The participants had more trust for researchers and journalists who come from within fandom.

[4.35] I prefer it when [researchers] have a sense of what's going on. That sort of notion of like, I'm not a voyeur. I do think an insider perspective is important…I think your positionally, with regards to the community that you're studying, is important to consider. I prefer it when people are one of us. (PI-21)

[4.36] I think it's helpful if whoever is coming in to research or write about fandom has a bit of a fandom background. [Fandoms] can be completely different from each other too, so that's sometimes a little hard to say. I'm thinking of that class about fan fiction at the university…where some fic got on the syllabus and the writers weren't told and they were not happy about that, because the people commenting were not observing the same community norms as in the fic writers' fandoms. (PI-9)

[4.37] As PI-9 pointed out, even those with the best intentions might not understand the norms of a community—norms that are often established to help protect the fans who are sharing their content. However, all of the concerns that our findings revealed could potentially be mitigated with some thoughtful best practices for using fan data—or public data in any context.

5. Best practices for using online fandom data

[5.1] In recommending best practices for public data use, we discuss the importance of obtaining permission to use public data and obscuring that public data to prevent reidentification. We also discuss when properly attributing public data back to someone might be appropriate, as well as ways of giving back to a community after collecting data. We close with recommendations for learning a community’s norms around data and a reflection on how the lessons learned from fan communities can be applied to other online spaces. While fandom is a rich and unique case study, it also serves as an example for how we might consider the consequences of collecting public data elsewhere online.

[5.2] The consequences that fans could face if their data are improperly disclosed elsewhere can be severe. Fans fear harassment from fellow fans, doxxing, or potentially being outed as who they are in fandom to people in their off-line lives. Furthermore, determining which data might be innocuous or not is challenging due to individual contexts, even for researchers with the best of intentions. Even seemingly harmless information, like a person's preference for LGBTQ characters or content, could cause problems.

[5.3] I don't think being outed as a fan or that sort of thing has impacted anyone that I know about, unless they're a fan of something horrific. But, I definitely know that being outed to what you are and how you act in the fandoms in particular, has caused problem. (PI-12)

[5.4] From our findings based on fan perspectives on these issues, we recommend best practices for researchers looking to work with public data created through transformative fandom. It is evident that these data, despite being publicly available, carry enough risk that ethical considerations beyond considering its publicness are necessary.

[5.5] Across our recommendations, we urge researchers to consider the weight and context of a fan's data, and the consequences of elevating them beyond their intended audience. Our findings reveal layers of publicness to user data that people are sensitive to. A majority of fan content is created and shared within highly contextual, semipublic spaces that have a specific audience in mind. When a researcher or journalist lifts these data from their original context and places them within a new one, there are certain considerations they should undertake to minimize the impact of recontextualizing those data to a new, potentially broader audience.

[5.6] The fan studies community has discussed the pros and cons of requiring permission to study fans' content. Our participants expressed feelings of vulnerability and even fear about their content being used in unexpected ways. These fears are in part related to the increased scale of fandom, which thanks to entirely public platforms like Tumblr, is also more accessible from the outside (Dym and Fiesler 2018a). This expanded connectivity has challenged traditional norms; as one participant stated:

[5.7] There used to be a certain type of etiquette [in fandom], and now that type of etiquette is shit. (PI-12)

[5.8] Fandom at a larger scale and with an increasing number of newcomers who might not understand existing norms has also led to more conflict within fandom. Additionally, other types of concerns around privacy may be highly contextual to the individual and not apparent to the researcher without asking. Therefore, we recommend that when it is possible and appropriate to the situation, researchers should attempt to gain permission from fans whose content is being quoted or used in research.

[5.9] However, while it is true we should respect fan privacy, some cultural phenomena deserve to be discussed, and obtaining permission without exception might be prohibitive. For example, the data might be too large, or it might be difficult to contact the creators. Moreover, obtaining permission can be challenging or even dangerous when researching groups who might be hostile to the researcher or broader online communities (Chess and Shaw 2015; Massanari 2017). Prior work has highlighted the ethical conundrums of working with antifans (Jones 2016), specifically when it comes to asking permission to share their anti opinions, and our research points to a very real concern among fan communities that they might become a target for this harassment. Therefore, when asking permission would be impossible, or when there is very low risk with the data involved, there are other ways to protect fan privacy. These heuristics make for a good starting point in untangling the potential risks associated with using any public data for research.

[5.10] One way we can protect fans' privacy is by obscuring specific text posts, particularly those that could link to identifiable information (particularly a real identity) or to contentious events within a fan community. If verbatim quotes from public content are not obfuscated before being presented as part of research, these posts can often be traced back to their source even when usernames are not included (Ayers et al. 2018). We recommend taking steps to obscure public data by rewriting sentences to paraphrase and other methods of ethical fabrication (Markham 2012). Obscuring data can allow researchers to delve into contentious or sensitive subjects in fandom without potentially putting community members at risk. Even for low-risk data, such as commonly phrased statements or tags (e.g., from Tumblr or Archive of Our Own) where obfuscation may not be as important, we recommend collecting only the data required for the analysis. For example, even though data such as usernames might be available, they may not be relevant to the analysis.

[5.11] One concern with obscuring data is that it adds to the tension between credit and privacy in fandom. Though there are no perfect solutions for this tension, our findings identify considerations researchers should take into account when deciding whether to attribute fans when their content is quoted or referenced. When permission can be sought, we encourage researchers to abide by a person's wishes in regards to whether they want their work attributed back to them. However, when researchers cannot seek permission (perhaps because they cannot identify or contact the content creator), we recommend erring on the side of caution and not attributing that person's name to the content. Because many of the fears around privacy violations affect fandom's most vulnerable members (e.g., LGBTQ fans), we contend that the risk-benefit assessment of attribution should weigh more strongly in favor of protecting those who might suffer from unwanted attention.

[5.12] As part of caring for the trust that fandom places in researchers, we think it is important to share research findings back to the community. Even outside of fandom, prior work has revealed that users whose social media data are part of a research project want to be told about that research and would be interested in reading research papers (Fiesler and Proferes 2018). Though we have focused a great deal on the potential harm of research in considering ethics, the benefits of research are important as well. It is important to note that neither our findings nor prior work on research ethics for public data (Fiesler and Proferes 2018) have suggested that this is a methodology that should be abolished altogether, despite some risk—many of our participants were excited about the idea of research shining a light on the practices and communities of fandom. Therefore, it should be a best practice to share this research with them. When conducting human subjects research, it is not difficult to optionally collect contact information for participants who want to be directly informed of the results of the study (as we did with our interviews and surveys here); in the case of using public data, this might mean sharing the results on social media where the studied population might see it.

[5.13] Many of our participants discussed the different ways that rules within fandom at large can differ among smaller fandom groups. Learning the etiquette of a fan space takes work, and until people know that etiquette they might behave in ways that the fan community does not expect or accept. Spending time within a fan community before researching or writing about it allows someone to better judge what public data might be safe to uplift from fandom to a broader audience.

[5.14] Though our participants shared the general perception that people do not write about fandom unless they started out within fandom (which may be true), online fandom is an important phenomenon that raises many interesting research questions in fields outside of fan studies (Dym et al. 2018). We recommend that anyone coming from the outside to write about fandom (whether a researcher or journalist) should spend time there and take the time to talk to fans and to understand and learn their norms. However, even for researchers who are part of fandom, they may not know the norms of a specific fan community, so the same applies when delving into unfamiliar fandom spaces. It is critical that we are mindful of each user's reasonable expectations to privacy, which may be dependent on the community or platform.

[5.15] Fan studies scholars acknowledge that data created in fan communities are usually created with a certain understanding that those spaces are closed off to the public to some degree (Busse and Hellekson 2012). Although their data are certainly public, community norms determine what is and is not acceptable to do with that data. As one participant said,

[5.16] There are things that should not be spoken of outside fandom. It's kind of like fight club. First rule of fandom. (PI-17)

[5.17] This first rule of fandom is a norm broadly understood by most people in fan communities. The very nature of research violates this rule, however, considering that our goal is to lift artifacts of fandom from their original space and bring them to a new audience. Ethnographic work can lend itself well to researchers observing other social media spaces, potentially saving someone a misstep before it happens. While learning the norms of a community has been a long-held value of ethnographers, this research illustrates the expectation that fan community members hold that researchers abide by this practice regardless of their discipline or methods.

[5.18] The five practices we detailed—obtaining permission, obscuring data, attribution, giving back, and learning community norms—are not just a useful heuristic for ethical research in fan communities but are also valuable considerations to keep in mind when working with all forms of public data, especially when those data come from people with particularly sensitive privacy concerns (Cho 2018; Dym and Fiesler 2018b). Our findings emphasize the importance of moving beyond the notion of publicness as a one-size-fits-all ethical heuristic for data collection, and instead considering important factors such as who the data belong to, what kind content it is, and what the researcher intends to do with it.

[5.19] Even for public data sets in which individual participants might number in the tens of thousands to the millions, it might be possible to talk to some members of the target population in order to better understand what values and concerns people might hold that would deter them from consenting to their data being used. Better understanding of the community can also highlight instances when attribution is desirable and beneficial for a person or community, or when and how ethical fabrication might be more appropriate. Spending time with the community a researcher intends to study permits learning important norms and better understanding how to safely use public data, thus reducing the risk of harm.

[5.20] We also argue that researchers should, when possible, give back to the communities they research, sharing their results in accessible ways. However, most importantly, we stress the importance of making all these decisions contextually—for the specific subjects of study, the specific community, and the specific methods for analysis and reporting, with an emphasis on understanding the risks of amplifying content beyond its intended audience.

6. Conclusion

[6.1] Fan studies are uniquely positioned within internet research because of their long history of engaging with research ethics concerning human subjects. However, along with fandom's move online we have seen more large-scale data explorations. As different types of data integrate into fan studies and beyond, we must pause and ask ourselves to what ethical standards we will hold our researchers and how we might better take care and hold space for the complex and contextual data generated through online transactions.

[6.2] The different layers of publicness inherent to online communities contribute to unspoken norms concerning what kinds of data people share, and where they share them. Within fandom, where a person's audience might be limited to a close-knit group of a hundred people, someone might be more than comfortable sharing a highly personal or contentious post or fan work. When a researcher or journalist relocates that same content—even when it is public—to a different space, they are changing the audience, potentially amplifying it to a much broader audience than ever originally intended and thus exposing the data’s owner to more privacy risks. By incorporating a mindfulness of these shifting concerns in privacy, context, and audience, we can better ensure the continued ethical use and study of data generated within fandom and beyond.

7. Acknowledgments

[7.1] As always, we would like to thank our participants for sharing their stories. We also would like to thank the members of the Internet Rules Lab (IRL) for their support throughout the research process. This work was funded by NSF award IIS-1704369 as part of the PERVADE (Pervasive Data Ethics for Computational Research) project.

8. References

Ayers, John W., Theodore L. Caputi, Camille Nebeker, and Mark Dredze. 2018. "Don't Quote Me: Reverse Identification of Research Participants in Social Media Studies." NPJ Digital Medicine 1:30.

Bennett, Lucy. 2018. "Surveying Fandom: The Ethics, Design, and Use of Surveys in Fan Studies." In The Routledge Companion to Media Fandom, edited by Melissa A. Click and Suzanne Scott, 36–44. New York: Routledge.

Black, Rebecca W. 2005. "Access and Affiliation: The Literacy and Composition Practices of English‐Language Learners in an Online Fanfiction Community." Journal of Adolescent and Adult Literacy 49 (2): 118–28.

boyd, danah. 2016. "Untangling Research and Practice: What Facebook's 'Emotional Contagion' Study Teaches Us." Research Ethics 12 (1): 4–13.

Braun, Virginia, and Victoria Clarke. 2006. "Using Thematic Analysis in Psychology." Qualitative Research in Psychology 3 (2): 77–101.

Bruckman, Amy, Kurt Luther, and Casey Fiesler. 2015. "When Should We Use Real Names in Published Accounts of Internet Research?" In Digital Research Confidential: The Secrets of Studying Behavior Online, edited by Eszter Hargittai and Christian Sandvig, 243–58. Cambridge, MA: MIT Press.

Busse, Kristina. 2018. "The Ethics of Studying Online Fandom." In The Routledge Companion to Media Fandom, edited by Melissa A. Click and Suzanne Scott, 9–17. New York: Routledge.

Busse, Kristina, and Shannon Farley. 2013. "Remixing the Remix: Fannish Appropriation and the Limits of Unauthorised Use." M/C Journal 16 (4).

Busse, Kristina, and Karen Hellekson. 2012. "Identity, Ethics, and Fan Privacy." In Fan Culture: Theory/Practice, edited by Katherine Larsen and Lynn Zubernis, 38–56. Newcastle upon Tyne: Cambridge Scholars.

Campbell, Julie, Cecilia Aragon, Katie Davis, Sarah Evans, Abigail Evans, and David Randall. 2016. "Thousands of Positive Reviews: Distributed Mentoring in Online Fan Communities." In CSCW '16: Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work and Social Computing, 691–704. New York: Association for Computing Machinery.

Chess, Shira, and Adrienne Shaw. 2015. "A Conspiracy of Fishes, or, How We Learned to Stop Worrying about #GamerGate and Embrace Hegemonic Masculinity." Journal of Broadcasting and Electronic Media 59 (1): 208–20.

Cho, Alexander. 2018. "Default Publicness: Queer Youth of Color, Social Media, and Being Outed by the Machine." New Media and Society 20 (9): 3183–200.

Coppa, Francesca. 2017. The Fanfiction Reader: Folk Tales for the Digital Age. Ann Arbor: University of Michigan Press.

Dreyfuss, Emily. 2018. "Was It Ethical for Dropbox to Share Customer Data with Scientists?" Wired, July 24, 2018.

Dubois, Elizabeth, Anatoliy Gruzd, and Jenna Jacobson. 2018. "Journalists' Use of Social Media to Infer Public Opinion: The Citizens' Perspective." Social Science Computer Review 38 (1): 57–74.

Dym, Brianna, Cecilia Aragon, Julia Bullard, Ruby Davis, and Casey Fiesler. 2018. "Online Fandom: Boldly Going Where Few CSCW Researchers Have Gone Before." In CSCW '18: Companion of the 2018 ACM Conference on Computer Supported Cooperative Work and Social Computing, 121–124. New York: Association for Computing Machinery.

Dym, Brianna, and Casey Fiesler. 2018a. "Generations, Migrations, and the Future of Fandom's Private Spaces." In "The Future of Fandom," edited by Kristina Busse and Karen Hellekson, special 10th anniversary issue, Transformative Works and Cultures, no. 28.

Dym, Brianna, and Casey Fiesler. 2018b. "Vulnerable and Online: Fandom's Case for Stronger Privacy Norms and Tools." In CSCW '18: Companion of the 2018 ACM Conference on Computer Supported Cooperative Work and Social Computing, 329–332. New York: Association for Computing Machinery.

Evans, Sarah, Katie Davis, Abigail Evans, Julie Ann Campbell, David P. Randall, Kodlee Yin, and Cecilia Aragon. 2017. "More Than Peer Production: Fanfiction Communities as Sites of Distributed Mentoring." In CSCW '17: Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing, 259–272. New York: Association for Computing Machinery.

Fathallah, Judith. 2016. "Transparency and Reciprocity: Respecting Fannish Spaces in Scholarly Research." Journal of Fandom Studies 4 (3): 251–4.

Fiesler, Casey, and Amy S. Bruckman. 2014. "Remixers' Understandings of Fair Use Online." In CSCW '14: Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing, 1023–32. New York: Association for Computing Machinery.

Fiesler, Casey, and Blake Hallinan. 2018. "'We Are the Product': Public Reactions to Online Data Sharing and Privacy Controversies in the Media." In CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, no. 53. New York: Association for Computing Machinery.

Fiesler, Casey, Shannon Morrison, and Amy S. Bruckman. 2016. "An Archive of Their Own: A Case Study of Feminist HCI and Values in Design." In CHI '16: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, 2574–85. New York: Association for Computing Machinery.

Fiesler, Casey, and Nicholas Proferes. 2018. "'Participant' Perceptions of Twitter Research Ethics." Social Media + Society 4 (1).

Guerrero-Pico, María del Mar, María-José Establés, and Rafael Ventura. 2018. "Killing Off Lexa: 'Dead Lesbian Syndrome' and Intra-fandom Management of Toxic Fan Practices in an Online Queer Community." Participations 15 (1): 311–33.

Hellekson, Karen, and Kristina Busse. 2009. "Fan Privacy and TWC's Editorial Philosophy." Organization for Transformative Works, December 5, 2009.

Jenkins, Henry. 1988. "Star Trek Rerun, Reread, Rewritten: Fan Writing as Textual Poaching." Critical Studies in Mass Communication 5 (2): 85–107.

Jensen, Thessa. 2016. "Let's Make It Personal! Ontological Ethics in Fan Studies." Journal of Fandom Studies 4 (3): 255–73.

Jones, Bethan. 2016. "'I Hate Beyoncé and I Don't Care Who Knows It': Towards an Ethics of Studying Anti-fandom." Journal of Fandom Studies 4 (3): 283–99.

Kelley, Brittany. 2016. "Toward a Goodwill Ethics of Online Research Methods." Transformative Works and Cultures, no. 22.

Markham, Annette. 2012. "Fabrication as Ethical Practice: Qualitative Inquiry in Ambiguous Internet Contexts." Information, Communication and Society 15 (3): 334–53.

Markham, Annette, and Elizabeth Buchanan. 2012. Ethical Decision-Making and Internet Research: Recommendations from the AOIR Ethics Working Committee (Version 2.0). Association of Internet Researchers Ethics.

Massanari, Adrienne. 2017. "#Gamergate and The Fappening: How Reddit's Algorithm, Governance, and Culture Support Toxic Technocultures." New Media and Society 19 (3): 329–46.

Nielsen, Ej. 2016. "Dear Researcher: Rethinking Engagement with Fan Authors." Journal of Fandom Studies 4 (3): 223–49.

Nissenbaum, Helen. 2004. "Privacy as Contextual Integrity." Washington Law Review 79 (1): 119–58.

Proferes, Nicholas. 2017. "Information Flow Solipsism in an Exploratory Study of Beliefs about Twitter." Social Media + Society 3 (1).

Reid, Robin Anne. 2016. "Ethics, Fan Studies and Institutional Review Boards." Journal of Fandom Studies 4 (3): 275–81.

Seidman, Irving. 2013. Interviewing as Qualitative Research: A Guide for Researchers in Education and the Social Sciences. 4th ed. New York: Teachers College Press.

Tushnet, Rebecca. 2007. "Payment in Credit: Copyright Law and Subcultural Creativity." Law and Contemporary Problems 70 (2): 135–74.

Vitak, Jessica, Nicholas Proferes, Katie Shilton, and Zahra Ashktorab. 2017. "Ethics Regulation in Social Computing Research: Examining the Role of Institutional Review Boards." Journal of Empirical Research on Human Research Ethics 12 (5): 372–82.

Vitak, Jessica, Katie Shilton, and Zahra Ashktorab. 2016. "Beyond the Belmont Principles: Ethical Challenges, Practices, and Beliefs in the Online Data Research Community." In CSCW '16: Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work and Social Computing, 941–53. New York: Association for Computing Machinery.

Whiteman, Natasha. 2012. "Undoing Ethics." In Undoing Ethics: Rethinking Practice in Online Research, 135–149. London: Springer.

Whiteman, Natasha. 2016. "Unsettling Relations: Disrupting the Ethical Subject in Fan Studies Research." Journal of Fandom Studies 4 (3): 307–23.

Williams, Matthew L., Pete Burnap, and Luke Sloan. 2017. "Towards an Ethical Framework for Publishing Twitter Data in Social Research: Taking into Account Users' Views, Online Context and Algorithmic Estimation." Sociology 51 (6): 1149–68.

Zimmer, Michael. 2010. "'But the Data Is Already Public': On the Ethics of Research in Facebook." Ethics and Information Technology 12 (4): 313–25.

Zubernis, Lynn, and Kelsey Davis. 2016. "Growing Pains: The Changing Ethical Landscape of Fan Studies." Journal of Fandom Studies 4 (3): 301–6.