Putting Large Data Collections in Conversation with Small Data and Situated Ways of Knowing
The amalgamation of the graphs, charts, and maps in this webtext, taken together, begin to build a polyvocal and nuanced understanding of anti-gentrification encounters located through Twitter data collected from 2016 to 2018, perhaps even developing an "antenarrative" of gentrification in our contemporary moment. Natasha Jones, Kristen Moore, and Rebecca Walton (2016) stated that "in contrast to narratives, which Boje (2011) conceived as characterized by 'stability and order and univocality,' antenarratives are poly-vocal, dynamic, and fragmented—yet highly interconnected" (p. 212). Jones and colleagues employed the concept of the antenarrative to theorize how dominant narratives can be troubled and made more multifaceted by the inclusion of antenarratives that trouble the dominant perspective. Jones et al. demonstrated, for instance, how the dominant perspective that technical communication has been historically apolitical can be troubled by recovering and amplifying tech comm's parallel history of activism. Similarly, I argue that the rhetorical feminist methodology I've begun to demonstrate throughout this webtext can help researchers locate encounters, amplify resistance, and create visualizations that contribute antenarratives that are "poly-vocal, dynamic, and fragmented–yet highly interconnected" and that "link the static dominant narrative of the past with the dynamic 'lived story' of the present to enable reflective (past oriented) and prospective (future oriented) sense making" (Boje, qtd. in Jones et al., 2016, p. 212). Locating and amplifying anti-gentrification encounters and rhetoric circulating on Twitter can provide an important corrective to dominant narratives about internet activism, data and data visualization, and urban renewal and gentrification. As we continue to engage with large datasets and the data analysis and visualization technologies necessary to work with them, feminist methodologies can assist rhetoric and writing studies in formulating and practicing critical and inventive digital methodologies.
Going Further in Participatory and Community-Engaged Visualization
Before I summarize what I see as the benefits of feminist rhetorical methodologies on data and visualization, I will first identify some limitations to my study of anti-gentrification rhetoric on Twitter. Throughout this webtext, I've tried to point out how the choices I've made and the affordances of the tools I've used have shaped the resulting visualizations and the partial knowledge they communicate, as well as to emphasize the importance of situating data in bodies and geographic locations; however, as scholars like Catherine D'Ignazio and Lauren Klein (2016) pointed out, feminist visualization practices should also endeavor to "examine power and aspire towards empowerment." The authors go on to explain that to do so involves "ensuring that the outcomes of our design research connect back to the communities that first made them possible." At this stage in my research, I haven't yet done enough to include the communities that made this data possible, particularly the Defend Boyle Heights community. Although there are logistical hurdles to doing so, truly situated feminist methodologies would involve these efforts towards community-informed research and visualization and also offer reciprocity to the communities. In the case of my study, this will involve contacting Defend Boyle Heights, sharing the data I've collected, offering opportunities for input and correction, and providing an opportunity for the community to opt-out of any additional data collection and research. As an effort toward reciprocity, I'll offer the data collection, analysis, and visualizations to the community to be used for their own purposes. DBH might use the visualizations in their efforts to persuade others of the negative impact of gentrifying forces in their community and to illustrate their tactics and impact to build new alliances. As I acknowledge these shortcomings and ways in which I will seek to modify my methodology to more closely align with feminist concerns for situated and contextualized research, I also urge us to begin conversations within our universities and other places of research about how they might build the institutional infrastructures to support community-connected data research and provide the support necessary for ethical community research partnerships.
Feminism-informed methodologies offer rhetoric and writing studies critical frameworks to push back at disembodied data epistemologies that rely on notions that such ways of knowing are objective, neutral, and apolitical. This is no less important for those of us who engage with large Twitter datasets, which can also "appear immediate, seamless, unified, isolated from time and space" (Wolff, 2015). Rhetorical feminist methodologies push back at such epistemologies by illuminating their structuring perspectives and the rhetorical qualities of data, computational analytics, and visualization. I've also emphasized that we can enact feminist rhetorical methodologies through conscious and rigorous efforts to attend to and influence data structuring in collection, analysis, and visualization; through efforts like these, data visualization can also orient from perspectives that dissent from dominant cultural, social, and technological narratives and perhaps play a small role in undermining dominant power structures like data epistemologies and gentrification.
By choosing to depart from computational analytics of the large dataset and create a smaller data study through Defend Boyle Heights' tweet perspective, I dissented from the pattern-finding and quantification that is the primary mode of understanding large datasets. Instead, I attempted to amplify DBH's dissenting perspective on urban change as cultural and human erasure by mapping the forces they identify as negatively impacting their landscape and communities. I also accounted for the embodied, communicative, and legal actions resisting these forces in order to retain physical and cultural rights in Boyle Heights. Although the visualizations in this webtext make some progress in actualizing the goals of this research and practicing a rhetorical feminist methodology, particularly through polyvocality and multiplicity, I've also elaborated on the ways in which the existing tools' affordances and my technological skill level weren't adequate for reaching the kind of embodied and situated visualizations that would go further in disrupting the omniscient and totalizing view of charts, graphs, and maps; however, with more collaboration and dedicated resources, rhetoric and writing studies can better actualize our own visualization goals and offer ways to turn the big data paradigm inside out.
First, we can go further to privilege manual coding informed by grounded theory over algorithmic analysis. Several of the rhetorical data studies I've referred to in this webtext such as work from William I. Woolf (2015), Laurie Gries (2015), and Danielle Endres et al. (2016) adopt some form of grounded theory, even though they also used computational modes at various stages of data collection, analysis, and visualization. LaDona Knigge and Meghan Cope (2006) emphasized that grounded theory acknowledges "analysis to be a social practice with all its attendant subjectivities, partial knowledges, and positionalities" (p. 2026). As my own research progresses, I could use the themes derived from the grounded theory coding of DBH's tweets to develop ways to apply these codes to an analysis of the larger dataset. This point informs the second way we might influence data analysis and visualization studies more broadly: allowing small data to inform large data analysis. Rather than leading with large-scale pattern analysis and deriving outcomes from algorithmic analysis, the more embodied, situated accounts in our small data analysis could become the orientation guiding the analysis of the whole. For instance, I might use the thematic and action codes I've derived from DBH tweets to train topic modeling tools like Scikit or Mallet in an analysis of the whole dataset. I don't expect that the whole dataset can be reduced to these codes any more than DBH's tweet codes totally encapsulate an understanding of their embodied actions, communication practices, or legal actions, nor do I expect the DBH codes to encompass all the entities at play in gentrification on a global scale; however, leading with theories born from small data significantly departs from what Rob Kitchin (2014) identified as one of the false epistemological beliefs of the big data paradigm: that "through the application of agnostic data analytics the data can speak for themselves free of human bias or framing, and any patterns and relationships within Big Data are inherently meaningful and truthful" (p. 4). What’s more, by starting with small data analysis informed by critical and rhetorical theories as well as traditional humanities research practices of close reading and immersive research, we resist adopting the blackboxed logic of algorithmic analysis and the seeming neutrality of pattern-finding in favor of explicitly forming theories, hypotheses, and critical stances that don't relinquish research outcomes to the novel results of computational analytics but instead bring our theoretical and methodological choices to bear on data research.
Finally, feminist rhetorical methodologies allow not just for a critical awareness of the affordances of tools, but an active engagement to retool them according to our theoretical perspectives and research goals. By reorienting practices from focusing on the quantitatively greatest data points in the Tableau chart and graph visualizations to using Tableau to identify what might otherwise be missed and overlooked in these measures, re-arrangement becomes a key rhetorical strategy we can employ to locate and human read individual tweets that might otherwise become the missing data in our collections. By first being aware of the guiding perspective of tools and visualizations like Tableau and then actively using them differently—arranging the affordances to locate and highlight data that is present but not deemed significant by these technologies—we not only take more responsibility for our research outcomes but develop practices that come closer to actualizing the playful and inventive potential of engaging technological infrastructures advocated for by rhetoric scholars like Gries (2015) and Douglas Eyman (2016). We also demonstrate to other fields that data isn't neutral, tools aren't fact-finding machines, and visualizations don't lack authorial framing and perspective. By doing so, perhaps we progress towards what Kitchin (2014) identified as one of the positive potentialities of humanities-based engagement with large datasets and their attendant technologies: modifying positivist data epistemologies to show that "the research conducted is reflexive and open with respect to the research process, acknowledging the contingencies and relationalities of the approach employed, thus producing nuanced and contextualized accounts" (p. 9).
Although I eagerly await free and accessible data collection, analysis, and visualization tools built around our shared disciplinary principles and needs, I also share Aaron Beveridge's (2018) urgency that the time to develop data analysis methodologies is now. As Cathy O'Neil (2016) emphasized in Weapons of Math Destruction, data-born ways of knowing are only increasing and "predictive models are, increasingly, the tools we will be relying on to run our institutions, deploy our resources, and manage our lives" (p. 218). As such, data literacy, which includes methodologies for practicing data analysis and visualization, is paramount to our research and teaching practices. Teaching students data literacy through hands-on engagement with collection, analysis, and visualization tools can make manifest the ways in which data is not abstract but situated and constructed. Furthermore, as modes like datafied ways of knowing and visualization become dominant ways of communicating and knowing the world, students should develop the literacies and skills that allow critical engagements with data and visualization.
As I hope this webtext demonstrates, rhetoric and writing studies scholars are uniquely equipped to offer critical and generative methodologies that challenge the assumptions around data and visualizations while also productive and inventive ways forward. Although many of the tools I've employed are low or no-cost and many of them don't require any programming skills, this project would be improved by the contribution of others with different skill sets and experience, such as other scholars, technologists, librarians, and community members. Such collaborations increase our own literacies and expand the possibilities of our research and visualizations. My hope is that more of us will not just explore large datasets and analytics, but create inter-institutional, cross-institutional, and community collaborations that, through the sharing of different skills and disciplinary perspectives, reinvent what’s possible and preferable with data analysis and visualization.