Content Discussion History

Data Science is More “Human” Than You Think

Data science… that big, shiny, new field that has everyone talking.

Before entering graduate school, I didn’t know much about data science. I only knew that it was some newfangled branch of computer science, and that I wasn’t interested. Me? I wanted to work with people. I wanted to make change in the world. I wanted to use psychology (my undergraduate major). I wanted to help people and study humanity. Data science doesn’t have any of those things, I thought.

Make no mistake, data science (especially data science specific to civic work) is, and will remain, a very human field. This was solidified for me at my very first data science conference—Data Science for Social Good (DSSG) 2017—where acquaintances and peers alike listened and learned from one another to see the human side of it all.

To make a long story short, I spent the majority of my energy simply listening to all of the smart folks around me. I learned a lot and came away with a few tidbits of advice for all data newbies, who might be under the impression (like I once was) that data science is an inanimate field. It is far from it. Here are a few of the most significant things that I heard, followed by my interpretations:

“It takes a team.”

Data science is not about isolating yourself in a cubicle. Realize that you’re not getting anywhere alone, and that any data science project will need a team. The ingredients for this team do not need to be 100% data scientist. In fact, the more comprehensively knowledgeable a team is, the better.

“Hire a conscience.”

Data science requires thoughtful minds. So employ compassionate professionals who can do more than just solve data problems. Seek to hire the conscientious, the self- and socially-aware. In particular, seek those who can critically evaluate from more than one perspective.

For example, say your question is, “How can we help mitigate the negative effects of gentrification?” Social scientists, sociologists, African American studies majors, urban planners, and so on are within the feasible realm of ‘team-member’ for a sensitive question of this complexity.

“Data is not racist.”

While some might say data itself has no opinion (“data is not racist,” said one conference presenter), I would have to disagree. Perhaps, data can be racist, because data collection methods can cause bias to exist within those datasets. It is important to recognize that any person is likely to bring his or her own biases into a project. This becomes a problem as we think about the potential ramifications of data science projects (real-life outcomes: city funds reallocation, resource development, time and attention).

While it is not always clear how to actively mitigate these issues (ethical and unbiased methodologies are more easily delineated than they are actually practiced), there are steps that we can take. For example, make an effort to think critically about your research questions. Have others take a second look. Do your questions make assumptions based on some unfounded opinion? Remember, research should be unassuming.

“This is about working with people, not with data.”

People = data. Perhaps the fact that information is often represented numerically causes us to forget this, but the truth remains: Data typically comes from people.

Facebook data comes from Facebook users, health data comes from patients, housing data comes from renters and homeowners, and so on. This being said, always search for the humanity behind your data. Be considerate of that reality. This consideration will help you to fairly scope a line of questioning and ground the project in reality.

“Thank you for being here. I appreciate you all being here to listen today, because I need help with this.”

Data science is collaborative. Openness and peer-peer evaluation is a necessary component of efficient data science. Returning to the first quote I shared, “it takes a team.” Emphasize conversation, ask for feedback, and adopt a culture of community.

It’s true that for newcomers (and I speak from experience), certain topics within the field of data science can feel impersonal. The words “data” and “science” alone can evoke mental images of lifeless objects (computers, infinite numbers, rows upon rows of data entries). However, it is important to remember data science runs on: Human beings.

The people whose lives, experiences, or characteristics have been translated into quantitative or qualitative data points comprise our datasets. Conscientious minds are then called upon to evaluate these data. The gentle, caring touch of a well-rounded team of people is necessary to avoid biased, racist or discriminatory methods. Humanity envelops data science, and data science cannot function effectively without humanity. So for anyone who argues that data science will one day be monopolized by robots, I disagree.

Data science thrives when you keep people in mind. Specifically, civic data science lacks insight and complexity without the oversight of humanity.

Comments

add comment

Your comment will be revised by the site if needed.