Algorithms and AI affect so many aspects of everyday life. Although AI can deliver extraordinary value and insight, it can also hurt people. In this episode of Working Better, we dive deep into the issue of bias in AI to illuminate the stories, consequences, and complicated ethical questions that have experts pushing for progress.
- Kyle Hundman – Data Science Manager, American Family Insurance
- Deena McKay – Functional Delivery Consultant, Kin + Carta; Founder and Host of Black Tech Unplugged
- Maxwell Young – UX Designer, Kin + Carta
- Nicolas Kayser-Bril – Journalist, AlgorithmWatch
“Alexa, play Whitney Houston as loud as possible.”
Voice assistants are a great way to demonstrate how an algorithm works. In its simplest form, an algorithm is just a sequence of steps designed to accomplish a task.
Alexa uses a voice recognition algorithm to understand I want music, that I want that music to be Whitney Houston’s music, and that I want it played at maximum volume. It moves through a carefully designed sequence of rules that arrives at “I wanna dance with somebody” playing loud enough to wake up my neighbors. As requested.
So let’s say, hypothetically, that’s how I start every Friday morning. Except, this Friday, I just say, “Alexa, play some music.” Alexa will then be more likely to play Whitney Houston, or something like it because it’s learned my preferences and can now better predict what I want to hear.
(00:48) The Pervasiveness of Algorithms
That’s just one example of how algorithms, machine learning, and AI are used in everyday life. It’s also a fairly harmless example. Which is not always the case. Algorithms are used to predict the things you might buy, the fastest route to the grocery store, your qualifications for a job, how likely you are to pay back a loan, which Pokemon you are based on your grocery list, and more.
But, whatever their purpose, algorithms all have one thing in common, They’re designed by people.
In case you haven’t caught a headline for the last 5,000 years or so, people are far from perfect. So when the stuff that goes in to these algorithms is designed by humans, modeled after human behavior, the output can be just as flawed. Bias in the form of racism, sexism, and other forms of discrimination become solidified in code, embedded into everyday products, and affect people’s lives in very real ways.
So today we’re going to shed a light on the dangers of bias in AI, why it’s so hard to fix, and what we can do to overcome it and help create more representative, equitable, and accountable AI.
(02:10)The Building Blocks of AI
First a little algorithm and AI 101.
Let’s say an algorithm is a building. Data points and lines of code are like brick, mortar, and concrete–raw material used in different ways for different purposes. Some become apartment buildings. Some become museums. And, thankfully, some become Wendy’s restaurants.
Artificial intelligence, then, is sort of like a city–a collection of different buildings, all designed to interact, depend on, and benefit from one another. Today–we’re going to talk a lot about algorithms, the buildings designed by people, which can accomplish extraordinary things–but can also cause harm in all sorts of ways.
Deena: “When you would go wash your hands and you put your hand under the sink, would it work automatically?” Deena asked.
Maxx: Yes, typically, yeah.
Deena McKay, a delivery consultant here at Kin + Carta, was talking with our producer Maxx (who is white). Maxx thought Deena might just be checking up on his COVID hygiene, but she was actually illustrating just how widespread this issue is, even with a fairly low-tech example:
Deena: “So, me being a person of color, it doesn’t work automatically. Sometimes I have to move my hand around. Or sometimes I have to maybe even go to an entirely different sink because of the way that these things were created. Was it with a diverse thought? And sometimes people who are Brown/Black minorities, our hands don’t automatically get recognized, even just for washing our hands, which is crazy because we obviously need to wash our hands.”
Yes, we do. And with that type of fundamental failure, it doesn’t take much to imagine how it could lead to much more severe consequences. As Deena explained, “If you have that concept of we can barely wash our hands, imagine what would happen if it was a self-driving car, and it didn’t recognize me walking across the street. It’s going to hit me.”
Deena is also the host of another podcast that we highly encourage you to check out called Black Tech Unplugged. It is an amazing podcast where Deena talks with other Black people currently working in tech to share their stories about how they got started and encourage other people of color to work in the tech industry.
(04:12) Joy Buolamwini – The Coded Gaze
If you’ve heard anything recently about racial bias in AI, you may have heard about the remarkable work of Joy Buolamwini. In her own words, Joy is a poet of code who uses art and research to illuminate the social implications of artificial intelligence. Joy was working at the MIT Media Lab when she made a startling discovery. Joy explains, via a talk at the 2019 World Economic Forum:
“I was working on a project that used computer vision, didn’t work on my face, until I did something. I pulled out a white mask,and then I was detected.”
In the talk, Joy shows a video of herself sitting in front of a computer vision system. In this system, white male faces are recognized immediately, but when she sits down, nothing–until she puts on an expressionless, seemingly plastic white mask. Joy set out to determine why this was happening, to uncover the biases within widely used facial recognition systems, and help build solutions to correct the issue.
Joy’s story is the subject of a new documentary called Coded Bias, which premiered at the Sundance Film festival earlier this year. Joy is also the founder of the Algorithmic Justice League, an organization aiming to illuminate the social implications and dangers of artificial intelligence. As Joy says, if black faces are harder for AI to detect accurately, it means there’s a much higher chance they’ll be misidentified.
(05:32) Wrongfully Accused
Take the story of Robert Williams, a man from Detroit wrongfully accused at his home for a crime he didn’t commit. In a piece produced by the ACLU, Robert describes his conversation with police after he was first detained.
“The detective turns over a picture and says, ‘That’s not you?’ I look, and I say ‘No, that’s not me.’ He turns another paper over and says ‘I guess that’s not you either.’ I pick that paper up and hold it next to my face, and I say ‘That’s not me. I hope you don’t think all Black people look alike.’ And he says, ‘The computer says it’s you.”
Although companies including Amazon and IBM have announced they are halting the development of facial recognition programs for police use, Robert’s story is, unfortunately, becoming all too common.
However, the dangers of bias in AI aren’t always so easily seen and demonstrated. They’re not always as tangible as a computer seeing a white face, but not a Black face, or a soap dispenser recognizing white hands more than Black hands.
One study found that a language processing algorithm was more likely to rate white names as “more pleasant” than Black names.
In 2016, an algorithm judged a virtual beauty contest of over 600,000 applicants from around the world–and almost exclusively chose white finalists.
There are well documented cases in healthcare, financial services, the justice system, the list goes on.
(06:58) How does bias in AI happen?
So how do these things happen?
The most obvious place to start is with the data being fed into an algorithm.
(07:05) Bad Data
For image recognition models–the algorithms used in things like soap dispensers or facial recognition software–if the data are being trained on mostly white faces or white hands, it’s going to learn to recognize white skin more easily. Because many of these systems were trained on such a disproportionate sample of white men, Joy gave the phenomenon a name:
“I ran into a problem, a problem I call the pale male data issue. So, in machine learning, which includes techniques being used for computer vision–hence finding the pattern of the face–data is destiny. And right now if we look at many of the training sets or even the benchmarks by which we judge progress, we find that there’s an over-representation of men with 75 percent male for this National Benchmark from the US government, and 80 percent lighter-skinned individuals. So pale male data sets are destined to fail the rest of the world, which is why we have to be intentional about being inclusive.” – Joy Buolamwini
In 2015, Amazon experienced a similar situation. Recruiters at Amazon had built an experimental AI model to help streamline the company’s search for top talent. The tool took thousands of candidates’ resumes, and would quickly identify top prospects, saving hiring managers countless hours. Even when the algorithm was designed to weigh gender neutrally, Amazon found it was heavily favoring men.
Why? The benchmark for top talent was developed by observing patterns in resumes Amazon had received over the previous 10 years, which belonged to, you guessed it, mostly men. The system learned to penalize resumes containing words like “women’s” as in “women’s college” or “women’s debate team” because they weren’t phrases likely to show up in previous applicants’ resumes.
(08:40) Diversity of Perspective
It really comes down to the fact that you need more multidisciplinary people making these decisions, “Twitter was invented by a bunch of white guys at a table, and they never thought of any problems that wouldn’t affect them as white guys.” – Max Young
That’s Max Young, a UX designer from the Kin + Carta UX team. Max says that often the simplest place to start is by looking at who is in the room. Deena agrees: “I would always like to see more people who look like me, in the workplace, doing tech work.”
If your algorithm is a mirror of humanity, you failed and your algorithm is biased.
Kyle Hundman – Data Science Manager, American Family Insurance
(09:39) Reinforcing Systemic Bias
There are also cases where algorithms that overlook broader systemic issues–like gender and racial inequality–can actually continue to reinforce them. To help explore this idea, we sat down with Kyle Hundman. Kyle leads a team at the Data Science and Analytics Lab at American Family Insurance.
“If your algorithm is a mirror of humanity, you failed and your algorithm is biased.” – Kyle Hundman, Data Science Manager, American Family Insurance
It really is the simplest way to understand it. AI isn’t really artificial intelligence. At Kin + Carta, we often prefer to think of it as augmented intelligence, because it’s not a computer thinking on its own. It’s a computer thinking the way we think, and behaving as we behave, which means it needs to be examined very carefully.
Take the story of COMPAS, an algorithm developed to evaluate the likelihood that a criminal will commit a crime again. A 2016 ProPublica study analyzed 10,000 defendants using the COMPAS system; their findings were clear: of all defendants who did not commit a crime over a two-year period, black defendants were twice as likely to be classified as higher risk than their white counterparts. The system had effectively learned to disproportionately evaluate Black defendants because it was mimicking the bias that we know exists in arrest records.
It’s also one of the reasons some are calling for an overhaul of credit reports as we know them in the US. The short of it is that, beginning in the 1930s, neighborhoods in many American cities were subject to “red-lining” policies, allowing mortgage lenders to label predominantly Black neighborhoods as “high-risk” areas, effectively denying Black residents access to credit for years. Even decades after those practices were outlawed, advocates point out even the simplest of data points can still lead to a disproportionate impact. Kyle helped illustrate one such example, as well as how important, yet still entangled, the conversation can be:
Kyle: “Just because of the use of location and anything that you’re doing that’s consumer facing, because you have all of these historical factors of discrimination and injustice in our country, and those often date back hundreds of years, and still manifest themselves today, it’s a really tricky question to ask, well, can location be a proxy for some of these historical injustices? How much is that still present today? How much does that matter in what we’re doing right now? And then how much of that is actually perpetuating some of those injustices? And that’s where the conversation gets really tricky and really deep.”
(12:14) Understanding the Bigger Picture
There’s clearly no easy solution, but one thing seems clear: the broader social context can’t be ignored when algorithms are making decisions about things like hiring, access to loans, or criminal sentencing.
Focusing on really narrow data sets and ignoring the backdrop of racial and gender inequality makes as much sense as summarizing 2020 by saying “Traffic jams were at an all time low.” Whether it’s true or not, you’re very much missing the bigger picture.
Which begs the question about education: for anyone in the tech world–designers, developers, data scientists–should AI skills and social understanding be considered inseparable? Like Laverne and Shirley? Bacon and eggs? Or being from Minnesota and saying “you betcha”?
Kyle Hundman: “I think it should be. And I think that it’s now more culturally relevant than it’s ever been before, and it’s getting a lot of attention rightfully so.“
Max Young: “when you get a bunch of engineers together and you say, ‘Come up with the system to figure out credit scores. Or maybe it’d be good to have a historian in there to say, “We’ve actually come across this problem before, let’s try to fix it rather than just maintain the situation.'”
Responsibility and Action
Kyle says one of the most powerful examples of multi-disciplinary teams could be in how companies are addressing diversity and inclusion.
Kyle: “We’ve seen recently diversity and inclusion departments pop-up in corporations. I think those will become technical, and I think you’ll have bias audits where you have technical people, that this is their focus, and they want to make sure that corporations are being responsible.”
We also asked Kyle about the responsibility of folks like him to uncover and uproot issues of algorithmic bias. He said that in many ways, it’s about better data science, and more accurate models, period.
Kyle Hundman: “I think it’s a healthy way to look at it as due diligence, and it should be core to any modeling exercise. I think there are a lot of situations where that’s actually beneficial to model development and that bias might actually hurt performance, where if you’re over sample and you have one class that’s over-represented, that’s a fundamental flaw in your data and you need to fix that. You want to fix that issue no matter what your task is or what your data looks like. I think, in a lot of situations, there’s empirical evidence of this in that fixing some of these biases issues actually improves your model and actually improves your accuracy.”
So with a system like COMPAS, how do we “fix it”?
We can’t really say, because COMPAS is a proprietary algorithm owned by its creators. However, this brings us to another key issue here: transparency.
(14:57) The “Black Box” Problem
“The Black Box.”–and no not the thing on a plane that holds all its juicy plane secrets. Fun fact–did you know that “black boxes” on planes are actually not black at all; they’re bright orange so they can be found more easily in the event of a crash?
In the case of AI, it’s still not something physical. But perhaps more “black” in its lack of visibility. We asked Kyle to help explain what the black box issue with deep learning is really all about.
Kyle Hundman: “Because the combinations are endless, you can’t really pinpoint how a single input moves through that network and interacts with all of these other features and lights up neurons partially or fully. There’s just so much depth and so much interaction throughout this whole thing. You can’t peel that apart.”
When we can’t peel it apart, how do we know how an algorithm is coming to an answer? And how do we know it’s being unbiased in arriving at that answer? In response to calls for more transparency, big tech firms have released a variety of different “tool kits” to help give a window into how AI systems work.
Earlier this year, Microsoft released its new “Fairlearn” tool kit for its machine learning platform on Azure, allowing anyone using the platform to test and hopefully prevent incidents of bias. LinkedIn released its Fairness Toolkit used to govern how AI recommends job listings, shared content, or potential job candidates. This type of transparency is at least a step in the right direction, right?
(16:31) Who can hold companies accountable?
That’s what we asked Nicolas Kayser-Bril from AlgorithmWatch, a non-profit organization based in Berlin, Germany, that’s focused on research and advocacy about algorithms and their impact on society. Nicolas pointed out that transparency is important, but really only part of the equation:
Nicolas Kayser-Bril: “It’s of course, very important to look under the hood, but I wouldn’t say that transparency is the most important issue. The most important issue is enforcement. The problem is that we know there is a problem; we know which companies are the problem. I mean, when I as a journalist called the enforcement organizations, they’re like, ‘Oh, thank you very much we might look into it in five years.’ Because they have no funding, no expertise, and no political support to simply enforce the law. And no business in their right mind will ever be transparent to the point that they admit to breaking the law. This will never happen.”
So should algorithms be better regulated? Should the public and the government treat data and artificial intelligence like any other potentially dangerous commodity? Nicolas says the way we look at food service can be a helpful comparison. “When you go to the restaurant, you don’t ask to go to the kitchen in the name of transparency to look for yourself which bacteria are living there. You trust that the government sends hygiene inspectors to do it on your behalf.”
When you go to the restaurant, you don’t ask to go to the kitchen in the name of transparency to look for yourself, which bacteria are living there. You trust that the government sends hygiene inspectors to do it on your behalf.
Nicholas Kayser-Bril – Journalist, Algorithm Watch
(18:01) Rising to the Occasion
Another good example of a group that can cause great harm or good are doctors. What if we looked at medicine as an example of how to regulate AI and ensure that it meets ethical standards? Doctors are regulated privately by medical boards and publicly regulated by state licensing agencies. In this case, we need an industry group to set the standards for what tests AI should be subject to in order to validate its fairness. Those tools, like the Fairness Toolkit would be open source. State or federal law can mandate that the AI has to pass those tests. Ideally, the AI algorithm itself would be open source, but, until we can get companies to give up their intellectual property, passing a consistent set of black box tests would be better than nothing. Even now, you can work with the Algorithmic Justice League and request an algorithmic audit much in the same way we currently work with security firms to do a security audit.
The debate about regulating AI and algorithms will undoubtedly continue. The ethical questions are complicated, and, at least in the short run, it looks as though the responsibility will be up to the builders–the makers, and practitioners creating these systems–to be really deliberate in how we understand the impact of algorithmic bias, better hold ourselves accountable, and ultimately prove that AI can actually improve the human mind, rather than just imitate it.
Because remember what Kyle told us:
“if your algorithm is a mirror of humanity, you failed and your algorithm is biased.”
Speaking of groundbreaking feats of human achievement – it’s about that time. That’s right folks – it’s Cooler Terms with Pooler and Hermes.
(19:34) Cooler Terms with Pooler and Hermes
Scott: Joining me as always is Katie Pooler and Katie, I just realized that I introduce myself every episode but you never have.
Katie: Truthfully, I needed a few episodes before I felt comfortable enough to formally attach my name and identity to the podcast.
Scott: OK but hardly anyone listens to this podcast so I think you will be OK to introduce yourself
Katie: I am Katie Pooler, and, in addition to being our CFO – Chief Fun Office – I also work for our Connective Digital Services here at Kin + Carta. In fact, Connective Digital Services is the official Cooler Term for IT. We do a lot of things, but, essentially, I’m a solutions or systems engineer for the operations side of Kin + Carta. People come to me with problems, and it’s my job to find solutions. In fact, I think this is why you asked me for help with the podcast.
Scott: Thanks for not starting with ‘For those of you who don’t know me’ when introducing yourself. What is that about? Isn’t that what all introductions are for? For people who don’t know you?
Katie: For those of you who don’t know me, I am our president and CFO, I climbed Kilimanjaro, and I have immaculate credit and perfect work attendance. For those of you who do know me, don’t tell them I’m full of shit.
Scott: For those of you who don’t know me, how dare you? How dare you. I’m clearly someone you should already know.
Katie: You know who does know me Scott? The algorithm. It knows what I want, what I need, I assume it knows everything about me. So where is my algorithm-inspired soul mate? It’s 2020, we’re stuck inside, and we hate being on Zoom all day. I live alone. I have been so close to purchasing cardboard cutouts of celebrities just to add some variety to my social life. The algorithm knows what I want before I want it, why not use those powers for good?
Scott: What would you like to be able to do with it?
Katie: I wish we were able to use our unconscious biases more effectively, and not to discriminate against race, gender, abilities. What am I talking about? I’m talking about using algorithms to determine whether a person is likely to microwave fish in the break room, take off their shoes on an airplane, or do they watch Big Bang Theory?
Scott: I would pay cash money for that service.
Katie: Seriously though, that show is awful.
Scott: I don’t need an unbiased algorithm to tell me that.
Scott: Thanks for tuning in. Let us know what you think of the podcast and if you have any ideas for future episodes. Reach out to us on Twitter, Facebook, LinkedIn, and Instagram or just dream it to us on the astral plane. We are everywhere.