Humane, Humanist, and Humanitarian AI

Aligning technology with human values

Every design choice behind an AI system carries a question that is easy to skip: whose values does it serve? AI now sits inside hiring, healthcare, criminal justice, food aid, and migration policy, so the answer affects real people in real places, and getting it wrong has a cost.

I build machine learning systems for Earth observation, climate work, and humanitarian response. Over the years that work has taught me that three things have to be part of an AI system from the start, rather than bolted on later. I call them the three Hs. A humane AI respects people and stays within ethical limits. A humanist AI helps people do more instead of replacing them. A humanitarian AI points its tools toward the people who need them most. Here is what each one means in practice, and where we still fall short.

1. Keeping AI humane

A humane system starts from the idea that people are not data points. The choices that break this rule are common. A credit model tuned only to predict default can score very well and still leave out applicants from neighborhoods with thin credit histories. The goal is met and the person is missed. The fix is to treat ethics as part of the design. Fairness, transparency, accountability, and the right to question a decision should be built into the system, not parked in a document nobody reads. As Fei-Fei Li puts it, AI is made and deployed by people, so its moral character comes from human choices. The technology will not fix itself.

The clearest failure is bias repeated at scale. When training data carries old patterns from policing, lending, or healthcare, models tend to copy them and sometimes make them worse. The COMPAS recidivism tool is the usual example: a 2016 ProPublica investigation found it flagged Black defendants as high risk through false positives at about twice the rate of white defendants. Statisticians like Alexandra Chouldechova and Jon Kleinberg later showed there are several valid definitions of fairness, and when base rates differ between groups you cannot satisfy all of them at once. That does not let us off the hook. It means picking a fairness measure is a value choice that should be made in the open. Joy Buolamwini and Timnit Gebru found a similar gap in facial analysis, with error rates up to 34.7% for darker-skinned women against 0.8% for lighter-skinned men. The gap stayed hidden until researchers from outside the usual demographic ran the tests.

Emotional AI needs care of its own. A system can be trained to spot patterns linked to human feelings, but that is a long way from understanding them. This sets a limit. In trauma counseling, grief support, or end-of-life care, what matters is a person being there, a point Joseph Weizenbaum already made in 1976. Simulated empathy can also mislead, since a model can produce text that scores well on empathy while missing the person in front of it. AI can still help here by screening for risk, sharing information, and handling routine tasks so caregivers have more time. The test for any such tool is simple: does it make human connection more likely or less likely?

2. Toward a humanist AI

The idea of AI as a partner is older than it looks. In the 1960s, J.C.R. Licklider and Douglas Engelbart wrote about computers that extend human thinking instead of standing in for it. Licklider's 1960 paper imagined machines carrying the computation while people bring goals, intuition, and values. In climate science the best tools still follow that pattern. They help scientists work through decades of satellite imagery or run large sets of weather simulations, and they pay off most when the scientist can question the model and add field knowledge. A wildfire model that says "this region is high risk" helps less than one that says "soil moisture is 30% below the seasonal baseline and recent imagery shows dry fuel where fires have started before." The manager can then judge whether that fits the ground.

Most current systems are strong pattern matchers. They find statistical regularities and extend them, which is useful but has a real limit: they confuse what goes together with what causes what. In a refugee camp, a system might link the number of latrines to rates of waterborne disease and suggest building more latrines. The real driver is clean water and handwashing. Build more latrines over the same dirty water and the outcomes stay flat while money is spent. Causal reasoning, tied to the work of Judea Pearl among others, is making progress but is far from solved, so claims that AI "understands" cause and effect deserve some doubt. A related issue is hallucination, where a language model states something fluent and false because it has no way to check itself. For humanitarian logistics, medical, or legal work that is dangerous, and the answer is to ground outputs in verified data and human review.

There is also the question of who builds these tools. Most are made by large organizations for communities that had no say in the design, and when those communities are smallholder farmers in sub-Saharan Africa or refugees in camps in Bangladesh, the gap shows. Tools built without local input often fail on the ground, because the assumptions in the data, the proxies, and even the language can make a system useless where it is sent. Participatory design, where community members help set goals and flag harmful edge cases, has now been documented in places like Nepal and Cameroon. It takes more time and a different set of skills, but it produces systems people trust and that cause fewer surprises. In high-stakes humanitarian work it is closer to a requirement than a bonus.

3. Humanitarian AI in action

The UN Sustainable Development Goals are a handy way to frame what humanitarian AI could do: less hunger, better health, wider access to education, faster climate adaptation. Take food security. The UN's FAO estimates the number of undernourished people grew from about 572 million in 2014 to around 733 million in 2023, pushed by conflict, climate shocks, and economic trouble. Aid has long been reactive, arriving after a crisis is already visible. AI can help shift toward acting early, spotting risk signals in time to move resources and funding before conditions collapse. The Famine Early Warning Systems Network combines satellite imagery, rainfall estimates, market prices, and conflict tracking, and it has flagged worsening conditions months ahead of emergency classifications. In South Sudan those warnings came months before formal famine declarations, where the gap was political will and logistics rather than detection. These systems are correlational, only as good as their data, and they work best when local analysts can check whether a signal makes sense.

Migration shows both sides of AI at once. The same tools that help refugees reach services and work through legal steps can also be used to watch, screen, and shut them out. On the helpful side, GeoMatch, built by the Immigration Policy Lab at Stanford and ETH Zurich, uses data on job markets, language, and community fit to suggest where to place refugee families. Modeling published in Science estimated placement could raise employment by about 40% in the United States and 75% in Switzerland against traditional methods, based on historical data rather than a live trial. Caseworkers still make the final call and add what the model cannot see. Multilingual chatbots also help migrants understand asylum steps and their rights. The harder side is real too: rights groups including Amnesty International have documented facial recognition and risk scoring at borders with little independent testing, where errors can be severe and hard to reverse. The EU AI Act, in force since 2024, treats migration systems as high risk and adds requirements for transparency and human oversight.

The three Hs are mostly institutional, not technical. We already know how to build more interpretable models, run impact audits, design participatory processes, and bring causal reasoning in. What has been missing is steady pressure to do it. Rules like the EU AI Act set a floor, impact assessments make organizations weigh effects before deployment, and transparency about training data and subgroup performance lets outsiders hold developers to account. A wider public understanding helps too, since an informed public is harder to mislead and quicker to ask questions when systems fail. The work is hard and nothing is guaranteed, but the examples already exist. Fairness research has changed real products, early warning systems buy time that saves lives, and GeoMatch is placing families where they are more likely to find work. These run because specific people decided the extra care was worth it. The open question is whether that becomes the norm.