#WiDS2022 Our tools and techniques influence the path we take and the decisions we make. If we have tools that drive complex machines, then we develop solutions for complex machines. What about all sorts of solutions that use simple machines? We leave out swaths of people & econ
#WiDS2022 How do we build equity into our data sets and not repeat mistakes of the past?
models have choice built into them. but they can be opaque. We don't pay attention to constraints. Do we assume everyone has the same level of choice as we do? (diverse teams help)
#WiDS2022 Tierra Bills Confronting UCLA Data Bias in Travel Demand Modeling
Bias in survey comes through:
- non coverage, non-response, accessibility (digital divide).
We don't know how severe the under-representation is, or what the impact is.
#WiDS2022 Tierra Bills Confronting UCLA Data Bias in Travel Demand Modeling
To confront bias, include:
Data collection, data cleaning, model estimation/prediction and analysis and metrics. All this goes into transportation decisions.
Her focus: data collection = surveys
#WiDS2022 Tierra Bills Confronting UCLA Data Bias in Travel Demand Modeling
Today, this looks like: Uber/Lyft wait times, funding for rail over bus, pollution within vulnerable communities.
How to confront this using data science?
#WiDS2022 Tierra Bills Confronting UCLA Data Bias in Travel Demand Modeling
equity: power, economic stability,
Transportation Injustice and Inequities, 1 example is putting in of interstates. Decisions made w/o investigation = devastation.
#WiDS2022 Alex Hanna DIR Institute The genealogy of data:
3. how have benchmark datasets become hegemonic or paradigmatic?
4. What are the current work practices, norms and routines that structure data collection, curation and annotation?
#WiDS2022 Alex Hanna DAIR Institute - ethical data collection. The genealogy of data
1. how do dataset dvprs describe & motivate the decisions that go into their creation?
2. what are the histories & contingent conditions of creation of benchmark datasets?
#WiDS2022 Alex Hanna DAIR Institute - ethical data collection. Data as infrastructure.
1. datasets determine what a model learns
2. datasets benchmark algorithms
3. datasets serve as model organisms
4. datasets provide methodological grounds for model devpt in industry contexts
#WiDS2022 Alex Hanna DAIR Institute - ethical data collection. Data as infrastructure. Yummy!!!
Imagenet - developed at Stanford. For computer vision research. 14m images 20k categories.
#WiDS2022 Panel Tanveer Syeda-Mahmood. knowledge representation, then vision - recognition, attention, image search.
Lots of exploratory projects.
Arrived at school and was told that AI was done, she was too late. LOL!
#WiDS2022 Panel Jinoos Yazdany. Electronic health care records used for billing now, and don't include things that matter to patients, like outcomes. !!
How to aggregate data? Study quality of care and discover gaps. Privacy and security comes into play. Quality measures.
#WiDS2022 Panel Sylvia Plevritis - Data Driven computational models. Developed a simulation model to see if MRI could be used to screen for cancer in high risk patients. Used by Cancer Society to develop guidelines for clinical practice.
#WiDS2022 Panel -- Tina Hernandez-Boussard had a great mentor, made all the difference. Combined data science and healthcare. Developing cancer patient digital twin... NLP to capture the patient voice to include values and goals into care. Woooot!
#WiDS2022 Tanveer Syeda-Mahmood IBM Research "A Turing Test for Chest Radiology AI" Course grain findings have been addressed. We now need to focus on fine-grain findings.
#WiDS2022 Tanveer Syeda-Mahmood IBM Research "A Turing Test for Chest Radiology AI" building models... then generate a report.
Then benchmark against live humans -- to compare: found that in some ways, humans are better, in some ways machines are better...
#WiDS2022 Tanveer Syeda-Mahmood IBM Research "A Turing Test for Chest Radiology AI" other issues - a variety of data sets from a variety of institutions. imbalance in data. unlabeled data.