Ethical Challenges in Biomedical Engineering - Data Collection, Analysis, and Interpretation
Oct 15, 2022
I recently contributed to the design of a new course for first year biomedical engineering PhD students at Johns Hopkins on Ethical Challenges in Biomedical Engineering. I also taught the first module on Data Collection, Analysis, and Interpretation, focusing on a particular case study involving women in the biomedical sciences. Below is a reflection on my motivations behind the course, the materials for my module of the course, and my hopes for the course’s impact.
Course background and motivations
The Ethical Challenge in Biomedical Engineering course is a semester-long course for first year biomedical engineering PhD students. It is modeled after the discussions around ethical challenges in biomedical data analysis I would hold in my own lab meetings. I started holding these discussions in my lab meetings after noticing how our current course offerings in science, technology, engineering, and math (STEM) education was preparing our students with the hard quantitative skills to address challenging biomedical engineering problems but appeared to be lacking in terms of preparing our students with the intellectual framework for evaluating whether a problem is worth tackling in the first place and the ethical and moral considerations in the choices we must make in tackling these problems. Therefore, my goal for these discussions is to provide students with the space to begin exploring ethical gray areas that impact biomedical engineering through guided discussion and case studies.
As faculty, we hold the power and responsibility to teach in the context of our labs as well as the classroom. So when the BME department was looking to offer a new class for first years, I pitched to expand these discussions of ethical gray areas beyond my lab and into the bigger classroom. With additional discussions and organization led by Rachel Karchin, we streamlined the course structure and established additional side goals for the course in the classroom setting for first year students to get acquainted with faculty, make new friends with each other, and eat pizza together on a Monday evening.
To take advantage of the breadth of biomedical engineering research experiences available at Johns Hopkins, we turned to the biomedical engineering faculty and asked them to identify an ethical gray area in their subfield. Faculty proposed thought-provoking topics including “Technical, social, regulatory and ethical decisions involving life saving devices” (Nitish Thakor), “Bioethics and biosecurity challenges for protein engineering” (Jamie Spangler), “Rights vs ownership of genes and germline” (Josh Doloff), “Biomedical devise obsolescence” (Patrick Kanold and Tilak Ratnanather), “Neural prosthetics” (Kathleen Cullen) and more.
Each faculty then designed a module based on their topic of expertise. Each module was comprised of a brief 15 minute lecture introducing the topic, a 45 minute discussion in small groups guided by either faculty or teaching assistants, a 20 minute reconvening to share thoughts from our small group discussions, and a final 10 minutes for concluding remarks.
Module 1: Data Collection, Analysis, and Interpretation
I had the pleasure of kicking off the first module on “Data Collection, Analysis, and Interpretation.”
Setting the tone
First, I believed it to be important to set the tone of the course and the subsequent discussions by emphasizing that:
- This is a space of learning, not judgement
- There are no right or wrong answers here. The ultimate purpose of this discussion is not to produce a consensus, it’s to promote critical thinking
- Please respect each other’s viewpoints
- Keep in mind that we all have biases. The discussions that will take place in your group will reflect the biases of the participants, including the faculty and teaching assistants
A brief introduction
Next, in my brief lecture, I emphasized how large-scale data analysis is becoming an increasingly important part of the practice of science. With advances in genomics as well as machine-learning and other data-driven fields, the amount of data that is and can be collected is growing readily. Furthermore, all this data must be analyzed and interpreted. Thus, the choices that scientists make regarding how data is collected, analyzed, and interpreted play crucial roles in how this data is translated into knowledge and even policy. However, these choices may be influenced by social factors, including personal viewpoints and lived experiences. Yet, many groups are under-represented in biomedical engineering and the broader scientific community. In particular, women remain under-represented among engineering and computational science degree holders. Likewise, only a small percentage of senior faculty and leadership in science academic departments and industries are women.
As such, for our case study, I focused on an effort to determine how the scientific community may most effectively work to improve the representation of women in science. In 2020, a group of scientists conducted a large-scale data-driven study (now retracted) on “the association between early career informal mentorship in academic collaborations and junior author performance.” The scientists analyzed a dataset of 215 million scientist names from 222 million historic papers from US-based affiliations. They chose to identify “mentor-protégé” pairs based on co-publication, quantify “mentorship impact” based on the number of publications from the mentee independent of the mentor 7 years after graduation and the number of citations accumulated 5 years after publication, and infer mentor and protégé genders based on historically annotated first names. Based on these quantifications, the scientists’ data analyses revealed that “increasing the proportion of female mentors is associated with a reduction in post-mentorship impact of female protégés.” The scientists interpreted these data analyses as demonstrating that “current diversity policies promoting female–female mentorships, as well-intended as they may be, could hinder the careers of women who remain in academia in unexpected ways. Female scientists, in fact, may benefit from opposite-gender mentorships in terms of their publication potential and impact throughout their post-mentorship careers.”
Students were not expected to have read the paper. The exact details of the paper are actually not so important. Rather, the goal is for students to think generally and broadly about data collection, analysis, and interpretation using this particular paper as a case study to guide discussion.
Discussion Questions Set 1:
- Data is often thought of as unbiased and objective. What do you think about this? What is the difference between data and knowledge?
- Do you think that data collection, analysis, and interpretation are affected by the viewpoints and lived experiences of those who conduct the science? What does this mean with respect to the notion the practice of science as being unbiased and objective?
- What factors do you think has influenced (or will influence) your perspective, style, and approach when it comes to data collection, analysis, interpretation, and more broadly speaking, the conduct of science itself?
Discussion Questions Set 2:
- What do you think about the data of 215 million scientists and 222 million historic papers from US-based affiliations? Does the data (such as its size, how and when it was collected, etc) influence your interpretation of the downstream analysis results?
- What do you think about the choice of quantifying mentorship as co-publishing? What assumptions and/or biases do you think may be introduced by this choice? How does this choice influence your interpretation of the downstream analysis results?
- What do you think about the choice to estimate gender based on first names? What assumptions and/or biases do you think may be introduced by this choice? How does this choice influence your interpretation of the downstream analysis results?
- What do you think about the choice of quantifying mentorship outcome based on publication number and citation counts? What assumptions and/or biases do you think may be introduced by this choice? How does this choice influence your interpretation of the downstream analysis results?
- What do you think about the scientist’s conclusion that to improve the representation of women in science, women students should choose to be mentored by men rather than women? What do you think would be the long-term ramifications if such a suggestion is implemented?
- Put yourself in the scientist’s shoes. If you were given this data, what different choices would you have made (if any) in analyzing it? Likewise, if given these data analysis results, how would you have interpreted them?
Discussion Questions Set 3:
- Do you think it is important to determine how the scientific community may most effectively work to improve the representation of women in science? Why or why not?
- If you are given a million dollars to determine how the scientific community may most effectively work to improve the representation of women in science, what would you do? Would you conduct a similar study or do something totally different?
After students reconvened to share their thoughts, I had a chance to give my concluding remarks. I emphasized that, whether it is spatial transcriptomics data or paper citation networks data, when we generate/collect data, we must understand where it comes from and understand its limitations. When we make choices regarding how this data is analyzed, we have the freedom to exercise our creativity but also the obligation to exercise discretion. When we interpret the results of our data analysis, we must exercise the most humility.
With regards in particular to “the association between early career informal mentorship in academic collaborations and junior author performance.”, I highlighted the female mentors in my own career development. Some of these were formal research advisors, while others were informal career advisors, mentors outside academia, and even peers. I emphasized to students the importance of finding mentors beyond their formal research advisors and also creating the communal infrastructure to support identifying potential mentors moving forward.
Hopes for impact
In general, my hope is that through these discussions, these first year PhD students (and all students pursuing academia) will come to appreciate that they are becoming producers of knowledge in academic spaces. The knowledge produced in academic spaces have and continue to be used in a myriad of ways. So as these producers of knowledge, in the words of Prof. Chandra Talpade Mohanty, we must ask ourselves: Why are we producing this knowledge? For whom are we producing this knowledge? What communities do they serve?
(made using Canva)
Try it out for yourself!
- What ethical gray areas impact your field or subfield of study?
- Discuss these questions in your own class or lab!
- Alignment of Xenium and Visium spatial transcriptomics data using STalign on 27 December 2023
- Aligning 10X Visium spatial transcriptomics datasets using STalign with Reticulate in R on 05 November 2023
- Aligning single-cell spatial transcriptomics datasets simulated with non-linear disortions on 20 August 2023
- 10x Visium spatial transcriptomics data analysis with STdeconvolve in R on 29 May 2023
- Impact of normalizing spatial transcriptomics data in dimensionality reduction and clustering versus deconvolution analysis with STdeconvolve on 04 May 2023
- Aligning Spatial Transcriptomics Data With Stalign on 16 April 2023
- 3D animation of the brain in R on 08 November 2022
- Ethical Challenges in Biomedical Engineering - Data Collection, Analysis, and Interpretation on 15 October 2022
- I use R to (try to) figure out the cost of medical procedures by analyzing insurance data from the Transparency in Coverage Final Rule on 12 September 2022
- Annotating STdeconvolve Cell-Types with ASCT+B Tables on 30 August 2022