Home - ua-datalab/AI-for-Professionals GitHub Wiki
mindmap
((**AI/ML Toolkit**))
id(**Code Development**)
Visual Studio Code
Jupyter Notebooks
Marimo
Quarto
id(**Data Analysis Platforms**)
KNIME
OpenRefine
Orange
id(**Machine Learning <br/> Deep Learning**)
Scikit-Learn
PyTorch
Tensorflow
id(**Natural Language Processing <br/> NLP**)
SpaCy
NLTK
id(**Geospatial Analysis**)
QGIS
Felt
id(**Databases**)
Duckdb
id(**Data Visualization**)
Data-to-Viz
Voyager
Flourish
datawrapper
Google Looker Studio
PowerBI
Plotly
Shiny
Tableau
id(**Generating Ideas**)
ChatGPT
Claude
Open Source LLMs
UArizona AI Verde
Gemini
Google Notebook
Perplexity AI
id(**Collaborative Research <br/> #38; Information Gathering**)
Elicit
Research Rabbit
SciSpace
Scite
Semantic Scholar
id(**Project Documentation**)
GitHub Pages
Google Docs
Notion
id(**Brainstorming <br/> #38; Mind Mapping**)
NotebookLM
Miro
MindMeister
- Understand fundamental data concepts without the jargon.
- Explore free, user-friendly AI and data tools.
- Apply these concepts to real-world public health challenges through hands-on activities.
- Navigate the ethical landscape of AI in healthcare.
๐ Please see: Presentation Slides
"Start Your Journey Here": Learning Objectives or jump to this page Overview.
๐ Learning Objectives (Click me!)
Upon completion of this session and engagement with this resource, you will be able to:
- Define fundamental data concepts (e.g., dataset, variable, data types, big data) in the context of public health.
- Identify key sources of public health data and recognize indicators of data quality.
- Explain the basic principles of data visualization and its importance in communicating public health information.
- Describe the core concepts of Artificial Intelligence (AI), Machine Learning (ML), and Large Language Models (LLMs) without technical jargon.
- Utilize prompt engineering techniques to effectively interact with LLMs (like ChatGPT/Gemini) for tasks relevant to public health.
- Recognize common open-source software and AI tools accessible to non-coders for basic data exploration and AI interaction.
- Discuss ethical considerations and potential biases associated with using data and AI in public health.
- Apply these concepts through practical, non-coding exercises simulating real-world public health scenarios.
๐ Overview (Click me!)
- The "Why": Why data and AI/ML are no longer just for tech experts but essential for all healthcare professionals.
- Example: How rapid data analysis helped track COVID-19 spread and inform public response.
- What We'll Cover: A roadmap of the session, emphasizing the non-coding, practical approach.
- Focus on Empowerment: This session is designed to give you the confidence to engage with data and AI tools, improving your daily work and contributions to public health.
Introduction: You don't need a PhD in computer science or a big budget to start using powerful tools!
๐ (Click me!)
-
Tools: ChatGPT (OpenAI) / Gemini (Google) / Claude (Anthropic)
- Main Features: Natural language understanding and generation, summarization, brainstorming, drafting text, answering questions.
-
Practical Healthcare Applications:
- Drafting patient education materials (e.g., "Explain type 2 diabetes in simple terms for a brochure").
- Summarizing research papers or public health reports for quick insights.
- Brainstorming public health campaign slogans or outreach strategies.
- Generating FAQs for common health concerns.
-
Prompt Engineering Focus:
- Concept: How to ask the right questions to get the best results. Explain "Role, Task, Format, Constraints."
-
Example:
- Weak Prompt: "Tell me about vaccination."
- Strong Prompt: "Act as a public health advisor. Create a list of 5 key benefits of childhood vaccination for parents of newborns, written in clear, empathetic language, suitable for a pamphlet. Each benefit should be one sentence."
-
Tools: PubMed Central / Perplexity AI / Semantic Scholar
- Main Features: Conversational search engine that provides answers with cited sources. Excellent for research and fact-finding.
-
Practical Healthcare Applications:
- Quickly finding evidence-based information on specific health conditions or interventions.
- Identifying recent research papers on a public health topic.
- Checking the source of health claims.
๐ (Click me!)
-
Tools: Google Forms / Microsoft Forms
- Main Features: Easy-to-create surveys, quizzes, and feedback forms. Collects responses in a spreadsheet.
- Practical Healthcare Applications: Community health needs assessments, patient satisfaction surveys, collecting sign-ups for health workshops, post-event feedback.
-
Tool: KoboToolbox
- Main Features: Free, open-source suite for field data collection, often used in humanitarian and development contexts. Works offline.
- Practical Healthcare Applications: Epidemiological surveys in remote areas, health facility assessments, monitoring public health interventions.
๐ (Click me!)
-
Tools: OpenRefine / Google Sheets / Microsoft Excel (Online/Free versions)
- Main Features: Organizing data in tables, basic calculations (sum, average), creating simple charts (bar, pie, line).
- Practical Healthcare Applications: Tracking patient appointments, managing small project budgets, creating simple dashboards for clinic metrics, and visualizing immunization rates over time.
-
Tools: DataVoyager / Flourish / Datawrapper
- Main Features: Web-based tools for creating interactive and embeddable charts, maps, and tables with no coding. Generous free tiers.
- Practical Healthcare Applications: Creating compelling visuals for public health reports, presentations, or websites (e.g., mapping disease prevalence, showing trends in health behaviors).
๐ Learning Modules (Click me!)
- Content: Defining data, datasets, variables, data types (quantitative vs. qualitative). Importance of context. What is "Big Data" in simple terms?
-
Experiential Learning Use Cases (Non-Coding):
- Scenario Analysis: Given a brief public health scenario (e.g., a local flu outbreak), identify 5 types of data that would be useful to collect (e.g., number of cases, age of patients, vaccination status, onset date, symptoms). Classify each as quantitative or qualitative.
- "Spot the Data": Look at a simplified public health infographic (provided). Identify 3 key data points and what they represent.
- Data Scavenger Hunt: Find a public health statistic from a reputable source (e.g., WHO, CDC website). Describe what it measures and its unit of measurement.
- Dataset Exploration (Conceptual): Review a very small, clean sample dataset (e.g., 10 rows, 5 columns in a table about patient demographics and a health outcome). Identify the variables and their likely data types.
- Content: Common public health data sources (surveys, surveillance systems, electronic health records, census data). Characteristics of good quality data (accuracy, completeness, timeliness, reliability, relevance).
-
Experiential Learning Use Cases (Non-Coding):
- Source Evaluation: Given two fictional data sources for child malnutrition rates (one from a well-known NGO, one from an anonymous blog), discuss which is likely more reliable and why.
- "Is This Data Healthy?": Review a small, sample dataset with obvious errors or missing values (e.g., age = 150, city missing for half the entries). Identify 3 quality issues.
- Survey Critique: Review a short sample survey (provided) with leading questions or biased options. Identify 2-3 problematic questions and suggest improvements.
- Brainstorming Data Gaps: For a specific public health goal (e.g., reducing smoking in teens), brainstorm what data is needed and where it might be found or how it could be collected ethically.
- Content: Why visualize data? Common chart types (bar, line, pie, map) and when to use them. Principles of clear and honest visualization (avoiding misleading charts).
-
Experiential Learning Use Cases (Non-Coding):
- Chart Match-Up: Given 3 simple datasets (e.g., disease cases over time, comparison of risk factors by percentage, geographical distribution of clinics) and 3 charts types (line, pie, map), match the best chart to each dataset.
- "Bad Chart" Detective: Analyze a misleading chart (provided) and explain why it's problematic (e.g., truncated y-axis, confusing colors).
- Sketch-a-Visual: Given a public health message (e.g., "Cases of X disease have increased by 50% in the last year among young adults"), sketch a simple chart idea to communicate this effectively.
- Interactive Exploration: Use a link to a pre-made interactive chart on Flourish or Datawrapper (e.g., showing global health indicators). Explore the chart and write down two insights you gained.
- Content: Simple definitions of AI, Machine Learning (learning from data), and Large Language Models (understanding and generating text). Focus on what LLMs can do for them.
-
Experiential Learning Use Cases (Non-Coding):
- LLM Task Brainstorm: List 3 routine tasks in your public health role (e.g., drafting emails, summarizing meeting notes, finding information) where an LLM like ChatGPT could assist.
- Prompt Practice - Summarization: Take a short paragraph from a public health news article (provided). Use ChatGPT/Gemini with the prompt: "Summarize this text in one sentence for a busy public health official." Compare the output to the original.
- Prompt Practice - Idea Generation: Use ChatGPT/Gemini with the prompt: "Act as a health communication specialist. Brainstorm 5 catchy slogans for a campaign encouraging handwashing in primary schools."
- AI Output Critique: Given a short AI-generated text about a health topic, identify one strength and one potential area for improvement or fact-checking. (e.g., Is it too generic? Does it cite sources? Is the tone appropriate?).
- Content: Importance of data privacy (anonymization, confidentiality). Potential for bias in data and AI algorithms. Fairness, accountability, and transparency in AI.
-
Experiential Learning Use Cases (Non-Coding):
- Case Study Discussion: Read a short, fictional case study about an AI tool used for disease prediction that shows biased results against a certain demographic. Discuss 2-3 ethical concerns.
- Privacy Brainstorm: A local clinic wants to share patient data for a research study on diabetes trends. What are 3 key privacy considerations or steps they must take before sharing? (e.g., de-identification, informed consent).
- Bias Identification: "An AI model is trained primarily on health data from one ethnic group to predict heart disease risk." Discuss potential bias if this model is then applied to a diverse population.
- "Ethical AI" Checklist Creation: Brainstorm a 3-5 point checklist of questions a public health professional should ask before adopting a new AI tool in their work (e.g., Where did the data come from? How is privacy protected? Has it been tested for fairness?).
- Connected Papers. Connected papers, a web-based AI tool that helps researchers explore and discover academic papers in their field of interest.
- Elicit. Elicit AI is a free tool developed by Ought that helps researchers with various aspects of the research process, particularly literature reviews.
- Perplexity. Perplexity AI is an artificial intelligence-powered search engine that aims to provide users with comprehensive and accurate answers to their questions.
- PubMed Central. PubMed Central (PMC) is a free digital archive of full-text biomedical and life sciences journal articles. It's a repository maintained by the U.S. National Library of Medicine (NLM), and many articles are made available for free access due to NIH funding policies.
- Research Rabbit. ResearchRabbit is a free, AI-powered online platform that helps researchers map and explore the literature in their field.
- SciSpace. SciSpace is an AI-powered research platform designed to help academics efficiently navigate scholarly literature.
- Scite. Scite is an AI-powered research platform that helps users understand and evaluate research articles by providing context and classification for citations.
- Semantic Scholar. Semantic Scholar is a free, AI-powered search engine and research tool that helps scientists and researchers discover and understand scientific literature.
- Open AI ChatGPT. ChatGPT is a large language model chatbot created by OpenAI that can engage in human-like conversations and generate text based on various prompts. It powers Microsoft Copilot.
- Gemini. Google Gemini is a large language model (LLM) and multimodal AI assistant that can be accessed through a chatbot interface.
- Google NotebookLM. NotebookLM (Google NotebookLM) is a research and note-taking online tool developed by Google Labs that uses artificial intelligence (AI), specifically Google Gemini, to assist users in interacting with their documents.
- Google AI Studio. Google AI Studio is a free, browser-based Integrated Development Environment (IDE) that allows users to experiment with and prototype applications using Google's Gemini family of generative AI models.
- Claude. Claude AI is a large language model (LLM) and AI chatbot developed by Anthropic that excels at natural language processing (NLP).
- U of Arizona AI Verde. Local LLMs.
- Chatbox. Chatbox software is a user interface, typically a pop-up window or widget on a website or application, that facilitates communication between a user and either a live agent (human) or a chatbot (AI-powered). Requires an API.
General References:
- Visual Studio Code (VS Code). Visual Studio Code is a free, cross-platform code editor developed by Microsoft.
- Jupyter Notebooks. A Jupyter Notebook is a web-based interactive computing environment that allows users to create and share documents containing live code, equations, visualizations, and narrative text.
- Marimo. marimo is an open-source reactive notebook for Python โ reproducible, git-friendly, SQL built-in, executable as a script, and shareable as an app.
- Quarto. Quarto provides a unified authoring framework for data science, combining your code, its results, and your prose. Quarto documents are fully reproducible.
- KNIME. KNIME (Konstanz Information Miner) is a free and open-source data analytics platform that allows users to build data science workflows without extensive coding skills.
- OpenRefine. OpenRefine is a free, open-source software tool that cleans, transforms, and enriches data, especially when dealing with messy or incomplete datasets.
- Orange Data Mining. Orange is a visual programming toolkit that facilitates data visualization, machine learning, and data analysis.
- DuckDB. DuckDB is a high-performance, embedded, in-process, OLAP (Online Analytical Processing) relational database management system (RDBMS) that is designed for data analysis.
- A synthetic healthcare dataset
- Awesome Healthcare Datasets
- Center for Disease Control and Prevention (CDC) - Datasets
- Global.health
- The Humanitarian Data Exchange (HDX)
- NIH-Supported Data Sharing Resources
- World Health Organization: Global Health Observatory
- Synthea: Synthetic Patient Generator | (GitHub)
- Data-to-Viz.com. From Data to Viz leads you to the most appropriate graph for your data. It links to the code to build it and lists common caveats you should avoid.
- dataVoyager. Data Voyager is a data visualization tool that helps users explore and analyze data by combining manual and automated chart specification techniques.
- Datawrapper. Datawrapper is a user-friendly web-based tool for creating and sharing data visualizations like charts, maps, and tables.
- Exploratory. Exploratoryโs Simple UI experience makes it possible for anyone to use Data Science to explore data quickly, discover deeper insights, and communicate effectively (Downloadable. Installs R).
- Google Looker Studio. Looker Studio is a free, web-based data visualization and reporting tool from Google Cloud that allows users to create interactive dashboards and reports from various data sources.
- RAWGraphs. RAWGraphs is a free, open-source web-based tool designed for creating data visualizations, particularly for designers and those who want to create custom visualizations without extensive coding.
- ObservableHQ. ObservableHQ is a platform and ecosystem for building interactive web-based data visualizations and dashboards.
- Tableau. Tableau is a visual analytics platform and business intelligence (BI) software that helps users visualize, analyze, and share data.
- PowerBI. Power BI is a suite of business analytics services and software from Microsoft designed to help users visualize and analyze data to gain insights and make informed decisions.
- Shiny. A Shiny app is an interactive web application built using the Shiny framework, which is part of the R programming language.
- plotly. Plotly provides online graphing, analytics, and statistics tools for individuals and collaboration, as well as scientific graphing libraries for Python, R, MATLAB, Julia, and others.
- Gradio. Gradio is a Python library that simplifies building interactive web applications, particularly for machine learning demos and applications.
- Streamlit. Streamlit is an open-source Python library that makes it easy to build and share interactive, data-rich web apps.
- QGIS. QGIS (formerly Quantum GIS) is a free and open-source Geographic Information System (GIS) software that allows users to create, analyze, and manage spatial data.
- Felt. Felt is a software platform that enables users to easily create, visualize, analyze, and share maps online.
- Scikit-Learn. Scikit-learn is a free and open-source machine learning library for the Python programming language.
- PyTorch. PyTorch is an open-source machine learning framework based on the Torch library, primarily developed by Meta AI. It is used for applications such as computer vision and natural language processing.
- Tensorflow. TensorFlow is a software library for machine learning and artificial intelligence. It can be used across a range of tasks, but is used mainly for training and inference of neural networks. It is one of the most popular deep learning frameworks, alongside others such as PyTorch.
- spaCy. spaCy is a free, open-source library for advanced Natural Language Processing (NLP) in Python.
- NLTK. NLTK (Natural Language Toolkit) is a leading Python library for working with human language data.
- A Glossary of Terms in Artificial Intelligence for Healthcare. S. Shamtej Singh Rana, Jacob S. Ghahremani, Joshua J. Woo, Ronald A. Navarro, Prem N. Ramkumar, Arthroscopy: The Journal of Arthroscopic & Related Surgery, Volume 41, Issue 2, 2025, Pages 516-531, ISSN 0749-8063.
- A Guide to Data Literacy: How to Interpret Data in Media. UC Berkeley, School of Information.
- Artificial Intelligence in healthcare. Directorate of Health and Foos Safety, European Commission.
- [Harnessing Artificial Intelligence for Health(https://www.who.int/teams/digital-health-and-innovation/harnessing-artificial-intelligence-for-health). World Health Organization.
- Health Equity and Ethical Considerations in Using Artificial Intelligence in Public Health and Medicine. Dankwa-Mullan I., Prev Chronic Dis 2024;21:240245.
- Shaping the future of AI in healthcare through ethics and governance. Bouderhem, R., Humanit Soc Sci Commun 11, 416 (2024).
- The potential for artificial intelligence in healthcare. Thomas Davenport, Ravi Kalakota. Future Healthcare Journal, Volume 6, Issue 2, 2019, Pages 94-98, ISSN 2514-6645.
- 6 ways AI is transforming healthcare. (2025) World Economic Forum.
Created: 04/29/2025 (C. Lizรกrraga)
Updated: 06/10/2025 (C. Lizรกrraga)
๐ ๐ UArizona DataLab Learning Resources
UArizona DataLab, Data Science Institute, University of Arizona.