Think day on data management
As I’ve been working on my think days, I’ve made it a habit to share the generalized prompts here for anyone who may want to go through this exercise. I design the prompts together with ChatGPT and tailor them then to my situation.
Think Day Design (1.5 hours)
Data Management and Improvement in a Research Group
Purpose
This Think Day is designed to help research group leaders step back from day-to-day firefighting and reflect strategically on how data is handled across their group. The focus is not on compliance or tooling for its own sake, but on improving research quality, efficiency, continuity, and resilience.
The session treats data management as research infrastructure: something that quietly determines whether a group scales, survives turnover, and can build cumulatively on its own work.
Intended outcomes
By the end of the session, participants should have:
- A clear picture of how data currently flows through their group.
- Insight into the main risks and inefficiencies in current practices.
- A small set of realistic, enforceable standards.
- A concrete improvement roadmap with priorities and ownership.
- Greater clarity on their own leadership stance toward structure, autonomy, and responsibility.
Structure and topics (90 minutes)
1. Framing the problem (≈10 minutes)
Participants begin by clarifying why data management matters for their specific context, rather than in abstract policy terms.
Reflection topics include:
- How poor data practices already affect time, quality, or continuity.
- Whether current practices depend on individuals rather than systems.
- What kind of research group they want to be running in 3–5 years.
The goal is to articulate a problem statement that anchors the rest of the session.
2. Mapping current reality (≈20 minutes)
This segment focuses on making implicit practices explicit.
Participants map their group’s current data lifecycle, typically including:
- Data creation (experiments, simulations, surveys, fieldwork, etc.).
- Storage locations and access.
- Folder structures and naming conventions.
- Documentation and metadata practices.
- Sharing within the group and with collaborators.
- Archiving after publications or project completion.
- Security, backups, and ethical considerations.
Attention is paid to where things work smoothly and where they tend to fail—especially during transitions such as people leaving or projects ending.
3. Identifying pain points and risks (≈15 minutes)
Participants identify recurring problems and risks related to data management.
Typical prompts include:
- Where time is repeatedly lost.
- Where reproducibility breaks down.
- Where knowledge disappears.
- Where strategic reuse of data fails.
Each issue is considered in terms of impact, frequency, and ease of improvement, allowing participants to focus on the few problems that matter most.
4. Defining “good enough” standards (≈20 minutes)
This section is about designing minimum viable rigor, not perfection.
Participants reflect on what should be:
- Mandatory across the group.
- Recommended but flexible.
- Explicitly left to individual preference.
Topics often include folder structures, naming conventions, documentation requirements, handover rules, version control, and long-term storage.
The emphasis is on standards that are:
- Easy to explain.
- Easy to enforce.
- Effective in reducing confusion and rework.
5. Tools, roles, and responsibility (≈15 minutes)
Here the focus shifts from norms to execution.
Participants reflect on:
- Which tools are already in use and how consistently they are applied.
- Where behavioral standardization matters more than new tools.
- Who currently holds “institutional memory” in the group.
Roles and responsibilities are clarified across levels (PI, PhD students, postdocs, group-level practices), with attention to balancing trust and accountability.
6. Improvement roadmap (≈10 minutes)
Participants translate insights into action by sketching a simple roadmap:
- Short-term improvements (next few months).
- Medium-term structural changes (6–12 months).
- Long-term ambitions (1–3 years).
Each action is associated with an owner, a timeline, and a success criterion, keeping the plan realistic and implementable.
7. Reflection and leadership alignment (≈10 minutes)
The session closes by connecting data practices to leadership identity.
Participants reflect on:
- How their current approach reflects their leadership style.
- Where they may need to be firmer or clearer.
- How current practices would scale if the group grew significantly.
Many conclude by drafting a short data management philosophy for their group, articulating the values that guide decisions going forward.
I like to insightful messages about data management