IPCAI 2023 Reproducibility Checklist
IPCAI is dedicated to publishing high quality, reproducible and responsible research following best practices. Motivated by the success and increasing use of reproducibility checklists in other major conferences, this year, we request authors to complete a reproducibility checklist as part of the submission on CMT. The reproducibility questions are given below.
The main purpose of the checklist is to help verify that the materials, methods and analysis presented in an IPCAI paper are clearly defined, and that, ideally, the results and conclusions can be reproduced independently given full access to these resources. The resources should be described in the paper itself or via supplementary material for additional details (especially, wherever possible, by linking to data, and the software implementations of algorithms and analysis used in the paper).
A good IPCAI paper will strive for the highest level of reproducibility, however, experimental setups described in IPCAI often comprise complex hardware and software setups, and it must be kept in mind that full independent reproducibility cannot always be ensured, despite best efforts by the authors. Access to all materials needed to reproduce results may not be possible, and the use of proprietary systems that are subject to licence restrictions is common in IPCAI. For instance, reported software run-times require implementation on specific hardware, which is part of the experimental material that would be needed for exact reproduction, which generally cannot be shared. Additionally, the implementation can be so large that checking (or even running) code implementations is often neither sufficient for full reproducibility and unfeasible. In such cases where full reproducibility is impossible, it is expected that the results are reproducible in principle. That is, when the experimental materials and implementation are not readily available, the paper should be described in sufficient detail that reproducibility would likely be attained given access to such resources by field experts. The checklist questions below are designed to promote this important aspect.
Reproducibility Frequently Asked Questions
Q1: How are the checklist responses used during the reviewing process?
The checklist is designed to remind authors of items they could address in their submissions to help the reviewers to understand and evaluate the work, similar to a pre-flight checklist or pre-surgery checklist. The checklist responses are visible to reviewers and the review form will ask reviewers to rate a submission with respect to whether the submission includes enough information for reproducibility. However, the reviewers will rate the submission based on the submission itself, not simply based on the checklist responses. Also, the checklist responses are visible to Program Chairs, who might take them into consideration when making the final accept/reject decisions.
Q2: If our paper addresses all the items on the checklist, is that sufficient for reproducibility?
Not necessarily. IPCAI research is quite varied and no checklist could cover all the items necessary to reproduce all papers. Use your best judgement to include items relevant for your work, regardless of whether they are in the reproducibility checklist.
Q3: As the corresponding (submitting) author, am I required to fill out the reproducibility checklist in the full-paper submission form?
Yes. You won’t be able to submit the full paper without filling out the reproducibility checklist first. Note that some items in the checklist may not be relevant to your submission.
Q4: As the corresponding author, I have completed the checklist in the submission form. Am I required to address those items explicitly in the submission itself (i.e., in the main paper or supplementary material)?
The items from the checklist are not required to be addressed explicitly in a submission, but authors will surely find their submission is of higher quality if they do so for items that are relevant to their work. Please use your best judgement to determine what checklist items are relevant to your work and what items should be addressed explicitly in the submission itself.
Q5: As the corresponding author, should we address checklist items in the main paper or in the supplementary materials?
It depends on which place is more appropriate. For example, important values such as the number of model parameters and the size of training data should be included in the main paper, whereas less important ones can be included in the appendix in supplementary material. This is the same kind of decision you have to make for other types of information such as math formulas. Just remember that reviewers have access to supplementary materials.
Q6: As the corresponding author, should we submit code and/or data as part of supplementary materials?
IJCARS, and the IPCAI special issue, have a type 1 research data policy. That is, where possible and applicable, authors are encouraged to deposit data that support the findings of their research in a public repository. Authors who do not have a preferred repository should consult Springer Nature’s list of repositories and research data policy. The same is true for code, where authors are encouraged to make code that supports findings publicly available, including setup, dependencies and execution instructions. Public code and data links can be referenced in the main text or in the or supplementary materials. As per the author guidelines. Statements about public access to resources, i.e., data, code or other materials, should be collected in one place: the Declarations section under the 'Data, code and/or material availability' heading, including access links.
Q7: As a reviewer, should I lower my rating of a submission if it doesn’t address some checklist items in the submission?
Please see Q1 above. Reviewers should use their best judgement to determine whether certain checklist items should be addressed in the submission.
The checklist is closely aligned to the one used at MICCAI 2022 with additional questions relevant for IPCAI. Questions in black font are from the MICCAI 2022 reproducibility checklist. Questions in red font are new to IPCAI 2023. Questions should be answered on CMT with ‘Yes’, ‘No’ or ‘Not applicable'. An optional free text field can be used for further clarification by the authors. Some of the new IPCAI questions concerning clinical research and statistical methods have been based on A CHecklist for statistical Assessment of Medical Papers (CHAMP), and we recommend that you also consult that paper.
1. For all experimental setups/studies, check if you include
A clear description of the experimental objectives, tested hypotheses, study design, controls and materials (also see item 5).
References to any published details about the study e.g., its approved protocol and record on clinicaltrials.gov.
Declaration of any deviations from previously published protocols or design violations. For example, non-response in surveys (missing data) or non-compliance by patients in clinical studies.
A discussion of key assumptions or simplifications.
A discussion of potential biases, e.g., selection, information or confounding biases.
A discussion of study limitations.
2. For all experiments involving users/testers of a system, surveys, or questionnaires, check if you include
A clear description of the participant cohort, including demographics, competencies and experience.
A clear description of the intended population.
A clear description of the participant selection/sampling methodology and any exclusion criteria.
A clear description of any participant training, priming or key instructions.
The list of questions, or forms and response options.
3. For all preclinical experiments e.g., with phantoms, animal studies, or in-silico studies, check if you include
A clear description of models, preparation and choice of settings/parameters.
A discussion of model validity and important clinical aspects that have not been accounted by the model.
For animal studies, full conformity with the journal and publisher's ethics guidelines and rules.
4. For all clinical studies, check if you include
Details of the study type/model.
A clear description of the patient cohort and target population.
A clear description of selection/sampling methodology and any exclusion criteria.
A verification of the study description and analysis by a clinical research expert.
Full conformity with the journal and publisher's ethics guidelines and rules.
5. For all hardware and systems, check if you include
A clear declaration of what hardware or system you used (e.g., make, model and hardware/software versions).
A clear description of hardware setup, controls and any calibration processes.
A clear description of possible assumptions or limitations to using this specific hardware.
6. For all statistical analysis
A description of the statistical methods, appropriateness and validity of assumptions.
A description of all variables considered for statistical analysis.
A justification of sample size.
A description of how missing data or outliers are handled e.g., if a baseline or proposed method fails to return a result (missing data).
A declaration of used software tools.
7. For all models and algorithms, check if you include
A clear declaration of what software framework and version you used.
A clear explanation of any assumptions.
A clear description of the mathematical setting, algorithm, and/or model.
8. For all datasets used, check if you include:
The relevant statistics, such as number of examples.
Description of the study cohort.
For existing datasets, citations, versions, as well as descriptions if they are not publicly available.
For new data collected, a complete description of the data collection process, such as descriptions of the experimental setup, device(s) used, image acquisition parameters, subjects/objects involved, instructions to annotators, and methods for quality control.
A link to a downloadable version of the dataset (if public).
Whether ethics approval was necessary for the data.
9. For all code related to this work that you have made available or will release if this work is accepted, check if you include:
Specification of dependencies.
(Pre-) trained model(s).
Dataset or link to the dataset needed to run the code.
README file including a table of results accompanied by a precise command to run to produce those results.
10. For all reported experimental results, check if you include:
The range of hyper-parameters considered, method to select the best hyper-parameter configuration, and specification of all hyper-parameters used to generate results.
Information on sensitivity regarding parameter changes.
The exact number of training and evaluation runs.
Details on how baseline methods were implemented and tuned.
The details of train / validation / test data, and isolation of test data.
A clear definition of the specific evaluation metrics and/or statistics used to report results.
A description of results with central tendency (e.g. mean) & variation (e.g. error bars).
An analysis of statistical significance of reported differences in performance between methods.
The average runtime for each result, or estimated energy cost.
A description of the memory footprint.
Discussion of clinical significance.
A description of the computing infrastructure used (hardware and software).
An analysis of situations in which the method failed.
These guidelines use the following sources:
The MICCAI 2022, AAAI 2022, IJCAI 2021 and NeurIPS 2020 reproducibility checklists
The IJCAI-ECAI 2022 reproducibility guideline FAQ
Pineau et al. Improving Reproducibility in Machine Learning Research. arXiv:2003.12206
Mansournia et al. A CHecklist for statistical Assessment of Medical Papers (the CHAMP statement): explanation and elaboration, BJSM 2021