Workshop on Common Model Infrastructure

IEEE ICDM 2019, November 8-11, 2019, Beijing, China

The continuing, rapid accumulation of large amounts of data, and ubiquitous use of data science and computational science requires us to confront the question of how to manage the increasingly complex modeling processes and the large numbers of data-­driven and compute-intensive models being generated.

Current modeling practices are rather ad hoc, often depending upon the experience and expertise of individual data and computational scientists and on the nature of pre-­processing steps, which may be specific to specific domains and/or application areas.

There is a need for the ability to catalog, share, and discover extant models for the purposes of reproducibility and reuse. For example, very different application domains/disciplines may be able to utilize similar models and/or modeling tools. Yet, such sharing is limited in current practice. Modeling results often have poor reproducibility. Information on when/how a model works, and when it may fail, is oftentimes not clearly recorded. To support reproducibility and reuse, model provenance as well as the original intent behind the knowledge discovery process should be well-recorded. The goal should be to make predictive analytics algorithms, and other models, more transparent to end-­users.

The CMI Workshop will focus on this key, emerging issue for the ICDM community, viz., what are the approaches, protocols, standards needed to create a ModelCommons for cataloging analytics procedures and managing management? This workshop provides a forum for researchers and practitioners to discuss the challenges and solutions for supporting the discovery, sharing, use, and reuse of data mining, machine learning, statistical analysis, and analytics models.

The CMI Workshop at IEEE ICDM 2019 is a continuation of successful workshops on Common Model Infrastructure that were held at KDD 2018, and NIPS 2018 Expo. A report from the Workshop on Common Model Infrastructure at ACM KDD 2018 appeared in a special issue of the IEEE Data Engineering Bulletin on Machine Learning Life-cycle Management, December 2018.

Important Dates

  • All deadlines are at 11:59 PM Pacific Daylight Time
  • Workshop Duration: half-day.
  • Workshop paper submissions deadline: August 7, 2019.
  • Workshop paper notification: September 4, 2019.
  • Camera-ready deadline and copyright forms: September 8, 2019.

Call for Papers

The need for ModelCommons for cataloging analytics procedures and for model management has emerged as a key issue for the ICDM community. This workshop provides a forum for researchers to discuss emerging challenges and solutions for supporting discovery, sharing, and use/reuse of data mining, machine learning, statistical analysis, and analytics models. The resulting approaches, protocols, and standards will help in defining and implementing the functionality necessary for a ModelCommons.

Topics of interest include, but are not limited to:

  • Lifecycle management of data mining models and software
  • Data mining systems and platforms that support cataloging, searching, recommending, and discovering data mining models
  • Systems to support reproducibility of data mining analyses
  • Common data mining model infrastructure to support applications of data mining across a wide range of disciplines including, for example, social sciences, physical sciences, engineering, life sciences, web, marketing, finance, precision medicine, health informatics, and other domains.
  • Privacy and security issues in creating shared platrforms and shareable models
  • Metadata for recording and describing data mining models, including when/how a given model works, and when it may fail
  • Model storage, versioning, exchange, and provenance management
  • Transparency of data mining algorithms / models
  • Publishing and reuse of data mining pipelines
  • Integration with existing modeling tools and data analytics infrastructure
  • Reports on experiences with model management infrastructure, model exchange formats, etc., from practice in industry and elsewhere.

The expected audience for this workshop includes researchers and practicing engineers in fields affiliated with data mining systems.

Technical Program Committee

  • Chaitan Baru (Co-Chair), UC San Diego, San Diego, USA.
  • Luke Huan (Co-chair), Baidu Research, Beijing, China.
  • Robert Grossman, University of Chicago, Chicago, USA.
  • Bill Howe, University of Washington. Seattle, USA.
  • Vandana Janeja, University of Maryland Baltimore County, Baltimore, USA.
  • Joaquin Vanschoren, Eindhoven University of Technology, Eindhoven, Netherlands.

Contact

Contact the organizers above for general questions.

Submission guidelines for research papers/posters

Paper submissions should be limited to a maximum of ten (10) pages, in the IEEE 2-column format (link), including the bibliography and any possible appendices. Submissions longer than 10 pages will be rejected without review. All submissions will be triple-blind reviewed by the Program Committee on the basis of technical quality, relevance to scope of the conference, originality, significance, and clarity. The following sections give further information for authors.

Triple blind submission guidelines

Since 2011, ICDM has imposed a triple blind submission and review policy for all submissions. Authors must hence not use identifying information in the text of the paper and bibliographies must be referenced to preserve anonymity. Any papers available on the Web (including Arxiv) no longer qualify for ICDM submissions, as their author information is already public.

What is triple blind reviewing?

The traditional blind paper submission hides the referee names from the authors, and the double-blind paper submission also hides the author names from the referees. The triple-blind reviewing further hides the referee names among referees during paper discussions before their acceptance decisions. The names of authors and referees remain known only to the PC Co-chairs, and the author names are disclosed only after the ranking and acceptance of submissions are finalized. It is imperative that all authors of ICDM submissions conceal their identity and affiliation information in their paper submissions. It does not suffice to simply remove the author names and affiliations from the first page, but also in the content of each paper submission.

How to prepare your submissions

The authors shall omit their names from the submission. For formatting templates with author and institution information, simply replace all these information in the template by “Anonymous”.

In the submission, the authors’ should refer to their own prior work like the prior work of any other author, and include all relevant citations. This can be done either by referring to their prior work in the third person or referencing papers generically. For example, if your name is Smith and you have worked on clustering, instead of saying “We extend our earlier work on distance-based clustering (Smith 2005),” you might say “We extend Smith’s (Smith 2005) earlier work on distance-based clustering.” The authors shall exclude citations to their own work which is not fundamental to understanding the paper, including prior versions (e.g., technical reports, unpublished internal documents) of the submitted paper. Hence, do not write: “In our previous work [3]” as it reveals that citation 3 is written by the current authors. The authors shall remove mention of funding sources, personal acknowledgments, and other such auxiliary information that could be related to their identities. These can be reinstituted in the camera-ready copy once the paper is accepted for publication. The authors shall make statements on well-known or unique systems that identify an author, as vague in respect to identifying the authors as possible. The submitted files shall be named with care to ensure that author anonymity is not compromised by the file name. For example, do not name your submission “Smith.pdf”, instead give it a name that is descriptive of the title of your paper, such as “ANewApproachtoClustering.pdf” (or a shorter version of the same).

Accepted papers will be published in the conference proceedings by the IEEE Computer Society Press. All manuscripts are submitted as full papers and are reviewed based on their scientific merit. The reviewing process is confidential. There is no separate abstract submission step. There are no separate industrial, application, short paper or poster tracks during submission. Manuscripts must be submitted electronically in online submission system. We do not accept email submissions.

Attending the Workshop

ICDM is a premier forum for presenting and discussing current research in data mining. Therefore, at least one author of each accepted paper must complete the conference registration and present the paper at the workshop, in order for the paper to be included in the proceedings and the workshop/conference program.