We plan to organize a three-hour tutorial based on the following content. We will make efforts to make the tutorial interactive by having quizzes at regular intervals and also we hope to accommodate questions in between:
1. Introduction to NLG from Structured data and Knowledge Bases (20 mins)
- Data-to-text and text-to-text paradigms
- Motivation: Why is this problem important?
- Challenges in structured data translation: Why known text-to-text methods can not be applied to this problem?
- Roadmap of the tutorial
2. Heuristic Driven Methods (20 mins)
- Rule-based approaches
- Template-based approaches
- Current industry solutions
- Shortcomings of the paradigm
3. Statistical and Neural Methods (30 mins)
- Probabilistic Generation Models
- Context-free Grammar based Approaches
- Three-phase Approach : Planning, Selection and Surface Realization
- End-to-end Encoder Decoder Paradigm
- Seq2seq approaches with attention
4. Evaluation Methods for NLG (10 mins)
- N-gram based methods : BLEU, ROUGE etc.
- Document similarity based methods
- Task-specific evaluation
- Human evaluation metrics
- Challenges in Evaluation
5. Hybrid and Scalable Methods (20 mins)
- Structured data input formats
- Canonicalization
- Simple Language Generation
- Ranking of simple sentences
- Sentence Compounding
- Coreference Replacement
6. Role of Semantics and Pragmatics (15 mins)
- Role of Knowledge Graphs
- Domain-specific Ontologies
- Reasoning and Inference in Generation
7. Open Problems and Future Directions (20 mins)
- Structure-aware Generation
- Theme/Topic based Generation
- Argumentative Text Generation
- Controllable Text Generation
- Creative Text Generation
8. Conclusion and Closing Remarks (15 mins)