D7 Midterm - Milestone #1
Algorithm Research/Decision Process
Algorithm Research/Decision Process
Algorithm Choice
- While neural networks are the state of the art for text summarization, Kevin recommends we first approach the problem with a simpler, statistical based approach
- Chose Text Rank Algorithm
- Implement by end of Fall '18 Semester
Text Rank - Graph Based Algorithm
- Provides a relevancy score for each sentence in an article
- Takes the most relevantly ranked sentences and sorts them as they appear in the article which is then used to create a summary
Text Rank - How It Works
- Based on Google's PageRank algorithm
- Each sentence in an article is a node in the graph
- Each node has a relevancy weight based on keywords, sentence placement within the article, etc.
- Randomly traverse the graph by moving to the next most similar neighbor node from a random starting node and increment each node's relevancy counter when it is visited
- Once the graph traversals are complete, the algorithm then outputs the most relevant sentences based on their relevancy score